From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39547) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1btTcD-00074h-EZ for qemu-devel@nongnu.org; Mon, 10 Oct 2016 01:59:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1btTc9-0004Rf-73 for qemu-devel@nongnu.org; Mon, 10 Oct 2016 01:59:08 -0400 Date: Mon, 10 Oct 2016 16:05:14 +1100 From: David Gibson Message-ID: <20161010050514.GE22498@umbus.fritz.box> References: <1475519097-27611-1-git-send-email-duanj@linux.vnet.ibm.com> <1475519097-27611-6-git-send-email-duanj@linux.vnet.ibm.com> <20161007033607.GO18490@umbus.fritz.box> <20161007145251.9563.39710@loki> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="zaRBsRFn0XYhEU69" Content-Disposition: inline In-Reply-To: <20161007145251.9563.39710@loki> Subject: Re: [Qemu-devel] [QEMU PATCH v5 5/6] migration: spapr: migrate ccs_list in spapr state List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Michael Roth Cc: Jianjun Duan , qemu-devel@nongnu.org, qemu-ppc@nongnu.org, dmitry@daynix.com, peter.maydell@linaro.org, kraxel@redhat.com, mst@redhat.com, pbonzini@redhat.com, veroniabahaa@gmail.com, quintela@redhat.com, amit.shah@redhat.com, mreitz@redhat.com, kwolf@redhat.com, rth@twiddle.net, aurelien@aurel32.net, leon.alrae@imgtec.com, blauwirbel@gmail.com, mark.cave-ayland@ilande.co.uk, dgilbert@redhat.com --zaRBsRFn0XYhEU69 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Oct 07, 2016 at 09:52:51AM -0500, Michael Roth wrote: > Quoting David Gibson (2016-10-06 22:36:07) > > On Mon, Oct 03, 2016 at 11:24:56AM -0700, Jianjun Duan wrote: > > > ccs_list in spapr state maintains the device tree related > > > information on the rtas side for hotplugged devices. In racing > > > situations between hotplug events and migration operation, a rtas > > > hotplug event could be migrated from the source guest to target > > > guest, or the source guest could have not yet finished fetching > > > the device tree when migration is started, the target will try > > > to finish fetching the device tree. By migrating ccs_list, the > > > target can fetch the device tree properly. > > >=20 > > > ccs_list is put in a subsection in the spapr state VMSD to make > > > sure migration across different versions is not broken. > > >=20 > > > Signed-off-by: Jianjun Duan > >=20 > > I'm still not entirely convinced we need to migrate the ccs_list. > > What would happen if we did this: > >=20 > > * Keep a flag which indicates whether the guest is in the middle of > > the configure_connector process. > > - I'm not sure if that would need to be a new bit of state, or > > if we could deduce it from the value of the isolation and > > allocation states > > - If it's new state, we'd need to migrate it, obviously not if > > we can derive it from other state flags > >=20 > > * On the destination during post_load, if there was an in-progress > > configure_connector on the source, we set another "stale > > configure" flag > >=20 > > * When a configure_connector call is attempted on the destination > > with the stale configure flag set, return an error > >=20 > > The question is, if we choose the right error, can we get the guest to > > either restart the configure from scratch, or fail gracefully, so the > > operator can restart the hotplug >=20 > To get the configure to restart, the guest's configure_connector > implementation would need to changed. Current code in drmgr would just > bail on any error, and I'd imagine the in-kernel version does the same. >=20 > So at least for existing guests, the only option is failing the command > at the operator's interface, namely device_add. device_add is > asynchronous to the actual hotplug event handling however. So if we want > to convey failure to the user, it would have to be either in the form of > a new QMP event emitted to convey hotplug success or error, or through > device_add itself by implementing something like the async QMP rework > that Marc-Andre posted some time back (which still seems to be a topic > of debate). Which either approach, something like libvirt, with adequate > state-tracking for pending hotplug events, could handle an error event > on the target side post-migrate and convey that to the user somehow. >=20 > That's probably a much larger discussion if it comes to that, but doable > in theory. >=20 > But even that wouldn't get us totally out of the woods: DRC state can > still be modified outside of hotplug. For instance, a guest should be > able to do: >=20 > drmgr -c pci -s -r > drmgr -c pci -s -a >=20 > to return a device to firmware and then later take it back and > reconfigure it. I'm not aware of any common case where this would occur, > but it's not disallowed by the specification, and performing a migration > between these 2 operations would currently break this since the default > coldplug state on target assumes a configured state in source. Thanks, that's a good case for why we need this. Can you please fold this description into your commit messages so it's there for posterity. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --zaRBsRFn0XYhEU69 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJX+yGKAAoJEGw4ysog2bOSmrAQAM9dnzCQGJhKJZlY/yLTPLGK gBRBg+9zjdVUNSUOPqjnVLZuKjiKIkQScLrPj6pxkjOvmWFSYzFcKTd9nuPZ8FjB Vy0ZoBdWEgDTVCUh9rD082IAHzxBIHJxMK82f3gsfvVxJeKAegZO3KCGl71ziLsw BHUzjK5f+A7txTE2DMVz04sW6GTrdSpOQi6hJUa6MDGRhaZXn0eAVHQIozZJzeed D5n6q0h4pAgq2obBWp8y/BDtXkDhudd37Ct9p3uIVbLH6fulNa3F7Sl6jUB4r4no FPhAqtzEO5ZNS/wG+2SIUPHL17MnMwKSKlrmIvaufshNeSeCVuWDf4t5f92Nuyjh mOCL5MehigM1xJTsiRMCQYwvGg24TPQSbD4UXSIKdmaNkN4kCUCW0n+jFrOOM0dq rKOSxcZSAofk91uD1wjsWIO/1vDRckIbrdIDQm1qj86mG9I39kl92xcOQv+WSyBs Z9xnrYyryOb+ev0mrVxlZIYQGQF1k+TCNkKbMMGA+CJ6u7d9RFMiEKFhqliMKCR/ Fgrvm3IwITZRR45qrFcco7604eEbN39Jc0mCMbSKpYDGWAX2eAmmrNIm56ga/flb NMSiXePlSDZax88/T90nFw1Y5T1P3gBawhnWEsERSEpySGKxVSl2xCa1TyN7xop8 XbTwyLejiXRMa841Awdc =FeY1 -----END PGP SIGNATURE----- --zaRBsRFn0XYhEU69--