From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53959) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dxQCT-0000cS-He for qemu-devel@nongnu.org; Thu, 28 Sep 2017 00:13:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dxQCR-0003V6-SG for qemu-devel@nongnu.org; Thu, 28 Sep 2017 00:13:25 -0400 Date: Thu, 28 Sep 2017 13:58:15 +1000 From: David Gibson Message-ID: <20170928035815.GX12504@umbus> References: <150287457293.9760.17827532208744487789.stgit@aravinda> <150287474187.9760.12052550430995757993.stgit@aravinda> <20170817013934.GC5509@umbus.fritz.box> <555e187e-38af-d897-85b7-f08364b264fd@linux.vnet.ibm.com> <20170822020854.GY12356@umbus.fritz.box> <98ef38a4-9c09-adec-6f0c-b280e5349d64@linux.vnet.ibm.com> <20170927071508.GP12504@umbus> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="9CzcV6dAFIr7O1Ie" Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [PATCH v3 2/5] ppc: spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aravinda Prasad Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, aik@ozlabs.ru, mahesh@linux.vnet.ibm.com, benh@au1.ibm.com, paulus@samba.org, sam.bobroff@au1.ibm.com --9CzcV6dAFIr7O1Ie Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Sep 27, 2017 at 05:23:51PM +0530, Aravinda Prasad wrote: >=20 >=20 > On Wednesday 27 September 2017 12:45 PM, David Gibson wrote: > > On Thu, Sep 21, 2017 at 02:39:06PM +0530, Aravinda Prasad wrote: > >> > >> > >> On Tuesday 22 August 2017 07:38 AM, David Gibson wrote: > >> > >> [ . . . ] > >> > >>>>>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h > >>>>>> index 46012b3..eee8d33 100644 > >>>>>> --- a/include/hw/ppc/spapr.h > >>>>>> +++ b/include/hw/ppc/spapr.h > >>>>>> @@ -123,6 +123,12 @@ struct sPAPRMachineState { > >>>>>> * occurs during the unplug process. */ > >>>>>> QTAILQ_HEAD(, sPAPRDIMMState) pending_dimm_unplugs; > >>>>>> =20 > >>>>>> + /* State related to "ibm,nmi-register" and "ibm,nmi-interlock= " calls */ > >>>>>> + target_ulong guest_machine_check_addr; > >>>>>> + bool mc_in_progress; > >>>>>> + int mc_cpu; > >>>>> > >>>>> mc_cpu isn't actually used yet in this patch. In any case it and > >>>>> mc_in_progress could probably be folded together, no? > >>>> > >>>> It is possible to fold mc_cpu and mc_in_progress together with the > >>>> convention that if it is set to -1 mc is not in progress otherwise i= t is > >>>> set to the CPU handling the mc. > >>>> > >>>>> > >>>>> These values will also need to be migrated, AFAICT. > >>>> > >>>> I am thinking of how to handle the migration when machine check hand= ling > >>>> is in progress. Probably wait for machine check handling to complete > >>>> before migrating as the error could be irrelevant once migrated to a= new > >>>> hardware. If that is the case we don't need to migrate these values. > >>> > >>> Ok. > >> > >> This is what I think about handling machine check during migration bas= ed > >> on my understanding of the VM migration code. > >> > >> There are two possibilities here. First, migration can be initiated > >> while the machine check handling is in progress. Second, A machine che= ck > >> error can happen when the migration is in progress. > >> > >> To handle the first case we can add migrate_add_blocker() call when we > >> start handling the machine check error and issue migrate_del_blocker() > >> when done. I think this should solve the issue. > >> > >> The second case is bit tricky. The migration has already started and > >> hence migrate_add_blocker() call will fail. We also cannot wait till t= he > >> completion of the migration to handle machine check error as the VM's > >> data could be corrupt. > >> > >> Machine check errors should not be an issue when the migration is in t= he > >> RAM copy phase as VM is still active with vCPUs running. The problem is > >> when we hit a machine check when the migration is about to complete. F= or > >> example, > >> > >> 1. vCPU2 hits a machine check error during migration. > >> > >> 2. KVM causes VM exit on vCPU2 and the NIP of vCPU2 is changed to the > >> guest registered machine check handler. > >> > >> 3. The migration_completion() issues vm_stop() and hence either vCPU2 = is > >> never scheduled again on the source hardware or vCPU2 is preempted whi= le > >> executing the machine check handler. > >> > >> 4. vCPU2 is resumed on the target hardware and either starts or > >> continues processing the machine check error. This could be a problem = as > >> these errors are specific to the source hardware. For instance, when t= he > >> the guest issues memory poisoning upon such error, a clean page on the > >> target hardware is poisoned while the corrupt page on source hardware = is > >> not poisoned. > >> > >> The second case of hitting machine check during the final phase of > >> migration is rare but wanted to check what others think about it. > >=20 > > So, I've had a bit of a think about this. I don't recall if these > > fwnmi machine checks are expected on guest RAM, or guest IO addresses. >=20 > It is expected on guest RAM. I am not sure about guest IO address. >=20 > >=20 > > 1) If RAM > >=20 > > What exactly is the guest's notification for? Even without > > migration, the host's free to move guest memory around in host > > memory, so it seems any hardware level poking should be done on the > > host side. >=20 > If the error is a correctable error, then host takes care of it by > moving the page to a different location, the guest need not be and will > not be notified. Guest will be notified if host is not able to fully > recover. Hence we hit FWNMI in guest when RAM errors are not recovered > by the host. Ok. > > Is it just to notify the guest that we weren't able to fully recover > > on the host side and that page may contain corrupted data? If > > that's so then it seems resuming the handling on the destination is > > still right. It may be new good RAM, but the contents we migrated > > could still be corrupt from the machine check event on the source. >=20 > Yes. This is what I am doing in my v5 patch set which I am about to > post. Additionally I block migration when processing machine check errors. >=20 > >=20 > > 2) If IO > >=20 > > AFAICT this could only happen with VFIO passthrough devices.. but it > > shouldn't be possible to migrate if there are any of those. > >=20 >=20 > I am not very sure about IO errors. Ok. It sounds like that's not the primary case you're interested, so I guess we can ignore it for now. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --9CzcV6dAFIr7O1Ie Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAlnMc1cACgkQbDjKyiDZ s5JuihAA5L93HFs9ndp3wW/M1nPAM8f6ek5VCrdqBiDLQwgV7tVjM3W8I819OOOO yDeAa8UeBDeA9K7Z5n505HbX8owjZxfWA7x0/Lnes4Dm4SklkR4p6yoot5RFAGcM 0hZw17m4wMD20c20zU9nKfJ3YgCEpXD4JYFaSOQtxFGmdAoWqzhyPrl2s0fvbzhS OkTjv1+hdP5mK5VXOSXhkX+6rTOZLHXQUh5B3ansasguOD3bVIyd/bIsA+2vSNxn HXuQOIarimu5zy5XY2eOS2Lqa4VJNezrax+9x2DUwANW0j/kja90+SbR28iSi2Ui tpFqipsuPp160ukKfkNPiX4EjeK3kt47G2G3vsLg0Fkunfh+mLHeq3FlYru8ayBF fSIlXopGGUD4e/fUE+fPEWpgFaMX2mjY8fG9P4/pLaM5WYwcna9jCX5gG2u402Ph aYCGNQWHWodXY5BRrMNBPKm39VwS8kC8Cw6P3SRzkk8K4qsfuS/qqbBLU43gwwXX dOal0CE4vxf0ooMQMa4YLbofzeR9zSyATM0wCNgfWEMlCjSdSuAKoDRQSM2VlHAw U+X6pBixBmZiN6qIjsnvk+Apg7GcJmYVWuMpax6pen7GOVXXISZEX9elqum0SToQ +lL0woaDKbo9bzMXU24z6yQ2mONu+7tmIxe/X3Sj64rUw2qO92Y= =8yHT -----END PGP SIGNATURE----- --9CzcV6dAFIr7O1Ie--