From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45969) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zx5az-0001po-JG for qemu-devel@nongnu.org; Thu, 12 Nov 2015 23:04:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zx5ay-0003Qb-Bs for qemu-devel@nongnu.org; Thu, 12 Nov 2015 23:04:17 -0500 Date: Fri, 13 Nov 2015 12:57:15 +1100 From: David Gibson Message-ID: <20151113015715.GJ4886@voom.redhat.com> References: <20151111171135.4328.41819.stgit@aravindap> <20151111171602.4328.34006.stgit@aravindap> <56444957.9080003@redhat.com> <56445E7B.5010904@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="QxN5xOWGsmh5a4wb" Content-Disposition: inline In-Reply-To: <56445E7B.5010904@redhat.com> Subject: Re: [Qemu-devel] [PATCH 4/4] target-ppc: Handle NMI guest exit List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Thomas Huth Cc: benh@au1.ibm.com, aik@ozlabs.ru, agraf@suse.de, qemu-devel@nongnu.org, qemu-ppc@nongnu.org, sam.bobroff@au1.ibm.com, Aravinda Prasad , paulus@samba.org --QxN5xOWGsmh5a4wb Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 12, 2015 at 10:40:11AM +0100, Thomas Huth wrote: > On 12/11/15 09:09, Thomas Huth wrote: > > On 11/11/15 18:16, Aravinda Prasad wrote: > >> Memory error such as bit flips that cannot be corrected > >> by hardware are passed on to the kernel for handling. > >> If the memory address in error belongs to guest then > >> guest kernel is responsible for taking suitable action. > >> Patch [1] enhances KVM to exit guest with exit reason > >> set to KVM_EXIT_NMI in such cases. > >> > >> This patch handles KVM_EXIT_NMI exit. If the guest OS > >> has registered the machine check handling routine by > >> calling "ibm,nmi-register", then the handler builds > >> the error log and invokes the registered handler else > >> invokes the handler at 0x200. > >> > >> [1] http://marc.info/?l=3Dkvm-ppc&m=3D144726114408289 > >> > >> Signed-off-by: Aravinda Prasad [snip] > >> + env->nip =3D 0x200; > >> + return 0; > >> + } > >> + > >> + qemu_mutex_lock(&spapr->mc_in_progress); > >=20 > > Using a mutex here is definitely wrong. The kvm_arch_handle_exit() code > > is run under the Big QEMU Lock=E2=84=A2 (see qemu_mutex_lock_iothread()= in > > kvm_cpu_exec()), >=20 > In case you're looking for the calls, I just noticed that the > qemu_mutex_lock_iothread() have recently been pushed into > kvm_arch_handle_exit() itself. >=20 > > so if you would ever get one thread waiting for this > > mutex here, it could never be unlocked again in rtas_ibm_nmi_interlock() > > because the other code would wait forever to get the BQL =3D=3D> Deadlo= ck. > >=20 > > I think if you want to be able to handle multiple NMIs at once, you > > likely need something like an error log per CPU instead. And if an NMI > > happens one CPU while there is already a NMI handler running on the very > > same CPU, you could likely simply track this with an boolean variable > > and put the CPU into checkstop if this happens? >=20 > Ok, I now had a look into the LoPAPR spec, and if I've got that right, > you really have to serialize the NMIs in case they happen at multiple > CPUs at the same time. So I guess the best thing you can do here is > something like: >=20 > while (spapr->mc_in_progress) { > /* > * There is already another NMI in progress, thus we need > * to yield here to wait until it has been finsihed > */ > qemu_mutex_unlock_iothread(); > usleep(10); > qemu_mutex_lock_iothread(); > } > spapr->mc_in_progress =3D true; >=20 > Also LoPAPR talks about 'subsequent processors report "fatal error > previously reported"', so maybe the other processors should report that > condition in this case? > And of course you've also got to check that the same CPU is not getting > multiple NMIs before the interlock function has been called again. You should be able to avoid the nasty usleep by using a pthread condition variable. So here you'd have while (spapr->mc_in_progress) { pthread_cond_wait(&mc_delivery_cond, &qemu_iothread_lock); } spapr->mc_in_progress =3D true; Or.. there may be qemu wrappers around the pthread functions you should be using. Once delivery of a single MC is complete, you'd use pthread_cond_signal() to wake up any additional ones. pthread_cond_wait automatically drops the specified mutex internally, so access to mc_in_progress itself is still protected by the iothread mutex. --=20 David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson --QxN5xOWGsmh5a4wb Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBAgAGBQJWRUN7AAoJEGw4ysog2bOSWC4QAJCMSRSCA5gKSC8+h5gMg5LN 7kelvbc3j4oYMRCMLy/8+9x5kz6xj4A7SxnfQSgPfsx1UBu30MTDhiurhuJdUf78 NvSKWx+SFZcd2vFW9bT7vdK0/+Z6uPsZePhWPIWlpIgqH2Ju8/XMm5BYGDNCw6ua TJI9UO08BbnHCCL2kPHGRr1yDzD8CcFhjcTqbeGUXehcPqHz/hOC4Xx3fDtd4TH/ dYF0KubuQa7SChWEHe9FWrVEiz2d+WdSMtco8tvKCJC5j86AVCY8+C7a0VgCZ+YK SL4v3UEKYgpxRlHWk6EVUcWFMaqdhA3KdYL/YE2IwCzoA7DmXncC9ubsJ10ocxlM X+0ChlbXh0Q3DYKOfYeVscdqoaDWZpUzQxThSKsEU8Eb58MNbXQmTFOkGhgDC76H h1wrymuEOLbXm5AuUw/RalXduH2VYWImpVYTHKq59bfvrJ90jRYpd4UlaONJtIGn 1UAV+zAZZjnQw/wOyezsblNMTAfAmWrj+4nkKxDTLKvNAQX/tg85q5fmhvkJs9gH dY5M2jMfhAYPK2DqPp+pmDj3nvilfSrwwr1YLtnsh7LVga2izXXY39s4ZRZnfiUK yMZqgtQUwGVrRHZxhrMOIhLRen0pS90P8v8yOnq7E/c5irfknsSCOMdljSgRj7S0 6MzESkdSZfDjGSVRxUzE =ks7G -----END PGP SIGNATURE----- --QxN5xOWGsmh5a4wb--