From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51595) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XpCBh-0000qC-Ho for qemu-devel@nongnu.org; Fri, 14 Nov 2014 03:25:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XpCBT-0003Fm-Td for qemu-devel@nongnu.org; Fri, 14 Nov 2014 03:25:01 -0500 Received: from e37.co.us.ibm.com ([32.97.110.158]:42326) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XpCBT-0003FL-NJ for qemu-devel@nongnu.org; Fri, 14 Nov 2014 03:24:47 -0500 Received: from /spool/local by e37.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 14 Nov 2014 01:24:45 -0700 Message-ID: <5465BC38.6090106@linux.vnet.ibm.com> Date: Fri, 14 Nov 2014 13:54:24 +0530 From: Aravinda Prasad MIME-Version: 1.0 References: <20141105071019.26196.93729.stgit@aravindap> <20141105071315.26196.68104.stgit@aravindap> <20141111031635.GF15270@voom.redhat.com> <5461B04F.5080204@linux.vnet.ibm.com> <20141113035206.GH7291@voom.fritz.box> <54644886.9050803@linux.vnet.ibm.com> <20141113103235.GK7291@voom.fritz.box> <54649A80.5000204@linux.vnet.ibm.com> <20141113124454.GB18600@voom.fritz.box> <5464C207.3020903@linux.vnet.ibm.com> <20141114004237.GD18600@voom.fritz.box> In-Reply-To: <20141114004237.GD18600@voom.fritz.box> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3 4/4] target-ppc: Handle ibm, nmi-register RTAS call List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: David Gibson Cc: qemu-ppc@nongnu.org, benh@au1.ibm.com, aik@au1.ibm.com, qemu-devel@nongnu.org, paulus@samba.org On Friday 14 November 2014 06:12 AM, David Gibson wrote: > On Thu, Nov 13, 2014 at 08:06:55PM +0530, Aravinda Prasad wrote: >> >> >> On Thursday 13 November 2014 06:14 PM, David Gibson wrote: >>> On Thu, Nov 13, 2014 at 05:18:16PM +0530, Aravinda Prasad wrote: >>>> On Thursday 13 November 2014 04:02 PM, David Gibson wrote: >>>>> On Thu, Nov 13, 2014 at 11:28:30AM +0530, Aravinda Prasad wrote: >>> [snip] >>>>>>>>> Having to retry the hcall from here seems very awkward. This is a >>>>>>>>> private hcall, so you can define it to do whatever retries are >>>>>>>>> necessary internally (and I don't think your current implementation >>>>>>>>> can fail anyway). >>>>>>>> >>>>>>>> Retrying is required in the cases when multi-processors experience >>>>>>>> machine check at or about the same time. As per PAPR, subsequent >>>>>>>> processors should serialize and wait for the first processor to issue >>>>>>>> the ibm,nmi-interlock call. The second processor retries if the first >>>>>>>> processor which received a machine check is still reading the error log >>>>>>>> and is yet to issue ibm,nmi-interlock call. >>>>>>> >>>>>>> Hmm.. ok. But I don't see any mechanism in the patches by which >>>>>>> H_REPORT_MC_ERR will report failure if another CPU has an MC in >>>>>>> progress. >>>>>> >>>>>> h_report_mc_err returns 0 if another VCPU is processing machine check >>>>>> and in that case we retry. h_report_mc_err returns error log address if >>>>>> no other VCPU is processing machine check. >>>>> >>>>> Uh.. how? I'm only seeing one return statement in the implementation >>>>> in 3/4. >>>> >>>> This part is in 4/4 which handles ibm,nmi-interlock call in >>>> h_report_mc_err() >>>> >>>> + if (mc_in_progress == 1) { >>>> + return 0; >>>> + } >>> >>> Ah, right, missed the change to h_report_mc_err() in the later patch. >>> >>>>>>>> Retrying cannot be done internally in h_report_mc_err hcall: only one >>>>>>>> thread can succeed entering qemu upon parallel hcall and hence retrying >>>>>>>> inside the hcall will not allow the ibm,nmi-interlock from first CPU to >>>>>>>> succeed. >>>>>>> >>>>>>> It's possible, but would require some fiddling inside the h_call to >>>>>>> unlock and wait for the other CPUs to finish, so yes, it might be more >>>>>>> trouble than it's worth. >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>> + mtsprg 2,4 >>>>>>>>> >>>>>>>>> Um.. doesn't this clobber the value of r3 you saved in SPRG2 just above. >>>>>>>> >>>>>>>> The r3 saved in SPRG2 is moved to rtas area in the private hcall and >>>>>>>> hence it is fine to clobber r3 here >>>>>>> >>>>>>> Ok, if you're going to do some magic register saving inside the HCALL, >>>>>>> why not do the SRR[01] and CR restoration inside there as well. >>>>>> >>>>>> SRR0/1 is clobbered while returning from HCALL and hence cannot be >>>>>> restored in HCALL. For CR, we need to do the restoration here as we >>>>>> clobber CR after returning from HCALL (the instruction checking the >>>>>> return value of hcall clobbers CR). >>>>> >>>>> Hrm. AFAICT SRR0/1 shouldn't be clobbered when returning from an >>>> >>>> As hcall is an interrupt, SRR0 is set to nip and SRR1 to msr just before >>>> executing rfid. >>> >>> AFAICT the return path from the hypervisor - including for hcalls - >>> uses HSSR0/1 and hrfid, so ordinary SRR0/SRR1 should be ok. >> >> I see SRR0 and SRR1 clobbered when the HCALL from guest returns. >> Previous discussions on this is in the link below: >> >> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg01148.html > > Hrm. Well, I guess if it happened it happened, but Alex's explanation > for why doesn't make sense to me. > > Did you execute cpu_synchronize_state() *before* attempting to set > SRR0/1 in the hcall? Yes I did. > >> Further I searched QEMU source code but could not find whether it is >> using rfid/hrfid. However, ISA for sc instruction mentions that SRR0 and >> SRR1 are modified. > > Well of course it isn't in the qemu source, the low-level return to > guest is within the host kernel, specifically fast_guest_return in > arch/powerpc/kvm/book3s_hv_rmhandlers.S which uses hrfid. > > If I'm reading the ISA correctly then yes, SRR0/1 are clobbered on > entry, but that's on *entry* so can be overwritten by the hcall > handler itself. Hmm.. ok. I need to take a look into it in detail. > > -- Regards, Aravinda