From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Gibson Subject: Re: Reset problem vs. MMIO emulation, hypercalls, etc... Date: Wed, 8 Aug 2012 21:59:43 +1000 Message-ID: <20120808115943.GU16664@truffala.fritz.box> References: <20120803174113.GA13174@amt.cnet> <1344033008.24037.67.camel@pasglop> <20120806031344.GG16664@truffala.fritz.box> <1344286677.24037.100.camel@pasglop> <20120807013228.GL16664@truffala.fritz.box> <5020D5EB.9060104@redhat.com> <20120807121442.GN16664@truffala.fritz.box> <5021148D.4000107@redhat.com> <20120808004948.GO16664@truffala.fritz.box> <50222A52.6010201@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Benjamin Herrenschmidt , Marcelo Tosatti , kvm@vger.kernel.org, Alexander Graf , Paul Mackerras , kvm-ppc@vger.kernel.org To: Avi Kivity Return-path: Content-Disposition: inline In-Reply-To: <50222A52.6010201@redhat.com> Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On Wed, Aug 08, 2012 at 11:58:58AM +0300, Avi Kivity wrote: > On 08/08/2012 03:49 AM, David Gibson wrote: > >> > We never have irqchip in kernel (because we haven't written that yet) > >> > but we still sleep in-kernel for CEDE. I haven't spotted any problem > >> > with that, but now I'm wondering if there is one, since x86 don't do > >> > it in what seems like the analogous situation. > >> > > >> > It's possible this works because our decrementer (timer) interrupts > >> > are different at the core level from external interrupts coming from > >> > the PIC, and *are* handled in kernel, but I haven't actually followed > >> > the logic to work out if this is the case. > >> > > >> >> Meaning the normal state of things is to sleep in > >> >> the kernel (whether or not you have an emulated interrupt controller in > >> >> the kernel -- the term irqchip in kernel is overloaded for x86). > >> > > >> > Uh.. overloaded in what way. > >> > >> On x86, irqchip-in-kernel means that the local APICs, the IOAPIC, and > >> the two PICs are emulated in the kernel. Now the IOAPIC and the PICs > >> correspond to non-x86 interrupt controllers, but the local APIC is more > >> tightly coupled to the core. Interrupt acceptance by the core is an > >> operation that involved synchronous communication with the local APIC: > >> the APIC presents the interrupt, the core accepts it based on the value > >> of the interrupt enable flag and possible a register (CR8), then the > >> APIC updates the ISR and IRR. > >> > >> The upshot is that if the local APIC is in userspace, interrupts must be > >> synchronous with vcpu exection, so that KVM_INTERRUPT is a vcpu ioctl > >> and HLT is emulated in userspace (so that local APIC emulation can check > >> if an interrupt wakes it up or not). > > > > Sorry, still not 100% getting it. When the vcpu is actually running > > code, that synchronous communication must still be accomplished via > > the KVM_INTERRUPT ioctl, yes? So what makes HLT different, that the > > communication can't be accomplished in that case. > > No, you're correct. HLT could have been emulated in userspace, it just > wasn't. The correct statement is that HLT was arbitrarily chosen to be > emulated in userspace with the synchronous model, but the asynchronous > model forced it into the kernel. Aha! Ok, understood. Uh, assuming you meant kernelspace, not userspace in the first line there, anyway. Ok, so I am now reassured that our current handling of CEDE in kernelspace does not cause problems. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson