From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler. Date: Tue, 6 Aug 2013 14:23:25 +0100 Message-ID: <5200F8CD.6070504@citrix.com> References: <1375735116-31466-1-git-send-email-andrew.cooper3@citrix.com> <5200C95F02000078000E9893@nat28.tlf.novell.com> <5200D0B8.4060706@citrix.com> <5200FDA002000078000E999E@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1V6hEX-0000k0-75 for xen-devel@lists.xenproject.org; Tue, 06 Aug 2013 13:23:29 +0000 In-Reply-To: <5200FDA002000078000E999E@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel , Keir Fraser , Tim Deegan List-Id: xen-devel@lists.xenproject.org On 06/08/13 12:44, Jan Beulich wrote: >>>> On 06.08.13 at 12:32, Andrew Cooper wrote: >> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. > Not all that big. > >> The full boot xl dmesg is attached, but it appears that the are 8 >> broadcast hpets. This is futher backed up by the 'i' debugkey (also >> attached) > Right. And with fewer CPUs than HPET channels, you could get > the system into a mode where each CPU uses a dedicated channel > ("maxcpus=4", suppressing registration of all the disabled ones). Does this setup actually mean that there are 8 hpets which are all broadcasting to every pcpu? The affinities listed in debug-keys 'i' seem to be towards single pcpus, but the order looks peculiar to say the least. > >> Keir: (merging your thread back here) >> I see your point regarding IRQ_INPROGRESS, but even with 8 hpet >> interrupts, there are rather more than 8 occurences of >> handle_hpet_broadcast() in the stack. If the occurences were just >> function pointers on the stack, I would expect to see several >> handle_hpet_broadcast()+0x0/0x268 > Which further hints at some earlier problem. I suppose you don't > happen to have a dump of that crash, or else you could inspect > the IRQ descriptors as well as the stack for whether all instances > came from the same IRQ/vector. > > Jan > Sadly no - the crashdump analyser grabbed the double fault IST, rather than the entire contents of the main stack. I shall extend the analyser to pick up the main stack as well; It does cross IST boundaries for call traces. I shall how easy it is to make it parse the irq_desc's & friends as well on crash, although for this case it might be easier just to tweak the double fault handler. ~Andrew