From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [RFC Patch] x86/hpet: Disable interrupts while
 running hpet interrupt handler.
Date: Tue, 6 Aug 2013 14:23:25 +0100
Message-ID: <5200F8CD.6070504@citrix.com>
References: <1375735116-31466-1-git-send-email-andrew.cooper3@citrix.com>
	<5200C95F02000078000E9893@nat28.tlf.novell.com>
	<5200D0B8.4060706@citrix.com>
	<5200FDA002000078000E999E@nat28.tlf.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta4.messagelabs.com ([85.158.143.247])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <Andrew.Cooper3@citrix.com>) id 1V6hEX-0000k0-75
	for xen-devel@lists.xenproject.org; Tue, 06 Aug 2013 13:23:29 +0000
In-Reply-To: <5200FDA002000078000E999E@nat28.tlf.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>, Keir Fraser <keir@xen.org>, Tim Deegan <tim@xen.org>
List-Id: xen-devel@lists.xenproject.org

On 06/08/13 12:44, Jan Beulich wrote:
>>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> The machine we found this crash on is a Dell R310.  4 CPUs, 16G Ram.
> Not all that big.
>
>> The full boot xl dmesg is attached, but it appears that the are 8
>> broadcast hpets.  This is futher backed up by the 'i' debugkey (also
>> attached)
> Right. And with fewer CPUs than HPET channels, you could get
> the system into a mode where each CPU uses a dedicated channel
> ("maxcpus=4", suppressing registration of all the disabled ones).

Does this setup actually mean that there are 8 hpets which are all
broadcasting to every pcpu?  The affinities listed in debug-keys 'i'
seem to be towards single pcpus, but the order looks peculiar to say the
least.

>
>> Keir: (merging your thread back here)
>>   I see your point regarding IRQ_INPROGRESS, but even with 8 hpet
>> interrupts, there are rather more than 8 occurences of
>> handle_hpet_broadcast() in the stack.  If the occurences were just
>> function pointers on the stack, I would expect to see several
>> handle_hpet_broadcast()+0x0/0x268
> Which further hints at some earlier problem. I suppose you don't
> happen to have a dump of that crash, or else you could inspect
> the IRQ descriptors as well as the stack for whether all instances
> came from the same IRQ/vector.
>
> Jan
>

Sadly no - the crashdump analyser grabbed the double fault IST, rather
than the entire contents of the main stack.  I shall extend the analyser
to pick up the main stack as well; It does cross IST boundaries for call
traces.  I shall how easy it is to make it parse the irq_desc's &
friends as well on crash, although for this case it might be easier just
to tweak the double fault handler.

~Andrew