All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keir Fraser <keir.xen@gmail.com>
To: Jan Beulich <JBeulich@suse.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Tim Deegan <tim@xen.org>, Xen-devel List <xen-devel@lists.xen.org>
Subject: Re: HPET stack overflow, and general problems with do_IRQ()
Date: Fri, 16 Aug 2013 16:34:44 +0100	[thread overview]
Message-ID: <CE340524.5B164%keir.xen@gmail.com> (raw)
In-Reply-To: <520DF68B02000078000EC79A@nat28.tlf.novell.com>

On 16/08/2013 08:53, "Jan Beulich" <JBeulich@suse.com> wrote:

>>>> On 15.08.13 at 22:21, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> Hello,
>> 
>> I have finally managed to get a full stack dump from affected hardware.
>> 
>> The logs can be found here (including hypervisor with debugging symbols):
>> 
>> http://xenbits.xen.org/people/andrewcoop/hpet-overflow-full-stackdump.tar.gz
>> 
>> The interesting log file is xen.pcpu0.stack.log
>> 
>> By my count (grepping for e008 as CS), there are are 8 exception frames
>> on the Xen stack (all stack page 6)
>> 
>> However, because of the early ack() at the LAPIC, and disabling of
>> interrupts, the vectors (in order of interrupts arriving) are
>> 
>> c1, 99, b1, b9, a9, a1, 91, 89
> 
> So these are all HPET interrupts as it seems to me. You said the
> box just has 8 of them, so the fundamental problem is not the
> general handling of interrupts that you talk about below, but the
> fact that _all_ these channels are bound to CPU0: That's an
> insane side effect of the way channel management works when
> there are (potentially) more CPUs than channels. So _I_ think
> this is what needs fixing.
> 
> That's even more so that the above sequence would be impossible
> for guest interrupts (which don't get EOI-ed immediately, and
> interrupts don't get re-enabled on that path either). Hence in the
> discussion here we need to only be concerned of interrupts that
> Xen uses for itself: timer, console, iommu, and HPET. Out of these,
> timer and console - going through the IO-APIC - are safe from this
> because of how io_apic.c implements the ->ack()/->end() pairs.
> Both IOMMU implementations ack their IRQs in the LAPIC only in
> ->end(). And that's what I suggested to switch HPET to too. And
> other than I said about this earlier, disabling interrupts in the
> ->end() handler isn't even necessary, as it already gets called with
> them disabled.
> 
> So we have two possible fixes to the HPET, either of which is
> very likely to deal with the problem on its own.

Additionally, with per-vcpu stacks we could have a larger per-cpu irq stack.
It would be easier to grow that without 'wasting' memory. Although I think
Jan's arguments above do make sense.

 -- Keir

> Jan
> 

      reply	other threads:[~2013-08-16 15:34 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-15 20:21 HPET stack overflow, and general problems with do_IRQ() Andrew Cooper
2013-08-16  7:53 ` Jan Beulich
2013-08-16 15:34   ` Keir Fraser [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CE340524.5B164%keir.xen@gmail.com \
    --to=keir.xen@gmail.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.