All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	"Keir (Xen.org)" <keir@xen.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Ian Jackson <Ian.Jackson@eu.citrix.com>,
	Daniel De Graaf <dgdegra@tycho.nsa.gov>
Subject: Re: [xen-unstable test] 11946: regressions - FAIL
Date: Mon, 7 May 2012 15:41:47 +0100	[thread overview]
Message-ID: <4FA7DF2B.3020000@citrix.com> (raw)
In-Reply-To: <4FA7EB80020000780008204B@nat28.tlf.novell.com>

On 07/05/2012 14:34, Jan Beulich wrote:
>>>> On 07.05.12 at 13:50, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 07/05/2012 09:10, Jan Beulich wrote:
>>>>>> On 05.05.12 at 02:21, AP <apxeng@gmail.com> wrote:
>>>> (XEN) *** IRQ BUG found ***
>>>> (XEN) CPU0 -Testing vector 236 from bitmap
>>> 236 = 0xec = FIRST_LEGACY_VECTOR + 0x0c, i.e. an IRQ12 coming
>>> in through the 8259A. Something fundamentally fishy must be going
>>> on here, and I would suppose the code in question shouldn't even be
>>> reached for legacy vectors.
>>>
>>> Furthermore, calling dump_irqs() from the debugging code with
>>> desc->lock still held makes it impossible to get full output, as that
>>> function wants to lock all initialized IRQ descriptors.
>> Yes - it has been vector 236 on each of the 3 reported failures from AP,
>> and I believe it was also vector 236 in the one case I managed to
>> reproduce the issue.
>>
>> However, once we have set up the IO-APIC, the 8259A should not be used
>> any more.  The boot dmeg shows that io_ack_method is indeed "old" (which
>> was going to be my first suggestion), and that EOI Broadcast Suppression
>> is enabled, which I have already identified as a source of problems for
>> some customers.  As a 'fix', I provided the ability for
>> "io_ack_method=new" to prevent EOI Broadcast Suppression being enabled. 
>> This was upstreamed in c/s 24870:9bf3ec036bef, but apparently has not
>> completely fixed the customer problems - just made it substantially more
>> rare.
>>
>> AP: Can you manually invoke the 'i' debug key and provide that - it will
>> help to see how Xen is setting up the IO-APIC(s) on your system.
> Seeing the 'z' output might also be helpful, especially to see whether
> any of the IO-APICs' RTEs is an ExtINT one.
>
> Further, checking that no 8259A IRQ got (or was left) enabled for
> some reason might be useful as well (cached_irq_mask plus the raw
> port 0x21 and 0xA1 values).
>
> In any case the debugging code's locking should be fixed.
>
> Jan
>

It appears we have two functions to dump the IO-APIC state:
__print_IO_APIC() which gets called on boot and from 'z', and
dump_ioapic_irq_info() which gets called from the end of 'i'.  These
should probably be consolidated somehow.

As for the debugging, perhaps change the call to dump_irqs() with a call
to dump_ioapic_irq_info() instead.

Given that the legacy vectors cant migrate, is it wise including them in
the loop in irq_move_cleanup_interrupt()?  In fact, is it wise including
any vector above LAST_DYNAMIC_VECTOR?

~Andrew

  reply	other threads:[~2012-05-07 14:41 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-13 20:16 [xen-unstable test] 11946: regressions - FAIL xen.org
2012-02-14 10:44 ` Ian Campbell
2012-02-14 19:17   ` Daniel De Graaf
2012-03-27 10:36   ` Ian Campbell
2012-03-27 10:52     ` Jan Beulich
2012-05-04 19:48     ` AP
2012-05-04 20:11       ` Andrew Cooper
2012-05-05  0:21         ` AP
2012-05-05 11:04           ` Andrew Cooper
2012-05-05 18:41             ` AP
2012-05-05 19:06               ` AP
2012-05-07  8:10           ` Jan Beulich
2012-05-07 11:50             ` Andrew Cooper
2012-05-07 13:34               ` Jan Beulich
2012-05-07 14:41                 ` Andrew Cooper [this message]
2012-05-07 14:50                   ` Jan Beulich
2012-05-07 15:40                     ` Andrew Cooper
2012-05-07 15:43                       ` Jan Beulich
2012-05-07 14:54                   ` Jan Beulich
2012-05-07 15:51                     ` Andrew Cooper
2012-05-07 18:29                 ` AP
2012-05-08  6:37                   ` Jan Beulich
2012-05-05 10:33         ` Ian Campbell
2012-05-05 11:11           ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FA7DF2B.3020000@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=JBeulich@suse.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.