xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: AP <apxeng@gmail.com>, Ian Jackson <Ian.Jackson@eu.citrix.com>,
	"Keir (Xen.org)" <keir@xen.org>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Ian Campbell <Ian.Campbell@citrix.com>
Subject: Re: [xen-unstable test] 11946: regressions - FAIL
Date: Fri, 4 May 2012 21:11:19 +0100	[thread overview]
Message-ID: <4FA437E7.6040105@citrix.com> (raw)
In-Reply-To: <CAGU+auvEvjy3qi3d9ZxMyFdHxauGL6wC=out07rxF-0sMfP8jg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5167 bytes --]

On 04/05/12 20:48, AP wrote:
> On Tue, Mar 27, 2012 at 3:36 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> On Tue, 2012-02-14 at 10:44 +0000, Ian Campbell wrote:
>>> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote:
>>>> flight 11946 xen-unstable real [real]
>>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/
>>>>
>>>> Regressions :-(
>>>>
>>>> Tests which did not succeed and are blocking,
>>>> including tests which could not be run:
>>>>  test-amd64-i386-xl-credit2    7 debian-install            fail REGR. vs. 11944
>>> Host crash:
>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log
>>>
>>> This is the debug Andrew Cooper added recently to track down the IRQ
>>> assertion we've been seeing, sadly it looks like the debug code tries to
>>> call xfree from interrupt context and therefore doesn't produce full
>>> output :-(
>> Are we still seeing the issue this debugging was intended to address? We
>> don't seem to be seeing the host crashes any more. Should the debug code
>> be patched up as in the following patch, otherwise when we do see it it
>> doesn't end up printing any useful info.
>>
>> Someone recently reported bugs.debian.org/665433 to Debian, is this the
>> same underlying issue? That report is with Xen 4.0 FWIW.
> I saw the issue (xen-unstable 25256:9dda0efd8ce1) that the debugging
> code added. Can the fix to the debugging code be checked in until the
> original issue has been fixed?
>
> Thanks,
> AP
>
> (XEN) *** IRQ BUG found ***
> (XEN) CPU0 -Testing vector 236 from bitmap
> 41,47,49,57,64,72,80,88,96,100,104,120,136,152,160-161,168,171,192,200-201,208
> (XEN) Guest interrupt information:
> (XEN)    IRQ:   0 affinity:01 vec:f0 type=IO-APIC-edge
> status=00000000 mapped, unbound
> (XEN) Assertion '!in_irq()' failed at xmalloc_tlsf.c:607
> (XEN) ----[ Xen-4.2-unstable  x86_64  debug=y  Tainted:    C ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82c48012cefb>] xfree+0x33/0x118
> (XEN) RFLAGS: 0000000000010002   CONTEXT: hypervisor
> (XEN) rax: 0000000000000000   rbx: ffff830214ac0080   rcx: 0000000000000000
> (XEN) rdx: ffff82c4802d8880   rsi: 0000000000000083   rdi: 0000000000000000
> (XEN) rbp: ffff82c4802b7c78   rsp: ffff82c4802b7c58   r8:  0000000000000004
> (XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000000000000010
> (XEN) r12: ffff830214ac0c80   r13: 000000000000000c   r14: ffff830214ac0ca8
> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000426f0
> (XEN) cr3: 0000000168971000   cr2: 0000000001095e00
> (XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802b7c58:
> (XEN)    ffff830214ac0080 ffff830214ac0c80 000000000000000c ffff830214ac0ca8
> (XEN)    ffff82c4802b7ce8 ffff82c4801664d4 ffff82c4802e214a ffff82c400000020
> (XEN)    ffff82c4802b7cf8 0000000000000083 ffff830214ac00a8 0000000000000000
> (XEN)    00000000000000ec 00000000000000ec ffff830214ac0c80 000000000000000c
> (XEN)    ffff830214ac0ca8 ffff82c480302760 ffff82c4802b7d58 ffff82c480168000
> (XEN)    ffff82c4802b7f18 ffff82c4802b7f18 000000ec00000000 ffff82c4802b7f18
> (XEN)    0000000000000000 0000000000000000 ffff82c480302324 0000000000000020
> (XEN)    ffff82c4802b7dd8 0000000000000003 0000000000000000 0000000000000000
> (XEN)    ffff82c4802b7dc8 ffff82c4801683d3 ffff8300da991000 ffff8300da996000
> (XEN)    0000000000000000 ffffffff802b7d90 ffff82c480159160 ffff82c4802b7e20
> (XEN)    ffff82c48015d7db ffff82c4802b7f18 ffff8300da991000 0000000000000003
> (XEN)    0000000000000000 0000000000000000 00007d3b7fd48207 ffff82c480160426
> (XEN)    0000000000000000 0000000000000000 0000000000000003 ffff8300da991000
> (XEN)    ffff82c4802b7ef8 ffff82c4802b7f18 0000000000000282 ffff82c4802319a0
> (XEN)    00000000deadbeef 0000000000000000 ffff83021c0b8081 0000000000000000
> (XEN)    0000000000000048 ffff8801d7227ec0 ffff8300da991000 0000002000000000
> (XEN)    ffff82c4801865c1 000000000000e008 0000000000000202 ffff82c4802b7e88
> (XEN)    000000000000e010 0000000000000003 ffff82c4802b7ef8 ffff82c4802230d8
> (XEN)    ffff82c4802b7f18 0000000000000000 0000000000000246 ffffffff810013aa
> (XEN)    0000000000000000 ffffffff810013aa 000000000000e030 0000000000000246
> (XEN) Xen call trace:
> (XEN)    [<ffff82c48012cefb>] xfree+0x33/0x118
> (XEN)    [<ffff82c4801664d4>] dump_irqs+0x2a4/0x2e8
> (XEN)    [<ffff82c480168000>] irq_move_cleanup_interrupt+0x29f/0x2db
> (XEN)    [<ffff82c4801683d3>] do_IRQ+0x9e/0x5a4
> (XEN)    [<ffff82c480160426>] common_interrupt+0x26/0x30
> (XEN)    [<ffff82c4801865c1>] async_exception_cleanup+0x1/0x35a
> (XEN)    [<ffff82c480228438>] syscall_enter+0xc8/0x122
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion '!in_irq()' failed at xmalloc_tlsf.c:607
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
The attached patch should prevent this panic, allowing for all the debug
information to be printed to the console.

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


[-- Attachment #2: irq-fix-dump_irqs.patch --]
[-- Type: text/x-patch, Size: 1504 bytes --]

# HG changeset patch
# Parent 3b563b2c79f991f226bd383d40402d96ddf9a168
x86/irq: Prevent call to xfree in dump_irqs while in an irq context.

Because of c/s 24707:96987c324a4f, dump_irqs() can now be called in an irq
context when a bug condition is encountered.  If this is the case, ignore the
call to xsm_show_irq_ssid() and the subsequent call to xfree.

This prevents an assertion failure in xfree(), and should allow all the debug
information to be dumped, before failing with a BUG() because of the underlying
race condition we are attempting to reproduce.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

diff -r 3b563b2c79f9 xen/arch/x86/irq.c
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2039,7 +2039,7 @@ static void dump_irqs(unsigned char key)
     struct domain *d;
     const struct pirq *info;
     unsigned long flags;
-    char *ssid;
+    char *ssid = NULL;
 
     printk("Guest interrupt information:\n");
 
@@ -2051,7 +2051,8 @@ static void dump_irqs(unsigned char key)
         if ( !irq_desc_initialized(desc) || desc->handler == &no_irq_type )
             continue;
 
-        ssid = xsm_show_irq_sid(irq);
+        if ( ! in_irq() )
+            ssid = xsm_show_irq_sid(irq);
 
         spin_lock_irqsave(&desc->lock, flags);
 
@@ -2098,7 +2099,8 @@ static void dump_irqs(unsigned char key)
 
         spin_unlock_irqrestore(&desc->lock, flags);
 
-        xfree(ssid);
+        if ( ! in_irq() )
+            xfree(ssid);
     }
 
     dump_ioapic_irq_info();

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2012-05-04 20:11 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-13 20:16 [xen-unstable test] 11946: regressions - FAIL xen.org
2012-02-14 10:44 ` Ian Campbell
2012-02-14 19:17   ` Daniel De Graaf
2012-03-27 10:36   ` Ian Campbell
2012-03-27 10:52     ` Jan Beulich
2012-05-04 19:48     ` AP
2012-05-04 20:11       ` Andrew Cooper [this message]
2012-05-05  0:21         ` AP
2012-05-05 11:04           ` Andrew Cooper
2012-05-05 18:41             ` AP
2012-05-05 19:06               ` AP
2012-05-07  8:10           ` Jan Beulich
2012-05-07 11:50             ` Andrew Cooper
2012-05-07 13:34               ` Jan Beulich
2012-05-07 14:41                 ` Andrew Cooper
2012-05-07 14:50                   ` Jan Beulich
2012-05-07 15:40                     ` Andrew Cooper
2012-05-07 15:43                       ` Jan Beulich
2012-05-07 14:54                   ` Jan Beulich
2012-05-07 15:51                     ` Andrew Cooper
2012-05-07 18:29                 ` AP
2012-05-08  6:37                   ` Jan Beulich
2012-05-05 10:33         ` Ian Campbell
2012-05-05 11:11           ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FA437E7.6040105@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=apxeng@gmail.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).