From: Andrew Cooper <andrew.cooper3@citrix.com>
To: AP <apxeng@gmail.com>, Ian Jackson <Ian.Jackson@eu.citrix.com>,
"Keir (Xen.org)" <keir@xen.org>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>,
"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
Ian Campbell <Ian.Campbell@citrix.com>
Subject: Re: [xen-unstable test] 11946: regressions - FAIL
Date: Fri, 4 May 2012 21:11:19 +0100 [thread overview]
Message-ID: <4FA437E7.6040105@citrix.com> (raw)
In-Reply-To: <CAGU+auvEvjy3qi3d9ZxMyFdHxauGL6wC=out07rxF-0sMfP8jg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5167 bytes --]
On 04/05/12 20:48, AP wrote:
> On Tue, Mar 27, 2012 at 3:36 AM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
>> On Tue, 2012-02-14 at 10:44 +0000, Ian Campbell wrote:
>>> On Mon, 2012-02-13 at 20:16 +0000, xen.org wrote:
>>>> flight 11946 xen-unstable real [real]
>>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/
>>>>
>>>> Regressions :-(
>>>>
>>>> Tests which did not succeed and are blocking,
>>>> including tests which could not be run:
>>>> test-amd64-i386-xl-credit2 7 debian-install fail REGR. vs. 11944
>>> Host crash:
>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/11946/test-amd64-i386-xl-credit2/serial-woodlouse.log
>>>
>>> This is the debug Andrew Cooper added recently to track down the IRQ
>>> assertion we've been seeing, sadly it looks like the debug code tries to
>>> call xfree from interrupt context and therefore doesn't produce full
>>> output :-(
>> Are we still seeing the issue this debugging was intended to address? We
>> don't seem to be seeing the host crashes any more. Should the debug code
>> be patched up as in the following patch, otherwise when we do see it it
>> doesn't end up printing any useful info.
>>
>> Someone recently reported bugs.debian.org/665433 to Debian, is this the
>> same underlying issue? That report is with Xen 4.0 FWIW.
> I saw the issue (xen-unstable 25256:9dda0efd8ce1) that the debugging
> code added. Can the fix to the debugging code be checked in until the
> original issue has been fixed?
>
> Thanks,
> AP
>
> (XEN) *** IRQ BUG found ***
> (XEN) CPU0 -Testing vector 236 from bitmap
> 41,47,49,57,64,72,80,88,96,100,104,120,136,152,160-161,168,171,192,200-201,208
> (XEN) Guest interrupt information:
> (XEN) IRQ: 0 affinity:01 vec:f0 type=IO-APIC-edge
> status=00000000 mapped, unbound
> (XEN) Assertion '!in_irq()' failed at xmalloc_tlsf.c:607
> (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Tainted: C ]----
> (XEN) CPU: 0
> (XEN) RIP: e008:[<ffff82c48012cefb>] xfree+0x33/0x118
> (XEN) RFLAGS: 0000000000010002 CONTEXT: hypervisor
> (XEN) rax: 0000000000000000 rbx: ffff830214ac0080 rcx: 0000000000000000
> (XEN) rdx: ffff82c4802d8880 rsi: 0000000000000083 rdi: 0000000000000000
> (XEN) rbp: ffff82c4802b7c78 rsp: ffff82c4802b7c58 r8: 0000000000000004
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000010
> (XEN) r12: ffff830214ac0c80 r13: 000000000000000c r14: ffff830214ac0ca8
> (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000426f0
> (XEN) cr3: 0000000168971000 cr2: 0000000001095e00
> (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802b7c58:
> (XEN) ffff830214ac0080 ffff830214ac0c80 000000000000000c ffff830214ac0ca8
> (XEN) ffff82c4802b7ce8 ffff82c4801664d4 ffff82c4802e214a ffff82c400000020
> (XEN) ffff82c4802b7cf8 0000000000000083 ffff830214ac00a8 0000000000000000
> (XEN) 00000000000000ec 00000000000000ec ffff830214ac0c80 000000000000000c
> (XEN) ffff830214ac0ca8 ffff82c480302760 ffff82c4802b7d58 ffff82c480168000
> (XEN) ffff82c4802b7f18 ffff82c4802b7f18 000000ec00000000 ffff82c4802b7f18
> (XEN) 0000000000000000 0000000000000000 ffff82c480302324 0000000000000020
> (XEN) ffff82c4802b7dd8 0000000000000003 0000000000000000 0000000000000000
> (XEN) ffff82c4802b7dc8 ffff82c4801683d3 ffff8300da991000 ffff8300da996000
> (XEN) 0000000000000000 ffffffff802b7d90 ffff82c480159160 ffff82c4802b7e20
> (XEN) ffff82c48015d7db ffff82c4802b7f18 ffff8300da991000 0000000000000003
> (XEN) 0000000000000000 0000000000000000 00007d3b7fd48207 ffff82c480160426
> (XEN) 0000000000000000 0000000000000000 0000000000000003 ffff8300da991000
> (XEN) ffff82c4802b7ef8 ffff82c4802b7f18 0000000000000282 ffff82c4802319a0
> (XEN) 00000000deadbeef 0000000000000000 ffff83021c0b8081 0000000000000000
> (XEN) 0000000000000048 ffff8801d7227ec0 ffff8300da991000 0000002000000000
> (XEN) ffff82c4801865c1 000000000000e008 0000000000000202 ffff82c4802b7e88
> (XEN) 000000000000e010 0000000000000003 ffff82c4802b7ef8 ffff82c4802230d8
> (XEN) ffff82c4802b7f18 0000000000000000 0000000000000246 ffffffff810013aa
> (XEN) 0000000000000000 ffffffff810013aa 000000000000e030 0000000000000246
> (XEN) Xen call trace:
> (XEN) [<ffff82c48012cefb>] xfree+0x33/0x118
> (XEN) [<ffff82c4801664d4>] dump_irqs+0x2a4/0x2e8
> (XEN) [<ffff82c480168000>] irq_move_cleanup_interrupt+0x29f/0x2db
> (XEN) [<ffff82c4801683d3>] do_IRQ+0x9e/0x5a4
> (XEN) [<ffff82c480160426>] common_interrupt+0x26/0x30
> (XEN) [<ffff82c4801865c1>] async_exception_cleanup+0x1/0x35a
> (XEN) [<ffff82c480228438>] syscall_enter+0xc8/0x122
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Assertion '!in_irq()' failed at xmalloc_tlsf.c:607
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
The attached patch should prevent this panic, allowing for all the debug
information to be printed to the console.
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
[-- Attachment #2: irq-fix-dump_irqs.patch --]
[-- Type: text/x-patch, Size: 1504 bytes --]
# HG changeset patch
# Parent 3b563b2c79f991f226bd383d40402d96ddf9a168
x86/irq: Prevent call to xfree in dump_irqs while in an irq context.
Because of c/s 24707:96987c324a4f, dump_irqs() can now be called in an irq
context when a bug condition is encountered. If this is the case, ignore the
call to xsm_show_irq_ssid() and the subsequent call to xfree.
This prevents an assertion failure in xfree(), and should allow all the debug
information to be dumped, before failing with a BUG() because of the underlying
race condition we are attempting to reproduce.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff -r 3b563b2c79f9 xen/arch/x86/irq.c
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -2039,7 +2039,7 @@ static void dump_irqs(unsigned char key)
struct domain *d;
const struct pirq *info;
unsigned long flags;
- char *ssid;
+ char *ssid = NULL;
printk("Guest interrupt information:\n");
@@ -2051,7 +2051,8 @@ static void dump_irqs(unsigned char key)
if ( !irq_desc_initialized(desc) || desc->handler == &no_irq_type )
continue;
- ssid = xsm_show_irq_sid(irq);
+ if ( ! in_irq() )
+ ssid = xsm_show_irq_sid(irq);
spin_lock_irqsave(&desc->lock, flags);
@@ -2098,7 +2099,8 @@ static void dump_irqs(unsigned char key)
spin_unlock_irqrestore(&desc->lock, flags);
- xfree(ssid);
+ if ( ! in_irq() )
+ xfree(ssid);
}
dump_ioapic_irq_info();
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2012-05-04 20:11 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-13 20:16 [xen-unstable test] 11946: regressions - FAIL xen.org
2012-02-14 10:44 ` Ian Campbell
2012-02-14 19:17 ` Daniel De Graaf
2012-03-27 10:36 ` Ian Campbell
2012-03-27 10:52 ` Jan Beulich
2012-05-04 19:48 ` AP
2012-05-04 20:11 ` Andrew Cooper [this message]
2012-05-05 0:21 ` AP
2012-05-05 11:04 ` Andrew Cooper
2012-05-05 18:41 ` AP
2012-05-05 19:06 ` AP
2012-05-07 8:10 ` Jan Beulich
2012-05-07 11:50 ` Andrew Cooper
2012-05-07 13:34 ` Jan Beulich
2012-05-07 14:41 ` Andrew Cooper
2012-05-07 14:50 ` Jan Beulich
2012-05-07 15:40 ` Andrew Cooper
2012-05-07 15:43 ` Jan Beulich
2012-05-07 14:54 ` Jan Beulich
2012-05-07 15:51 ` Andrew Cooper
2012-05-07 18:29 ` AP
2012-05-08 6:37 ` Jan Beulich
2012-05-05 10:33 ` Ian Campbell
2012-05-05 11:11 ` Andrew Cooper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FA437E7.6040105@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=Ian.Campbell@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=apxeng@gmail.com \
--cc=dgdegra@tycho.nsa.gov \
--cc=keir@xen.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).