* [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
@ 2013-08-05 20:38 Andrew Cooper
2013-08-06 4:49 ` Keir Fraser
2013-08-06 8:01 ` Jan Beulich
0 siblings, 2 replies; 11+ messages in thread
From: Andrew Cooper @ 2013-08-05 20:38 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich, Tim Deegan
Automated testing on Xen-4.3 testing tip found an interesting issue
(XEN) *** DOUBLE FAULT ***
(XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]----
(XEN) CPU: 3
(XEN) RIP: e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f
(XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor
(XEN) rax: 0000000000000000 rbx: 0000000000000020 rcx: 0000000000000100
(XEN) rdx: ffff82c4c032dfc0 rsi: ffff83043f2c6068 rdi: ffff83043f2c6008
(XEN) rbp: ffff83043f2c6048 rsp: ffff83043f2c6000 r8: 0000000000000001
(XEN) r9: 0000000000000000 r10: ffff83043f2c76f0 r11: 0000000000000000
(XEN) r12: ffff83043f2c6008 r13: 7fffffffffffffff r14: ffff83043f2c6068
(XEN) r15: 000003343036797b cr0: 0000000080050033 cr4: 00000000000026f0
(XEN) cr3: 0000000403c40000 cr2: ffff83043f2c5ff8
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0
(XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000):
(XEN) ffff83043f2c6008: [<ffff82c4c01aa73d>] cpuidle_wakeup_mwait+0x2d/0xba
(XEN) ffff83043f2c6058: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c60c8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c60d8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c61a8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6230: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6268: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c62d8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c62e8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c63b8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6440: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6478: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6498: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c64e8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c64f8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6550: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
(XEN) ffff83043f2c6568: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN) ffff83043f2c65c8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6650: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6688: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c66f8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6708: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c67d8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6860: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6898: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6908: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6918: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6938: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c6988: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c69e8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6a70: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6aa8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6b18: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6c80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41
(XEN) ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f
(XEN) ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c6f28: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d
(XEN) ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde
(XEN) ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239
(XEN) ffff83043f2c7108: [<ffff82c4c01e9497>] __get_gfn_type_access+0x12b/0x20e
(XEN) ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d
(XEN) ffff83043f2c71c8: [<ffff82c4c01f4660>] map_domain_gfn_3_levels+0x43/0x13a
(XEN) ffff83043f2c7208: [<ffff82c4c01f4b6b>] guest_walk_tables_3_levels+0x414/0x489
(XEN) ffff83043f2c7288: [<ffff82c4c0223988>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306
(XEN) ffff83043f2c7338: [<ffff82c4c0223b35>] hap_gva_to_gfn_3_levels+0x1f/0x2a
(XEN) ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc
(XEN) ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d
(XEN) ffff83043f2c73c8: [<ffff82c4c01b6d34>] hvmemul_virtual_to_linear+0x102/0x153
(XEN) ffff83043f2c7408: [<ffff82c4c01c1538>] hvm_copy_from_guest_virt+0x15/0x17
(XEN) ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8
(XEN) ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14
(XEN) ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10
(XEN) ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309
(XEN) ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN) ffff83043f2c75c0: [<ffff82c4c012a100>] _spin_trylock_recursive+0x63/0x93
(XEN) ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN) ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c7668: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN) ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN) ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
(XEN) ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2
(XEN) ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN) ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
(XEN) ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf
(XEN) ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60
(XEN) ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0
(XEN) ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cb
(XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d
(XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e
(XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456
(XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba
(XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7
(XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a
(XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
(XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839
(XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
(XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85
(XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd
(XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139
(XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 3:
(XEN) DOUBLE FAULT -- system shutdown
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
The hpet interrupt handler runs with interrupts enabled, due to this the
spin_unlock_irq() in:
while ( desc->status & IRQ_PENDING )
{
desc->status &= ~IRQ_PENDING;
spin_unlock_irq(&desc->lock);
tsc_in = tb_init_done ? get_cycles() : 0;
action->handler(irq, action->dev_id, regs);
TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
spin_lock_irq(&desc->lock);
}
in do_IRQ().
Clearly there are cases where the frequency of the HPET interrupt is faster
than the time it takes to process handle_hpet_broadcast(), I presume in part
because of the large amount of cpumask manipulation.
One solution, presented in this patch is to disable interrupts while running
the hpet event handler, but the patch is RFC because I dont really like it as
it feels a little bit like a hack.
handle_hpet_broadcast() is clearly too long as an interrupt handler, and
perhaps effort should be made to reduce it if possible.
Then again, interrupt handlers for this (and other Xen-consumed interrupts?)
should probably be run with interrupts disabled, to prevent exactly these
kinds of problems.
Thoughts/comments?
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
---
xen/arch/x86/hpet.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index 946d133..46e14a9 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -229,7 +229,9 @@ static void hpet_interrupt_handler(int irq, void *data,
return;
}
+ local_irq_disable();
ch->event_handler(ch);
+ local_irq_enable();
}
static void hpet_msi_unmask(struct irq_desc *desc)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler. 2013-08-05 20:38 [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler Andrew Cooper @ 2013-08-06 4:49 ` Keir Fraser 2013-08-06 8:01 ` Jan Beulich 1 sibling, 0 replies; 11+ messages in thread From: Keir Fraser @ 2013-08-06 4:49 UTC (permalink / raw) To: Andrew Cooper, Xen-devel; +Cc: Tim Deegan, Jan Beulich On 05/08/2013 21:38, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > The hpet interrupt handler runs with interrupts enabled, due to this the > spin_unlock_irq() in: > > while ( desc->status & IRQ_PENDING ) > { > desc->status &= ~IRQ_PENDING; > spin_unlock_irq(&desc->lock); > tsc_in = tb_init_done ? get_cycles() : 0; > action->handler(irq, action->dev_id, regs); > TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); > spin_lock_irq(&desc->lock); > } > > in do_IRQ(). But the handler sets IRQ_INPROGRESS before entering this loop, which should prevent reentry of the irq handler (hpet_interrupt_handler). So I don't see how the multiple reentering of hpet_interrupt_handler in the call trace is possible. -- Keir ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler. 2013-08-05 20:38 [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler Andrew Cooper 2013-08-06 4:49 ` Keir Fraser @ 2013-08-06 8:01 ` Jan Beulich 2013-08-06 10:32 ` Andrew Cooper 1 sibling, 1 reply; 11+ messages in thread From: Jan Beulich @ 2013-08-06 8:01 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan >>> On 05.08.13 at 22:38, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Automated testing on Xen-4.3 testing tip found an interesting issue > > (XEN) *** DOUBLE FAULT *** > (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]---- The call trace is suspicious in ways beyond what Keir already pointed out - with debug=y, there shouldn't be bogus entries listed, yet ... > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f > (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: 0000000000000020 rcx: 0000000000000100 > (XEN) rdx: ffff82c4c032dfc0 rsi: ffff83043f2c6068 rdi: ffff83043f2c6008 > (XEN) rbp: ffff83043f2c6048 rsp: ffff83043f2c6000 r8: 0000000000000001 > (XEN) r9: 0000000000000000 r10: ffff83043f2c76f0 r11: 0000000000000000 > (XEN) r12: ffff83043f2c6008 r13: 7fffffffffffffff r14: ffff83043f2c6068 > (XEN) r15: 000003343036797b cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 0000000403c40000 cr2: ffff83043f2c5ff8 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > (XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0 > (XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000): [... removed redundant stuff] > (XEN) ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f > (XEN) ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 > (XEN) ffff83043f2c6c80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 > (XEN) ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268 > (XEN) ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40 > (XEN) ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f > (XEN) ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 > (XEN) ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41 > (XEN) ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f > (XEN) ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba > (XEN) ffff83043f2c6f28: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 > (XEN) ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70 > (XEN) ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d > (XEN) ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde > (XEN) ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239 > (XEN) ffff83043f2c7108: [<ffff82c4c01e9497>] __get_gfn_type_access+0x12b/0x20e > (XEN) ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d > (XEN) ffff83043f2c71c8: [<ffff82c4c01f4660>] map_domain_gfn_3_levels+0x43/0x13a > (XEN) ffff83043f2c7208: [<ffff82c4c01f4b6b>] guest_walk_tables_3_levels+0x414/0x489 > (XEN) ffff83043f2c7288: [<ffff82c4c0223988>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306 > (XEN) ffff83043f2c7338: [<ffff82c4c0223b35>] hap_gva_to_gfn_3_levels+0x1f/0x2a > (XEN) ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc > (XEN) ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d > (XEN) ffff83043f2c73c8: [<ffff82c4c01b6d34>] hvmemul_virtual_to_linear+0x102/0x153 > (XEN) ffff83043f2c7408: [<ffff82c4c01c1538>] hvm_copy_from_guest_virt+0x15/0x17 > (XEN) ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8 > (XEN) ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14 > (XEN) ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10 > (XEN) ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309 ... how would this end up getting called from do_IRQ()? > (XEN) ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f > (XEN) ffff83043f2c75c0: [<ffff82c4c012a100>] _spin_trylock_recursive+0x63/0x93 > (XEN) ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f > (XEN) ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba > (XEN) ffff83043f2c7668: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268 > (XEN) ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde > (XEN) ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde > (XEN) ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 > (XEN) ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2 > (XEN) ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde > (XEN) ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239 > (XEN) ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf > (XEN) ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60 > (XEN) ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0 > (XEN) ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cb And this one looks bogus too. Question therefore is whether the problem you describe isn't a consequence of an earlier issue. > (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d > (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e > (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456 > (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba > (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7 > (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a > (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 > (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839 > (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 > (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85 > (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd > (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139 > (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7 > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) DOUBLE FAULT -- system shutdown > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > > The hpet interrupt handler runs with interrupts enabled, due to this the > spin_unlock_irq() in: > > while ( desc->status & IRQ_PENDING ) > { > desc->status &= ~IRQ_PENDING; > spin_unlock_irq(&desc->lock); > tsc_in = tb_init_done ? get_cycles() : 0; > action->handler(irq, action->dev_id, regs); > TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); > spin_lock_irq(&desc->lock); > } > > in do_IRQ(). > > Clearly there are cases where the frequency of the HPET interrupt is faster > than the time it takes to process handle_hpet_broadcast(), I presume in part > because of the large amount of cpumask manipulation. How many CPUs (and how many usable HPET channels) does the system have that this crash was observed on? Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler. 2013-08-06 8:01 ` Jan Beulich @ 2013-08-06 10:32 ` Andrew Cooper 2013-08-06 11:44 ` Jan Beulich 0 siblings, 1 reply; 11+ messages in thread From: Andrew Cooper @ 2013-08-06 10:32 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Tim Deegan [-- Attachment #1: Type: text/plain, Size: 3512 bytes --] On 06/08/13 09:01, Jan Beulich wrote: >>>> On 05.08.13 at 22:38, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> Automated testing on Xen-4.3 testing tip found an interesting issue >> >> (XEN) *** DOUBLE FAULT *** >> (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]---- > The call trace is suspicious in ways beyond what Keir already > pointed out - with debug=y, there shouldn't be bogus entries listed, > yet ... show_stack_overflow() doesn't have a debug case which follows frame pointers. I shall submit a patch for this presently, and put it into XenServer in the hope of getting a better stack trace in the future. <snip> > And this one looks bogus too. Question therefore is whether the > problem you describe isn't a consequence of an earlier issue. There is nothing apparently interesting preceding the crash. Just some spew from an HVM domain using the 0x39 debug port. > >> (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d >> (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e >> (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456 >> (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba >> (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7 >> (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a >> (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 >> (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839 >> (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48 >> (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85 >> (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd >> (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139 >> (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7 >> (XEN) >> (XEN) **************************************** >> (XEN) Panic on CPU 3: >> (XEN) DOUBLE FAULT -- system shutdown >> (XEN) **************************************** >> (XEN) >> (XEN) Reboot in five seconds... >> >> The hpet interrupt handler runs with interrupts enabled, due to this the >> spin_unlock_irq() in: >> >> while ( desc->status & IRQ_PENDING ) >> { >> desc->status &= ~IRQ_PENDING; >> spin_unlock_irq(&desc->lock); >> tsc_in = tb_init_done ? get_cycles() : 0; >> action->handler(irq, action->dev_id, regs); >> TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles()); >> spin_lock_irq(&desc->lock); >> } >> >> in do_IRQ(). >> >> Clearly there are cases where the frequency of the HPET interrupt is faster >> than the time it takes to process handle_hpet_broadcast(), I presume in part >> because of the large amount of cpumask manipulation. > How many CPUs (and how many usable HPET channels) does the > system have that this crash was observed on? > > Jan The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. The full boot xl dmesg is attached, but it appears that the are 8 broadcast hpets. This is futher backed up by the 'i' debugkey (also attached) Keir: (merging your thread back here) I see your point regarding IRQ_INPROGRESS, but even with 8 hpet interrupts, there are rather more than 8 occurences of handle_hpet_broadcast() in the stack. If the occurences were just function pointers on the stack, I would expect to see several handle_hpet_broadcast()+0x0/0x268 ~Andrew [-- Attachment #2: xl-dmesg-boot --] [-- Type: text/plain, Size: 11950 bytes --] __ __ _ _ _____ ___ \ \/ /___ _ __ | || | |___ / / _ \ \ // _ \ '_ \ | || |_ |_ \| | | | / \ __/ | | | |__ _| ___) | |_| | /_/\_\___|_| |_| |_|(_)____(_)___/ (XEN) Xen version 4.3.0 (root@uk.xensource.com) (x86_64-linux-gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)) debug=y Mon Aug 5 16:03:49 EDT 2013 (XEN) Latest ChangeSet: 27215:b5c2bcac14ad, pq 59:2de62343c69b (XEN) Bootloader: SYSLINUX 4.06 0x51f8b10e (XEN) Command line: com1=115200,8n1 console=com1,vga mem=1024G dom0_max_vcpus=4 dom0_mem=752M,max:752M watchdog lowmem_emergency_pool=1M crashkernel=64M@32M cpuid_mask_xsave_eax=0 (XEN) Video information: (XEN) VGA is text mode 80x25, font 8x16 (XEN) VBE/DDC methods: V2; EDID transfer time: 1 seconds (XEN) Disc information: (XEN) Found 1 MBR signatures (XEN) Found 1 EDD information structures (XEN) Xen-e820 RAM map: (XEN) 0000000000000000 - 000000000009e000 (usable) (XEN) 0000000000100000 - 00000000bf699000 (usable) (XEN) 00000000bf699000 - 00000000bf6af000 (reserved) (XEN) 00000000bf6af000 - 00000000bf6ce000 (ACPI data) (XEN) 00000000bf6ce000 - 00000000c0000000 (reserved) (XEN) 00000000e0000000 - 00000000f0000000 (reserved) (XEN) 00000000fe000000 - 0000000100000000 (reserved) (XEN) 0000000100000000 - 0000000440000000 (usable) (XEN) Kdump: 64MB (65536kB) at 0x2000000 (XEN) ACPI: RSDP 000F12D0, 0024 (r2 DELL ) (XEN) ACPI: XSDT 000F13D0, 008C (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: FACP BF6C3BB4, 00F4 (r3 DELL PE_SC3 1 DELL 1) (XEN) ACPI: DSDT BF6AF000, 3E5B (r1 DELL PE_SC3 1 INTL 20050624) (XEN) ACPI: FACS BF6C6000, 0040 (XEN) ACPI: APIC BF6C3478, 0152 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: SPCR BF6C35CC, 0050 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: HPET BF6C3620, 0038 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: DMAR BF6C365C, 00A8 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: MCFG BF6C3850, 003C (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: WD__ BF6C3890, 0134 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: SLIC BF6C39C8, 0024 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: ERST BF6B2FDC, 0270 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: HEST BF6B324C, 03A8 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: BERT BF6B2E5C, 0030 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: EINJ BF6B2E8C, 0150 (r1 DELL PE_SC3 1 DELL 1) (XEN) ACPI: TCPA BF6C3B4C, 0064 (r2 DELL PE_SC3 1 DELL 1) (XEN) System RAM: 16374MB (16767196kB) (XEN) No NUMA configuration found (XEN) Faking a node at 0000000000000000-0000000440000000 (XEN) Domain heap initialised DMA width 32 bits (XEN) found SMP MP-table at 000fe710 (XEN) DMI 2.6 present. (XEN) Using APIC driver bigsmp (XEN) ACPI: PM-Timer IO Port: 0x808 (XEN) ACPI: SLEEP INFO: pm1x_cnt[804,0], pm1x_evt[800,0] (XEN) ACPI: wakeup_vec[bf6c600c], vec_size[20] (XEN) ACPI: Local APIC address 0xfee00000 (XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) (XEN) Processor #0 7:14 APIC version 21 (XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled) (XEN) Processor #2 7:14 APIC version 21 (XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled) (XEN) Processor #4 7:14 APIC version 21 (XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled) (XEN) Processor #6 7:14 APIC version 21 (XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x24] disabled) (XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x25] disabled) (XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x26] disabled) (XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x27] disabled) (XEN) ACPI: LAPIC (acpi_id[0x09] lapic_id[0x28] disabled) (XEN) ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x29] disabled) (XEN) ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x2a] disabled) (XEN) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x2b] disabled) (XEN) ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x2c] disabled) (XEN) ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x2d] disabled) (XEN) ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x2e] disabled) (XEN) ACPI: LAPIC (acpi_id[0x10] lapic_id[0x2f] disabled) (XEN) ACPI: LAPIC (acpi_id[0x11] lapic_id[0x30] disabled) (XEN) ACPI: LAPIC (acpi_id[0x12] lapic_id[0x31] disabled) (XEN) ACPI: LAPIC (acpi_id[0x13] lapic_id[0x32] disabled) (XEN) ACPI: LAPIC (acpi_id[0x14] lapic_id[0x33] disabled) (XEN) ACPI: LAPIC (acpi_id[0x15] lapic_id[0x34] disabled) (XEN) ACPI: LAPIC (acpi_id[0x16] lapic_id[0x35] disabled) (XEN) ACPI: LAPIC (acpi_id[0x17] lapic_id[0x36] disabled) (XEN) ACPI: LAPIC (acpi_id[0x18] lapic_id[0x37] disabled) (XEN) ACPI: LAPIC (acpi_id[0x19] lapic_id[0x38] disabled) (XEN) ACPI: LAPIC (acpi_id[0x1a] lapic_id[0x39] disabled) (XEN) ACPI: LAPIC (acpi_id[0x1b] lapic_id[0x3a] disabled) (XEN) ACPI: LAPIC (acpi_id[0x1c] lapic_id[0x3b] disabled) (XEN) ACPI: LAPIC (acpi_id[0x1d] lapic_id[0x3c] disabled) (XEN) ACPI: LAPIC (acpi_id[0x1e] lapic_id[0x3d] disabled) (XEN) ACPI: LAPIC (acpi_id[0x1f] lapic_id[0x3e] disabled) (XEN) ACPI: LAPIC (acpi_id[0x20] lapic_id[0x3f] disabled) (XEN) ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1]) (XEN) ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0]) (XEN) IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23 (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) (XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) (XEN) ACPI: IRQ0 used by override. (XEN) ACPI: IRQ2 used by override. (XEN) ACPI: IRQ9 used by override. (XEN) Enabling APIC mode: Phys. Using 1 I/O APICs (XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000 (XEN) Xen ERST support is initialized. (XEN) Using ACPI (MADT) for SMP configuration information (XEN) SMP: Allowing 32 CPUs (28 hotplug CPUs) (XEN) IRQ limits: 24 GSI, 760 MSI/MSI-X (XEN) Using scheduler: SMP Credit Scheduler (credit) (XEN) Detected 2394.052 MHz processor. (XEN) Initing memory sharing. (XEN) Cannot set CPU xsave feature mask on CPU#0 (XEN) mce_intel.c:717: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0 extended MCE MSR 0 (XEN) Intel machine check reporting enabled (XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff (XEN) PCI: MCFG area at e0000000 reserved in E820 (XEN) PCI: Using MCFG for segment 0000 bus 00-ff (XEN) Intel VT-d iommu 0 supported page sizes: 4kB. (XEN) Intel VT-d Snoop Control enabled. (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. (XEN) Intel VT-d Queued Invalidation enabled. (XEN) Intel VT-d Interrupt Remapping not enabled. (XEN) Intel VT-d Shared EPT tables not enabled. (XEN) I/O virtualisation enabled (XEN) - Dom0 mode: Relaxed (XEN) Interrupt remapping disabled (XEN) ENABLING IO-APIC IRQs (XEN) -> Using new ACK method (XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1 (XEN) [2013-08-06 10:04:38] Platform timer is 14.318MHz HPET (XEN) [2013-08-06 10:04:38] Allocated console ring of 64 KiB. (XEN) [2013-08-06 10:04:38] mwait-idle: MWAIT substates: 0x1120 (XEN) [2013-08-06 10:04:38] mwait-idle: v0.4 model 0x1e (XEN) [2013-08-06 10:04:38] mwait-idle: lapic_timer_reliable_states 0x2 (XEN) [2013-08-06 10:04:38] HPET: 8 timers (8 will be used for broadcast) (XEN) [2013-08-06 10:04:38] VMX: Supported advanced features: (XEN) [2013-08-06 10:04:38] - APIC MMIO access virtualisation (XEN) [2013-08-06 10:04:38] - APIC TPR shadow (XEN) [2013-08-06 10:04:38] - Extended Page Tables (EPT) (XEN) [2013-08-06 10:04:38] - Virtual-Processor Identifiers (VPID) (XEN) [2013-08-06 10:04:38] - Virtual NMI (XEN) [2013-08-06 10:04:38] - MSR direct-access bitmap (XEN) [2013-08-06 10:04:38] HVM: ASIDs enabled. (XEN) [2013-08-06 10:04:38] HVM: VMX enabled (XEN) [2013-08-06 10:04:38] HVM: Hardware Assisted Paging (HAP) detected (XEN) [2013-08-06 10:04:38] HVM: HAP page sizes: 4kB, 2MB (XEN) [2013-08-06 10:04:37] Cannot set CPU xsave feature mask on CPU#1 (XEN) [2013-08-06 10:04:37] Cannot set CPU xsave feature mask on CPU#2 (XEN) [2013-08-06 10:04:37] Cannot set CPU xsave feature mask on CPU#3 (XEN) [2013-08-06 10:04:39] Brought up 4 CPUs (XEN) [2013-08-06 10:04:39] Testing NMI watchdog --- CPU#0 okay. CPU#1 okay. CPU#2 okay. CPU#3 okay. (XEN) [2013-08-06 10:04:39] ACPI sleep modes: S3 (XEN) [2013-08-06 10:04:39] mcheck_poll: Machine check polling timer started. (XEN) [2013-08-06 10:04:39] *** LOADING DOMAIN 0 *** (XEN) [2013-08-06 10:04:39] elf_parse_binary: phdr: paddr=0x100000 memsz=0x3f4000 (XEN) [2013-08-06 10:04:39] elf_parse_binary: phdr: paddr=0x4f4000 memsz=0x1a7000 (XEN) [2013-08-06 10:04:39] elf_parse_binary: memory: 0x100000 -> 0x69b000 (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: GUEST_OS = "linux" (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: GUEST_VERSION = "2.6" (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: XEN_VERSION = "xen-3.0" (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: VIRT_BASE = 0xc0000000 (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: PADDR_OFFSET = 0x0 (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: ENTRY = 0xc0100000 (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: HYPERCALL_PAGE = 0xc0101000 (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: HV_START_LOW = 0xf5800000 (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: FEATURES = "writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_kernel" (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: PAE_MODE = "yes" (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: unknown xen elf note (0xd) (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: LOADER = "generic" (XEN) [2013-08-06 10:04:39] elf_xen_parse_note: SUSPEND_CANCEL = 0x1 (XEN) [2013-08-06 10:04:39] elf_xen_addr_calc_check: addresses: (XEN) [2013-08-06 10:04:39] virt_base = 0xc0000000 (XEN) [2013-08-06 10:04:39] elf_paddr_offset = 0x0 (XEN) [2013-08-06 10:04:39] virt_offset = 0xc0000000 (XEN) [2013-08-06 10:04:39] virt_kstart = 0xc0100000 (XEN) [2013-08-06 10:04:39] virt_kend = 0xc069b000 (XEN) [2013-08-06 10:04:39] virt_entry = 0xc0100000 (XEN) [2013-08-06 10:04:39] p2m_base = 0xffffffffffffffff (XEN) [2013-08-06 10:04:39] Xen kernel: 64-bit, lsb, compat32 (XEN) [2013-08-06 10:04:39] Dom0 kernel: 32-bit, PAE, lsb, paddr 0x100000 -> 0x69b000 (XEN) [2013-08-06 10:04:39] PHYSICAL MEMORY ARRANGEMENT: (XEN) [2013-08-06 10:04:39] Dom0 alloc.: 0000000433800000->0000000434000000 (188393 pages to be allocated) (XEN) [2013-08-06 10:04:39] Init. ramdisk: 000000043f7e9000->000000043ffffe00 (XEN) [2013-08-06 10:04:39] VIRTUAL MEMORY ARRANGEMENT: (XEN) [2013-08-06 10:04:39] Loaded kernel: 00000000c0100000->00000000c069b000 (XEN) [2013-08-06 10:04:39] Init. ramdisk: 00000000c069b000->00000000c0eb1e00 (XEN) [2013-08-06 10:04:39] Phys-Mach map: 00000000c0eb2000->00000000c0f6e000 (XEN) [2013-08-06 10:04:39] Start info: 00000000c0f6e000->00000000c0f6e4b4 (XEN) [2013-08-06 10:04:39] Page tables: 00000000c0f6f000->00000000c0f7c000 (XEN) [2013-08-06 10:04:39] Boot stack: 00000000c0f7c000->00000000c0f7d000 (XEN) [2013-08-06 10:04:39] TOTAL: 00000000c0000000->00000000c1000000 (XEN) [2013-08-06 10:04:39] ENTRY ADDRESS: 00000000c0100000 (XEN) [2013-08-06 10:04:39] Dom0 has maximum 4 VCPUs (XEN) [2013-08-06 10:04:39] elf_load_binary: phdr 0 at 0xc0100000 -> 0xc04f4000 (XEN) [2013-08-06 10:04:39] elf_load_binary: phdr 1 at 0xc04f4000 -> 0xc05c9000 (XEN) [2013-08-06 10:04:40] Scrubbing Free RAM: ..........................................................................................................................................................done. (XEN) [2013-08-06 10:04:43] Initial low memory virq threshold set at 0x4000 pages. (XEN) [2013-08-06 10:04:43] Std. Loglevel: All (XEN) [2013-08-06 10:04:43] Guest Loglevel: All (XEN) [2013-08-06 10:04:43] Xen is relinquishing VGA console. (XEN) [2013-08-06 10:04:43] *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen) (XEN) [2013-08-06 10:04:43] Freed 320kB init memory. [-- Attachment #3: xl-debugkeys-i --] [-- Type: text/plain, Size: 9521 bytes --] (XEN) [2013-08-06 10:25:32] Guest interrupt information: (XEN) [2013-08-06 10:25:32] IRQ: 0 affinity:00000001 vec:f0 type=IO-APIC-edge status=00000000 timer_interrupt+0/0x18f (XEN) [2013-08-06 10:25:32] IRQ: 1 affinity:00000001 vec:30 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 3 affinity:00000001 vec:38 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 4 affinity:00000001 vec:f1 type=IO-APIC-edge status=00000000 ns16550_interrupt+0/0x6a (XEN) [2013-08-06 10:25:32] IRQ: 5 affinity:00000001 vec:40 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 6 affinity:00000001 vec:48 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 7 affinity:00000001 vec:50 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 8 affinity:00000001 vec:58 type=IO-APIC-edge status=00000010 in-flight=0 domain-list=0: 8(----), (XEN) [2013-08-06 10:25:32] IRQ: 9 affinity:00000001 vec:60 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 9(----), (XEN) [2013-08-06 10:25:32] IRQ: 10 affinity:00000001 vec:68 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 11 affinity:00000001 vec:70 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 12 affinity:00000001 vec:78 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 13 affinity:00000001 vec:88 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 14 affinity:00000001 vec:90 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 15 affinity:00000001 vec:98 type=IO-APIC-edge status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 16 affinity:00000008 vec:4a type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 16(----), (XEN) [2013-08-06 10:25:32] IRQ: 17 affinity:00000001 vec:d8 type=IO-APIC-level status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 20 affinity:00000001 vec:a0 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 20(----), (XEN) [2013-08-06 10:25:32] IRQ: 21 affinity:00000001 vec:d0 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 21(----), (XEN) [2013-08-06 10:25:32] IRQ: 22 affinity:00000001 vec:a4 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 22(----), (XEN) [2013-08-06 10:25:32] IRQ: 24 affinity:0000000f vec:28 type=DMA_MSI status=00000000 iommu_page_fault+0/0x12 (XEN) [2013-08-06 10:25:32] IRQ: 25 affinity:00000001 vec:c6 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 26 affinity:00000008 vec:3f type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 27 affinity:00000001 vec:47 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 28 affinity:00000002 vec:4f type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 29 affinity:00000008 vec:57 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 30 affinity:00000008 vec:2f type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 31 affinity:00000001 vec:86 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 32 affinity:00000002 vec:37 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40 (XEN) [2013-08-06 10:25:32] IRQ: 33 affinity:00000001 vec:3a type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:279(----), (XEN) [2013-08-06 10:25:32] IRQ: 34 affinity:00000001 vec:42 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:278(----), (XEN) [2013-08-06 10:25:32] IRQ: 35 affinity:00000001 vec:4a type=PCI-MSI status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 36 affinity:00000001 vec:52 type=PCI-MSI status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 37 affinity:00000001 vec:aa type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:275(----), (XEN) [2013-08-06 10:25:32] IRQ: 38 affinity:00000001 vec:3f type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:274(----), (XEN) [2013-08-06 10:25:32] IRQ: 39 affinity:00000001 vec:c7 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:273(----), (XEN) [2013-08-06 10:25:32] IRQ: 40 affinity:00000001 vec:cf type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:272(----), (XEN) [2013-08-06 10:25:32] IRQ: 41 affinity:00000001 vec:d7 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:271(----), (XEN) [2013-08-06 10:25:32] IRQ: 42 affinity:00000001 vec:df type=PCI-MSI/-X status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IRQ: 43 affinity:00000001 vec:a8 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:269(----), (XEN) [2013-08-06 10:25:32] IRQ: 44 affinity:00000001 vec:b0 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:268(----), (XEN) [2013-08-06 10:25:32] IRQ: 45 affinity:00000001 vec:b8 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:267(----), (XEN) [2013-08-06 10:25:32] IRQ: 46 affinity:00000001 vec:c0 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:266(----), (XEN) [2013-08-06 10:25:32] IRQ: 47 affinity:00000001 vec:c8 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:265(----), (XEN) [2013-08-06 10:25:32] IRQ: 48 affinity:00000001 vec:21 type=PCI-MSI/-X status=00000002 mapped, unbound (XEN) [2013-08-06 10:25:32] IO-APIC interrupt information: (XEN) [2013-08-06 10:25:32] IRQ 0 Vec240: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 2: vec=f0 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 1 Vec 48: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 1: vec=30 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 3 Vec 56: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 3: vec=38 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 4 Vec241: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 4: vec=f1 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 5 Vec 64: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 5: vec=40 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 6 Vec 72: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 6: vec=48 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 7 Vec 80: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 7: vec=50 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 8 Vec 88: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 8: vec=58 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 9 Vec 96: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 9: vec=60 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=L mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 10 Vec104: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 10: vec=68 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 11 Vec112: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 11: vec=70 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 12 Vec120: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 12: vec=78 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 13 Vec136: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 13: vec=88 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 14 Vec144: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 14: vec=90 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 15 Vec152: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 15: vec=98 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 16 Vec 74: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 16: vec=4a delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:6 (XEN) [2013-08-06 10:25:32] IRQ 17 Vec216: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 17: vec=d8 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=1 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 20 Vec160: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 20: vec=a0 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 21 Vec208: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 21: vec=d0 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0 (XEN) [2013-08-06 10:25:32] IRQ 22 Vec164: (XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 22: vec=a4 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0 [-- Attachment #4: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler. 2013-08-06 10:32 ` Andrew Cooper @ 2013-08-06 11:44 ` Jan Beulich 2013-08-06 13:23 ` Andrew Cooper 0 siblings, 1 reply; 11+ messages in thread From: Jan Beulich @ 2013-08-06 11:44 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan >>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. Not all that big. > The full boot xl dmesg is attached, but it appears that the are 8 > broadcast hpets. This is futher backed up by the 'i' debugkey (also > attached) Right. And with fewer CPUs than HPET channels, you could get the system into a mode where each CPU uses a dedicated channel ("maxcpus=4", suppressing registration of all the disabled ones). > Keir: (merging your thread back here) > I see your point regarding IRQ_INPROGRESS, but even with 8 hpet > interrupts, there are rather more than 8 occurences of > handle_hpet_broadcast() in the stack. If the occurences were just > function pointers on the stack, I would expect to see several > handle_hpet_broadcast()+0x0/0x268 Which further hints at some earlier problem. I suppose you don't happen to have a dump of that crash, or else you could inspect the IRQ descriptors as well as the stack for whether all instances came from the same IRQ/vector. Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler. 2013-08-06 11:44 ` Jan Beulich @ 2013-08-06 13:23 ` Andrew Cooper 2013-08-06 13:57 ` Jan Beulich 0 siblings, 1 reply; 11+ messages in thread From: Andrew Cooper @ 2013-08-06 13:23 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Tim Deegan On 06/08/13 12:44, Jan Beulich wrote: >>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. > Not all that big. > >> The full boot xl dmesg is attached, but it appears that the are 8 >> broadcast hpets. This is futher backed up by the 'i' debugkey (also >> attached) > Right. And with fewer CPUs than HPET channels, you could get > the system into a mode where each CPU uses a dedicated channel > ("maxcpus=4", suppressing registration of all the disabled ones). Does this setup actually mean that there are 8 hpets which are all broadcasting to every pcpu? The affinities listed in debug-keys 'i' seem to be towards single pcpus, but the order looks peculiar to say the least. > >> Keir: (merging your thread back here) >> I see your point regarding IRQ_INPROGRESS, but even with 8 hpet >> interrupts, there are rather more than 8 occurences of >> handle_hpet_broadcast() in the stack. If the occurences were just >> function pointers on the stack, I would expect to see several >> handle_hpet_broadcast()+0x0/0x268 > Which further hints at some earlier problem. I suppose you don't > happen to have a dump of that crash, or else you could inspect > the IRQ descriptors as well as the stack for whether all instances > came from the same IRQ/vector. > > Jan > Sadly no - the crashdump analyser grabbed the double fault IST, rather than the entire contents of the main stack. I shall extend the analyser to pick up the main stack as well; It does cross IST boundaries for call traces. I shall how easy it is to make it parse the irq_desc's & friends as well on crash, although for this case it might be easier just to tweak the double fault handler. ~Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler. 2013-08-06 13:23 ` Andrew Cooper @ 2013-08-06 13:57 ` Jan Beulich 2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper 0 siblings, 1 reply; 11+ messages in thread From: Jan Beulich @ 2013-08-06 13:57 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan >>> On 06.08.13 at 15:23, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > On 06/08/13 12:44, Jan Beulich wrote: >>>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >>> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram. >> Not all that big. >> >>> The full boot xl dmesg is attached, but it appears that the are 8 >>> broadcast hpets. This is futher backed up by the 'i' debugkey (also >>> attached) >> Right. And with fewer CPUs than HPET channels, you could get >> the system into a mode where each CPU uses a dedicated channel >> ("maxcpus=4", suppressing registration of all the disabled ones). > > Does this setup actually mean that there are 8 hpets which are all > broadcasting to every pcpu? The affinities listed in debug-keys 'i' > seem to be towards single pcpus, but the order looks peculiar to say the > least. No, each channel will be used for just one CPU when there are at least as many channels as CPUs. The difference between not using said command line option and using it is that in the former case a new channel would get assigned to a CPU each time it needs one, while in the latter case a static (pre-)assignment is used, i.e. each CPU will use always the same single channel. Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow 2013-08-06 13:57 ` Jan Beulich @ 2013-08-13 9:03 ` Andrew Cooper 2013-08-13 9:22 ` Tim Deegan 2013-08-13 11:59 ` Jan Beulich 0 siblings, 2 replies; 11+ messages in thread From: Andrew Cooper @ 2013-08-13 9:03 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Tim Deegan [-- Attachment #1: Type: text/plain, Size: 1204 bytes --] On 06/08/13 14:57, Jan Beulich wrote: >>> Right. And with fewer CPUs than HPET channels, you could get >>> the system into a mode where each CPU uses a dedicated channel >>> ("maxcpus=4", suppressing registration of all the disabled ones). >> Does this setup actually mean that there are 8 hpets which are all >> broadcasting to every pcpu? The affinities listed in debug-keys 'i' >> seem to be towards single pcpus, but the order looks peculiar to say the >> least. > No, each channel will be used for just one CPU when there are at > least as many channels as CPUs. The difference between not using > said command line option and using it is that in the former case a > new channel would get assigned to a CPU each time it needs one, > while in the latter case a static (pre-)assignment is used, i.e. each > CPU will use always the same single channel. > > Jan > We had another crash, this time with a proper stack trace. (This was using an early version stack trace improvements series) >From the stack trace (now correctly with frame pointers), we see 9 calls to handle_hpet_broadcast(). This indicates that the current logic does not correctly prevent repeated delivery of interrupts. ~Andrew [-- Attachment #2: stack-trace.log --] [-- Type: text/x-log, Size: 11436 bytes --] (XEN) [2013-08-12 22:57:42] *** DOUBLE FAULT *** (XEN) [2013-08-12 22:57:42] ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]---- (XEN) [2013-08-12 22:57:42] CPU: 2 (XEN) [2013-08-12 22:57:42] RIP: e008:[<ffff82c4c012a578>] _spin_lock_irqsave+0/0x5e (XEN) [2013-08-12 22:57:42] RFLAGS: 0000000000010292 CONTEXT: hypervisor (XEN) [2013-08-12 22:57:42] rax: ffff82c4c01a7b39 rbx: ffff83043f2d6168 rcx: ffff83043f2bab30 (XEN) [2013-08-12 22:57:42] rdx: ffff83043f2dac88 rsi: ffff83043f2ba300 rdi: ffff83043f2ba320 (XEN) [2013-08-12 22:57:42] rbp: ffff83043f2d6068 rsp: ffff83043f2d6000 r8: 0000000000000000 (XEN) [2013-08-12 22:57:42] r9: 0000000000000000 r10: ffff83043f2d76f0 r11: 0000000000000000 (XEN) [2013-08-12 22:57:42] r12: ffff83043f2ba300 r13: 0000000000000073 r14: ffff83043f281e24 (XEN) [2013-08-12 22:57:42] r15: ffff83043f281e00 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) [2013-08-12 22:57:42] cr3: 000000041e04e000 cr2: ffff83043f2d5ff8 (XEN) [2013-08-12 22:57:42] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) [2013-08-12 22:57:42] Valid stack range: ffff83043f2d6000-ffff83043f2d8000, sp=ffff83043f2d6000, tss.esp0=ffff83043f2d7fc0 (XEN) [2013-08-12 22:57:42] Xen stack overflow (dumping trace ffff83043f2d6000-ffff83043f2d8000): (XEN) [2013-08-12 22:57:42] ffff83043f2d6008: [<ffff82c4c01a7b56>] handle_hpet_broadcast+0x1d/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6078: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40 (XEN) [2013-08-12 22:57:42] ffff83043f2d6088: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d6168: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d61f0: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) [2013-08-12 22:57:42] ffff83043f2d6228: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6298: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40 (XEN) [2013-08-12 22:57:42] ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 (XEN) [2013-08-12 22:57:42] ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d6410: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) [2013-08-12 22:57:42] ffff83043f2d6448: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6470: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 (XEN) [2013-08-12 22:57:42] ffff83043f2d64b8: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40 (XEN) [2013-08-12 22:57:42] ffff83043f2d64c8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d64d8: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d6518: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d65a8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d6600: [<ffff82c4c01a7b39>] handle_hpet_broadcast+0/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6630: [<ffff82c4c012a583>] _spin_lock_irqsave+0xb/0x5e (XEN) [2013-08-12 22:57:42] ffff83043f2d6678: [<ffff82c4c01a7b56>] handle_hpet_broadcast+0x1d/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d66e8: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40 (XEN) [2013-08-12 22:57:42] ffff83043f2d66f8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d6758: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d67d8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d6860: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) [2013-08-12 22:57:42] ffff83043f2d6898: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6908: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40 (XEN) [2013-08-12 22:57:42] ffff83043f2d6918: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d6988: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d69b8: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d69f8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d6a80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) [2013-08-12 22:57:42] ffff83043f2d6ab8: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6b28: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40 (XEN) [2013-08-12 22:57:42] ffff83043f2d6b38: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d6b58: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d6ba8: [<ffff82c4c01a7ce9>] handle_hpet_broadcast+0x1b0/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6c18: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d6ca0: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42 (XEN) [2013-08-12 22:57:42] ffff83043f2d6cd8: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268 (XEN) [2013-08-12 22:57:42] ffff83043f2d6d48: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40 (XEN) [2013-08-12 22:57:42] ffff83043f2d6d58: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d6dc8: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d6e38: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d6ec0: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41 (XEN) [2013-08-12 22:57:42] ffff83043f2d6ee8: [<ffff82c4c017071a>] do_IRQ+0xae8/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d6f00: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 (XEN) [2013-08-12 22:57:42] ffff83043f2d6f78: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba (XEN) [2013-08-12 22:57:42] ffff83043f2d6fc8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 (XEN) [2013-08-12 22:57:42] ffff83043f2d7050: [<ffff82c4c01f0d3f>] ept_get_entry+0x2f/0x239 (XEN) [2013-08-12 22:57:42] ffff83043f2d7078: [<ffff82c4c01f0d3f>] ept_get_entry+0x2f/0x239 (XEN) [2013-08-12 22:57:42] ffff83043f2d70b8: [<ffff82c4c016fcda>] do_IRQ+0xa8/0xbc7 (XEN) [2013-08-12 22:57:42] ffff83043f2d7108: [<ffff82c4c01e97a7>] __get_gfn_type_access+0x12b/0x20e (XEN) [2013-08-12 22:57:42] ffff83043f2d7158: [<ffff82c4c01e9fd2>] get_page_from_gfn_p2m+0xc8/0x25d (XEN) [2013-08-12 22:57:42] ffff83043f2d71c8: [<ffff82c4c01f4970>] map_domain_gfn_3_levels+0x43/0x13a (XEN) [2013-08-12 22:57:42] ffff83043f2d7208: [<ffff82c4c01f4bf2>] guest_walk_tables_3_levels+0x18b/0x489 (XEN) [2013-08-12 22:57:42] ffff83043f2d7248: [<ffff82c4c01f0f37>] ept_get_entry+0x227/0x239 (XEN) [2013-08-12 22:57:42] ffff83043f2d7288: [<ffff82c4c0223c98>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306 (XEN) [2013-08-12 22:57:42] ffff83043f2d7338: [<ffff82c4c0223e45>] hap_gva_to_gfn_3_levels+0x1f/0x2a (XEN) [2013-08-12 22:57:42] ffff83043f2d7348: [<ffff82c4c01ebf2e>] paging_gva_to_gfn+0xb6/0xcc (XEN) [2013-08-12 22:57:42] ffff83043f2d7398: [<ffff82c4c01b72e6>] hvmemul_linear_to_phys+0xf3/0x24f (XEN) [2013-08-12 22:57:42] ffff83043f2d7418: [<ffff82c4c01b801f>] __hvmemul_read+0x179/0x1c8 (XEN) [2013-08-12 22:57:42] ffff83043f2d7498: [<ffff82c4c01b80c1>] hvmemul_read+0x12/0x14 (XEN) [2013-08-12 22:57:42] ffff83043f2d74a8: [<ffff82c4c0193aa9>] read_ulong+0xe/0x10 (XEN) [2013-08-12 22:57:42] ffff83043f2d74b8: [<ffff82c4c0196338>] x86_emulate+0x1df1/0x11309 (XEN) [2013-08-12 22:57:42] ffff83043f2d7510: [<ffff82c4c01b806e>] hvmemul_insn_fetch+0/0x41 (XEN) [2013-08-12 22:57:42] ffff83043f2d7530: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309 (XEN) [2013-08-12 22:57:42] ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309 (XEN) [2013-08-12 22:57:42] ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e (XEN) [2013-08-12 22:57:42] ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124 (XEN) [2013-08-12 22:57:42] ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde (XEN) [2013-08-12 22:57:42] ffff83043f2d7788: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde (XEN) [2013-08-12 22:57:42] ffff83043f2d77b8: [<ffff82c4c01f0f37>] ept_get_entry+0x227/0x239 (XEN) [2013-08-12 22:57:42] ffff83043f2d7848: [<ffff82c4c017788f>] get_page+0x27/0xf2 (XEN) [2013-08-12 22:57:42] ffff83043f2d7898: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde (XEN) [2013-08-12 22:57:42] ffff83043f2d78c8: [<ffff82c4c01f0f37>] ept_get_entry+0x227/0x239 (XEN) [2013-08-12 22:57:42] ffff83043f2d7a98: [<ffff82c4c01b8260>] hvm_emulate_one+0x127/0x1bf (XEN) [2013-08-12 22:57:42] ffff83043f2d7aa8: [<ffff82c4c01b6f1b>] hvmemul_get_seg_reg+0x49/0x60 (XEN) [2013-08-12 22:57:42] ffff83043f2d7ae8: [<ffff82c4c01c3bc5>] handle_mmio+0x55/0x1f0 (XEN) [2013-08-12 22:57:42] ffff83043f2d7b10: [<ffff82c4c01b8260>] hvm_emulate_one+0x127/0x1bf (XEN) [2013-08-12 22:57:42] ffff83043f2d7b20: [<ffff82c4c01b6f1b>] hvmemul_get_seg_reg+0x49/0x60 (XEN) [2013-08-12 22:57:42] ffff83043f2d7c48: [<ffff82c4c01e9700>] __get_gfn_type_access+0x84/0x20e (XEN) [2013-08-12 22:57:42] ffff83043f2d7c98: [<ffff82c4c01bcff3>] hvm_hap_nested_page_fault+0x25d/0x456 (XEN) [2013-08-12 22:57:42] ffff83043f2d7d18: [<ffff82c4c01e1557>] vmx_vmexit_handler+0x140a/0x17ba (XEN) [2013-08-12 22:57:42] ffff83043f2d7d30: [<ffff82c4c01be8c5>] hvm_do_resume+0xc6/0x1b7 (XEN) [2013-08-12 22:57:42] ffff83043f2d7da8: [<ffff82c4c01ce19c>] vpic_get_highest_priority_irq+0xaa/0xc6 (XEN) [2013-08-12 22:57:42] ffff83043f2d7db8: [<ffff82c4c015f972>] vcpu_kick+0x20/0x6c (XEN) [2013-08-12 22:57:42] ffff83043f2d7dd8: [<ffff82c4c01ce22f>] vpic_update_int_output+0x77/0xa2 (XEN) [2013-08-12 22:57:43] ffff83043f2d7df8: [<ffff82c4c01ce363>] vpic_irq_positive_edge+0x80/0x85 (XEN) [2013-08-12 22:57:43] ffff83043f2d7e18: [<ffff82c4c01c4b30>] assert_irq+0x27/0x32 (XEN) [2013-08-12 22:57:43] ffff83043f2d7e38: [<ffff82c4c01c4bca>] hvm_isa_irq_assert+0x8f/0xa4 (XEN) [2013-08-12 22:57:43] ffff83043f2d7e58: [<ffff82c4c01cb3d0>] vlapic_accept_pic_intr+0x21/0x2b (XEN) [2013-08-12 22:57:43] ffff83043f2d7e68: [<ffff82c4c01cf86d>] pt_update_irq+0x267/0x2ea (XEN) [2013-08-12 22:57:43] ffff83043f2d7e78: [<ffff82c4c01cb3d0>] vlapic_accept_pic_intr+0x21/0x2b (XEN) [2013-08-12 22:57:43] ffff83043f2d7e88: [<ffff82c4c01bd239>] hvm_interrupt_blocked+0x4d/0xe9 (XEN) [2013-08-12 22:57:43] ffff83043f2d7ec8: [<ffff82c4c01defa9>] vmx_vmenter_helper+0x60/0x139 (XEN) [2013-08-12 22:57:43] ffff83043f2d7f18: [<ffff82c4c01e7739>] vmx_asm_do_vmentry+0/0xe7 (XEN) [2013-08-12 22:57:43] (XEN) [2013-08-12 22:57:43] (XEN) [2013-08-12 22:57:43] **************************************** (XEN) [2013-08-12 22:57:43] Panic on CPU 2: (XEN) [2013-08-12 22:57:43] DOUBLE FAULT -- system shutdown (XEN) [2013-08-12 22:57:43] **************************************** (XEN) [2013-08-12 22:57:43] (XEN) [2013-08-12 22:57:43] Reboot in five seconds... (XEN) [2013-08-12 22:57:43] Executing crash image [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow 2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper @ 2013-08-13 9:22 ` Tim Deegan 2013-08-13 9:33 ` Andrew Cooper 2013-08-13 11:59 ` Jan Beulich 1 sibling, 1 reply; 11+ messages in thread From: Tim Deegan @ 2013-08-13 9:22 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Jan Beulich Hi, At 10:03 +0100 on 13 Aug (1376388226), Andrew Cooper wrote: > We had another crash, this time with a proper stack trace. (This was > using an early version stack trace improvements series) > > From the stack trace (now correctly with frame pointers), we see 9 calls > to handle_hpet_broadcast(). Hmmm. I don't think this can be following frame pointers -- or if it is something very odd is happening here: ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 and here: ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309 ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124 ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde Tim. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow 2013-08-13 9:22 ` Tim Deegan @ 2013-08-13 9:33 ` Andrew Cooper 0 siblings, 0 replies; 11+ messages in thread From: Andrew Cooper @ 2013-08-13 9:33 UTC (permalink / raw) To: Tim Deegan; +Cc: xen-devel, Keir Fraser, Jan Beulich On 13/08/13 10:22, Tim Deegan wrote: > Hi, > > At 10:03 +0100 on 13 Aug (1376388226), Andrew Cooper wrote: >> We had another crash, this time with a proper stack trace. (This was >> using an early version stack trace improvements series) >> >> From the stack trace (now correctly with frame pointers), we see 9 calls >> to handle_hpet_broadcast(). > Hmmm. I don't think this can be following frame pointers -- or if it is > something very odd is happening here: > > ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7 > ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65 > ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70 > > and here: > > ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309 > ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e > ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124 > ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde > > Tim. > > Hmm yes. I will double check the frame pointer through exception frame logic. ~Andrew ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow 2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper 2013-08-13 9:22 ` Tim Deegan @ 2013-08-13 11:59 ` Jan Beulich 1 sibling, 0 replies; 11+ messages in thread From: Jan Beulich @ 2013-08-13 11:59 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan >>> On 13.08.13 at 11:03, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > We had another crash, this time with a proper stack trace. (This was > using an early version stack trace improvements series) > > From the stack trace (now correctly with frame pointers), we see 9 calls > to handle_hpet_broadcast(). > > This indicates that the current logic does not correctly prevent > repeated delivery of interrupts. And this was with a 1:1 CPU <-> HPET channel mapping (not visible from just the stack trace)? In any case, could you try moving the call to ack_APIC_irq() from hpet_msi_ack() to hpet_msi_end() (the latter may need to be re-created depending on the Xen version you do this with). Or, as another alternative, call hpet_msi_{,un}mask() from the two functions respectively (albeit I think this might result in lost interrupts). Potentially hpet_msi_end() would then also need to disable interrupts before doing either of these actions. Jan ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-08-13 11:59 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-08-05 20:38 [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler Andrew Cooper 2013-08-06 4:49 ` Keir Fraser 2013-08-06 8:01 ` Jan Beulich 2013-08-06 10:32 ` Andrew Cooper 2013-08-06 11:44 ` Jan Beulich 2013-08-06 13:23 ` Andrew Cooper 2013-08-06 13:57 ` Jan Beulich 2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper 2013-08-13 9:22 ` Tim Deegan 2013-08-13 9:33 ` Andrew Cooper 2013-08-13 11:59 ` Jan Beulich
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.