* [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
@ 2013-08-05 20:38 Andrew Cooper
2013-08-06 4:49 ` Keir Fraser
2013-08-06 8:01 ` Jan Beulich
0 siblings, 2 replies; 11+ messages in thread
From: Andrew Cooper @ 2013-08-05 20:38 UTC (permalink / raw)
To: Xen-devel; +Cc: Andrew Cooper, Keir Fraser, Jan Beulich, Tim Deegan
Automated testing on Xen-4.3 testing tip found an interesting issue
(XEN) *** DOUBLE FAULT ***
(XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]----
(XEN) CPU: 3
(XEN) RIP: e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f
(XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor
(XEN) rax: 0000000000000000 rbx: 0000000000000020 rcx: 0000000000000100
(XEN) rdx: ffff82c4c032dfc0 rsi: ffff83043f2c6068 rdi: ffff83043f2c6008
(XEN) rbp: ffff83043f2c6048 rsp: ffff83043f2c6000 r8: 0000000000000001
(XEN) r9: 0000000000000000 r10: ffff83043f2c76f0 r11: 0000000000000000
(XEN) r12: ffff83043f2c6008 r13: 7fffffffffffffff r14: ffff83043f2c6068
(XEN) r15: 000003343036797b cr0: 0000000080050033 cr4: 00000000000026f0
(XEN) cr3: 0000000403c40000 cr2: ffff83043f2c5ff8
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0
(XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000):
(XEN) ffff83043f2c6008: [<ffff82c4c01aa73d>] cpuidle_wakeup_mwait+0x2d/0xba
(XEN) ffff83043f2c6058: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c60c8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c60d8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c61a8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6230: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6268: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c62d8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c62e8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c63b8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6440: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6478: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6498: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c64e8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c64f8: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6550: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
(XEN) ffff83043f2c6568: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN) ffff83043f2c65c8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6650: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6688: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c66f8: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6708: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c67d8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6860: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6898: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6908: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6918: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6938: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c6988: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c69e8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6a70: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6aa8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6b18: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6c80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
(XEN) ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
(XEN) ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
(XEN) ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41
(XEN) ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f
(XEN) ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c6f28: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
(XEN) ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d
(XEN) ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde
(XEN) ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239
(XEN) ffff83043f2c7108: [<ffff82c4c01e9497>] __get_gfn_type_access+0x12b/0x20e
(XEN) ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d
(XEN) ffff83043f2c71c8: [<ffff82c4c01f4660>] map_domain_gfn_3_levels+0x43/0x13a
(XEN) ffff83043f2c7208: [<ffff82c4c01f4b6b>] guest_walk_tables_3_levels+0x414/0x489
(XEN) ffff83043f2c7288: [<ffff82c4c0223988>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306
(XEN) ffff83043f2c7338: [<ffff82c4c0223b35>] hap_gva_to_gfn_3_levels+0x1f/0x2a
(XEN) ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc
(XEN) ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d
(XEN) ffff83043f2c73c8: [<ffff82c4c01b6d34>] hvmemul_virtual_to_linear+0x102/0x153
(XEN) ffff83043f2c7408: [<ffff82c4c01c1538>] hvm_copy_from_guest_virt+0x15/0x17
(XEN) ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8
(XEN) ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14
(XEN) ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10
(XEN) ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309
(XEN) ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN) ffff83043f2c75c0: [<ffff82c4c012a100>] _spin_trylock_recursive+0x63/0x93
(XEN) ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
(XEN) ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) ffff83043f2c7668: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
(XEN) ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN) ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN) ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
(XEN) ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2
(XEN) ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
(XEN) ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
(XEN) ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf
(XEN) ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60
(XEN) ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0
(XEN) ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cb
(XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d
(XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e
(XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456
(XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba
(XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7
(XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a
(XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
(XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839
(XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
(XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85
(XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd
(XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139
(XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 3:
(XEN) DOUBLE FAULT -- system shutdown
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
The hpet interrupt handler runs with interrupts enabled, due to this the
spin_unlock_irq() in:
while ( desc->status & IRQ_PENDING )
{
desc->status &= ~IRQ_PENDING;
spin_unlock_irq(&desc->lock);
tsc_in = tb_init_done ? get_cycles() : 0;
action->handler(irq, action->dev_id, regs);
TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
spin_lock_irq(&desc->lock);
}
in do_IRQ().
Clearly there are cases where the frequency of the HPET interrupt is faster
than the time it takes to process handle_hpet_broadcast(), I presume in part
because of the large amount of cpumask manipulation.
One solution, presented in this patch is to disable interrupts while running
the hpet event handler, but the patch is RFC because I dont really like it as
it feels a little bit like a hack.
handle_hpet_broadcast() is clearly too long as an interrupt handler, and
perhaps effort should be made to reduce it if possible.
Then again, interrupt handlers for this (and other Xen-consumed interrupts?)
should probably be run with interrupts disabled, to prevent exactly these
kinds of problems.
Thoughts/comments?
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
---
xen/arch/x86/hpet.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/xen/arch/x86/hpet.c b/xen/arch/x86/hpet.c
index 946d133..46e14a9 100644
--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -229,7 +229,9 @@ static void hpet_interrupt_handler(int irq, void *data,
return;
}
+ local_irq_disable();
ch->event_handler(ch);
+ local_irq_enable();
}
static void hpet_msi_unmask(struct irq_desc *desc)
--
1.7.10.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
2013-08-05 20:38 [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler Andrew Cooper
@ 2013-08-06 4:49 ` Keir Fraser
2013-08-06 8:01 ` Jan Beulich
1 sibling, 0 replies; 11+ messages in thread
From: Keir Fraser @ 2013-08-06 4:49 UTC (permalink / raw)
To: Andrew Cooper, Xen-devel; +Cc: Tim Deegan, Jan Beulich
On 05/08/2013 21:38, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> The hpet interrupt handler runs with interrupts enabled, due to this the
> spin_unlock_irq() in:
>
> while ( desc->status & IRQ_PENDING )
> {
> desc->status &= ~IRQ_PENDING;
> spin_unlock_irq(&desc->lock);
> tsc_in = tb_init_done ? get_cycles() : 0;
> action->handler(irq, action->dev_id, regs);
> TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
> spin_lock_irq(&desc->lock);
> }
>
> in do_IRQ().
But the handler sets IRQ_INPROGRESS before entering this loop, which should
prevent reentry of the irq handler (hpet_interrupt_handler). So I don't see
how the multiple reentering of hpet_interrupt_handler in the call trace is
possible.
-- Keir
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
2013-08-05 20:38 [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler Andrew Cooper
2013-08-06 4:49 ` Keir Fraser
@ 2013-08-06 8:01 ` Jan Beulich
2013-08-06 10:32 ` Andrew Cooper
1 sibling, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2013-08-06 8:01 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan
>>> On 05.08.13 at 22:38, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> Automated testing on Xen-4.3 testing tip found an interesting issue
>
> (XEN) *** DOUBLE FAULT ***
> (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]----
The call trace is suspicious in ways beyond what Keir already
pointed out - with debug=y, there shouldn't be bogus entries listed,
yet ...
> (XEN) CPU: 3
> (XEN) RIP: e008:[<ffff82c4c01003d0>] __bitmap_and+0/0x3f
> (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor
> (XEN) rax: 0000000000000000 rbx: 0000000000000020 rcx: 0000000000000100
> (XEN) rdx: ffff82c4c032dfc0 rsi: ffff83043f2c6068 rdi: ffff83043f2c6008
> (XEN) rbp: ffff83043f2c6048 rsp: ffff83043f2c6000 r8: 0000000000000001
> (XEN) r9: 0000000000000000 r10: ffff83043f2c76f0 r11: 0000000000000000
> (XEN) r12: ffff83043f2c6008 r13: 7fffffffffffffff r14: ffff83043f2c6068
> (XEN) r15: 000003343036797b cr0: 0000000080050033 cr4: 00000000000026f0
> (XEN) cr3: 0000000403c40000 cr2: ffff83043f2c5ff8
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Valid stack range: ffff83043f2c6000-ffff83043f2c8000, sp=ffff83043f2c6000, tss.esp0=ffff83043f2c7fc0
> (XEN) Xen stack overflow (dumping trace ffff83043f2c6000-ffff83043f2c8000):
[... removed redundant stuff]
> (XEN) ffff83043f2c6b28: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
> (XEN) ffff83043f2c6bf8: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
> (XEN) ffff83043f2c6c80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
> (XEN) ffff83043f2c6cb8: [<ffff82c4c01a78d4>] handle_hpet_broadcast+0x5b/0x268
> (XEN) ffff83043f2c6d28: [<ffff82c4c01a7b41>] hpet_interrupt_handler+0x3e/0x40
> (XEN) ffff83043f2c6d38: [<ffff82c4c0170500>] do_IRQ+0x99a/0xa4f
> (XEN) ffff83043f2c6e08: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
> (XEN) ffff83043f2c6e90: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41
> (XEN) ffff83043f2c6eb8: [<ffff82c4c01704d6>] do_IRQ+0x970/0xa4f
> (XEN) ffff83043f2c6ed8: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
> (XEN) ffff83043f2c6f28: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
> (XEN) ffff83043f2c6f88: [<ffff82c4c016806f>] common_interrupt+0x5f/0x70
> (XEN) ffff83043f2c7010: [<ffff82c4c0164f94>] unmap_domain_page+0x6/0x32d
> (XEN) ffff83043f2c7048: [<ffff82c4c01ef69d>] ept_next_level+0x9c/0xde
> (XEN) ffff83043f2c7078: [<ffff82c4c01f0ab3>] ept_get_entry+0xb3/0x239
> (XEN) ffff83043f2c7108: [<ffff82c4c01e9497>] __get_gfn_type_access+0x12b/0x20e
> (XEN) ffff83043f2c7158: [<ffff82c4c01e9cc2>] get_page_from_gfn_p2m+0xc8/0x25d
> (XEN) ffff83043f2c71c8: [<ffff82c4c01f4660>] map_domain_gfn_3_levels+0x43/0x13a
> (XEN) ffff83043f2c7208: [<ffff82c4c01f4b6b>] guest_walk_tables_3_levels+0x414/0x489
> (XEN) ffff83043f2c7288: [<ffff82c4c0223988>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306
> (XEN) ffff83043f2c7338: [<ffff82c4c0223b35>] hap_gva_to_gfn_3_levels+0x1f/0x2a
> (XEN) ffff83043f2c7348: [<ffff82c4c01ebc1e>] paging_gva_to_gfn+0xb6/0xcc
> (XEN) ffff83043f2c7398: [<ffff82c4c01bedf2>] __hvm_copy+0x57/0x36d
> (XEN) ffff83043f2c73c8: [<ffff82c4c01b6d34>] hvmemul_virtual_to_linear+0x102/0x153
> (XEN) ffff83043f2c7408: [<ffff82c4c01c1538>] hvm_copy_from_guest_virt+0x15/0x17
> (XEN) ffff83043f2c7418: [<ffff82c4c01b7cd3>] __hvmemul_read+0x12d/0x1c8
> (XEN) ffff83043f2c7498: [<ffff82c4c01b7dc1>] hvmemul_read+0x12/0x14
> (XEN) ffff83043f2c74a8: [<ffff82c4c01937e9>] read_ulong+0xe/0x10
> (XEN) ffff83043f2c74b8: [<ffff82c4c0195924>] x86_emulate+0x169d/0x11309
... how would this end up getting called from do_IRQ()?
> (XEN) ffff83043f2c7558: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
> (XEN) ffff83043f2c75c0: [<ffff82c4c012a100>] _spin_trylock_recursive+0x63/0x93
> (XEN) ffff83043f2c75d8: [<ffff82c4c0170564>] do_IRQ+0x9fe/0xa4f
> (XEN) ffff83043f2c7618: [<ffff82c4c01aa7bd>] cpuidle_wakeup_mwait+0xad/0xba
> (XEN) ffff83043f2c7668: [<ffff82c4c01a7a29>] handle_hpet_broadcast+0x1b0/0x268
> (XEN) ffff83043f2c76c8: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
> (XEN) ffff83043f2c7788: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
> (XEN) ffff83043f2c77b8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
> (XEN) ffff83043f2c7848: [<ffff82c4c01775ef>] get_page+0x27/0xf2
> (XEN) ffff83043f2c7898: [<ffff82c4c01ef6a5>] ept_next_level+0xa4/0xde
> (XEN) ffff83043f2c78c8: [<ffff82c4c01f0c27>] ept_get_entry+0x227/0x239
> (XEN) ffff83043f2c7a98: [<ffff82c4c01b7f60>] hvm_emulate_one+0x127/0x1bf
> (XEN) ffff83043f2c7aa8: [<ffff82c4c01b6c1b>] hvmemul_get_seg_reg+0x49/0x60
> (XEN) ffff83043f2c7ae8: [<ffff82c4c01c38c5>] handle_mmio+0x55/0x1f0
> (XEN) ffff83043f2c7b38: [<ffff82c4c0108208>] do_event_channel_op+0/0x10cb
And this one looks bogus too. Question therefore is whether the
problem you describe isn't a consequence of an earlier issue.
> (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d
> (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e
> (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456
> (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba
> (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7
> (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a
> (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
> (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839
> (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
> (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85
> (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd
> (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139
> (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 3:
> (XEN) DOUBLE FAULT -- system shutdown
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
>
> The hpet interrupt handler runs with interrupts enabled, due to this the
> spin_unlock_irq() in:
>
> while ( desc->status & IRQ_PENDING )
> {
> desc->status &= ~IRQ_PENDING;
> spin_unlock_irq(&desc->lock);
> tsc_in = tb_init_done ? get_cycles() : 0;
> action->handler(irq, action->dev_id, regs);
> TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
> spin_lock_irq(&desc->lock);
> }
>
> in do_IRQ().
>
> Clearly there are cases where the frequency of the HPET interrupt is faster
> than the time it takes to process handle_hpet_broadcast(), I presume in part
> because of the large amount of cpumask manipulation.
How many CPUs (and how many usable HPET channels) does the
system have that this crash was observed on?
Jan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
2013-08-06 8:01 ` Jan Beulich
@ 2013-08-06 10:32 ` Andrew Cooper
2013-08-06 11:44 ` Jan Beulich
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2013-08-06 10:32 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Tim Deegan
[-- Attachment #1: Type: text/plain, Size: 3512 bytes --]
On 06/08/13 09:01, Jan Beulich wrote:
>>>> On 05.08.13 at 22:38, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> Automated testing on Xen-4.3 testing tip found an interesting issue
>>
>> (XEN) *** DOUBLE FAULT ***
>> (XEN) ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]----
> The call trace is suspicious in ways beyond what Keir already
> pointed out - with debug=y, there shouldn't be bogus entries listed,
> yet ...
show_stack_overflow() doesn't have a debug case which follows frame
pointers. I shall submit a patch for this presently, and put it into
XenServer in the hope of getting a better stack trace in the future.
<snip>
> And this one looks bogus too. Question therefore is whether the
> problem you describe isn't a consequence of an earlier issue.
There is nothing apparently interesting preceding the crash. Just some
spew from an HVM domain using the 0x39 debug port.
>
>> (XEN) ffff83043f2c7b48: [<ffff82c4c0128bb3>] vcpu_unblock+0x4b/0x4d
>> (XEN) ffff83043f2c7c48: [<ffff82c4c01e9400>] __get_gfn_type_access+0x94/0x20e
>> (XEN) ffff83043f2c7c98: [<ffff82c4c01bccf3>] hvm_hap_nested_page_fault+0x25d/0x456
>> (XEN) ffff83043f2c7d18: [<ffff82c4c01e1257>] vmx_vmexit_handler+0x140a/0x17ba
>> (XEN) ffff83043f2c7d30: [<ffff82c4c01be519>] hvm_do_resume+0x1a/0x1b7
>> (XEN) ffff83043f2c7d60: [<ffff82c4c01dae73>] vmx_do_resume+0x13b/0x15a
>> (XEN) ffff83043f2c7da8: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
>> (XEN) ffff83043f2c7e20: [<ffff82c4c0128091>] schedule+0x82a/0x839
>> (XEN) ffff83043f2c7e50: [<ffff82c4c012a1e1>] _spin_lock+0x11/0x48
>> (XEN) ffff83043f2c7e68: [<ffff82c4c01cb132>] vlapic_has_pending_irq+0x3f/0x85
>> (XEN) ffff83043f2c7e88: [<ffff82c4c01c50a7>] hvm_vcpu_has_pending_irq+0x9b/0xcd
>> (XEN) ffff83043f2c7ec8: [<ffff82c4c01deca9>] vmx_vmenter_helper+0x60/0x139
>> (XEN) ffff83043f2c7f18: [<ffff82c4c01e7439>] vmx_asm_do_vmentry+0/0xe7
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 3:
>> (XEN) DOUBLE FAULT -- system shutdown
>> (XEN) ****************************************
>> (XEN)
>> (XEN) Reboot in five seconds...
>>
>> The hpet interrupt handler runs with interrupts enabled, due to this the
>> spin_unlock_irq() in:
>>
>> while ( desc->status & IRQ_PENDING )
>> {
>> desc->status &= ~IRQ_PENDING;
>> spin_unlock_irq(&desc->lock);
>> tsc_in = tb_init_done ? get_cycles() : 0;
>> action->handler(irq, action->dev_id, regs);
>> TRACE_3D(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
>> spin_lock_irq(&desc->lock);
>> }
>>
>> in do_IRQ().
>>
>> Clearly there are cases where the frequency of the HPET interrupt is faster
>> than the time it takes to process handle_hpet_broadcast(), I presume in part
>> because of the large amount of cpumask manipulation.
> How many CPUs (and how many usable HPET channels) does the
> system have that this crash was observed on?
>
> Jan
The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram.
The full boot xl dmesg is attached, but it appears that the are 8
broadcast hpets. This is futher backed up by the 'i' debugkey (also
attached)
Keir: (merging your thread back here)
I see your point regarding IRQ_INPROGRESS, but even with 8 hpet
interrupts, there are rather more than 8 occurences of
handle_hpet_broadcast() in the stack. If the occurences were just
function pointers on the stack, I would expect to see several
handle_hpet_broadcast()+0x0/0x268
~Andrew
[-- Attachment #2: xl-dmesg-boot --]
[-- Type: text/plain, Size: 11950 bytes --]
__ __ _ _ _____ ___
\ \/ /___ _ __ | || | |___ / / _ \
\ // _ \ '_ \ | || |_ |_ \| | | |
/ \ __/ | | | |__ _| ___) | |_| |
/_/\_\___|_| |_| |_|(_)____(_)___/
(XEN) Xen version 4.3.0 (root@uk.xensource.com) (x86_64-linux-gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46)) debug=y Mon Aug 5 16:03:49 EDT 2013
(XEN) Latest ChangeSet: 27215:b5c2bcac14ad, pq 59:2de62343c69b
(XEN) Bootloader: SYSLINUX 4.06 0x51f8b10e
(XEN) Command line: com1=115200,8n1 console=com1,vga mem=1024G dom0_max_vcpus=4 dom0_mem=752M,max:752M watchdog lowmem_emergency_pool=1M crashkernel=64M@32M cpuid_mask_xsave_eax=0
(XEN) Video information:
(XEN) VGA is text mode 80x25, font 8x16
(XEN) VBE/DDC methods: V2; EDID transfer time: 1 seconds
(XEN) Disc information:
(XEN) Found 1 MBR signatures
(XEN) Found 1 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN) 0000000000000000 - 000000000009e000 (usable)
(XEN) 0000000000100000 - 00000000bf699000 (usable)
(XEN) 00000000bf699000 - 00000000bf6af000 (reserved)
(XEN) 00000000bf6af000 - 00000000bf6ce000 (ACPI data)
(XEN) 00000000bf6ce000 - 00000000c0000000 (reserved)
(XEN) 00000000e0000000 - 00000000f0000000 (reserved)
(XEN) 00000000fe000000 - 0000000100000000 (reserved)
(XEN) 0000000100000000 - 0000000440000000 (usable)
(XEN) Kdump: 64MB (65536kB) at 0x2000000
(XEN) ACPI: RSDP 000F12D0, 0024 (r2 DELL )
(XEN) ACPI: XSDT 000F13D0, 008C (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: FACP BF6C3BB4, 00F4 (r3 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: DSDT BF6AF000, 3E5B (r1 DELL PE_SC3 1 INTL 20050624)
(XEN) ACPI: FACS BF6C6000, 0040
(XEN) ACPI: APIC BF6C3478, 0152 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: SPCR BF6C35CC, 0050 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: HPET BF6C3620, 0038 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: DMAR BF6C365C, 00A8 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: MCFG BF6C3850, 003C (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: WD__ BF6C3890, 0134 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: SLIC BF6C39C8, 0024 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: ERST BF6B2FDC, 0270 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: HEST BF6B324C, 03A8 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: BERT BF6B2E5C, 0030 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: EINJ BF6B2E8C, 0150 (r1 DELL PE_SC3 1 DELL 1)
(XEN) ACPI: TCPA BF6C3B4C, 0064 (r2 DELL PE_SC3 1 DELL 1)
(XEN) System RAM: 16374MB (16767196kB)
(XEN) No NUMA configuration found
(XEN) Faking a node at 0000000000000000-0000000440000000
(XEN) Domain heap initialised DMA width 32 bits
(XEN) found SMP MP-table at 000fe710
(XEN) DMI 2.6 present.
(XEN) Using APIC driver bigsmp
(XEN) ACPI: PM-Timer IO Port: 0x808
(XEN) ACPI: SLEEP INFO: pm1x_cnt[804,0], pm1x_evt[800,0]
(XEN) ACPI: wakeup_vec[bf6c600c], vec_size[20]
(XEN) ACPI: Local APIC address 0xfee00000
(XEN) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
(XEN) Processor #0 7:14 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
(XEN) Processor #2 7:14 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)
(XEN) Processor #4 7:14 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)
(XEN) Processor #6 7:14 APIC version 21
(XEN) ACPI: LAPIC (acpi_id[0x05] lapic_id[0x24] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x06] lapic_id[0x25] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x07] lapic_id[0x26] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x08] lapic_id[0x27] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x09] lapic_id[0x28] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x29] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x2a] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x2b] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x2c] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x2d] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x2e] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x10] lapic_id[0x2f] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x11] lapic_id[0x30] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x12] lapic_id[0x31] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x13] lapic_id[0x32] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x14] lapic_id[0x33] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x15] lapic_id[0x34] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x16] lapic_id[0x35] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x17] lapic_id[0x36] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x18] lapic_id[0x37] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x19] lapic_id[0x38] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x1a] lapic_id[0x39] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x1b] lapic_id[0x3a] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x1c] lapic_id[0x3b] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x1d] lapic_id[0x3c] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x1e] lapic_id[0x3d] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x1f] lapic_id[0x3e] disabled)
(XEN) ACPI: LAPIC (acpi_id[0x20] lapic_id[0x3f] disabled)
(XEN) ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
(XEN) ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0])
(XEN) IOAPIC[0]: apic_id 0, version 32, address 0xfec00000, GSI 0-23
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
(XEN) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
(XEN) ACPI: IRQ0 used by override.
(XEN) ACPI: IRQ2 used by override.
(XEN) ACPI: IRQ9 used by override.
(XEN) Enabling APIC mode: Phys. Using 1 I/O APICs
(XEN) ACPI: HPET id: 0x8086a701 base: 0xfed00000
(XEN) Xen ERST support is initialized.
(XEN) Using ACPI (MADT) for SMP configuration information
(XEN) SMP: Allowing 32 CPUs (28 hotplug CPUs)
(XEN) IRQ limits: 24 GSI, 760 MSI/MSI-X
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2394.052 MHz processor.
(XEN) Initing memory sharing.
(XEN) Cannot set CPU xsave feature mask on CPU#0
(XEN) mce_intel.c:717: MCA Capability: BCAST 1 SER 0 CMCI 1 firstbank 0 extended MCE MSR 0
(XEN) Intel machine check reporting enabled
(XEN) PCI: MCFG configuration 0: base e0000000 segment 0000 buses 00 - ff
(XEN) PCI: MCFG area at e0000000 reserved in E820
(XEN) PCI: Using MCFG for segment 0000 bus 00-ff
(XEN) Intel VT-d iommu 0 supported page sizes: 4kB.
(XEN) Intel VT-d Snoop Control enabled.
(XEN) Intel VT-d Dom0 DMA Passthrough not enabled.
(XEN) Intel VT-d Queued Invalidation enabled.
(XEN) Intel VT-d Interrupt Remapping not enabled.
(XEN) Intel VT-d Shared EPT tables not enabled.
(XEN) I/O virtualisation enabled
(XEN) - Dom0 mode: Relaxed
(XEN) Interrupt remapping disabled
(XEN) ENABLING IO-APIC IRQs
(XEN) -> Using new ACK method
(XEN) ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
(XEN) [2013-08-06 10:04:38] Platform timer is 14.318MHz HPET
(XEN) [2013-08-06 10:04:38] Allocated console ring of 64 KiB.
(XEN) [2013-08-06 10:04:38] mwait-idle: MWAIT substates: 0x1120
(XEN) [2013-08-06 10:04:38] mwait-idle: v0.4 model 0x1e
(XEN) [2013-08-06 10:04:38] mwait-idle: lapic_timer_reliable_states 0x2
(XEN) [2013-08-06 10:04:38] HPET: 8 timers (8 will be used for broadcast)
(XEN) [2013-08-06 10:04:38] VMX: Supported advanced features:
(XEN) [2013-08-06 10:04:38] - APIC MMIO access virtualisation
(XEN) [2013-08-06 10:04:38] - APIC TPR shadow
(XEN) [2013-08-06 10:04:38] - Extended Page Tables (EPT)
(XEN) [2013-08-06 10:04:38] - Virtual-Processor Identifiers (VPID)
(XEN) [2013-08-06 10:04:38] - Virtual NMI
(XEN) [2013-08-06 10:04:38] - MSR direct-access bitmap
(XEN) [2013-08-06 10:04:38] HVM: ASIDs enabled.
(XEN) [2013-08-06 10:04:38] HVM: VMX enabled
(XEN) [2013-08-06 10:04:38] HVM: Hardware Assisted Paging (HAP) detected
(XEN) [2013-08-06 10:04:38] HVM: HAP page sizes: 4kB, 2MB
(XEN) [2013-08-06 10:04:37] Cannot set CPU xsave feature mask on CPU#1
(XEN) [2013-08-06 10:04:37] Cannot set CPU xsave feature mask on CPU#2
(XEN) [2013-08-06 10:04:37] Cannot set CPU xsave feature mask on CPU#3
(XEN) [2013-08-06 10:04:39] Brought up 4 CPUs
(XEN) [2013-08-06 10:04:39] Testing NMI watchdog --- CPU#0 okay. CPU#1 okay. CPU#2 okay. CPU#3 okay.
(XEN) [2013-08-06 10:04:39] ACPI sleep modes: S3
(XEN) [2013-08-06 10:04:39] mcheck_poll: Machine check polling timer started.
(XEN) [2013-08-06 10:04:39] *** LOADING DOMAIN 0 ***
(XEN) [2013-08-06 10:04:39] elf_parse_binary: phdr: paddr=0x100000 memsz=0x3f4000
(XEN) [2013-08-06 10:04:39] elf_parse_binary: phdr: paddr=0x4f4000 memsz=0x1a7000
(XEN) [2013-08-06 10:04:39] elf_parse_binary: memory: 0x100000 -> 0x69b000
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: GUEST_OS = "linux"
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: GUEST_VERSION = "2.6"
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: XEN_VERSION = "xen-3.0"
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: VIRT_BASE = 0xc0000000
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: PADDR_OFFSET = 0x0
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: ENTRY = 0xc0100000
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: HYPERCALL_PAGE = 0xc0101000
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: HV_START_LOW = 0xf5800000
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: FEATURES = "writable_page_tables|writable_descriptor_tables|auto_translated_physmap|pae_pgdir_above_4gb|supervisor_mode_kernel"
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: PAE_MODE = "yes"
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: unknown xen elf note (0xd)
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: LOADER = "generic"
(XEN) [2013-08-06 10:04:39] elf_xen_parse_note: SUSPEND_CANCEL = 0x1
(XEN) [2013-08-06 10:04:39] elf_xen_addr_calc_check: addresses:
(XEN) [2013-08-06 10:04:39] virt_base = 0xc0000000
(XEN) [2013-08-06 10:04:39] elf_paddr_offset = 0x0
(XEN) [2013-08-06 10:04:39] virt_offset = 0xc0000000
(XEN) [2013-08-06 10:04:39] virt_kstart = 0xc0100000
(XEN) [2013-08-06 10:04:39] virt_kend = 0xc069b000
(XEN) [2013-08-06 10:04:39] virt_entry = 0xc0100000
(XEN) [2013-08-06 10:04:39] p2m_base = 0xffffffffffffffff
(XEN) [2013-08-06 10:04:39] Xen kernel: 64-bit, lsb, compat32
(XEN) [2013-08-06 10:04:39] Dom0 kernel: 32-bit, PAE, lsb, paddr 0x100000 -> 0x69b000
(XEN) [2013-08-06 10:04:39] PHYSICAL MEMORY ARRANGEMENT:
(XEN) [2013-08-06 10:04:39] Dom0 alloc.: 0000000433800000->0000000434000000 (188393 pages to be allocated)
(XEN) [2013-08-06 10:04:39] Init. ramdisk: 000000043f7e9000->000000043ffffe00
(XEN) [2013-08-06 10:04:39] VIRTUAL MEMORY ARRANGEMENT:
(XEN) [2013-08-06 10:04:39] Loaded kernel: 00000000c0100000->00000000c069b000
(XEN) [2013-08-06 10:04:39] Init. ramdisk: 00000000c069b000->00000000c0eb1e00
(XEN) [2013-08-06 10:04:39] Phys-Mach map: 00000000c0eb2000->00000000c0f6e000
(XEN) [2013-08-06 10:04:39] Start info: 00000000c0f6e000->00000000c0f6e4b4
(XEN) [2013-08-06 10:04:39] Page tables: 00000000c0f6f000->00000000c0f7c000
(XEN) [2013-08-06 10:04:39] Boot stack: 00000000c0f7c000->00000000c0f7d000
(XEN) [2013-08-06 10:04:39] TOTAL: 00000000c0000000->00000000c1000000
(XEN) [2013-08-06 10:04:39] ENTRY ADDRESS: 00000000c0100000
(XEN) [2013-08-06 10:04:39] Dom0 has maximum 4 VCPUs
(XEN) [2013-08-06 10:04:39] elf_load_binary: phdr 0 at 0xc0100000 -> 0xc04f4000
(XEN) [2013-08-06 10:04:39] elf_load_binary: phdr 1 at 0xc04f4000 -> 0xc05c9000
(XEN) [2013-08-06 10:04:40] Scrubbing Free RAM: ..........................................................................................................................................................done.
(XEN) [2013-08-06 10:04:43] Initial low memory virq threshold set at 0x4000 pages.
(XEN) [2013-08-06 10:04:43] Std. Loglevel: All
(XEN) [2013-08-06 10:04:43] Guest Loglevel: All
(XEN) [2013-08-06 10:04:43] Xen is relinquishing VGA console.
(XEN) [2013-08-06 10:04:43] *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) [2013-08-06 10:04:43] Freed 320kB init memory.
[-- Attachment #3: xl-debugkeys-i --]
[-- Type: text/plain, Size: 9521 bytes --]
(XEN) [2013-08-06 10:25:32] Guest interrupt information:
(XEN) [2013-08-06 10:25:32] IRQ: 0 affinity:00000001 vec:f0 type=IO-APIC-edge status=00000000 timer_interrupt+0/0x18f
(XEN) [2013-08-06 10:25:32] IRQ: 1 affinity:00000001 vec:30 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 3 affinity:00000001 vec:38 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 4 affinity:00000001 vec:f1 type=IO-APIC-edge status=00000000 ns16550_interrupt+0/0x6a
(XEN) [2013-08-06 10:25:32] IRQ: 5 affinity:00000001 vec:40 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 6 affinity:00000001 vec:48 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 7 affinity:00000001 vec:50 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 8 affinity:00000001 vec:58 type=IO-APIC-edge status=00000010 in-flight=0 domain-list=0: 8(----),
(XEN) [2013-08-06 10:25:32] IRQ: 9 affinity:00000001 vec:60 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 9(----),
(XEN) [2013-08-06 10:25:32] IRQ: 10 affinity:00000001 vec:68 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 11 affinity:00000001 vec:70 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 12 affinity:00000001 vec:78 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 13 affinity:00000001 vec:88 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 14 affinity:00000001 vec:90 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 15 affinity:00000001 vec:98 type=IO-APIC-edge status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 16 affinity:00000008 vec:4a type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 16(----),
(XEN) [2013-08-06 10:25:32] IRQ: 17 affinity:00000001 vec:d8 type=IO-APIC-level status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 20 affinity:00000001 vec:a0 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 20(----),
(XEN) [2013-08-06 10:25:32] IRQ: 21 affinity:00000001 vec:d0 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 21(----),
(XEN) [2013-08-06 10:25:32] IRQ: 22 affinity:00000001 vec:a4 type=IO-APIC-level status=00000010 in-flight=0 domain-list=0: 22(----),
(XEN) [2013-08-06 10:25:32] IRQ: 24 affinity:0000000f vec:28 type=DMA_MSI status=00000000 iommu_page_fault+0/0x12
(XEN) [2013-08-06 10:25:32] IRQ: 25 affinity:00000001 vec:c6 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 26 affinity:00000008 vec:3f type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 27 affinity:00000001 vec:47 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 28 affinity:00000002 vec:4f type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 29 affinity:00000008 vec:57 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 30 affinity:00000008 vec:2f type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 31 affinity:00000001 vec:86 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 32 affinity:00000002 vec:37 type=HPET-MSI status=00000000 hpet_interrupt_handler+0/0x40
(XEN) [2013-08-06 10:25:32] IRQ: 33 affinity:00000001 vec:3a type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:279(----),
(XEN) [2013-08-06 10:25:32] IRQ: 34 affinity:00000001 vec:42 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:278(----),
(XEN) [2013-08-06 10:25:32] IRQ: 35 affinity:00000001 vec:4a type=PCI-MSI status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 36 affinity:00000001 vec:52 type=PCI-MSI status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 37 affinity:00000001 vec:aa type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:275(----),
(XEN) [2013-08-06 10:25:32] IRQ: 38 affinity:00000001 vec:3f type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:274(----),
(XEN) [2013-08-06 10:25:32] IRQ: 39 affinity:00000001 vec:c7 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:273(----),
(XEN) [2013-08-06 10:25:32] IRQ: 40 affinity:00000001 vec:cf type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:272(----),
(XEN) [2013-08-06 10:25:32] IRQ: 41 affinity:00000001 vec:d7 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:271(----),
(XEN) [2013-08-06 10:25:32] IRQ: 42 affinity:00000001 vec:df type=PCI-MSI/-X status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IRQ: 43 affinity:00000001 vec:a8 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:269(----),
(XEN) [2013-08-06 10:25:32] IRQ: 44 affinity:00000001 vec:b0 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:268(----),
(XEN) [2013-08-06 10:25:32] IRQ: 45 affinity:00000001 vec:b8 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:267(----),
(XEN) [2013-08-06 10:25:32] IRQ: 46 affinity:00000001 vec:c0 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:266(----),
(XEN) [2013-08-06 10:25:32] IRQ: 47 affinity:00000001 vec:c8 type=PCI-MSI/-X status=00000010 in-flight=0 domain-list=0:265(----),
(XEN) [2013-08-06 10:25:32] IRQ: 48 affinity:00000001 vec:21 type=PCI-MSI/-X status=00000002 mapped, unbound
(XEN) [2013-08-06 10:25:32] IO-APIC interrupt information:
(XEN) [2013-08-06 10:25:32] IRQ 0 Vec240:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 2: vec=f0 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 1 Vec 48:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 1: vec=30 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 3 Vec 56:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 3: vec=38 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 4 Vec241:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 4: vec=f1 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 5 Vec 64:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 5: vec=40 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 6 Vec 72:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 6: vec=48 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 7 Vec 80:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 7: vec=50 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 8 Vec 88:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 8: vec=58 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 9 Vec 96:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 9: vec=60 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=L mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 10 Vec104:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 10: vec=68 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 11 Vec112:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 11: vec=70 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 12 Vec120:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 12: vec=78 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 13 Vec136:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 13: vec=88 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 14 Vec144:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 14: vec=90 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 15 Vec152:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 15: vec=98 delivery=Fixed dest=P status=0 polarity=0 irr=0 trig=E mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 16 Vec 74:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 16: vec=4a delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:6
(XEN) [2013-08-06 10:25:32] IRQ 17 Vec216:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 17: vec=d8 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=1 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 20 Vec160:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 20: vec=a0 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 21 Vec208:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 21: vec=d0 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0
(XEN) [2013-08-06 10:25:32] IRQ 22 Vec164:
(XEN) [2013-08-06 10:25:32] Apic 0x00, Pin 22: vec=a4 delivery=Fixed dest=P status=0 polarity=1 irr=0 trig=L mask=0 dest_id:0
[-- Attachment #4: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
2013-08-06 10:32 ` Andrew Cooper
@ 2013-08-06 11:44 ` Jan Beulich
2013-08-06 13:23 ` Andrew Cooper
0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2013-08-06 11:44 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan
>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram.
Not all that big.
> The full boot xl dmesg is attached, but it appears that the are 8
> broadcast hpets. This is futher backed up by the 'i' debugkey (also
> attached)
Right. And with fewer CPUs than HPET channels, you could get
the system into a mode where each CPU uses a dedicated channel
("maxcpus=4", suppressing registration of all the disabled ones).
> Keir: (merging your thread back here)
> I see your point regarding IRQ_INPROGRESS, but even with 8 hpet
> interrupts, there are rather more than 8 occurences of
> handle_hpet_broadcast() in the stack. If the occurences were just
> function pointers on the stack, I would expect to see several
> handle_hpet_broadcast()+0x0/0x268
Which further hints at some earlier problem. I suppose you don't
happen to have a dump of that crash, or else you could inspect
the IRQ descriptors as well as the stack for whether all instances
came from the same IRQ/vector.
Jan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
2013-08-06 11:44 ` Jan Beulich
@ 2013-08-06 13:23 ` Andrew Cooper
2013-08-06 13:57 ` Jan Beulich
0 siblings, 1 reply; 11+ messages in thread
From: Andrew Cooper @ 2013-08-06 13:23 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Tim Deegan
On 06/08/13 12:44, Jan Beulich wrote:
>>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram.
> Not all that big.
>
>> The full boot xl dmesg is attached, but it appears that the are 8
>> broadcast hpets. This is futher backed up by the 'i' debugkey (also
>> attached)
> Right. And with fewer CPUs than HPET channels, you could get
> the system into a mode where each CPU uses a dedicated channel
> ("maxcpus=4", suppressing registration of all the disabled ones).
Does this setup actually mean that there are 8 hpets which are all
broadcasting to every pcpu? The affinities listed in debug-keys 'i'
seem to be towards single pcpus, but the order looks peculiar to say the
least.
>
>> Keir: (merging your thread back here)
>> I see your point regarding IRQ_INPROGRESS, but even with 8 hpet
>> interrupts, there are rather more than 8 occurences of
>> handle_hpet_broadcast() in the stack. If the occurences were just
>> function pointers on the stack, I would expect to see several
>> handle_hpet_broadcast()+0x0/0x268
> Which further hints at some earlier problem. I suppose you don't
> happen to have a dump of that crash, or else you could inspect
> the IRQ descriptors as well as the stack for whether all instances
> came from the same IRQ/vector.
>
> Jan
>
Sadly no - the crashdump analyser grabbed the double fault IST, rather
than the entire contents of the main stack. I shall extend the analyser
to pick up the main stack as well; It does cross IST boundaries for call
traces. I shall how easy it is to make it parse the irq_desc's &
friends as well on crash, although for this case it might be easier just
to tweak the double fault handler.
~Andrew
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler.
2013-08-06 13:23 ` Andrew Cooper
@ 2013-08-06 13:57 ` Jan Beulich
2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper
0 siblings, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2013-08-06 13:57 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan
>>> On 06.08.13 at 15:23, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 06/08/13 12:44, Jan Beulich wrote:
>>>>> On 06.08.13 at 12:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> The machine we found this crash on is a Dell R310. 4 CPUs, 16G Ram.
>> Not all that big.
>>
>>> The full boot xl dmesg is attached, but it appears that the are 8
>>> broadcast hpets. This is futher backed up by the 'i' debugkey (also
>>> attached)
>> Right. And with fewer CPUs than HPET channels, you could get
>> the system into a mode where each CPU uses a dedicated channel
>> ("maxcpus=4", suppressing registration of all the disabled ones).
>
> Does this setup actually mean that there are 8 hpets which are all
> broadcasting to every pcpu? The affinities listed in debug-keys 'i'
> seem to be towards single pcpus, but the order looks peculiar to say the
> least.
No, each channel will be used for just one CPU when there are at
least as many channels as CPUs. The difference between not using
said command line option and using it is that in the former case a
new channel would get assigned to a CPU each time it needs one,
while in the latter case a static (pre-)assignment is used, i.e. each
CPU will use always the same single channel.
Jan
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow
2013-08-06 13:57 ` Jan Beulich
@ 2013-08-13 9:03 ` Andrew Cooper
2013-08-13 9:22 ` Tim Deegan
2013-08-13 11:59 ` Jan Beulich
0 siblings, 2 replies; 11+ messages in thread
From: Andrew Cooper @ 2013-08-13 9:03 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Tim Deegan
[-- Attachment #1: Type: text/plain, Size: 1204 bytes --]
On 06/08/13 14:57, Jan Beulich wrote:
>>> Right. And with fewer CPUs than HPET channels, you could get
>>> the system into a mode where each CPU uses a dedicated channel
>>> ("maxcpus=4", suppressing registration of all the disabled ones).
>> Does this setup actually mean that there are 8 hpets which are all
>> broadcasting to every pcpu? The affinities listed in debug-keys 'i'
>> seem to be towards single pcpus, but the order looks peculiar to say the
>> least.
> No, each channel will be used for just one CPU when there are at
> least as many channels as CPUs. The difference between not using
> said command line option and using it is that in the former case a
> new channel would get assigned to a CPU each time it needs one,
> while in the latter case a static (pre-)assignment is used, i.e. each
> CPU will use always the same single channel.
>
> Jan
>
We had another crash, this time with a proper stack trace. (This was
using an early version stack trace improvements series)
>From the stack trace (now correctly with frame pointers), we see 9 calls
to handle_hpet_broadcast().
This indicates that the current logic does not correctly prevent
repeated delivery of interrupts.
~Andrew
[-- Attachment #2: stack-trace.log --]
[-- Type: text/x-log, Size: 11436 bytes --]
(XEN) [2013-08-12 22:57:42] *** DOUBLE FAULT ***
(XEN) [2013-08-12 22:57:42] ----[ Xen-4.3.0 x86_64 debug=y Not tainted ]----
(XEN) [2013-08-12 22:57:42] CPU: 2
(XEN) [2013-08-12 22:57:42] RIP: e008:[<ffff82c4c012a578>] _spin_lock_irqsave+0/0x5e
(XEN) [2013-08-12 22:57:42] RFLAGS: 0000000000010292 CONTEXT: hypervisor
(XEN) [2013-08-12 22:57:42] rax: ffff82c4c01a7b39 rbx: ffff83043f2d6168 rcx: ffff83043f2bab30
(XEN) [2013-08-12 22:57:42] rdx: ffff83043f2dac88 rsi: ffff83043f2ba300 rdi: ffff83043f2ba320
(XEN) [2013-08-12 22:57:42] rbp: ffff83043f2d6068 rsp: ffff83043f2d6000 r8: 0000000000000000
(XEN) [2013-08-12 22:57:42] r9: 0000000000000000 r10: ffff83043f2d76f0 r11: 0000000000000000
(XEN) [2013-08-12 22:57:42] r12: ffff83043f2ba300 r13: 0000000000000073 r14: ffff83043f281e24
(XEN) [2013-08-12 22:57:42] r15: ffff83043f281e00 cr0: 000000008005003b cr4: 00000000000026f0
(XEN) [2013-08-12 22:57:42] cr3: 000000041e04e000 cr2: ffff83043f2d5ff8
(XEN) [2013-08-12 22:57:42] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) [2013-08-12 22:57:42] Valid stack range: ffff83043f2d6000-ffff83043f2d8000, sp=ffff83043f2d6000, tss.esp0=ffff83043f2d7fc0
(XEN) [2013-08-12 22:57:42] Xen stack overflow (dumping trace ffff83043f2d6000-ffff83043f2d8000):
(XEN) [2013-08-12 22:57:42] ffff83043f2d6008: [<ffff82c4c01a7b56>] handle_hpet_broadcast+0x1d/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6078: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40
(XEN) [2013-08-12 22:57:42] ffff83043f2d6088: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d6168: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d61f0: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) [2013-08-12 22:57:42] ffff83043f2d6228: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6298: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40
(XEN) [2013-08-12 22:57:42] ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
(XEN) [2013-08-12 22:57:42] ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d6410: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) [2013-08-12 22:57:42] ffff83043f2d6448: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6470: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
(XEN) [2013-08-12 22:57:42] ffff83043f2d64b8: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40
(XEN) [2013-08-12 22:57:42] ffff83043f2d64c8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d64d8: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d6518: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d65a8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d6600: [<ffff82c4c01a7b39>] handle_hpet_broadcast+0/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6630: [<ffff82c4c012a583>] _spin_lock_irqsave+0xb/0x5e
(XEN) [2013-08-12 22:57:42] ffff83043f2d6678: [<ffff82c4c01a7b56>] handle_hpet_broadcast+0x1d/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d66e8: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40
(XEN) [2013-08-12 22:57:42] ffff83043f2d66f8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d6758: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d67d8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d6860: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) [2013-08-12 22:57:42] ffff83043f2d6898: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6908: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40
(XEN) [2013-08-12 22:57:42] ffff83043f2d6918: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d6988: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d69b8: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d69f8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d6a80: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) [2013-08-12 22:57:42] ffff83043f2d6ab8: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6b28: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40
(XEN) [2013-08-12 22:57:42] ffff83043f2d6b38: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d6b58: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d6ba8: [<ffff82c4c01a7ce9>] handle_hpet_broadcast+0x1b0/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6c18: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d6ca0: [<ffff82c4c012a535>] _spin_unlock_irqrestore+0x40/0x42
(XEN) [2013-08-12 22:57:42] ffff83043f2d6cd8: [<ffff82c4c01a7b94>] handle_hpet_broadcast+0x5b/0x268
(XEN) [2013-08-12 22:57:42] ffff83043f2d6d48: [<ffff82c4c01a7e01>] hpet_interrupt_handler+0x3e/0x40
(XEN) [2013-08-12 22:57:42] ffff83043f2d6d58: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d6dc8: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d6e38: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d6ec0: [<ffff82c4c012a577>] _spin_unlock_irq+0x40/0x41
(XEN) [2013-08-12 22:57:42] ffff83043f2d6ee8: [<ffff82c4c017071a>] do_IRQ+0xae8/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d6f00: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
(XEN) [2013-08-12 22:57:42] ffff83043f2d6f78: [<ffff82c4c01aaabd>] cpuidle_wakeup_mwait+0xad/0xba
(XEN) [2013-08-12 22:57:42] ffff83043f2d6fc8: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
(XEN) [2013-08-12 22:57:42] ffff83043f2d7050: [<ffff82c4c01f0d3f>] ept_get_entry+0x2f/0x239
(XEN) [2013-08-12 22:57:42] ffff83043f2d7078: [<ffff82c4c01f0d3f>] ept_get_entry+0x2f/0x239
(XEN) [2013-08-12 22:57:42] ffff83043f2d70b8: [<ffff82c4c016fcda>] do_IRQ+0xa8/0xbc7
(XEN) [2013-08-12 22:57:42] ffff83043f2d7108: [<ffff82c4c01e97a7>] __get_gfn_type_access+0x12b/0x20e
(XEN) [2013-08-12 22:57:42] ffff83043f2d7158: [<ffff82c4c01e9fd2>] get_page_from_gfn_p2m+0xc8/0x25d
(XEN) [2013-08-12 22:57:42] ffff83043f2d71c8: [<ffff82c4c01f4970>] map_domain_gfn_3_levels+0x43/0x13a
(XEN) [2013-08-12 22:57:42] ffff83043f2d7208: [<ffff82c4c01f4bf2>] guest_walk_tables_3_levels+0x18b/0x489
(XEN) [2013-08-12 22:57:42] ffff83043f2d7248: [<ffff82c4c01f0f37>] ept_get_entry+0x227/0x239
(XEN) [2013-08-12 22:57:42] ffff83043f2d7288: [<ffff82c4c0223c98>] hap_p2m_ga_to_gfn_3_levels+0x178/0x306
(XEN) [2013-08-12 22:57:42] ffff83043f2d7338: [<ffff82c4c0223e45>] hap_gva_to_gfn_3_levels+0x1f/0x2a
(XEN) [2013-08-12 22:57:42] ffff83043f2d7348: [<ffff82c4c01ebf2e>] paging_gva_to_gfn+0xb6/0xcc
(XEN) [2013-08-12 22:57:42] ffff83043f2d7398: [<ffff82c4c01b72e6>] hvmemul_linear_to_phys+0xf3/0x24f
(XEN) [2013-08-12 22:57:42] ffff83043f2d7418: [<ffff82c4c01b801f>] __hvmemul_read+0x179/0x1c8
(XEN) [2013-08-12 22:57:42] ffff83043f2d7498: [<ffff82c4c01b80c1>] hvmemul_read+0x12/0x14
(XEN) [2013-08-12 22:57:42] ffff83043f2d74a8: [<ffff82c4c0193aa9>] read_ulong+0xe/0x10
(XEN) [2013-08-12 22:57:42] ffff83043f2d74b8: [<ffff82c4c0196338>] x86_emulate+0x1df1/0x11309
(XEN) [2013-08-12 22:57:42] ffff83043f2d7510: [<ffff82c4c01b806e>] hvmemul_insn_fetch+0/0x41
(XEN) [2013-08-12 22:57:42] ffff83043f2d7530: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309
(XEN) [2013-08-12 22:57:42] ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309
(XEN) [2013-08-12 22:57:42] ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e
(XEN) [2013-08-12 22:57:42] ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124
(XEN) [2013-08-12 22:57:42] ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde
(XEN) [2013-08-12 22:57:42] ffff83043f2d7788: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde
(XEN) [2013-08-12 22:57:42] ffff83043f2d77b8: [<ffff82c4c01f0f37>] ept_get_entry+0x227/0x239
(XEN) [2013-08-12 22:57:42] ffff83043f2d7848: [<ffff82c4c017788f>] get_page+0x27/0xf2
(XEN) [2013-08-12 22:57:42] ffff83043f2d7898: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde
(XEN) [2013-08-12 22:57:42] ffff83043f2d78c8: [<ffff82c4c01f0f37>] ept_get_entry+0x227/0x239
(XEN) [2013-08-12 22:57:42] ffff83043f2d7a98: [<ffff82c4c01b8260>] hvm_emulate_one+0x127/0x1bf
(XEN) [2013-08-12 22:57:42] ffff83043f2d7aa8: [<ffff82c4c01b6f1b>] hvmemul_get_seg_reg+0x49/0x60
(XEN) [2013-08-12 22:57:42] ffff83043f2d7ae8: [<ffff82c4c01c3bc5>] handle_mmio+0x55/0x1f0
(XEN) [2013-08-12 22:57:42] ffff83043f2d7b10: [<ffff82c4c01b8260>] hvm_emulate_one+0x127/0x1bf
(XEN) [2013-08-12 22:57:42] ffff83043f2d7b20: [<ffff82c4c01b6f1b>] hvmemul_get_seg_reg+0x49/0x60
(XEN) [2013-08-12 22:57:42] ffff83043f2d7c48: [<ffff82c4c01e9700>] __get_gfn_type_access+0x84/0x20e
(XEN) [2013-08-12 22:57:42] ffff83043f2d7c98: [<ffff82c4c01bcff3>] hvm_hap_nested_page_fault+0x25d/0x456
(XEN) [2013-08-12 22:57:42] ffff83043f2d7d18: [<ffff82c4c01e1557>] vmx_vmexit_handler+0x140a/0x17ba
(XEN) [2013-08-12 22:57:42] ffff83043f2d7d30: [<ffff82c4c01be8c5>] hvm_do_resume+0xc6/0x1b7
(XEN) [2013-08-12 22:57:42] ffff83043f2d7da8: [<ffff82c4c01ce19c>] vpic_get_highest_priority_irq+0xaa/0xc6
(XEN) [2013-08-12 22:57:42] ffff83043f2d7db8: [<ffff82c4c015f972>] vcpu_kick+0x20/0x6c
(XEN) [2013-08-12 22:57:42] ffff83043f2d7dd8: [<ffff82c4c01ce22f>] vpic_update_int_output+0x77/0xa2
(XEN) [2013-08-12 22:57:43] ffff83043f2d7df8: [<ffff82c4c01ce363>] vpic_irq_positive_edge+0x80/0x85
(XEN) [2013-08-12 22:57:43] ffff83043f2d7e18: [<ffff82c4c01c4b30>] assert_irq+0x27/0x32
(XEN) [2013-08-12 22:57:43] ffff83043f2d7e38: [<ffff82c4c01c4bca>] hvm_isa_irq_assert+0x8f/0xa4
(XEN) [2013-08-12 22:57:43] ffff83043f2d7e58: [<ffff82c4c01cb3d0>] vlapic_accept_pic_intr+0x21/0x2b
(XEN) [2013-08-12 22:57:43] ffff83043f2d7e68: [<ffff82c4c01cf86d>] pt_update_irq+0x267/0x2ea
(XEN) [2013-08-12 22:57:43] ffff83043f2d7e78: [<ffff82c4c01cb3d0>] vlapic_accept_pic_intr+0x21/0x2b
(XEN) [2013-08-12 22:57:43] ffff83043f2d7e88: [<ffff82c4c01bd239>] hvm_interrupt_blocked+0x4d/0xe9
(XEN) [2013-08-12 22:57:43] ffff83043f2d7ec8: [<ffff82c4c01defa9>] vmx_vmenter_helper+0x60/0x139
(XEN) [2013-08-12 22:57:43] ffff83043f2d7f18: [<ffff82c4c01e7739>] vmx_asm_do_vmentry+0/0xe7
(XEN) [2013-08-12 22:57:43]
(XEN) [2013-08-12 22:57:43]
(XEN) [2013-08-12 22:57:43] ****************************************
(XEN) [2013-08-12 22:57:43] Panic on CPU 2:
(XEN) [2013-08-12 22:57:43] DOUBLE FAULT -- system shutdown
(XEN) [2013-08-12 22:57:43] ****************************************
(XEN) [2013-08-12 22:57:43]
(XEN) [2013-08-12 22:57:43] Reboot in five seconds...
(XEN) [2013-08-12 22:57:43] Executing crash image
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow
2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper
@ 2013-08-13 9:22 ` Tim Deegan
2013-08-13 9:33 ` Andrew Cooper
2013-08-13 11:59 ` Jan Beulich
1 sibling, 1 reply; 11+ messages in thread
From: Tim Deegan @ 2013-08-13 9:22 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Jan Beulich
Hi,
At 10:03 +0100 on 13 Aug (1376388226), Andrew Cooper wrote:
> We had another crash, this time with a proper stack trace. (This was
> using an early version stack trace improvements series)
>
> From the stack trace (now correctly with frame pointers), we see 9 calls
> to handle_hpet_broadcast().
Hmmm. I don't think this can be following frame pointers -- or if it is
something very odd is happening here:
ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
and here:
ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309
ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e
ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124
ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde
Tim.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow
2013-08-13 9:22 ` Tim Deegan
@ 2013-08-13 9:33 ` Andrew Cooper
0 siblings, 0 replies; 11+ messages in thread
From: Andrew Cooper @ 2013-08-13 9:33 UTC (permalink / raw)
To: Tim Deegan; +Cc: xen-devel, Keir Fraser, Jan Beulich
On 13/08/13 10:22, Tim Deegan wrote:
> Hi,
>
> At 10:03 +0100 on 13 Aug (1376388226), Andrew Cooper wrote:
>> We had another crash, this time with a proper stack trace. (This was
>> using an early version stack trace improvements series)
>>
>> From the stack trace (now correctly with frame pointers), we see 9 calls
>> to handle_hpet_broadcast().
> Hmmm. I don't think this can be following frame pointers -- or if it is
> something very odd is happening here:
>
> ffff83043f2d62a8: [<ffff82c4c0170744>] do_IRQ+0xb12/0xbc7
> ffff83043f2d6340: [<ffff82c4c012a178>] _spin_lock_irq+0x28/0x65
> ffff83043f2d6388: [<ffff82c4c016805f>] common_interrupt+0x5f/0x70
>
> and here:
>
> ffff83043f2d7548: [<ffff82c4c01a24a7>] x86_emulate+0xdf60/0x11309
> ffff83043f2d75a8: [<ffff82c4c0107774>] evtchn_set_pending+0xc0/0x18e
> ffff83043f2d75d8: [<ffff82c4c0107900>] notify_via_xen_event_channel+0xbe/0x124
> ffff83043f2d76c8: [<ffff82c4c01ef9b5>] ept_next_level+0xa4/0xde
>
> Tim.
>
>
Hmm yes. I will double check the frame pointer through exception frame
logic.
~Andrew
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Hpet interrupt overflow
2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper
2013-08-13 9:22 ` Tim Deegan
@ 2013-08-13 11:59 ` Jan Beulich
1 sibling, 0 replies; 11+ messages in thread
From: Jan Beulich @ 2013-08-13 11:59 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan
>>> On 13.08.13 at 11:03, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> We had another crash, this time with a proper stack trace. (This was
> using an early version stack trace improvements series)
>
> From the stack trace (now correctly with frame pointers), we see 9 calls
> to handle_hpet_broadcast().
>
> This indicates that the current logic does not correctly prevent
> repeated delivery of interrupts.
And this was with a 1:1 CPU <-> HPET channel mapping (not
visible from just the stack trace)?
In any case, could you try moving the call to ack_APIC_irq() from
hpet_msi_ack() to hpet_msi_end() (the latter may need to be
re-created depending on the Xen version you do this with). Or,
as another alternative, call hpet_msi_{,un}mask() from the two
functions respectively (albeit I think this might result in lost
interrupts).
Potentially hpet_msi_end() would then also need to disable
interrupts before doing either of these actions.
Jan
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2013-08-13 11:59 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-08-05 20:38 [RFC Patch] x86/hpet: Disable interrupts while running hpet interrupt handler Andrew Cooper
2013-08-06 4:49 ` Keir Fraser
2013-08-06 8:01 ` Jan Beulich
2013-08-06 10:32 ` Andrew Cooper
2013-08-06 11:44 ` Jan Beulich
2013-08-06 13:23 ` Andrew Cooper
2013-08-06 13:57 ` Jan Beulich
2013-08-13 9:03 ` Hpet interrupt overflow Andrew Cooper
2013-08-13 9:22 ` Tim Deegan
2013-08-13 9:33 ` Andrew Cooper
2013-08-13 11:59 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.