Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

* Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace
@ 2010-03-05 12:47 Pasi Kärkkäinen
  2010-03-05 13:00 ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Pasi Kärkkäinen @ 2010-03-05 12:47 UTC (permalink / raw)
  To: xen-devel

Hello,

I have a PV domU running Debian Lenny 2.6.26-2-xen-686. It has 2 vcpus, 2GB of memory,
and it crashes every 14-30 days..

Basicly the guest get stuck somehow, and starts to consume all the CPU time it can get.
"xm console <guest>" doesn't allow me to do anything (I can't login, but I can see the 
messages on the console - no errors there), and it doesn't respond from the network either.

It's not totally crashed either, it's just stuck in some loop. 
no errors or anything special in "xm log". 

Any ideas?

I tried running xenctx on it a couple of times from dom0:

eip: c0105c0f jiffies_to_st+0x17
esp: dd747dcc
eax: 91dc1272   ebx: 2e88c443   ecx: 03020006   edx: 91dc1272
esi: 91dc1272   edi: dd747dec   ebp: c0378184
 cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000033

Stack:
 00000003 00000103 dd747dec c023d164 00000000 00000001 00000000 00000000
 00000006 c123872c 00000022 c023f968 00002221 c0378184 00002221 00000000
 00000100 c02cbedf ed4026c0 00000000 c0106444 0000007b c011007b 000000d8
 fffffef0 c0101227 00000061 00000246 c023cdd9 00000fd0 c1234020 00000000

Code:
89 c6 53 8b 1d 80 81 37 c0 0f ae e8 66 90 f6 c3 01 74 04 f3 90 <eb> ec a1 40 81 37 c0 89 f1 29 c1

Call Trace:
  [<c0105c0f>] jiffies_to_st+0x17 <--
  [<c023d164>] xen_poll_irq+0x41
  [<c023f968>] xen_spin_wait+0xcc
  [<c02cbedf>] _spin_lock+0x31
  [<c0106444>] timer_interrupt+0x37
  [<c011007b>] __change_page_attr_set_clr+0x4cd
  [<c0101227>] hypercall_page+0x227
  [<c023cdd9>] force_evtchn_callback+0xa
  [<c0122648>] current_fs_time+0x13
  [<c0184c61>] mnt_drop_write+0x1b
  [<c01502e7>] generic_file_aio_read+0x49a
  [<c0149901>] handle_IRQ_event+0x36
  [<c014aa15>] handle_level_irq+0x90
  [<c0105b00>] do_IRQ+0x4d
  [<c023d8c7>] evtchn_do_upcall+0xfa
  [<c010412c>] hypervisor_callback+0x3c
  [<c013326b>] current_kernel_time+0xb
  [<c012263d>] current_fs_time+0x8
  [<c0223b1f>] tty_write+0x191
  [<c0225b5e>] n_tty_open+0x88
  [<c022398e>] send_break+0x5e
  [<c017060c>] vfs_write+0x83
  [<c0170bde>] sys_write+0x3c
  [<c0103f76>] syscall_call+0x7


running it again:

eip: c0105c06 jiffies_to_st+0xe
esp: dd747dcc
eax: 91dc1272   ebx: 2e88c443   ecx: 03020006   edx: 91dc1272
esi: 91dc1272   edi: dd747dec   ebp: c0378184
 cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000033

Stack:
 00000003 00000103 dd747dec c023d164 00000000 00000001 00000000 00000000
 00000006 c123872c 00000022 c023f968 00002221 c0378184 00002221 00000000
 00000100 c02cbedf ed4026c0 00000000 c0106444 0000007b c011007b 000000d8
 fffffef0 c0101227 00000061 00000246 c023cdd9 00000fd0 c1234020 00000000

Code:
f1 89 43 14 5b c3 c3 57 56 89 c6 53 8b 1d 80 81 37 c0 0f ae e8 <66> 90 f6 c3 01 74 04 f3 90 eb ec

Call Trace:
  [<c0105c06>] jiffies_to_st+0xe <--
  [<c023d164>] xen_poll_irq+0x41
  [<c023f968>] xen_spin_wait+0xcc
  [<c02cbedf>] _spin_lock+0x31
  [<c0106444>] timer_interrupt+0x37
  [<c011007b>] __change_page_attr_set_clr+0x4cd
  [<c0101227>] hypercall_page+0x227
  [<c023cdd9>] force_evtchn_callback+0xa
  [<c0122648>] current_fs_time+0x13
  [<c0184c61>] mnt_drop_write+0x1b
  [<c01502e7>] generic_file_aio_read+0x49a
  [<c0149901>] handle_IRQ_event+0x36
  [<c014aa15>] handle_level_irq+0x90
  [<c0105b00>] do_IRQ+0x4d
  [<c023d8c7>] evtchn_do_upcall+0xfa
  [<c010412c>] hypervisor_callback+0x3c
  [<c013326b>] current_kernel_time+0xb
  [<c012263d>] current_fs_time+0x8
  [<c0223b1f>] tty_write+0x191
  [<c0225b5e>] n_tty_open+0x88
  [<c022398e>] send_break+0x5e
  [<c017060c>] vfs_write+0x83
  [<c0170bde>] sys_write+0x3c
  [<c0103f76>] syscall_call+0x7


-- Pasi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace
  2010-03-05 12:47 Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace Pasi Kärkkäinen
@ 2010-03-05 13:00 ` Jan Beulich
  2010-03-05 13:12   ` Pasi Kärkkäinen
  2010-03-07 15:52   ` Pasi Kärkkäinen
  0 siblings, 2 replies; 7+ messages in thread
From: Jan Beulich @ 2010-03-05 13:00 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

>>> Pasi Kärkkäinen<pasik@iki.fi> 05.03.10 13:47 >>>
>Basicly the guest get stuck somehow, and starts to consume all the CPU time it can get.

Would seem like you posted the xenctx output only for one of the two
vCPU-s (which isn't able to acquire xtime_lock as it seems) - the
question is what the other vCPU is doing then.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace
  2010-03-05 13:00 ` Jan Beulich
@ 2010-03-05 13:12   ` Pasi Kärkkäinen
  2010-03-07 15:52   ` Pasi Kärkkäinen
  1 sibling, 0 replies; 7+ messages in thread
From: Pasi Kärkkäinen @ 2010-03-05 13:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Fri, Mar 05, 2010 at 01:00:39PM +0000, Jan Beulich wrote:
> >>> Pasi Kärkkäinen<pasik@iki.fi> 05.03.10 13:47 >>>
> >Basicly the guest get stuck somehow, and starts to consume all the CPU time it can get.
> 
> Would seem like you posted the xenctx output only for one of the two
> vCPU-s (which isn't able to acquire xtime_lock as it seems) - the
> question is what the other vCPU is doing then.
> 

Damnit, I didn't realize that.. and now the domU is destroyed/rebooted already..

-- Pasi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace
  2010-03-05 13:00 ` Jan Beulich
  2010-03-05 13:12   ` Pasi Kärkkäinen
@ 2010-03-07 15:52   ` Pasi Kärkkäinen
  2010-03-08  8:05     ` Jan Beulich
  1 sibling, 1 reply; 7+ messages in thread
From: Pasi Kärkkäinen @ 2010-03-07 15:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Fri, Mar 05, 2010 at 01:00:39PM +0000, Jan Beulich wrote:
> >>> Pasi Kärkkäinen<pasik@iki.fi> 05.03.10 13:47 >>>
> >Basicly the guest get stuck somehow, and starts to consume all the CPU time it can get.
> 
> Would seem like you posted the xenctx output only for one of the two
> vCPU-s (which isn't able to acquire xtime_lock as it seems) - the
> question is what the other vCPU is doing then.
> 

Ok, another guest crashed, so here are the xenctx outputs for both vcpus:

vcpu 0:

eip: c0105c0f jiffies_to_st+0x17
esp: ec8d3cd4
eax: bc58158b   ebx: 15e6990b   ecx: 03020006   edx: bc58158b
esi: bc58158b   edi: ec8d3cf4   ebp: c0378184
 cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000033

Stack:
 00000003 00000103 ec8d3cf4 c023d00c 00000000 00000001 00000000 00000000
 00000006 c149d72c 00000086 c023f80f 00008685 c0378184 00008685 00000000
 00000100 c02cbd87 ed4026c0 00000000 c0106444 095ff7f8 0354f7fa 00000000
 001200fa 007a0001 00000000 00000000 00000000 00000000 c1499020 00000000

Code:
89 c6 53 8b 1d 80 81 37 c0 0f ae e8 66 90 f6 c3 01 74 04 f3 90 <eb> ec a1 40 81 37 c0 89 f1 29 c1

Call Trace:
  [<c0105c0f>] jiffies_to_st+0x17 <--
  [<c023d00c>] startup_pirq+0x46
  [<c023f80f>] uuid_show+0x4d
  [<c02cbd87>] rwsem_down_read_failed+0x23
  [<c0106444>] timer_interrupt+0x37
  [<c026c4a8>] register_netdevice_notifier+0xeb
  [<c0149949>] handle_bad_irq+0x10
  [<c014aa5d>] handle_level_irq+0xd8
  [<c0105b00>] do_IRQ+0x4d
  [<c023d76f>] xen_clear_irq_pending+0x3
  [<c010412c>] hypervisor_callback+0x3c
  [<c01332b2>] timekeeping_suspend+0x1c
  [<c012269d>] sys_stime+0x8
  [<c0181d5d>] igrab+0x1d
  [<c015032d>] generic_file_aio_read+0x4e0
  [<c016fe8e>] do_sync_write+0xb3
  [<c012ec98>] wake_bit_function+0x23
  [<c0265ded>] skb_copy_and_csum_bits+0x22f
  [<c026966a>] netdev_create_hash+0x20
  [<c01b945b>] security_bprm_post_apply_creds+0x1
  [<c016fdcf>] do_sync_readv_writev+0xed
  [<c017061e>] vfs_write+0x95
  [<c0170a6f>] sys_lseek+0x15
  [<c0103f76>] syscall_call+0x7


vcpu 1:

eip: c0105c0f jiffies_to_st+0x17
esp: ed447d60
eax: bc58158b   ebx: 15e6990b   ecx: 03020009   edx: bc58158b
esi: bc58158b   edi: ed447d80   ebp: c14a4b40
 cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000000

Stack:
 00000003 00000107 ed447d80 c023d00c 00000000 00000001 00000000 00000000
 00000009 c14a472c 000000fb c023f80f 0000fbfa c14a4b40 0000fbfa ed447dcc
 ed4a15e0 c02cbd87 c03bfb40 c14a4b40 c01175ac ed4a15e0 ed42bb84 00000000
 00000000 c01177c6 00000003 00000001 ed4c5fbc ed42bb84 00000001 00000000

Code:
89 c6 53 8b 1d 80 81 37 c0 0f ae e8 66 90 f6 c3 01 74 04 f3 90 <eb> ec a1 40 81 37 c0 89 f1 29 c1

Call Trace:
  [<c0105c0f>] jiffies_to_st+0x17 <--
  [<c023d00c>] startup_pirq+0x46
  [<c023f80f>] uuid_show+0x4d
  [<c02cbd87>] rwsem_down_read_failed+0x23
  [<c01175ac>] sched_move_task+0x71
  [<c01177c6>] try_to_wake_up+0xbc
  [<c012eca5>] wake_bit_function+0x30
  [<c0114b84>] sys_sched_get_priority_max+0x16
  [<c011684b>] print_cfs_rq+0x84
  [<c012c27d>] flush_cpu_workqueue+0xc
  [<c012c5c5>] queue_work+0x28
  [<c012c620>] __cancel_work_timer+0x3
  [<c0106644>] timer_interrupt+0x237
  [<c011684b>] print_cfs_rq+0x84
  [<c0105f74>] get_nsec_offset+0xe
  [<c0149949>] handle_bad_irq+0x10
  [<c014aa5d>] handle_level_irq+0xd8
  [<c0105b00>] do_IRQ+0x4d
  [<c023d76f>] xen_clear_irq_pending+0x3
  [<c010412c>] hypervisor_callback+0x3c
  [<c01013a7>] hypercall_page+0x3a7
  [<c0105f52>] xen_safe_halt+0x9f
  [<c01028ab>] xen_idle+0x1b
  [<c0102810>] cpu_idle+0xa8


-- Pasi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace
  2010-03-07 15:52   ` Pasi Kärkkäinen
@ 2010-03-08  8:05     ` Jan Beulich
  2010-03-08  8:36       ` Pasi Kärkkäinen
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2010-03-08  8:05 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

>>> Pasi Kärkkäinen<pasik@iki.fi> 07.03.10 16:52 >>>
>Ok, another guest crashed, so here are the xenctx outputs for both vcpus:

Which would require some cleaning up - afaict these are imprecise call
traces, which are pretty hard to analyze without the corresponding
binary. In any case both vCPU-s appear to try to acquire different
spin locks, the chances are good that this simply is an ABBA deadlock.

But it's certainly also suspicious that *both* have handle_bad_irq()
on their call stack.

Jan

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

vcpu 0:

eip: c0105c0f jiffies_to_st+0x17
esp: ec8d3cd4
eax: bc58158b   ebx: 15e6990b   ecx: 03020006   edx: bc58158b
esi: bc58158b   edi: ec8d3cf4   ebp: c0378184
 cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000033

Stack:
 00000003 00000103 ec8d3cf4 c023d00c 00000000 00000001 00000000 00000000
 00000006 c149d72c 00000086 c023f80f 00008685 c0378184 00008685 00000000
 00000100 c02cbd87 ed4026c0 00000000 c0106444 095ff7f8 0354f7fa 00000000
 001200fa 007a0001 00000000 00000000 00000000 00000000 c1499020 00000000

Code:
89 c6 53 8b 1d 80 81 37 c0 0f ae e8 66 90 f6 c3 01 74 04 f3 90 <eb> ec a1 40 81 37 c0 89 f1 29 c1

Call Trace:
  [<c0105c0f>] jiffies_to_st+0x17 <--
  [<c023d00c>] startup_pirq+0x46
  [<c023f80f>] uuid_show+0x4d
  [<c02cbd87>] rwsem_down_read_failed+0x23
  [<c0106444>] timer_interrupt+0x37
  [<c026c4a8>] register_netdevice_notifier+0xeb
  [<c0149949>] handle_bad_irq+0x10
  [<c014aa5d>] handle_level_irq+0xd8
  [<c0105b00>] do_IRQ+0x4d
  [<c023d76f>] xen_clear_irq_pending+0x3
  [<c010412c>] hypervisor_callback+0x3c
  [<c01332b2>] timekeeping_suspend+0x1c
  [<c012269d>] sys_stime+0x8
  [<c0181d5d>] igrab+0x1d
  [<c015032d>] generic_file_aio_read+0x4e0
  [<c016fe8e>] do_sync_write+0xb3
  [<c012ec98>] wake_bit_function+0x23
  [<c0265ded>] skb_copy_and_csum_bits+0x22f
  [<c026966a>] netdev_create_hash+0x20
  [<c01b945b>] security_bprm_post_apply_creds+0x1
  [<c016fdcf>] do_sync_readv_writev+0xed
  [<c017061e>] vfs_write+0x95
  [<c0170a6f>] sys_lseek+0x15
  [<c0103f76>] syscall_call+0x7


vcpu 1:

eip: c0105c0f jiffies_to_st+0x17
esp: ed447d60
eax: bc58158b   ebx: 15e6990b   ecx: 03020009   edx: bc58158b
esi: bc58158b   edi: ed447d80   ebp: c14a4b40
 cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000000

Stack:
 00000003 00000107 ed447d80 c023d00c 00000000 00000001 00000000 00000000
 00000009 c14a472c 000000fb c023f80f 0000fbfa c14a4b40 0000fbfa ed447dcc
 ed4a15e0 c02cbd87 c03bfb40 c14a4b40 c01175ac ed4a15e0 ed42bb84 00000000
 00000000 c01177c6 00000003 00000001 ed4c5fbc ed42bb84 00000001 00000000

Code:
89 c6 53 8b 1d 80 81 37 c0 0f ae e8 66 90 f6 c3 01 74 04 f3 90 <eb> ec a1 40 81 37 c0 89 f1 29 c1

Call Trace:
  [<c0105c0f>] jiffies_to_st+0x17 <--
  [<c023d00c>] startup_pirq+0x46
  [<c023f80f>] uuid_show+0x4d
  [<c02cbd87>] rwsem_down_read_failed+0x23
  [<c01175ac>] sched_move_task+0x71
  [<c01177c6>] try_to_wake_up+0xbc
  [<c012eca5>] wake_bit_function+0x30
  [<c0114b84>] sys_sched_get_priority_max+0x16
  [<c011684b>] print_cfs_rq+0x84
  [<c012c27d>] flush_cpu_workqueue+0xc
  [<c012c5c5>] queue_work+0x28
  [<c012c620>] __cancel_work_timer+0x3
  [<c0106644>] timer_interrupt+0x237
  [<c011684b>] print_cfs_rq+0x84
  [<c0105f74>] get_nsec_offset+0xe
  [<c0149949>] handle_bad_irq+0x10
  [<c014aa5d>] handle_level_irq+0xd8
  [<c0105b00>] do_IRQ+0x4d
  [<c023d76f>] xen_clear_irq_pending+0x3
  [<c010412c>] hypervisor_callback+0x3c
  [<c01013a7>] hypercall_page+0x3a7
  [<c0105f52>] xen_safe_halt+0x9f
  [<c01028ab>] xen_idle+0x1b
  [<c0102810>] cpu_idle+0xa8


-- Pasi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace
  2010-03-08  8:05     ` Jan Beulich
@ 2010-03-08  8:36       ` Pasi Kärkkäinen
  2010-03-08  8:54         ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Pasi Kärkkäinen @ 2010-03-08  8:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On Mon, Mar 08, 2010 at 08:05:28AM +0000, Jan Beulich wrote:
> >>> Pasi Kärkkäinen<pasik@iki.fi> 07.03.10 16:52 >>>
> >Ok, another guest crashed, so here are the xenctx outputs for both vcpus:
> 
> Which would require some cleaning up - afaict these are imprecise call
> traces, which are pretty hard to analyze without the corresponding
> binary. In any case both vCPU-s appear to try to acquire different
> spin locks, the chances are good that this simply is an ABBA deadlock.
> 
> But it's certainly also suspicious that *both* have handle_bad_irq()
> on their call stack.
> 

I still have the guest up.. in the crashed state. 
Anything that I should try? 

-- Pasi

> Jan
> 
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 
> vcpu 0:
> 
> eip: c0105c0f jiffies_to_st+0x17
> esp: ec8d3cd4
> eax: bc58158b   ebx: 15e6990b   ecx: 03020006   edx: bc58158b
> esi: bc58158b   edi: ec8d3cf4   ebp: c0378184
>  cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000033
> 
> Stack:
>  00000003 00000103 ec8d3cf4 c023d00c 00000000 00000001 00000000 00000000
>  00000006 c149d72c 00000086 c023f80f 00008685 c0378184 00008685 00000000
>  00000100 c02cbd87 ed4026c0 00000000 c0106444 095ff7f8 0354f7fa 00000000
>  001200fa 007a0001 00000000 00000000 00000000 00000000 c1499020 00000000
> 
> Code:
> 89 c6 53 8b 1d 80 81 37 c0 0f ae e8 66 90 f6 c3 01 74 04 f3 90 <eb> ec a1 40 81 37 c0 89 f1 29 c1
> 
> Call Trace:
>   [<c0105c0f>] jiffies_to_st+0x17 <--
>   [<c023d00c>] startup_pirq+0x46
>   [<c023f80f>] uuid_show+0x4d
>   [<c02cbd87>] rwsem_down_read_failed+0x23
>   [<c0106444>] timer_interrupt+0x37
>   [<c026c4a8>] register_netdevice_notifier+0xeb
>   [<c0149949>] handle_bad_irq+0x10
>   [<c014aa5d>] handle_level_irq+0xd8
>   [<c0105b00>] do_IRQ+0x4d
>   [<c023d76f>] xen_clear_irq_pending+0x3
>   [<c010412c>] hypervisor_callback+0x3c
>   [<c01332b2>] timekeeping_suspend+0x1c
>   [<c012269d>] sys_stime+0x8
>   [<c0181d5d>] igrab+0x1d
>   [<c015032d>] generic_file_aio_read+0x4e0
>   [<c016fe8e>] do_sync_write+0xb3
>   [<c012ec98>] wake_bit_function+0x23
>   [<c0265ded>] skb_copy_and_csum_bits+0x22f
>   [<c026966a>] netdev_create_hash+0x20
>   [<c01b945b>] security_bprm_post_apply_creds+0x1
>   [<c016fdcf>] do_sync_readv_writev+0xed
>   [<c017061e>] vfs_write+0x95
>   [<c0170a6f>] sys_lseek+0x15
>   [<c0103f76>] syscall_call+0x7
> 
> 
> vcpu 1:
> 
> eip: c0105c0f jiffies_to_st+0x17
> esp: ed447d60
> eax: bc58158b   ebx: 15e6990b   ecx: 03020009   edx: bc58158b
> esi: bc58158b   edi: ed447d80   ebp: c14a4b40
>  cs: 00000061    ds: 0000007b    fs: 000000d8    gs: 00000000
> 
> Stack:
>  00000003 00000107 ed447d80 c023d00c 00000000 00000001 00000000 00000000
>  00000009 c14a472c 000000fb c023f80f 0000fbfa c14a4b40 0000fbfa ed447dcc
>  ed4a15e0 c02cbd87 c03bfb40 c14a4b40 c01175ac ed4a15e0 ed42bb84 00000000
>  00000000 c01177c6 00000003 00000001 ed4c5fbc ed42bb84 00000001 00000000
> 
> Code:
> 89 c6 53 8b 1d 80 81 37 c0 0f ae e8 66 90 f6 c3 01 74 04 f3 90 <eb> ec a1 40 81 37 c0 89 f1 29 c1
> 
> Call Trace:
>   [<c0105c0f>] jiffies_to_st+0x17 <--
>   [<c023d00c>] startup_pirq+0x46
>   [<c023f80f>] uuid_show+0x4d
>   [<c02cbd87>] rwsem_down_read_failed+0x23
>   [<c01175ac>] sched_move_task+0x71
>   [<c01177c6>] try_to_wake_up+0xbc
>   [<c012eca5>] wake_bit_function+0x30
>   [<c0114b84>] sys_sched_get_priority_max+0x16
>   [<c011684b>] print_cfs_rq+0x84
>   [<c012c27d>] flush_cpu_workqueue+0xc
>   [<c012c5c5>] queue_work+0x28
>   [<c012c620>] __cancel_work_timer+0x3
>   [<c0106644>] timer_interrupt+0x237
>   [<c011684b>] print_cfs_rq+0x84
>   [<c0105f74>] get_nsec_offset+0xe
>   [<c0149949>] handle_bad_irq+0x10
>   [<c014aa5d>] handle_level_irq+0xd8
>   [<c0105b00>] do_IRQ+0x4d
>   [<c023d76f>] xen_clear_irq_pending+0x3
>   [<c010412c>] hypervisor_callback+0x3c
>   [<c01013a7>] hypercall_page+0x3a7
>   [<c0105f52>] xen_safe_halt+0x9f
>   [<c01028ab>] xen_idle+0x1b
>   [<c0102810>] cpu_idle+0xa8
> 
> 
> -- Pasi
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace
  2010-03-08  8:36       ` Pasi Kärkkäinen
@ 2010-03-08  8:54         ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2010-03-08  8:54 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

>>> Pasi Kärkkäinen<pasik@iki.fi> 08.03.10 09:36 >>>
>I still have the guest up.. in the crashed state. 
>Anything that I should try? 

It's not really something you should try, it's really the analysis of the
call stack that's going to get you forward. In particular, after
reconstructing the true call stack (i.e. with all false entries removed)
and after determining which two locks are being acquired by the two
vCPU-s, it should be possible to determine whether respectively the
other vCPU is currently holding those locks.

Plus this work would also show whether the handle_bad_irq()
entries on the stack are directly or only indirectly (and hence only
possibly) related to the issue.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-03-08  8:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-05 12:47 Debian Lenny 2.6.26-2-xen-686 crashing as multi-vcpu domU, stack trace Pasi Kärkkäinen
2010-03-05 13:00 ` Jan Beulich
2010-03-05 13:12   ` Pasi Kärkkäinen
2010-03-07 15:52   ` Pasi Kärkkäinen
2010-03-08  8:05     ` Jan Beulich
2010-03-08  8:36       ` Pasi Kärkkäinen
2010-03-08  8:54         ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).