All of lore.kernel.org
 help / color / mirror / Atom feed
* xen pci passthrough hung task instead of terminate
@ 2010-07-25 15:35 Sander Eikelenboom
  2010-07-26 15:53 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 3+ messages in thread
From: Sander Eikelenboom @ 2010-07-25 15:35 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel

Hi Konrad,

I have tried both your trees, together with some experimental usb3 stuff.
It seems to work apart from some usb3 problems after several hours of videograbbing, in the end it crashes the program, but instead of terminating it keeps hanging.
Since xen_evtchn is on the trace stack i'm wondering if any xen parts are causing it to hang instead of terminate.

--
Sander



Jul 25 16:54:26 security kernel: [26400.136170] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 25 16:54:26 security kernel: [26400.136191] motion        D ffffffff810049f9     0  1556      1 0x00000000
Jul 25 16:54:26 security kernel: [26400.136220]  ffff88001fce6800 0000000000000286 0000000000000001 0000000000014580
Jul 25 16:54:26 security kernel: [26400.136254]  ffff88001e251fd8 ffff88001e251fd8 ffff88001e088100 0000000000014580
Jul 25 16:54:26 security kernel: [26400.136285]  0000000000014580 0000000000014580 ffff88001e088100 0000000000000001
Jul 25 16:54:26 security kernel: [26400.136316] Call Trace:
Jul 25 16:54:26 security kernel: [26400.136346]  [<ffffffff8142c33c>] ? __mutex_lock_slowpath+0xda/0x125
Jul 25 16:54:26 security kernel: [26400.136374]  [<ffffffff8142c1e1>] ? mutex_lock+0x12/0x28
Jul 25 16:54:26 security kernel: [26400.136399]  [<ffffffffa0015ea5>] ? videobuf_streamoff+0x13/0x34 [videobuf_core]
Jul 25 16:54:26 security kernel: [26400.136424]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
Jul 25 16:54:26 security kernel: [26400.136449]  [<ffffffffa008b5a8>] ? vidioc_streamoff+0x7e/0xb5 [em28xx]
Jul 25 16:54:26 security kernel: [26400.136473]  [<ffffffffa00355fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev]
Jul 25 16:54:26 security kernel: [26400.136496]  [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1
Jul 25 16:54:26 security kernel: [26400.136517]  [<ffffffff8142d2a4>] ? _raw_spin_unlock_irqrestore+0xc/0xd
Jul 25 16:54:26 security kernel: [26400.136539]  [<ffffffff81393cda>] ? sock_def_readable+0x3b/0x5d
Jul 25 16:54:26 security kernel: [26400.136561]  [<ffffffff81404296>] ? unix_dgram_sendmsg+0x428/0x4b2
Jul 25 16:54:26 security kernel: [26400.136580]  [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6
Jul 25 16:54:26 security kernel: [26400.136600]  [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e
Jul 25 16:54:26 security kernel: [26400.136620]  [<ffffffff81390c1e>] ? sock_sendmsg+0xd1/0xec
Jul 25 16:54:26 security kernel: [26400.136641]  [<ffffffff810b117c>] ? __do_fault+0x3eb/0x426
Jul 25 16:54:26 security kernel: [26400.136662]  [<ffffffffa0037d38>] ? video_ioctl2+0x292/0x32e [videodev]
Jul 25 16:54:26 security kernel: [26400.136684]  [<ffffffff8139271a>] ? sys_sendto+0x10d/0x127
Jul 25 16:54:26 security kernel: [26400.136702]  [<ffffffff81006332>] ? check_events+0x12/0x20
Jul 25 16:54:26 security kernel: [26400.136722]  [<ffffffffa003310b>] ? v4l2_ioctl+0x38/0x3a [videodev]
Jul 25 16:54:26 security kernel: [26400.136742]  [<ffffffff810d45be>] ? vfs_ioctl+0x69/0x92
Jul 25 16:54:26 security kernel: [26400.136760]  [<ffffffff810d4a6e>] ? do_vfs_ioctl+0x411/0x43c
Jul 25 16:54:26 security kernel: [26400.136779]  [<ffffffff810c874c>] ? vfs_write+0x134/0x169
Jul 25 16:54:26 security kernel: [26400.136797]  [<ffffffff810d4aea>] ? sys_ioctl+0x51/0x70
Jul 25 16:54:26 security kernel: [26400.136815]  [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: xen pci passthrough hung task instead of terminate
  2010-07-25 15:35 xen pci passthrough hung task instead of terminate Sander Eikelenboom
@ 2010-07-26 15:53 ` Konrad Rzeszutek Wilk
  2010-07-26 15:55   ` Sander Eikelenboom
  0 siblings, 1 reply; 3+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-07-26 15:53 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel

On Sun, Jul 25, 2010 at 05:35:07PM +0200, Sander Eikelenboom wrote:
> Hi Konrad,
> 
> I have tried both your trees, together with some experimental usb3 stuff.

How many CPUs do you have assigned to your guest?

I presume this problem does not appear under baremetal? Thought
looking at the stack I would think it would too - it does not
look Xen specific - just that a mutex is deadlocked.

> It seems to work apart from some usb3 problems after several hours of videograbbing, in the end it crashes the program, but instead of terminating it keeps hanging.
> Since xen_evtchn is on the trace stack i'm wondering if any xen parts are causing it to hang instead of terminate.

Here is what the mutex_lock says:

 71 /***
 72  * mutex_lock - acquire the mutex
 73  * @lock: the mutex to be acquired
 74  *
 75  * Lock the mutex exclusively for this task. If the mutex is not
 76  * available right now, it will sleep until it can get it.
 77  *
 78  * The mutex must later on be released by the same task that
 79  * acquired it. Recursive locking is not allowed. The task
 80  * may not exit without first unlocking the mutex. Also, kernel
 81  * memory where the mutex resides mutex must not be freed with
 82  * the mutex still locked. The mutex must first be initialized
 83  * (or statically defined) before it can be locked. memset()-ing
 84  * the mutex to 0 is not allowed.
 85  *
 86  * ( The CONFIG_DEBUG_MUTEXES .config option turns on debugging
 87  *   checks that will enforce the restrictions and will also do
 88  *   deadlock debugging. )
 89  *
 90  * This function is similar to (but not equivalent to) down()

So I think the next step is to try CONFIG_DEBUG_MUTEXES, and see
what it tells you.

> 
> --
> Sander
> 
> 
> 
> Jul 25 16:54:26 security kernel: [26400.136170] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Jul 25 16:54:26 security kernel: [26400.136191] motion        D ffffffff810049f9     0  1556      1 0x00000000
> Jul 25 16:54:26 security kernel: [26400.136220]  ffff88001fce6800 0000000000000286 0000000000000001 0000000000014580
> Jul 25 16:54:26 security kernel: [26400.136254]  ffff88001e251fd8 ffff88001e251fd8 ffff88001e088100 0000000000014580
> Jul 25 16:54:26 security kernel: [26400.136285]  0000000000014580 0000000000014580 ffff88001e088100 0000000000000001
> Jul 25 16:54:26 security kernel: [26400.136316] Call Trace:
> Jul 25 16:54:26 security kernel: [26400.136346]  [<ffffffff8142c33c>] ? __mutex_lock_slowpath+0xda/0x125
> Jul 25 16:54:26 security kernel: [26400.136374]  [<ffffffff8142c1e1>] ? mutex_lock+0x12/0x28
> Jul 25 16:54:26 security kernel: [26400.136399]  [<ffffffffa0015ea5>] ? videobuf_streamoff+0x13/0x34 [videobuf_core]
> Jul 25 16:54:26 security kernel: [26400.136424]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
> Jul 25 16:54:26 security kernel: [26400.136449]  [<ffffffffa008b5a8>] ? vidioc_streamoff+0x7e/0xb5 [em28xx]
> Jul 25 16:54:26 security kernel: [26400.136473]  [<ffffffffa00355fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev]
> Jul 25 16:54:26 security kernel: [26400.136496]  [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1
> Jul 25 16:54:26 security kernel: [26400.136517]  [<ffffffff8142d2a4>] ? _raw_spin_unlock_irqrestore+0xc/0xd
> Jul 25 16:54:26 security kernel: [26400.136539]  [<ffffffff81393cda>] ? sock_def_readable+0x3b/0x5d
> Jul 25 16:54:26 security kernel: [26400.136561]  [<ffffffff81404296>] ? unix_dgram_sendmsg+0x428/0x4b2
> Jul 25 16:54:26 security kernel: [26400.136580]  [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6
> Jul 25 16:54:26 security kernel: [26400.136600]  [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e
> Jul 25 16:54:26 security kernel: [26400.136620]  [<ffffffff81390c1e>] ? sock_sendmsg+0xd1/0xec
> Jul 25 16:54:26 security kernel: [26400.136641]  [<ffffffff810b117c>] ? __do_fault+0x3eb/0x426
> Jul 25 16:54:26 security kernel: [26400.136662]  [<ffffffffa0037d38>] ? video_ioctl2+0x292/0x32e [videodev]
> Jul 25 16:54:26 security kernel: [26400.136684]  [<ffffffff8139271a>] ? sys_sendto+0x10d/0x127
> Jul 25 16:54:26 security kernel: [26400.136702]  [<ffffffff81006332>] ? check_events+0x12/0x20
> Jul 25 16:54:26 security kernel: [26400.136722]  [<ffffffffa003310b>] ? v4l2_ioctl+0x38/0x3a [videodev]
> Jul 25 16:54:26 security kernel: [26400.136742]  [<ffffffff810d45be>] ? vfs_ioctl+0x69/0x92
> Jul 25 16:54:26 security kernel: [26400.136760]  [<ffffffff810d4a6e>] ? do_vfs_ioctl+0x411/0x43c
> Jul 25 16:54:26 security kernel: [26400.136779]  [<ffffffff810c874c>] ? vfs_write+0x134/0x169
> Jul 25 16:54:26 security kernel: [26400.136797]  [<ffffffff810d4aea>] ? sys_ioctl+0x51/0x70
> Jul 25 16:54:26 security kernel: [26400.136815]  [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: xen pci passthrough hung task instead of terminate
  2010-07-26 15:53 ` Konrad Rzeszutek Wilk
@ 2010-07-26 15:55   ` Sander Eikelenboom
  0 siblings, 0 replies; 3+ messages in thread
From: Sander Eikelenboom @ 2010-07-26 15:55 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

2 vcpus

good idea to try just 1 for now :)


Monday, July 26, 2010, 5:53:12 PM, you wrote:

> On Sun, Jul 25, 2010 at 05:35:07PM +0200, Sander Eikelenboom wrote:
>> Hi Konrad,
>> 
>> I have tried both your trees, together with some experimental usb3 stuff.

> How many CPUs do you have assigned to your guest?

> I presume this problem does not appear under baremetal? Thought
> looking at the stack I would think it would too - it does not
> look Xen specific - just that a mutex is deadlocked.

>> It seems to work apart from some usb3 problems after several hours of videograbbing, in the end it crashes the program, but instead of terminating it keeps hanging.
>> Since xen_evtchn is on the trace stack i'm wondering if any xen parts are causing it to hang instead of terminate.

> Here is what the mutex_lock says:

>  71 /***
>  72  * mutex_lock - acquire the mutex
>  73  * @lock: the mutex to be acquired
>  74  *
>  75  * Lock the mutex exclusively for this task. If the mutex is not
>  76  * available right now, it will sleep until it can get it.
>  77  *
>  78  * The mutex must later on be released by the same task that
>  79  * acquired it. Recursive locking is not allowed. The task
>  80  * may not exit without first unlocking the mutex. Also, kernel
>  81  * memory where the mutex resides mutex must not be freed with
>  82  * the mutex still locked. The mutex must first be initialized
>  83  * (or statically defined) before it can be locked. memset()-ing
>  84  * the mutex to 0 is not allowed.
>  85  *
>  86  * ( The CONFIG_DEBUG_MUTEXES .config option turns on debugging
>  87  *   checks that will enforce the restrictions and will also do
>  88  *   deadlock debugging. )
>  89  *
>  90  * This function is similar to (but not equivalent to) down()

> So I think the next step is to try CONFIG_DEBUG_MUTEXES, and see
> what it tells you.

>> 
>> --
>> Sander
>> 
>> 
>> 
>> Jul 25 16:54:26 security kernel: [26400.136170] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> Jul 25 16:54:26 security kernel: [26400.136191] motion        D ffffffff810049f9     0  1556      1 0x00000000
>> Jul 25 16:54:26 security kernel: [26400.136220]  ffff88001fce6800 0000000000000286 0000000000000001 0000000000014580
>> Jul 25 16:54:26 security kernel: [26400.136254]  ffff88001e251fd8 ffff88001e251fd8 ffff88001e088100 0000000000014580
>> Jul 25 16:54:26 security kernel: [26400.136285]  0000000000014580 0000000000014580 ffff88001e088100 0000000000000001
>> Jul 25 16:54:26 security kernel: [26400.136316] Call Trace:
>> Jul 25 16:54:26 security kernel: [26400.136346]  [<ffffffff8142c33c>] ? __mutex_lock_slowpath+0xda/0x125
>> Jul 25 16:54:26 security kernel: [26400.136374]  [<ffffffff8142c1e1>] ? mutex_lock+0x12/0x28
>> Jul 25 16:54:26 security kernel: [26400.136399]  [<ffffffffa0015ea5>] ? videobuf_streamoff+0x13/0x34 [videobuf_core]
>> Jul 25 16:54:26 security kernel: [26400.136424]  [<ffffffff81005cc5>] ? xen_force_evtchn_callback+0x9/0xa
>> Jul 25 16:54:26 security kernel: [26400.136449]  [<ffffffffa008b5a8>] ? vidioc_streamoff+0x7e/0xb5 [em28xx]
>> Jul 25 16:54:26 security kernel: [26400.136473]  [<ffffffffa00355fe>] ? __video_do_ioctl+0x181f/0x3cc7 [videodev]
>> Jul 25 16:54:26 security kernel: [26400.136496]  [<ffffffff8100631f>] ? xen_restore_fl_direct_end+0x0/0x1
>> Jul 25 16:54:26 security kernel: [26400.136517]  [<ffffffff8142d2a4>] ? _raw_spin_unlock_irqrestore+0xc/0xd
>> Jul 25 16:54:26 security kernel: [26400.136539]  [<ffffffff81393cda>] ? sock_def_readable+0x3b/0x5d
>> Jul 25 16:54:26 security kernel: [26400.136561]  [<ffffffff81404296>] ? unix_dgram_sendmsg+0x428/0x4b2
>> Jul 25 16:54:26 security kernel: [26400.136580]  [<ffffffff810058fa>] ? xen_set_pte_at+0x196/0x1b6
>> Jul 25 16:54:26 security kernel: [26400.136600]  [<ffffffff810036bd>] ? __raw_callee_save_xen_make_pte+0x11/0x1e
>> Jul 25 16:54:26 security kernel: [26400.136620]  [<ffffffff81390c1e>] ? sock_sendmsg+0xd1/0xec
>> Jul 25 16:54:26 security kernel: [26400.136641]  [<ffffffff810b117c>] ? __do_fault+0x3eb/0x426
>> Jul 25 16:54:26 security kernel: [26400.136662]  [<ffffffffa0037d38>] ? video_ioctl2+0x292/0x32e [videodev]
>> Jul 25 16:54:26 security kernel: [26400.136684]  [<ffffffff8139271a>] ? sys_sendto+0x10d/0x127
>> Jul 25 16:54:26 security kernel: [26400.136702]  [<ffffffff81006332>] ? check_events+0x12/0x20
>> Jul 25 16:54:26 security kernel: [26400.136722]  [<ffffffffa003310b>] ? v4l2_ioctl+0x38/0x3a [videodev]
>> Jul 25 16:54:26 security kernel: [26400.136742]  [<ffffffff810d45be>] ? vfs_ioctl+0x69/0x92
>> Jul 25 16:54:26 security kernel: [26400.136760]  [<ffffffff810d4a6e>] ? do_vfs_ioctl+0x411/0x43c
>> Jul 25 16:54:26 security kernel: [26400.136779]  [<ffffffff810c874c>] ? vfs_write+0x134/0x169
>> Jul 25 16:54:26 security kernel: [26400.136797]  [<ffffffff810d4aea>] ? sys_ioctl+0x51/0x70
>> Jul 25 16:54:26 security kernel: [26400.136815]  [<ffffffff810086c2>] ? system_call_fastpath+0x16/0x1b
>> 
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel



-- 
Best regards,
 Sander                            mailto:linux@eikelenboom.it

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-07-26 15:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-25 15:35 xen pci passthrough hung task instead of terminate Sander Eikelenboom
2010-07-26 15:53 ` Konrad Rzeszutek Wilk
2010-07-26 15:55   ` Sander Eikelenboom

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.