All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-help] machine hangs running oprofile on xenomai kernel
@ 2010-06-28 16:48 Sydir, Jerry
  2010-06-29 14:47 ` Jan Kiszka
  0 siblings, 1 reply; 7+ messages in thread
From: Sydir, Jerry @ 2010-06-28 16:48 UTC (permalink / raw)
  To: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 562 bytes --]

Hello,

I am having trouble running oprofile on a xenomai kernel. I am running the 2.5.3 version of xenomai built into a 2.6.32.11 version of the kernel running on an Intel Core 2 duo. I have tried both the 0.9.3 and 0.9.6 versions of oprofile. When I try to start collection ("opcontrol -start") the machine hangs. Oprofile runs without trouble on a clean version of the 2.6.32.11 kernel built using the same configuration as the xenomai version.

Is this a know limitation? Are there any configuration settings that I need to use?

Best Regards,
Jerry

[-- Attachment #2: Type: text/html, Size: 2732 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai-help] machine hangs running oprofile on xenomai kernel
  2010-06-28 16:48 [Xenomai-help] machine hangs running oprofile on xenomai kernel Sydir, Jerry
@ 2010-06-29 14:47 ` Jan Kiszka
  2010-06-29 21:27   ` Sydir, Jerry
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kiszka @ 2010-06-29 14:47 UTC (permalink / raw)
  To: Sydir, Jerry; +Cc: xenomai@xenomai.org

Sydir, Jerry wrote:
> Hello,
> 
> I am having trouble running oprofile on a xenomai kernel. I am running the 2.5.3 version of xenomai built into a 2.6.32.11 version of the kernel running on an Intel Core 2 duo. I have tried both the 0.9.3 and 0.9.6 versions of oprofile. When I try to start collection ("opcontrol -start") the machine hangs. Oprofile runs without trouble on a clean version of the 2.6.32.11 kernel built using the same configuration as the xenomai version.
> 
> Is this a know limitation? Are there any configuration settings that I need to use?

Could you try if loading oprofile with module parameter timer=1 helps?
It will not profile Xenomai code anymore, but may point to the NMI path.

But oprofile might have its own problems. This is what I got testing it
on 2.6.34 (opcontrol --dump in a profiling session):

[  667.539673] =======================================================
[  667.540063] [ INFO: possible circular locking dependency detected ]
[  667.540063] 2.6.34-xeno_64 #24
[  667.540063] -------------------------------------------------------
[  667.540063] oprofiled/6497 is trying to acquire lock:
[  667.540063]  (&mm->mmap_sem){++++++}, at: [<ffffffff8111fd9b>] might_fault+0x68/0xb5
[  667.540063]
[  667.540063] but task is already holding lock:
[  667.540063]  (dcookie_mutex){+.+.+.}, at: [<ffffffff81190980>] sys_lookup_dcookie+0x44/0x182
[  667.540063]
[  667.540063] which lock already depends on the new lock.
[  667.540063]
[  667.540063]
[  667.540063] the existing dependency chain (in reverse order) is:
[  667.540063]
[  667.540063] -> #1 (dcookie_mutex){+.+.+.}:
[  667.540063]        [<ffffffff81065f0e>] __lock_acquire+0x150b/0x1862
[  667.540063]        [<ffffffff8106635d>] lock_acquire+0xf8/0x122
[  667.540063]        [<ffffffff8132ed77>] mutex_lock_nested+0x69/0x336
[  667.540063]        [<ffffffff81190aed>] get_dcookie+0x2f/0x13e
[  667.540063]        [<ffffffffa02d4fbb>] sync_buffer+0x1a9/0x413 [oprofile]
[  667.540063]        [<ffffffffa02d523b>] task_exit_notify+0x16/0x1a [oprofile]
[  667.540063]        [<ffffffff81334813>] notifier_call_chain+0x38/0x60
[  667.540063]        [<ffffffff810569c6>] __blocking_notifier_call_chain+0x52/0x6f
[  667.540063]        [<ffffffff810569f7>] blocking_notifier_call_chain+0x14/0x16
[  667.540063]        [<ffffffff810599ed>] profile_task_exit+0x1a/0x1c
[  667.540063]        [<ffffffff81039e37>] do_exit+0x2a/0x705
[  667.540063]        [<ffffffff8103a58a>] do_group_exit+0x78/0xa1
[  667.540063]        [<ffffffff8103a5ca>] sys_exit_group+0x17/0x1b
[  667.540063]        [<ffffffff81002b7f>] system_call_fastpath+0x16/0x1b
[  667.540063]
[  667.540063] -> #0 (&mm->mmap_sem){++++++}:
[  667.540063]        [<ffffffff81065c33>] __lock_acquire+0x1230/0x1862
[  667.540063]        [<ffffffff8106635d>] lock_acquire+0xf8/0x122
[  667.540063]        [<ffffffff8111fdc8>] might_fault+0x95/0xb5
[  667.540063]        [<ffffffff81190a76>] sys_lookup_dcookie+0x13a/0x182
[  667.540063]        [<ffffffff81002b7f>] system_call_fastpath+0x16/0x1b
[  667.540063]
[  667.540063] other info that might help us debug this:
[  667.540063]
[  667.540063] 1 lock held by oprofiled/6497:
[  667.540063]  #0:  (dcookie_mutex){+.+.+.}, at: [<ffffffff81190980>] sys_lookup_dcookie+0x44/0x182
[  667.540063]
[  667.540063] stack backtrace:
[  667.540063] Pid: 6497, comm: oprofiled Not tainted 2.6.34-xeno_64 #24
[  667.540063] Call Trace:
[  667.540063]  [<ffffffff810644a7>] print_circular_bug+0xb3/0xc2
[  667.540063]  [<ffffffff81065c33>] __lock_acquire+0x1230/0x1862
[  667.540063]  [<ffffffff8106635d>] lock_acquire+0xf8/0x122
[  667.540063]  [<ffffffff8111fd9b>] ? might_fault+0x68/0xb5
[  667.540063]  [<ffffffff8111fdc8>] might_fault+0x95/0xb5
[  667.540063]  [<ffffffff8111fd9b>] ? might_fault+0x68/0xb5
[  667.540063]  [<ffffffff81190a76>] sys_lookup_dcookie+0x13a/0x182
[  667.540063]  [<ffffffff81002b7f>] system_call_fastpath+0x16/0x1b

Maybe perf works, maybe it also has issues when using NMIs. I haven't
tried this yet over Xenomai.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai-help] machine hangs running oprofile on xenomai kernel
  2010-06-29 14:47 ` Jan Kiszka
@ 2010-06-29 21:27   ` Sydir, Jerry
  2010-06-29 22:33     ` Jan Kiszka
  0 siblings, 1 reply; 7+ messages in thread
From: Sydir, Jerry @ 2010-06-29 21:27 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai@xenomai.org

Hi Jan,

Thanks for your reply. I tried loading the oprofile module with timer=1 and oprofile seems to work (doesn't hang the system and spits out some results). So it appears that the use of NMIs are the problem. Do you have any suggestions on what to try next?

Best Regards,
Jerry

-----Original Message-----
From: Jan Kiszka [mailto:jan.kiszka@domain.hid
Sent: Tuesday, June 29, 2010 7:48 AM
To: Sydir, Jerry
Cc: xenomai@xenomai.org
Subject: Re: machine hangs running oprofile on xenomai kernel

Sydir, Jerry wrote:
> Hello,
> 
> I am having trouble running oprofile on a xenomai kernel. I am running the 2.5.3 version of xenomai built into a 2.6.32.11 version of the kernel running on an Intel Core 2 duo. I have tried both the 0.9.3 and 0.9.6 versions of oprofile. When I try to start collection ("opcontrol -start") the machine hangs. Oprofile runs without trouble on a clean version of the 2.6.32.11 kernel built using the same configuration as the xenomai version.
> 
> Is this a know limitation? Are there any configuration settings that I need to use?

Could you try if loading oprofile with module parameter timer=1 helps?
It will not profile Xenomai code anymore, but may point to the NMI path.

But oprofile might have its own problems. This is what I got testing it
on 2.6.34 (opcontrol --dump in a profiling session):

[  667.539673] =======================================================
[  667.540063] [ INFO: possible circular locking dependency detected ]
[  667.540063] 2.6.34-xeno_64 #24
[  667.540063] -------------------------------------------------------
[  667.540063] oprofiled/6497 is trying to acquire lock:
[  667.540063]  (&mm->mmap_sem){++++++}, at: [<ffffffff8111fd9b>] might_fault+0x68/0xb5
[  667.540063]
[  667.540063] but task is already holding lock:
[  667.540063]  (dcookie_mutex){+.+.+.}, at: [<ffffffff81190980>] sys_lookup_dcookie+0x44/0x182
[  667.540063]
[  667.540063] which lock already depends on the new lock.
[  667.540063]
[  667.540063]
[  667.540063] the existing dependency chain (in reverse order) is:
[  667.540063]
[  667.540063] -> #1 (dcookie_mutex){+.+.+.}:
[  667.540063]        [<ffffffff81065f0e>] __lock_acquire+0x150b/0x1862
[  667.540063]        [<ffffffff8106635d>] lock_acquire+0xf8/0x122
[  667.540063]        [<ffffffff8132ed77>] mutex_lock_nested+0x69/0x336
[  667.540063]        [<ffffffff81190aed>] get_dcookie+0x2f/0x13e
[  667.540063]        [<ffffffffa02d4fbb>] sync_buffer+0x1a9/0x413 [oprofile]
[  667.540063]        [<ffffffffa02d523b>] task_exit_notify+0x16/0x1a [oprofile]
[  667.540063]        [<ffffffff81334813>] notifier_call_chain+0x38/0x60
[  667.540063]        [<ffffffff810569c6>] __blocking_notifier_call_chain+0x52/0x6f
[  667.540063]        [<ffffffff810569f7>] blocking_notifier_call_chain+0x14/0x16
[  667.540063]        [<ffffffff810599ed>] profile_task_exit+0x1a/0x1c
[  667.540063]        [<ffffffff81039e37>] do_exit+0x2a/0x705
[  667.540063]        [<ffffffff8103a58a>] do_group_exit+0x78/0xa1
[  667.540063]        [<ffffffff8103a5ca>] sys_exit_group+0x17/0x1b
[  667.540063]        [<ffffffff81002b7f>] system_call_fastpath+0x16/0x1b
[  667.540063]
[  667.540063] -> #0 (&mm->mmap_sem){++++++}:
[  667.540063]        [<ffffffff81065c33>] __lock_acquire+0x1230/0x1862
[  667.540063]        [<ffffffff8106635d>] lock_acquire+0xf8/0x122
[  667.540063]        [<ffffffff8111fdc8>] might_fault+0x95/0xb5
[  667.540063]        [<ffffffff81190a76>] sys_lookup_dcookie+0x13a/0x182
[  667.540063]        [<ffffffff81002b7f>] system_call_fastpath+0x16/0x1b
[  667.540063]
[  667.540063] other info that might help us debug this:
[  667.540063]
[  667.540063] 1 lock held by oprofiled/6497:
[  667.540063]  #0:  (dcookie_mutex){+.+.+.}, at: [<ffffffff81190980>] sys_lookup_dcookie+0x44/0x182
[  667.540063]
[  667.540063] stack backtrace:
[  667.540063] Pid: 6497, comm: oprofiled Not tainted 2.6.34-xeno_64 #24
[  667.540063] Call Trace:
[  667.540063]  [<ffffffff810644a7>] print_circular_bug+0xb3/0xc2
[  667.540063]  [<ffffffff81065c33>] __lock_acquire+0x1230/0x1862
[  667.540063]  [<ffffffff8106635d>] lock_acquire+0xf8/0x122
[  667.540063]  [<ffffffff8111fd9b>] ? might_fault+0x68/0xb5
[  667.540063]  [<ffffffff8111fdc8>] might_fault+0x95/0xb5
[  667.540063]  [<ffffffff8111fd9b>] ? might_fault+0x68/0xb5
[  667.540063]  [<ffffffff81190a76>] sys_lookup_dcookie+0x13a/0x182
[  667.540063]  [<ffffffff81002b7f>] system_call_fastpath+0x16/0x1b

Maybe perf works, maybe it also has issues when using NMIs. I haven't
tried this yet over Xenomai.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai-help] machine hangs running oprofile on xenomai kernel
  2010-06-29 21:27   ` Sydir, Jerry
@ 2010-06-29 22:33     ` Jan Kiszka
  2010-06-29 22:36       ` Gilles Chanteperdrix
  2010-06-30  0:02       ` Sydir, Jerry
  0 siblings, 2 replies; 7+ messages in thread
From: Jan Kiszka @ 2010-06-29 22:33 UTC (permalink / raw)
  To: Sydir, Jerry; +Cc: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

[please do not top-post]

Sydir, Jerry wrote:
> Hi Jan,
> 
> Thanks for your reply. I tried loading the oprofile module with timer=1 and oprofile seems to work (doesn't hang the system and spits out some results). So it appears that the use of NMIs are the problem. Do you have any suggestions on what to try next?

Now someone would have to dig into the code path taken by oprofile
during NMI handling to find the reason for the lock-up. Did you have any
real-time jobs running when starting the profiling, or was Xenomai idle?

On the other hand, oprofile is a legacy interface. It probably makes
more sense to try and, if required, fix perf instead.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai-help] machine hangs running oprofile on xenomai kernel
  2010-06-29 22:33     ` Jan Kiszka
@ 2010-06-29 22:36       ` Gilles Chanteperdrix
  2010-06-30  0:13         ` Sydir, Jerry
  2010-06-30  0:02       ` Sydir, Jerry
  1 sibling, 1 reply; 7+ messages in thread
From: Gilles Chanteperdrix @ 2010-06-29 22:36 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai@xenomai.org

Jan Kiszka wrote:
> [please do not top-post]
> 
> Sydir, Jerry wrote:
>> Hi Jan,
>>
>> Thanks for your reply. I tried loading the oprofile module with timer=1 and oprofile seems to work (doesn't hang the system and spits out some results). So it appears that the use of NMIs are the problem. Do you have any suggestions on what to try next?
> 
> Now someone would have to dig into the code path taken by oprofile
> during NMI handling to find the reason for the lock-up. Did you have any
> real-time jobs running when starting the profiling, or was Xenomai idle?
> 
> On the other hand, oprofile is a legacy interface. It probably makes
> more sense to try and, if required, fix perf instead.

Could not there be some bad interaction with Xenomai NMI watchdog?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai-help] machine hangs running oprofile on xenomai kernel
  2010-06-29 22:33     ` Jan Kiszka
  2010-06-29 22:36       ` Gilles Chanteperdrix
@ 2010-06-30  0:02       ` Sydir, Jerry
  1 sibling, 0 replies; 7+ messages in thread
From: Sydir, Jerry @ 2010-06-30  0:02 UTC (permalink / raw)
  To: jan.kiszka@domain.hid; +Cc: xenomai@xenomai.org



Jan Kiszka wrote:

>[please do not top-post]
>
>Sydir, Jerry wrote:
>> Hi Jan,
>>
>> Thanks for your reply. I tried loading the oprofile module with timer=1
>and oprofile seems to work (doesn't hang the system and spits out some
>results). So it appears that the use of NMIs are the problem. Do you have
>any suggestions on what to try next?
>
>
>Now someone would have to dig into the code path taken by oprofile
>during NMI handling to find the reason for the lock-up. Did you have any
>real-time jobs running when starting the profiling, or was Xenomai idle?
>
>On the other hand, oprofile is a legacy interface. It probably makes
>more sense to try and, if required, fix perf instead.
>
>Jan

I was not running any real-time jobs, so Xenomai was idle. 

I just tried running perf on a non-real time command and the system hung in the same way as with oprofile. Here is command that I tried:
perf -e cycles -c 400000 find / jerry*


Jerry Sydir


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Xenomai-help] machine hangs running oprofile on xenomai kernel
  2010-06-29 22:36       ` Gilles Chanteperdrix
@ 2010-06-30  0:13         ` Sydir, Jerry
  0 siblings, 0 replies; 7+ messages in thread
From: Sydir, Jerry @ 2010-06-30  0:13 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org



Gilles Chanteperdrix wrote:

>Jan Kiszka wrote:
>> [please do not top-post]
>>
>> Sydir, Jerry wrote:
>>> Hi Jan,
>>>
>>> Thanks for your reply. I tried loading the oprofile module with timer=1
>and oprofile seems to work (doesn't hang the system and spits out some
>results). So it appears that the use of NMIs are the problem. Do you have
>any suggestions on what to try next?
>>
>> Now someone would have to dig into the code path taken by oprofile
>> during NMI handling to find the reason for the lock-up. Did you have any
>> real-time jobs running when starting the profiling, or was Xenomai idle?
>>
>> On the other hand, oprofile is a legacy interface. It probably makes
>> more sense to try and, if required, fix perf instead.
>
>Could not there be some bad interaction with Xenomai NMI watchdog?
>
>--
>					    Gilles.

I think that the NMI watchdog is not enabled in my configuration (CONFIG_XENO_HW_NMI_DEBUG_LATENCY is not set). I'm not that familiar with xenomai, so I'm not sure if this is what you are talking about.

Jerry Sydir


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-06-30  0:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-28 16:48 [Xenomai-help] machine hangs running oprofile on xenomai kernel Sydir, Jerry
2010-06-29 14:47 ` Jan Kiszka
2010-06-29 21:27   ` Sydir, Jerry
2010-06-29 22:33     ` Jan Kiszka
2010-06-29 22:36       ` Gilles Chanteperdrix
2010-06-30  0:13         ` Sydir, Jerry
2010-06-30  0:02       ` Sydir, Jerry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.