[Xenomai-help] isolating unwanted mode switch

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xenomai-help] isolating unwanted mode switch
@ 2007-01-23 20:48 Jeff Weber
  2007-01-23 21:45 ` Dmitry Adamushko
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Weber @ 2007-01-23 20:48 UTC (permalink / raw)
  To: Xenomai Help

Greetings,

I have a multi-threaded C++ realtime application that is encountering an unwanted switch to secondary mode.
To isolate the mode switch, I've enabled enabled the task mode T_WARNSW to deliver the signal SIGXCPU.
gdb shows me where the SIGXCPU signal was delivered to the real time thread, presumably upon the
transition from primary to secondary mode:

Here's the backtrace of the core file:

0  0x080eb5ce in AMSC::CSeqMeas::input (this=0x81af900, dat=0xb642cccc, 
    tm=@0x81b1ac0) at prj/src/amscseqmeas.cpp:1656
(gdb) bt
#0  0x080eb5ce in AMSC::CSeqMeas::input (this=0x81af900, dat=0xb642cccc, tm=@0x81b1ac0) at prj/src/amscseqmeas.cpp:1656
#1  0x080de1cd in fastdelegate::FastDelegate2<unsigned short*, RtTime const&, void>::operator() (this=0x8253330, p1=0xb642cccc, p2=@0x81b1ac0) at prj/src/fastdelegate.h:1076
#2  0x080dc1f6 in AMSC::CRtA2dHdlr::hsThreadwIrqFnc (this=0x81b1b20) at prj/src/amsca2d.cpp:669
#3  0x080ba3ef in fastdelegate::FastDelegate0<void>::operator() (this=0x81b1b58) at prj/src/fastdelegate.h:906
#4  0x080c22c3 in AMSC::CRtThread::starter (arg=0x81b1b28) at prj/src/amscthread.h:444
#5  0xb7eb944e in rt_task_trampoline (cookie=0x81b1b28) at /usr/src/xenomai-2.2.4/src/skins/native/task.c:89
#6  0xb7f1234b in start_thread () from /lib/libpthread.so.0
#7  0xb7d4165e in clone () from /lib/libc.so.6

Here's source code (x86 assembly view) of frame #0 near the SIGXCPU delivery point:
Dump of assembler code from 0x80eb5a4 to 0x80eb6a4:
    0x080eb5a4 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+232>:   mov    0x8(%ebp),%eax
    0x080eb5a7 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+235>:   mov    0x71c(%eax),%ecx
    0x080eb5ad <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+241>:   mov    0x8(%ebp),%eax
    0x080eb5b0 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+244>:   mov    0x720(%eax),%eax
    0x080eb5b6 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+250>:   mov    %eax,%edx
    0x080eb5b8 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+252>:   mov    %edx,%eax
    0x080eb5ba <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+254>:   shl    $0x2,%eax
    0x080eb5bd <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+257>:   add    %edx,%eax
    0x080eb5bf <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+259>:   shl    $0x2,%eax
    0x080eb5c2 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+262>:   add    %eax,%ecx
    0x080eb5c4 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+264>:   mov    0x10(%ebp),%eax
    0x080eb5c7 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+267>:   mov    0x4(%eax),%edx
    0x080eb5ca <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+270>:   mov    (%eax),%eax
    0x080eb5cc <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+272>:   mov    %eax,(%ecx)
    0x080eb5ce <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+274>:   mov    %edx,0x4(%ecx)  <-frame pointer here
    0x080eb5d1 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+277>:   mov    0x8(%ebp),%eax
    0x080eb5d4 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+280>:   mov    0x71c(%eax),%ecx
    0x080eb5da <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+286>:   mov    0x8(%ebp),%eax
    0x080eb5dd <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+289>:   mov    0x720(%eax),%eax
    0x080eb5e3 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+295>:   mov    %eax,%edx
    0x080eb5e5 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+297>:   mov    %edx,%eax
    0x080eb5e7 <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+299>:   shl    $0x2,%eax
    0x080eb5ea <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+302>:   add    %edx,%eax
    0x080eb5ec <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+304>:   shl    $0x2,%eax
    0x080eb5ef <_ZN4AMSC8CSeqMeas5inputEPtRK6RtTime+307>:   lea    (%ecx,%eax,1),%ebx

Hmmm.  I expected to see a system call at or near the signal receipt point.  To confirm, I coded up
a quick test case that called printf() from a realtime thread in primary mode, with T_WARNSW enabled.
Sure enough, the signalled fram from the simple test case looks as suspected:
Dump of assembler code for function __kernel_vsyscall:
    0xffffe400 <__kernel_vsyscall+0>:       push   %ecx
    0xffffe401 <__kernel_vsyscall+1>:       push   %edx
    0xffffe402 <__kernel_vsyscall+2>:       push   %ebp
    0xffffe403 <__kernel_vsyscall+3>:       mov    %esp,%ebp
    0xffffe405 <__kernel_vsyscall+5>:       sysenter 
    0xffffe407 <__kernel_vsyscall+7>:       nop    
    0xffffe408 <__kernel_vsyscall+8>:       nop    
    0xffffe409 <__kernel_vsyscall+9>:       nop    
    0xffffe40a <__kernel_vsyscall+10>:      nop    
    0xffffe40b <__kernel_vsyscall+11>:      nop    
    0xffffe40c <__kernel_vsyscall+12>:      nop    
    0xffffe40d <__kernel_vsyscall+13>:      nop    
    0xffffe40e <__kernel_vsyscall+14>:      jmp    0xffffe403 <__kernel_vsyscall+3>
    0xffffe410 <__kernel_vsyscall+16>:      pop    %ebp               <- frame pointer here
    0xffffe411 <__kernel_vsyscall+17>:      pop    %edx
    0xffffe412 <__kernel_vsyscall+18>:      pop    %ecx
    0xffffe413 <__kernel_vsyscall+19>:      ret    
    0xffffe414 <__kernel_vsyscall+20>:      nop    
    0xffffe415 <__kernel_vsyscall+21>:      nop    

So in the original mutli-threaded realtime application above, I see no evidence of a system call
at the point the SIGXCPU was sent.

What am I missing?

Are there other events (not system calls) that initiate a switch to secondary mode?

Or, what is the best way to isolate the unwanted mode switch?

my config:
Xenomai 2.2.4
Linux 2.6.17.14

	thanks,
	Jeff


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-23 20:48 [Xenomai-help] isolating unwanted mode switch Jeff Weber
@ 2007-01-23 21:45 ` Dmitry Adamushko
  2007-01-23 22:27   ` Jeff Weber
  0 siblings, 1 reply; 18+ messages in thread
From: Dmitry Adamushko @ 2007-01-23 21:45 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai help

Hi,

>
> Are there other events (not system calls) that initiate a switch to secondary mode?

exceptions (CPU exceptions, e.g. page faults). I presume, you have
done mlockall(CURRENT | FUTURE), haven't you?

As a starting point, take a look at what /proc/xenomai/faults provides.

Moreover, enable "Nucleus debugging support" in your kernel config. If
it's an exception indeed, you would see a message like :

"Switching %s to secondary mode after exception #%u from "
                             "user-space at 0x%lx (pid %d)\n"

with proper parameters, of course.

-- 
Best regards,
Dmitry Adamushko

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-23 21:45 ` Dmitry Adamushko
@ 2007-01-23 22:27   ` Jeff Weber
  2007-01-24 10:08     ` Dmitry Adamushko
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff Weber @ 2007-01-23 22:27 UTC (permalink / raw)
  To: Dmitry Adamushko; +Cc: Xenomai help

Dimitry:

On Tuesday 23 January 2007 15:45, Dmitry Adamushko wrote:
> > Are there other events (not system calls) that initiate a switch to
> > secondary mode?
>
> exceptions (CPU exceptions, e.g. page faults). I presume, you have
> done mlockall(CURRENT | FUTURE), haven't you?
Yes.  my Linux application follows this this model:

main () {
	 mlockall(MCL_CURRENT | MCL_FUTURE);
	rt_task_shadow( ... )  // become a realtime thread

	// for N threads:
		rt_task_spawn( ... ) // fork off child RT thread
}

It is one of the child RT threads that encounters the SIGXCPU unwanted mode 
switch.  It this the correct way to call mlockall() ?

>
> As a starting point, take a look at what /proc/xenomai/faults provides.
Aha!  All faults are 0 expect for:
TRAP         CPU0
 ...
 14:          331    (Page fault)
...

So indeed I have an error in how I call mlockall().  (and I do error check the 
return value in main() ).

>
> Moreover, enable "Nucleus debugging support" in your kernel config. If
> it's an exception indeed, you would see a message like :
Will do.  I always stayed away from enabling XENO_OPT_DEBUG in the past 
because of the associated "scary" warning:
	"... Do not switch this option on unless you really know what you
        are doing."

While I was reconfiguring my kernel, I came across the related config:
XENO_OPT_DEBUG_QUEUES, which has even "scarier" warning:
	"... It adds even more runtime
        overhead then CONFIG_XENO_OPT_DEBUG, use with care."

Is the  XENO_OPT_DEBUG_QUEUES option safe and useful to enable too?

	thanks,
	Jeff


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-23 22:27   ` Jeff Weber
@ 2007-01-24 10:08     ` Dmitry Adamushko
  2007-01-24 11:22       ` [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails Daniel Schnell
  2007-01-24 17:52       ` [Xenomai-help] isolating unwanted mode switch Jeff Weber
  0 siblings, 2 replies; 18+ messages in thread
From: Dmitry Adamushko @ 2007-01-24 10:08 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai help

On 23/01/07, Jeff Weber <jweber@domain.hid> wrote:
> Dimitry:
> > ...
> > exceptions (CPU exceptions, e.g. page faults). I presume, you have
> > done mlockall(CURRENT | FUTURE), haven't you?
> Yes.  my Linux application follows this this model:
>
> main () {
>          mlockall(MCL_CURRENT | MCL_FUTURE);
>         rt_task_shadow( ... )  // become a realtime thread
>
>         // for N threads:
>                 rt_task_spawn( ... ) // fork off child RT thread
> }
>
> It is one of the child RT threads that encounters the SIGXCPU unwanted mode
> switch.  It this the correct way to call mlockall() ?

Looks ok. Just to avoid getting into the same trap twice. You don't
use system(), fork() or alike beasts in your code, do you?


> > Moreover, enable "Nucleus debugging support" in your kernel config. If
> > it's an exception indeed, you would see a message like :
> Will do.  I always stayed away from enabling XENO_OPT_DEBUG in the past
> because of the associated "scary" warning:
>         "... Do not switch this option on unless you really know what you
>         are doing."
> While I was reconfiguring my kernel, I came across the related config:
> XENO_OPT_DEBUG_QUEUES, which has even "scarier" warning:
>         "... It adds even more runtime
>         overhead then CONFIG_XENO_OPT_DEBUG, use with care."
>
> Is the  XENO_OPT_DEBUG_QUEUES option safe and useful to enable too?

None of them will blow your boards up (or at least, they were not
designed to act this way :) They are helpful in debugging of some -
primarily internal - issues but may provide some useful info in
general.

DEBUG_QUEUES enables additional debugging messages for various queue
operations, that's not something you need know.

On the other hand, the "switch to secondary mode..." message that is
enabled by XENO_OPT_DEBUG may give you an address where the exception
has been taken.


>
>         thanks,
>         Jeff
>

-- 
Best regards,
Dmitry Adamushko


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-24 10:08     ` Dmitry Adamushko
@ 2007-01-24 11:22       ` Daniel Schnell
  2007-01-24 14:44         ` Wolfgang Grandegger
  2007-01-24 17:52       ` [Xenomai-help] isolating unwanted mode switch Jeff Weber
  1 sibling, 1 reply; 18+ messages in thread
From: Daniel Schnell @ 2007-01-24 11:22 UTC (permalink / raw)
  To: Xenomai help

Hi,

I want to patch the latest Denx 2.4.25 kernel with Xenomai-2.3.0

The prepare-kernel.sh skript fails with the following message:

[root@domain.hid xenomai-2.3.0]# scripts/prepare-kernel.sh --arch=ppc
--adeos=ksrc/arch/powerpc/patches/adeos-ipipe-2.4.25-ppc-CVS-20060707-1.
2-01.patch --linux=../linuxppc_2_4_devel-git-xenomai-2.3.0
patching file arch/ppc/config.in
Hunk #1 succeeded at 858 (offset 40 lines).
patching file arch/ppc/kernel/Makefile
patching file arch/ppc/kernel/entry.S
patching file arch/ppc/kernel/head.S
patching file arch/ppc/kernel/head_440.S
patching file arch/ppc/kernel/head_44x.S
patching file arch/ppc/kernel/head_4xx.S
patching file arch/ppc/kernel/head_8xx.S
patching file arch/ppc/kernel/head_e500.S
patching file arch/ppc/kernel/idle.c
patching file arch/ppc/kernel/ipipe-core.c
patching file arch/ppc/kernel/ipipe-root.c
patching file arch/ppc/kernel/irq.c
patching file arch/ppc/kernel/ppc_ksyms.c
patching file arch/ppc/kernel/traps.c
patching file arch/ppc/mm/fault.c
patching file include/asm-ppc/hw_irq.h
patching file include/asm-ppc/ipipe.h
patching file include/asm-ppc/mmu_context.h
patching file include/asm-ppc/system.h
patching file include/linux/ipipe.h
patching file include/linux/sched.h
Hunk #3 succeeded at 424 with fuzz 1.
patching file init/main.c
Hunk #1 succeeded at 412 (offset 3 lines).
patching file kernel/Makefile
patching file kernel/exit.c
patching file kernel/fork.c
patching file kernel/ipipe/Makefile
patching file kernel/ipipe/core.c
patching file kernel/ipipe/generic.c
patching file kernel/printk.c
patching file kernel/sched.c
patching file kernel/signal.c
?


Then the script unexpectedly exits without error (but a ?).


Any hints appreciated.



Best regards,

Daniel Schnell.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-24 11:22       ` [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails Daniel Schnell
@ 2007-01-24 14:44         ` Wolfgang Grandegger
  2007-01-24 17:04           ` Daniel Schnell
  0 siblings, 1 reply; 18+ messages in thread
From: Wolfgang Grandegger @ 2007-01-24 14:44 UTC (permalink / raw)
  To: Daniel Schnell; +Cc: Xenomai help

Hi Daniel,

Daniel Schnell wrote:
> Hi,
> 
> I want to patch the latest Denx 2.4.25 kernel with Xenomai-2.3.0
> 
> The prepare-kernel.sh skript fails with the following message:
> 
> [root@domain.hid xenomai-2.3.0]# scripts/prepare-kernel.sh --arch=ppc
> --adeos=ksrc/arch/powerpc/patches/adeos-ipipe-2.4.25-ppc-CVS-20060707-1.
> 2-01.patch --linux=../linuxppc_2_4_devel-git-xenomai-2.3.0
> patching file arch/ppc/config.in
> Hunk #1 succeeded at 858 (offset 40 lines).
> patching file arch/ppc/kernel/Makefile
> patching file arch/ppc/kernel/entry.S
> patching file arch/ppc/kernel/head.S
> patching file arch/ppc/kernel/head_440.S
> patching file arch/ppc/kernel/head_44x.S
> patching file arch/ppc/kernel/head_4xx.S
> patching file arch/ppc/kernel/head_8xx.S
> patching file arch/ppc/kernel/head_e500.S
> patching file arch/ppc/kernel/idle.c
> patching file arch/ppc/kernel/ipipe-core.c
> patching file arch/ppc/kernel/ipipe-root.c
> patching file arch/ppc/kernel/irq.c
> patching file arch/ppc/kernel/ppc_ksyms.c
> patching file arch/ppc/kernel/traps.c
> patching file arch/ppc/mm/fault.c
> patching file include/asm-ppc/hw_irq.h
> patching file include/asm-ppc/ipipe.h
> patching file include/asm-ppc/mmu_context.h
> patching file include/asm-ppc/system.h
> patching file include/linux/ipipe.h
> patching file include/linux/sched.h
> Hunk #3 succeeded at 424 with fuzz 1.
> patching file init/main.c
> Hunk #1 succeeded at 412 (offset 3 lines).
> patching file kernel/Makefile
> patching file kernel/exit.c
> patching file kernel/fork.c
> patching file kernel/ipipe/Makefile
> patching file kernel/ipipe/core.c
> patching file kernel/ipipe/generic.c
> patching file kernel/printk.c
> patching file kernel/sched.c
> patching file kernel/signal.c
> ?
> 
> 
> Then the script unexpectedly exits without error (but a ?).
> 
> 
> Any hints appreciated.

I unable to reproduce your problem with a fresh linuxppc_2_4_devel tree 
and xenomai-2.3.0. Could you please use the "--verbose" option with 
prepare_kernel.sh and post the result.

Wolfgang.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-24 14:44         ` Wolfgang Grandegger
@ 2007-01-24 17:04           ` Daniel Schnell
  2007-01-24 21:33             ` Wolfgang Grandegger
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Schnell @ 2007-01-24 17:04 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: Xenomai help

Wolfgang Grandegger wrote:

> I unable to reproduce your problem with a fresh linuxppc_2_4_devel
> tree and xenomai-2.3.0. Could you please use the "--verbose" option
> with prepare_kernel.sh and post the result.  

Hmm, I used

make mrproper; git pull

to get the latest deltas from the repository. But maybe there is a saner
way to do it ? I am pulling a fresh tree at the moment, but this will
take quite a while according to the progress bar.

The output from the skript is

Preparing kernel 2.4.25 in
/home/daniel/projects/mach4/linuxppc_2_4_devel-git-xenomai-2.3.0...
Adeos found - bypassing patch.
Adeos/ppc 1.2-01 installed.

And ? Is printed out on stderr.

In the meanwhile I got the fresh linux kernel, the same happensn here.

I pulled it via

git clone git://www.denx.de/git/linuxppc_2_4_devel.git
linuxppc_2_4_devel-git-xenomai-2.3.0

Best regards,

Daniel.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-24 10:08     ` Dmitry Adamushko
  2007-01-24 11:22       ` [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails Daniel Schnell
@ 2007-01-24 17:52       ` Jeff Weber
  2007-01-24 18:55         ` Eric Noulard
  2007-01-25 10:50         ` Philippe Gerum
  1 sibling, 2 replies; 18+ messages in thread
From: Jeff Weber @ 2007-01-24 17:52 UTC (permalink / raw)
  To: Dmitry Adamushko; +Cc: Xenomai help

Dmitry:

On Wednesday 24 January 2007 04:08, Dmitry Adamushko wrote:
> On 23/01/07, Jeff Weber <jweber@domain.hid> wrote:
> > Dimitry:
> > > ...
> > > exceptions (CPU exceptions, e.g. page faults). I presume, you have
> > > done mlockall(CURRENT | FUTURE), haven't you?
> >
> > Yes.  my Linux application follows this this model:
> >
> > main () {
> >          mlockall(MCL_CURRENT | MCL_FUTURE);
> >         rt_task_shadow( ... )  // become a realtime thread
> >
> >         // for N threads:
> >                 rt_task_spawn( ... ) // fork off child RT thread
> > }
> >
> > It is one of the child RT threads that encounters the SIGXCPU unwanted
> > mode switch.  It this the correct way to call mlockall() ?
>
> Looks ok. Just to avoid getting into the same trap twice. You don't
> use system(), fork() or alike beasts in your code, do you?
No.

BTW, a primary mode thread in my application encountered the page fault 
writing to the BSS segment.

How would my application encounter a page fault after calling
mlockall(CURRENT | FUTURE) ?

Perhaps an answer is that mlockall() prevents pages from ever being paged out, 
but the application may still encounter page faults upon referring to pages 
that were never paged in when mlockall() was called.  If true, then my 
application may continue to encounter page faults, and hence switches to 
seondary mode until all pages in the application are loaded.

Could you, or any Linux+Xenomai memory paging expert comment on how/when pages 
for an application are loaded?

Is it possible to load all the text, bss and "initial" (initial data segment 
from ELF load image + thread stacks) data pages for an application before 
calling mlockall() ?

	thanks,
	Jeff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-24 17:52       ` [Xenomai-help] isolating unwanted mode switch Jeff Weber
@ 2007-01-24 18:55         ` Eric Noulard
  2007-01-25  7:30           ` M. Koehrer
  2007-01-25 10:50         ` Philippe Gerum
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Noulard @ 2007-01-24 18:55 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai help

2007/1/24, Jeff Weber <jweber@domain.hid>:
> Dmitry:
>
> On Wednesday 24 January 2007 04:08, Dmitry Adamushko wrote:
> > On 23/01/07, Jeff Weber <jweber@domain.hid> wrote:
> > > Dimitry:
> > > > ...
> > > > exceptions (CPU exceptions, e.g. page faults). I presume, you have
> > > > done mlockall(CURRENT | FUTURE), haven't you?
> > >
> > > Yes.  my Linux application follows this this model:
> > >
> > > main () {
> > >          mlockall(MCL_CURRENT | MCL_FUTURE);
> > >         rt_task_shadow( ... )  // become a realtime thread
> > >
> > >         // for N threads:
> > >                 rt_task_spawn( ... ) // fork off child RT thread
> > > }
> > >
> > > It is one of the child RT threads that encounters the SIGXCPU unwanted
> > > mode switch.  It this the correct way to call mlockall() ?
> >
> > Looks ok. Just to avoid getting into the same trap twice. You don't
> > use system(), fork() or alike beasts in your code, do you?
> No.
>
> BTW, a primary mode thread in my application encountered the page fault
> writing to the BSS segment.

Are your variables all properly initialized, before mlockall ?

>
> How would my application encounter a page fault after calling
> mlockall(CURRENT | FUTURE) ?
>
> Perhaps an answer is that mlockall() prevents pages from ever being paged out,
> but the application may still encounter page faults upon referring to pages
> that were never paged in when mlockall() was called.  If true, then my
> application may continue to encounter page faults, and hence switches to
> seondary mode until all pages in the application are loaded.

I am no Memory Handling expert nor Xenomai expert
but I think you may get a page fault when you require more memory
you used before you call mlockall(CURRENT | FUTURE),
that is growing stack, mmap calls,  malloc etc...

Does you application has some unbounded or not pre-computed memory
requirement (recursive call, malloc/calloc/alloca call etc.., mmap call)?

I may imagine the case where your system is short of memory
then you app requires more memory then you get page fault.

As a simple check, did you verify your mlockall syscall succeed?
I remember an application of mine which did not verify the returned
code and discover later that the call did not succeed :(((


-- 
Erk


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-24 17:04           ` Daniel Schnell
@ 2007-01-24 21:33             ` Wolfgang Grandegger
  2007-01-25  7:14               ` Wolfgang Grandegger
  0 siblings, 1 reply; 18+ messages in thread
From: Wolfgang Grandegger @ 2007-01-24 21:33 UTC (permalink / raw)
  To: Daniel Schnell; +Cc: Xenomai help

[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]

Daniel Schnell wrote:
> Wolfgang Grandegger wrote:
> 
>> I unable to reproduce your problem with a fresh linuxppc_2_4_devel
>> tree and xenomai-2.3.0. Could you please use the "--verbose" option
>> with prepare_kernel.sh and post the result.  
> 
> 
> Hmm, I used
> 
> make mrproper; git pull
> 
> to get the latest deltas from the repository. But maybe there is a saner
> way to do it ? I am pulling a fresh tree at the moment, but this will
> take quite a while according to the progress bar.
> 
> The output from the skript is
> 
> Preparing kernel 2.4.25 in
> /home/daniel/projects/mach4/linuxppc_2_4_devel-git-xenomai-2.3.0...
> Adeos found - bypassing patch.
> Adeos/ppc 1.2-01 installed.
> 
> And ? Is printed out on stderr.
> 
> 
> In the meanwhile I got the fresh linux kernel, the same happensn here.
> 
> I pulled it via
> 
> git clone git://www.denx.de/git/linuxppc_2_4_devel.git
> linuxppc_2_4_devel-git-xenomai-2.3.0

The problem is with the attached code snippet. It works under FC4 but 
not under FC6, as I just realized. Still puzzled...

Wolfgang.


[-- Attachment #2: test_ed --]
[-- Type: text/plain, Size: 85 bytes --]

#!/bin/bash
ed -s Makefile <<EOF
/DRIVERS := \$(DRIVERS-y)
^r Modules.frag

.
wq
EOF

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-24 21:33             ` Wolfgang Grandegger
@ 2007-01-25  7:14               ` Wolfgang Grandegger
  2007-01-25  9:03                 ` Daniel Schnell
  2007-01-29 15:36                 ` Philippe Gerum
  0 siblings, 2 replies; 18+ messages in thread
From: Wolfgang Grandegger @ 2007-01-25  7:14 UTC (permalink / raw)
  Cc: Xenomai help

[-- Attachment #1: Type: text/plain, Size: 1589 bytes --]

Wolfgang Grandegger wrote:
> Daniel Schnell wrote:
>> Wolfgang Grandegger wrote:
>>
>>> I unable to reproduce your problem with a fresh linuxppc_2_4_devel
>>> tree and xenomai-2.3.0. Could you please use the "--verbose" option
>>> with prepare_kernel.sh and post the result.  
>>
>>
>> Hmm, I used
>>
>> make mrproper; git pull
>>
>> to get the latest deltas from the repository. But maybe there is a saner
>> way to do it ? I am pulling a fresh tree at the moment, but this will
>> take quite a while according to the progress bar.
>>
>> The output from the skript is
>>
>> Preparing kernel 2.4.25 in
>> /home/daniel/projects/mach4/linuxppc_2_4_devel-git-xenomai-2.3.0...
>> Adeos found - bypassing patch.
>> Adeos/ppc 1.2-01 installed.
>>
>> And ? Is printed out on stderr.
>>
>>
>> In the meanwhile I got the fresh linux kernel, the same happensn here.
>>
>> I pulled it via
>>
>> git clone git://www.denx.de/git/linuxppc_2_4_devel.git
>> linuxppc_2_4_devel-git-xenomai-2.3.0
> 
> The problem is with the attached code snippet. It works under FC4 but 
> not under FC6, as I just realized. Still puzzled...
> 
> Wolfgang.
> 
> 
> ------------------------------------------------------------------------
> 
> #!/bin/bash
> ed -s Makefile <<EOF
> /DRIVERS := \$(DRIVERS-y)
> ^r Modules.frag
> 
> .
> wq
> EOF
> 
> 
> ------------------------------------------------------------------------

The attached patch fixes the problem by replacing "^" with "-1" for 
"previous line". It works now with FC4 and FC6. Likely it's an 
incompatibility issue between ed-0.2-3 ed-0.3-0.fc6.

Wolfgang.


[-- Attachment #2: xenomai-ed-fix.patch --]
[-- Type: text/x-patch, Size: 474 bytes --]

+ diff -u xenomai-2.3.0/scripts/prepare-kernel.sh.ED xenomai-2.3.0/scripts/prepare-kernel.sh
--- xenomai-2.3.0/scripts/prepare-kernel.sh.ED	2007-01-24 21:45:24.000000000 +0100
+++ xenomai-2.3.0/scripts/prepare-kernel.sh	2007-01-24 23:13:46.000000000 +0100
@@ -443,7 +443,7 @@
     if ! grep -q CONFIG_XENO $linux_tree/Makefile; then
 	patch_ed Makefile <<EOF
 /DRIVERS := \$(DRIVERS-y)
-^r $xenomai_root/scripts/Modules.frag
+-1r $xenomai_root/scripts/Modules.frag
 
 .
 wq

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-24 18:55         ` Eric Noulard
@ 2007-01-25  7:30           ` M. Koehrer
  2007-01-25  8:24             ` Eric Noulard
  0 siblings, 1 reply; 18+ messages in thread
From: M. Koehrer @ 2007-01-25  7:30 UTC (permalink / raw)
  To: eric.noulard, jweber; +Cc: xenomai

Hi all,

one other issue that comes in my mind:
The usage of C++ in a real time application could be fairly difficult, as 
C++ might create temporary objects (using new(), which will probably 
be realized with a malloc()). It is hard to identify what's happening with
the memory without an excellent understanding of the compiler.
That's why I try to avoid C++ and real time application and use plain C...
I do not know Jeff's application, however I think it could be worth to
have a look on that.

Regards

Mathias
> > On Wednesday 24 January 2007 04:08, Dmitry Adamushko wrote:
> > > On 23/01/07, Jeff Weber <jweber@domain.hid> wrote:
> > > > Dimitry:
> > > > > ...
> > > > > exceptions (CPU exceptions, e.g. page faults). I presume, you have
> > > > > done mlockall(CURRENT | FUTURE), haven't you?
> > > >
> > > > Yes.  my Linux application follows this this model:
> > > >
> > > > main () {
> > > >          mlockall(MCL_CURRENT | MCL_FUTURE);
> > > >         rt_task_shadow( ... )  // become a realtime thread
> > > >
> > > >         // for N threads:
> > > >                 rt_task_spawn( ... ) // fork off child RT thread
> > > > }
> > > >
> > > > It is one of the child RT threads that encounters the SIGXCPU
> unwanted
> > > > mode switch.  It this the correct way to call mlockall() ?
> > >
> > > Looks ok. Just to avoid getting into the same trap twice. You don't
> > > use system(), fork() or alike beasts in your code, do you?
> > No.
> >
> > BTW, a primary mode thread in my application encountered the page fault
> > writing to the BSS segment.
> 
> Are your variables all properly initialized, before mlockall ?
> 
> >
> > How would my application encounter a page fault after calling
> > mlockall(CURRENT | FUTURE) ?
> >
> > Perhaps an answer is that mlockall() prevents pages from ever being paged
> out,
> > but the application may still encounter page faults upon referring to
> pages
> > that were never paged in when mlockall() was called.  If true, then my
> > application may continue to encounter page faults, and hence switches to
> > seondary mode until all pages in the application are loaded.
> 
> I am no Memory Handling expert nor Xenomai expert
> but I think you may get a page fault when you require more memory
> you used before you call mlockall(CURRENT | FUTURE),
> that is growing stack, mmap calls,  malloc etc...
> 
> Does you application has some unbounded or not pre-computed memory
> requirement (recursive call, malloc/calloc/alloca call etc.., mmap call)?
> 
> I may imagine the case where your system is short of memory
> then you app requires more memory then you get page fault.
> 
> As a simple check, did you verify your mlockall syscall succeed?
> I remember an application of mine which did not verify the returned
> code and discover later that the call did not succeed :(((


-- 
Mathias Koehrer
mathias_koehrer@domain.hid


Viel oder wenig? Schnell oder langsam? Unbegrenzt surfen + telefonieren
ohne Zeit- und Volumenbegrenzung? DAS TOP ANGEBOT JETZT bei Arcor: günstig
und schnell mit DSL - das All-Inclusive-Paket für clevere Doppel-Sparer,
nur  44,85 €  inkl. DSL- und ISDN-Grundgebühr!
http://www.arcor.de/rd/emf-dsl-2


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-25  7:30           ` M. Koehrer
@ 2007-01-25  8:24             ` Eric Noulard
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Noulard @ 2007-01-25  8:24 UTC (permalink / raw)
  To: M. Koehrer; +Cc: xenomai

2007/1/25, M. Koehrer <mathias_koehrer@domain.hid>:
> Hi all,
>
> one other issue that comes in my mind:
> The usage of C++ in a real time application could be fairly difficult, as
> C++ might create temporary objects (using new(), which will probably
> be realized with a malloc()). It is hard to identify what's happening with
> the memory without an excellent understanding of the compiler.

I think you may try to replace (or overload for your class) operator new/delete
http://www.cantrip.org/wave12.html in order to catch such problem.

> That's why I try to avoid C++ and real time application and use plain C...
> I do not know Jeff's application, however I think it could be worth to
> have a look on that.

But this choice is easier and I would go for it too :))

-- 
Erk


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-25  7:14               ` Wolfgang Grandegger
@ 2007-01-25  9:03                 ` Daniel Schnell
  2007-01-25  9:38                   ` Wolfgang Grandegger
  2007-01-29 15:36                 ` Philippe Gerum
  1 sibling, 1 reply; 18+ messages in thread
From: Daniel Schnell @ 2007-01-25  9:03 UTC (permalink / raw)
  To: Xenomai help

Wolfgang Grandegger wrote:
> 
> The attached patch fixes the problem by replacing "^" with "-1" for
> "previous line". It works now with FC4 and FC6. Likely it's an
> incompatibility issue between ed-0.2-3 ed-0.3-0.fc6.  
> 
> Wolfgang.

Yes this fixes the problem for me. I can successfully patch the kernel
now.

Thanks for the quick fix.


Best regards,

Daniel Schnell.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-25  9:03                 ` Daniel Schnell
@ 2007-01-25  9:38                   ` Wolfgang Grandegger
  0 siblings, 0 replies; 18+ messages in thread
From: Wolfgang Grandegger @ 2007-01-25  9:38 UTC (permalink / raw)
  To: Daniel Schnell; +Cc: Xenomai help

Daniel Schnell wrote:
> Wolfgang Grandegger wrote:
>> The attached patch fixes the problem by replacing "^" with "-1" for
>> "previous line". It works now with FC4 and FC6. Likely it's an
>> incompatibility issue between ed-0.2-3 ed-0.3-0.fc6.  
>>
>> Wolfgang.
> 
> Yes this fixes the problem for me. I can successfully patch the kernel
> now.
> 
> Thanks for the quick fix.

The change is also documented:

  FC4: man ed
  -
  ^       The  previous  line.   This  is  equivalent  to  -1  and may be
          repeated with cumulative effect.

  FC6: man ed
  -       The  previous  line.   This  is  equivalent  to  -1  and may be
          repeated with cumulative effect.

The "^" disappeared obviously with ed v0.3.0 :-(.

Wolfgang.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-24 17:52       ` [Xenomai-help] isolating unwanted mode switch Jeff Weber
  2007-01-24 18:55         ` Eric Noulard
@ 2007-01-25 10:50         ` Philippe Gerum
  2007-01-25 17:04           ` Jeff Weber
  1 sibling, 1 reply; 18+ messages in thread
From: Philippe Gerum @ 2007-01-25 10:50 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai help

On Wed, 2007-01-24 at 11:52 -0600, Jeff Weber wrote:
> Dmitry:
> 
> On Wednesday 24 January 2007 04:08, Dmitry Adamushko wrote:
> > On 23/01/07, Jeff Weber <jweber@domain.hid> wrote:
> > > Dimitry:
> > > > ...
> > > > exceptions (CPU exceptions, e.g. page faults). I presume, you have
> > > > done mlockall(CURRENT | FUTURE), haven't you?
> > >
> > > Yes.  my Linux application follows this this model:
> > >
> > > main () {
> > >          mlockall(MCL_CURRENT | MCL_FUTURE);
> > >         rt_task_shadow( ... )  // become a realtime thread
> > >
> > >         // for N threads:
> > >                 rt_task_spawn( ... ) // fork off child RT thread
> > > }
> > >
> > > It is one of the child RT threads that encounters the SIGXCPU unwanted
> > > mode switch.  It this the correct way to call mlockall() ?
> >
> > Looks ok. Just to avoid getting into the same trap twice. You don't
> > use system(), fork() or alike beasts in your code, do you?
> No.
> 
> BTW, a primary mode thread in my application encountered the page fault 
> writing to the BSS segment.
> 
> How would my application encounter a page fault after calling
> mlockall(CURRENT | FUTURE) ?
> 

Because Linux initially maps .bss sections to the zero page, which is
COW. Upon first writing to some of the contained data, a page fault is
raised. mlockall() does try to make all mapped pages present, but does
not break COW. That's the current problem we are about to solve in the
next I-pipe patch series.

Spawning a RT thread involves cloning, which is also a source of future
on-demand mappings involving COW pages, and as such, would trigger
subsequent page faults. We usually don't notice them because the COWish
memory is written and committed while the application is still
initializing.

> Perhaps an answer is that mlockall() prevents pages from ever being paged out, 
> but the application may still encounter page faults upon referring to pages 
> that were never paged in when mlockall() was called.  If true, then my 
> application may continue to encounter page faults, and hence switches to 
> seondary mode until all pages in the application are loaded.
> 
> Could you, or any Linux+Xenomai memory paging expert comment on how/when pages 
> for an application are loaded?
> Is it possible to load all the text, bss and "initial" (initial data segment 
> from ELF load image + thread stacks) data pages for an application before 
> calling mlockall() ?

mlockall() will take care of the load image and stack (even if only
initially reserved by the glibc, mlockall() does commit the stack space;
that's why the default 8Mb stack size picked by the glibc is somewhat
exhuberant when your app happens to create a number of threads under
mlockall conditions). Shared libraries are also COWish things, since
there is some fixup to do during the first references.

Gilles is working on a patch that solves a number of on-demand mapping
issues initially observed on ARM and ppc, but generic enough to affect
other archs; I'll merge it into the I-pipe 1.7 series for x86 in a near
future.

> 	thanks,
> 	Jeff
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help
-- 
Philippe.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] isolating unwanted mode switch
  2007-01-25 10:50         ` Philippe Gerum
@ 2007-01-25 17:04           ` Jeff Weber
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff Weber @ 2007-01-25 17:04 UTC (permalink / raw)
  To: rpm; +Cc: Xenomai help

A correction to my previous post on this problem: the memory in question was 
previously allocated heap, that was apparently not yet paged in.  The pointer 
to the memory was in bss, so, at the time, I assumed the page fault memory 
was in bss as well.  I have a "solution" which is working for now, to write 
(zero out)  dynamically allocated memory before first use.  (The dynamic 
allocation is after the mlockall() ).

My application is a legacy C++ program, which has an initialization phase: 
where all RT tasks are created and all memory is allocated; and a run phase: 
where the task T_WARNSW mode is enabled, to report unwanted modes switches.  
I am in the process of porting this application from Linux-2.4.27+rtai-3.1.0 
to Linux-2.6.17.14+Xenomai-2.2.4.  (As a side note, it has now become 
interesting to consider how and if page faults were affectinging the 
Linux-2.4+RTAI version of this application.)

From the discussions in this thread, there will always be a non-zero 
probability that my application will still encounter future page faults. So 
I've decided to enable T_WARNSW for the applicaion while under development, 
but T_WARNSW will not be enabled for deployed applications.  Instead, I may 
periodically parse /proc/xenomai/faults, and /proc/xenomai/stat .

	Jeff

On Thursday 25 January 2007 04:50, Philippe Gerum wrote:

> Gilles is working on a patch that solves a number of on-demand mapping
> issues initially observed on ARM and ppc, but generic enough to affect
> other archs; I'll merge it into the I-pipe 1.7 series for x86 in a near
> future.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails
  2007-01-25  7:14               ` Wolfgang Grandegger
  2007-01-25  9:03                 ` Daniel Schnell
@ 2007-01-29 15:36                 ` Philippe Gerum
  1 sibling, 0 replies; 18+ messages in thread
From: Philippe Gerum @ 2007-01-29 15:36 UTC (permalink / raw)
  To: Wolfgang Grandegger; +Cc: Xenomai help

On Thu, 2007-01-25 at 08:14 +0100, Wolfgang Grandegger wrote:

[...]

> The attached patch fixes the problem by replacing "^" with "-1" for 
> "previous line". It works now with FC4 and FC6. Likely it's an 
> incompatibility issue between ed-0.2-3 ed-0.3-0.fc6.
> 

Merged, thanks.

-- 
Philippe.




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2007-01-29 15:36 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-23 20:48 [Xenomai-help] isolating unwanted mode switch Jeff Weber
2007-01-23 21:45 ` Dmitry Adamushko
2007-01-23 22:27   ` Jeff Weber
2007-01-24 10:08     ` Dmitry Adamushko
2007-01-24 11:22       ` [Xenomai-help] Patching latest Denx 2.4.25-devel kernel with xenomai-2.3.0 fails Daniel Schnell
2007-01-24 14:44         ` Wolfgang Grandegger
2007-01-24 17:04           ` Daniel Schnell
2007-01-24 21:33             ` Wolfgang Grandegger
2007-01-25  7:14               ` Wolfgang Grandegger
2007-01-25  9:03                 ` Daniel Schnell
2007-01-25  9:38                   ` Wolfgang Grandegger
2007-01-29 15:36                 ` Philippe Gerum
2007-01-24 17:52       ` [Xenomai-help] isolating unwanted mode switch Jeff Weber
2007-01-24 18:55         ` Eric Noulard
2007-01-25  7:30           ` M. Koehrer
2007-01-25  8:24             ` Eric Noulard
2007-01-25 10:50         ` Philippe Gerum
2007-01-25 17:04           ` Jeff Weber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.