[Xenomai-help] Page faults

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xenomai-help] Page faults
@ 2006-02-28 14:22 Jeroen Van den Keybus
  2006-02-28 15:05 ` Jan Kiszka
  2006-02-28 16:31 ` Philippe Gerum
  0 siblings, 2 replies; 17+ messages in thread
From: Jeroen Van den Keybus @ 2006-02-28 14:22 UTC (permalink / raw)
  To: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 280 bytes --]

I'm observing a considerable amount of Page Faults (5090 after an hour or
so), each one associated with a MSW increase in /proc/xenomai/stat. I'm
missing RT deadlines on those occasions and I want to fix it, so I would
like to know what Page Faults actually are.

Jeroen.

[-- Attachment #2: Type: text/html, Size: 335 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] Page faults
  2006-02-28 14:22 [Xenomai-help] Page faults Jeroen Van den Keybus
@ 2006-02-28 15:05 ` Jan Kiszka
  2006-02-28 15:29   ` Jeroen Van den Keybus
  2006-02-28 16:31 ` Philippe Gerum
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2006-02-28 15:05 UTC (permalink / raw)
  To: Jeroen Van den Keybus; +Cc: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 765 bytes --]

Jeroen Van den Keybus wrote:
> I'm observing a considerable amount of Page Faults (5090 after an hour or
> so), each one associated with a MSW increase in /proc/xenomai/stat. I'm
> missing RT deadlines on those occasions and I want to fix it, so I would
> like to know what Page Faults actually are.
> 

Maybe not all of your user memory is cleanly locked and got swapped out
(swapping activated?). If you don't see kernel oopses and your programs
don't receive segfaults, these faults are references to swapped out or
not yet mapped in pages. BTW, what does the kernel console tell you?

The deadline misses as a result are "normal": your RT thread gets
switched to secondary mode to handle the fault, and handling it may take
some time...

Jan

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] Page faults
  2006-02-28 15:05 ` Jan Kiszka
@ 2006-02-28 15:29   ` Jeroen Van den Keybus
  2006-02-28 16:29     ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Jeroen Van den Keybus @ 2006-02-28 15:29 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

>
> Maybe not all of your user memory is cleanly locked and got swapped out
> (swapping activated?). If you don't see kernel oopses and your programs
> don't receive segfaults, these faults are references to swapped out or
> not yet mapped in pages. BTW, what does the kernel console tell you?

I have a mlockall(MCL_CURRENT | MCL_FUTURE) in my main() (non-RT) task
alright. Do I need to repeat that for the real-time tasks as well ?

I'm quite certain that my kernel is configured for swapping. I did not see
any harm in it, as a mlockall() was given. Should I try turning it off (if
so, where) ?

I do not see 'oopses' and the dmesg log is clean, apart from one
incidental spurious IRQ7 interrupt. Is 'dmesg' what you mean by 'kernel
console' ?

Jeroen.

[-- Attachment #2: Type: text/html, Size: 1061 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] Page faults
  2006-02-28 15:29   ` Jeroen Van den Keybus
@ 2006-02-28 16:29     ` Jan Kiszka
  0 siblings, 0 replies; 17+ messages in thread
From: Jan Kiszka @ 2006-02-28 16:29 UTC (permalink / raw)
  To: Jeroen Van den Keybus; +Cc: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 1423 bytes --]

Jeroen Van den Keybus wrote:
>> Maybe not all of your user memory is cleanly locked and got swapped out
>> (swapping activated?). If you don't see kernel oopses and your programs
>> don't receive segfaults, these faults are references to swapped out or
>> not yet mapped in pages. BTW, what does the kernel console tell you?
> 
> 
> I have a mlockall(MCL_CURRENT | MCL_FUTURE) in my main() (non-RT) task
> alright. Do I need to repeat that for the real-time tasks as well ?

You should not have to. I observed some cases where malloc'ed memory was
not immediately available, and you had to touch it first. I guess this
depends on how glibc requests the memory. So, if unsure, do an memset on
freshly allocated memory.

> 
> I'm quite certain that my kernel is configured for swapping. I did not see
> any harm in it, as a mlockall() was given. Should I try turning it off (if
> so, where) ?

Swapping does not do any harm, correct. I just wanted to check what may
trigger the problem.

> 
> I do not see 'oopses' and the dmesg log is clean, apart from one
> incidental spurious IRQ7 interrupt. Is 'dmesg' what you mean by 'kernel
> console' ?
> 

Ah, I oversaw some dependency. Try to enable CONFIG_XENO_OPT_DEBUG. Then
the nucleus should spit out which task is being relaxed due to faults:

"Switching XYZ to secondary mode after exception #? from user-space at
0x??? (pid ?)"

Jan



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] Page faults
  2006-02-28 14:22 [Xenomai-help] Page faults Jeroen Van den Keybus
  2006-02-28 15:05 ` Jan Kiszka
@ 2006-02-28 16:31 ` Philippe Gerum
  2006-02-28 17:08   ` Jeroen Van den Keybus
  1 sibling, 1 reply; 17+ messages in thread
From: Philippe Gerum @ 2006-02-28 16:31 UTC (permalink / raw)
  To: Jeroen Van den Keybus; +Cc: xenomai@xenomai.org

Jeroen Van den Keybus wrote:
> I'm observing a considerable amount of Page Faults (5090 after an hour 
> or so), each one associated with a MSW increase in /proc/xenomai/stat. 
> I'm missing RT deadlines on those occasions and I want to fix it, so I 
> would like to know what Page Faults actually are.
>  

Since page faults cause mode switches as you observed, you might want to try using 
a built-in feature that sends a SIGXCPU signal to a task going back to the Linux 
domain (e.g. typically relaxing to process a fault). The way to do this is as follows:

- set the T_WARNSW bit for your task using the rt_task_set_mode() call. Something 
like rt_task_set_mode(0,T_WARNSW,NULL) would do.

- code a Linux signal handler, and register it for receiving SIGXCPU.

The signal is always sent on behalf of the relaxing task, so you just need to 
inspect the backtrace to discover the cause of the mode switch by analysing the 
inner frames. You may either use GDB and set a breakpoint in the handler, or play 
with the backtrace_*() routines available from the glibc.

ksrc/skins/native/snippets/sigxcpu.c gives an example of such use.

>  
> Jeroen.
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help

-- 

Philippe.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] Page faults
  2006-02-28 16:31 ` Philippe Gerum
@ 2006-02-28 17:08   ` Jeroen Van den Keybus
  0 siblings, 0 replies; 17+ messages in thread
From: Jeroen Van den Keybus @ 2006-02-28 17:08 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 458 bytes --]

Debugging tools also suggested 'normal' pagefaults. Made me look for the
mlockall() call which turns out to be unsuccessful. I made a mistake in
evaluating its (non-zero) return value, which caused the error to go by
unnoticed.

The reason mlockall() fails is that I executed the real-time program as
non-root. (Guess I'll have to figure out that daunting sudoers thing...)

Thanks for your advice, which led me in the right direction.

Jeroen.

[-- Attachment #2: Type: text/html, Size: 592 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Xenomai-help] page faults
@ 2007-04-16 19:49 Jeff Weber
  2007-04-16 20:05 ` Philippe Gerum
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Weber @ 2007-04-16 19:49 UTC (permalink / raw)
  To: Xenomai Help

I need some help finding the cause for page faults in my RT application.
My app has a startup phase, where I can tolerate page faults, and a hard 
realtime phase, where page faults cannot be tolerated.  Trouble is, I 
continue to get page faults in the hard realtime phase.

The app encounters a page fault while writing to a heap buffer.  I've even 
added steps to clear the entire buffer in the startup phase, after 
mlockall(), to ensure each page is locked in place.  Here's a timeline of the 
fault:

static constructor runs and allocates buffer from heap
application main() runs
begin application startup phase
mlockall(MCL_CURRENT | MCL_FUTURE) runs and returns 0
entire buffer cleared
paranoid code even verifies buffer addr 0x81f4000 contents are 0
begin application hard realtime phase
time elapses ...
page fault copying stack addr 0xb645a2d1 to heap buffer addr 0x81f4000

The page fault is confirmed by the kernel debug message:
Xenomai: Switching mythread to secondary mode after exception #14 from 
user-space at 0x80fb8c5 (pid 3558)

as well as the delivery of SIGXCPU to my application (at my request).

How do I prevent this page fault?

Is this issue covered by the recent NOCOW activity?

	thanks,
	Jeff

my config:
CPU: VIA Nehemiah (i386)
ipipe version: 1.5-00
Xenomai: 2.2.4
Linux kernel: 2.6.17.14

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-16 19:49 [Xenomai-help] page faults Jeff Weber
@ 2007-04-16 20:05 ` Philippe Gerum
  2007-04-16 20:20   ` Jeff Weber
  0 siblings, 1 reply; 17+ messages in thread
From: Philippe Gerum @ 2007-04-16 20:05 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai Help

On Mon, 2007-04-16 at 14:49 -0500, Jeff Weber wrote:
> I need some help finding the cause for page faults in my RT application.
> My app has a startup phase, where I can tolerate page faults, and a hard 
> realtime phase, where page faults cannot be tolerated.  Trouble is, I 
> continue to get page faults in the hard realtime phase.
> 
> The app encounters a page fault while writing to a heap buffer.  I've even 
> added steps to clear the entire buffer in the startup phase, after 
> mlockall(), to ensure each page is locked in place.  Here's a timeline of the 
> fault:
> 
> static constructor runs and allocates buffer from heap
> application main() runs
> begin application startup phase
> mlockall(MCL_CURRENT | MCL_FUTURE) runs and returns 0
> entire buffer cleared
> paranoid code even verifies buffer addr 0x81f4000 contents are 0
> begin application hard realtime phase
> time elapses ...
> page fault copying stack addr 0xb645a2d1 to heap buffer addr 0x81f4000
> 
> The page fault is confirmed by the kernel debug message:
> Xenomai: Switching mythread to secondary mode after exception #14 from 
> user-space at 0x80fb8c5 (pid 3558)
> 

Could you disassemble the code around location 0x80fb8c5?

> as well as the delivery of SIGXCPU to my application (at my request).
> 
> How do I prevent this page fault?
> 
> Is this issue covered by the recent NOCOW activity?
> 

Possibly. You need I-pipe 1.7-03 and Xenomai >= v2.3.1 to get the
ondemand mapping scheme disabled by the nucleus when your thread starts.

> 	thanks,
> 	Jeff
> 
> my config:
> CPU: VIA Nehemiah (i386)
> ipipe version: 1.5-00
> Xenomai: 2.2.4
> Linux kernel: 2.6.17.14
> 
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help
-- 
Philippe.




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-16 20:05 ` Philippe Gerum
@ 2007-04-16 20:20   ` Jeff Weber
  2007-04-16 20:43     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Weber @ 2007-04-16 20:20 UTC (permalink / raw)
  To: rpm; +Cc: Xenomai Help

On Monday 16 April 2007 15:05, Philippe Gerum wrote:

> Could you disassemble the code around location 0x80fb8c5?
The latest version of my code has moved the the addresses a bit:
Xenomai: Switching mythread to secondary mode after exception #14 from user-space at 0x80fb8cd (pid 3590)

(gdb) disas
Dump of assembler code for function _ZN4AMSC8CRtDequeIcE9push_backERKc:
0x080fb8b4 <_ZN4AMSC8CRtDequeIcE9push_backERKc+0>:      push   %ebp
0x080fb8b5 <_ZN4AMSC8CRtDequeIcE9push_backERKc+1>:      mov    %esp,%ebp
0x080fb8b7 <_ZN4AMSC8CRtDequeIcE9push_backERKc+3>:      sub    $0x8,%esp
0x080fb8ba <_ZN4AMSC8CRtDequeIcE9push_backERKc+6>:      mov    0x8(%ebp),%edx
0x080fb8bd <_ZN4AMSC8CRtDequeIcE9push_backERKc+9>:      mov    0x8(%ebp),%eax
0x080fb8c0 <_ZN4AMSC8CRtDequeIcE9push_backERKc+12>:     mov    0x8(%eax),%eax
0x080fb8c3 <_ZN4AMSC8CRtDequeIcE9push_backERKc+15>:     mov    (%edx),%edx
0x080fb8c5 <_ZN4AMSC8CRtDequeIcE9push_backERKc+17>:     add    %eax,%edx
0x080fb8c7 <_ZN4AMSC8CRtDequeIcE9push_backERKc+19>:     mov    0xc(%ebp),%eax
0x080fb8ca <_ZN4AMSC8CRtDequeIcE9push_backERKc+22>:     movzbl (%eax),%eax
0x080fb8cd <_ZN4AMSC8CRtDequeIcE9push_backERKc+25>:     mov    %al,(%edx)
0x080fb8cf <_ZN4AMSC8CRtDequeIcE9push_backERKc+27>:     mov    0x8(%ebp),%eax
0x080fb8d2 <_ZN4AMSC8CRtDequeIcE9push_backERKc+30>:     add    $0x8,%eax
0x080fb8d5 <_ZN4AMSC8CRtDequeIcE9push_backERKc+33>:     mov    %eax,0x4(%esp)
0x080fb8d9 <_ZN4AMSC8CRtDequeIcE9push_backERKc+37>:     mov    0x8(%ebp),%eax
0x080fb8dc <_ZN4AMSC8CRtDequeIcE9push_backERKc+40>:     mov    %eax,(%esp)
0x080fb8df <_ZN4AMSC8CRtDequeIcE9push_backERKc+43>:     call   0x80fcb18 <_ZN4AMSC8CRtDequeIcE3incERi>

>
> > as well as the delivery of SIGXCPU to my application (at my request).
> >
> > How do I prevent this page fault?
> >
> > Is this issue covered by the recent NOCOW activity?
>
> Possibly. You need I-pipe 1.7-03 and Xenomai >= v2.3.1 to get the
> ondemand mapping scheme disabled by the nucleus when your thread starts.
I am not familiar with the purpose and implementation of the NOCOW patch.
How would the patch affect my page fault issue?

	thanks,
	Jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-16 20:20   ` Jeff Weber
@ 2007-04-16 20:43     ` Gilles Chanteperdrix
  2007-04-16 21:27       ` Jeff Weber
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-04-16 20:43 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai Help

Jeff Weber wrote:
 > On Monday 16 April 2007 15:05, Philippe Gerum wrote:
 > 
 > > Could you disassemble the code around location 0x80fb8c5?
 > The latest version of my code has moved the the addresses a bit:
 > Xenomai: Switching mythread to secondary mode after exception #14 from user-space at 0x80fb8cd (pid 3590)
 > 
 > (gdb) disas
 > Dump of assembler code for function _ZN4AMSC8CRtDequeIcE9push_backERKc:
 > 0x080fb8b4 <_ZN4AMSC8CRtDequeIcE9push_backERKc+0>:      push   %ebp
 > 0x080fb8b5 <_ZN4AMSC8CRtDequeIcE9push_backERKc+1>:      mov    %esp,%ebp
 > 0x080fb8b7 <_ZN4AMSC8CRtDequeIcE9push_backERKc+3>:      sub    $0x8,%esp
 > 0x080fb8ba <_ZN4AMSC8CRtDequeIcE9push_backERKc+6>:      mov    0x8(%ebp),%edx
 > 0x080fb8bd <_ZN4AMSC8CRtDequeIcE9push_backERKc+9>:      mov    0x8(%ebp),%eax
 > 0x080fb8c0 <_ZN4AMSC8CRtDequeIcE9push_backERKc+12>:     mov    0x8(%eax),%eax
 > 0x080fb8c3 <_ZN4AMSC8CRtDequeIcE9push_backERKc+15>:     mov    (%edx),%edx
 > 0x080fb8c5 <_ZN4AMSC8CRtDequeIcE9push_backERKc+17>:     add    %eax,%edx
 > 0x080fb8c7 <_ZN4AMSC8CRtDequeIcE9push_backERKc+19>:     mov    0xc(%ebp),%eax
 > 0x080fb8ca <_ZN4AMSC8CRtDequeIcE9push_backERKc+22>:     movzbl (%eax),%eax
 > 0x080fb8cd <_ZN4AMSC8CRtDequeIcE9push_backERKc+25>:     mov    %al,(%edx)
 > 0x080fb8cf <_ZN4AMSC8CRtDequeIcE9push_backERKc+27>:     mov    0x8(%ebp),%eax
 > 0x080fb8d2 <_ZN4AMSC8CRtDequeIcE9push_backERKc+30>:     add    $0x8,%eax
 > 0x080fb8d5 <_ZN4AMSC8CRtDequeIcE9push_backERKc+33>:     mov    %eax,0x4(%esp)
 > 0x080fb8d9 <_ZN4AMSC8CRtDequeIcE9push_backERKc+37>:     mov    0x8(%ebp),%eax
 > 0x080fb8dc <_ZN4AMSC8CRtDequeIcE9push_backERKc+40>:     mov    %eax,(%esp)
 > 0x080fb8df <_ZN4AMSC8CRtDequeIcE9push_backERKc+43>:     call   0x80fcb18 <_ZN4AMSC8CRtDequeIcE3incERi>
 > 
 > >
 > > > as well as the delivery of SIGXCPU to my application (at my request).
 > > >
 > > > How do I prevent this page fault?
 > > >
 > > > Is this issue covered by the recent NOCOW activity?
 > >
 > > Possibly. You need I-pipe 1.7-03 and Xenomai >= v2.3.1 to get the
 > > ondemand mapping scheme disabled by the nucleus when your thread starts.
 > I am not familiar with the purpose and implementation of the NOCOW patch.
 > How would the patch affect my page fault issue?

If the fault you observe is due to an access to some memory after a call
to fork or one of its derivative (such as system, popen, etc...), the
patch would have copied the whole real-time process address space at
fork time instead of setting up COW mappings.

-- 


					    Gilles Chanteperdrix.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-16 20:43     ` Gilles Chanteperdrix
@ 2007-04-16 21:27       ` Jeff Weber
  2007-04-16 21:34         ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Weber @ 2007-04-16 21:27 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai Help

On Monday 16 April 2007 15:43, Gilles Chanteperdrix wrote:
> If the fault you observe is due to an access to some memory after a call
> to fork or one of its derivative (such as system, popen, etc...), the
> patch would have copied the whole real-time process address space at
> fork time instead of setting up COW mappings.
No process forks are involved.  Though mlockall() was called from Linux 
main(), and the page fault was encountered by a separate Xenomai task.  
Here's the task history:

static constructor allocates buffer from heap
enter main()
call mlockall(), verify return status == 0
buffer is forcibly cleared using memset(buffer, 0, sizeof(buffer))
main calls rt_task_shadow() to create Xenomai startup task
startup task spawns Xenomai communications task via rt_task_start()
communications task encounters page fault

Let me know if you are able to spot the source of the page faults.

	thanks,
	Jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-16 21:27       ` Jeff Weber
@ 2007-04-16 21:34         ` Gilles Chanteperdrix
  2007-04-17 13:21           ` Jeff Weber
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-04-16 21:34 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai Help

Jeff Weber wrote:
 > On Monday 16 April 2007 15:43, Gilles Chanteperdrix wrote:
 > > If the fault you observe is due to an access to some memory after a call
 > > to fork or one of its derivative (such as system, popen, etc...), the
 > > patch would have copied the whole real-time process address space at
 > > fork time instead of setting up COW mappings.
 > No process forks are involved.  Though mlockall() was called from Linux 
 > main(), and the page fault was encountered by a separate Xenomai task.  
 > Here's the task history:

The fork may well be hidden in some library. The best way to know if
there is really no fork is to register a callback with pthread_atfork.

-- 


					    Gilles Chanteperdrix.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-16 21:34         ` Gilles Chanteperdrix
@ 2007-04-17 13:21           ` Jeff Weber
  2007-04-17 19:17             ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Weber @ 2007-04-17 13:21 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai Help

On Monday 16 April 2007 16:34, Gilles Chanteperdrix wrote:
> Jeff Weber wrote:
>  > On Monday 16 April 2007 15:43, Gilles Chanteperdrix wrote:
>  > > If the fault you observe is due to an access to some memory after a
>  > > call to fork or one of its derivative (such as system, popen, etc...),
>  > > the patch would have copied the whole real-time process address space
>  > > at fork time instead of setting up COW mappings.
>  >
>  > No process forks are involved.  Though mlockall() was called from Linux
>  > main(), and the page fault was encountered by a separate Xenomai task.
>  > Here's the task history:
>
> The fork may well be hidden in some library. The best way to know if
> there is really no fork is to register a callback with pthread_atfork.
pthread_atfork confirms that there is no fork.

	Jeff
-- 
Jeff Weber
American Superconductor Corp.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-17 13:21           ` Jeff Weber
@ 2007-04-17 19:17             ` Gilles Chanteperdrix
  2007-04-17 20:59               ` Jeff Weber
  2007-04-20 16:43               ` Jeff Weber
  0 siblings, 2 replies; 17+ messages in thread
From: Gilles Chanteperdrix @ 2007-04-17 19:17 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai Help

Jeff Weber wrote:
 > On Monday 16 April 2007 16:34, Gilles Chanteperdrix wrote:
 > > Jeff Weber wrote:
 > >  > On Monday 16 April 2007 15:43, Gilles Chanteperdrix wrote:
 > >  > > If the fault you observe is due to an access to some memory after a
 > >  > > call to fork or one of its derivative (such as system, popen, etc...),
 > >  > > the patch would have copied the whole real-time process address space
 > >  > > at fork time instead of setting up COW mappings.
 > >  >
 > >  > No process forks are involved.  Though mlockall() was called from Linux
 > >  > main(), and the page fault was encountered by a separate Xenomai task.
 > >  > Here's the task history:
 > >
 > > The fork may well be hidden in some library. The best way to know if
 > > there is really no fork is to register a callback with pthread_atfork.
 > pthread_atfork confirms that there is no fork.

Ok. I am afraid you will have to help us a bit. Could you try Xenomai
2.3.1 in case the nocow patch magically solves your issue ?

If it does not, could you try sizing down your program to a small
example that we could run to reproduce the issue ?

If reducing your program is not possible, the only option left is to
start debugging this issue. A good starting point would be to put some
printks in arch/i386/mm/fault.c to see what kind of page fault is
causing the mode switch.

-- 


					    Gilles Chanteperdrix.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-17 19:17             ` Gilles Chanteperdrix
@ 2007-04-17 20:59               ` Jeff Weber
  2007-04-20 16:43               ` Jeff Weber
  1 sibling, 0 replies; 17+ messages in thread
From: Jeff Weber @ 2007-04-17 20:59 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai Help

On Tuesday 17 April 2007 14:17, Gilles Chanteperdrix wrote:

> Ok. I am afraid you will have to help us a bit. Could you try Xenomai
> 2.3.1 in case the nocow patch magically solves your issue ?
Will do.  That may take a little while due to other considerations on this 
end.  I'll get back to you.
>
> If it does not, could you try sizing down your program to a small
> example that we could run to reproduce the issue ?
I've already been working on this, with no luck so far.
>
> If reducing your program is not possible, the only option left is to
> start debugging this issue. A good starting point would be to put some
> printks in arch/i386/mm/fault.c to see what kind of page fault is
> causing the mode switch.
Let's hope it doesn't come to this.  :-)

	Jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-17 19:17             ` Gilles Chanteperdrix
  2007-04-17 20:59               ` Jeff Weber
@ 2007-04-20 16:43               ` Jeff Weber
  2007-04-20 17:24                 ` Philippe Gerum
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff Weber @ 2007-04-20 16:43 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai Help

On Tuesday 17 April 2007 14:17, Gilles Chanteperdrix wrote:
> Jeff Weber wrote:
>  > On Monday 16 April 2007 16:34, Gilles Chanteperdrix wrote:
>  > > Jeff Weber wrote:
>  > >  > On Monday 16 April 2007 15:43, Gilles Chanteperdrix wrote:
>  > >  > > If the fault you observe is due to an access to some memory after
>  > >  > > a call to fork or one of its derivative (such as system, popen,
>  > >  > > etc...), the patch would have copied the whole real-time process
>  > >  > > address space at fork time instead of setting up COW mappings.
>  > >  >
>  > >  > No process forks are involved.  Though mlockall() was called from
>  > >  > Linux main(), and the page fault was encountered by a separate
>  > >  > Xenomai task. Here's the task history:
>  > >
>  > > The fork may well be hidden in some library. The best way to know if
>  > > there is really no fork is to register a callback with pthread_atfork.
>  >
>  > pthread_atfork confirms that there is no fork.
>
> Ok. I am afraid you will have to help us a bit. Could you try Xenomai
> 2.3.1 in case the nocow patch magically solves your issue ?
Indeed, stepping up to:

linux-2.6.20.3
xenomai-2.3.1
ipipe-1.7-03

magically solved my page faults.

	thanks!
	Jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai-help] page faults
  2007-04-20 16:43               ` Jeff Weber
@ 2007-04-20 17:24                 ` Philippe Gerum
  0 siblings, 0 replies; 17+ messages in thread
From: Philippe Gerum @ 2007-04-20 17:24 UTC (permalink / raw)
  To: Jeff Weber; +Cc: Xenomai Help

On Fri, 2007-04-20 at 11:43 -0500, Jeff Weber wrote:
> On Tuesday 17 April 2007 14:17, Gilles Chanteperdrix wrote:
> > Jeff Weber wrote:
> >  > On Monday 16 April 2007 16:34, Gilles Chanteperdrix wrote:
> >  > > Jeff Weber wrote:
> >  > >  > On Monday 16 April 2007 15:43, Gilles Chanteperdrix wrote:
> >  > >  > > If the fault you observe is due to an access to some memory after
> >  > >  > > a call to fork or one of its derivative (such as system, popen,
> >  > >  > > etc...), the patch would have copied the whole real-time process
> >  > >  > > address space at fork time instead of setting up COW mappings.
> >  > >  >
> >  > >  > No process forks are involved.  Though mlockall() was called from
> >  > >  > Linux main(), and the page fault was encountered by a separate
> >  > >  > Xenomai task. Here's the task history:
> >  > >
> >  > > The fork may well be hidden in some library. The best way to know if
> >  > > there is really no fork is to register a callback with pthread_atfork.
> >  >
> >  > pthread_atfork confirms that there is no fork.
> >
> > Ok. I am afraid you will have to help us a bit. Could you try Xenomai
> > 2.3.1 in case the nocow patch magically solves your issue ?
> Indeed, stepping up to:
> 
> linux-2.6.20.3
> xenomai-2.3.1
> ipipe-1.7-03
> 
> magically solved my page faults.

Ok, so some memory which was referred to must have been lying in a COWed
shared VM page, since what 1.7-03 does is that it eagerly breaks COW for
all the virtual memory space available to real-time threads.

In case you really, really, really don't want to upgrade to 2.3.x for
product maintenance issues (albeit I would really, really, really
recommend it, because there are really, really, really nice improvements
there over 2.2.x), I've crafted a quick patch to disable ondemand VM
mappings with older Xenomai revs using recent I-pipe patches. IOW, this
patch against v2.2.x should allow you to run the latter over I-pipe
1.7-03, and make use of its COW-breaking feature (uncompiled, untested).

--- include/asm-generic/hal.h	(revision 2323)
+++ include/asm-generic/hal.h	(working copy)
@@ -304,6 +304,13 @@
 #define PF_EVNOTIFY  0
 #endif	/* !PF_EVNOTIFY */
 
+#ifdef VM_PINNED
+#define rthal_disable_ondemand_mappings(tsk)   ipipe_disable_ondemand_mappings(tsk)
+#else /* !VM_PINNED */
+/* In case the I-pipe does not allow disabling ondemand mappings. */
+#define rthal_disable_ondemand_mappings(tsk)   (0)
+#endif	/* !VM_PINNED */
+
 #ifdef CONFIG_KGDB
 #define rthal_set_foreign_stack(ipd)	ipipe_set_foreign_stack(ipd)
 #define rthal_clear_foreign_stack(ipd)	ipipe_clear_foreign_stack(ipd)
Index: ksrc/nucleus/shadow.c
===================================================================
--- ksrc/nucleus/shadow.c	(revision 2323)
+++ ksrc/nucleus/shadow.c	(working copy)
@@ -828,6 +828,14 @@
 			thread->name, current->pid,
 			xnthread_base_priority(thread));
 
+#ifdef CONFIG_MMU
+	if (!(current->mm->def_flags & VM_LOCKED))
+		send_sig(SIGXCPU, current, 1);
+	else
+		if ((err = rthal_disable_ondemand_mappings(current)))
+			return err;
+#endif /* CONFIG_MMU */
+
 	/* Switch on propagation of normal kernel events for the bound
 	   task. This is basically a per-task event filter which
 	   restricts event notifications (e.g. syscalls) to tasks
@@ -836,11 +844,6 @@
 	   plain (i.e. non-Xenomai) Linux tasks. */
 	current->flags |= PF_EVNOTIFY;
 
-#ifdef CONFIG_MMU
-	if (!(current->mm->def_flags & VM_LOCKED))
-		send_sig(SIGXCPU, current, 1);
-#endif /* CONFIG_MMU */
-
 	current->cap_effective |=
 	    CAP_TO_MASK(CAP_IPC_LOCK) |
 	    CAP_TO_MASK(CAP_SYS_RAWIO) | CAP_TO_MASK(CAP_SYS_NICE);
-- 
Philippe.




^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2007-04-20 17:24 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-28 14:22 [Xenomai-help] Page faults Jeroen Van den Keybus
2006-02-28 15:05 ` Jan Kiszka
2006-02-28 15:29   ` Jeroen Van den Keybus
2006-02-28 16:29     ` Jan Kiszka
2006-02-28 16:31 ` Philippe Gerum
2006-02-28 17:08   ` Jeroen Van den Keybus
  -- strict thread matches above, loose matches on Subject: below --
2007-04-16 19:49 [Xenomai-help] page faults Jeff Weber
2007-04-16 20:05 ` Philippe Gerum
2007-04-16 20:20   ` Jeff Weber
2007-04-16 20:43     ` Gilles Chanteperdrix
2007-04-16 21:27       ` Jeff Weber
2007-04-16 21:34         ` Gilles Chanteperdrix
2007-04-17 13:21           ` Jeff Weber
2007-04-17 19:17             ` Gilles Chanteperdrix
2007-04-17 20:59               ` Jeff Weber
2007-04-20 16:43               ` Jeff Weber
2007-04-20 17:24                 ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.