All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-core] Stalled xenomai domain with head-optimisation
@ 2006-05-11 16:22 Jan Kiszka
  2006-05-11 22:32 ` [Xenomai-core] " Philippe Gerum
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kiszka @ 2006-05-11 16:22 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 3935 bytes --]

Hi Philippe,

I had a bit "fun" today trying to get some of our robotic hardware
running with latest Xenomai / Ipipe, also in order to test recent RTDM
fixes. It turned out that the head-optimised variant easily creates that
infamous stalled Xenomai domain, e.g. like this one:

> :     fn                 -212+   3.323  sched_clock+0xd (schedule+0x112)
> :     fn                 -209+   2.045  __ipipe_stall_root+0x8 (schedule+0x18e)
> :    *fn                 -207+   1.428  deactivate_task+0x9 (schedule+0x21e)
> :    *fn                 -205+   4.417  dequeue_task+0xa (deactivate_task+0x1a)
> :    *fn                 -201+   2.635  recalc_task_prio+0xd (schedule+0x317)
> :    *fn                 -198+   2.345  effective_prio+0x9 (recalc_task_prio+0x108)
> :    *fn                 -196+   3.443  requeue_task+0xa (schedule+0x344)
> :    *fn                 -192+   2.582  __ipipe_dispatch_event+0xe (schedule+0x412)
> :    *fn                 -190!  11.808  schedule_event+0xd (__ipipe_dispatch_event+0x5e)
> :|   *fn                 -178+   8.135  __switch_to+0xc (schedule+0x4fe)
> :    *fn                 -170+   3.714  __ipipe_unstall_root+0x8 (schedule+0x536)
> :     fn                 -166+   2.105  finish_wait+0xa (xnpipe_read+0x17c)
> :     fn                 -164+   1.368  __ipipe_test_and_stall_root+0x8 (finish_wait+0xae)
> :    *fn                 -163+   1.203  __ipipe_restore_root+0x8 (finish_wait+0x70)
> :    *fn                 -161+   6.210  __ipipe_unstall_root+0x8 (__ipipe_restore_root+0x2b)
> :|  * fn                 -155+   1.706  fput+0x8 (sys_read+0x5d)
> :|  * fn                 -153+   2.413  __ipipe_stall_root+0x8 (syscall_exit+0x5)
> :   **fn                 -151+   1.984  do_notify_resume+0x9 (work_notifysig+0x13)
> :   **fn                 -149+   1.894  do_signal+0x11 (do_notify_resume+0x2f)
> :   **fn                 -147+   1.330  get_signal_to_deliver+0xe (do_signal+0x4a)
> :   **fn                 -146+   2.022  __ipipe_stall_root+0x8 (get_signal_to_deliver+0x24)
> :   **fn                 -144+   2.060  dequeue_signal+0xb (get_signal_to_deliver+0xe9)
> :   **fn                 -142+   2.030  __dequeue_signal+0xe (dequeue_signal+0x21)
> :   **fn                 -140+   1.902  next_signal+0x9 (__dequeue_signal+0x1c)

This does not happen when I switch off Xenomai's head-optimisation. I
took this trace by patching shadow.c like this:

--- ksrc/nucleus/shadow.c       (revision 1074)
+++ ksrc/nucleus/shadow.c       (working copy)
@@ -1096,6 +1096,8 @@ static inline int do_hisyscall_event(uns
     xnthread_t *thread;
     u_long sysflags;

+    if (test_bit(IPIPE_STALL_FLAG, &rthal_domain.cpudata[0].status))
+        ipipe_trace_freeze(0);
     if (!nkpod || testbits(nkpod->status, XNPIDLE))
         goto no_skin;


You can reproduce the problem without special hardware by loading the
tims.ko module of our RACK framework [1], then starting tims_msg_client
(main/tims/router), and finally terminating it with ^C. The issue seems
to be somehow related to the pipe usage of TiMS.


Besides these bad news, there is fortunately also a lot of light: The
RTDM fixes and reorganisation did not cause regressions (puh...). Well,
and our RACK framework (+ various in-house extensions) runs really
smoothly over Xenomai. Specifically terminating and reloading
applications during runtime, which used to be a nightmare with /other
RT-extensions/, works fine and cause neither latency pikes nor even
worse effects.

I did some benchmarking on a production system today with "latency -p
1000 -f", and got about 130 us worst-case jitter (266 MHz Pentium-MMX,
tracer enabled) for this highest-prio task. And all this happened while
running various RT and non-RT jobs (e.g. cache calibrator) + xeno_16550A
(2 ports, one at 500 kbit/s) in background. =8)

Jan


[1]http://developer.berlios.de/projects/rack


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 250 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Xenomai-core] Re: Stalled xenomai domain with head-optimisation
  2006-05-11 16:22 [Xenomai-core] Stalled xenomai domain with head-optimisation Jan Kiszka
@ 2006-05-11 22:32 ` Philippe Gerum
  2006-05-11 23:56   ` Jan Kiszka
  0 siblings, 1 reply; 5+ messages in thread
From: Philippe Gerum @ 2006-05-11 22:32 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Hi Philippe,
> 
> I had a bit "fun" today trying to get some of our robotic hardware
> running with latest Xenomai / Ipipe, also in order to test recent RTDM
> fixes. It turned out that the head-optimised variant easily creates that
> infamous stalled Xenomai domain, e.g. like this one:
> 

Eeek... Ok, two things come to my mind for debugging this issue. The first one is
to make sure the assumption in ipipe_restore_pipeline_head() is a good one, so could
you try please testing the patch below, and see if the situation improves (it cannot
worsen anyway):

--- include/linux/ipipe.h~	2006-05-08 12:17:06.000000000 +0200
+++ include/linux/ipipe.h	2006-05-12 00:17:00.000000000 +0200
@@ -563,7 +563,9 @@
  static inline void ipipe_restore_pipeline_head(unsigned long x)
  {
  	struct ipipe_domain *head = __ipipe_pipeline_head();
+#if 0
  	if (x != test_bit(IPIPE_STALL_FLAG, &head->cpudata[ipipe_processor_id()].status))
+#endif
  		__ipipe_restore_pipeline_head(head,x);
  }


Second, if the first try is unsuccessful, could you try disabling the wired
interrupt support the way below, keeping the rest of the invariant pipeline
head optimizations active?

--- kernel/ipipe/core.c~	2006-05-07 18:05:28.000000000 +0200
+++ kernel/ipipe/core.c	2006-05-11 18:34:57.000000000 +0200
@@ -482,8 +482,10 @@
  	if (ipd->irqs[irq].control & IPIPE_SYSTEM_MASK)
  		return -EPERM;

+#if 0
  	if (!test_bit(IPIPE_AHEAD_FLAG, &ipd->flags))
  		/* Silently unwire interrupts for non-heading domains. */
+#endif
  		modemask &= ~IPIPE_WIRED_MASK;

  	spin_lock_irqsave_hw(&__ipipe_pipelock, flags);

-- 

Philippe.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Xenomai-core] Re: Stalled xenomai domain with head-optimisation
  2006-05-11 22:32 ` [Xenomai-core] " Philippe Gerum
@ 2006-05-11 23:56   ` Jan Kiszka
  2006-05-12  6:15     ` Philippe Gerum
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kiszka @ 2006-05-11 23:56 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Hi Philippe,
>>
>> I had a bit "fun" today trying to get some of our robotic hardware
>> running with latest Xenomai / Ipipe, also in order to test recent RTDM
>> fixes. It turned out that the head-optimised variant easily creates that
>> infamous stalled Xenomai domain, e.g. like this one:
>>
> 
> Eeek... Ok, two things come to my mind for debugging this issue. The
> first one is
> to make sure the assumption in ipipe_restore_pipeline_head() is a good
> one, so could
> you try please testing the patch below, and see if the situation
> improves (it cannot
> worsen anyway):
> 
> --- include/linux/ipipe.h~    2006-05-08 12:17:06.000000000 +0200
> +++ include/linux/ipipe.h    2006-05-12 00:17:00.000000000 +0200
> @@ -563,7 +563,9 @@
>  static inline void ipipe_restore_pipeline_head(unsigned long x)
>  {
>      struct ipipe_domain *head = __ipipe_pipeline_head();
> +#if 0
>      if (x != test_bit(IPIPE_STALL_FLAG,
> &head->cpudata[ipipe_processor_id()].status))
> +#endif
>          __ipipe_restore_pipeline_head(head,x);
>  }
> 
> 
> Second, if the first try is unsuccessful, could you try disabling the wired
> interrupt support the way below, keeping the rest of the invariant pipeline
> head optimizations active?
> 
> --- kernel/ipipe/core.c~    2006-05-07 18:05:28.000000000 +0200
> +++ kernel/ipipe/core.c    2006-05-11 18:34:57.000000000 +0200
> @@ -482,8 +482,10 @@
>      if (ipd->irqs[irq].control & IPIPE_SYSTEM_MASK)
>          return -EPERM;
> 
> +#if 0
>      if (!test_bit(IPIPE_AHEAD_FLAG, &ipd->flags))
>          /* Silently unwire interrupts for non-heading domains. */
> +#endif
>          modemask &= ~IPIPE_WIRED_MASK;
> 
>      spin_lock_irqsave_hw(&__ipipe_pipelock, flags);
> 

As long as I didn't messed my build up (it's late...): no effect for
both patches.

Jan


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Xenomai-core] Re: Stalled xenomai domain with head-optimisation
  2006-05-11 23:56   ` Jan Kiszka
@ 2006-05-12  6:15     ` Philippe Gerum
  2006-05-12  7:54       ` Jan Kiszka
  0 siblings, 1 reply; 5+ messages in thread
From: Philippe Gerum @ 2006-05-12  6:15 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> 
> As long as I didn't messed my build up (it's late...): no effect for
> both patches.
> 

Ok, thanks. So does the following patch on the trunk/ make the bug
disappear, when using a vanilla Adeos support?

--- include/asm-generic/hal.h	(revision 1078)
+++ include/asm-generic/hal.h	(working copy)
@@ -118,7 +118,7 @@
  /* Obsolete Adeos patches do not support the invariant pipeline head
     optimization, so we check for the presence of __ipipe_pipeline_head
     to detect it. */
-#if defined(CONFIG_XENO_OPT_PIPELINE_HEAD) && defined(__ipipe_pipeline_head)
+#if defined(CONFIG_XENO_OPT_PIPELINE_HEAD) && defined(__ipipe_pipeline_head) && 0
  #define rthal_local_irq_disable()	ipipe_stall_pipeline_head()
  #define rthal_local_irq_enable()	ipipe_unstall_pipeline_head()
  #define rthal_local_irq_save(x)		((x) = !!ipipe_test_and_stall_pipeline_head())

-- 

Philippe.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Xenomai-core] Re: Stalled xenomai domain with head-optimisation
  2006-05-12  6:15     ` Philippe Gerum
@ 2006-05-12  7:54       ` Jan Kiszka
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kiszka @ 2006-05-12  7:54 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1006 bytes --]

Philippe Gerum wrote:
> Jan Kiszka wrote:
>>
>> As long as I didn't messed my build up (it's late...): no effect for
>> both patches.
>>
> 
> Ok, thanks. So does the following patch on the trunk/ make the bug
> disappear, when using a vanilla Adeos support?
> 
> --- include/asm-generic/hal.h    (revision 1078)
> +++ include/asm-generic/hal.h    (working copy)
> @@ -118,7 +118,7 @@
>  /* Obsolete Adeos patches do not support the invariant pipeline head
>     optimization, so we check for the presence of __ipipe_pipeline_head
>     to detect it. */
> -#if defined(CONFIG_XENO_OPT_PIPELINE_HEAD) &&
> defined(__ipipe_pipeline_head)
> +#if defined(CONFIG_XENO_OPT_PIPELINE_HEAD) &&
> defined(__ipipe_pipeline_head) && 0
>  #define rthal_local_irq_disable()    ipipe_stall_pipeline_head()
>  #define rthal_local_irq_enable()    ipipe_unstall_pipeline_head()
>  #define rthal_local_irq_save(x)        ((x) =
> !!ipipe_test_and_stall_pipeline_head())
> 

This works now.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 252 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-05-12  7:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-11 16:22 [Xenomai-core] Stalled xenomai domain with head-optimisation Jan Kiszka
2006-05-11 22:32 ` [Xenomai-core] " Philippe Gerum
2006-05-11 23:56   ` Jan Kiszka
2006-05-12  6:15     ` Philippe Gerum
2006-05-12  7:54       ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.