[Xenomai-core] Watchdog / immediate Linux signal delivery

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xenomai-core] Watchdog / immediate Linux signal delivery
@ 2009-03-08 10:42 Jan Kiszka
  2009-03-08 13:41 ` Gilles Chanteperdrix
  2009-03-09 15:50 ` Philippe Gerum
  0 siblings, 2 replies; 21+ messages in thread
From: Jan Kiszka @ 2009-03-08 10:42 UTC (permalink / raw)
  To: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

Hi,

the watchdog is currently broken in trunk ("zombie [...] would not
die..."). In fact, it should also be broken in older versions, but only
recent thread termination rework made this visible.

When a Xenomai CPU hog is caught by the watchdog, xnpod_delete_thread is
invoked, causing the current thread to be set in zombie state and
scheduled out. But as its Linux mate still exist, hell breaks loose once
Linux tries to get rid of it (the Xenomai zombie is scheduled in again).
In short: calling xnpod_delete_thread(<self>) for a shadow thread is not
working, probably never worked cleanly.

There are basically two approaches to fix it: The first one is to find a
different way to kill (or only suspend?) the current shadow thread when
the watchdog strikes. The second one brought me to another issue: Raise
SIGKILL for the current thread and make sure that it can be processed by
Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
no way to force a shadow thread into secondary mode to handle pending
Linux signals unless that thread issues a syscall once in a while. And
that raises the question if we shouldn't improve this as well while we
are on it.

Granted, non-broken Xenomai user space threads always issue frequent
syscalls, otherwise the system would starve (and the watchdog would come
around). On the other hand, delaying signals till syscall prologues is
different from plain Linux behaviour...

Comments, ideas?

Jan

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-08 10:42 [Xenomai-core] Watchdog / immediate Linux signal delivery Jan Kiszka
@ 2009-03-08 13:41 ` Gilles Chanteperdrix
  2009-03-08 14:23   ` Jan Kiszka
  2009-03-09 15:50 ` Philippe Gerum
  1 sibling, 1 reply; 21+ messages in thread
From: Gilles Chanteperdrix @ 2009-03-08 13:41 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
> no way to force a shadow thread into secondary mode to handle pending
> Linux signals unless that thread issues a syscall once in a while. And
> that raises the question if we shouldn't improve this as well while we
> are on it.
> 
> Granted, non-broken Xenomai user space threads always issue frequent
> syscalls, otherwise the system would starve (and the watchdog would come
> around). On the other hand, delaying signals till syscall prologues is
> different from plain Linux behaviour...
> 
> Comments, ideas?

We discussed the issue of having a way to force threads to relax with
Philippe, and we both had patches to make this work. However, the issue
we recently had with the emulated iret on x86 makes me think that we can
not relax at any point in time, the code surrounding the relax has to be
made to allow a relax to occur.

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-08 13:41 ` Gilles Chanteperdrix
@ 2009-03-08 14:23   ` Jan Kiszka
  2009-03-08 14:43     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kiszka @ 2009-03-08 14:23 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1387 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>> no way to force a shadow thread into secondary mode to handle pending
>> Linux signals unless that thread issues a syscall once in a while. And
>> that raises the question if we shouldn't improve this as well while we
>> are on it.
>>
>> Granted, non-broken Xenomai user space threads always issue frequent
>> syscalls, otherwise the system would starve (and the watchdog would come
>> around). On the other hand, delaying signals till syscall prologues is
>> different from plain Linux behaviour...
>>
>> Comments, ideas?
> 
> We discussed the issue of having a way to force threads to relax with
> Philippe, and we both had patches to make this work. However, the issue
> we recently had with the emulated iret on x86 makes me think that we can
> not relax at any point in time, the code surrounding the relax has to be
> made to allow a relax to occur.
> 

Those issues were fixed. If we have similar problems around
__ipipe_handle_irq (I would expect the relaxation to take place in
xnintr_*_handler), then they should be fixed as well.

The problem is that I currently do not see any other way of cleanly
terminating or debugging some Xenomai user space thread doing "while
(1);" (or any more complicated variation).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-08 14:23   ` Jan Kiszka
@ 2009-03-08 14:43     ` Gilles Chanteperdrix
  2009-03-08 14:55       ` Jan Kiszka
  0 siblings, 1 reply; 21+ messages in thread
From: Gilles Chanteperdrix @ 2009-03-08 14:43 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>>> no way to force a shadow thread into secondary mode to handle pending
>>> Linux signals unless that thread issues a syscall once in a while. And
>>> that raises the question if we shouldn't improve this as well while we
>>> are on it.
>>>
>>> Granted, non-broken Xenomai user space threads always issue frequent
>>> syscalls, otherwise the system would starve (and the watchdog would come
>>> around). On the other hand, delaying signals till syscall prologues is
>>> different from plain Linux behaviour...
>>>
>>> Comments, ideas?
>> We discussed the issue of having a way to force threads to relax with
>> Philippe, and we both had patches to make this work. However, the issue
>> we recently had with the emulated iret on x86 makes me think that we can
>> not relax at any point in time, the code surrounding the relax has to be
>> made to allow a relax to occur.
>>
> 
> Those issues were fixed. If we have similar problems around
> __ipipe_handle_irq (I would expect the relaxation to take place in
> xnintr_*_handler), then they should be fixed as well.
> 
> The problem is that I currently do not see any other way of cleanly
> terminating or debugging some Xenomai user space thread doing "while
> (1);" (or any more complicated variation).

I am not really opposed to the "force relax upon signal", thing, but the
current approach used to work, at least with v2.3.x, so it must be my
recent rework of thread termination which broke things, maybe we can
repair them?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-08 14:43     ` Gilles Chanteperdrix
@ 2009-03-08 14:55       ` Jan Kiszka
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kiszka @ 2009-03-08 14:55 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 2073 bytes --]

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>>>> no way to force a shadow thread into secondary mode to handle pending
>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>> that raises the question if we shouldn't improve this as well while we
>>>> are on it.
>>>>
>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>> syscalls, otherwise the system would starve (and the watchdog would come
>>>> around). On the other hand, delaying signals till syscall prologues is
>>>> different from plain Linux behaviour...
>>>>
>>>> Comments, ideas?
>>> We discussed the issue of having a way to force threads to relax with
>>> Philippe, and we both had patches to make this work. However, the issue
>>> we recently had with the emulated iret on x86 makes me think that we can
>>> not relax at any point in time, the code surrounding the relax has to be
>>> made to allow a relax to occur.
>>>
>> Those issues were fixed. If we have similar problems around
>> __ipipe_handle_irq (I would expect the relaxation to take place in
>> xnintr_*_handler), then they should be fixed as well.
>>
>> The problem is that I currently do not see any other way of cleanly
>> terminating or debugging some Xenomai user space thread doing "while
>> (1);" (or any more complicated variation).
> 
> I am not really opposed to the "force relax upon signal", thing, but the
> current approach used to work, at least with v2.3.x, so it must be my
> recent rework of thread termination which broke things, maybe we can
> repair them?

That was my first impression as well, but I've some feeling that it was
only silently broken so far. IIRC, Xenomai shadow threads must not run
in unmapped state after deleting their nucleus counterpart (but that's
what we do when the watchdog fired!). Rather, proper termination for
shadow threads goes via relaxing and then do_exit.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-08 10:42 [Xenomai-core] Watchdog / immediate Linux signal delivery Jan Kiszka
  2009-03-08 13:41 ` Gilles Chanteperdrix
@ 2009-03-09 15:50 ` Philippe Gerum
  2009-03-09 16:44   ` Jan Kiszka
  1 sibling, 1 reply; 21+ messages in thread
From: Philippe Gerum @ 2009-03-09 15:50 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Hi,
> 
> the watchdog is currently broken in trunk ("zombie [...] would not
> die..."). In fact, it should also be broken in older versions, but only
> recent thread termination rework made this visible.
> 
> When a Xenomai CPU hog is caught by the watchdog, xnpod_delete_thread is
> invoked, causing the current thread to be set in zombie state and
> scheduled out. But as its Linux mate still exist, hell breaks loose once
> Linux tries to get rid of it (the Xenomai zombie is scheduled in again).
> In short: calling xnpod_delete_thread(<self>) for a shadow thread is not
> working, probably never worked cleanly.

Nak, it is a regression introduced by the scheduler changes in 2.5.x. We should 
detect _any_ shadow thread that schedules out in primary mode then regains 
control in secondary mode like we do in the 2.4.x series, not only _relaxing_ 
shadow threads. It is perfectly valid to have the Linux task orphaned from the 
deletion of its shadow TCB until Xenomai notices the issue and reaps it; problem 
was that such regression prevented the nucleus to get the memo.

The following patch should fix the issue:

  Index: include/asm-generic/system.h
===================================================================
--- include/asm-generic/system.h	(revision 4676)
+++ include/asm-generic/system.h	(working copy)
@@ -311,6 +311,11 @@
  	return !!s;
  }

+static inline int xnarch_root_domain_p(void)
+{
+	return rthal_current_domain == rthal_root_domain;
+}
+
  #ifdef CONFIG_SMP

  #define xnlock_get(lock)		__xnlock_get(lock  XNLOCK_DBG_CONTEXT)
Index: ksrc/nucleus/pod.c
===================================================================
--- ksrc/nucleus/pod.c	(revision 4676)
+++ ksrc/nucleus/pod.c	(working copy)
@@ -2137,7 +2137,7 @@
  void __xnpod_schedule(struct xnsched *sched)
  {
  	struct xnthread *prev, *next, *curr = sched->curr;
-	int zombie, switched = 0, need_resched, relaxing;
+	int zombie, switched = 0, need_resched, shadow;
  	spl_t s;

  	if (xnarch_escalate())
@@ -2174,9 +2174,9 @@
  		   next, xnthread_name(next));

  #ifdef CONFIG_XENO_OPT_PERVASIVE
-	relaxing = xnthread_test_state(prev, XNRELAX);
+	shadow = xnthread_test_state(prev, XNSHADOW);
  #else
-	(void)relaxing;
+	(void)shadow;
  #endif /* CONFIG_XENO_OPT_PERVASIVE */

  	if (xnthread_test_state(next, XNROOT)) {
@@ -2204,12 +2204,18 @@

  #ifdef CONFIG_XENO_OPT_PERVASIVE
  	/*
-	 * Test whether we are relaxing a thread. In such a case, we
-	 * are here the epilogue of Linux' schedule, and should skip
-	 * xnpod_schedule epilogue.
+	 * Test whether we transitioned from primary mode to secondary
+	 * over a shadow thread. This may happen in two cases:
+	 *
+	 * 1) the shadow thread just relaxed.
+	 * 2) the shadow TCB has just been deleted, in which case
+	 * we have to reap the mated Linux side as well.
+	 *
+	 * In both cases, we are running over the epilogue of Linux's
+	 * schedule, and should skip our epilogue code.
  	 */
-	if (relaxing)
-		goto relax_epilogue;
+	if (shadow && xnarch_root_domain_p())
+		goto shadow_epilogue;
  #endif /* CONFIG_XENO_OPT_PERVASIVE */

  	switched = 1;
@@ -2252,7 +2258,7 @@
  	return;

  #ifdef CONFIG_XENO_OPT_PERVASIVE
-      relax_epilogue:
+      shadow_epilogue:
  	{
  		spl_t ignored;

> 
> There are basically two approaches to fix it: The first one is to find a
> different way to kill (or only suspend?)

Suspending the hog won't work, particularly when GDB is involved, because a 
pending non-lethal Linux signal may cause the suspended shadow to resume 
immediately for processing the signal, therefore defeating the purpose of the 
watchdog, leading to an infinite loop. This is why we moved from suspension to 
deletion upon watchdog trigger in 2.3 (2.2 used to suspend only).

  the current shadow thread when
> the watchdog strikes. The second one brought me to another issue: Raise
> SIGKILL for the current thread and make sure that it can be processed by
> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
> no way to force a shadow thread into secondary mode to handle pending
> Linux signals unless that thread issues a syscall once in a while. And
> that raises the question if we shouldn't improve this as well while we
> are on it.
> 
> Granted, non-broken Xenomai user space threads always issue frequent
> syscalls, otherwise the system would starve (and the watchdog would come
> around). On the other hand, delaying signals till syscall prologues is
> different from plain Linux behaviour...
> 
> Comments, ideas?
> 

We probably need a two-stage approach: first record the thread was bumped out 
and suspend it from the watchdog handler to give Linux a chance to run again, 
then finish the work, killing it for good, next time the root thread is 
scheduled in on the same CPU.

> Jan
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Xenomai-core mailing list
> Xenomai-core@domain.hid
> https://mail.gna.org/listinfo/xenomai-core


-- 
Philippe.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 15:50 ` Philippe Gerum
@ 2009-03-09 16:44   ` Jan Kiszka
  2009-03-09 17:02     ` Gilles Chanteperdrix
  2009-03-09 17:09     ` Philippe Gerum
  0 siblings, 2 replies; 21+ messages in thread
From: Jan Kiszka @ 2009-03-09 16:44 UTC (permalink / raw)
  To: rpm; +Cc: xenomai-core

Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Hi,
>>
>> the watchdog is currently broken in trunk ("zombie [...] would not
>> die..."). In fact, it should also be broken in older versions, but only
>> recent thread termination rework made this visible.
>>
>> When a Xenomai CPU hog is caught by the watchdog, xnpod_delete_thread is
>> invoked, causing the current thread to be set in zombie state and
>> scheduled out. But as its Linux mate still exist, hell breaks loose once
>> Linux tries to get rid of it (the Xenomai zombie is scheduled in again).
>> In short: calling xnpod_delete_thread(<self>) for a shadow thread is not
>> working, probably never worked cleanly.
> 
> Nak, it is a regression introduced by the scheduler changes in 2.5.x. We should 
> detect _any_ shadow thread that schedules out in primary mode then regains 
> control in secondary mode like we do in the 2.4.x series, not only _relaxing_ 
> shadow threads. It is perfectly valid to have the Linux task orphaned from the 
> deletion of its shadow TCB until Xenomai notices the issue and reaps it; problem 
> was that such regression prevented the nucleus to get the memo.
> 
> The following patch should fix the issue:
> 
>   Index: include/asm-generic/system.h
> ===================================================================
> --- include/asm-generic/system.h	(revision 4676)
> +++ include/asm-generic/system.h	(working copy)
> @@ -311,6 +311,11 @@
>   	return !!s;
>   }
> 
> +static inline int xnarch_root_domain_p(void)
> +{
> +	return rthal_current_domain == rthal_root_domain;
> +}
> +
>   #ifdef CONFIG_SMP
> 
>   #define xnlock_get(lock)		__xnlock_get(lock  XNLOCK_DBG_CONTEXT)
> Index: ksrc/nucleus/pod.c
> ===================================================================
> --- ksrc/nucleus/pod.c	(revision 4676)
> +++ ksrc/nucleus/pod.c	(working copy)
> @@ -2137,7 +2137,7 @@
>   void __xnpod_schedule(struct xnsched *sched)
>   {
>   	struct xnthread *prev, *next, *curr = sched->curr;
> -	int zombie, switched = 0, need_resched, relaxing;
> +	int zombie, switched = 0, need_resched, shadow;
>   	spl_t s;
> 
>   	if (xnarch_escalate())
> @@ -2174,9 +2174,9 @@
>   		   next, xnthread_name(next));
> 
>   #ifdef CONFIG_XENO_OPT_PERVASIVE
> -	relaxing = xnthread_test_state(prev, XNRELAX);
> +	shadow = xnthread_test_state(prev, XNSHADOW);
>   #else
> -	(void)relaxing;
> +	(void)shadow;
>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
> 
>   	if (xnthread_test_state(next, XNROOT)) {
> @@ -2204,12 +2204,18 @@
> 
>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>   	/*
> -	 * Test whether we are relaxing a thread. In such a case, we
> -	 * are here the epilogue of Linux' schedule, and should skip
> -	 * xnpod_schedule epilogue.
> +	 * Test whether we transitioned from primary mode to secondary
> +	 * over a shadow thread. This may happen in two cases:
> +	 *
> +	 * 1) the shadow thread just relaxed.
> +	 * 2) the shadow TCB has just been deleted, in which case
> +	 * we have to reap the mated Linux side as well.
> +	 *
> +	 * In both cases, we are running over the epilogue of Linux's
> +	 * schedule, and should skip our epilogue code.
>   	 */
> -	if (relaxing)
> -		goto relax_epilogue;
> +	if (shadow && xnarch_root_domain_p())
> +		goto shadow_epilogue;
>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
> 
>   	switched = 1;
> @@ -2252,7 +2258,7 @@
>   	return;
> 
>   #ifdef CONFIG_XENO_OPT_PERVASIVE
> -      relax_epilogue:
> +      shadow_epilogue:
>   	{
>   		spl_t ignored;

Finally makes sense and works (but your posting was corrupted). Great.

> 
>> There are basically two approaches to fix it: The first one is to find a
>> different way to kill (or only suspend?)
> 
> Suspending the hog won't work, particularly when GDB is involved, because a 
> pending non-lethal Linux signal may cause the suspended shadow to resume 
> immediately for processing the signal, therefore defeating the purpose of the 
> watchdog, leading to an infinite loop. This is why we moved from suspension to 
> deletion upon watchdog trigger in 2.3 (2.2 used to suspend only).

Yes, that became clear to me in the meantime, too.

> 
>   the current shadow thread when
>> the watchdog strikes. The second one brought me to another issue: Raise
>> SIGKILL for the current thread and make sure that it can be processed by
>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>> no way to force a shadow thread into secondary mode to handle pending
>> Linux signals unless that thread issues a syscall once in a while. And
>> that raises the question if we shouldn't improve this as well while we
>> are on it.
>>
>> Granted, non-broken Xenomai user space threads always issue frequent
>> syscalls, otherwise the system would starve (and the watchdog would come
>> around). On the other hand, delaying signals till syscall prologues is
>> different from plain Linux behaviour...
>>
>> Comments, ideas?
>>
> 
> We probably need a two-stage approach: first record the thread was bumped out 
> and suspend it from the watchdog handler to give Linux a chance to run again, 
> then finish the work, killing it for good, next time the root thread is 
> scheduled in on the same CPU.

That confuses me again: The watchdog issue is solved now, no? We are
only left with the scenario of breaking out of a user space loop of some
Xenomai thread via a Linux signal (which implies SMP - otherwise there
is no chance to raise the signal...).

Meanwhile I played with some light-weight approach to relax a thread
that received a signal (according to do_sigwake_event). Worked, but only
once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
it does not handle the case that a non-root handler may alter the
current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
involved domains. Will try to fix this and post my signaling proposal so
that this work is not lost.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 16:44   ` Jan Kiszka
@ 2009-03-09 17:02     ` Gilles Chanteperdrix
  2009-03-09 17:15       ` Jan Kiszka
  2009-03-09 17:27       ` Philippe Gerum
  2009-03-09 17:09     ` Philippe Gerum
  1 sibling, 2 replies; 21+ messages in thread
From: Gilles Chanteperdrix @ 2009-03-09 17:02 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> the watchdog strikes. The second one brought me to another issue: Raise
>>> SIGKILL for the current thread and make sure that it can be processed by
>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>>> no way to force a shadow thread into secondary mode to handle pending
>>> Linux signals unless that thread issues a syscall once in a while. And
>>> that raises the question if we shouldn't improve this as well while we
>>> are on it.
>>>
>>> Granted, non-broken Xenomai user space threads always issue frequent
>>> syscalls, otherwise the system would starve (and the watchdog would come
>>> around). On the other hand, delaying signals till syscall prologues is
>>> different from plain Linux behaviour...
>>>
>>> Comments, ideas?
>>>
>> We probably need a two-stage approach: first record the thread was bumped out 
>> and suspend it from the watchdog handler to give Linux a chance to run again, 
>> then finish the work, killing it for good, next time the root thread is 
>> scheduled in on the same CPU.
> 
> That confuses me again: The watchdog issue is solved now, no? We are
> only left with the scenario of breaking out of a user space loop of some
> Xenomai thread via a Linux signal (which implies SMP - otherwise there
> is no chance to raise the signal...).
> 
> Meanwhile I played with some light-weight approach to relax a thread
> that received a signal (according to do_sigwake_event). Worked, but only
> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
> it does not handle the case that a non-root handler may alter the
> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
> involved domains. Will try to fix this and post my signaling proposal so
> that this work is not lost.

If we go that way, I would vote for a SIGSEGV instead of the SIGKILL.
This would allow to install a handler to dump the backtrace, or even gdb
to be stopped at the point of the infinite loop, and a SIGSEGV handler
is not expected to recover (well, except in cases of implementation of
COW in user-space, but that does not fit well with real-time threads).

-- 
                                                 Gilles.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 16:44   ` Jan Kiszka
  2009-03-09 17:02     ` Gilles Chanteperdrix
@ 2009-03-09 17:09     ` Philippe Gerum
  2009-03-09 17:12       ` Jan Kiszka
  2009-03-09 17:27       ` Jan Kiszka
  1 sibling, 2 replies; 21+ messages in thread
From: Philippe Gerum @ 2009-03-09 17:09 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Hi,
>>>
>>> the watchdog is currently broken in trunk ("zombie [...] would not
>>> die..."). In fact, it should also be broken in older versions, but only
>>> recent thread termination rework made this visible.
>>>
>>> When a Xenomai CPU hog is caught by the watchdog, xnpod_delete_thread is
>>> invoked, causing the current thread to be set in zombie state and
>>> scheduled out. But as its Linux mate still exist, hell breaks loose once
>>> Linux tries to get rid of it (the Xenomai zombie is scheduled in again).
>>> In short: calling xnpod_delete_thread(<self>) for a shadow thread is not
>>> working, probably never worked cleanly.
>> Nak, it is a regression introduced by the scheduler changes in 2.5.x. We should 
>> detect _any_ shadow thread that schedules out in primary mode then regains 
>> control in secondary mode like we do in the 2.4.x series, not only _relaxing_ 
>> shadow threads. It is perfectly valid to have the Linux task orphaned from the 
>> deletion of its shadow TCB until Xenomai notices the issue and reaps it; problem 
>> was that such regression prevented the nucleus to get the memo.
>>
>> The following patch should fix the issue:
>>
>>   Index: include/asm-generic/system.h
>> ===================================================================
>> --- include/asm-generic/system.h	(revision 4676)
>> +++ include/asm-generic/system.h	(working copy)
>> @@ -311,6 +311,11 @@
>>   	return !!s;
>>   }
>>
>> +static inline int xnarch_root_domain_p(void)
>> +{
>> +	return rthal_current_domain == rthal_root_domain;
>> +}
>> +
>>   #ifdef CONFIG_SMP
>>
>>   #define xnlock_get(lock)		__xnlock_get(lock  XNLOCK_DBG_CONTEXT)
>> Index: ksrc/nucleus/pod.c
>> ===================================================================
>> --- ksrc/nucleus/pod.c	(revision 4676)
>> +++ ksrc/nucleus/pod.c	(working copy)
>> @@ -2137,7 +2137,7 @@
>>   void __xnpod_schedule(struct xnsched *sched)
>>   {
>>   	struct xnthread *prev, *next, *curr = sched->curr;
>> -	int zombie, switched = 0, need_resched, relaxing;
>> +	int zombie, switched = 0, need_resched, shadow;
>>   	spl_t s;
>>
>>   	if (xnarch_escalate())
>> @@ -2174,9 +2174,9 @@
>>   		   next, xnthread_name(next));
>>
>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>> -	relaxing = xnthread_test_state(prev, XNRELAX);
>> +	shadow = xnthread_test_state(prev, XNSHADOW);
>>   #else
>> -	(void)relaxing;
>> +	(void)shadow;
>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>
>>   	if (xnthread_test_state(next, XNROOT)) {
>> @@ -2204,12 +2204,18 @@
>>
>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>   	/*
>> -	 * Test whether we are relaxing a thread. In such a case, we
>> -	 * are here the epilogue of Linux' schedule, and should skip
>> -	 * xnpod_schedule epilogue.
>> +	 * Test whether we transitioned from primary mode to secondary
>> +	 * over a shadow thread. This may happen in two cases:
>> +	 *
>> +	 * 1) the shadow thread just relaxed.
>> +	 * 2) the shadow TCB has just been deleted, in which case
>> +	 * we have to reap the mated Linux side as well.
>> +	 *
>> +	 * In both cases, we are running over the epilogue of Linux's
>> +	 * schedule, and should skip our epilogue code.
>>   	 */
>> -	if (relaxing)
>> -		goto relax_epilogue;
>> +	if (shadow && xnarch_root_domain_p())
>> +		goto shadow_epilogue;
>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>
>>   	switched = 1;
>> @@ -2252,7 +2258,7 @@
>>   	return;
>>
>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>> -      relax_epilogue:
>> +      shadow_epilogue:
>>   	{
>>   		spl_t ignored;
> 
> Finally makes sense and works (but your posting was corrupted). Great.
> 
>>> There are basically two approaches to fix it: The first one is to find a
>>> different way to kill (or only suspend?)
>> Suspending the hog won't work, particularly when GDB is involved, because a 
>> pending non-lethal Linux signal may cause the suspended shadow to resume 
>> immediately for processing the signal, therefore defeating the purpose of the 
>> watchdog, leading to an infinite loop. This is why we moved from suspension to 
>> deletion upon watchdog trigger in 2.3 (2.2 used to suspend only).
> 
> Yes, that became clear to me in the meantime, too.
> 
>>   the current shadow thread when
>>> the watchdog strikes. The second one brought me to another issue: Raise
>>> SIGKILL for the current thread and make sure that it can be processed by
>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>>> no way to force a shadow thread into secondary mode to handle pending
>>> Linux signals unless that thread issues a syscall once in a while. And
>>> that raises the question if we shouldn't improve this as well while we
>>> are on it.
>>>
>>> Granted, non-broken Xenomai user space threads always issue frequent
>>> syscalls, otherwise the system would starve (and the watchdog would come
>>> around). On the other hand, delaying signals till syscall prologues is
>>> different from plain Linux behaviour...
>>>
>>> Comments, ideas?
>>>
>> We probably need a two-stage approach: first record the thread was bumped out 
>> and suspend it from the watchdog handler to give Linux a chance to run again, 
>> then finish the work, killing it for good, next time the root thread is 
>> scheduled in on the same CPU.
> 
> That confuses me again: The watchdog issue is solved now, no? We are
> only left with the scenario of breaking out of a user space loop of some
> Xenomai thread via a Linux signal (which implies SMP - otherwise there
> is no chance to raise the signal...).
>

If you first suspend the hog, then send it a lethal signal, you solve both 
issues: first Linux is allowed to run eventually, then your task won't be able 
to resume running the faulty code, but solely to process SIGKILL, which can be 
made pending early enough because the nucleus decides when Linux resumes.

> Meanwhile I played with some light-weight approach to relax a thread
> that received a signal (according to do_sigwake_event). Worked, but only
> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
> it does not handle the case that a non-root handler may alter the
> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
> involved domains.

It is not a bug, this is wanted. ISR must neither change the current domain nor 
migrate CPU; allowing this would open Pandora's box.

  Will try to fix this and post my signaling proposal so
> that this work is not lost.
> 
> Jan
> 


-- 
Philippe.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:09     ` Philippe Gerum
@ 2009-03-09 17:12       ` Jan Kiszka
  2009-03-09 17:37         ` Philippe Gerum
  2009-03-09 17:27       ` Jan Kiszka
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Kiszka @ 2009-03-09 17:12 UTC (permalink / raw)
  To: rpm; +Cc: xenomai-core

Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> Hi,
>>>>
>>>> the watchdog is currently broken in trunk ("zombie [...] would not
>>>> die..."). In fact, it should also be broken in older versions, but only
>>>> recent thread termination rework made this visible.
>>>>
>>>> When a Xenomai CPU hog is caught by the watchdog,
>>>> xnpod_delete_thread is
>>>> invoked, causing the current thread to be set in zombie state and
>>>> scheduled out. But as its Linux mate still exist, hell breaks loose
>>>> once
>>>> Linux tries to get rid of it (the Xenomai zombie is scheduled in
>>>> again).
>>>> In short: calling xnpod_delete_thread(<self>) for a shadow thread is
>>>> not
>>>> working, probably never worked cleanly.
>>> Nak, it is a regression introduced by the scheduler changes in 2.5.x.
>>> We should detect _any_ shadow thread that schedules out in primary
>>> mode then regains control in secondary mode like we do in the 2.4.x
>>> series, not only _relaxing_ shadow threads. It is perfectly valid to
>>> have the Linux task orphaned from the deletion of its shadow TCB
>>> until Xenomai notices the issue and reaps it; problem was that such
>>> regression prevented the nucleus to get the memo.
>>>
>>> The following patch should fix the issue:
>>>
>>>   Index: include/asm-generic/system.h
>>> ===================================================================
>>> --- include/asm-generic/system.h    (revision 4676)
>>> +++ include/asm-generic/system.h    (working copy)
>>> @@ -311,6 +311,11 @@
>>>       return !!s;
>>>   }
>>>
>>> +static inline int xnarch_root_domain_p(void)
>>> +{
>>> +    return rthal_current_domain == rthal_root_domain;
>>> +}
>>> +
>>>   #ifdef CONFIG_SMP
>>>
>>>   #define xnlock_get(lock)        __xnlock_get(lock  XNLOCK_DBG_CONTEXT)
>>> Index: ksrc/nucleus/pod.c
>>> ===================================================================
>>> --- ksrc/nucleus/pod.c    (revision 4676)
>>> +++ ksrc/nucleus/pod.c    (working copy)
>>> @@ -2137,7 +2137,7 @@
>>>   void __xnpod_schedule(struct xnsched *sched)
>>>   {
>>>       struct xnthread *prev, *next, *curr = sched->curr;
>>> -    int zombie, switched = 0, need_resched, relaxing;
>>> +    int zombie, switched = 0, need_resched, shadow;
>>>       spl_t s;
>>>
>>>       if (xnarch_escalate())
>>> @@ -2174,9 +2174,9 @@
>>>              next, xnthread_name(next));
>>>
>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>> -    relaxing = xnthread_test_state(prev, XNRELAX);
>>> +    shadow = xnthread_test_state(prev, XNSHADOW);
>>>   #else
>>> -    (void)relaxing;
>>> +    (void)shadow;
>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>
>>>       if (xnthread_test_state(next, XNROOT)) {
>>> @@ -2204,12 +2204,18 @@
>>>
>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>       /*
>>> -     * Test whether we are relaxing a thread. In such a case, we
>>> -     * are here the epilogue of Linux' schedule, and should skip
>>> -     * xnpod_schedule epilogue.
>>> +     * Test whether we transitioned from primary mode to secondary
>>> +     * over a shadow thread. This may happen in two cases:
>>> +     *
>>> +     * 1) the shadow thread just relaxed.
>>> +     * 2) the shadow TCB has just been deleted, in which case
>>> +     * we have to reap the mated Linux side as well.
>>> +     *
>>> +     * In both cases, we are running over the epilogue of Linux's
>>> +     * schedule, and should skip our epilogue code.
>>>        */
>>> -    if (relaxing)
>>> -        goto relax_epilogue;
>>> +    if (shadow && xnarch_root_domain_p())
>>> +        goto shadow_epilogue;
>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>
>>>       switched = 1;
>>> @@ -2252,7 +2258,7 @@
>>>       return;
>>>
>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>> -      relax_epilogue:
>>> +      shadow_epilogue:
>>>       {
>>>           spl_t ignored;
>>
>> Finally makes sense and works (but your posting was corrupted). Great.
>>
>>>> There are basically two approaches to fix it: The first one is to
>>>> find a
>>>> different way to kill (or only suspend?)
>>> Suspending the hog won't work, particularly when GDB is involved,
>>> because a pending non-lethal Linux signal may cause the suspended
>>> shadow to resume immediately for processing the signal, therefore
>>> defeating the purpose of the watchdog, leading to an infinite loop.
>>> This is why we moved from suspension to deletion upon watchdog
>>> trigger in 2.3 (2.2 used to suspend only).
>>
>> Yes, that became clear to me in the meantime, too.
>>
>>>   the current shadow thread when
>>>> the watchdog strikes. The second one brought me to another issue: Raise
>>>> SIGKILL for the current thread and make sure that it can be
>>>> processed by
>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately,
>>>> there is
>>>> no way to force a shadow thread into secondary mode to handle pending
>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>> that raises the question if we shouldn't improve this as well while we
>>>> are on it.
>>>>
>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>> syscalls, otherwise the system would starve (and the watchdog would
>>>> come
>>>> around). On the other hand, delaying signals till syscall prologues is
>>>> different from plain Linux behaviour...
>>>>
>>>> Comments, ideas?
>>>>
>>> We probably need a two-stage approach: first record the thread was
>>> bumped out and suspend it from the watchdog handler to give Linux a
>>> chance to run again, then finish the work, killing it for good, next
>>> time the root thread is scheduled in on the same CPU.
>>
>> That confuses me again: The watchdog issue is solved now, no? We are
>> only left with the scenario of breaking out of a user space loop of some
>> Xenomai thread via a Linux signal (which implies SMP - otherwise there
>> is no chance to raise the signal...).
>>
> 
> If you first suspend the hog, then send it a lethal signal, you solve
> both issues: first Linux is allowed to run eventually, then your task
> won't be able to resume running the faulty code, but solely to process
> SIGKILL, which can be made pending early enough because the nucleus
> decides when Linux resumes.

I'm not interested in SIGKILL here, rather in SIGSTOP to do debugging.
That is currently impossible.

> 
>> Meanwhile I played with some light-weight approach to relax a thread
>> that received a signal (according to do_sigwake_event). Worked, but only
>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>> it does not handle the case that a non-root handler may alter the
>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>> involved domains.
> 
> It is not a bug, this is wanted. ISR must neither change the current
> domain nor migrate CPU; allowing this would open Pandora's box.

OK, then please elaborate on this a bit more in the adeos-main thread
and explain why __ipipe_sync_stage currently reloads the domain.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:02     ` Gilles Chanteperdrix
@ 2009-03-09 17:15       ` Jan Kiszka
  2009-03-09 17:27       ` Philippe Gerum
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Kiszka @ 2009-03-09 17:15 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai-core

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> the watchdog strikes. The second one brought me to another issue: Raise
>>>> SIGKILL for the current thread and make sure that it can be processed by
>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>>>> no way to force a shadow thread into secondary mode to handle pending
>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>> that raises the question if we shouldn't improve this as well while we
>>>> are on it.
>>>>
>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>> syscalls, otherwise the system would starve (and the watchdog would come
>>>> around). On the other hand, delaying signals till syscall prologues is
>>>> different from plain Linux behaviour...
>>>>
>>>> Comments, ideas?
>>>>
>>> We probably need a two-stage approach: first record the thread was bumped out 
>>> and suspend it from the watchdog handler to give Linux a chance to run again, 
>>> then finish the work, killing it for good, next time the root thread is 
>>> scheduled in on the same CPU.
>> That confuses me again: The watchdog issue is solved now, no? We are
>> only left with the scenario of breaking out of a user space loop of some
>> Xenomai thread via a Linux signal (which implies SMP - otherwise there
>> is no chance to raise the signal...).
>>
>> Meanwhile I played with some light-weight approach to relax a thread
>> that received a signal (according to do_sigwake_event). Worked, but only
>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>> it does not handle the case that a non-root handler may alter the
>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>> involved domains. Will try to fix this and post my signaling proposal so
>> that this work is not lost.
> 
> If we go that way, I would vote for a SIGSEGV instead of the SIGKILL.
> This would allow to install a handler to dump the backtrace, or even gdb
> to be stopped at the point of the infinite loop, and a SIGSEGV handler
> is not expected to recover (well, except in cases of implementation of
> COW in user-space, but that does not fit well with real-time threads).

Yea, I also thought about such mechanism to allow gdb to catch the
problem. But for a first step I do not plan to convert the watchdog kill
mechanism.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:09     ` Philippe Gerum
  2009-03-09 17:12       ` Jan Kiszka
@ 2009-03-09 17:27       ` Jan Kiszka
  2009-03-09 17:38         ` Philippe Gerum
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Kiszka @ 2009-03-09 17:27 UTC (permalink / raw)
  To: rpm; +Cc: xenomai-core

Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Meanwhile I played with some light-weight approach to relax a thread
>> that received a signal (according to do_sigwake_event). Worked, but only
>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>> it does not handle the case that a non-root handler may alter the
>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>> involved domains.
> 
> It is not a bug, this is wanted. ISR must neither change the current
> domain nor migrate CPU; allowing this would open Pandora's box.

And if there is no way to migrate from within an ISR, we can bury any
attempt to deliver signals to spinning Xenomai threads - or what other
context would remain to Xenomai for triggering migration?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:02     ` Gilles Chanteperdrix
  2009-03-09 17:15       ` Jan Kiszka
@ 2009-03-09 17:27       ` Philippe Gerum
  1 sibling, 0 replies; 21+ messages in thread
From: Philippe Gerum @ 2009-03-09 17:27 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Jan Kiszka, xenomai-core

Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> the watchdog strikes. The second one brought me to another issue: Raise
>>>> SIGKILL for the current thread and make sure that it can be processed by
>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>>>> no way to force a shadow thread into secondary mode to handle pending
>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>> that raises the question if we shouldn't improve this as well while we
>>>> are on it.
>>>>
>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>> syscalls, otherwise the system would starve (and the watchdog would come
>>>> around). On the other hand, delaying signals till syscall prologues is
>>>> different from plain Linux behaviour...
>>>>
>>>> Comments, ideas?
>>>>
>>> We probably need a two-stage approach: first record the thread was bumped out 
>>> and suspend it from the watchdog handler to give Linux a chance to run again, 
>>> then finish the work, killing it for good, next time the root thread is 
>>> scheduled in on the same CPU.
>> That confuses me again: The watchdog issue is solved now, no? We are
>> only left with the scenario of breaking out of a user space loop of some
>> Xenomai thread via a Linux signal (which implies SMP - otherwise there
>> is no chance to raise the signal...).
>>
>> Meanwhile I played with some light-weight approach to relax a thread
>> that received a signal (according to do_sigwake_event). Worked, but only
>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>> it does not handle the case that a non-root handler may alter the
>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>> involved domains. Will try to fix this and post my signaling proposal so
>> that this work is not lost.
> 
> If we go that way, I would vote for a SIGSEGV instead of the SIGKILL.
> This would allow to install a handler to dump the backtrace, or even gdb
> to be stopped at the point of the infinite loop, and a SIGSEGV handler
> is not expected to recover (well, except in cases of implementation of
> COW in user-space, but that does not fit well with real-time threads).
> 

Makes sense.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:12       ` Jan Kiszka
@ 2009-03-09 17:37         ` Philippe Gerum
  2009-03-09 23:49           ` Jan Kiszka
  0 siblings, 1 reply; 21+ messages in thread
From: Philippe Gerum @ 2009-03-09 17:37 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
>>>> Jan Kiszka wrote:
>>>>> Hi,
>>>>>
>>>>> the watchdog is currently broken in trunk ("zombie [...] would not
>>>>> die..."). In fact, it should also be broken in older versions, but only
>>>>> recent thread termination rework made this visible.
>>>>>
>>>>> When a Xenomai CPU hog is caught by the watchdog,
>>>>> xnpod_delete_thread is
>>>>> invoked, causing the current thread to be set in zombie state and
>>>>> scheduled out. But as its Linux mate still exist, hell breaks loose
>>>>> once
>>>>> Linux tries to get rid of it (the Xenomai zombie is scheduled in
>>>>> again).
>>>>> In short: calling xnpod_delete_thread(<self>) for a shadow thread is
>>>>> not
>>>>> working, probably never worked cleanly.
>>>> Nak, it is a regression introduced by the scheduler changes in 2.5.x.
>>>> We should detect _any_ shadow thread that schedules out in primary
>>>> mode then regains control in secondary mode like we do in the 2.4.x
>>>> series, not only _relaxing_ shadow threads. It is perfectly valid to
>>>> have the Linux task orphaned from the deletion of its shadow TCB
>>>> until Xenomai notices the issue and reaps it; problem was that such
>>>> regression prevented the nucleus to get the memo.
>>>>
>>>> The following patch should fix the issue:
>>>>
>>>>   Index: include/asm-generic/system.h
>>>> ===================================================================
>>>> --- include/asm-generic/system.h    (revision 4676)
>>>> +++ include/asm-generic/system.h    (working copy)
>>>> @@ -311,6 +311,11 @@
>>>>       return !!s;
>>>>   }
>>>>
>>>> +static inline int xnarch_root_domain_p(void)
>>>> +{
>>>> +    return rthal_current_domain == rthal_root_domain;
>>>> +}
>>>> +
>>>>   #ifdef CONFIG_SMP
>>>>
>>>>   #define xnlock_get(lock)        __xnlock_get(lock  XNLOCK_DBG_CONTEXT)
>>>> Index: ksrc/nucleus/pod.c
>>>> ===================================================================
>>>> --- ksrc/nucleus/pod.c    (revision 4676)
>>>> +++ ksrc/nucleus/pod.c    (working copy)
>>>> @@ -2137,7 +2137,7 @@
>>>>   void __xnpod_schedule(struct xnsched *sched)
>>>>   {
>>>>       struct xnthread *prev, *next, *curr = sched->curr;
>>>> -    int zombie, switched = 0, need_resched, relaxing;
>>>> +    int zombie, switched = 0, need_resched, shadow;
>>>>       spl_t s;
>>>>
>>>>       if (xnarch_escalate())
>>>> @@ -2174,9 +2174,9 @@
>>>>              next, xnthread_name(next));
>>>>
>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>> -    relaxing = xnthread_test_state(prev, XNRELAX);
>>>> +    shadow = xnthread_test_state(prev, XNSHADOW);
>>>>   #else
>>>> -    (void)relaxing;
>>>> +    (void)shadow;
>>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>>
>>>>       if (xnthread_test_state(next, XNROOT)) {
>>>> @@ -2204,12 +2204,18 @@
>>>>
>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>       /*
>>>> -     * Test whether we are relaxing a thread. In such a case, we
>>>> -     * are here the epilogue of Linux' schedule, and should skip
>>>> -     * xnpod_schedule epilogue.
>>>> +     * Test whether we transitioned from primary mode to secondary
>>>> +     * over a shadow thread. This may happen in two cases:
>>>> +     *
>>>> +     * 1) the shadow thread just relaxed.
>>>> +     * 2) the shadow TCB has just been deleted, in which case
>>>> +     * we have to reap the mated Linux side as well.
>>>> +     *
>>>> +     * In both cases, we are running over the epilogue of Linux's
>>>> +     * schedule, and should skip our epilogue code.
>>>>        */
>>>> -    if (relaxing)
>>>> -        goto relax_epilogue;
>>>> +    if (shadow && xnarch_root_domain_p())
>>>> +        goto shadow_epilogue;
>>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>>
>>>>       switched = 1;
>>>> @@ -2252,7 +2258,7 @@
>>>>       return;
>>>>
>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>> -      relax_epilogue:
>>>> +      shadow_epilogue:
>>>>       {
>>>>           spl_t ignored;
>>> Finally makes sense and works (but your posting was corrupted). Great.
>>>
>>>>> There are basically two approaches to fix it: The first one is to
>>>>> find a
>>>>> different way to kill (or only suspend?)
>>>> Suspending the hog won't work, particularly when GDB is involved,
>>>> because a pending non-lethal Linux signal may cause the suspended
>>>> shadow to resume immediately for processing the signal, therefore
>>>> defeating the purpose of the watchdog, leading to an infinite loop.
>>>> This is why we moved from suspension to deletion upon watchdog
>>>> trigger in 2.3 (2.2 used to suspend only).
>>> Yes, that became clear to me in the meantime, too.
>>>
>>>>   the current shadow thread when
>>>>> the watchdog strikes. The second one brought me to another issue: Raise
>>>>> SIGKILL for the current thread and make sure that it can be
>>>>> processed by
>>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately,
>>>>> there is
>>>>> no way to force a shadow thread into secondary mode to handle pending
>>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>>> that raises the question if we shouldn't improve this as well while we
>>>>> are on it.
>>>>>
>>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>>> syscalls, otherwise the system would starve (and the watchdog would
>>>>> come
>>>>> around). On the other hand, delaying signals till syscall prologues is
>>>>> different from plain Linux behaviour...
>>>>>
>>>>> Comments, ideas?
>>>>>
>>>> We probably need a two-stage approach: first record the thread was
>>>> bumped out and suspend it from the watchdog handler to give Linux a
>>>> chance to run again, then finish the work, killing it for good, next
>>>> time the root thread is scheduled in on the same CPU.
>>> That confuses me again: The watchdog issue is solved now, no? We are
>>> only left with the scenario of breaking out of a user space loop of some
>>> Xenomai thread via a Linux signal (which implies SMP - otherwise there
>>> is no chance to raise the signal...).
>>>
>> If you first suspend the hog, then send it a lethal signal, you solve
>> both issues: first Linux is allowed to run eventually, then your task
>> won't be able to resume running the faulty code, but solely to process
>> SIGKILL, which can be made pending early enough because the nucleus
>> decides when Linux resumes.
> 
> I'm not interested in SIGKILL here, rather in SIGSTOP to do debugging.
> That is currently impossible.
> 
>>> Meanwhile I played with some light-weight approach to relax a thread
>>> that received a signal (according to do_sigwake_event). Worked, but only
>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>> it does not handle the case that a non-root handler may alter the
>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>> involved domains.
>> It is not a bug, this is wanted. ISR must neither change the current
>> domain nor migrate CPU; allowing this would open Pandora's box.
> 
> OK, then please elaborate on this a bit more in the adeos-main thread
> and explain why __ipipe_sync_stage currently reloads the domain.
>

ipipe_cpudom_ptr() may be affected by CPU migration within the _root_ domain, 
which does not mean that non-root domains are allowed to migrate and/or change 
domains.

> Jan
> 


-- 
Philippe.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:27       ` Jan Kiszka
@ 2009-03-09 17:38         ` Philippe Gerum
  2009-03-09 17:58           ` Gilles Chanteperdrix
  2009-03-09 23:46           ` Jan Kiszka
  0 siblings, 2 replies; 21+ messages in thread
From: Philippe Gerum @ 2009-03-09 17:38 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Meanwhile I played with some light-weight approach to relax a thread
>>> that received a signal (according to do_sigwake_event). Worked, but only
>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>> it does not handle the case that a non-root handler may alter the
>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>> involved domains.
>> It is not a bug, this is wanted. ISR must neither change the current
>> domain nor migrate CPU; allowing this would open Pandora's box.
> 
> And if there is no way to migrate from within an ISR, we can bury any
> attempt to deliver signals to spinning Xenomai threads - or what other
> context would remain to Xenomai for triggering migration?
>

The two-phase solution I have mentioned would work.

> Jan
> 


-- 
Philippe.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:38         ` Philippe Gerum
@ 2009-03-09 17:58           ` Gilles Chanteperdrix
  2009-03-10 11:09             ` Jan Kiszka
  2009-03-09 23:46           ` Jan Kiszka
  1 sibling, 1 reply; 21+ messages in thread
From: Gilles Chanteperdrix @ 2009-03-09 17:58 UTC (permalink / raw)
  To: rpm; +Cc: Jan Kiszka, xenomai-core

Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> Meanwhile I played with some light-weight approach to relax a thread
>>>> that received a signal (according to do_sigwake_event). Worked, but only
>>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>>> it does not handle the case that a non-root handler may alter the
>>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>>> involved domains.
>>> It is not a bug, this is wanted. ISR must neither change the current
>>> domain nor migrate CPU; allowing this would open Pandora's box.
>> And if there is no way to migrate from within an ISR, we can bury any
>> attempt to deliver signals to spinning Xenomai threads - or what other
>> context would remain to Xenomai for triggering migration?
>>
> 
> The two-phase solution I have mentioned would work.

Something like:

Index: ksrc/nucleus/sched.c
===================================================================
--- ksrc/nucleus/sched.c        (revision 4678)
+++ ksrc/nucleus/sched.c        (working copy)
@@ -75,7 +75,13 @@ static void xnsched_watchdog_handler(str
                           thread, xnthread_name(thread));
                xnprintf("watchdog triggered -- killing runaway thread
'%s'\n",
                         xnthread_name(thread));
-               xnpod_delete_thread(thread);
+#ifdef CONFIG_XENO_OPT_PERVASIVE
+               if (xnthread_user_task(thread)) {
+                       xnpod_suspend_thread(thread);
+                       xnshadow_send_sig(thread, SIGSEGV, 0, 1);
+               } else
+#endif /* CONFIG_XENO_OPT_PERVASIVE */
+                       xnpod_delete_thread(thread);
                xnsched_reset_watchdog(sched);
        }
 }


-- 
                                                 Gilles.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:38         ` Philippe Gerum
  2009-03-09 17:58           ` Gilles Chanteperdrix
@ 2009-03-09 23:46           ` Jan Kiszka
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Kiszka @ 2009-03-09 23:46 UTC (permalink / raw)
  To: rpm; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 1013 bytes --]

Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> Meanwhile I played with some light-weight approach to relax a thread
>>>> that received a signal (according to do_sigwake_event). Worked, but only
>>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>>> it does not handle the case that a non-root handler may alter the
>>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>>> involved domains.
>>> It is not a bug, this is wanted. ISR must neither change the current
>>> domain nor migrate CPU; allowing this would open Pandora's box.
>> And if there is no way to migrate from within an ISR, we can bury any
>> attempt to deliver signals to spinning Xenomai threads - or what other
>> context would remain to Xenomai for triggering migration?
>>
> 
> The two-phase solution I have mentioned would work.

I think you can only handle lethal signals that way, not non-lethal like
SIGSTOP.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:37         ` Philippe Gerum
@ 2009-03-09 23:49           ` Jan Kiszka
  2009-03-10  9:20             ` Philippe Gerum
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kiszka @ 2009-03-09 23:49 UTC (permalink / raw)
  To: rpm; +Cc: xenomai-core

[-- Attachment #1: Type: text/plain, Size: 7851 bytes --]

Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Hi,
>>>>>>
>>>>>> the watchdog is currently broken in trunk ("zombie [...] would not
>>>>>> die..."). In fact, it should also be broken in older versions, but only
>>>>>> recent thread termination rework made this visible.
>>>>>>
>>>>>> When a Xenomai CPU hog is caught by the watchdog,
>>>>>> xnpod_delete_thread is
>>>>>> invoked, causing the current thread to be set in zombie state and
>>>>>> scheduled out. But as its Linux mate still exist, hell breaks loose
>>>>>> once
>>>>>> Linux tries to get rid of it (the Xenomai zombie is scheduled in
>>>>>> again).
>>>>>> In short: calling xnpod_delete_thread(<self>) for a shadow thread is
>>>>>> not
>>>>>> working, probably never worked cleanly.
>>>>> Nak, it is a regression introduced by the scheduler changes in 2.5.x.
>>>>> We should detect _any_ shadow thread that schedules out in primary
>>>>> mode then regains control in secondary mode like we do in the 2.4.x
>>>>> series, not only _relaxing_ shadow threads. It is perfectly valid to
>>>>> have the Linux task orphaned from the deletion of its shadow TCB
>>>>> until Xenomai notices the issue and reaps it; problem was that such
>>>>> regression prevented the nucleus to get the memo.
>>>>>
>>>>> The following patch should fix the issue:
>>>>>
>>>>>   Index: include/asm-generic/system.h
>>>>> ===================================================================
>>>>> --- include/asm-generic/system.h    (revision 4676)
>>>>> +++ include/asm-generic/system.h    (working copy)
>>>>> @@ -311,6 +311,11 @@
>>>>>       return !!s;
>>>>>   }
>>>>>
>>>>> +static inline int xnarch_root_domain_p(void)
>>>>> +{
>>>>> +    return rthal_current_domain == rthal_root_domain;
>>>>> +}
>>>>> +
>>>>>   #ifdef CONFIG_SMP
>>>>>
>>>>>   #define xnlock_get(lock)        __xnlock_get(lock  XNLOCK_DBG_CONTEXT)
>>>>> Index: ksrc/nucleus/pod.c
>>>>> ===================================================================
>>>>> --- ksrc/nucleus/pod.c    (revision 4676)
>>>>> +++ ksrc/nucleus/pod.c    (working copy)
>>>>> @@ -2137,7 +2137,7 @@
>>>>>   void __xnpod_schedule(struct xnsched *sched)
>>>>>   {
>>>>>       struct xnthread *prev, *next, *curr = sched->curr;
>>>>> -    int zombie, switched = 0, need_resched, relaxing;
>>>>> +    int zombie, switched = 0, need_resched, shadow;
>>>>>       spl_t s;
>>>>>
>>>>>       if (xnarch_escalate())
>>>>> @@ -2174,9 +2174,9 @@
>>>>>              next, xnthread_name(next));
>>>>>
>>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>> -    relaxing = xnthread_test_state(prev, XNRELAX);
>>>>> +    shadow = xnthread_test_state(prev, XNSHADOW);
>>>>>   #else
>>>>> -    (void)relaxing;
>>>>> +    (void)shadow;
>>>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>>>
>>>>>       if (xnthread_test_state(next, XNROOT)) {
>>>>> @@ -2204,12 +2204,18 @@
>>>>>
>>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>>       /*
>>>>> -     * Test whether we are relaxing a thread. In such a case, we
>>>>> -     * are here the epilogue of Linux' schedule, and should skip
>>>>> -     * xnpod_schedule epilogue.
>>>>> +     * Test whether we transitioned from primary mode to secondary
>>>>> +     * over a shadow thread. This may happen in two cases:
>>>>> +     *
>>>>> +     * 1) the shadow thread just relaxed.
>>>>> +     * 2) the shadow TCB has just been deleted, in which case
>>>>> +     * we have to reap the mated Linux side as well.
>>>>> +     *
>>>>> +     * In both cases, we are running over the epilogue of Linux's
>>>>> +     * schedule, and should skip our epilogue code.
>>>>>        */
>>>>> -    if (relaxing)
>>>>> -        goto relax_epilogue;
>>>>> +    if (shadow && xnarch_root_domain_p())
>>>>> +        goto shadow_epilogue;
>>>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>>>
>>>>>       switched = 1;
>>>>> @@ -2252,7 +2258,7 @@
>>>>>       return;
>>>>>
>>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>> -      relax_epilogue:
>>>>> +      shadow_epilogue:
>>>>>       {
>>>>>           spl_t ignored;
>>>> Finally makes sense and works (but your posting was corrupted). Great.
>>>>
>>>>>> There are basically two approaches to fix it: The first one is to
>>>>>> find a
>>>>>> different way to kill (or only suspend?)
>>>>> Suspending the hog won't work, particularly when GDB is involved,
>>>>> because a pending non-lethal Linux signal may cause the suspended
>>>>> shadow to resume immediately for processing the signal, therefore
>>>>> defeating the purpose of the watchdog, leading to an infinite loop.
>>>>> This is why we moved from suspension to deletion upon watchdog
>>>>> trigger in 2.3 (2.2 used to suspend only).
>>>> Yes, that became clear to me in the meantime, too.
>>>>
>>>>>   the current shadow thread when
>>>>>> the watchdog strikes. The second one brought me to another issue: Raise
>>>>>> SIGKILL for the current thread and make sure that it can be
>>>>>> processed by
>>>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately,
>>>>>> there is
>>>>>> no way to force a shadow thread into secondary mode to handle pending
>>>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>>>> that raises the question if we shouldn't improve this as well while we
>>>>>> are on it.
>>>>>>
>>>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>>>> syscalls, otherwise the system would starve (and the watchdog would
>>>>>> come
>>>>>> around). On the other hand, delaying signals till syscall prologues is
>>>>>> different from plain Linux behaviour...
>>>>>>
>>>>>> Comments, ideas?
>>>>>>
>>>>> We probably need a two-stage approach: first record the thread was
>>>>> bumped out and suspend it from the watchdog handler to give Linux a
>>>>> chance to run again, then finish the work, killing it for good, next
>>>>> time the root thread is scheduled in on the same CPU.
>>>> That confuses me again: The watchdog issue is solved now, no? We are
>>>> only left with the scenario of breaking out of a user space loop of some
>>>> Xenomai thread via a Linux signal (which implies SMP - otherwise there
>>>> is no chance to raise the signal...).
>>>>
>>> If you first suspend the hog, then send it a lethal signal, you solve
>>> both issues: first Linux is allowed to run eventually, then your task
>>> won't be able to resume running the faulty code, but solely to process
>>> SIGKILL, which can be made pending early enough because the nucleus
>>> decides when Linux resumes.
>> I'm not interested in SIGKILL here, rather in SIGSTOP to do debugging.
>> That is currently impossible.
>>
>>>> Meanwhile I played with some light-weight approach to relax a thread
>>>> that received a signal (according to do_sigwake_event). Worked, but only
>>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>>> it does not handle the case that a non-root handler may alter the
>>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>>> involved domains.
>>> It is not a bug, this is wanted. ISR must neither change the current
>>> domain nor migrate CPU; allowing this would open Pandora's box.
>> OK, then please elaborate on this a bit more in the adeos-main thread
>> and explain why __ipipe_sync_stage currently reloads the domain.
>>
> 
> ipipe_cpudom_ptr() may be affected by CPU migration within the _root_ domain, 
> which does not mean that non-root domains are allowed to migrate and/or change 
> domains.

ipd or ipipe_current_domain should not be affected by CPU migration, so
I still see no point in re-reading the current domain unless it actually
changes.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 23:49           ` Jan Kiszka
@ 2009-03-10  9:20             ` Philippe Gerum
  0 siblings, 0 replies; 21+ messages in thread
From: Philippe Gerum @ 2009-03-10  9:20 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
>>>> Jan Kiszka wrote:
>>>>> Philippe Gerum wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> the watchdog is currently broken in trunk ("zombie [...] would not
>>>>>>> die..."). In fact, it should also be broken in older versions, but only
>>>>>>> recent thread termination rework made this visible.
>>>>>>>
>>>>>>> When a Xenomai CPU hog is caught by the watchdog,
>>>>>>> xnpod_delete_thread is
>>>>>>> invoked, causing the current thread to be set in zombie state and
>>>>>>> scheduled out. But as its Linux mate still exist, hell breaks loose
>>>>>>> once
>>>>>>> Linux tries to get rid of it (the Xenomai zombie is scheduled in
>>>>>>> again).
>>>>>>> In short: calling xnpod_delete_thread(<self>) for a shadow thread is
>>>>>>> not
>>>>>>> working, probably never worked cleanly.
>>>>>> Nak, it is a regression introduced by the scheduler changes in 2.5.x.
>>>>>> We should detect _any_ shadow thread that schedules out in primary
>>>>>> mode then regains control in secondary mode like we do in the 2.4.x
>>>>>> series, not only _relaxing_ shadow threads. It is perfectly valid to
>>>>>> have the Linux task orphaned from the deletion of its shadow TCB
>>>>>> until Xenomai notices the issue and reaps it; problem was that such
>>>>>> regression prevented the nucleus to get the memo.
>>>>>>
>>>>>> The following patch should fix the issue:
>>>>>>
>>>>>>   Index: include/asm-generic/system.h
>>>>>> ===================================================================
>>>>>> --- include/asm-generic/system.h    (revision 4676)
>>>>>> +++ include/asm-generic/system.h    (working copy)
>>>>>> @@ -311,6 +311,11 @@
>>>>>>       return !!s;
>>>>>>   }
>>>>>>
>>>>>> +static inline int xnarch_root_domain_p(void)
>>>>>> +{
>>>>>> +    return rthal_current_domain == rthal_root_domain;
>>>>>> +}
>>>>>> +
>>>>>>   #ifdef CONFIG_SMP
>>>>>>
>>>>>>   #define xnlock_get(lock)        __xnlock_get(lock  XNLOCK_DBG_CONTEXT)
>>>>>> Index: ksrc/nucleus/pod.c
>>>>>> ===================================================================
>>>>>> --- ksrc/nucleus/pod.c    (revision 4676)
>>>>>> +++ ksrc/nucleus/pod.c    (working copy)
>>>>>> @@ -2137,7 +2137,7 @@
>>>>>>   void __xnpod_schedule(struct xnsched *sched)
>>>>>>   {
>>>>>>       struct xnthread *prev, *next, *curr = sched->curr;
>>>>>> -    int zombie, switched = 0, need_resched, relaxing;
>>>>>> +    int zombie, switched = 0, need_resched, shadow;
>>>>>>       spl_t s;
>>>>>>
>>>>>>       if (xnarch_escalate())
>>>>>> @@ -2174,9 +2174,9 @@
>>>>>>              next, xnthread_name(next));
>>>>>>
>>>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>>> -    relaxing = xnthread_test_state(prev, XNRELAX);
>>>>>> +    shadow = xnthread_test_state(prev, XNSHADOW);
>>>>>>   #else
>>>>>> -    (void)relaxing;
>>>>>> +    (void)shadow;
>>>>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>>>>
>>>>>>       if (xnthread_test_state(next, XNROOT)) {
>>>>>> @@ -2204,12 +2204,18 @@
>>>>>>
>>>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>>>       /*
>>>>>> -     * Test whether we are relaxing a thread. In such a case, we
>>>>>> -     * are here the epilogue of Linux' schedule, and should skip
>>>>>> -     * xnpod_schedule epilogue.
>>>>>> +     * Test whether we transitioned from primary mode to secondary
>>>>>> +     * over a shadow thread. This may happen in two cases:
>>>>>> +     *
>>>>>> +     * 1) the shadow thread just relaxed.
>>>>>> +     * 2) the shadow TCB has just been deleted, in which case
>>>>>> +     * we have to reap the mated Linux side as well.
>>>>>> +     *
>>>>>> +     * In both cases, we are running over the epilogue of Linux's
>>>>>> +     * schedule, and should skip our epilogue code.
>>>>>>        */
>>>>>> -    if (relaxing)
>>>>>> -        goto relax_epilogue;
>>>>>> +    if (shadow && xnarch_root_domain_p())
>>>>>> +        goto shadow_epilogue;
>>>>>>   #endif /* CONFIG_XENO_OPT_PERVASIVE */
>>>>>>
>>>>>>       switched = 1;
>>>>>> @@ -2252,7 +2258,7 @@
>>>>>>       return;
>>>>>>
>>>>>>   #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>>> -      relax_epilogue:
>>>>>> +      shadow_epilogue:
>>>>>>       {
>>>>>>           spl_t ignored;
>>>>> Finally makes sense and works (but your posting was corrupted). Great.
>>>>>
>>>>>>> There are basically two approaches to fix it: The first one is to
>>>>>>> find a
>>>>>>> different way to kill (or only suspend?)
>>>>>> Suspending the hog won't work, particularly when GDB is involved,
>>>>>> because a pending non-lethal Linux signal may cause the suspended
>>>>>> shadow to resume immediately for processing the signal, therefore
>>>>>> defeating the purpose of the watchdog, leading to an infinite loop.
>>>>>> This is why we moved from suspension to deletion upon watchdog
>>>>>> trigger in 2.3 (2.2 used to suspend only).
>>>>> Yes, that became clear to me in the meantime, too.
>>>>>
>>>>>>   the current shadow thread when
>>>>>>> the watchdog strikes. The second one brought me to another issue: Raise
>>>>>>> SIGKILL for the current thread and make sure that it can be
>>>>>>> processed by
>>>>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately,
>>>>>>> there is
>>>>>>> no way to force a shadow thread into secondary mode to handle pending
>>>>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>>>>> that raises the question if we shouldn't improve this as well while we
>>>>>>> are on it.
>>>>>>>
>>>>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>>>>> syscalls, otherwise the system would starve (and the watchdog would
>>>>>>> come
>>>>>>> around). On the other hand, delaying signals till syscall prologues is
>>>>>>> different from plain Linux behaviour...
>>>>>>>
>>>>>>> Comments, ideas?
>>>>>>>
>>>>>> We probably need a two-stage approach: first record the thread was
>>>>>> bumped out and suspend it from the watchdog handler to give Linux a
>>>>>> chance to run again, then finish the work, killing it for good, next
>>>>>> time the root thread is scheduled in on the same CPU.
>>>>> That confuses me again: The watchdog issue is solved now, no? We are
>>>>> only left with the scenario of breaking out of a user space loop of some
>>>>> Xenomai thread via a Linux signal (which implies SMP - otherwise there
>>>>> is no chance to raise the signal...).
>>>>>
>>>> If you first suspend the hog, then send it a lethal signal, you solve
>>>> both issues: first Linux is allowed to run eventually, then your task
>>>> won't be able to resume running the faulty code, but solely to process
>>>> SIGKILL, which can be made pending early enough because the nucleus
>>>> decides when Linux resumes.
>>> I'm not interested in SIGKILL here, rather in SIGSTOP to do debugging.
>>> That is currently impossible.
>>>
>>>>> Meanwhile I played with some light-weight approach to relax a thread
>>>>> that received a signal (according to do_sigwake_event). Worked, but only
>>>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>>>> it does not handle the case that a non-root handler may alter the
>>>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>>>> involved domains.
>>>> It is not a bug, this is wanted. ISR must neither change the current
>>>> domain nor migrate CPU; allowing this would open Pandora's box.
>>> OK, then please elaborate on this a bit more in the adeos-main thread
>>> and explain why __ipipe_sync_stage currently reloads the domain.
>>>
>> ipipe_cpudom_ptr() may be affected by CPU migration within the _root_ domain, 
>> which does not mean that non-root domains are allowed to migrate and/or change 
>> domains.
> 
> ipd or ipipe_current_domain should not be affected by CPU migration, so
> I still see no point in re-reading the current domain unless it actually
> changes.
>

Please re-read:
>> which does not mean that non-root domains are allowed to migrate and/or change 
>> domains.

That also means that the root domain may migrate CPU and/or domain, not the 
others. I'm not opposed to reconsider domain migration for NON-root domains, but 
this has implications. Not all pipeline code is able to handle such situation, 
particularly not all ipipe_sync_pipeline call sites. Each and every call site 
should be inspected for making sure that important assumptions are not broken.
The fact that __ipipe_run_isr() does not consider such migration possible when 
non-root domains are involved is telling. Again, this was done on purpose. Sorry 
for the bad news.

> Jan
> 


-- 
Philippe.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-09 17:58           ` Gilles Chanteperdrix
@ 2009-03-10 11:09             ` Jan Kiszka
  2009-03-10 13:17               ` Gilles Chanteperdrix
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kiszka @ 2009-03-10 11:09 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai-core

Gilles Chanteperdrix wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>> Philippe Gerum wrote:
>>>> Jan Kiszka wrote:
>>>>> Meanwhile I played with some light-weight approach to relax a thread
>>>>> that received a signal (according to do_sigwake_event). Worked, but only
>>>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>>>> it does not handle the case that a non-root handler may alter the
>>>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>>>> involved domains.
>>>> It is not a bug, this is wanted. ISR must neither change the current
>>>> domain nor migrate CPU; allowing this would open Pandora's box.
>>> And if there is no way to migrate from within an ISR, we can bury any
>>> attempt to deliver signals to spinning Xenomai threads - or what other
>>> context would remain to Xenomai for triggering migration?
>>>
>> The two-phase solution I have mentioned would work.
> 
> Something like:
> 
> Index: ksrc/nucleus/sched.c
> ===================================================================
> --- ksrc/nucleus/sched.c        (revision 4678)
> +++ ksrc/nucleus/sched.c        (working copy)
> @@ -75,7 +75,13 @@ static void xnsched_watchdog_handler(str
>                            thread, xnthread_name(thread));
>                 xnprintf("watchdog triggered -- killing runaway thread
> '%s'\n",
>                          xnthread_name(thread));
> -               xnpod_delete_thread(thread);
> +#ifdef CONFIG_XENO_OPT_PERVASIVE
> +               if (xnthread_user_task(thread)) {
> +                       xnpod_suspend_thread(thread);
> +                       xnshadow_send_sig(thread, SIGSEGV, 0, 1);
> +               } else
> +#endif /* CONFIG_XENO_OPT_PERVASIVE */
> +                       xnpod_delete_thread(thread);
>                 xnsched_reset_watchdog(sched);
>         }
>  }

Looks good - but we first have to establish that famous relax-on-pending
signal thing...

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Xenomai-core] Watchdog / immediate Linux signal delivery
  2009-03-10 11:09             ` Jan Kiszka
@ 2009-03-10 13:17               ` Gilles Chanteperdrix
  0 siblings, 0 replies; 21+ messages in thread
From: Gilles Chanteperdrix @ 2009-03-10 13:17 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-core

Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> Jan Kiszka wrote:
>>>>>> Meanwhile I played with some light-weight approach to relax a thread
>>>>>> that received a signal (according to do_sigwake_event). Worked, but only
>>>>>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>>>>>> it does not handle the case that a non-root handler may alter the
>>>>>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>>>>>> involved domains.
>>>>> It is not a bug, this is wanted. ISR must neither change the current
>>>>> domain nor migrate CPU; allowing this would open Pandora's box.
>>>> And if there is no way to migrate from within an ISR, we can bury any
>>>> attempt to deliver signals to spinning Xenomai threads - or what other
>>>> context would remain to Xenomai for triggering migration?
>>>>
>>> The two-phase solution I have mentioned would work.
>> Something like:
>>
>> Index: ksrc/nucleus/sched.c
>> ===================================================================
>> --- ksrc/nucleus/sched.c        (revision 4678)
>> +++ ksrc/nucleus/sched.c        (working copy)
>> @@ -75,7 +75,13 @@ static void xnsched_watchdog_handler(str
>>                            thread, xnthread_name(thread));
>>                 xnprintf("watchdog triggered -- killing runaway thread
>> '%s'\n",
>>                          xnthread_name(thread));
>> -               xnpod_delete_thread(thread);
>> +#ifdef CONFIG_XENO_OPT_PERVASIVE
>> +               if (xnthread_user_task(thread)) {
>> +                       xnpod_suspend_thread(thread);
>> +                       xnshadow_send_sig(thread, SIGSEGV, 0, 1);
>> +               } else
>> +#endif /* CONFIG_XENO_OPT_PERVASIVE */
>> +                       xnpod_delete_thread(thread);
>>                 xnsched_reset_watchdog(sched);
>>         }
>>  }
> 
> Looks good - but we first have to establish that famous relax-on-pending
> signal thing...

Yes, obviously, it does not work, in other words...

-- 
                                                 Gilles.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-03-10 13:17 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-08 10:42 [Xenomai-core] Watchdog / immediate Linux signal delivery Jan Kiszka
2009-03-08 13:41 ` Gilles Chanteperdrix
2009-03-08 14:23   ` Jan Kiszka
2009-03-08 14:43     ` Gilles Chanteperdrix
2009-03-08 14:55       ` Jan Kiszka
2009-03-09 15:50 ` Philippe Gerum
2009-03-09 16:44   ` Jan Kiszka
2009-03-09 17:02     ` Gilles Chanteperdrix
2009-03-09 17:15       ` Jan Kiszka
2009-03-09 17:27       ` Philippe Gerum
2009-03-09 17:09     ` Philippe Gerum
2009-03-09 17:12       ` Jan Kiszka
2009-03-09 17:37         ` Philippe Gerum
2009-03-09 23:49           ` Jan Kiszka
2009-03-10  9:20             ` Philippe Gerum
2009-03-09 17:27       ` Jan Kiszka
2009-03-09 17:38         ` Philippe Gerum
2009-03-09 17:58           ` Gilles Chanteperdrix
2009-03-10 11:09             ` Jan Kiszka
2009-03-10 13:17               ` Gilles Chanteperdrix
2009-03-09 23:46           ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.