All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] SIGXCPU with rt_mutex_release
@ 2012-12-31 17:32 Mariusz Janiak
  2012-12-31 17:41 ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Mariusz Janiak @ 2012-12-31 17:32 UTC (permalink / raw)
  To: Xenomai

Hi,

I have met problem when I run OROCOS helloWord example. Xenomai generate SIGXCPU signal when rt_mutex_release(...) is called. I have installed signal handler for each realtime thread created by OROCOS and I get following result 

SIGDEBUG received, reason 4: affected by priority inversion
/worek/install/orocos-toolchain-xeno/install/lib/liborocos-rtt-xenomai.so.2.6(my_warn_upon_switch+0x44)[0x7f687ad75024]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f687a5dbcb0]
/usr/xenomai/lib/libnative.so.3(rt_mutex_release+0xbb)[0x7f687a7ecf5b]
/worek/install/orocos-toolchain-xeno/install/lib/liborocos-rtt-xenomai.so.2.6(_ZN3RTT2os6ThreadD1Ev+0x129)[0x7f687ad73d89]
/worek/install/orocos-toolchain-xeno/install/lib/liborocos-rtt-xenomai.so.2.6(_ZN3RTT8ActivityD2Ev+0x37)[0x7f687ad272e7]
/worek/install/orocos-toolchain-xeno/install/lib/liborocos-rtt-xenomai.so.2.6(_ZN3RTT8ActivityD0Ev+0x9)[0x7f687ad27339]
/worek/install/orocos-toolchain-xeno/install/lib/liborocos-rtt-xenomai.so.2.6(_ZN5boost10shared_ptrIN3RTT4base17ActivityInterfaceEEaSERKS4_+0x4e)[0x7f687ad3fb6e]
/worek/install/orocos-toolchain-xeno/install/lib/liborocos-rtt-xenomai.so.2.6(_ZN3RTT11TaskContext11setActivityEPNS_4base17ActivityInterfaceE+0x72)[0x7f687ad34d32]
./helloworld(_ZN3OCL10HelloWorldC2ESs+0x19f)[0x4505ff]
./helloworld(_Z13ORO_main_impliPPc+0xe3)[0x445393]
./helloworld(main+0x83)[0x444d63]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f6879a1976d]
./helloworld[0x444f01]

There is much bigger mess during exiting from application. I am not pretty sure this is a Xenomai problem, if not sorry for bothering. I haven check it with previous Xenomai release yet.

Best regards and happy new year,
Mariusz






^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2012-12-31 17:32 Mariusz Janiak
@ 2012-12-31 17:41 ` Gilles Chanteperdrix
  2012-12-31 17:52   ` Peter Soetens
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2012-12-31 17:41 UTC (permalink / raw)
  To: Mariusz Janiak; +Cc: Xenomai

On 12/31/2012 06:32 PM, Mariusz Janiak wrote:

> Hi,
> 
> I have met problem when I run OROCOS helloWord example. Xenomai
> generate SIGXCPU signal when rt_mutex_release(...) is called. I have
> installed signal handler for each realtime thread created by OROCOS
> and I get following result
> 
> SIGDEBUG received, reason 4: affected by priority inversion


There is a problem in your application, namely a mutex owned by a thread
running with the SCHED_OTHER scheduling policy, released while Xenomai
did not notice that the same thread had taken the mutex. This could
happen if for instance you change the scheduling policy while holding a
mutex.

Please try reducing the error to a simple testcase which will allow us
to investigate this issue.

> There is much bigger mess during exiting from application. I am not
> pretty sure this is a Xenomai problem, if not sorry for bothering. I
> haven check it with previous Xenomai release yet.


We need facts to start working, again, please write a simple test case
to help investigating the issue.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2012-12-31 17:41 ` Gilles Chanteperdrix
@ 2012-12-31 17:52   ` Peter Soetens
  2012-12-31 17:55     ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Soetens @ 2012-12-31 17:52 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

Hi all,

Thanks for the analysis. I'd propose to move this to bugs.orocos.org where
we can try to reproduce and track this bug in orocos.

We've been testing in our buildfarm against xenomai 2.5, so this might be
something we did not notice until now...

Peter

Op maandag 31 december 2012 schreef Gilles Chanteperdrix (
gilles.chanteperdrix@xenomai.org) het volgende:

> On 12/31/2012 06:32 PM, Mariusz Janiak wrote:
>
> > Hi,
> >
> > I have met problem when I run OROCOS helloWord example. Xenomai
> > generate SIGXCPU signal when rt_mutex_release(...) is called. I have
> > installed signal handler for each realtime thread created by OROCOS
> > and I get following result
> >
> > SIGDEBUG received, reason 4: affected by priority inversion
>
>
> There is a problem in your application, namely a mutex owned by a thread
> running with the SCHED_OTHER scheduling policy, released while Xenomai
> did not notice that the same thread had taken the mutex. This could
> happen if for instance you change the scheduling policy while holding a
> mutex.
>
> Please try reducing the error to a simple testcase which will allow us
> to investigate this issue.
>
> > There is much bigger mess during exiting from application. I am not
> > pretty sure this is a Xenomai problem, if not sorry for bothering. I
> > haven check it with previous Xenomai release yet.
>
>
> We need facts to start working, again, please write a simple test case
> to help investigating the issue.
>
> --
>                                                                 Gilles.
>
> _______________________________________________
> Xenomai mailing list
> Xenomai@xenomai.org <javascript:;>
> http://www.xenomai.org/mailman/listinfo/xenomai
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2012-12-31 17:52   ` Peter Soetens
@ 2012-12-31 17:55     ` Gilles Chanteperdrix
  0 siblings, 0 replies; 17+ messages in thread
From: Gilles Chanteperdrix @ 2012-12-31 17:55 UTC (permalink / raw)
  To: peter; +Cc: Xenomai

On 12/31/2012 06:52 PM, Peter Soetens wrote:

> Hi all,
> 
> Thanks for the analysis. I'd propose to move this to bugs.orocos.org
> <http://bugs.orocos.org> where we can try to reproduce and track this
> bug in orocos.
> 
> We've been testing in our buildfarm against xenomai 2.5, so this might
> be something we did not notice until now...


It is related to the automatic migration of threads running with
SCHED_OTHER policy introduced in Xenomai 2.6.


-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
@ 2013-01-08 11:12 Mariusz Janiak
  2013-01-08 13:33 ` Jan Kiszka
  2013-01-08 19:43 ` Gilles Chanteperdrix
  0 siblings, 2 replies; 17+ messages in thread
From: Mariusz Janiak @ 2013-01-08 11:12 UTC (permalink / raw)
  To: Xenomai

Hi GIlles,

As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 

Code is following:

/*****************************************************************************
 * mutexTest.c
 * Xenomai mutex test for SIGXCPU
 * Mariusz Janiak
 * Wroclaw 2013
 *****************************************************************************/

#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <execinfo.h>
#include <sched.h>
#include <native/task.h>
#include <native/mutex.h>
#include <native/sem.h>

#define ORO_SCHED_OTHER 1 /** Soft real-time */

#ifdef UNUSED
#elif defined(__GNUC__)
# define UNUSED(x) UNUSED_ ## x __attribute__((unused))
#else
# define UNUSED(x) x
#endif

void test(void *UNUSED(arg));
void warn_upon_switch(int UNUSED(sig));

RT_TASK  mainTask, testTask; 
RT_MUTEX mutex;
RT_SEM   sem;

void warn_upon_switch(int UNUSED(sig))
{
  void *bt[32];
  int nentries;
  
  nentries = backtrace(bt, sizeof(bt) / sizeof(bt[0]));
  backtrace_symbols_fd(bt, nentries, fileno(stderr));
}

void test(void *UNUSED(arg))
{
  
  /* thread_function() in Thread.cpp:83 -- Thread::configure() in 
     Thread.cpp:489 -- rtos_task_set_period(...) in fosi_internal.cpp:387 --
     rtos_task_make_periodic(...) in fosi_internal.cpp:378*/
  rt_task_set_periodic(NULL, TM_NOW, TM_INFINITE);
  /* thread_function() in Thread.cpp:86 -- rtos_sem_signal(...) in fosi.h:188*/
  rt_sem_v(&sem);
  /* thread_function() in Thread.cpp:89 -- MutexLock */
  rt_mutex_acquire(&mutex, TM_INFINITE);
  rt_mutex_release(&mutex);
  /* hread_function() in Thread.cpp:116 -- rtos_sem_wait(...) in fosi.h:194 */
  rt_sem_p(&sem, TM_INFINITE);
}

int main(int UNUSED(argc), char *UNUSED(argv[]))
{
  int                ret=0;
  struct sched_param param;
  struct sigaction   sa;

  /* rtos_task_create_main(...) in fosi_internal.cpp:82 */
  mlockall(MCL_CURRENT|MCL_FUTURE);
  /* rtos_task_create_main(...) in fosi_internal.cpp:91 */
  param.sched_priority = sched_get_priority_max(ORO_SCHED_OTHER);
  if (param.sched_priority != -1 ){
    sched_setscheduler(0, ORO_SCHED_OTHER, &param);
  }
  /* rtos_task_create_main(...) in fosi_internal.cpp:102 */
  ret = rt_task_shadow(&mainTask, "MutexTest", 0, 0);
  if(ret < 0){
    printf("ERROR: rt_task_shadow(...)\n");
    return -1;
  }
  /* rtos_task_create_main(...) in fosi_internal.cpp:162 */
  sa.sa_sigaction = warn_upon_switch;
  sigemptyset(&sa.sa_mask);
  sa.sa_flags = 0;
  sigaction(SIGXCPU, &sa, 0);

  /* Thread::Thread(...) in Thread.cpp:238 -- automatic in constructor */
  rt_mutex_create(&mutex, "breaker");
  /* Thread::setup(...) in Thread.cpp:257 -- in MutexLock constructor */
  ret = rt_mutex_acquire(&mutex, TM_INFINITE);
  /* Thread::setup(...) in Thread.cpp:264 -- rtos_sem_init(...) in fosi.h:176*/
  rt_sem_create(&sem, "sem", 0, S_PRIO);
  /* Thread::setup(...) in Thread.cpp:293 -- rtos_task_create(...) in 
     fosi_internal.cpp:240 */
  ret = rt_task_spawn(&testTask, "testTask", 128000, 1, T_JOINABLE | (0 & T_CPUMASK), test, NULL);
  if(ret < 0){
    printf("ERROR: rt_task_spawn(...)\n");
    return -1;
  }
  /* Thread::setup(...) in Thread.cpp:309 -- rtos_sem_wait(...) in fosi.h:194 */
  rt_sem_p(&sem, TM_INFINITE);
  /* Thread::setup(...) in Thread.cpp:324 --  in MutexLock destructor */
  rt_mutex_release(&mutex);

  /* Do something */
  sleep(1);

  /* Thread::terminate() in Thread.cpp:614 -- rtos_sem_signal(...) in 
     fosi.h:188*/
  rt_sem_v(&sem);
  /* Thread::terminate() in Thread.cpp:616 -- rtos_task_delete(...) in 
     fosi_internal.cpp:490*/
  rt_task_join(&testTask); 
  rt_task_delete(&testTask);
  /* Thread::~Thread() in Thread.cpp:326 -- automatic in destructor when 
     TaskContext::setActivity(...) is called by HelloWord object (default 
     activity created in object constructor is replaced by new activity)*/
  rt_mutex_acquire(&mutex, TM_NONBLOCK); /* rtos_mutex_trylock(...) in 
                                            fosi.h:247 */
  rt_mutex_release(&mutex);              /* we get the SIG here!!! */
  return 0;
}

Building procedure (you need Obj, Dep and Bin dirs in the current path, and Xenomai in /usr/xenomai)

gcc -c -Wp,-MM,-MP,-MT,mutexTest.o,-MF,Dep/mutexTest.d -O2 -ggdb3 -DDEBUG_EN   -I/usr/xenomai/include -D_GNU_SOURCE -D_REENTRANT -D__XENO__ -Wall -Wextra -Wcast-align -Wimplicit -Wpointer-arith  -Wswitch -Wredundant-decls -Wreturn-type  -Wunused  -Wsign-compare -Waggregate-return  -Wnested-externs  -Wmissing-prototypes -Wstrict-prototypes -Wmissing-declarations -Wbad-function-cast mutexTest.c -o Obj/mutexTest.o

gcc -o Bin/mutexTest.run Obj/mutexTest.o  -lnative -L/usr/xenomai/lib -lxenomai -lpthread -lrt -lm

Mariusz





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-08 11:12 [Xenomai] SIGXCPU with rt_mutex_release Mariusz Janiak
@ 2013-01-08 13:33 ` Jan Kiszka
  2013-01-08 19:40   ` Gilles Chanteperdrix
  2013-01-08 19:43 ` Gilles Chanteperdrix
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2013-01-08 13:33 UTC (permalink / raw)
  To: Mariusz Janiak; +Cc: Xenomai

On 2013-01-08 12:12, Mariusz Janiak wrote:
> Hi GIlles,
> 
> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 

This should cure the issue (there was a check for XNTRAPSW missing):

diff --git a/ksrc/nucleus/synch.c b/ksrc/nucleus/synch.c
index e10be47..c1465dc 100644
--- a/ksrc/nucleus/synch.c
+++ b/ksrc/nucleus/synch.c
@@ -687,10 +687,11 @@ xnsynch_release_thread(struct xnsynch *synch, struct xnthread *lastowner)
 
 #ifdef CONFIG_XENO_OPT_PERVASIVE
 	if (xnthread_test_state(lastowner, XNOTHER)) {
-		if (xnthread_get_rescnt(lastowner) == 0)
-			xnshadow_send_sig(lastowner, SIGDEBUG,
-					  SIGDEBUG_MIGRATE_PRIOINV, 1);
-		else
+		if (xnthread_get_rescnt(lastowner) == 0) {
+			if (xnthread_test_state(lastowner, XNTRAPSW))
+				xnshadow_send_sig(lastowner, SIGDEBUG,
+						  SIGDEBUG_MIGRATE_PRIOINV, 1);
+		} else
 			xnthread_dec_rescnt(lastowner);
 	}
 #endif

Thanks for providing the test case.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
@ 2013-01-08 17:42 Mariusz Janiak
  0 siblings, 0 replies; 17+ messages in thread
From: Mariusz Janiak @ 2013-01-08 17:42 UTC (permalink / raw)
  To: Xenomai

Dnia Wtorek, 8 Stycznia 2013 14:33 Jan Kiszka <jan.kiszka@siemens.com> napisał(a) 
> On 2013-01-08 12:12, Mariusz Janiak wrote:
> > Hi GIlles,
> > 
> > As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so 
why 
> > signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
> 
> This should cure the issue (there was a check for XNTRAPSW missing):
> 
> diff --git a/ksrc/nucleus/synch.c b/ksrc/nucleus/synch.c
> index e10be47..c1465dc 100644
> --- a/ksrc/nucleus/synch.c
> +++ b/ksrc/nucleus/synch.c
> @@ -687,10 +687,11 @@ xnsynch_release_thread(struct xnsynch *synch, struct xnthread *lastowner)
>  
>  #ifdef CONFIG_XENO_OPT_PERVASIVE
>  	if (xnthread_test_state(lastowner, XNOTHER)) {
> -		if (xnthread_get_rescnt(lastowner) == 0)
> -			xnshadow_send_sig(lastowner, SIGDEBUG,
> -					  SIGDEBUG_MIGRATE_PRIOINV, 1);
> -		else
> +		if (xnthread_get_rescnt(lastowner) == 0) {
> +			if (xnthread_test_state(lastowner, XNTRAPSW))
> +				xnshadow_send_sig(lastowner, SIGDEBUG,
> +						  SIGDEBUG_MIGRATE_PRIOINV, 1);
> +		} else
>  			xnthread_dec_rescnt(lastowner);
>  	}
>  #endif
> 
> Thanks for providing the test case.
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux

Thank you, after this, unexpected SIGXCPU is no longer a problem for my test case and OROCOS helloword example, as well. 

Best regards,
Mariusz




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-08 13:33 ` Jan Kiszka
@ 2013-01-08 19:40   ` Gilles Chanteperdrix
  0 siblings, 0 replies; 17+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-08 19:40 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 01/08/2013 02:33 PM, Jan Kiszka wrote:

> On 2013-01-08 12:12, Mariusz Janiak wrote:
>> Hi GIlles,
>>
>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
> 
> This should cure the issue (there was a check for XNTRAPSW missing):


No, the check is unconditional because if this happens this is a serious
bug which should be signalled to the user. This patches cures nothing.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-08 11:12 [Xenomai] SIGXCPU with rt_mutex_release Mariusz Janiak
  2013-01-08 13:33 ` Jan Kiszka
@ 2013-01-08 19:43 ` Gilles Chanteperdrix
  2013-01-08 21:06   ` Jan Kiszka
  1 sibling, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-08 19:43 UTC (permalink / raw)
  To: Mariusz Janiak; +Cc: Xenomai

On 01/08/2013 12:12 PM, Mariusz Janiak wrote:

> Hi GIlles,
> 
> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 


I guess the test could be simpler, simply:

rt_mutex_acquire
rt_task_create
rt_mutex_release
rt_mutex_acquire
rt_mutex_release

Anyway, calling rt_task_create while holding a real-time mutex is itself
a priority inversion: any thread in primary mode waiting for the mutex
will now have to wait for task running in secondary mode, so may be
block during an unbounded amount of time. So, using a real-time mutex
for this is completely useless you should be using a glibc
pthread_mutex_t. If compiling for the posix skin, use
__real_pthread_mutex_lock.

Now, how this can cause the issue you observe remains to be understood,
and probably requires a fix.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
@ 2013-01-08 20:51 Mariusz Janiak
  0 siblings, 0 replies; 17+ messages in thread
From: Mariusz Janiak @ 2013-01-08 20:51 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

> I guess the test could be simpler, simply:
> 
> rt_mutex_acquire
> rt_task_create
> rt_mutex_release
> rt_mutex_acquire
> rt_mutex_release

Yes this is minimal subset. 

> Anyway, calling rt_task_create while holding a real-time mutex is itself
> a priority inversion: any thread in primary mode waiting for the mutex
> will now have to wait for task running in secondary mode, so may be
> block during an unbounded amount of time. So, using a real-time mutex
> for this is completely useless you should be using a glibc
> pthread_mutex_t. If compiling for the posix skin, use
> __real_pthread_mutex_lock.

I deeply understand your point, but in case of OROCOS framework this mutex blocking occurs only during task creation and setting environment up. After that, when real time task perform its job, it run without blocking from secondary domain. Of course you are right that in this case the standard mutex will be better. 

> Now, how this can cause the issue you observe remains to be understood,
> and probably requires a fix.

I will be thankful for your help in solving this problem. Till then I will use Jan's patch. 

Mariusz





^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-08 19:43 ` Gilles Chanteperdrix
@ 2013-01-08 21:06   ` Jan Kiszka
  2013-01-08 21:09     ` Jan Kiszka
  2013-01-09 13:30     ` Jan Kiszka
  0 siblings, 2 replies; 17+ messages in thread
From: Jan Kiszka @ 2013-01-08 21:06 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 2013-01-08 20:43, Gilles Chanteperdrix wrote:
> On 01/08/2013 12:12 PM, Mariusz Janiak wrote:
> 
>> Hi GIlles,
>>
>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
> 
> 
> I guess the test could be simpler, simply:
> 
> rt_mutex_acquire
> rt_task_create
> rt_mutex_release
> rt_mutex_acquire
> rt_mutex_release
> 
> Anyway, calling rt_task_create while holding a real-time mutex is itself
> a priority inversion: any thread in primary mode waiting for the mutex
> will now have to wait for task running in secondary mode, so may be
> block during an unbounded amount of time. So, using a real-time mutex
> for this is completely useless you should be using a glibc
> pthread_mutex_t. If compiling for the posix skin, use
> __real_pthread_mutex_lock.
> 
> Now, how this can cause the issue you observe remains to be understood,
> and probably requires a fix.

OK, second try: We do not update the new owner's hrescnt if we acquire a
mutex via trylock. This applies both to rt_mutex_acquire_inner and
pthread_mutex_trylock. Probably, this should be done in the
corresponding syscall wrapper as both services are also used for the
in-kernel API.

Jan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20130108/45a67755/attachment.pgp>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-08 21:06   ` Jan Kiszka
@ 2013-01-08 21:09     ` Jan Kiszka
  2013-01-09 13:30     ` Jan Kiszka
  1 sibling, 0 replies; 17+ messages in thread
From: Jan Kiszka @ 2013-01-08 21:09 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 2013-01-08 22:06, Jan Kiszka wrote:
> On 2013-01-08 20:43, Gilles Chanteperdrix wrote:
>> On 01/08/2013 12:12 PM, Mariusz Janiak wrote:
>>
>>> Hi GIlles,
>>>
>>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
>>
>>
>> I guess the test could be simpler, simply:
>>
>> rt_mutex_acquire
>> rt_task_create
>> rt_mutex_release
>> rt_mutex_acquire
>> rt_mutex_release
>>
>> Anyway, calling rt_task_create while holding a real-time mutex is itself
>> a priority inversion: any thread in primary mode waiting for the mutex
>> will now have to wait for task running in secondary mode, so may be
>> block during an unbounded amount of time. So, using a real-time mutex
>> for this is completely useless you should be using a glibc
>> pthread_mutex_t. If compiling for the posix skin, use
>> __real_pthread_mutex_lock.
>>
>> Now, how this can cause the issue you observe remains to be understood,
>> and probably requires a fix.
> 
> OK, second try: We do not update the new owner's hrescnt if we acquire a
> mutex via trylock. This applies both to rt_mutex_acquire_inner and
> pthread_mutex_trylock. Probably, this should be done in the
> corresponding syscall wrapper as both services are also used for the
> in-kernel API.
> 

BTW, signaling this inconsistent state via SIGDEBUG and no further
indication what went wrong and what is different to "normal"
SIGDEBUG_MIGRATE_PRIOINV is not a good idea. We should issue a kernel
log message or, better, use a different, specific reason code.

Jan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20130108/380b20cc/attachment.pgp>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-08 21:06   ` Jan Kiszka
  2013-01-08 21:09     ` Jan Kiszka
@ 2013-01-09 13:30     ` Jan Kiszka
  2013-01-12 18:43       ` Gilles Chanteperdrix
  1 sibling, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2013-01-09 13:30 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 2013-01-08 22:06, Jan Kiszka wrote:
> On 2013-01-08 20:43, Gilles Chanteperdrix wrote:
>> On 01/08/2013 12:12 PM, Mariusz Janiak wrote:
>>
>>> Hi GIlles,
>>>
>>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
>>
>>
>> I guess the test could be simpler, simply:
>>
>> rt_mutex_acquire
>> rt_task_create
>> rt_mutex_release
>> rt_mutex_acquire
>> rt_mutex_release
>>
>> Anyway, calling rt_task_create while holding a real-time mutex is itself
>> a priority inversion: any thread in primary mode waiting for the mutex
>> will now have to wait for task running in secondary mode, so may be
>> block during an unbounded amount of time. So, using a real-time mutex
>> for this is completely useless you should be using a glibc
>> pthread_mutex_t. If compiling for the posix skin, use
>> __real_pthread_mutex_lock.
>>
>> Now, how this can cause the issue you observe remains to be understood,
>> and probably requires a fix.
> 
> OK, second try: We do not update the new owner's hrescnt if we acquire a
> mutex via trylock. This applies both to rt_mutex_acquire_inner and
> pthread_mutex_trylock. Probably, this should be done in the
> corresponding syscall wrapper as both services are also used for the
> in-kernel API.

Here is the corresponding patch:
http://www.xenomai.org/pipermail/xenomai-git/2013-January/000336.html

Adding it to the inner functions turned out to be cleaner. I also queued
a patch to change the reason code when reporting a rescnt imbalance in
the future. Please review/merge.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-09 13:30     ` Jan Kiszka
@ 2013-01-12 18:43       ` Gilles Chanteperdrix
  2013-01-13 12:29         ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-12 18:43 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 01/09/2013 02:30 PM, Jan Kiszka wrote:

> On 2013-01-08 22:06, Jan Kiszka wrote:
>> On 2013-01-08 20:43, Gilles Chanteperdrix wrote:
>>> On 01/08/2013 12:12 PM, Mariusz Janiak wrote:
>>>
>>>> Hi GIlles,
>>>>
>>>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>>>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
>>>
>>>
>>> I guess the test could be simpler, simply:
>>>
>>> rt_mutex_acquire
>>> rt_task_create
>>> rt_mutex_release
>>> rt_mutex_acquire
>>> rt_mutex_release
>>>
>>> Anyway, calling rt_task_create while holding a real-time mutex is itself
>>> a priority inversion: any thread in primary mode waiting for the mutex
>>> will now have to wait for task running in secondary mode, so may be
>>> block during an unbounded amount of time. So, using a real-time mutex
>>> for this is completely useless you should be using a glibc
>>> pthread_mutex_t. If compiling for the posix skin, use
>>> __real_pthread_mutex_lock.
>>>
>>> Now, how this can cause the issue you observe remains to be understood,
>>> and probably requires a fix.
>>
>> OK, second try: We do not update the new owner's hrescnt if we acquire a
>> mutex via trylock. This applies both to rt_mutex_acquire_inner and
>> pthread_mutex_trylock. Probably, this should be done in the
>> corresponding syscall wrapper as both services are also used for the
>> in-kernel API.
> 
> Here is the corresponding patch:

> http://www.xenomai.org/pipermail/xenomai-git/2013-January/000336.html

Ok, so, if I understand correctly, the whole orocos testcase boils down to:
trylock
unlock

We should move the incrementation of the resource counter to
xnsynch_fast_acquire. We will be left with only two places to patch: the
native and posix trylock in the !FASTSYNCH case.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-12 18:43       ` Gilles Chanteperdrix
@ 2013-01-13 12:29         ` Jan Kiszka
  2013-01-13 12:35           ` Jan Kiszka
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2013-01-13 12:29 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 2013-01-12 19:43, Gilles Chanteperdrix wrote:
> On 01/09/2013 02:30 PM, Jan Kiszka wrote:
> 
>> On 2013-01-08 22:06, Jan Kiszka wrote:
>>> On 2013-01-08 20:43, Gilles Chanteperdrix wrote:
>>>> On 01/08/2013 12:12 PM, Mariusz Janiak wrote:
>>>>
>>>>> Hi GIlles,
>>>>>
>>>>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>>>>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
>>>>
>>>>
>>>> I guess the test could be simpler, simply:
>>>>
>>>> rt_mutex_acquire
>>>> rt_task_create
>>>> rt_mutex_release
>>>> rt_mutex_acquire
>>>> rt_mutex_release
>>>>
>>>> Anyway, calling rt_task_create while holding a real-time mutex is itself
>>>> a priority inversion: any thread in primary mode waiting for the mutex
>>>> will now have to wait for task running in secondary mode, so may be
>>>> block during an unbounded amount of time. So, using a real-time mutex
>>>> for this is completely useless you should be using a glibc
>>>> pthread_mutex_t. If compiling for the posix skin, use
>>>> __real_pthread_mutex_lock.
>>>>
>>>> Now, how this can cause the issue you observe remains to be understood,
>>>> and probably requires a fix.
>>>
>>> OK, second try: We do not update the new owner's hrescnt if we acquire a
>>> mutex via trylock. This applies both to rt_mutex_acquire_inner and
>>> pthread_mutex_trylock. Probably, this should be done in the
>>> corresponding syscall wrapper as both services are also used for the
>>> in-kernel API.
>>
>> Here is the corresponding patch:
> 
>> http://www.xenomai.org/pipermail/xenomai-git/2013-January/000336.html
> 
> Ok, so, if I understand correctly, the whole orocos testcase boils down to:
> trylock
> unlock
> 
> We should move the incrementation of the resource counter to
> xnsynch_fast_acquire. We will be left with only two places to patch: the
> native and posix trylock in the !FASTSYNCH case.

xnsynch_fast_acquire is shared with user space code and therefore
references no kernel types.

Jan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20130113/610ec0d2/attachment.pgp>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-13 12:29         ` Jan Kiszka
@ 2013-01-13 12:35           ` Jan Kiszka
  2013-01-13 12:52             ` Gilles Chanteperdrix
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Kiszka @ 2013-01-13 12:35 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: Xenomai

On 2013-01-13 13:29, Jan Kiszka wrote:
> On 2013-01-12 19:43, Gilles Chanteperdrix wrote:
>> On 01/09/2013 02:30 PM, Jan Kiszka wrote:
>>
>>> On 2013-01-08 22:06, Jan Kiszka wrote:
>>>> On 2013-01-08 20:43, Gilles Chanteperdrix wrote:
>>>>> On 01/08/2013 12:12 PM, Mariusz Janiak wrote:
>>>>>
>>>>>> Hi GIlles,
>>>>>>
>>>>>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>>>>>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
>>>>>
>>>>>
>>>>> I guess the test could be simpler, simply:
>>>>>
>>>>> rt_mutex_acquire
>>>>> rt_task_create
>>>>> rt_mutex_release
>>>>> rt_mutex_acquire
>>>>> rt_mutex_release
>>>>>
>>>>> Anyway, calling rt_task_create while holding a real-time mutex is itself
>>>>> a priority inversion: any thread in primary mode waiting for the mutex
>>>>> will now have to wait for task running in secondary mode, so may be
>>>>> block during an unbounded amount of time. So, using a real-time mutex
>>>>> for this is completely useless you should be using a glibc
>>>>> pthread_mutex_t. If compiling for the posix skin, use
>>>>> __real_pthread_mutex_lock.
>>>>>
>>>>> Now, how this can cause the issue you observe remains to be understood,
>>>>> and probably requires a fix.
>>>>
>>>> OK, second try: We do not update the new owner's hrescnt if we acquire a
>>>> mutex via trylock. This applies both to rt_mutex_acquire_inner and
>>>> pthread_mutex_trylock. Probably, this should be done in the
>>>> corresponding syscall wrapper as both services are also used for the
>>>> in-kernel API.
>>>
>>> Here is the corresponding patch:
>>
>>> http://www.xenomai.org/pipermail/xenomai-git/2013-January/000336.html
>>
>> Ok, so, if I understand correctly, the whole orocos testcase boils down to:
>> trylock
>> unlock
>>
>> We should move the incrementation of the resource counter to
>> xnsynch_fast_acquire. We will be left with only two places to patch: the
>> native and posix trylock in the !FASTSYNCH case.
> 
> xnsynch_fast_acquire is shared with user space code and therefore
> references no kernel types.

...and there are more spots than those after successful
xnsynch_fast_acquire.

Jan


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20130113/4c2fdc83/attachment.pgp>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Xenomai] SIGXCPU with rt_mutex_release
  2013-01-13 12:35           ` Jan Kiszka
@ 2013-01-13 12:52             ` Gilles Chanteperdrix
  0 siblings, 0 replies; 17+ messages in thread
From: Gilles Chanteperdrix @ 2013-01-13 12:52 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 01/13/2013 01:35 PM, Jan Kiszka wrote:

> On 2013-01-13 13:29, Jan Kiszka wrote:
>> On 2013-01-12 19:43, Gilles Chanteperdrix wrote:
>>> On 01/09/2013 02:30 PM, Jan Kiszka wrote:
>>>
>>>> On 2013-01-08 22:06, Jan Kiszka wrote:
>>>>> On 2013-01-08 20:43, Gilles Chanteperdrix wrote:
>>>>>> On 01/08/2013 12:12 PM, Mariusz Janiak wrote:
>>>>>>
>>>>>>> Hi GIlles,
>>>>>>>
>>>>>>> As you suggested, I have prepared simple test case that demonstrate how Xenomai is utilized by OROCOS. This test case behaves exactly the same like helloword example. Scheduler is chosen before any mutex are processed, so in my opinion it is not the case which you defined. What is really surprising is that the replacing TM_NONBLOCK with TM_INFINITE, in one before last line, do magic and suppress signal generation. Furthermore, there is no call to 'rt_task_set_mode(0, T_WARNSW, NULL);' so why 
>>>>>>> signal is generated? If we enable T_WARNSW in the thread, SIGXCPU is generated when mutex is locked first time in the thread. 
>>>>>>
>>>>>>
>>>>>> I guess the test could be simpler, simply:
>>>>>>
>>>>>> rt_mutex_acquire
>>>>>> rt_task_create
>>>>>> rt_mutex_release
>>>>>> rt_mutex_acquire
>>>>>> rt_mutex_release
>>>>>>
>>>>>> Anyway, calling rt_task_create while holding a real-time mutex is itself
>>>>>> a priority inversion: any thread in primary mode waiting for the mutex
>>>>>> will now have to wait for task running in secondary mode, so may be
>>>>>> block during an unbounded amount of time. So, using a real-time mutex
>>>>>> for this is completely useless you should be using a glibc
>>>>>> pthread_mutex_t. If compiling for the posix skin, use
>>>>>> __real_pthread_mutex_lock.
>>>>>>
>>>>>> Now, how this can cause the issue you observe remains to be understood,
>>>>>> and probably requires a fix.
>>>>>
>>>>> OK, second try: We do not update the new owner's hrescnt if we acquire a
>>>>> mutex via trylock. This applies both to rt_mutex_acquire_inner and
>>>>> pthread_mutex_trylock. Probably, this should be done in the
>>>>> corresponding syscall wrapper as both services are also used for the
>>>>> in-kernel API.
>>>>
>>>> Here is the corresponding patch:
>>>
>>>> http://www.xenomai.org/pipermail/xenomai-git/2013-January/000336.html
>>>
>>> Ok, so, if I understand correctly, the whole orocos testcase boils down to:
>>> trylock
>>> unlock
>>>
>>> We should move the incrementation of the resource counter to
>>> xnsynch_fast_acquire. We will be left with only two places to patch: the
>>> native and posix trylock in the !FASTSYNCH case.
>>
>> xnsynch_fast_acquire is shared with user space code and therefore
>> references no kernel types.
> 
> ...and there are more spots than those after successful
> xnsynch_fast_acquire.


Ok, merged your pach, but removed the resource counter incrementation
when re-locking a recursive mutex.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-01-13 12:52 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-08 11:12 [Xenomai] SIGXCPU with rt_mutex_release Mariusz Janiak
2013-01-08 13:33 ` Jan Kiszka
2013-01-08 19:40   ` Gilles Chanteperdrix
2013-01-08 19:43 ` Gilles Chanteperdrix
2013-01-08 21:06   ` Jan Kiszka
2013-01-08 21:09     ` Jan Kiszka
2013-01-09 13:30     ` Jan Kiszka
2013-01-12 18:43       ` Gilles Chanteperdrix
2013-01-13 12:29         ` Jan Kiszka
2013-01-13 12:35           ` Jan Kiszka
2013-01-13 12:52             ` Gilles Chanteperdrix
  -- strict thread matches above, loose matches on Subject: below --
2013-01-08 20:51 Mariusz Janiak
2013-01-08 17:42 Mariusz Janiak
2012-12-31 17:32 Mariusz Janiak
2012-12-31 17:41 ` Gilles Chanteperdrix
2012-12-31 17:52   ` Peter Soetens
2012-12-31 17:55     ` Gilles Chanteperdrix

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.