All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-help] Issue with Auto relax and nested mutexes
@ 2012-01-09 20:50 Makarand Pradhan
  2012-01-10 15:11 ` Philippe Gerum
  0 siblings, 1 reply; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-09 20:50 UTC (permalink / raw)
  To: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 1453 bytes --]

Hi,

I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.

I am noticing an issue while using the auto relax feature related to 
mutexes. I am using nested mutexes. The code is attached to this email.

The problem is that I am not relaxing after a RT thread grabs and 
releases a mutex. On further investigation, it was noted that the rescnt 
is not going down to 0. Another observation is that I do not hit 
rt_mutex_release in the kernel in the problem scenario, I believe when 
the thread undergoes a priority inversion. This may be a problem as the 
rescnt would not get decremented.  Not sure how the mutex is releasing 
wiithout hitting rt_mutex_relase or am I missing anything?

If I have both the tasks running at priority 0, I stay in the secondary 
domain, rt_mutex_release is invoked as expected, the rescnt goes down to 
0 when all the mutexes are released.

Has anyone faced this problem?

Rgds,
Makarand





-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  


[-- Attachment #2: prio.c --]
[-- Type: text/x-csrc, Size: 2123 bytes --]

#include <xenomai/asm-generic/bits/current.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#include <stdio.h>
#include <native/task.h>
#include <native/mutex.h>
#include <nucleus/thread.h>
#include <rtdk.h>
#include <stdlib.h>
#include <unistd.h>

RT_TASK task0, task1, task2; 
RT_MUTEX mux1, mux2, mux3, mux4;
char buff[8192];

void task1_func(void *arg)
{
	RT_TASK_INFO info;
	int current_in_primary = 0;

	while(true)
	{
		rt_task_inquire(&task1, &info);
		current_in_primary = !(xeno_get_current_mode() & XNRELAX);
		rt_printf("bP: %i, cp: %i, mode: %i\n", 
			info.bprio, info.cprio, current_in_primary);
                rt_mutex_acquire(&mux1, TM_INFINITE);
                rt_mutex_acquire(&mux2, TM_INFINITE);
                rt_mutex_acquire(&mux3, TM_INFINITE);
		rt_printf("Acquire complete\n");

		rt_task_sleep(2222222222LL);
		rt_mutex_release(&mux3);
		rt_mutex_release(&mux2);
		rt_mutex_release(&mux1);
		rt_printf("Release complete\n");
	}
}

void task2_func(void *arg)
{
	int counter = 0;
	while(true)
	{
		if (2 == counter)
		{
			rt_printf("Grabbing mux in HP\n");
			rt_mutex_acquire(&mux1, TM_INFINITE);
			rt_mutex_acquire(&mux2, TM_INFINITE);
			rt_mutex_acquire(&mux3, TM_INFINITE);
			rt_printf("Mux held by Task2\n");
		}
		rt_mutex_acquire(&mux4, TM_INFINITE);
		rt_task_sleep(4444444444LL);
		if (2 == counter)
		{
			rt_mutex_release(&mux3);
			rt_mutex_release(&mux2);
			rt_mutex_release(&mux1);
		}
		rt_mutex_release(&mux4);
		counter++;
	}
}

int main(int argc, char **argv)
{
	int prio1, prio2;

	if (argc == 1)
	{
		printf("Pl enter prio1 prio1\n");
		exit (0);
	}

	prio1 = atoi(argv[1]);
	prio2 = atoi(argv[2]);

	mlockall(MCL_CURRENT|MCL_FUTURE);

	rt_print_auto_init(1);

	rt_task_shadow(&task0, "Task 0", 10, 0);

	rt_mutex_create(&mux1, "test_mux1");
	rt_mutex_create(&mux2, "test_mux2");
	rt_mutex_create(&mux3, "test_mux3");
	rt_mutex_create(&mux4, "test_mux4");

	printf("Spawning: tasks\n");
	rt_task_spawn(&task1, "TTTTTTTTTTT", 0, prio1, 0, task1_func, NULL);
	rt_task_spawn(&task2, "TASK2", 0, prio2, 0, task2_func, NULL);

	while (1) {
		rt_task_sleep(5000000LL);
	}
}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-09 20:50 [Xenomai-help] Issue with Auto relax and nested mutexes Makarand Pradhan
@ 2012-01-10 15:11 ` Philippe Gerum
  2012-01-10 15:26   ` Makarand Pradhan
  0 siblings, 1 reply; 23+ messages in thread
From: Philippe Gerum @ 2012-01-10 15:11 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai@xenomai.org

On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
> Hi,
>
> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>
> I am noticing an issue while using the auto relax feature related to
> mutexes. I am using nested mutexes. The code is attached to this email.
>
> The problem is that I am not relaxing after a RT thread grabs and
> releases a mutex. On further investigation, it was noted that the rescnt
> is not going down to 0.

 From your code, task1 would auto-relax only if started with priority 0, 
which is what I get here:

-bash-3.2# ./relax 0 1
Spawning: tasks
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
...

Conversely, I get the right behavior if setting a non-zero priority to 
task1:

-bash-3.2# ./relax 1 0
Spawning: tasks
bP: 1, cp: 1, mode: 1
Acquire complete
Release complete
bP: 1, cp: 1, mode: 1
Acquire complete
Release complete
bP: 1, cp: 1, mode: 1
Acquire complete
...

In any case, the priority of task2 should have no impact on the result.

I'm running current 2.6 HEAD commit (168da46de), kernel 3.1.5/powerpc32 
(52xx), pipeline 2.13-06.

Which priority arguments are you passing to your test program?

> Another observation is that I do not hit
> rt_mutex_release in the kernel in the problem scenario, I believe when
> the thread undergoes a priority inversion.This may be a problem as the
> rescnt would not get decremented. Not sure how the mutex is releasing
> wiithout hitting rt_mutex_relase or am I missing anything?
>

These are fast mutexes, the thread does not have to jump to kernel space 
unless the released mutex was actually contented.

> If I have both the tasks running at priority 0, I stay in the secondary
> domain, rt_mutex_release is invoked as expected, the rescnt goes down to
> 0 when all the mutexes are released.
>
> Has anyone faced this problem?
>

I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai 
threads only (i.e. prio == 0).

> Rgds,
> Makarand
>
>
>
>
>
>
>
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:11 ` Philippe Gerum
@ 2012-01-10 15:26   ` Makarand Pradhan
  2012-01-10 15:38     ` Philippe Gerum
  2012-01-10 15:40     ` Makarand Pradhan
  0 siblings, 2 replies; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-10 15:26 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

Hi Phillippe,

You are right. Task 1 requires to be started with prio 0. I start seeing 
the problem after task2 grabs the mutex and releases them. The first 
task never jumps back to seconodary. Here is my output. The mode never 
goes back to 0 after "Grabbing mux in HP" and the rescnt stays stuck at 
1 in the kernel.

root@domain.hid:~# ./relax 0 1
Spawning: tasks
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Grabbing mux in HP
Mux held by Task2
Release complete
bP: 0, cp: 0, mode: 1
Acquire complete
Release complete
bP: 0, cp: 0, mode: 1
Acquire complete

Rgds,
Mak.


On 10/01/12 10:11 AM, Philippe Gerum wrote:
> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>> Hi,
>>
>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>
>> I am noticing an issue while using the auto relax feature related to
>> mutexes. I am using nested mutexes. The code is attached to this email.
>>
>> The problem is that I am not relaxing after a RT thread grabs and
>> releases a mutex. On further investigation, it was noted that the rescnt
>> is not going down to 0.
>    From your code, task1 would auto-relax only if started with priority 0,
> which is what I get here:
>
> -bash-3.2# ./relax 0 1
> Spawning: tasks
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> ...
>
> Conversely, I get the right behavior if setting a non-zero priority to
> task1:
>
> -bash-3.2# ./relax 1 0
> Spawning: tasks
> bP: 1, cp: 1, mode: 1
> Acquire complete
> Release complete
> bP: 1, cp: 1, mode: 1
> Acquire complete
> Release complete
> bP: 1, cp: 1, mode: 1
> Acquire complete
> ...
>
> In any case, the priority of task2 should have no impact on the result.
>
> I'm running current 2.6 HEAD commit (168da46de), kernel 3.1.5/powerpc32
> (52xx), pipeline 2.13-06.
>
> Which priority arguments are you passing to your test program?
>
>> Another observation is that I do not hit
>> rt_mutex_release in the kernel in the problem scenario, I believe when
>> the thread undergoes a priority inversion.This may be a problem as the
>> rescnt would not get decremented. Not sure how the mutex is releasing
>> wiithout hitting rt_mutex_relase or am I missing anything?
>>
> These are fast mutexes, the thread does not have to jump to kernel space
> unless the released mutex was actually contented.
>
>> If I have both the tasks running at priority 0, I stay in the secondary
>> domain, rt_mutex_release is invoked as expected, the rescnt goes down to
>> 0 when all the mutexes are released.
>>
>> Has anyone faced this problem?
>>
> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
> threads only (i.e. prio == 0).
>
>> Rgds,
>> Makarand
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:26   ` Makarand Pradhan
@ 2012-01-10 15:38     ` Philippe Gerum
  2012-01-10 15:40     ` Makarand Pradhan
  1 sibling, 0 replies; 23+ messages in thread
From: Philippe Gerum @ 2012-01-10 15:38 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai@xenomai.org

On 01/10/2012 04:26 PM, Makarand Pradhan wrote:
> Hi Phillippe,
>
> You are right. Task 1 requires to be started with prio 0. I start seeing
> the problem after task2 grabs the mutex and releases them. The first
> task never jumps back to seconodary. Here is my output. The mode never
> goes back to 0 after "Grabbing mux in HP" and the rescnt stays stuck at
> 1 in the kernel.

Eeek. Ok, I definitely need to reproduce this. I'm slightly buried alive 
in Xenomai 3 stuff right now, but I'll have a look at this asap.

>
> root@domain.hid:~# ./relax 0 1
> Spawning: tasks
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Grabbing mux in HP
> Mux held by Task2
> Release complete
> bP: 0, cp: 0, mode: 1
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 1
> Acquire complete
>
> Rgds,
> Mak.
>
>
> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>> Hi,
>>>
>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>
>>> I am noticing an issue while using the auto relax feature related to
>>> mutexes. I am using nested mutexes. The code is attached to this email.
>>>
>>> The problem is that I am not relaxing after a RT thread grabs and
>>> releases a mutex. On further investigation, it was noted that the rescnt
>>> is not going down to 0.
>> From your code, task1 would auto-relax only if started with priority 0,
>> which is what I get here:
>>
>> -bash-3.2# ./relax 0 1
>> Spawning: tasks
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> ...
>>
>> Conversely, I get the right behavior if setting a non-zero priority to
>> task1:
>>
>> -bash-3.2# ./relax 1 0
>> Spawning: tasks
>> bP: 1, cp: 1, mode: 1
>> Acquire complete
>> Release complete
>> bP: 1, cp: 1, mode: 1
>> Acquire complete
>> Release complete
>> bP: 1, cp: 1, mode: 1
>> Acquire complete
>> ...
>>
>> In any case, the priority of task2 should have no impact on the result.
>>
>> I'm running current 2.6 HEAD commit (168da46de), kernel 3.1.5/powerpc32
>> (52xx), pipeline 2.13-06.
>>
>> Which priority arguments are you passing to your test program?
>>
>>> Another observation is that I do not hit
>>> rt_mutex_release in the kernel in the problem scenario, I believe when
>>> the thread undergoes a priority inversion.This may be a problem as the
>>> rescnt would not get decremented. Not sure how the mutex is releasing
>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>
>> These are fast mutexes, the thread does not have to jump to kernel space
>> unless the released mutex was actually contented.
>>
>>> If I have both the tasks running at priority 0, I stay in the secondary
>>> domain, rt_mutex_release is invoked as expected, the rescnt goes down to
>>> 0 when all the mutexes are released.
>>>
>>> Has anyone faced this problem?
>>>
>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>> threads only (i.e. prio == 0).
>>
>>> Rgds,
>>> Makarand
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xenomai-help mailing list
>>> Xenomai-help@domain.hid
>>> https://mail.gna.org/listinfo/xenomai-help
>>
>
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:26   ` Makarand Pradhan
  2012-01-10 15:38     ` Philippe Gerum
@ 2012-01-10 15:40     ` Makarand Pradhan
  2012-01-10 15:40       ` Philippe Gerum
  1 sibling, 1 reply; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-10 15:40 UTC (permalink / raw)
  To: xenomai

Another point:

"These are fast mutexes, the thread does not have to jump to kernel space
unless the released mutex was actually contented."

When the first task is started with prio 0, I always see that rt_mutex_release is invoked in the kernel. even when there is no contention.

I have an instrumented kernel. The kernel trace is given below. In this trace only task1 is running at prio 0. It should be easy to follow:

Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
Jan 10 10:37:01 ruggedcom kernel: RML
Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
Jan 10 10:37:01 ruggedcom kernel: RML
Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
Jan 10 10:37:01 ruggedcom kernel: RML
Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
Jan 10 10:37:03 ruggedcom kernel: RML
Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
Jan 10 10:37:03 ruggedcom kernel: RML
Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
Jan 10 10:37:03 ruggedcom kernel: RML
Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0


root@domain.hid:~# ./a.out 0 1
Spawning: tasks
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
^C


Rgds,
Mak.



On 10/01/12 10:26 AM, Makarand Pradhan wrote:
> Hi Phillippe,
>
> You are right. Task 1 requires to be started with prio 0. I start seeing
> the problem after task2 grabs the mutex and releases them. The first
> task never jumps back to seconodary. Here is my output. The mode never
> goes back to 0 after "Grabbing mux in HP" and the rescnt stays stuck at
> 1 in the kernel.
>
> root@domain.hid:~# ./relax 0 1
> Spawning: tasks
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Grabbing mux in HP
> Mux held by Task2
> Release complete
> bP: 0, cp: 0, mode: 1
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 1
> Acquire complete
>
> Rgds,
> Mak.
>
>
> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>> Hi,
>>>
>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>
>>> I am noticing an issue while using the auto relax feature related to
>>> mutexes. I am using nested mutexes. The code is attached to this email.
>>>
>>> The problem is that I am not relaxing after a RT thread grabs and
>>> releases a mutex. On further investigation, it was noted that the rescnt
>>> is not going down to 0.
>>      From your code, task1 would auto-relax only if started with priority 0,
>> which is what I get here:
>>
>> -bash-3.2# ./relax 0 1
>> Spawning: tasks
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> ...
>>
>> Conversely, I get the right behavior if setting a non-zero priority to
>> task1:
>>
>> -bash-3.2# ./relax 1 0
>> Spawning: tasks
>> bP: 1, cp: 1, mode: 1
>> Acquire complete
>> Release complete
>> bP: 1, cp: 1, mode: 1
>> Acquire complete
>> Release complete
>> bP: 1, cp: 1, mode: 1
>> Acquire complete
>> ...
>>
>> In any case, the priority of task2 should have no impact on the result.
>>
>> I'm running current 2.6 HEAD commit (168da46de), kernel 3.1.5/powerpc32
>> (52xx), pipeline 2.13-06.
>>
>> Which priority arguments are you passing to your test program?
>>
>>> Another observation is that I do not hit
>>> rt_mutex_release in the kernel in the problem scenario, I believe when
>>> the thread undergoes a priority inversion.This may be a problem as the
>>> rescnt would not get decremented. Not sure how the mutex is releasing
>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>
>> These are fast mutexes, the thread does not have to jump to kernel space
>> unless the released mutex was actually contented.
>>
>>> If I have both the tasks running at priority 0, I stay in the secondary
>>> domain, rt_mutex_release is invoked as expected, the rescnt goes down to
>>> 0 when all the mutexes are released.
>>>
>>> Has anyone faced this problem?
>>>
>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>> threads only (i.e. prio == 0).
>>
>>> Rgds,
>>> Makarand
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xenomai-help mailing list
>>> Xenomai-help@domain.hid
>>> https://mail.gna.org/listinfo/xenomai-help
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:40     ` Makarand Pradhan
@ 2012-01-10 15:40       ` Philippe Gerum
  2012-01-10 15:42         ` Philippe Gerum
  0 siblings, 1 reply; 23+ messages in thread
From: Philippe Gerum @ 2012-01-10 15:40 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai

On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
> Another point:
>
> "These are fast mutexes, the thread does not have to jump to kernel space
> unless the released mutex was actually contented."
>
> When the first task is started with prio 0, I always see that
> rt_mutex_release is invoked in the kernel. even when there is no
> contention.

I should have added: "unless there is no contention ... or the caller is 
a non-rt thread". This is because we have to jump to kernel space to 
track rescnt.

>
> I have an instrumented kernel. The kernel trace is given below. In this
> trace only task1 is running at prio 0. It should be easy to follow:
>
> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
> Jan 10 10:37:01 ruggedcom kernel: RML
> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
> Jan 10 10:37:01 ruggedcom kernel: RML
> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
> Jan 10 10:37:01 ruggedcom kernel: RML
> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
> Jan 10 10:37:03 ruggedcom kernel: RML
> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
> Jan 10 10:37:03 ruggedcom kernel: RML
> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
> Jan 10 10:37:03 ruggedcom kernel: RML
> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>
>
> root@domain.hid:~# ./a.out 0 1
> Spawning: tasks
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> ^C
>
>
> Rgds,
> Mak.
>
>
>
> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>> Hi Phillippe,
>>
>> You are right. Task 1 requires to be started with prio 0. I start seeing
>> the problem after task2 grabs the mutex and releases them. The first
>> task never jumps back to seconodary. Here is my output. The mode never
>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays stuck at
>> 1 in the kernel.
>>
>> root@domain.hid:~# ./relax 0 1
>> Spawning: tasks
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Grabbing mux in HP
>> Mux held by Task2
>> Release complete
>> bP: 0, cp: 0, mode: 1
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 1
>> Acquire complete
>>
>> Rgds,
>> Mak.
>>
>>
>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>> Hi,
>>>>
>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>
>>>> I am noticing an issue while using the auto relax feature related to
>>>> mutexes. I am using nested mutexes. The code is attached to this email.
>>>>
>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>> releases a mutex. On further investigation, it was noted that the
>>>> rescnt
>>>> is not going down to 0.
>>> From your code, task1 would auto-relax only if started with priority 0,
>>> which is what I get here:
>>>
>>> -bash-3.2# ./relax 0 1
>>> Spawning: tasks
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> ...
>>>
>>> Conversely, I get the right behavior if setting a non-zero priority to
>>> task1:
>>>
>>> -bash-3.2# ./relax 1 0
>>> Spawning: tasks
>>> bP: 1, cp: 1, mode: 1
>>> Acquire complete
>>> Release complete
>>> bP: 1, cp: 1, mode: 1
>>> Acquire complete
>>> Release complete
>>> bP: 1, cp: 1, mode: 1
>>> Acquire complete
>>> ...
>>>
>>> In any case, the priority of task2 should have no impact on the result.
>>>
>>> I'm running current 2.6 HEAD commit (168da46de), kernel 3.1.5/powerpc32
>>> (52xx), pipeline 2.13-06.
>>>
>>> Which priority arguments are you passing to your test program?
>>>
>>>> Another observation is that I do not hit
>>>> rt_mutex_release in the kernel in the problem scenario, I believe when
>>>> the thread undergoes a priority inversion.This may be a problem as the
>>>> rescnt would not get decremented. Not sure how the mutex is releasing
>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>
>>> These are fast mutexes, the thread does not have to jump to kernel space
>>> unless the released mutex was actually contented.
>>>
>>>> If I have both the tasks running at priority 0, I stay in the secondary
>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>> down to
>>>> 0 when all the mutexes are released.
>>>>
>>>> Has anyone faced this problem?
>>>>
>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>> threads only (i.e. prio == 0).
>>>
>>>> Rgds,
>>>> Makarand
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Xenomai-help mailing list
>>>> Xenomai-help@domain.hid
>>>> https://mail.gna.org/listinfo/xenomai-help
>>
>
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:40       ` Philippe Gerum
@ 2012-01-10 15:42         ` Philippe Gerum
  2012-01-10 15:51           ` Makarand Pradhan
  0 siblings, 1 reply; 23+ messages in thread
From: Philippe Gerum @ 2012-01-10 15:42 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai

On 01/10/2012 04:40 PM, Philippe Gerum wrote:
> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>> Another point:
>>
>> "These are fast mutexes, the thread does not have to jump to kernel space
>> unless the released mutex was actually contented."
>>
>> When the first task is started with prio 0, I always see that
>> rt_mutex_release is invoked in the kernel. even when there is no
>> contention.
>
> I should have added: "unless there is no contention ... or the caller is
> a non-rt thread". This is because we have to jump to kernel space to
> track rescnt.
>

Ok, next try: "unless the mutex was contented ... or the caller is
a non-rt thread".

>>
>> I have an instrumented kernel. The kernel trace is given below. In this
>> trace only task1 is running at prio 0. It should be easy to follow:
>>
>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>> Jan 10 10:37:01 ruggedcom kernel: RML
>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>> Jan 10 10:37:01 ruggedcom kernel: RML
>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>> Jan 10 10:37:01 ruggedcom kernel: RML
>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>> Jan 10 10:37:03 ruggedcom kernel: RML
>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>> Jan 10 10:37:03 ruggedcom kernel: RML
>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>> Jan 10 10:37:03 ruggedcom kernel: RML
>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>
>>
>> root@domain.hid:~# ./a.out 0 1
>> Spawning: tasks
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> ^C
>>
>>
>> Rgds,
>> Mak.
>>
>>
>>
>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>> Hi Phillippe,
>>>
>>> You are right. Task 1 requires to be started with prio 0. I start seeing
>>> the problem after task2 grabs the mutex and releases them. The first
>>> task never jumps back to seconodary. Here is my output. The mode never
>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays stuck at
>>> 1 in the kernel.
>>>
>>> root@domain.hid:~# ./relax 0 1
>>> Spawning: tasks
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Grabbing mux in HP
>>> Mux held by Task2
>>> Release complete
>>> bP: 0, cp: 0, mode: 1
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 1
>>> Acquire complete
>>>
>>> Rgds,
>>> Mak.
>>>
>>>
>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>> Hi,
>>>>>
>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>
>>>>> I am noticing an issue while using the auto relax feature related to
>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>> email.
>>>>>
>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>> releases a mutex. On further investigation, it was noted that the
>>>>> rescnt
>>>>> is not going down to 0.
>>>> From your code, task1 would auto-relax only if started with priority 0,
>>>> which is what I get here:
>>>>
>>>> -bash-3.2# ./relax 0 1
>>>> Spawning: tasks
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> ...
>>>>
>>>> Conversely, I get the right behavior if setting a non-zero priority to
>>>> task1:
>>>>
>>>> -bash-3.2# ./relax 1 0
>>>> Spawning: tasks
>>>> bP: 1, cp: 1, mode: 1
>>>> Acquire complete
>>>> Release complete
>>>> bP: 1, cp: 1, mode: 1
>>>> Acquire complete
>>>> Release complete
>>>> bP: 1, cp: 1, mode: 1
>>>> Acquire complete
>>>> ...
>>>>
>>>> In any case, the priority of task2 should have no impact on the result.
>>>>
>>>> I'm running current 2.6 HEAD commit (168da46de), kernel 3.1.5/powerpc32
>>>> (52xx), pipeline 2.13-06.
>>>>
>>>> Which priority arguments are you passing to your test program?
>>>>
>>>>> Another observation is that I do not hit
>>>>> rt_mutex_release in the kernel in the problem scenario, I believe when
>>>>> the thread undergoes a priority inversion.This may be a problem as the
>>>>> rescnt would not get decremented. Not sure how the mutex is releasing
>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>
>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>> space
>>>> unless the released mutex was actually contented.
>>>>
>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>> secondary
>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>> down to
>>>>> 0 when all the mutexes are released.
>>>>>
>>>>> Has anyone faced this problem?
>>>>>
>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>> threads only (i.e. prio == 0).
>>>>
>>>>> Rgds,
>>>>> Makarand
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Xenomai-help mailing list
>>>>> Xenomai-help@domain.hid
>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>
>>
>>
>
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:42         ` Philippe Gerum
@ 2012-01-10 15:51           ` Makarand Pradhan
  2012-01-10 17:51             ` Philippe Gerum
  2012-01-10 18:08             ` Philippe Gerum
  0 siblings, 2 replies; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-10 15:51 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

Based on my testing, it is noted that the rescnt is not released when 
task1 gets a priority boost and starts running with priority 1. That's 
when the rescnt is not decremented.

It would imply that we may be checking the current priority while 
testing if we want to invoke rt_mutex_release in kernel. Will try to 
check it out.

Rgds,
Mak.

On 10/01/12 10:42 AM, Philippe Gerum wrote:
> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>> Another point:
>>>
>>> "These are fast mutexes, the thread does not have to jump to kernel space
>>> unless the released mutex was actually contented."
>>>
>>> When the first task is started with prio 0, I always see that
>>> rt_mutex_release is invoked in the kernel. even when there is no
>>> contention.
>> I should have added: "unless there is no contention ... or the caller is
>> a non-rt thread". This is because we have to jump to kernel space to
>> track rescnt.
>>
> Ok, next try: "unless the mutex was contented ... or the caller is
> a non-rt thread".
>
>>> I have an instrumented kernel. The kernel trace is given below. In this
>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>
>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>
>>>
>>> root@domain.hid:~# ./a.out 0 1
>>> Spawning: tasks
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> ^C
>>>
>>>
>>> Rgds,
>>> Mak.
>>>
>>>
>>>
>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>> Hi Phillippe,
>>>>
>>>> You are right. Task 1 requires to be started with prio 0. I start seeing
>>>> the problem after task2 grabs the mutex and releases them. The first
>>>> task never jumps back to seconodary. Here is my output. The mode never
>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays stuck at
>>>> 1 in the kernel.
>>>>
>>>> root@domain.hid:~# ./relax 0 1
>>>> Spawning: tasks
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Grabbing mux in HP
>>>> Mux held by Task2
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 1
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 1
>>>> Acquire complete
>>>>
>>>> Rgds,
>>>> Mak.
>>>>
>>>>
>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>
>>>>>> I am noticing an issue while using the auto relax feature related to
>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>> email.
>>>>>>
>>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>>> releases a mutex. On further investigation, it was noted that the
>>>>>> rescnt
>>>>>> is not going down to 0.
>>>>>  From your code, task1 would auto-relax only if started with priority 0,
>>>>> which is what I get here:
>>>>>
>>>>> -bash-3.2# ./relax 0 1
>>>>> Spawning: tasks
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> ...
>>>>>
>>>>> Conversely, I get the right behavior if setting a non-zero priority to
>>>>> task1:
>>>>>
>>>>> -bash-3.2# ./relax 1 0
>>>>> Spawning: tasks
>>>>> bP: 1, cp: 1, mode: 1
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 1, cp: 1, mode: 1
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 1, cp: 1, mode: 1
>>>>> Acquire complete
>>>>> ...
>>>>>
>>>>> In any case, the priority of task2 should have no impact on the result.
>>>>>
>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel 3.1.5/powerpc32
>>>>> (52xx), pipeline 2.13-06.
>>>>>
>>>>> Which priority arguments are you passing to your test program?
>>>>>
>>>>>> Another observation is that I do not hit
>>>>>> rt_mutex_release in the kernel in the problem scenario, I believe when
>>>>>> the thread undergoes a priority inversion.This may be a problem as the
>>>>>> rescnt would not get decremented. Not sure how the mutex is releasing
>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>
>>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>>> space
>>>>> unless the released mutex was actually contented.
>>>>>
>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>> secondary
>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>> down to
>>>>>> 0 when all the mutexes are released.
>>>>>>
>>>>>> Has anyone faced this problem?
>>>>>>
>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>>> threads only (i.e. prio == 0).
>>>>>
>>>>>> Rgds,
>>>>>> Makarand
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Xenomai-help mailing list
>>>>>> Xenomai-help@domain.hid
>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>
>>
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:51           ` Makarand Pradhan
@ 2012-01-10 17:51             ` Philippe Gerum
  2012-01-10 18:08             ` Philippe Gerum
  1 sibling, 0 replies; 23+ messages in thread
From: Philippe Gerum @ 2012-01-10 17:51 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai@xenomai.org

On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
> Based on my testing, it is noted that the rescnt is not released when
> task1 gets a priority boost and starts running with priority 1. That's
> when the rescnt is not decremented.
>

Ah, bad side effect of __xnsched_rt_setparam likely. Ok, I need to have 
a deeper look, but this would make sense.

> It would imply that we may be checking the current priority while
> testing if we want to invoke rt_mutex_release in kernel. Will try to
> check it out.
>
> Rgds,
> Mak.
>
> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>> Another point:
>>>>
>>>> "These are fast mutexes, the thread does not have to jump to kernel
>>>> space
>>>> unless the released mutex was actually contented."
>>>>
>>>> When the first task is started with prio 0, I always see that
>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>> contention.
>>> I should have added: "unless there is no contention ... or the caller is
>>> a non-rt thread". This is because we have to jump to kernel space to
>>> track rescnt.
>>>
>> Ok, next try: "unless the mutex was contented ... or the caller is
>> a non-rt thread".
>>
>>>> I have an instrumented kernel. The kernel trace is given below. In this
>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>
>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>
>>>>
>>>> root@domain.hid:~# ./a.out 0 1
>>>> Spawning: tasks
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> ^C
>>>>
>>>>
>>>> Rgds,
>>>> Mak.
>>>>
>>>>
>>>>
>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>> Hi Phillippe,
>>>>>
>>>>> You are right. Task 1 requires to be started with prio 0. I start
>>>>> seeing
>>>>> the problem after task2 grabs the mutex and releases them. The first
>>>>> task never jumps back to seconodary. Here is my output. The mode never
>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>> stuck at
>>>>> 1 in the kernel.
>>>>>
>>>>> root@domain.hid:~# ./relax 0 1
>>>>> Spawning: tasks
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Grabbing mux in HP
>>>>> Mux held by Task2
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 1
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 1
>>>>> Acquire complete
>>>>>
>>>>> Rgds,
>>>>> Mak.
>>>>>
>>>>>
>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>
>>>>>>> I am noticing an issue while using the auto relax feature related to
>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>> email.
>>>>>>>
>>>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>>>> releases a mutex. On further investigation, it was noted that the
>>>>>>> rescnt
>>>>>>> is not going down to 0.
>>>>>> From your code, task1 would auto-relax only if started with
>>>>>> priority 0,
>>>>>> which is what I get here:
>>>>>>
>>>>>> -bash-3.2# ./relax 0 1
>>>>>> Spawning: tasks
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> ...
>>>>>>
>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>> priority to
>>>>>> task1:
>>>>>>
>>>>>> -bash-3.2# ./relax 1 0
>>>>>> Spawning: tasks
>>>>>> bP: 1, cp: 1, mode: 1
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 1, cp: 1, mode: 1
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 1, cp: 1, mode: 1
>>>>>> Acquire complete
>>>>>> ...
>>>>>>
>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>> result.
>>>>>>
>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>> 3.1.5/powerpc32
>>>>>> (52xx), pipeline 2.13-06.
>>>>>>
>>>>>> Which priority arguments are you passing to your test program?
>>>>>>
>>>>>>> Another observation is that I do not hit
>>>>>>> rt_mutex_release in the kernel in the problem scenario, I believe
>>>>>>> when
>>>>>>> the thread undergoes a priority inversion.This may be a problem
>>>>>>> as the
>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>> releasing
>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>
>>>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>>>> space
>>>>>> unless the released mutex was actually contented.
>>>>>>
>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>> secondary
>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>> down to
>>>>>>> 0 when all the mutexes are released.
>>>>>>>
>>>>>>> Has anyone faced this problem?
>>>>>>>
>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>>>> threads only (i.e. prio == 0).
>>>>>>
>>>>>>> Rgds,
>>>>>>> Makarand
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Xenomai-help mailing list
>>>>>>> Xenomai-help@domain.hid
>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>>
>>>
>>
>
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 15:51           ` Makarand Pradhan
  2012-01-10 17:51             ` Philippe Gerum
@ 2012-01-10 18:08             ` Philippe Gerum
  2012-01-10 18:39               ` Makarand Pradhan
  1 sibling, 1 reply; 23+ messages in thread
From: Philippe Gerum @ 2012-01-10 18:08 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai@xenomai.org

On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
> Based on my testing, it is noted that the rescnt is not released when 
> task1 gets a priority boost and starts running with priority 1. That's 
> when the rescnt is not decremented.
> 
> It would imply that we may be checking the current priority while 
> testing if we want to invoke rt_mutex_release in kernel. Will try to 
> check it out.

Does this help in your case?

diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
index cc1cefa..6ac8fd7 100644
--- a/include/nucleus/sched-rt.h
+++ b/include/nucleus/sched-rt.h
@@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct xnthread *thread,
 {
 	thread->cprio = p->rt.prio;
 	if (xnthread_test_state(thread, XNSHADOW)) {
-		if (thread->cprio)
+		if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
 			xnthread_clear_state(thread, XNOTHER);
 		else
 			xnthread_set_state(thread, XNOTHER);
> 
> Rgds,
> Mak.
> 
> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>> Another point:
>>>>
>>>> "These are fast mutexes, the thread does not have to jump to kernel 
>>>> space
>>>> unless the released mutex was actually contented."
>>>>
>>>> When the first task is started with prio 0, I always see that
>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>> contention.
>>> I should have added: "unless there is no contention ... or the caller is
>>> a non-rt thread". This is because we have to jump to kernel space to
>>> track rescnt.
>>>
>> Ok, next try: "unless the mutex was contented ... or the caller is
>> a non-rt thread".
>>
>>>> I have an instrumented kernel. The kernel trace is given below. In this
>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>
>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>
>>>>
>>>> root@domain.hid:~# ./a.out 0 1
>>>> Spawning: tasks
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> ^C
>>>>
>>>>
>>>> Rgds,
>>>> Mak.
>>>>
>>>>
>>>>
>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>> Hi Phillippe,
>>>>>
>>>>> You are right. Task 1 requires to be started with prio 0. I start 
>>>>> seeing
>>>>> the problem after task2 grabs the mutex and releases them. The first
>>>>> task never jumps back to seconodary. Here is my output. The mode never
>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays 
>>>>> stuck at
>>>>> 1 in the kernel.
>>>>>
>>>>> root@domain.hid:~# ./relax 0 1
>>>>> Spawning: tasks
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Grabbing mux in HP
>>>>> Mux held by Task2
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 1
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 1
>>>>> Acquire complete
>>>>>
>>>>> Rgds,
>>>>> Mak.
>>>>>
>>>>>
>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>
>>>>>>> I am noticing an issue while using the auto relax feature related to
>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>> email.
>>>>>>>
>>>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>>>> releases a mutex. On further investigation, it was noted that the
>>>>>>> rescnt
>>>>>>> is not going down to 0.
>>>>>> From your code, task1 would auto-relax only if started with 
>>>>>> priority 0,
>>>>>> which is what I get here:
>>>>>>
>>>>>> -bash-3.2# ./relax 0 1
>>>>>> Spawning: tasks
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> ...
>>>>>>
>>>>>> Conversely, I get the right behavior if setting a non-zero 
>>>>>> priority to
>>>>>> task1:
>>>>>>
>>>>>> -bash-3.2# ./relax 1 0
>>>>>> Spawning: tasks
>>>>>> bP: 1, cp: 1, mode: 1
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 1, cp: 1, mode: 1
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 1, cp: 1, mode: 1
>>>>>> Acquire complete
>>>>>> ...
>>>>>>
>>>>>> In any case, the priority of task2 should have no impact on the 
>>>>>> result.
>>>>>>
>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel 
>>>>>> 3.1.5/powerpc32
>>>>>> (52xx), pipeline 2.13-06.
>>>>>>
>>>>>> Which priority arguments are you passing to your test program?
>>>>>>
>>>>>>> Another observation is that I do not hit
>>>>>>> rt_mutex_release in the kernel in the problem scenario, I believe 
>>>>>>> when
>>>>>>> the thread undergoes a priority inversion.This may be a problem 
>>>>>>> as the
>>>>>>> rescnt would not get decremented. Not sure how the mutex is 
>>>>>>> releasing
>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>
>>>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>>>> space
>>>>>> unless the released mutex was actually contented.
>>>>>>
>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>> secondary
>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>> down to
>>>>>>> 0 when all the mutexes are released.
>>>>>>>
>>>>>>> Has anyone faced this problem?
>>>>>>>
>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>>>> threads only (i.e. prio == 0).
>>>>>>
>>>>>>> Rgds,
>>>>>>> Makarand
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Xenomai-help mailing list
>>>>>>> Xenomai-help@domain.hid
>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>>
>>>
>>
> 
> 


-- 
Philippe.


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 18:08             ` Philippe Gerum
@ 2012-01-10 18:39               ` Makarand Pradhan
  2012-01-10 19:10                 ` Makarand Pradhan
  0 siblings, 1 reply; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-10 18:39 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

Hi Phillipe,

A bit surprised to see a change in sched-rt.h. I had another problem 
earlier where the XNOTHER was not getting set after a priority change. I 
had to look at the code that you have modified. Although I had 
temporarily worked around it by setting the XNOTHER in 
rt_task_set_priority. I think this would fix that problem as well.

Will test the patch and get back with the results.

Thanks and Rgds,
Mak.

On 10/01/12 01:08 PM, Philippe Gerum wrote:
> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>> Based on my testing, it is noted that the rescnt is not released when
>> task1 gets a priority boost and starts running with priority 1. That's
>> when the rescnt is not decremented.
>>
>> It would imply that we may be checking the current priority while
>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>> check it out.
> Does this help in your case?
>
> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
> index cc1cefa..6ac8fd7 100644
> --- a/include/nucleus/sched-rt.h
> +++ b/include/nucleus/sched-rt.h
> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct xnthread *thread,
>   {
>   	thread->cprio = p->rt.prio;
>   	if (xnthread_test_state(thread, XNSHADOW)) {
> -		if (thread->cprio)
> +		if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>   			xnthread_clear_state(thread, XNOTHER);
>   		else
>   			xnthread_set_state(thread, XNOTHER);
>> Rgds,
>> Mak.
>>
>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>> Another point:
>>>>>
>>>>> "These are fast mutexes, the thread does not have to jump to kernel
>>>>> space
>>>>> unless the released mutex was actually contented."
>>>>>
>>>>> When the first task is started with prio 0, I always see that
>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>> contention.
>>>> I should have added: "unless there is no contention ... or the caller is
>>>> a non-rt thread". This is because we have to jump to kernel space to
>>>> track rescnt.
>>>>
>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>> a non-rt thread".
>>>
>>>>> I have an instrumented kernel. The kernel trace is given below. In this
>>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>>
>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>
>>>>>
>>>>> root@domain.hid:~# ./a.out 0 1
>>>>> Spawning: tasks
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> ^C
>>>>>
>>>>>
>>>>> Rgds,
>>>>> Mak.
>>>>>
>>>>>
>>>>>
>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>> Hi Phillippe,
>>>>>>
>>>>>> You are right. Task 1 requires to be started with prio 0. I start
>>>>>> seeing
>>>>>> the problem after task2 grabs the mutex and releases them. The first
>>>>>> task never jumps back to seconodary. Here is my output. The mode never
>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>> stuck at
>>>>>> 1 in the kernel.
>>>>>>
>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>> Spawning: tasks
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Grabbing mux in HP
>>>>>> Mux held by Task2
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 1
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 1
>>>>>> Acquire complete
>>>>>>
>>>>>> Rgds,
>>>>>> Mak.
>>>>>>
>>>>>>
>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>
>>>>>>>> I am noticing an issue while using the auto relax feature related to
>>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>>> email.
>>>>>>>>
>>>>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>>>>> releases a mutex. On further investigation, it was noted that the
>>>>>>>> rescnt
>>>>>>>> is not going down to 0.
>>>>>>>  From your code, task1 would auto-relax only if started with
>>>>>>> priority 0,
>>>>>>> which is what I get here:
>>>>>>>
>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>> Spawning: tasks
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> ...
>>>>>>>
>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>> priority to
>>>>>>> task1:
>>>>>>>
>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>> Spawning: tasks
>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>> Acquire complete
>>>>>>> ...
>>>>>>>
>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>> result.
>>>>>>>
>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>> 3.1.5/powerpc32
>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>
>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>
>>>>>>>> Another observation is that I do not hit
>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I believe
>>>>>>>> when
>>>>>>>> the thread undergoes a priority inversion.This may be a problem
>>>>>>>> as the
>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>> releasing
>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>
>>>>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>>>>> space
>>>>>>> unless the released mutex was actually contented.
>>>>>>>
>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>> secondary
>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>>> down to
>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>
>>>>>>>> Has anyone faced this problem?
>>>>>>>>
>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>>>>> threads only (i.e. prio == 0).
>>>>>>>
>>>>>>>> Rgds,
>>>>>>>> Makarand
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Xenomai-help mailing list
>>>>>>>> Xenomai-help@domain.hid
>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 18:39               ` Makarand Pradhan
@ 2012-01-10 19:10                 ` Makarand Pradhan
  2012-01-10 20:30                   ` Philippe Gerum
  2012-01-18 22:41                   ` Makarand Pradhan
  0 siblings, 2 replies; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-10 19:10 UTC (permalink / raw)
  To: xenomai

The patch does work. Thanks.

Will it be available in the next release of xenomai?

Rgds,
Mak

root@domain.hid:~# ./relax 0 1
Spawning: tasks
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Grabbing mux in HP
Mux held by Task2
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
Release complete
bP: 0, cp: 0, mode: 0
Acquire complete
^C
root@domain.hid:~#


On 10/01/12 01:39 PM, Makarand Pradhan wrote:
> Hi Phillipe,
>
> A bit surprised to see a change in sched-rt.h. I had another problem
> earlier where the XNOTHER was not getting set after a priority change. I
> had to look at the code that you have modified. Although I had
> temporarily worked around it by setting the XNOTHER in
> rt_task_set_priority. I think this would fix that problem as well.
>
> Will test the patch and get back with the results.
>
> Thanks and Rgds,
> Mak.
>
> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>> Based on my testing, it is noted that the rescnt is not released when
>>> task1 gets a priority boost and starts running with priority 1. That's
>>> when the rescnt is not decremented.
>>>
>>> It would imply that we may be checking the current priority while
>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>> check it out.
>> Does this help in your case?
>>
>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>> index cc1cefa..6ac8fd7 100644
>> --- a/include/nucleus/sched-rt.h
>> +++ b/include/nucleus/sched-rt.h
>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct xnthread *thread,
>>    {
>>        thread->cprio = p->rt.prio;
>>        if (xnthread_test_state(thread, XNSHADOW)) {
>> -             if (thread->cprio)
>> +             if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>                        xnthread_clear_state(thread, XNOTHER);
>>                else
>>                        xnthread_set_state(thread, XNOTHER);
>>> Rgds,
>>> Mak.
>>>
>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>> Another point:
>>>>>>
>>>>>> "These are fast mutexes, the thread does not have to jump to kernel
>>>>>> space
>>>>>> unless the released mutex was actually contented."
>>>>>>
>>>>>> When the first task is started with prio 0, I always see that
>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>> contention.
>>>>> I should have added: "unless there is no contention ... or the caller is
>>>>> a non-rt thread". This is because we have to jump to kernel space to
>>>>> track rescnt.
>>>>>
>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>> a non-rt thread".
>>>>
>>>>>> I have an instrumented kernel. The kernel trace is given below. In this
>>>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>>>
>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>
>>>>>>
>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>> Spawning: tasks
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> ^C
>>>>>>
>>>>>>
>>>>>> Rgds,
>>>>>> Mak.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>> Hi Phillippe,
>>>>>>>
>>>>>>> You are right. Task 1 requires to be started with prio 0. I start
>>>>>>> seeing
>>>>>>> the problem after task2 grabs the mutex and releases them. The first
>>>>>>> task never jumps back to seconodary. Here is my output. The mode never
>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>> stuck at
>>>>>>> 1 in the kernel.
>>>>>>>
>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>> Spawning: tasks
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Grabbing mux in HP
>>>>>>> Mux held by Task2
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>> Acquire complete
>>>>>>>
>>>>>>> Rgds,
>>>>>>> Mak.
>>>>>>>
>>>>>>>
>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>
>>>>>>>>> I am noticing an issue while using the auto relax feature related to
>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>>>> email.
>>>>>>>>>
>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>>>>>> releases a mutex. On further investigation, it was noted that the
>>>>>>>>> rescnt
>>>>>>>>> is not going down to 0.
>>>>>>>>    From your code, task1 would auto-relax only if started with
>>>>>>>> priority 0,
>>>>>>>> which is what I get here:
>>>>>>>>
>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>> Spawning: tasks
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> ...
>>>>>>>>
>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>> priority to
>>>>>>>> task1:
>>>>>>>>
>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>> Spawning: tasks
>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>> Acquire complete
>>>>>>>> ...
>>>>>>>>
>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>> result.
>>>>>>>>
>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>> 3.1.5/powerpc32
>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>
>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>
>>>>>>>>> Another observation is that I do not hit
>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I believe
>>>>>>>>> when
>>>>>>>>> the thread undergoes a priority inversion.This may be a problem
>>>>>>>>> as the
>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>> releasing
>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>
>>>>>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>>>>>> space
>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>
>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>> secondary
>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>>>> down to
>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>
>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>
>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>
>>>>>>>>> Rgds,
>>>>>>>>> Makarand
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Xenomai-help mailing list
>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>
> --
> ___________________________________________________________________________
> NOTICE OF CONFIDENTIALITY:
> This e-mail and any attachments may contain confidential and privileged information.  If you are
> not the intended recipient, please notify the sender immediately by return e-mail and delete this
> e-mail and any copies.  Any dissemination or use of this information by a person other than the
> intended recipient is unauthorized and may be illegal.
> _____________________________________________________________________
>
>
>
>
> _______________________________________________
> Xenomai-help mailing list
> Xenomai-help@domain.hid
> https://mail.gna.org/listinfo/xenomai-help


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 19:10                 ` Makarand Pradhan
@ 2012-01-10 20:30                   ` Philippe Gerum
  2012-01-18 22:41                   ` Makarand Pradhan
  1 sibling, 0 replies; 23+ messages in thread
From: Philippe Gerum @ 2012-01-10 20:30 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai

On 01/10/2012 08:10 PM, Makarand Pradhan wrote:
> The patch does work. Thanks.
>
> Will it be available in the next release of xenomai?
>

This needs more testing, but this bug is a must fix for 2.6.1 for sure. 
Your test code does help a lot in this respect. Thanks for the heads up.

> Rgds,
> Mak
>
> root@domain.hid:~# ./relax 0 1
> Spawning: tasks
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Grabbing mux in HP
> Mux held by Task2
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> ^C
> root@domain.hid:~#
>
>
> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>> Hi Phillipe,
>>
>> A bit surprised to see a change in sched-rt.h. I had another problem
>> earlier where the XNOTHER was not getting set after a priority change. I
>> had to look at the code that you have modified. Although I had
>> temporarily worked around it by setting the XNOTHER in
>> rt_task_set_priority. I think this would fix that problem as well.
>>
>> Will test the patch and get back with the results.
>>
>> Thanks and Rgds,
>> Mak.
>>
>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>> Based on my testing, it is noted that the rescnt is not released when
>>>> task1 gets a priority boost and starts running with priority 1. That's
>>>> when the rescnt is not decremented.
>>>>
>>>> It would imply that we may be checking the current priority while
>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>> check it out.
>>> Does this help in your case?
>>>
>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>> index cc1cefa..6ac8fd7 100644
>>> --- a/include/nucleus/sched-rt.h
>>> +++ b/include/nucleus/sched-rt.h
>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct
>>> xnthread *thread,
>>> {
>>> thread->cprio = p->rt.prio;
>>> if (xnthread_test_state(thread, XNSHADOW)) {
>>> - if (thread->cprio)
>>> + if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>> xnthread_clear_state(thread, XNOTHER);
>>> else
>>> xnthread_set_state(thread, XNOTHER);
>>>> Rgds,
>>>> Mak.
>>>>
>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>> Another point:
>>>>>>>
>>>>>>> "These are fast mutexes, the thread does not have to jump to kernel
>>>>>>> space
>>>>>>> unless the released mutex was actually contented."
>>>>>>>
>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>> contention.
>>>>>> I should have added: "unless there is no contention ... or the
>>>>>> caller is
>>>>>> a non-rt thread". This is because we have to jump to kernel space to
>>>>>> track rescnt.
>>>>>>
>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>> a non-rt thread".
>>>>>
>>>>>>> I have an instrumented kernel. The kernel trace is given below.
>>>>>>> In this
>>>>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>>>>
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>
>>>>>>>
>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>> Spawning: tasks
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> ^C
>>>>>>>
>>>>>>>
>>>>>>> Rgds,
>>>>>>> Mak.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>> Hi Phillippe,
>>>>>>>>
>>>>>>>> You are right. Task 1 requires to be started with prio 0. I start
>>>>>>>> seeing
>>>>>>>> the problem after task2 grabs the mutex and releases them. The
>>>>>>>> first
>>>>>>>> task never jumps back to seconodary. Here is my output. The mode
>>>>>>>> never
>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>> stuck at
>>>>>>>> 1 in the kernel.
>>>>>>>>
>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>> Spawning: tasks
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Grabbing mux in HP
>>>>>>>> Mux held by Task2
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>> Acquire complete
>>>>>>>>
>>>>>>>> Rgds,
>>>>>>>> Mak.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>
>>>>>>>>>> I am noticing an issue while using the auto relax feature
>>>>>>>>>> related to
>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>>>>> email.
>>>>>>>>>>
>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>>>>>>> releases a mutex. On further investigation, it was noted that the
>>>>>>>>>> rescnt
>>>>>>>>>> is not going down to 0.
>>>>>>>>> From your code, task1 would auto-relax only if started with
>>>>>>>>> priority 0,
>>>>>>>>> which is what I get here:
>>>>>>>>>
>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>> Spawning: tasks
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>> priority to
>>>>>>>>> task1:
>>>>>>>>>
>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>> Spawning: tasks
>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>> result.
>>>>>>>>>
>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>
>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>
>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I believe
>>>>>>>>>> when
>>>>>>>>>> the thread undergoes a priority inversion.This may be a problem
>>>>>>>>>> as the
>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>> releasing
>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>
>>>>>>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>>>>>>> space
>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>
>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>> secondary
>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>>>>> down to
>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>
>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>
>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>
>>>>>>>>>> Rgds,
>>>>>>>>>> Makarand
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>
>> --
>> ___________________________________________________________________________
>>
>> NOTICE OF CONFIDENTIALITY:
>> This e-mail and any attachments may contain confidential and
>> privileged information. If you are
>> not the intended recipient, please notify the sender immediately by
>> return e-mail and delete this
>> e-mail and any copies. Any dissemination or use of this information by
>> a person other than the
>> intended recipient is unauthorized and may be illegal.
>> _____________________________________________________________________
>>
>>
>>
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
>
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-10 19:10                 ` Makarand Pradhan
  2012-01-10 20:30                   ` Philippe Gerum
@ 2012-01-18 22:41                   ` Makarand Pradhan
  2012-01-19 10:17                     ` Gilles Chanteperdrix
  2012-01-19 11:25                     ` Philippe Gerum
  1 sibling, 2 replies; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-18 22:41 UTC (permalink / raw)
  To: xenomai@xenomai.org

Hi,

Another problem was encountered with rescnt related to nested mutexes.

This time the rescnt is not incrementing because the XNOTHER bit is not 
set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing 
my application to crash.

The scenario is as follows:

1. Thread started with priority 0. (Relaxed)
2. This thread uses mutexes which causes Priority Inversions.
3. At some point, a rt_task_set_priority is done to change the priority. 
(RT 85).
4. Some time later the priority is set back to 0.

The problem again revolves around setting XNOTHER. In the problem 
scenario, the XNOTHER bit is not set in xnsynch_acquire. Hence the 
rescnt is not incremented.

The reason for that is, while doing a rt_task_set_priority, 
__xnsched_rt_setparam is invoked before the thread is reniced.

To resolve this issue, I had to set the XNOTHER bit in 
__xnpod_set_thread_schedparam after the thread was reniced or in 
rt_task_set_priority. Both the code changes are given below:


rt_task_set_priority(....

+        if (0==prio)
+        {
+                xnthread_set_state(&task->thread_base, XNOTHER);
+        }


xnpod_set_thread_schedparam(...

#ifdef CONFIG_XENO_OPT_PERVASIVE
         if (propagate) {
                 if (xnthread_test_state(thread, XNRELAX))
                         xnshadow_renice(thread);
                 else if (xnthread_test_state(thread, XNSHADOW))
                         xnthread_set_info(thread, XNPRIOSET);
         }

+       if (xnthread_test_state(thread, XNSHADOW)) {
+               // if (thread->bprio || !xnthread_test_state(thread, 
XNBOOST))
+               if (thread->bprio)
+                       xnthread_clear_state(thread, XNOTHER);
+               else
+                       xnthread_set_state(thread, XNOTHER);
+       }


Setting XNOTHER in rt_task_set_priority does not look appropriate. I 
believe the right place is in the xnpod_set_thread_schedparam.

Would highly appreciate your views.

Rgds,
Mak


On 10/01/12 02:10 PM, Makarand Pradhan wrote:
> The patch does work. Thanks.
>
> Will it be available in the next release of xenomai?
>
> Rgds,
> Mak
>
> root@domain.hid:~# ./relax 0 1
> Spawning: tasks
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Grabbing mux in HP
> Mux held by Task2
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> Release complete
> bP: 0, cp: 0, mode: 0
> Acquire complete
> ^C
> root@domain.hid:~#
>
>
> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>> Hi Phillipe,
>>
>> A bit surprised to see a change in sched-rt.h. I had another problem
>> earlier where the XNOTHER was not getting set after a priority change. I
>> had to look at the code that you have modified. Although I had
>> temporarily worked around it by setting the XNOTHER in
>> rt_task_set_priority. I think this would fix that problem as well.
>>
>> Will test the patch and get back with the results.
>>
>> Thanks and Rgds,
>> Mak.
>>
>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>> Based on my testing, it is noted that the rescnt is not released when
>>>> task1 gets a priority boost and starts running with priority 1. That's
>>>> when the rescnt is not decremented.
>>>>
>>>> It would imply that we may be checking the current priority while
>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>> check it out.
>>> Does this help in your case?
>>>
>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>> index cc1cefa..6ac8fd7 100644
>>> --- a/include/nucleus/sched-rt.h
>>> +++ b/include/nucleus/sched-rt.h
>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct xnthread *thread,
>>>     {
>>>         thread->cprio = p->rt.prio;
>>>         if (xnthread_test_state(thread, XNSHADOW)) {
>>> -             if (thread->cprio)
>>> +             if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>                         xnthread_clear_state(thread, XNOTHER);
>>>                 else
>>>                         xnthread_set_state(thread, XNOTHER);
>>>> Rgds,
>>>> Mak.
>>>>
>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>> Another point:
>>>>>>>
>>>>>>> "These are fast mutexes, the thread does not have to jump to kernel
>>>>>>> space
>>>>>>> unless the released mutex was actually contented."
>>>>>>>
>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>> contention.
>>>>>> I should have added: "unless there is no contention ... or the caller is
>>>>>> a non-rt thread". This is because we have to jump to kernel space to
>>>>>> track rescnt.
>>>>>>
>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>> a non-rt thread".
>>>>>
>>>>>>> I have an instrumented kernel. The kernel trace is given below. In this
>>>>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>>>>
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>
>>>>>>>
>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>> Spawning: tasks
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> ^C
>>>>>>>
>>>>>>>
>>>>>>> Rgds,
>>>>>>> Mak.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>> Hi Phillippe,
>>>>>>>>
>>>>>>>> You are right. Task 1 requires to be started with prio 0. I start
>>>>>>>> seeing
>>>>>>>> the problem after task2 grabs the mutex and releases them. The first
>>>>>>>> task never jumps back to seconodary. Here is my output. The mode never
>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>> stuck at
>>>>>>>> 1 in the kernel.
>>>>>>>>
>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>> Spawning: tasks
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Grabbing mux in HP
>>>>>>>> Mux held by Task2
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>> Acquire complete
>>>>>>>>
>>>>>>>> Rgds,
>>>>>>>> Mak.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>
>>>>>>>>>> I am noticing an issue while using the auto relax feature related to
>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>>>>> email.
>>>>>>>>>>
>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs and
>>>>>>>>>> releases a mutex. On further investigation, it was noted that the
>>>>>>>>>> rescnt
>>>>>>>>>> is not going down to 0.
>>>>>>>>>      From your code, task1 would auto-relax only if started with
>>>>>>>>> priority 0,
>>>>>>>>> which is what I get here:
>>>>>>>>>
>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>> Spawning: tasks
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>> priority to
>>>>>>>>> task1:
>>>>>>>>>
>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>> Spawning: tasks
>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>> result.
>>>>>>>>>
>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>
>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>
>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I believe
>>>>>>>>>> when
>>>>>>>>>> the thread undergoes a priority inversion.This may be a problem
>>>>>>>>>> as the
>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>> releasing
>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>
>>>>>>>>> These are fast mutexes, the thread does not have to jump to kernel
>>>>>>>>> space
>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>
>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>> secondary
>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>>>>> down to
>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>
>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>
>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt Xenomai
>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>
>>>>>>>>>> Rgds,
>>>>>>>>>> Makarand
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>> --
>> ___________________________________________________________________________
>> NOTICE OF CONFIDENTIALITY:
>> This e-mail and any attachments may contain confidential and privileged information.  If you are
>> not the intended recipient, please notify the sender immediately by return e-mail and delete this
>> e-mail and any copies.  Any dissemination or use of this information by a person other than the
>> intended recipient is unauthorized and may be illegal.
>> _____________________________________________________________________
>>
>>
>>
>>
>> _______________________________________________
>> Xenomai-help mailing list
>> Xenomai-help@domain.hid
>> https://mail.gna.org/listinfo/xenomai-help
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-18 22:41                   ` Makarand Pradhan
@ 2012-01-19 10:17                     ` Gilles Chanteperdrix
  2012-01-19 11:25                     ` Philippe Gerum
  1 sibling, 0 replies; 23+ messages in thread
From: Gilles Chanteperdrix @ 2012-01-19 10:17 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai@xenomai.org

On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
> Hi,
> 
> Another problem was encountered with rescnt related to nested mutexes.
> 
> This time the rescnt is not incrementing because the XNOTHER bit is not 
> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing 
> my application to crash.

That is deliberate. We explicitely decided to not support such case and
volontarily send a SIGDEBUG when it happens, in order to signal do that
you are doing an unsupported operation.

Can you show us a scenario where there is no other solution than
changing priority while holding a mutex?

-- 
					    Gilles.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-18 22:41                   ` Makarand Pradhan
  2012-01-19 10:17                     ` Gilles Chanteperdrix
@ 2012-01-19 11:25                     ` Philippe Gerum
  2012-01-19 12:29                       ` Gilles Chanteperdrix
  2012-01-19 15:22                       ` Makarand Pradhan
  1 sibling, 2 replies; 23+ messages in thread
From: Philippe Gerum @ 2012-01-19 11:25 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai@xenomai.org

On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
> Hi,
>
> Another problem was encountered with rescnt related to nested mutexes.
>
> This time the rescnt is not incrementing because the XNOTHER bit is not
> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
> my application to crash.
>
> The scenario is as follows:
>
> 1. Thread started with priority 0. (Relaxed)
> 2. This thread uses mutexes which causes Priority Inversions.
> 3. At some point, a rt_task_set_priority is done to change the priority.
> (RT 85).
> 4. Some time later the priority is set back to 0.

If I understand it properly, your runtime scenario is badly broken I'm 
afraid. By contrast to priority ceiling, priority inheritance is about 
leaving the responsibility to the _kernel_ to pick the best dynamic 
priority for your thread to solve a priority inversion.

Therefore, by changing your dynamic priority while holding a mutex, your 
application is preventing the kernel to do the job you previously 
assigned to it. Worst, you could be causing unexpected latencies to 
other threads your application has no clue about, or just can't tell 
whether they compete with your thread for accessing the resource at that 
specific time.

After all, this is your application that defined the contented mutex, 
and as such the fact that priority inheritance might be involved at some 
point. If you don't trust the kernel and want to deal with priorities 
manually during resource contention, then maybe you should use a 
different mutual exclusion mechanism not implementing priority 
inheritance, e.g. a plain binary semaphore.

>
> The problem again revolves around setting XNOTHER. In the problem
> scenario, the XNOTHER bit is not set in xnsynch_acquire. Hence the
> rescnt is not incremented.
>
> The reason for that is, while doing a rt_task_set_priority,
> __xnsched_rt_setparam is invoked before the thread is reniced.
>
> To resolve this issue, I had to set the XNOTHER bit in
> __xnpod_set_thread_schedparam after the thread was reniced or in
> rt_task_set_priority. Both the code changes are given below:
>
>
> rt_task_set_priority(....
>
> + if (0==prio)
> + {
> + xnthread_set_state(&task->thread_base, XNOTHER);
> + }
>
>
> xnpod_set_thread_schedparam(...
>
> #ifdef CONFIG_XENO_OPT_PERVASIVE
> if (propagate) {
> if (xnthread_test_state(thread, XNRELAX))
> xnshadow_renice(thread);
> else if (xnthread_test_state(thread, XNSHADOW))
> xnthread_set_info(thread, XNPRIOSET);
> }
>
> + if (xnthread_test_state(thread, XNSHADOW)) {
> + // if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
> + if (thread->bprio)
> + xnthread_clear_state(thread, XNOTHER);
> + else
> + xnthread_set_state(thread, XNOTHER);
> + }
>
>
> Setting XNOTHER in rt_task_set_priority does not look appropriate. I
> believe the right place is in the xnpod_set_thread_schedparam.
>
> Would highly appreciate your views.
>
> Rgds,
> Mak
>
>
> On 10/01/12 02:10 PM, Makarand Pradhan wrote:
>> The patch does work. Thanks.
>>
>> Will it be available in the next release of xenomai?
>>
>> Rgds,
>> Mak
>>
>> root@domain.hid:~# ./relax 0 1
>> Spawning: tasks
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Grabbing mux in HP
>> Mux held by Task2
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> Release complete
>> bP: 0, cp: 0, mode: 0
>> Acquire complete
>> ^C
>> root@domain.hid:~#
>>
>>
>> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>>> Hi Phillipe,
>>>
>>> A bit surprised to see a change in sched-rt.h. I had another problem
>>> earlier where the XNOTHER was not getting set after a priority change. I
>>> had to look at the code that you have modified. Although I had
>>> temporarily worked around it by setting the XNOTHER in
>>> rt_task_set_priority. I think this would fix that problem as well.
>>>
>>> Will test the patch and get back with the results.
>>>
>>> Thanks and Rgds,
>>> Mak.
>>>
>>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>>> Based on my testing, it is noted that the rescnt is not released when
>>>>> task1 gets a priority boost and starts running with priority 1. That's
>>>>> when the rescnt is not decremented.
>>>>>
>>>>> It would imply that we may be checking the current priority while
>>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>>> check it out.
>>>> Does this help in your case?
>>>>
>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>> index cc1cefa..6ac8fd7 100644
>>>> --- a/include/nucleus/sched-rt.h
>>>> +++ b/include/nucleus/sched-rt.h
>>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct
>>>> xnthread *thread,
>>>> {
>>>> thread->cprio = p->rt.prio;
>>>> if (xnthread_test_state(thread, XNSHADOW)) {
>>>> - if (thread->cprio)
>>>> + if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>> xnthread_clear_state(thread, XNOTHER);
>>>> else
>>>> xnthread_set_state(thread, XNOTHER);
>>>>> Rgds,
>>>>> Mak.
>>>>>
>>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>>> Another point:
>>>>>>>>
>>>>>>>> "These are fast mutexes, the thread does not have to jump to kernel
>>>>>>>> space
>>>>>>>> unless the released mutex was actually contented."
>>>>>>>>
>>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>>> contention.
>>>>>>> I should have added: "unless there is no contention ... or the
>>>>>>> caller is
>>>>>>> a non-rt thread". This is because we have to jump to kernel space to
>>>>>>> track rescnt.
>>>>>>>
>>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>>> a non-rt thread".
>>>>>>
>>>>>>>> I have an instrumented kernel. The kernel trace is given below.
>>>>>>>> In this
>>>>>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>>>>>
>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>
>>>>>>>>
>>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>>> Spawning: tasks
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> Release complete
>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>> Acquire complete
>>>>>>>> ^C
>>>>>>>>
>>>>>>>>
>>>>>>>> Rgds,
>>>>>>>> Mak.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>>> Hi Phillippe,
>>>>>>>>>
>>>>>>>>> You are right. Task 1 requires to be started with prio 0. I start
>>>>>>>>> seeing
>>>>>>>>> the problem after task2 grabs the mutex and releases them. The
>>>>>>>>> first
>>>>>>>>> task never jumps back to seconodary. Here is my output. The
>>>>>>>>> mode never
>>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>>> stuck at
>>>>>>>>> 1 in the kernel.
>>>>>>>>>
>>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>>> Spawning: tasks
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Grabbing mux in HP
>>>>>>>>> Mux held by Task2
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>> Acquire complete
>>>>>>>>>
>>>>>>>>> Rgds,
>>>>>>>>> Mak.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>>
>>>>>>>>>>> I am noticing an issue while using the auto relax feature
>>>>>>>>>>> related to
>>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>>>>>> email.
>>>>>>>>>>>
>>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs
>>>>>>>>>>> and
>>>>>>>>>>> releases a mutex. On further investigation, it was noted that
>>>>>>>>>>> the
>>>>>>>>>>> rescnt
>>>>>>>>>>> is not going down to 0.
>>>>>>>>>> From your code, task1 would auto-relax only if started with
>>>>>>>>>> priority 0,
>>>>>>>>>> which is what I get here:
>>>>>>>>>>
>>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>>> Spawning: tasks
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>>> priority to
>>>>>>>>>> task1:
>>>>>>>>>>
>>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>>> Spawning: tasks
>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>> Acquire complete
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>>> result.
>>>>>>>>>>
>>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>>
>>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>>
>>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I
>>>>>>>>>>> believe
>>>>>>>>>>> when
>>>>>>>>>>> the thread undergoes a priority inversion.This may be a problem
>>>>>>>>>>> as the
>>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>>> releasing
>>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>>
>>>>>>>>>> These are fast mutexes, the thread does not have to jump to
>>>>>>>>>> kernel
>>>>>>>>>> space
>>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>>
>>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>>> secondary
>>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>>>>>> down to
>>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>>
>>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>>
>>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt
>>>>>>>>>> Xenomai
>>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>>
>>>>>>>>>>> Rgds,
>>>>>>>>>>> Makarand
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>> --
>>> ___________________________________________________________________________
>>>
>>> NOTICE OF CONFIDENTIALITY:
>>> This e-mail and any attachments may contain confidential and
>>> privileged information. If you are
>>> not the intended recipient, please notify the sender immediately by
>>> return e-mail and delete this
>>> e-mail and any copies. Any dissemination or use of this information
>>> by a person other than the
>>> intended recipient is unauthorized and may be illegal.
>>> _____________________________________________________________________
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Xenomai-help mailing list
>>> Xenomai-help@domain.hid
>>> https://mail.gna.org/listinfo/xenomai-help
>>
>
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-19 11:25                     ` Philippe Gerum
@ 2012-01-19 12:29                       ` Gilles Chanteperdrix
  2012-01-19 15:35                         ` Makarand Pradhan
  2012-01-19 15:22                       ` Makarand Pradhan
  1 sibling, 1 reply; 23+ messages in thread
From: Gilles Chanteperdrix @ 2012-01-19 12:29 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

On 01/19/2012 12:25 PM, Philippe Gerum wrote:
> On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
>> Hi,
>>
>> Another problem was encountered with rescnt related to nested mutexes.
>>
>> This time the rescnt is not incrementing because the XNOTHER bit is not
>> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
>> my application to crash.
>>
>> The scenario is as follows:
>>
>> 1. Thread started with priority 0. (Relaxed)
>> 2. This thread uses mutexes which causes Priority Inversions.
>> 3. At some point, a rt_task_set_priority is done to change the priority.
>> (RT 85).
>> 4. Some time later the priority is set back to 0.
> 
> If I understand it properly, your runtime scenario is badly broken I'm 
> afraid. By contrast to priority ceiling, priority inheritance is about 
> leaving the responsibility to the _kernel_ to pick the best dynamic 
> priority for your thread to solve a priority inversion.
> 
> Therefore, by changing your dynamic priority while holding a mutex, your 
> application is preventing the kernel to do the job you previously 
> assigned to it. Worst, you could be causing unexpected latencies to 
> other threads your application has no clue about, or just can't tell 
> whether they compete with your thread for accessing the resource at that 
> specific time.
> 
> After all, this is your application that defined the contented mutex, 
> and as such the fact that priority inheritance might be involved at some 
> point. If you don't trust the kernel and want to deal with priorities 
> manually during resource contention, then maybe you should use a 
> different mutual exclusion mechanism not implementing priority 
> inheritance, e.g. a plain binary semaphore.

Even if you implement manual priority ceiling, you should change
priority before acquiring the mutex, not after, otherwise there is a
race condition.

-- 
                                                                Gilles.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-19 11:25                     ` Philippe Gerum
  2012-01-19 12:29                       ` Gilles Chanteperdrix
@ 2012-01-19 15:22                       ` Makarand Pradhan
  2012-01-19 15:49                         ` Philippe Gerum
  1 sibling, 1 reply; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-19 15:22 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

Hi Philippe,

I think I may have not communicated the scenario properly. I am not 
trying to control the priorities from user space during resource 
contention. That is left to the kernel. Let me try again.

At some point, my application which was relaxed has to run with a real 
time priority. That's when I invoke rt_task_set_priority to change the 
base priority. After the critical section is past, the thread has to 
relax again where the priority is set to 0 again.

The rt_task_set_priority API allows me to change the task priority on 
the fly, so I think that the operation is supported and legal. Pl feel 
free to correct me if that is not true.

This change of priorities does introduce the race condition that was 
encountered which can be handled properly in the kernel using any of the 
2 approaches that were mentioned.

Your comments are highly valued and I look forward to your opinions.

Rgds,
Mak.

On 19/01/12 06:25 AM, Philippe Gerum wrote:
> On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
>> Hi,
>>
>> Another problem was encountered with rescnt related to nested mutexes.
>>
>> This time the rescnt is not incrementing because the XNOTHER bit is not
>> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
>> my application to crash.
>>
>> The scenario is as follows:
>>
>> 1. Thread started with priority 0. (Relaxed)
>> 2. This thread uses mutexes which causes Priority Inversions.
>> 3. At some point, a rt_task_set_priority is done to change the priority.
>> (RT 85).
>> 4. Some time later the priority is set back to 0.
> If I understand it properly, your runtime scenario is badly broken I'm
> afraid. By contrast to priority ceiling, priority inheritance is about
> leaving the responsibility to the _kernel_ to pick the best dynamic
> priority for your thread to solve a priority inversion.
>
> Therefore, by changing your dynamic priority while holding a mutex, your
> application is preventing the kernel to do the job you previously
> assigned to it. Worst, you could be causing unexpected latencies to
> other threads your application has no clue about, or just can't tell
> whether they compete with your thread for accessing the resource at that
> specific time.
>
> After all, this is your application that defined the contented mutex,
> and as such the fact that priority inheritance might be involved at some
> point. If you don't trust the kernel and want to deal with priorities
> manually during resource contention, then maybe you should use a
> different mutual exclusion mechanism not implementing priority
> inheritance, e.g. a plain binary semaphore.
>
>> The problem again revolves around setting XNOTHER. In the problem
>> scenario, the XNOTHER bit is not set in xnsynch_acquire. Hence the
>> rescnt is not incremented.
>>
>> The reason for that is, while doing a rt_task_set_priority,
>> __xnsched_rt_setparam is invoked before the thread is reniced.
>>
>> To resolve this issue, I had to set the XNOTHER bit in
>> __xnpod_set_thread_schedparam after the thread was reniced or in
>> rt_task_set_priority. Both the code changes are given below:
>>
>>
>> rt_task_set_priority(....
>>
>> + if (0==prio)
>> + {
>> + xnthread_set_state(&task->thread_base, XNOTHER);
>> + }
>>
>>
>> xnpod_set_thread_schedparam(...
>>
>> #ifdef CONFIG_XENO_OPT_PERVASIVE
>> if (propagate) {
>> if (xnthread_test_state(thread, XNRELAX))
>> xnshadow_renice(thread);
>> else if (xnthread_test_state(thread, XNSHADOW))
>> xnthread_set_info(thread, XNPRIOSET);
>> }
>>
>> + if (xnthread_test_state(thread, XNSHADOW)) {
>> + // if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>> + if (thread->bprio)
>> + xnthread_clear_state(thread, XNOTHER);
>> + else
>> + xnthread_set_state(thread, XNOTHER);
>> + }
>>
>>
>> Setting XNOTHER in rt_task_set_priority does not look appropriate. I
>> believe the right place is in the xnpod_set_thread_schedparam.
>>
>> Would highly appreciate your views.
>>
>> Rgds,
>> Mak
>>
>>
>> On 10/01/12 02:10 PM, Makarand Pradhan wrote:
>>> The patch does work. Thanks.
>>>
>>> Will it be available in the next release of xenomai?
>>>
>>> Rgds,
>>> Mak
>>>
>>> root@domain.hid:~# ./relax 0 1
>>> Spawning: tasks
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Grabbing mux in HP
>>> Mux held by Task2
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> Release complete
>>> bP: 0, cp: 0, mode: 0
>>> Acquire complete
>>> ^C
>>> root@domain.hid:~#
>>>
>>>
>>> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>>>> Hi Phillipe,
>>>>
>>>> A bit surprised to see a change in sched-rt.h. I had another problem
>>>> earlier where the XNOTHER was not getting set after a priority change. I
>>>> had to look at the code that you have modified. Although I had
>>>> temporarily worked around it by setting the XNOTHER in
>>>> rt_task_set_priority. I think this would fix that problem as well.
>>>>
>>>> Will test the patch and get back with the results.
>>>>
>>>> Thanks and Rgds,
>>>> Mak.
>>>>
>>>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>>>> Based on my testing, it is noted that the rescnt is not released when
>>>>>> task1 gets a priority boost and starts running with priority 1. That's
>>>>>> when the rescnt is not decremented.
>>>>>>
>>>>>> It would imply that we may be checking the current priority while
>>>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>>>> check it out.
>>>>> Does this help in your case?
>>>>>
>>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>>> index cc1cefa..6ac8fd7 100644
>>>>> --- a/include/nucleus/sched-rt.h
>>>>> +++ b/include/nucleus/sched-rt.h
>>>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct
>>>>> xnthread *thread,
>>>>> {
>>>>> thread->cprio = p->rt.prio;
>>>>> if (xnthread_test_state(thread, XNSHADOW)) {
>>>>> - if (thread->cprio)
>>>>> + if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>>> xnthread_clear_state(thread, XNOTHER);
>>>>> else
>>>>> xnthread_set_state(thread, XNOTHER);
>>>>>> Rgds,
>>>>>> Mak.
>>>>>>
>>>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>>>> Another point:
>>>>>>>>>
>>>>>>>>> "These are fast mutexes, the thread does not have to jump to kernel
>>>>>>>>> space
>>>>>>>>> unless the released mutex was actually contented."
>>>>>>>>>
>>>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>>>> contention.
>>>>>>>> I should have added: "unless there is no contention ... or the
>>>>>>>> caller is
>>>>>>>> a non-rt thread". This is because we have to jump to kernel space to
>>>>>>>> track rescnt.
>>>>>>>>
>>>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>>>> a non-rt thread".
>>>>>>>
>>>>>>>>> I have an instrumented kernel. The kernel trace is given below.
>>>>>>>>> In this
>>>>>>>>> trace only task1 is running at prio 0. It should be easy to follow:
>>>>>>>>>
>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>>>> Spawning: tasks
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> Release complete
>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>> Acquire complete
>>>>>>>>> ^C
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Rgds,
>>>>>>>>> Mak.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>>>> Hi Phillippe,
>>>>>>>>>>
>>>>>>>>>> You are right. Task 1 requires to be started with prio 0. I start
>>>>>>>>>> seeing
>>>>>>>>>> the problem after task2 grabs the mutex and releases them. The
>>>>>>>>>> first
>>>>>>>>>> task never jumps back to seconodary. Here is my output. The
>>>>>>>>>> mode never
>>>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>>>> stuck at
>>>>>>>>>> 1 in the kernel.
>>>>>>>>>>
>>>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>>>> Spawning: tasks
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Grabbing mux in HP
>>>>>>>>>> Mux held by Task2
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>> Acquire complete
>>>>>>>>>>
>>>>>>>>>> Rgds,
>>>>>>>>>> Mak.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>>>
>>>>>>>>>>>> I am noticing an issue while using the auto relax feature
>>>>>>>>>>>> related to
>>>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to this
>>>>>>>>>>>> email.
>>>>>>>>>>>>
>>>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs
>>>>>>>>>>>> and
>>>>>>>>>>>> releases a mutex. On further investigation, it was noted that
>>>>>>>>>>>> the
>>>>>>>>>>>> rescnt
>>>>>>>>>>>> is not going down to 0.
>>>>>>>>>>>  From your code, task1 would auto-relax only if started with
>>>>>>>>>>> priority 0,
>>>>>>>>>>> which is what I get here:
>>>>>>>>>>>
>>>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> ...
>>>>>>>>>>>
>>>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>>>> priority to
>>>>>>>>>>> task1:
>>>>>>>>>>>
>>>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> ...
>>>>>>>>>>>
>>>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>>>> result.
>>>>>>>>>>>
>>>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>>>
>>>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>>>
>>>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I
>>>>>>>>>>>> believe
>>>>>>>>>>>> when
>>>>>>>>>>>> the thread undergoes a priority inversion.This may be a problem
>>>>>>>>>>>> as the
>>>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>>>> releasing
>>>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>>>
>>>>>>>>>>> These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>> kernel
>>>>>>>>>>> space
>>>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>>>
>>>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>>>> secondary
>>>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt goes
>>>>>>>>>>>> down to
>>>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>>>
>>>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>>>
>>>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt
>>>>>>>>>>> Xenomai
>>>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>>>
>>>>>>>>>>>> Rgds,
>>>>>>>>>>>> Makarand
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>> --
>>>> ___________________________________________________________________________
>>>>
>>>> NOTICE OF CONFIDENTIALITY:
>>>> This e-mail and any attachments may contain confidential and
>>>> privileged information. If you are
>>>> not the intended recipient, please notify the sender immediately by
>>>> return e-mail and delete this
>>>> e-mail and any copies. Any dissemination or use of this information
>>>> by a person other than the
>>>> intended recipient is unauthorized and may be illegal.
>>>> _____________________________________________________________________
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Xenomai-help mailing list
>>>> Xenomai-help@domain.hid
>>>> https://mail.gna.org/listinfo/xenomai-help
>>
>
> --
> Philippe.


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-19 12:29                       ` Gilles Chanteperdrix
@ 2012-01-19 15:35                         ` Makarand Pradhan
  0 siblings, 0 replies; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-19 15:35 UTC (permalink / raw)
  To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org

Hi Gilles,

GC: "Even if you implement manual priority ceiling, you should change
priority before acquiring the mutex, not after, otherwise there is a
race condition."

That's how it's being done.

The problem that is faced is basically due to XNOTHER bit being set before the base priority is adjusted in the kernel.

I have tried to explain the scenario again in my last email.

Your inputs will be highly appreciated.

Rgds,
Mak




On 19/01/12 07:29 AM, Gilles Chanteperdrix wrote:
> On 01/19/2012 12:25 PM, Philippe Gerum wrote:
>> On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
>>> Hi,
>>>
>>> Another problem was encountered with rescnt related to nested mutexes.
>>>
>>> This time the rescnt is not incrementing because the XNOTHER bit is not
>>> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
>>> my application to crash.
>>>
>>> The scenario is as follows:
>>>
>>> 1. Thread started with priority 0. (Relaxed)
>>> 2. This thread uses mutexes which causes Priority Inversions.
>>> 3. At some point, a rt_task_set_priority is done to change the priority.
>>> (RT 85).
>>> 4. Some time later the priority is set back to 0.
>> If I understand it properly, your runtime scenario is badly broken I'm
>> afraid. By contrast to priority ceiling, priority inheritance is about
>> leaving the responsibility to the _kernel_ to pick the best dynamic
>> priority for your thread to solve a priority inversion.
>>
>> Therefore, by changing your dynamic priority while holding a mutex, your
>> application is preventing the kernel to do the job you previously
>> assigned to it. Worst, you could be causing unexpected latencies to
>> other threads your application has no clue about, or just can't tell
>> whether they compete with your thread for accessing the resource at that
>> specific time.
>>
>> After all, this is your application that defined the contented mutex,
>> and as such the fact that priority inheritance might be involved at some
>> point. If you don't trust the kernel and want to deal with priorities
>> manually during resource contention, then maybe you should use a
>> different mutual exclusion mechanism not implementing priority
>> inheritance, e.g. a plain binary semaphore.
> Even if you implement manual priority ceiling, you should change
> priority before acquiring the mutex, not after, otherwise there is a
> race condition.
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-19 15:22                       ` Makarand Pradhan
@ 2012-01-19 15:49                         ` Philippe Gerum
  2012-01-19 16:22                           ` Makarand Pradhan
  0 siblings, 1 reply; 23+ messages in thread
From: Philippe Gerum @ 2012-01-19 15:49 UTC (permalink / raw)
  To: Makarand Pradhan; +Cc: xenomai@xenomai.org

On 01/19/2012 04:22 PM, Makarand Pradhan wrote:
> Hi Philippe,
>
> I think I may have not communicated the scenario properly. I am not
> trying to control the priorities from user space during resource
> contention. That is left to the kernel. Let me try again.
>
> At some point, my application which was relaxed has to run with a real
> time priority. That's when I invoke rt_task_set_priority to change the
> base priority. After the critical section is past, the thread has to
> relax again where the priority is set to 0 again.
>
> The rt_task_set_priority API allows me to change the task priority on
> the fly, so I think that the operation is supported and legal. Pl feel
> free to correct me if that is not true.

What is not supported is:

get_mutex
set_priority(current)
release_mutex

This can't be a valid operation, because of the reason I mentioned 
earlier. So, regardless of the reason why you call 
rt_set_task_priority(), you may not call it while holding a mutex.

What happens afterward, e.g. not getting the auto-relax feature back is 
irrelevant in this case.

So, if your app does:

set_priority(current, > 0)
get_mutex
release_mutex
set_priority(current, 0)

then fine, and not getting back the auto-relax after switching to 
priority 0 would indeed reveal a kernel issue. But the former scenario 
is wrong.

>
> This change of priorities does introduce the race condition that was
> encountered which can be handled properly in the kernel using any of the
> 2 approaches that were mentioned.
>
> Your comments are highly valued and I look forward to your opinions.
>
> Rgds,
> Mak.
>
> On 19/01/12 06:25 AM, Philippe Gerum wrote:
>> On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
>>> Hi,
>>>
>>> Another problem was encountered with rescnt related to nested mutexes.
>>>
>>> This time the rescnt is not incrementing because the XNOTHER bit is not
>>> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
>>> my application to crash.
>>>
>>> The scenario is as follows:
>>>
>>> 1. Thread started with priority 0. (Relaxed)
>>> 2. This thread uses mutexes which causes Priority Inversions.
>>> 3. At some point, a rt_task_set_priority is done to change the priority.
>>> (RT 85).
>>> 4. Some time later the priority is set back to 0.
>> If I understand it properly, your runtime scenario is badly broken I'm
>> afraid. By contrast to priority ceiling, priority inheritance is about
>> leaving the responsibility to the _kernel_ to pick the best dynamic
>> priority for your thread to solve a priority inversion.
>>
>> Therefore, by changing your dynamic priority while holding a mutex, your
>> application is preventing the kernel to do the job you previously
>> assigned to it. Worst, you could be causing unexpected latencies to
>> other threads your application has no clue about, or just can't tell
>> whether they compete with your thread for accessing the resource at that
>> specific time.
>>
>> After all, this is your application that defined the contented mutex,
>> and as such the fact that priority inheritance might be involved at some
>> point. If you don't trust the kernel and want to deal with priorities
>> manually during resource contention, then maybe you should use a
>> different mutual exclusion mechanism not implementing priority
>> inheritance, e.g. a plain binary semaphore.
>>
>>> The problem again revolves around setting XNOTHER. In the problem
>>> scenario, the XNOTHER bit is not set in xnsynch_acquire. Hence the
>>> rescnt is not incremented.
>>>
>>> The reason for that is, while doing a rt_task_set_priority,
>>> __xnsched_rt_setparam is invoked before the thread is reniced.
>>>
>>> To resolve this issue, I had to set the XNOTHER bit in
>>> __xnpod_set_thread_schedparam after the thread was reniced or in
>>> rt_task_set_priority. Both the code changes are given below:
>>>
>>>
>>> rt_task_set_priority(....
>>>
>>> + if (0==prio)
>>> + {
>>> + xnthread_set_state(&task->thread_base, XNOTHER);
>>> + }
>>>
>>>
>>> xnpod_set_thread_schedparam(...
>>>
>>> #ifdef CONFIG_XENO_OPT_PERVASIVE
>>> if (propagate) {
>>> if (xnthread_test_state(thread, XNRELAX))
>>> xnshadow_renice(thread);
>>> else if (xnthread_test_state(thread, XNSHADOW))
>>> xnthread_set_info(thread, XNPRIOSET);
>>> }
>>>
>>> + if (xnthread_test_state(thread, XNSHADOW)) {
>>> + // if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>> + if (thread->bprio)
>>> + xnthread_clear_state(thread, XNOTHER);
>>> + else
>>> + xnthread_set_state(thread, XNOTHER);
>>> + }
>>>
>>>
>>> Setting XNOTHER in rt_task_set_priority does not look appropriate. I
>>> believe the right place is in the xnpod_set_thread_schedparam.
>>>
>>> Would highly appreciate your views.
>>>
>>> Rgds,
>>> Mak
>>>
>>>
>>> On 10/01/12 02:10 PM, Makarand Pradhan wrote:
>>>> The patch does work. Thanks.
>>>>
>>>> Will it be available in the next release of xenomai?
>>>>
>>>> Rgds,
>>>> Mak
>>>>
>>>> root@domain.hid:~# ./relax 0 1
>>>> Spawning: tasks
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Grabbing mux in HP
>>>> Mux held by Task2
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> Release complete
>>>> bP: 0, cp: 0, mode: 0
>>>> Acquire complete
>>>> ^C
>>>> root@domain.hid:~#
>>>>
>>>>
>>>> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>>>>> Hi Phillipe,
>>>>>
>>>>> A bit surprised to see a change in sched-rt.h. I had another problem
>>>>> earlier where the XNOTHER was not getting set after a priority
>>>>> change. I
>>>>> had to look at the code that you have modified. Although I had
>>>>> temporarily worked around it by setting the XNOTHER in
>>>>> rt_task_set_priority. I think this would fix that problem as well.
>>>>>
>>>>> Will test the patch and get back with the results.
>>>>>
>>>>> Thanks and Rgds,
>>>>> Mak.
>>>>>
>>>>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>>>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>>>>> Based on my testing, it is noted that the rescnt is not released
>>>>>>> when
>>>>>>> task1 gets a priority boost and starts running with priority 1.
>>>>>>> That's
>>>>>>> when the rescnt is not decremented.
>>>>>>>
>>>>>>> It would imply that we may be checking the current priority while
>>>>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>>>>> check it out.
>>>>>> Does this help in your case?
>>>>>>
>>>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>>>> index cc1cefa..6ac8fd7 100644
>>>>>> --- a/include/nucleus/sched-rt.h
>>>>>> +++ b/include/nucleus/sched-rt.h
>>>>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct
>>>>>> xnthread *thread,
>>>>>> {
>>>>>> thread->cprio = p->rt.prio;
>>>>>> if (xnthread_test_state(thread, XNSHADOW)) {
>>>>>> - if (thread->cprio)
>>>>>> + if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>>>> xnthread_clear_state(thread, XNOTHER);
>>>>>> else
>>>>>> xnthread_set_state(thread, XNOTHER);
>>>>>>> Rgds,
>>>>>>> Mak.
>>>>>>>
>>>>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>>>>> Another point:
>>>>>>>>>>
>>>>>>>>>> "These are fast mutexes, the thread does not have to jump to
>>>>>>>>>> kernel
>>>>>>>>>> space
>>>>>>>>>> unless the released mutex was actually contented."
>>>>>>>>>>
>>>>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>>>>> contention.
>>>>>>>>> I should have added: "unless there is no contention ... or the
>>>>>>>>> caller is
>>>>>>>>> a non-rt thread". This is because we have to jump to kernel
>>>>>>>>> space to
>>>>>>>>> track rescnt.
>>>>>>>>>
>>>>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>>>>> a non-rt thread".
>>>>>>>>
>>>>>>>>>> I have an instrumented kernel. The kernel trace is given below.
>>>>>>>>>> In this
>>>>>>>>>> trace only task1 is running at prio 0. It should be easy to
>>>>>>>>>> follow:
>>>>>>>>>>
>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>>>>> Spawning: tasks
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> Release complete
>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>> Acquire complete
>>>>>>>>>> ^C
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Rgds,
>>>>>>>>>> Mak.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>>>>> Hi Phillippe,
>>>>>>>>>>>
>>>>>>>>>>> You are right. Task 1 requires to be started with prio 0. I
>>>>>>>>>>> start
>>>>>>>>>>> seeing
>>>>>>>>>>> the problem after task2 grabs the mutex and releases them. The
>>>>>>>>>>> first
>>>>>>>>>>> task never jumps back to seconodary. Here is my output. The
>>>>>>>>>>> mode never
>>>>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>>>>> stuck at
>>>>>>>>>>> 1 in the kernel.
>>>>>>>>>>>
>>>>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Grabbing mux in HP
>>>>>>>>>>> Mux held by Task2
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>> Acquire complete
>>>>>>>>>>>
>>>>>>>>>>> Rgds,
>>>>>>>>>>> Mak.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am noticing an issue while using the auto relax feature
>>>>>>>>>>>>> related to
>>>>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to
>>>>>>>>>>>>> this
>>>>>>>>>>>>> email.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs
>>>>>>>>>>>>> and
>>>>>>>>>>>>> releases a mutex. On further investigation, it was noted that
>>>>>>>>>>>>> the
>>>>>>>>>>>>> rescnt
>>>>>>>>>>>>> is not going down to 0.
>>>>>>>>>>>> From your code, task1 would auto-relax only if started with
>>>>>>>>>>>> priority 0,
>>>>>>>>>>>> which is what I get here:
>>>>>>>>>>>>
>>>>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>>>>> priority to
>>>>>>>>>>>> task1:
>>>>>>>>>>>>
>>>>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> ...
>>>>>>>>>>>>
>>>>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>>>>> result.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>>>>
>>>>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>>>>
>>>>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I
>>>>>>>>>>>>> believe
>>>>>>>>>>>>> when
>>>>>>>>>>>>> the thread undergoes a priority inversion.This may be a
>>>>>>>>>>>>> problem
>>>>>>>>>>>>> as the
>>>>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>>>>> releasing
>>>>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>>>>
>>>>>>>>>>>> These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>>> kernel
>>>>>>>>>>>> space
>>>>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>>>>
>>>>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>>>>> secondary
>>>>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt
>>>>>>>>>>>>> goes
>>>>>>>>>>>>> down to
>>>>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>>>>
>>>>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt
>>>>>>>>>>>> Xenomai
>>>>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>>>>
>>>>>>>>>>>>> Rgds,
>>>>>>>>>>>>> Makarand
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>>> --
>>>>> ___________________________________________________________________________
>>>>>
>>>>>
>>>>> NOTICE OF CONFIDENTIALITY:
>>>>> This e-mail and any attachments may contain confidential and
>>>>> privileged information. If you are
>>>>> not the intended recipient, please notify the sender immediately by
>>>>> return e-mail and delete this
>>>>> e-mail and any copies. Any dissemination or use of this information
>>>>> by a person other than the
>>>>> intended recipient is unauthorized and may be illegal.
>>>>> _____________________________________________________________________
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Xenomai-help mailing list
>>>>> Xenomai-help@domain.hid
>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>
>>
>> --
>> Philippe.
>
>


-- 
Philippe.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-19 15:49                         ` Philippe Gerum
@ 2012-01-19 16:22                           ` Makarand Pradhan
  2012-01-19 16:39                             ` Makarand Pradhan
  0 siblings, 1 reply; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-19 16:22 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

Hi,

The scenario is:

Start task prio 0
<Task is running>
Set priority 85
<Do something>
Set priority 0
get_mutex <-- rescnt not incremented as XNOTHER is not set.
<Do something> <-- XNOTHER gets set somewhere
Release mutex <-- Sends a SIGDEBUG/SIGXCPU

Am trying to write a simple app that will reveal the issue. Will send it 
out shortly.

Rgds,
Mak

On 19/01/12 10:49 AM, Philippe Gerum wrote:
> On 01/19/2012 04:22 PM, Makarand Pradhan wrote:
>> Hi Philippe,
>>
>> I think I may have not communicated the scenario properly. I am not
>> trying to control the priorities from user space during resource
>> contention. That is left to the kernel. Let me try again.
>>
>> At some point, my application which was relaxed has to run with a real
>> time priority. That's when I invoke rt_task_set_priority to change the
>> base priority. After the critical section is past, the thread has to
>> relax again where the priority is set to 0 again.
>>
>> The rt_task_set_priority API allows me to change the task priority on
>> the fly, so I think that the operation is supported and legal. Pl feel
>> free to correct me if that is not true.
> What is not supported is:
>
> get_mutex
> set_priority(current)
> release_mutex
>
> This can't be a valid operation, because of the reason I mentioned
> earlier. So, regardless of the reason why you call
> rt_set_task_priority(), you may not call it while holding a mutex.
>
> What happens afterward, e.g. not getting the auto-relax feature back is
> irrelevant in this case.
>
> So, if your app does:
>
> set_priority(current,>  0)
> get_mutex
> release_mutex
> set_priority(current, 0)
>
> then fine, and not getting back the auto-relax after switching to
> priority 0 would indeed reveal a kernel issue. But the former scenario
> is wrong.
>
>> This change of priorities does introduce the race condition that was
>> encountered which can be handled properly in the kernel using any of the
>> 2 approaches that were mentioned.
>>
>> Your comments are highly valued and I look forward to your opinions.
>>
>> Rgds,
>> Mak.
>>
>> On 19/01/12 06:25 AM, Philippe Gerum wrote:
>>> On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
>>>> Hi,
>>>>
>>>> Another problem was encountered with rescnt related to nested mutexes.
>>>>
>>>> This time the rescnt is not incrementing because the XNOTHER bit is not
>>>> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
>>>> my application to crash.
>>>>
>>>> The scenario is as follows:
>>>>
>>>> 1. Thread started with priority 0. (Relaxed)
>>>> 2. This thread uses mutexes which causes Priority Inversions.
>>>> 3. At some point, a rt_task_set_priority is done to change the priority.
>>>> (RT 85).
>>>> 4. Some time later the priority is set back to 0.
>>> If I understand it properly, your runtime scenario is badly broken I'm
>>> afraid. By contrast to priority ceiling, priority inheritance is about
>>> leaving the responsibility to the _kernel_ to pick the best dynamic
>>> priority for your thread to solve a priority inversion.
>>>
>>> Therefore, by changing your dynamic priority while holding a mutex, your
>>> application is preventing the kernel to do the job you previously
>>> assigned to it. Worst, you could be causing unexpected latencies to
>>> other threads your application has no clue about, or just can't tell
>>> whether they compete with your thread for accessing the resource at that
>>> specific time.
>>>
>>> After all, this is your application that defined the contented mutex,
>>> and as such the fact that priority inheritance might be involved at some
>>> point. If you don't trust the kernel and want to deal with priorities
>>> manually during resource contention, then maybe you should use a
>>> different mutual exclusion mechanism not implementing priority
>>> inheritance, e.g. a plain binary semaphore.
>>>
>>>> The problem again revolves around setting XNOTHER. In the problem
>>>> scenario, the XNOTHER bit is not set in xnsynch_acquire. Hence the
>>>> rescnt is not incremented.
>>>>
>>>> The reason for that is, while doing a rt_task_set_priority,
>>>> __xnsched_rt_setparam is invoked before the thread is reniced.
>>>>
>>>> To resolve this issue, I had to set the XNOTHER bit in
>>>> __xnpod_set_thread_schedparam after the thread was reniced or in
>>>> rt_task_set_priority. Both the code changes are given below:
>>>>
>>>>
>>>> rt_task_set_priority(....
>>>>
>>>> + if (0==prio)
>>>> + {
>>>> + xnthread_set_state(&task->thread_base, XNOTHER);
>>>> + }
>>>>
>>>>
>>>> xnpod_set_thread_schedparam(...
>>>>
>>>> #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>> if (propagate) {
>>>> if (xnthread_test_state(thread, XNRELAX))
>>>> xnshadow_renice(thread);
>>>> else if (xnthread_test_state(thread, XNSHADOW))
>>>> xnthread_set_info(thread, XNPRIOSET);
>>>> }
>>>>
>>>> + if (xnthread_test_state(thread, XNSHADOW)) {
>>>> + // if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>> + if (thread->bprio)
>>>> + xnthread_clear_state(thread, XNOTHER);
>>>> + else
>>>> + xnthread_set_state(thread, XNOTHER);
>>>> + }
>>>>
>>>>
>>>> Setting XNOTHER in rt_task_set_priority does not look appropriate. I
>>>> believe the right place is in the xnpod_set_thread_schedparam.
>>>>
>>>> Would highly appreciate your views.
>>>>
>>>> Rgds,
>>>> Mak
>>>>
>>>>
>>>> On 10/01/12 02:10 PM, Makarand Pradhan wrote:
>>>>> The patch does work. Thanks.
>>>>>
>>>>> Will it be available in the next release of xenomai?
>>>>>
>>>>> Rgds,
>>>>> Mak
>>>>>
>>>>> root@domain.hid:~# ./relax 0 1
>>>>> Spawning: tasks
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Grabbing mux in HP
>>>>> Mux held by Task2
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> Release complete
>>>>> bP: 0, cp: 0, mode: 0
>>>>> Acquire complete
>>>>> ^C
>>>>> root@domain.hid:~#
>>>>>
>>>>>
>>>>> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>>>>>> Hi Phillipe,
>>>>>>
>>>>>> A bit surprised to see a change in sched-rt.h. I had another problem
>>>>>> earlier where the XNOTHER was not getting set after a priority
>>>>>> change. I
>>>>>> had to look at the code that you have modified. Although I had
>>>>>> temporarily worked around it by setting the XNOTHER in
>>>>>> rt_task_set_priority. I think this would fix that problem as well.
>>>>>>
>>>>>> Will test the patch and get back with the results.
>>>>>>
>>>>>> Thanks and Rgds,
>>>>>> Mak.
>>>>>>
>>>>>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>>>>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>>>>>> Based on my testing, it is noted that the rescnt is not released
>>>>>>>> when
>>>>>>>> task1 gets a priority boost and starts running with priority 1.
>>>>>>>> That's
>>>>>>>> when the rescnt is not decremented.
>>>>>>>>
>>>>>>>> It would imply that we may be checking the current priority while
>>>>>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>>>>>> check it out.
>>>>>>> Does this help in your case?
>>>>>>>
>>>>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>>>>> index cc1cefa..6ac8fd7 100644
>>>>>>> --- a/include/nucleus/sched-rt.h
>>>>>>> +++ b/include/nucleus/sched-rt.h
>>>>>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct
>>>>>>> xnthread *thread,
>>>>>>> {
>>>>>>> thread->cprio = p->rt.prio;
>>>>>>> if (xnthread_test_state(thread, XNSHADOW)) {
>>>>>>> - if (thread->cprio)
>>>>>>> + if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>>>>> xnthread_clear_state(thread, XNOTHER);
>>>>>>> else
>>>>>>> xnthread_set_state(thread, XNOTHER);
>>>>>>>> Rgds,
>>>>>>>> Mak.
>>>>>>>>
>>>>>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>>>>>> Another point:
>>>>>>>>>>>
>>>>>>>>>>> "These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>> kernel
>>>>>>>>>>> space
>>>>>>>>>>> unless the released mutex was actually contented."
>>>>>>>>>>>
>>>>>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>>>>>> contention.
>>>>>>>>>> I should have added: "unless there is no contention ... or the
>>>>>>>>>> caller is
>>>>>>>>>> a non-rt thread". This is because we have to jump to kernel
>>>>>>>>>> space to
>>>>>>>>>> track rescnt.
>>>>>>>>>>
>>>>>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>>>>>> a non-rt thread".
>>>>>>>>>
>>>>>>>>>>> I have an instrumented kernel. The kernel trace is given below.
>>>>>>>>>>> In this
>>>>>>>>>>> trace only task1 is running at prio 0. It should be easy to
>>>>>>>>>>> follow:
>>>>>>>>>>>
>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> Release complete
>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>> Acquire complete
>>>>>>>>>>> ^C
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Rgds,
>>>>>>>>>>> Mak.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>>>>>> Hi Phillippe,
>>>>>>>>>>>>
>>>>>>>>>>>> You are right. Task 1 requires to be started with prio 0. I
>>>>>>>>>>>> start
>>>>>>>>>>>> seeing
>>>>>>>>>>>> the problem after task2 grabs the mutex and releases them. The
>>>>>>>>>>>> first
>>>>>>>>>>>> task never jumps back to seconodary. Here is my output. The
>>>>>>>>>>>> mode never
>>>>>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>>>>>> stuck at
>>>>>>>>>>>> 1 in the kernel.
>>>>>>>>>>>>
>>>>>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Grabbing mux in HP
>>>>>>>>>>>> Mux held by Task2
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>
>>>>>>>>>>>> Rgds,
>>>>>>>>>>>> Mak.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am noticing an issue while using the auto relax feature
>>>>>>>>>>>>>> related to
>>>>>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>> email.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> releases a mutex. On further investigation, it was noted that
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> rescnt
>>>>>>>>>>>>>> is not going down to 0.
>>>>>>>>>>>>>  From your code, task1 would auto-relax only if started with
>>>>>>>>>>>>> priority 0,
>>>>>>>>>>>>> which is what I get here:
>>>>>>>>>>>>>
>>>>>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>>>>>> priority to
>>>>>>>>>>>>> task1:
>>>>>>>>>>>>>
>>>>>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> ...
>>>>>>>>>>>>>
>>>>>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>>>>>> result.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I
>>>>>>>>>>>>>> believe
>>>>>>>>>>>>>> when
>>>>>>>>>>>>>> the thread undergoes a priority inversion.This may be a
>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>> as the
>>>>>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>>>>>> releasing
>>>>>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>>>> kernel
>>>>>>>>>>>>> space
>>>>>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>>>>>> secondary
>>>>>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt
>>>>>>>>>>>>>> goes
>>>>>>>>>>>>>> down to
>>>>>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt
>>>>>>>>>>>>> Xenomai
>>>>>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Rgds,
>>>>>>>>>>>>>> Makarand
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>>>> --
>>>>>> ___________________________________________________________________________
>>>>>>
>>>>>>
>>>>>> NOTICE OF CONFIDENTIALITY:
>>>>>> This e-mail and any attachments may contain confidential and
>>>>>> privileged information. If you are
>>>>>> not the intended recipient, please notify the sender immediately by
>>>>>> return e-mail and delete this
>>>>>> e-mail and any copies. Any dissemination or use of this information
>>>>>> by a person other than the
>>>>>> intended recipient is unauthorized and may be illegal.
>>>>>> _____________________________________________________________________
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Xenomai-help mailing list
>>>>>> Xenomai-help@domain.hid
>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>> --
>>> Philippe.
>>
>
> --
> Philippe.


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-19 16:22                           ` Makarand Pradhan
@ 2012-01-19 16:39                             ` Makarand Pradhan
  2012-01-23 15:01                               ` Makarand Pradhan
  0 siblings, 1 reply; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-19 16:39 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

[-- Attachment #1: Type: text/plain, Size: 19946 bytes --]

The attached c file should get you a SIGXCPU/SIGDEBUG revealing the bug 
that I am talking about.

Rgds,
Mak.



On 19/01/12 11:22 AM, Makarand Pradhan wrote:
> Hi,
>
> The scenario is:
>
> Start task prio 0
> <Task is running>
> Set priority 85
> <Do something>
> Set priority 0
> get_mutex<-- rescnt not incremented as XNOTHER is not set.
> <Do something>  <-- XNOTHER gets set somewhere
> Release mutex<-- Sends a SIGDEBUG/SIGXCPU
>
> Am trying to write a simple app that will reveal the issue. Will send it
> out shortly.
>
> Rgds,
> Mak
>
> On 19/01/12 10:49 AM, Philippe Gerum wrote:
>> On 01/19/2012 04:22 PM, Makarand Pradhan wrote:
>>> Hi Philippe,
>>>
>>> I think I may have not communicated the scenario properly. I am not
>>> trying to control the priorities from user space during resource
>>> contention. That is left to the kernel. Let me try again.
>>>
>>> At some point, my application which was relaxed has to run with a real
>>> time priority. That's when I invoke rt_task_set_priority to change the
>>> base priority. After the critical section is past, the thread has to
>>> relax again where the priority is set to 0 again.
>>>
>>> The rt_task_set_priority API allows me to change the task priority on
>>> the fly, so I think that the operation is supported and legal. Pl feel
>>> free to correct me if that is not true.
>> What is not supported is:
>>
>> get_mutex
>> set_priority(current)
>> release_mutex
>>
>> This can't be a valid operation, because of the reason I mentioned
>> earlier. So, regardless of the reason why you call
>> rt_set_task_priority(), you may not call it while holding a mutex.
>>
>> What happens afterward, e.g. not getting the auto-relax feature back is
>> irrelevant in this case.
>>
>> So, if your app does:
>>
>> set_priority(current,>   0)
>> get_mutex
>> release_mutex
>> set_priority(current, 0)
>>
>> then fine, and not getting back the auto-relax after switching to
>> priority 0 would indeed reveal a kernel issue. But the former scenario
>> is wrong.
>>
>>> This change of priorities does introduce the race condition that was
>>> encountered which can be handled properly in the kernel using any of the
>>> 2 approaches that were mentioned.
>>>
>>> Your comments are highly valued and I look forward to your opinions.
>>>
>>> Rgds,
>>> Mak.
>>>
>>> On 19/01/12 06:25 AM, Philippe Gerum wrote:
>>>> On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
>>>>> Hi,
>>>>>
>>>>> Another problem was encountered with rescnt related to nested mutexes.
>>>>>
>>>>> This time the rescnt is not incrementing because the XNOTHER bit is not
>>>>> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
>>>>> my application to crash.
>>>>>
>>>>> The scenario is as follows:
>>>>>
>>>>> 1. Thread started with priority 0. (Relaxed)
>>>>> 2. This thread uses mutexes which causes Priority Inversions.
>>>>> 3. At some point, a rt_task_set_priority is done to change the priority.
>>>>> (RT 85).
>>>>> 4. Some time later the priority is set back to 0.
>>>> If I understand it properly, your runtime scenario is badly broken I'm
>>>> afraid. By contrast to priority ceiling, priority inheritance is about
>>>> leaving the responsibility to the _kernel_ to pick the best dynamic
>>>> priority for your thread to solve a priority inversion.
>>>>
>>>> Therefore, by changing your dynamic priority while holding a mutex, your
>>>> application is preventing the kernel to do the job you previously
>>>> assigned to it. Worst, you could be causing unexpected latencies to
>>>> other threads your application has no clue about, or just can't tell
>>>> whether they compete with your thread for accessing the resource at that
>>>> specific time.
>>>>
>>>> After all, this is your application that defined the contented mutex,
>>>> and as such the fact that priority inheritance might be involved at some
>>>> point. If you don't trust the kernel and want to deal with priorities
>>>> manually during resource contention, then maybe you should use a
>>>> different mutual exclusion mechanism not implementing priority
>>>> inheritance, e.g. a plain binary semaphore.
>>>>
>>>>> The problem again revolves around setting XNOTHER. In the problem
>>>>> scenario, the XNOTHER bit is not set in xnsynch_acquire. Hence the
>>>>> rescnt is not incremented.
>>>>>
>>>>> The reason for that is, while doing a rt_task_set_priority,
>>>>> __xnsched_rt_setparam is invoked before the thread is reniced.
>>>>>
>>>>> To resolve this issue, I had to set the XNOTHER bit in
>>>>> __xnpod_set_thread_schedparam after the thread was reniced or in
>>>>> rt_task_set_priority. Both the code changes are given below:
>>>>>
>>>>>
>>>>> rt_task_set_priority(....
>>>>>
>>>>> + if (0==prio)
>>>>> + {
>>>>> + xnthread_set_state(&task->thread_base, XNOTHER);
>>>>> + }
>>>>>
>>>>>
>>>>> xnpod_set_thread_schedparam(...
>>>>>
>>>>> #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>> if (propagate) {
>>>>> if (xnthread_test_state(thread, XNRELAX))
>>>>> xnshadow_renice(thread);
>>>>> else if (xnthread_test_state(thread, XNSHADOW))
>>>>> xnthread_set_info(thread, XNPRIOSET);
>>>>> }
>>>>>
>>>>> + if (xnthread_test_state(thread, XNSHADOW)) {
>>>>> + // if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>>> + if (thread->bprio)
>>>>> + xnthread_clear_state(thread, XNOTHER);
>>>>> + else
>>>>> + xnthread_set_state(thread, XNOTHER);
>>>>> + }
>>>>>
>>>>>
>>>>> Setting XNOTHER in rt_task_set_priority does not look appropriate. I
>>>>> believe the right place is in the xnpod_set_thread_schedparam.
>>>>>
>>>>> Would highly appreciate your views.
>>>>>
>>>>> Rgds,
>>>>> Mak
>>>>>
>>>>>
>>>>> On 10/01/12 02:10 PM, Makarand Pradhan wrote:
>>>>>> The patch does work. Thanks.
>>>>>>
>>>>>> Will it be available in the next release of xenomai?
>>>>>>
>>>>>> Rgds,
>>>>>> Mak
>>>>>>
>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>> Spawning: tasks
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Grabbing mux in HP
>>>>>> Mux held by Task2
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> Release complete
>>>>>> bP: 0, cp: 0, mode: 0
>>>>>> Acquire complete
>>>>>> ^C
>>>>>> root@domain.hid:~#
>>>>>>
>>>>>>
>>>>>> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>>>>>>> Hi Phillipe,
>>>>>>>
>>>>>>> A bit surprised to see a change in sched-rt.h. I had another problem
>>>>>>> earlier where the XNOTHER was not getting set after a priority
>>>>>>> change. I
>>>>>>> had to look at the code that you have modified. Although I had
>>>>>>> temporarily worked around it by setting the XNOTHER in
>>>>>>> rt_task_set_priority. I think this would fix that problem as well.
>>>>>>>
>>>>>>> Will test the patch and get back with the results.
>>>>>>>
>>>>>>> Thanks and Rgds,
>>>>>>> Mak.
>>>>>>>
>>>>>>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>>>>>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>>>>>>> Based on my testing, it is noted that the rescnt is not released
>>>>>>>>> when
>>>>>>>>> task1 gets a priority boost and starts running with priority 1.
>>>>>>>>> That's
>>>>>>>>> when the rescnt is not decremented.
>>>>>>>>>
>>>>>>>>> It would imply that we may be checking the current priority while
>>>>>>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>>>>>>> check it out.
>>>>>>>> Does this help in your case?
>>>>>>>>
>>>>>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>>>>>> index cc1cefa..6ac8fd7 100644
>>>>>>>> --- a/include/nucleus/sched-rt.h
>>>>>>>> +++ b/include/nucleus/sched-rt.h
>>>>>>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct
>>>>>>>> xnthread *thread,
>>>>>>>> {
>>>>>>>> thread->cprio = p->rt.prio;
>>>>>>>> if (xnthread_test_state(thread, XNSHADOW)) {
>>>>>>>> - if (thread->cprio)
>>>>>>>> + if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>>>>>> xnthread_clear_state(thread, XNOTHER);
>>>>>>>> else
>>>>>>>> xnthread_set_state(thread, XNOTHER);
>>>>>>>>> Rgds,
>>>>>>>>> Mak.
>>>>>>>>>
>>>>>>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>>>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>>>>>>> Another point:
>>>>>>>>>>>>
>>>>>>>>>>>> "These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>>> kernel
>>>>>>>>>>>> space
>>>>>>>>>>>> unless the released mutex was actually contented."
>>>>>>>>>>>>
>>>>>>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>>>>>>> contention.
>>>>>>>>>>> I should have added: "unless there is no contention ... or the
>>>>>>>>>>> caller is
>>>>>>>>>>> a non-rt thread". This is because we have to jump to kernel
>>>>>>>>>>> space to
>>>>>>>>>>> track rescnt.
>>>>>>>>>>>
>>>>>>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>>>>>>> a non-rt thread".
>>>>>>>>>>
>>>>>>>>>>>> I have an instrumented kernel. The kernel trace is given below.
>>>>>>>>>>>> In this
>>>>>>>>>>>> trace only task1 is running at prio 0. It should be easy to
>>>>>>>>>>>> follow:
>>>>>>>>>>>>
>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> Release complete
>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>> ^C
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Rgds,
>>>>>>>>>>>> Mak.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>>>>>>> Hi Phillippe,
>>>>>>>>>>>>>
>>>>>>>>>>>>> You are right. Task 1 requires to be started with prio 0. I
>>>>>>>>>>>>> start
>>>>>>>>>>>>> seeing
>>>>>>>>>>>>> the problem after task2 grabs the mutex and releases them. The
>>>>>>>>>>>>> first
>>>>>>>>>>>>> task never jumps back to seconodary. Here is my output. The
>>>>>>>>>>>>> mode never
>>>>>>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>>>>>>> stuck at
>>>>>>>>>>>>> 1 in the kernel.
>>>>>>>>>>>>>
>>>>>>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Grabbing mux in HP
>>>>>>>>>>>>> Mux held by Task2
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>
>>>>>>>>>>>>> Rgds,
>>>>>>>>>>>>> Mak.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am noticing an issue while using the auto relax feature
>>>>>>>>>>>>>>> related to
>>>>>>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to
>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> email.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> releases a mutex. On further investigation, it was noted that
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> rescnt
>>>>>>>>>>>>>>> is not going down to 0.
>>>>>>>>>>>>>>    From your code, task1 would auto-relax only if started with
>>>>>>>>>>>>>> priority 0,
>>>>>>>>>>>>>> which is what I get here:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>>>>>>> priority to
>>>>>>>>>>>>>> task1:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>>>>>>> result.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I
>>>>>>>>>>>>>>> believe
>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>> the thread undergoes a priority inversion.This may be a
>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>> as the
>>>>>>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>>>>>>> releasing
>>>>>>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>>>>> kernel
>>>>>>>>>>>>>> space
>>>>>>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>>>>>>> secondary
>>>>>>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt
>>>>>>>>>>>>>>> goes
>>>>>>>>>>>>>>> down to
>>>>>>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt
>>>>>>>>>>>>>> Xenomai
>>>>>>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Rgds,
>>>>>>>>>>>>>>> Makarand
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>>>>> --
>>>>>>> ___________________________________________________________________________
>>>>>>>
>>>>>>>
>>>>>>> NOTICE OF CONFIDENTIALITY:
>>>>>>> This e-mail and any attachments may contain confidential and
>>>>>>> privileged information. If you are
>>>>>>> not the intended recipient, please notify the sender immediately by
>>>>>>> return e-mail and delete this
>>>>>>> e-mail and any copies. Any dissemination or use of this information
>>>>>>> by a person other than the
>>>>>>> intended recipient is unauthorized and may be illegal.
>>>>>>> _____________________________________________________________________
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Xenomai-help mailing list
>>>>>>> Xenomai-help@domain.hid
>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>> --
>>>> Philippe.
>> --
>> Philippe.
>
> --
> ___________________________________________________________________________
> NOTICE OF CONFIDENTIALITY:
> This e-mail and any attachments may contain confidential and privileged information.  If you are
> not the intended recipient, please notify the sender immediately by return e-mail and delete this
> e-mail and any copies.  Any dissemination or use of this information by a person other than the
> intended recipient is unauthorized and may be illegal.
> _____________________________________________________________________
>
>
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  


[-- Attachment #2: set_prio_bug.c --]
[-- Type: text/x-csrc, Size: 1316 bytes --]

#include <xenomai/asm-generic/bits/current.h>
#include <sys/syscall.h>
#include <sys/mman.h>
#include <stdio.h>
#include <native/task.h>
#include <native/mutex.h>
#include <native/timer.h>
#include <nucleus/thread.h>
#include <rtdk.h>
#include <stdlib.h>
#include <unistd.h>

RT_TASK task0, task1, task2; 
RT_MUTEX mux1, mux2, mux3, mux4;
char buff[8192];

void task1_func(void *arg)
{
	RT_TASK_INFO info;
	int current_in_primary = 0;
	int i = 0;
	RTIME st, en;
	int iterations = 10000;

	rt_task_set_priority(&task1, 85);
	rt_task_set_priority(&task1, 0);
        rt_mutex_acquire(&mux1, TM_INFINITE);
	rt_task_sleep(2222222222LL);
	rt_mutex_release(&mux1);

	rt_printf("Should get a SIGDEBUG/SIGXCPU now\n");

}

int main(int argc, char **argv)
{
	int prio1, prio2;

	if (argc == 1)
	{
		printf("Pl enter prio1 prio1\n");
		exit (0);
	}

	prio1 = atoi(argv[1]);
	prio2 = atoi(argv[2]);

	mlockall(MCL_CURRENT|MCL_FUTURE);

	rt_print_auto_init(1);

	rt_task_shadow(&task0, "Task 0", 10, 0);

	rt_mutex_create(&mux1, "test_mux1");
	rt_mutex_create(&mux2, "test_mux2");
	rt_mutex_create(&mux3, "test_mux3");
	rt_mutex_create(&mux4, "test_mux4");

	printf("Spawning: tasks %x\n", XNTHREAD_STATE_SPARE0);
	rt_task_spawn(&task1, "PRIO_BUG", 0, prio1, 0, task1_func, NULL);

	while (1) {
		rt_task_sleep(5000000LL);
	}
}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Xenomai-help] Issue with Auto relax and nested mutexes
  2012-01-19 16:39                             ` Makarand Pradhan
@ 2012-01-23 15:01                               ` Makarand Pradhan
  0 siblings, 0 replies; 23+ messages in thread
From: Makarand Pradhan @ 2012-01-23 15:01 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai@xenomai.org

Hi Phiippe,

Just wanted to send a gentle reminder regarding the issue related to 
task priority and mutexes.

The scenario under discussion was:

Start task prio 0
<Task is running>
Set priority 85
<Do something>
Set priority 0
     get_mutex <-- rescnt not incremented as XNOTHER is not set.
<Do something> <-- XNOTHER gets set somewhere
     Release mutex <-- Sends a SIGDEBUG/SIGXCPU

Would highly appreciate your opinion. Do you feel that there is a 
genuine issue out here?

Rgds,
Mak.

On 19/01/12 11:39 AM, Makarand Pradhan wrote:
> The attached c file should get you a SIGXCPU/SIGDEBUG revealing the bug
> that I am talking about.
>
> Rgds,
> Mak.
>
>
>
> On 19/01/12 11:22 AM, Makarand Pradhan wrote:
>> Hi,
>>
>> The scenario is:
>>
>> Start task prio 0
>> <Task is running>
>> Set priority 85
>> <Do something>
>> Set priority 0
>> get_mutex<-- rescnt not incremented as XNOTHER is not set.
>> <Do something>   <-- XNOTHER gets set somewhere
>> Release mutex<-- Sends a SIGDEBUG/SIGXCPU
>>
>> Am trying to write a simple app that will reveal the issue. Will send it
>> out shortly.
>>
>> Rgds,
>> Mak
>>
>> On 19/01/12 10:49 AM, Philippe Gerum wrote:
>>> On 01/19/2012 04:22 PM, Makarand Pradhan wrote:
>>>> Hi Philippe,
>>>>
>>>> I think I may have not communicated the scenario properly. I am not
>>>> trying to control the priorities from user space during resource
>>>> contention. That is left to the kernel. Let me try again.
>>>>
>>>> At some point, my application which was relaxed has to run with a real
>>>> time priority. That's when I invoke rt_task_set_priority to change the
>>>> base priority. After the critical section is past, the thread has to
>>>> relax again where the priority is set to 0 again.
>>>>
>>>> The rt_task_set_priority API allows me to change the task priority on
>>>> the fly, so I think that the operation is supported and legal. Pl feel
>>>> free to correct me if that is not true.
>>> What is not supported is:
>>>
>>> get_mutex
>>> set_priority(current)
>>> release_mutex
>>>
>>> This can't be a valid operation, because of the reason I mentioned
>>> earlier. So, regardless of the reason why you call
>>> rt_set_task_priority(), you may not call it while holding a mutex.
>>>
>>> What happens afterward, e.g. not getting the auto-relax feature back is
>>> irrelevant in this case.
>>>
>>> So, if your app does:
>>>
>>> set_priority(current,>    0)
>>> get_mutex
>>> release_mutex
>>> set_priority(current, 0)
>>>
>>> then fine, and not getting back the auto-relax after switching to
>>> priority 0 would indeed reveal a kernel issue. But the former scenario
>>> is wrong.
>>>
>>>> This change of priorities does introduce the race condition that was
>>>> encountered which can be handled properly in the kernel using any of the
>>>> 2 approaches that were mentioned.
>>>>
>>>> Your comments are highly valued and I look forward to your opinions.
>>>>
>>>> Rgds,
>>>> Mak.
>>>>
>>>> On 19/01/12 06:25 AM, Philippe Gerum wrote:
>>>>> On 01/18/2012 11:41 PM, Makarand Pradhan wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Another problem was encountered with rescnt related to nested mutexes.
>>>>>>
>>>>>> This time the rescnt is not incrementing because the XNOTHER bit is not
>>>>>> set, causing a SIGDEBUG or SIGXCPU to be delivered to the thread causing
>>>>>> my application to crash.
>>>>>>
>>>>>> The scenario is as follows:
>>>>>>
>>>>>> 1. Thread started with priority 0. (Relaxed)
>>>>>> 2. This thread uses mutexes which causes Priority Inversions.
>>>>>> 3. At some point, a rt_task_set_priority is done to change the priority.
>>>>>> (RT 85).
>>>>>> 4. Some time later the priority is set back to 0.
>>>>> If I understand it properly, your runtime scenario is badly broken I'm
>>>>> afraid. By contrast to priority ceiling, priority inheritance is about
>>>>> leaving the responsibility to the _kernel_ to pick the best dynamic
>>>>> priority for your thread to solve a priority inversion.
>>>>>
>>>>> Therefore, by changing your dynamic priority while holding a mutex, your
>>>>> application is preventing the kernel to do the job you previously
>>>>> assigned to it. Worst, you could be causing unexpected latencies to
>>>>> other threads your application has no clue about, or just can't tell
>>>>> whether they compete with your thread for accessing the resource at that
>>>>> specific time.
>>>>>
>>>>> After all, this is your application that defined the contented mutex,
>>>>> and as such the fact that priority inheritance might be involved at some
>>>>> point. If you don't trust the kernel and want to deal with priorities
>>>>> manually during resource contention, then maybe you should use a
>>>>> different mutual exclusion mechanism not implementing priority
>>>>> inheritance, e.g. a plain binary semaphore.
>>>>>
>>>>>> The problem again revolves around setting XNOTHER. In the problem
>>>>>> scenario, the XNOTHER bit is not set in xnsynch_acquire. Hence the
>>>>>> rescnt is not incremented.
>>>>>>
>>>>>> The reason for that is, while doing a rt_task_set_priority,
>>>>>> __xnsched_rt_setparam is invoked before the thread is reniced.
>>>>>>
>>>>>> To resolve this issue, I had to set the XNOTHER bit in
>>>>>> __xnpod_set_thread_schedparam after the thread was reniced or in
>>>>>> rt_task_set_priority. Both the code changes are given below:
>>>>>>
>>>>>>
>>>>>> rt_task_set_priority(....
>>>>>>
>>>>>> + if (0==prio)
>>>>>> + {
>>>>>> + xnthread_set_state(&task->thread_base, XNOTHER);
>>>>>> + }
>>>>>>
>>>>>>
>>>>>> xnpod_set_thread_schedparam(...
>>>>>>
>>>>>> #ifdef CONFIG_XENO_OPT_PERVASIVE
>>>>>> if (propagate) {
>>>>>> if (xnthread_test_state(thread, XNRELAX))
>>>>>> xnshadow_renice(thread);
>>>>>> else if (xnthread_test_state(thread, XNSHADOW))
>>>>>> xnthread_set_info(thread, XNPRIOSET);
>>>>>> }
>>>>>>
>>>>>> + if (xnthread_test_state(thread, XNSHADOW)) {
>>>>>> + // if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>>>> + if (thread->bprio)
>>>>>> + xnthread_clear_state(thread, XNOTHER);
>>>>>> + else
>>>>>> + xnthread_set_state(thread, XNOTHER);
>>>>>> + }
>>>>>>
>>>>>>
>>>>>> Setting XNOTHER in rt_task_set_priority does not look appropriate. I
>>>>>> believe the right place is in the xnpod_set_thread_schedparam.
>>>>>>
>>>>>> Would highly appreciate your views.
>>>>>>
>>>>>> Rgds,
>>>>>> Mak
>>>>>>
>>>>>>
>>>>>> On 10/01/12 02:10 PM, Makarand Pradhan wrote:
>>>>>>> The patch does work. Thanks.
>>>>>>>
>>>>>>> Will it be available in the next release of xenomai?
>>>>>>>
>>>>>>> Rgds,
>>>>>>> Mak
>>>>>>>
>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>> Spawning: tasks
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Grabbing mux in HP
>>>>>>> Mux held by Task2
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> Release complete
>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>> Acquire complete
>>>>>>> ^C
>>>>>>> root@domain.hid:~#
>>>>>>>
>>>>>>>
>>>>>>> On 10/01/12 01:39 PM, Makarand Pradhan wrote:
>>>>>>>> Hi Phillipe,
>>>>>>>>
>>>>>>>> A bit surprised to see a change in sched-rt.h. I had another problem
>>>>>>>> earlier where the XNOTHER was not getting set after a priority
>>>>>>>> change. I
>>>>>>>> had to look at the code that you have modified. Although I had
>>>>>>>> temporarily worked around it by setting the XNOTHER in
>>>>>>>> rt_task_set_priority. I think this would fix that problem as well.
>>>>>>>>
>>>>>>>> Will test the patch and get back with the results.
>>>>>>>>
>>>>>>>> Thanks and Rgds,
>>>>>>>> Mak.
>>>>>>>>
>>>>>>>> On 10/01/12 01:08 PM, Philippe Gerum wrote:
>>>>>>>>> On 01/10/2012 04:51 PM, Makarand Pradhan wrote:
>>>>>>>>>> Based on my testing, it is noted that the rescnt is not released
>>>>>>>>>> when
>>>>>>>>>> task1 gets a priority boost and starts running with priority 1.
>>>>>>>>>> That's
>>>>>>>>>> when the rescnt is not decremented.
>>>>>>>>>>
>>>>>>>>>> It would imply that we may be checking the current priority while
>>>>>>>>>> testing if we want to invoke rt_mutex_release in kernel. Will try to
>>>>>>>>>> check it out.
>>>>>>>>> Does this help in your case?
>>>>>>>>>
>>>>>>>>> diff --git a/include/nucleus/sched-rt.h b/include/nucleus/sched-rt.h
>>>>>>>>> index cc1cefa..6ac8fd7 100644
>>>>>>>>> --- a/include/nucleus/sched-rt.h
>>>>>>>>> +++ b/include/nucleus/sched-rt.h
>>>>>>>>> @@ -87,7 +87,7 @@ static inline void __xnsched_rt_setparam(struct
>>>>>>>>> xnthread *thread,
>>>>>>>>> {
>>>>>>>>> thread->cprio = p->rt.prio;
>>>>>>>>> if (xnthread_test_state(thread, XNSHADOW)) {
>>>>>>>>> - if (thread->cprio)
>>>>>>>>> + if (thread->bprio || !xnthread_test_state(thread, XNBOOST))
>>>>>>>>> xnthread_clear_state(thread, XNOTHER);
>>>>>>>>> else
>>>>>>>>> xnthread_set_state(thread, XNOTHER);
>>>>>>>>>> Rgds,
>>>>>>>>>> Mak.
>>>>>>>>>>
>>>>>>>>>> On 10/01/12 10:42 AM, Philippe Gerum wrote:
>>>>>>>>>>> On 01/10/2012 04:40 PM, Philippe Gerum wrote:
>>>>>>>>>>>> On 01/10/2012 04:40 PM, Makarand Pradhan wrote:
>>>>>>>>>>>>> Another point:
>>>>>>>>>>>>>
>>>>>>>>>>>>> "These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>>>> kernel
>>>>>>>>>>>>> space
>>>>>>>>>>>>> unless the released mutex was actually contented."
>>>>>>>>>>>>>
>>>>>>>>>>>>> When the first task is started with prio 0, I always see that
>>>>>>>>>>>>> rt_mutex_release is invoked in the kernel. even when there is no
>>>>>>>>>>>>> contention.
>>>>>>>>>>>> I should have added: "unless there is no contention ... or the
>>>>>>>>>>>> caller is
>>>>>>>>>>>> a non-rt thread". This is because we have to jump to kernel
>>>>>>>>>>>> space to
>>>>>>>>>>>> track rescnt.
>>>>>>>>>>>>
>>>>>>>>>>> Ok, next try: "unless the mutex was contented ... or the caller is
>>>>>>>>>>> a non-rt thread".
>>>>>>>>>>>
>>>>>>>>>>>>> I have an instrumented kernel. The kernel trace is given below.
>>>>>>>>>>>>> In this
>>>>>>>>>>>>> trace only task1 is running at prio 0. It should be easy to
>>>>>>>>>>>>> follow:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 0, switched: 0
>>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>>> Jan 10 10:36:59 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: RML
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:01 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 1, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: __rt_mutex_release
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: RML
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: rt_mutex_release: lockcnt: 1
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: xnsynch_release_thread: BP: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 0, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: lo: rescnt: 1, switched: 1
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 2, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:03 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>>> Jan 10 10:37:04 ruggedcom kernel: hi: rescnt: 3, switched: 0
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> root@domain.hid:~# ./a.out 0 1
>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>> ^C
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Rgds,
>>>>>>>>>>>>> Mak.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 10/01/12 10:26 AM, Makarand Pradhan wrote:
>>>>>>>>>>>>>> Hi Phillippe,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You are right. Task 1 requires to be started with prio 0. I
>>>>>>>>>>>>>> start
>>>>>>>>>>>>>> seeing
>>>>>>>>>>>>>> the problem after task2 grabs the mutex and releases them. The
>>>>>>>>>>>>>> first
>>>>>>>>>>>>>> task never jumps back to seconodary. Here is my output. The
>>>>>>>>>>>>>> mode never
>>>>>>>>>>>>>> goes back to 0 after "Grabbing mux in HP" and the rescnt stays
>>>>>>>>>>>>>> stuck at
>>>>>>>>>>>>>> 1 in the kernel.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> root@domain.hid:~# ./relax 0 1
>>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Grabbing mux in HP
>>>>>>>>>>>>>> Mux held by Task2
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 1
>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Rgds,
>>>>>>>>>>>>>> Mak.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 10/01/12 10:11 AM, Philippe Gerum wrote:
>>>>>>>>>>>>>>> On 01/09/2012 09:50 PM, Makarand Pradhan wrote:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am running kernel 3.0.0, xenomai: 2.6, powerpc 8360.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am noticing an issue while using the auto relax feature
>>>>>>>>>>>>>>>> related to
>>>>>>>>>>>>>>>> mutexes. I am using nested mutexes. The code is attached to
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> email.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The problem is that I am not relaxing after a RT thread grabs
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> releases a mutex. On further investigation, it was noted that
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> rescnt
>>>>>>>>>>>>>>>> is not going down to 0.
>>>>>>>>>>>>>>>      From your code, task1 would auto-relax only if started with
>>>>>>>>>>>>>>> priority 0,
>>>>>>>>>>>>>>> which is what I get here:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -bash-3.2# ./relax 0 1
>>>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>>> bP: 0, cp: 0, mode: 0
>>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Conversely, I get the right behavior if setting a non-zero
>>>>>>>>>>>>>>> priority to
>>>>>>>>>>>>>>> task1:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -bash-3.2# ./relax 1 0
>>>>>>>>>>>>>>> Spawning: tasks
>>>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>>> Release complete
>>>>>>>>>>>>>>> bP: 1, cp: 1, mode: 1
>>>>>>>>>>>>>>> Acquire complete
>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In any case, the priority of task2 should have no impact on the
>>>>>>>>>>>>>>> result.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm running current 2.6 HEAD commit (168da46de), kernel
>>>>>>>>>>>>>>> 3.1.5/powerpc32
>>>>>>>>>>>>>>> (52xx), pipeline 2.13-06.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Which priority arguments are you passing to your test program?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Another observation is that I do not hit
>>>>>>>>>>>>>>>> rt_mutex_release in the kernel in the problem scenario, I
>>>>>>>>>>>>>>>> believe
>>>>>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> the thread undergoes a priority inversion.This may be a
>>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>>> as the
>>>>>>>>>>>>>>>> rescnt would not get decremented. Not sure how the mutex is
>>>>>>>>>>>>>>>> releasing
>>>>>>>>>>>>>>>> wiithout hitting rt_mutex_relase or am I missing anything?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> These are fast mutexes, the thread does not have to jump to
>>>>>>>>>>>>>>> kernel
>>>>>>>>>>>>>>> space
>>>>>>>>>>>>>>> unless the released mutex was actually contented.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If I have both the tasks running at priority 0, I stay in the
>>>>>>>>>>>>>>>> secondary
>>>>>>>>>>>>>>>> domain, rt_mutex_release is invoked as expected, the rescnt
>>>>>>>>>>>>>>>> goes
>>>>>>>>>>>>>>>> down to
>>>>>>>>>>>>>>>> 0 when all the mutexes are released.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Has anyone faced this problem?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm unsure there is any yet. Auto-relax applies to non -rt
>>>>>>>>>>>>>>> Xenomai
>>>>>>>>>>>>>>> threads only (i.e. prio == 0).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Rgds,
>>>>>>>>>>>>>>>> Makarand
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Xenomai-help mailing list
>>>>>>>>>>>>>>>> Xenomai-help@domain.hid
>>>>>>>>>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>>>>>> --
>>>>>>>> ___________________________________________________________________________
>>>>>>>>
>>>>>>>>
>>>>>>>> NOTICE OF CONFIDENTIALITY:
>>>>>>>> This e-mail and any attachments may contain confidential and
>>>>>>>> privileged information. If you are
>>>>>>>> not the intended recipient, please notify the sender immediately by
>>>>>>>> return e-mail and delete this
>>>>>>>> e-mail and any copies. Any dissemination or use of this information
>>>>>>>> by a person other than the
>>>>>>>> intended recipient is unauthorized and may be illegal.
>>>>>>>> _____________________________________________________________________
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Xenomai-help mailing list
>>>>>>>> Xenomai-help@domain.hid
>>>>>>>> https://mail.gna.org/listinfo/xenomai-help
>>>>> --
>>>>> Philippe.
>>> --
>>> Philippe.
>> --
>> ___________________________________________________________________________
>> NOTICE OF CONFIDENTIALITY:
>> This e-mail and any attachments may contain confidential and privileged information.  If you are
>> not the intended recipient, please notify the sender immediately by return e-mail and delete this
>> e-mail and any copies.  Any dissemination or use of this information by a person other than the
>> intended recipient is unauthorized and may be illegal.
>> _____________________________________________________________________
>>
>>
>>
>
> --
> ___________________________________________________________________________
> NOTICE OF CONFIDENTIALITY:
> This e-mail and any attachments may contain confidential and privileged information.  If you are
> not the intended recipient, please notify the sender immediately by return e-mail and delete this
> e-mail and any copies.  Any dissemination or use of this information by a person other than the
> intended recipient is unauthorized and may be illegal.
> _____________________________________________________________________
>
>
>


-- 
___________________________________________________________________________
NOTICE OF CONFIDENTIALITY:
This e-mail and any attachments may contain confidential and privileged information.  If you are
not the intended recipient, please notify the sender immediately by return e-mail and delete this
e-mail and any copies.  Any dissemination or use of this information by a person other than the
intended recipient is unauthorized and may be illegal.
_____________________________________________________________________

  



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-01-23 15:01 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-09 20:50 [Xenomai-help] Issue with Auto relax and nested mutexes Makarand Pradhan
2012-01-10 15:11 ` Philippe Gerum
2012-01-10 15:26   ` Makarand Pradhan
2012-01-10 15:38     ` Philippe Gerum
2012-01-10 15:40     ` Makarand Pradhan
2012-01-10 15:40       ` Philippe Gerum
2012-01-10 15:42         ` Philippe Gerum
2012-01-10 15:51           ` Makarand Pradhan
2012-01-10 17:51             ` Philippe Gerum
2012-01-10 18:08             ` Philippe Gerum
2012-01-10 18:39               ` Makarand Pradhan
2012-01-10 19:10                 ` Makarand Pradhan
2012-01-10 20:30                   ` Philippe Gerum
2012-01-18 22:41                   ` Makarand Pradhan
2012-01-19 10:17                     ` Gilles Chanteperdrix
2012-01-19 11:25                     ` Philippe Gerum
2012-01-19 12:29                       ` Gilles Chanteperdrix
2012-01-19 15:35                         ` Makarand Pradhan
2012-01-19 15:22                       ` Makarand Pradhan
2012-01-19 15:49                         ` Philippe Gerum
2012-01-19 16:22                           ` Makarand Pradhan
2012-01-19 16:39                             ` Makarand Pradhan
2012-01-23 15:01                               ` Makarand Pradhan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.