All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai] Prio ceiling mutex issues
@ 2016-05-09 19:03 Jan Kiszka
  2016-05-09 19:16 ` Jan Kiszka
  2016-05-09 19:30 ` Philippe Gerum
  0 siblings, 2 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:03 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai

Hi Philippe,

we ran into at least one bug of the current prio ceiling support for
synch objects: on xnsynch_release, we only reschedule if we woken up a
new owner. But we also have to reschedule after dropping the ceiling
prio as some higher prio thread may be waiting.

For that, I would propose to alter the return type and semantic of
xnsynch_release from "pointer to new owner" into "bool, true if
reschedule is needed". It seems, no caller of xnsynch_release makes a
non-boolean use of the return value anyway. And then we can simply
return true on "synch->status & XNSYNCH_PP".

That leads to the question what would happen in the fast case. Consider

pthread_mutex_lock(ceiling_mutex);
wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
pthread_mutex_unlock(ceiling_mutex);

To my current understanding, there is nothing enforcing a syscall for
the unlock caller if some thread woke up after the lock, thus may have a
high prio after the unlock. Or am I missing something?

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] Prio ceiling mutex issues
  2016-05-09 19:03 [Xenomai] Prio ceiling mutex issues Jan Kiszka
@ 2016-05-09 19:16 ` Jan Kiszka
  2016-05-09 19:22   ` Philippe Gerum
  2016-05-09 19:30 ` Philippe Gerum
  1 sibling, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:16 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai

On 2016-05-09 21:03, Jan Kiszka wrote:
> Hi Philippe,
> 
> we ran into at least one bug of the current prio ceiling support for
> synch objects: on xnsynch_release, we only reschedule if we woken up a
> new owner. But we also have to reschedule after dropping the ceiling
> prio as some higher prio thread may be waiting.
> 
> For that, I would propose to alter the return type and semantic of
> xnsynch_release from "pointer to new owner" into "bool, true if
> reschedule is needed". It seems, no caller of xnsynch_release makes a
> non-boolean use of the return value anyway. And then we can simply
> return true on "synch->status & XNSYNCH_PP".
> 
> That leads to the question what would happen in the fast case. Consider
> 
> pthread_mutex_lock(ceiling_mutex);
> wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
> pthread_mutex_unlock(ceiling_mutex);
> 
> To my current understanding, there is nothing enforcing a syscall for
> the unlock caller if some thread woke up after the lock, thus may have a
> high prio after the unlock. Or am I missing something?

OK, I think I got it: When we found a reason to reschedule after taking
such a ceiling lock lazily, we commit the prio-boost and enforce the
kernel exit on release via xnsynch_fast_ceiling in commit_ceiling,
right? Then we only need to fix the above scheduling issue and are fine.
I'll quickly hack up some patches for discussion.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] Prio ceiling mutex issues
  2016-05-09 19:16 ` Jan Kiszka
@ 2016-05-09 19:22   ` Philippe Gerum
  2016-05-09 19:27     ` Jan Kiszka
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2016-05-09 19:22 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 05/09/2016 09:16 PM, Jan Kiszka wrote:
> On 2016-05-09 21:03, Jan Kiszka wrote:
>> Hi Philippe,
>>
>> we ran into at least one bug of the current prio ceiling support for
>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>> new owner. But we also have to reschedule after dropping the ceiling
>> prio as some higher prio thread may be waiting.
>>
>> For that, I would propose to alter the return type and semantic of
>> xnsynch_release from "pointer to new owner" into "bool, true if
>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>> non-boolean use of the return value anyway. And then we can simply
>> return true on "synch->status & XNSYNCH_PP".
>>
>> That leads to the question what would happen in the fast case. Consider
>>
>> pthread_mutex_lock(ceiling_mutex);
>> wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
>> pthread_mutex_unlock(ceiling_mutex);
>>
>> To my current understanding, there is nothing enforcing a syscall for
>> the unlock caller if some thread woke up after the lock, thus may have a
>> high prio after the unlock. Or am I missing something?
> 
> OK, I think I got it: When we found a reason to reschedule after taking
> such a ceiling lock lazily, we commit the prio-boost and enforce the
> kernel exit on release via xnsynch_fast_ceiling in commit_ceiling,
> right?

Correct (at the very least this is the intent of the implementation).

-- 
Philippe.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] Prio ceiling mutex issues
  2016-05-09 19:22   ` Philippe Gerum
@ 2016-05-09 19:27     ` Jan Kiszka
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:27 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai

On 2016-05-09 21:22, Philippe Gerum wrote:
> On 05/09/2016 09:16 PM, Jan Kiszka wrote:
>> On 2016-05-09 21:03, Jan Kiszka wrote:
>>> Hi Philippe,
>>>
>>> we ran into at least one bug of the current prio ceiling support for
>>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>>> new owner. But we also have to reschedule after dropping the ceiling
>>> prio as some higher prio thread may be waiting.
>>>
>>> For that, I would propose to alter the return type and semantic of
>>> xnsynch_release from "pointer to new owner" into "bool, true if
>>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>>> non-boolean use of the return value anyway. And then we can simply
>>> return true on "synch->status & XNSYNCH_PP".
>>>
>>> That leads to the question what would happen in the fast case. Consider
>>>
>>> pthread_mutex_lock(ceiling_mutex);
>>> wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
>>> pthread_mutex_unlock(ceiling_mutex);
>>>
>>> To my current understanding, there is nothing enforcing a syscall for
>>> the unlock caller if some thread woke up after the lock, thus may have a
>>> high prio after the unlock. Or am I missing something?
>>
>> OK, I think I got it: When we found a reason to reschedule after taking
>> such a ceiling lock lazily, we commit the prio-boost and enforce the
>> kernel exit on release via xnsynch_fast_ceiling in commit_ceiling,
>> right?
> 
> Correct (at the very least this is the intent of the implementation).
> 

Good... I pushed the two patches I have in mind, but both lack testing
beyond "builds for me". I will let them run to our scenario tomorrow
unless you see a problem.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] Prio ceiling mutex issues
  2016-05-09 19:03 [Xenomai] Prio ceiling mutex issues Jan Kiszka
  2016-05-09 19:16 ` Jan Kiszka
@ 2016-05-09 19:30 ` Philippe Gerum
  2016-05-09 19:32   ` Jan Kiszka
  1 sibling, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2016-05-09 19:30 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 05/09/2016 09:03 PM, Jan Kiszka wrote:
> Hi Philippe,
> 
> we ran into at least one bug of the current prio ceiling support for
> synch objects: on xnsynch_release, we only reschedule if we woken up a
> new owner. But we also have to reschedule after dropping the ceiling
> prio as some higher prio thread may be waiting.
> 
> For that, I would propose to alter the return type and semantic of
> xnsynch_release from "pointer to new owner" into "bool, true if
> reschedule is needed". It seems, no caller of xnsynch_release makes a
> non-boolean use of the return value anyway. And then we can simply
> return true on "synch->status & XNSYNCH_PP".

I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:

diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
index 976261d..f84cb6e 100644
--- a/kernel/cobalt/synch.c
+++ b/kernel/cobalt/synch.c
@@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
  *
  * @coretags{primary-only, might-switch}
  */
-struct xnthread *xnsynch_release(struct xnsynch *synch,
-				 struct xnthread *curr)
+static bool xnsynch_release(struct xnsynch *synch,
+			    struct xnthread *curr)
 {
-	struct xnthread *nextowner = NULL;
+	bool resched = false;
 	xnhandle_t currh, h;
 	atomic_t *lockp;
 	spl_t s;
@@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
 	h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
 	if ((h & ~XNSYNCH_FLCEIL) != currh)
 		/* FLCLAIM set, synch is contended. */
-		nextowner = transfer_ownership(synch, curr);
+		resched = transfer_ownership(synch, curr) != NULL;
 	else if (h != currh)	/* FLCEIL set, FLCLAIM clear. */
 		atomic_set(lockp, XN_NO_HANDLE);
 
-	if (synch->status & XNSYNCH_PP)
+	if (synch->status & XNSYNCH_PP) {
 		clear_pp_boost(synch, curr);
+		resched = true;
+	}
 
 	xnlock_put_irqrestore(&nklock, s);
 
-	return nextowner;
+	return resched;
 }
 EXPORT_SYMBOL_GPL(xnsynch_release);
 

-- 
Philippe.


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Xenomai] Prio ceiling mutex issues
  2016-05-09 19:30 ` Philippe Gerum
@ 2016-05-09 19:32   ` Jan Kiszka
  2016-05-10 13:32     ` Jan Kiszka
  2016-05-10 13:57     ` Philippe Gerum
  0 siblings, 2 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:32 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai

On 2016-05-09 21:30, Philippe Gerum wrote:
> On 05/09/2016 09:03 PM, Jan Kiszka wrote:
>> Hi Philippe,
>>
>> we ran into at least one bug of the current prio ceiling support for
>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>> new owner. But we also have to reschedule after dropping the ceiling
>> prio as some higher prio thread may be waiting.
>>
>> For that, I would propose to alter the return type and semantic of
>> xnsynch_release from "pointer to new owner" into "bool, true if
>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>> non-boolean use of the return value anyway. And then we can simply
>> return true on "synch->status & XNSYNCH_PP".
> 
> I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:
> 
> diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
> index 976261d..f84cb6e 100644
> --- a/kernel/cobalt/synch.c
> +++ b/kernel/cobalt/synch.c
> @@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
>   *
>   * @coretags{primary-only, might-switch}
>   */
> -struct xnthread *xnsynch_release(struct xnsynch *synch,
> -				 struct xnthread *curr)
> +static bool xnsynch_release(struct xnsynch *synch,
> +			    struct xnthread *curr)
>  {
> -	struct xnthread *nextowner = NULL;
> +	bool resched = false;
>  	xnhandle_t currh, h;
>  	atomic_t *lockp;
>  	spl_t s;
> @@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
>  	h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
>  	if ((h & ~XNSYNCH_FLCEIL) != currh)
>  		/* FLCLAIM set, synch is contended. */
> -		nextowner = transfer_ownership(synch, curr);
> +		resched = transfer_ownership(synch, curr) != NULL;
>  	else if (h != currh)	/* FLCEIL set, FLCLAIM clear. */
>  		atomic_set(lockp, XN_NO_HANDLE);
>  
> -	if (synch->status & XNSYNCH_PP)
> +	if (synch->status & XNSYNCH_PP) {
>  		clear_pp_boost(synch, curr);
> +		resched = true;
> +	}
>  
>  	xnlock_put_irqrestore(&nklock, s);
>  
> -	return nextowner;
> +	return resched;
>  }
>  EXPORT_SYMBOL_GPL(xnsynch_release);
>  
> 

OK, I didn't use bool (will change that), but otherwise did the same
with that two patches in for-forge. Or what do you mean?

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] Prio ceiling mutex issues
  2016-05-09 19:32   ` Jan Kiszka
@ 2016-05-10 13:32     ` Jan Kiszka
  2016-05-10 13:57     ` Philippe Gerum
  1 sibling, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-10 13:32 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: Xenomai

On 2016-05-09 21:32, Jan Kiszka wrote:
> On 2016-05-09 21:30, Philippe Gerum wrote:
>> On 05/09/2016 09:03 PM, Jan Kiszka wrote:
>>> Hi Philippe,
>>>
>>> we ran into at least one bug of the current prio ceiling support for
>>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>>> new owner. But we also have to reschedule after dropping the ceiling
>>> prio as some higher prio thread may be waiting.
>>>
>>> For that, I would propose to alter the return type and semantic of
>>> xnsynch_release from "pointer to new owner" into "bool, true if
>>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>>> non-boolean use of the return value anyway. And then we can simply
>>> return true on "synch->status & XNSYNCH_PP".
>>
>> I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:
>>
>> diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
>> index 976261d..f84cb6e 100644
>> --- a/kernel/cobalt/synch.c
>> +++ b/kernel/cobalt/synch.c
>> @@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
>>   *
>>   * @coretags{primary-only, might-switch}
>>   */
>> -struct xnthread *xnsynch_release(struct xnsynch *synch,
>> -				 struct xnthread *curr)
>> +static bool xnsynch_release(struct xnsynch *synch,
>> +			    struct xnthread *curr)
>>  {
>> -	struct xnthread *nextowner = NULL;
>> +	bool resched = false;
>>  	xnhandle_t currh, h;
>>  	atomic_t *lockp;
>>  	spl_t s;
>> @@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
>>  	h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
>>  	if ((h & ~XNSYNCH_FLCEIL) != currh)
>>  		/* FLCLAIM set, synch is contended. */
>> -		nextowner = transfer_ownership(synch, curr);
>> +		resched = transfer_ownership(synch, curr) != NULL;
>>  	else if (h != currh)	/* FLCEIL set, FLCLAIM clear. */
>>  		atomic_set(lockp, XN_NO_HANDLE);
>>  
>> -	if (synch->status & XNSYNCH_PP)
>> +	if (synch->status & XNSYNCH_PP) {
>>  		clear_pp_boost(synch, curr);
>> +		resched = true;
>> +	}
>>  
>>  	xnlock_put_irqrestore(&nklock, s);
>>  
>> -	return nextowner;
>> +	return resched;
>>  }
>>  EXPORT_SYMBOL_GPL(xnsynch_release);
>>  
>>
> 
> OK, I didn't use bool (will change that), but otherwise did the same
> with that two patches in for-forge. Or what do you mean?

Problem solved these ways, both for slow and fast path. Just received
the confirmation.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai] Prio ceiling mutex issues
  2016-05-09 19:32   ` Jan Kiszka
  2016-05-10 13:32     ` Jan Kiszka
@ 2016-05-10 13:57     ` Philippe Gerum
  1 sibling, 0 replies; 8+ messages in thread
From: Philippe Gerum @ 2016-05-10 13:57 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Xenomai

On 05/09/2016 09:32 PM, Jan Kiszka wrote:
> On 2016-05-09 21:30, Philippe Gerum wrote:
>> On 05/09/2016 09:03 PM, Jan Kiszka wrote:
>>> Hi Philippe,
>>>
>>> we ran into at least one bug of the current prio ceiling support for
>>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>>> new owner. But we also have to reschedule after dropping the ceiling
>>> prio as some higher prio thread may be waiting.
>>>
>>> For that, I would propose to alter the return type and semantic of
>>> xnsynch_release from "pointer to new owner" into "bool, true if
>>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>>> non-boolean use of the return value anyway. And then we can simply
>>> return true on "synch->status & XNSYNCH_PP".
>>
>> I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:
>>
>> diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
>> index 976261d..f84cb6e 100644
>> --- a/kernel/cobalt/synch.c
>> +++ b/kernel/cobalt/synch.c
>> @@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
>>   *
>>   * @coretags{primary-only, might-switch}
>>   */
>> -struct xnthread *xnsynch_release(struct xnsynch *synch,
>> -				 struct xnthread *curr)
>> +static bool xnsynch_release(struct xnsynch *synch,
>> +			    struct xnthread *curr)
>>  {
>> -	struct xnthread *nextowner = NULL;
>> +	bool resched = false;
>>  	xnhandle_t currh, h;
>>  	atomic_t *lockp;
>>  	spl_t s;
>> @@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
>>  	h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
>>  	if ((h & ~XNSYNCH_FLCEIL) != currh)
>>  		/* FLCLAIM set, synch is contended. */
>> -		nextowner = transfer_ownership(synch, curr);
>> +		resched = transfer_ownership(synch, curr) != NULL;
>>  	else if (h != currh)	/* FLCEIL set, FLCLAIM clear. */
>>  		atomic_set(lockp, XN_NO_HANDLE);
>>  
>> -	if (synch->status & XNSYNCH_PP)
>> +	if (synch->status & XNSYNCH_PP) {
>>  		clear_pp_boost(synch, curr);
>> +		resched = true;
>> +	}
>>  
>>  	xnlock_put_irqrestore(&nklock, s);
>>  
>> -	return nextowner;
>> +	return resched;
>>  }
>>  EXPORT_SYMBOL_GPL(xnsynch_release);
>>  
>>
> 
> OK, I didn't use bool (will change that), but otherwise did the same
> with that two patches in for-forge. Or what do you mean?
> 

No issue, e-mail crossing.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-05-10 13:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-09 19:03 [Xenomai] Prio ceiling mutex issues Jan Kiszka
2016-05-09 19:16 ` Jan Kiszka
2016-05-09 19:22   ` Philippe Gerum
2016-05-09 19:27     ` Jan Kiszka
2016-05-09 19:30 ` Philippe Gerum
2016-05-09 19:32   ` Jan Kiszka
2016-05-10 13:32     ` Jan Kiszka
2016-05-10 13:57     ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.