* [Xenomai] Prio ceiling mutex issues
@ 2016-05-09 19:03 Jan Kiszka
2016-05-09 19:16 ` Jan Kiszka
2016-05-09 19:30 ` Philippe Gerum
0 siblings, 2 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:03 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai
Hi Philippe,
we ran into at least one bug of the current prio ceiling support for
synch objects: on xnsynch_release, we only reschedule if we woken up a
new owner. But we also have to reschedule after dropping the ceiling
prio as some higher prio thread may be waiting.
For that, I would propose to alter the return type and semantic of
xnsynch_release from "pointer to new owner" into "bool, true if
reschedule is needed". It seems, no caller of xnsynch_release makes a
non-boolean use of the return value anyway. And then we can simply
return true on "synch->status & XNSYNCH_PP".
That leads to the question what would happen in the fast case. Consider
pthread_mutex_lock(ceiling_mutex);
wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
pthread_mutex_unlock(ceiling_mutex);
To my current understanding, there is nothing enforcing a syscall for
the unlock caller if some thread woke up after the lock, thus may have a
high prio after the unlock. Or am I missing something?
Thanks,
Jan
--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] Prio ceiling mutex issues
2016-05-09 19:03 [Xenomai] Prio ceiling mutex issues Jan Kiszka
@ 2016-05-09 19:16 ` Jan Kiszka
2016-05-09 19:22 ` Philippe Gerum
2016-05-09 19:30 ` Philippe Gerum
1 sibling, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:16 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai
On 2016-05-09 21:03, Jan Kiszka wrote:
> Hi Philippe,
>
> we ran into at least one bug of the current prio ceiling support for
> synch objects: on xnsynch_release, we only reschedule if we woken up a
> new owner. But we also have to reschedule after dropping the ceiling
> prio as some higher prio thread may be waiting.
>
> For that, I would propose to alter the return type and semantic of
> xnsynch_release from "pointer to new owner" into "bool, true if
> reschedule is needed". It seems, no caller of xnsynch_release makes a
> non-boolean use of the return value anyway. And then we can simply
> return true on "synch->status & XNSYNCH_PP".
>
> That leads to the question what would happen in the fast case. Consider
>
> pthread_mutex_lock(ceiling_mutex);
> wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
> pthread_mutex_unlock(ceiling_mutex);
>
> To my current understanding, there is nothing enforcing a syscall for
> the unlock caller if some thread woke up after the lock, thus may have a
> high prio after the unlock. Or am I missing something?
OK, I think I got it: When we found a reason to reschedule after taking
such a ceiling lock lazily, we commit the prio-boost and enforce the
kernel exit on release via xnsynch_fast_ceiling in commit_ceiling,
right? Then we only need to fix the above scheduling issue and are fine.
I'll quickly hack up some patches for discussion.
Jan
--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] Prio ceiling mutex issues
2016-05-09 19:16 ` Jan Kiszka
@ 2016-05-09 19:22 ` Philippe Gerum
2016-05-09 19:27 ` Jan Kiszka
0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2016-05-09 19:22 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai
On 05/09/2016 09:16 PM, Jan Kiszka wrote:
> On 2016-05-09 21:03, Jan Kiszka wrote:
>> Hi Philippe,
>>
>> we ran into at least one bug of the current prio ceiling support for
>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>> new owner. But we also have to reschedule after dropping the ceiling
>> prio as some higher prio thread may be waiting.
>>
>> For that, I would propose to alter the return type and semantic of
>> xnsynch_release from "pointer to new owner" into "bool, true if
>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>> non-boolean use of the return value anyway. And then we can simply
>> return true on "synch->status & XNSYNCH_PP".
>>
>> That leads to the question what would happen in the fast case. Consider
>>
>> pthread_mutex_lock(ceiling_mutex);
>> wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
>> pthread_mutex_unlock(ceiling_mutex);
>>
>> To my current understanding, there is nothing enforcing a syscall for
>> the unlock caller if some thread woke up after the lock, thus may have a
>> high prio after the unlock. Or am I missing something?
>
> OK, I think I got it: When we found a reason to reschedule after taking
> such a ceiling lock lazily, we commit the prio-boost and enforce the
> kernel exit on release via xnsynch_fast_ceiling in commit_ceiling,
> right?
Correct (at the very least this is the intent of the implementation).
--
Philippe.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] Prio ceiling mutex issues
2016-05-09 19:22 ` Philippe Gerum
@ 2016-05-09 19:27 ` Jan Kiszka
0 siblings, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:27 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai
On 2016-05-09 21:22, Philippe Gerum wrote:
> On 05/09/2016 09:16 PM, Jan Kiszka wrote:
>> On 2016-05-09 21:03, Jan Kiszka wrote:
>>> Hi Philippe,
>>>
>>> we ran into at least one bug of the current prio ceiling support for
>>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>>> new owner. But we also have to reschedule after dropping the ceiling
>>> prio as some higher prio thread may be waiting.
>>>
>>> For that, I would propose to alter the return type and semantic of
>>> xnsynch_release from "pointer to new owner" into "bool, true if
>>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>>> non-boolean use of the return value anyway. And then we can simply
>>> return true on "synch->status & XNSYNCH_PP".
>>>
>>> That leads to the question what would happen in the fast case. Consider
>>>
>>> pthread_mutex_lock(ceiling_mutex);
>>> wake_up_thread(some_thread); // caller_prio < target_prio < ceiling_prio
>>> pthread_mutex_unlock(ceiling_mutex);
>>>
>>> To my current understanding, there is nothing enforcing a syscall for
>>> the unlock caller if some thread woke up after the lock, thus may have a
>>> high prio after the unlock. Or am I missing something?
>>
>> OK, I think I got it: When we found a reason to reschedule after taking
>> such a ceiling lock lazily, we commit the prio-boost and enforce the
>> kernel exit on release via xnsynch_fast_ceiling in commit_ceiling,
>> right?
>
> Correct (at the very least this is the intent of the implementation).
>
Good... I pushed the two patches I have in mind, but both lack testing
beyond "builds for me". I will let them run to our scenario tomorrow
unless you see a problem.
Jan
--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] Prio ceiling mutex issues
2016-05-09 19:03 [Xenomai] Prio ceiling mutex issues Jan Kiszka
2016-05-09 19:16 ` Jan Kiszka
@ 2016-05-09 19:30 ` Philippe Gerum
2016-05-09 19:32 ` Jan Kiszka
1 sibling, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2016-05-09 19:30 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai
On 05/09/2016 09:03 PM, Jan Kiszka wrote:
> Hi Philippe,
>
> we ran into at least one bug of the current prio ceiling support for
> synch objects: on xnsynch_release, we only reschedule if we woken up a
> new owner. But we also have to reschedule after dropping the ceiling
> prio as some higher prio thread may be waiting.
>
> For that, I would propose to alter the return type and semantic of
> xnsynch_release from "pointer to new owner" into "bool, true if
> reschedule is needed". It seems, no caller of xnsynch_release makes a
> non-boolean use of the return value anyway. And then we can simply
> return true on "synch->status & XNSYNCH_PP".
I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:
diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
index 976261d..f84cb6e 100644
--- a/kernel/cobalt/synch.c
+++ b/kernel/cobalt/synch.c
@@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
*
* @coretags{primary-only, might-switch}
*/
-struct xnthread *xnsynch_release(struct xnsynch *synch,
- struct xnthread *curr)
+static bool xnsynch_release(struct xnsynch *synch,
+ struct xnthread *curr)
{
- struct xnthread *nextowner = NULL;
+ bool resched = false;
xnhandle_t currh, h;
atomic_t *lockp;
spl_t s;
@@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
if ((h & ~XNSYNCH_FLCEIL) != currh)
/* FLCLAIM set, synch is contended. */
- nextowner = transfer_ownership(synch, curr);
+ resched = transfer_ownership(synch, curr) != NULL;
else if (h != currh) /* FLCEIL set, FLCLAIM clear. */
atomic_set(lockp, XN_NO_HANDLE);
- if (synch->status & XNSYNCH_PP)
+ if (synch->status & XNSYNCH_PP) {
clear_pp_boost(synch, curr);
+ resched = true;
+ }
xnlock_put_irqrestore(&nklock, s);
- return nextowner;
+ return resched;
}
EXPORT_SYMBOL_GPL(xnsynch_release);
--
Philippe.
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Xenomai] Prio ceiling mutex issues
2016-05-09 19:30 ` Philippe Gerum
@ 2016-05-09 19:32 ` Jan Kiszka
2016-05-10 13:32 ` Jan Kiszka
2016-05-10 13:57 ` Philippe Gerum
0 siblings, 2 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-09 19:32 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai
On 2016-05-09 21:30, Philippe Gerum wrote:
> On 05/09/2016 09:03 PM, Jan Kiszka wrote:
>> Hi Philippe,
>>
>> we ran into at least one bug of the current prio ceiling support for
>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>> new owner. But we also have to reschedule after dropping the ceiling
>> prio as some higher prio thread may be waiting.
>>
>> For that, I would propose to alter the return type and semantic of
>> xnsynch_release from "pointer to new owner" into "bool, true if
>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>> non-boolean use of the return value anyway. And then we can simply
>> return true on "synch->status & XNSYNCH_PP".
>
> I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:
>
> diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
> index 976261d..f84cb6e 100644
> --- a/kernel/cobalt/synch.c
> +++ b/kernel/cobalt/synch.c
> @@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
> *
> * @coretags{primary-only, might-switch}
> */
> -struct xnthread *xnsynch_release(struct xnsynch *synch,
> - struct xnthread *curr)
> +static bool xnsynch_release(struct xnsynch *synch,
> + struct xnthread *curr)
> {
> - struct xnthread *nextowner = NULL;
> + bool resched = false;
> xnhandle_t currh, h;
> atomic_t *lockp;
> spl_t s;
> @@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
> h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
> if ((h & ~XNSYNCH_FLCEIL) != currh)
> /* FLCLAIM set, synch is contended. */
> - nextowner = transfer_ownership(synch, curr);
> + resched = transfer_ownership(synch, curr) != NULL;
> else if (h != currh) /* FLCEIL set, FLCLAIM clear. */
> atomic_set(lockp, XN_NO_HANDLE);
>
> - if (synch->status & XNSYNCH_PP)
> + if (synch->status & XNSYNCH_PP) {
> clear_pp_boost(synch, curr);
> + resched = true;
> + }
>
> xnlock_put_irqrestore(&nklock, s);
>
> - return nextowner;
> + return resched;
> }
> EXPORT_SYMBOL_GPL(xnsynch_release);
>
>
OK, I didn't use bool (will change that), but otherwise did the same
with that two patches in for-forge. Or what do you mean?
Jan
--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] Prio ceiling mutex issues
2016-05-09 19:32 ` Jan Kiszka
@ 2016-05-10 13:32 ` Jan Kiszka
2016-05-10 13:57 ` Philippe Gerum
1 sibling, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2016-05-10 13:32 UTC (permalink / raw)
To: Philippe Gerum; +Cc: Xenomai
On 2016-05-09 21:32, Jan Kiszka wrote:
> On 2016-05-09 21:30, Philippe Gerum wrote:
>> On 05/09/2016 09:03 PM, Jan Kiszka wrote:
>>> Hi Philippe,
>>>
>>> we ran into at least one bug of the current prio ceiling support for
>>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>>> new owner. But we also have to reschedule after dropping the ceiling
>>> prio as some higher prio thread may be waiting.
>>>
>>> For that, I would propose to alter the return type and semantic of
>>> xnsynch_release from "pointer to new owner" into "bool, true if
>>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>>> non-boolean use of the return value anyway. And then we can simply
>>> return true on "synch->status & XNSYNCH_PP".
>>
>> I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:
>>
>> diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
>> index 976261d..f84cb6e 100644
>> --- a/kernel/cobalt/synch.c
>> +++ b/kernel/cobalt/synch.c
>> @@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
>> *
>> * @coretags{primary-only, might-switch}
>> */
>> -struct xnthread *xnsynch_release(struct xnsynch *synch,
>> - struct xnthread *curr)
>> +static bool xnsynch_release(struct xnsynch *synch,
>> + struct xnthread *curr)
>> {
>> - struct xnthread *nextowner = NULL;
>> + bool resched = false;
>> xnhandle_t currh, h;
>> atomic_t *lockp;
>> spl_t s;
>> @@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
>> h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
>> if ((h & ~XNSYNCH_FLCEIL) != currh)
>> /* FLCLAIM set, synch is contended. */
>> - nextowner = transfer_ownership(synch, curr);
>> + resched = transfer_ownership(synch, curr) != NULL;
>> else if (h != currh) /* FLCEIL set, FLCLAIM clear. */
>> atomic_set(lockp, XN_NO_HANDLE);
>>
>> - if (synch->status & XNSYNCH_PP)
>> + if (synch->status & XNSYNCH_PP) {
>> clear_pp_boost(synch, curr);
>> + resched = true;
>> + }
>>
>> xnlock_put_irqrestore(&nklock, s);
>>
>> - return nextowner;
>> + return resched;
>> }
>> EXPORT_SYMBOL_GPL(xnsynch_release);
>>
>>
>
> OK, I didn't use bool (will change that), but otherwise did the same
> with that two patches in for-forge. Or what do you mean?
Problem solved these ways, both for slow and fast path. Just received
the confirmation.
Jan
--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Xenomai] Prio ceiling mutex issues
2016-05-09 19:32 ` Jan Kiszka
2016-05-10 13:32 ` Jan Kiszka
@ 2016-05-10 13:57 ` Philippe Gerum
1 sibling, 0 replies; 8+ messages in thread
From: Philippe Gerum @ 2016-05-10 13:57 UTC (permalink / raw)
To: Jan Kiszka; +Cc: Xenomai
On 05/09/2016 09:32 PM, Jan Kiszka wrote:
> On 2016-05-09 21:30, Philippe Gerum wrote:
>> On 05/09/2016 09:03 PM, Jan Kiszka wrote:
>>> Hi Philippe,
>>>
>>> we ran into at least one bug of the current prio ceiling support for
>>> synch objects: on xnsynch_release, we only reschedule if we woken up a
>>> new owner. But we also have to reschedule after dropping the ceiling
>>> prio as some higher prio thread may be waiting.
>>>
>>> For that, I would propose to alter the return type and semantic of
>>> xnsynch_release from "pointer to new owner" into "bool, true if
>>> reschedule is needed". It seems, no caller of xnsynch_release makes a
>>> non-boolean use of the return value anyway. And then we can simply
>>> return true on "synch->status & XNSYNCH_PP".
>>
>> I agree, but we need to keep the next owner check in the boolean status. Something like this would do I believe:
>>
>> diff --git a/kernel/cobalt/synch.c b/kernel/cobalt/synch.c
>> index 976261d..f84cb6e 100644
>> --- a/kernel/cobalt/synch.c
>> +++ b/kernel/cobalt/synch.c
>> @@ -913,10 +913,10 @@ static struct xnthread *transfer_ownership(struct xnsynch *synch,
>> *
>> * @coretags{primary-only, might-switch}
>> */
>> -struct xnthread *xnsynch_release(struct xnsynch *synch,
>> - struct xnthread *curr)
>> +static bool xnsynch_release(struct xnsynch *synch,
>> + struct xnthread *curr)
>> {
>> - struct xnthread *nextowner = NULL;
>> + bool resched = false;
>> xnhandle_t currh, h;
>> atomic_t *lockp;
>> spl_t s;
>> @@ -946,16 +946,18 @@ struct xnthread *xnsynch_release(struct xnsynch *synch,
>> h = atomic_cmpxchg(lockp, currh, XN_NO_HANDLE);
>> if ((h & ~XNSYNCH_FLCEIL) != currh)
>> /* FLCLAIM set, synch is contended. */
>> - nextowner = transfer_ownership(synch, curr);
>> + resched = transfer_ownership(synch, curr) != NULL;
>> else if (h != currh) /* FLCEIL set, FLCLAIM clear. */
>> atomic_set(lockp, XN_NO_HANDLE);
>>
>> - if (synch->status & XNSYNCH_PP)
>> + if (synch->status & XNSYNCH_PP) {
>> clear_pp_boost(synch, curr);
>> + resched = true;
>> + }
>>
>> xnlock_put_irqrestore(&nklock, s);
>>
>> - return nextowner;
>> + return resched;
>> }
>> EXPORT_SYMBOL_GPL(xnsynch_release);
>>
>>
>
> OK, I didn't use bool (will change that), but otherwise did the same
> with that two patches in for-forge. Or what do you mean?
>
No issue, e-mail crossing.
--
Philippe.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2016-05-10 13:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-09 19:03 [Xenomai] Prio ceiling mutex issues Jan Kiszka
2016-05-09 19:16 ` Jan Kiszka
2016-05-09 19:22 ` Philippe Gerum
2016-05-09 19:27 ` Jan Kiszka
2016-05-09 19:30 ` Philippe Gerum
2016-05-09 19:32 ` Jan Kiszka
2016-05-10 13:32 ` Jan Kiszka
2016-05-10 13:57 ` Philippe Gerum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.