[CXP] Discussing the RTDM specification

All of lore.kernel.org
 help / color / mirror / Atom feed

* [CXP] Discussing the RTDM specification
@ 2020-12-18 14:19 Philippe Gerum
  2020-12-19  4:53 ` Fino Meng
  2020-12-23  9:45 ` Jan Kiszka
  0 siblings, 2 replies; 10+ messages in thread
From: Philippe Gerum @ 2020-12-18 14:19 UTC (permalink / raw)
  To: xenomai

This wiki page [1] contains a draft proposal about specifying which
services from the current RTDM interface should be part of the Common
Xenomai Platform. Some proposals for deprecation stand out:

- I suspect that only very few RTDM drivers are actually handling
  requests from other kernel-based drivers in real world applications,
  at least not enough to justify RTDM codifying these rare cases into a
  common interface (rtdm_open, rtdm_read, rtdm_write etc).

  In other words, although I would agree that a few particular drivers
  might want to export a couple of services to kernel-based clients in
  order to provide them some sort of backchannel, it seems wrong to
  require RTDM drivers to provide a kernel interface which would match
  their user interface in the same terms. For these specific cases, ad
  hoc code in these few drivers should be enough.

  Besides, I believe that most kernel->kernel request paths implemented
  by in-tree RTDM drivers have never been tested for years, if ever.
  Meanwhile, this kernel->kernel API introduces a basic exception case
  into many RTDM and driver code paths, e.g. for differentiating kernel
  vs user buffers, for only very few use cases.

  For these reasons, I would suggest to deprecate the kernel->kernel API
  from RTDM starting from 3.3, excluding it from the CXP in the same
  move.

- RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
  lock must go. For SMP scalability reasons, this big lock was
  eliminated from the EVL core, which means that all the attached
  semantics will not hold there. Serializing access to shared resources
  should be guaranteed by resource-specific locking, not by a giant
  traffic light like the big lock implements.

- rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
  condition on grabbing a mutex either means that:

  * the mutex was already locked on entry to the call if timeout ==
    RTDM_TIMEOUT_NONE,

  * the mutex was not released within the allotted time, which had to be
    long enough to prevent early shots. This means that something is
    most likely going really wrong in the software, since a mutex is
    supposed to cover fairly short, hopefully simple sections of code,
    exhibiting an obvious exit path.

  In the first case, we would be better off providing
  rtdm_trylock_mutex() which has clearer semantics, starting from 3.3,
  adding it to the CXP. The ship is most likely wrecked already in the
  second case, so using a timeout condition as a way towards recovery is
  unlikely to succeed at this point anyway.

[1] https://gitlab.denx.de/Xenomai/xenomai/-/wikis/CXP_RTDM

-- 
Philippe.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2020-12-18 14:19 [CXP] Discussing the RTDM specification Philippe Gerum
@ 2020-12-19  4:53 ` Fino Meng
  2020-12-23  9:45 ` Jan Kiszka
  1 sibling, 0 replies; 10+ messages in thread
From: Fino Meng @ 2020-12-19  4:53 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On Fri, Dec 18, 2020 at 03:19:44PM +0100, Philippe Gerum via Xenomai wrote:
> 
> This wiki page [1] contains a draft proposal about specifying which
> services from the current RTDM interface should be part of the Common
> Xenomai Platform. Some proposals for deprecation stand out:
> 
> - I suspect that only very few RTDM drivers are actually handling
>   requests from other kernel-based drivers in real world applications,
>   at least not enough to justify RTDM codifying these rare cases into a
>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
> 
>   In other words, although I would agree that a few particular drivers
>   might want to export a couple of services to kernel-based clients in
>   order to provide them some sort of backchannel, it seems wrong to
>   require RTDM drivers to provide a kernel interface which would match
>   their user interface in the same terms. For these specific cases, ad
>   hoc code in these few drivers should be enough.
> 
>   Besides, I believe that most kernel->kernel request paths implemented
>   by in-tree RTDM drivers have never been tested for years, if ever.
>   Meanwhile, this kernel->kernel API introduces a basic exception case
>   into many RTDM and driver code paths, e.g. for differentiating kernel
>   vs user buffers, for only very few use cases.
> 
>   For these reasons, I would suggest to deprecate the kernel->kernel API
>   from RTDM starting from 3.3, excluding it from the CXP in the same
>   move.

agree. 

> 
> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>   lock must go. For SMP scalability reasons, this big lock was
>   eliminated from the EVL core, which means that all the attached
>   semantics will not hold there. Serializing access to shared resources
>   should be guaranteed by resource-specific locking, not by a giant
>   traffic light like the big lock implements.
> 
> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>   condition on grabbing a mutex either means that:
> 
>   * the mutex was already locked on entry to the call if timeout ==
>     RTDM_TIMEOUT_NONE,
> 
>   * the mutex was not released within the allotted time, which had to be
>     long enough to prevent early shots. This means that something is
>     most likely going really wrong in the software, since a mutex is
>     supposed to cover fairly short, hopefully simple sections of code,
>     exhibiting an obvious exit path.
> 
>   In the first case, we would be better off providing
>   rtdm_trylock_mutex() which has clearer semantics, starting from 3.3,
>   adding it to the CXP. The ship is most likely wrecked already in the
>   second case, so using a timeout condition as a way towards recovery is
>   unlikely to succeed at this point anyway.

agree. 

rtdm timer can be used to implement some kind of timeout wakeup feature,
orthogonal to mutex API.

for beginners, simple rules are easy to remember:

spinlock: thread will busy waiting, no sleep, occopying CPU;
mutex: thread will go to sleep, abdicate CPU;

BR fino

> 
> [1] https://gitlab.denx.de/Xenomai/xenomai/-/wikis/CXP_RTDM
> 
> -- 
> Philippe.
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2020-12-18 14:19 [CXP] Discussing the RTDM specification Philippe Gerum
  2020-12-19  4:53 ` Fino Meng
@ 2020-12-23  9:45 ` Jan Kiszka
  2020-12-23 10:40   ` Philippe Gerum
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Kiszka @ 2020-12-23  9:45 UTC (permalink / raw)
  To: Philippe Gerum, xenomai

On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
> 
> This wiki page [1] contains a draft proposal about specifying which
> services from the current RTDM interface should be part of the Common
> Xenomai Platform. Some proposals for deprecation stand out:
> 
> - I suspect that only very few RTDM drivers are actually handling
>   requests from other kernel-based drivers in real world applications,
>   at least not enough to justify RTDM codifying these rare cases into a
>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
> 
>   In other words, although I would agree that a few particular drivers
>   might want to export a couple of services to kernel-based clients in
>   order to provide them some sort of backchannel, it seems wrong to
>   require RTDM drivers to provide a kernel interface which would match
>   their user interface in the same terms. For these specific cases, ad
>   hoc code in these few drivers should be enough.
> 
>   Besides, I believe that most kernel->kernel request paths implemented
>   by in-tree RTDM drivers have never been tested for years, if ever.
>   Meanwhile, this kernel->kernel API introduces a basic exception case
>   into many RTDM and driver code paths, e.g. for differentiating kernel
>   vs user buffers, for only very few use cases.
> 
>   For these reasons, I would suggest to deprecate the kernel->kernel API
>   from RTDM starting from 3.3, excluding it from the CXP in the same
>   move.

That's fine with me. The idea was once that something like bus drivers
would appear, but that never happened.

> 
> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>   lock must go. For SMP scalability reasons, this big lock was
>   eliminated from the EVL core, which means that all the attached
>   semantics will not hold there. Serializing access to shared resources
>   should be guaranteed by resource-specific locking, not by a giant
>   traffic light like the big lock implements.

This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
long ago, but users were migrated to cobalt_atomic_enter/leave which may
now make it look like we no longer need this. Maybe this is already the
case when using rtdm_waitqueue*, but let's convert everyone first.

> 
> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>   condition on grabbing a mutex either means that:
> 

I think you are missing the use cases:

mutex-lock-timed
...
wait-event-timed
...
mutex-unlock
(which goes long with timeout sequences)

In fact, all that could be replaced with

mutex-lock
...
atomic-entry
mutex-unlock
wait-event-timed
mutex-lock
atomic-leave
...
mutex-lock

but that is what we want to replace as well...

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2020-12-23  9:45 ` Jan Kiszka
@ 2020-12-23 10:40   ` Philippe Gerum
  2021-01-05 19:31     ` Jan Kiszka
  0 siblings, 1 reply; 10+ messages in thread
From: Philippe Gerum @ 2020-12-23 10:40 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai


Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>> 
>> This wiki page [1] contains a draft proposal about specifying which
>> services from the current RTDM interface should be part of the Common
>> Xenomai Platform. Some proposals for deprecation stand out:
>> 
>> - I suspect that only very few RTDM drivers are actually handling
>>   requests from other kernel-based drivers in real world applications,
>>   at least not enough to justify RTDM codifying these rare cases into a
>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>> 
>>   In other words, although I would agree that a few particular drivers
>>   might want to export a couple of services to kernel-based clients in
>>   order to provide them some sort of backchannel, it seems wrong to
>>   require RTDM drivers to provide a kernel interface which would match
>>   their user interface in the same terms. For these specific cases, ad
>>   hoc code in these few drivers should be enough.
>> 
>>   Besides, I believe that most kernel->kernel request paths implemented
>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>   vs user buffers, for only very few use cases.
>> 
>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>   move.
>
> That's fine with me. The idea was once that something like bus drivers
> would appear, but that never happened.
>
>> 
>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>   lock must go. For SMP scalability reasons, this big lock was
>>   eliminated from the EVL core, which means that all the attached
>>   semantics will not hold there. Serializing access to shared resources
>>   should be guaranteed by resource-specific locking, not by a giant
>>   traffic light like the big lock implements.
>
> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
> long ago, but users were migrated to cobalt_atomic_enter/leave which may
> now make it look like we no longer need this. Maybe this is already the
> case when using rtdm_waitqueue*, but let's convert everyone first.

Alternatively, In-tree v3 drivers could also keep relying
RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
them. Bottom line is to exclude from the CXP the whole idea that we may
schedule while holding a lock to protect against missed wake ups, in
general the very existence of any superlock which would cover everything
from top to bottom when serializing. I agree that having v3 converge
towards the CXP would be better though.

>
>> 
>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>   condition on grabbing a mutex either means that:
>> 
>
> I think you are missing the use cases:
>
> mutex-lock-timed
> ...
> wait-event-timed
> ...
> mutex-unlock
> (which goes long with timeout sequences)
>

There is a couple of issues with such use case: first we should never
ever sleep with a mutex held, this would trigger SIGDEBUG if done from
user ( a [binary] semaphore would at least prevent this problem), but
more importantly, how would this pattern allow the event to be signaled
given the waiter holds the lock the sender would need to acquire first?

> In fact, all that could be replaced with
>
> mutex-lock
> ...
> atomic-entry
> mutex-unlock
> wait-event-timed
> mutex-lock
> atomic-leave
> ...
> mutex-lock
>
> but that is what we want to replace as well...
>

Yep, that would work for v3, preventing wake ups to be missed the same way.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2020-12-23 10:40   ` Philippe Gerum
@ 2021-01-05 19:31     ` Jan Kiszka
  2021-01-09 17:01       ` Philippe Gerum
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kiszka @ 2021-01-05 19:31 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 23.12.20 11:40, Philippe Gerum wrote:
> 
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>>>
>>> This wiki page [1] contains a draft proposal about specifying which
>>> services from the current RTDM interface should be part of the Common
>>> Xenomai Platform. Some proposals for deprecation stand out:
>>>
>>> - I suspect that only very few RTDM drivers are actually handling
>>>   requests from other kernel-based drivers in real world applications,
>>>   at least not enough to justify RTDM codifying these rare cases into a
>>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>>>
>>>   In other words, although I would agree that a few particular drivers
>>>   might want to export a couple of services to kernel-based clients in
>>>   order to provide them some sort of backchannel, it seems wrong to
>>>   require RTDM drivers to provide a kernel interface which would match
>>>   their user interface in the same terms. For these specific cases, ad
>>>   hoc code in these few drivers should be enough.
>>>
>>>   Besides, I believe that most kernel->kernel request paths implemented
>>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>>   vs user buffers, for only very few use cases.
>>>
>>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>>   move.
>>
>> That's fine with me. The idea was once that something like bus drivers
>> would appear, but that never happened.
>>
>>>
>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>>   lock must go. For SMP scalability reasons, this big lock was
>>>   eliminated from the EVL core, which means that all the attached
>>>   semantics will not hold there. Serializing access to shared resources
>>>   should be guaranteed by resource-specific locking, not by a giant
>>>   traffic light like the big lock implements.
>>
>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
>> long ago, but users were migrated to cobalt_atomic_enter/leave which may
>> now make it look like we no longer need this. Maybe this is already the
>> case when using rtdm_waitqueue*, but let's convert everyone first.
> 
> Alternatively, In-tree v3 drivers could also keep relying
> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
> them. Bottom line is to exclude from the CXP the whole idea that we may
> schedule while holding a lock to protect against missed wake ups, in
> general the very existence of any superlock which would cover everything
> from top to bottom when serializing. I agree that having v3 converge
> towards the CXP would be better though.
> 

I'm fine with migrating to a new pattern first, drop that old RTDM
pattern and declare the new one as migration path. Same for below.

>>
>>>
>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>>   condition on grabbing a mutex either means that:
>>>
>>
>> I think you are missing the use cases:
>>
>> mutex-lock-timed
>> ...
>> wait-event-timed
>> ...
>> mutex-unlock
>> (which goes long with timeout sequences)
>>
> 
> There is a couple of issues with such use case: first we should never
> ever sleep with a mutex held, this would trigger SIGDEBUG if done from
> user ( a [binary] semaphore would at least prevent this problem), but
> more importantly, how would this pattern allow the event to be signaled
> given the waiter holds the lock the sender would need to acquire first?

Just look at the existing drivers for the use cases (which obviously
imply signalling without holding the mutex). The clash with user-level
debugging was indeed always there and is another reason to provide users
an alternative.

> 
>> In fact, all that could be replaced with
>>
>> mutex-lock
>> ...
>> atomic-entry
>> mutex-unlock
>> wait-event-timed
>> mutex-lock
>> atomic-leave
>> ...
>> mutex-lock
>>
>> but that is what we want to replace as well...
>>
> 
> Yep, that would work for v3, preventing wake ups to be missed the same way.
> 

So let's define something that serves both versions, thus is
CXP-compatible. It makes no sense defining CXP while ignoring the use cases.

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2021-01-05 19:31     ` Jan Kiszka
@ 2021-01-09 17:01       ` Philippe Gerum
  2021-01-11 10:48         ` Jan Kiszka
  0 siblings, 1 reply; 10+ messages in thread
From: Philippe Gerum @ 2021-01-09 17:01 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai


Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 23.12.20 11:40, Philippe Gerum wrote:
>> 
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>>>>
>>>> This wiki page [1] contains a draft proposal about specifying which
>>>> services from the current RTDM interface should be part of the Common
>>>> Xenomai Platform. Some proposals for deprecation stand out:
>>>>
>>>> - I suspect that only very few RTDM drivers are actually handling
>>>>   requests from other kernel-based drivers in real world applications,
>>>>   at least not enough to justify RTDM codifying these rare cases into a
>>>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>>>>
>>>>   In other words, although I would agree that a few particular drivers
>>>>   might want to export a couple of services to kernel-based clients in
>>>>   order to provide them some sort of backchannel, it seems wrong to
>>>>   require RTDM drivers to provide a kernel interface which would match
>>>>   their user interface in the same terms. For these specific cases, ad
>>>>   hoc code in these few drivers should be enough.
>>>>
>>>>   Besides, I believe that most kernel->kernel request paths implemented
>>>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>>>   vs user buffers, for only very few use cases.
>>>>
>>>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>>>   move.
>>>
>>> That's fine with me. The idea was once that something like bus drivers
>>> would appear, but that never happened.
>>>
>>>>
>>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>>>   lock must go. For SMP scalability reasons, this big lock was
>>>>   eliminated from the EVL core, which means that all the attached
>>>>   semantics will not hold there. Serializing access to shared resources
>>>>   should be guaranteed by resource-specific locking, not by a giant
>>>>   traffic light like the big lock implements.
>>>
>>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
>>> long ago, but users were migrated to cobalt_atomic_enter/leave which may
>>> now make it look like we no longer need this. Maybe this is already the
>>> case when using rtdm_waitqueue*, but let's convert everyone first.
>> 
>> Alternatively, In-tree v3 drivers could also keep relying
>> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
>> them. Bottom line is to exclude from the CXP the whole idea that we may
>> schedule while holding a lock to protect against missed wake ups, in
>> general the very existence of any superlock which would cover everything
>> from top to bottom when serializing. I agree that having v3 converge
>> towards the CXP would be better though.
>> 
>
> I'm fine with migrating to a new pattern first, drop that old RTDM
> pattern and declare the new one as migration path. Same for below.
>
>>>
>>>>
>>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>>>   condition on grabbing a mutex either means that:
>>>>
>>>
>>> I think you are missing the use cases:
>>>
>>> mutex-lock-timed
>>> ...
>>> wait-event-timed
>>> ...
>>> mutex-unlock
>>> (which goes long with timeout sequences)
>>>
>> 
>> There is a couple of issues with such use case: first we should never
>> ever sleep with a mutex held, this would trigger SIGDEBUG if done from
>> user ( a [binary] semaphore would at least prevent this problem), but
>> more importantly, how would this pattern allow the event to be signaled
>> given the waiter holds the lock the sender would need to acquire first?
>
> Just look at the existing drivers for the use cases (which obviously
> imply signalling without holding the mutex).
>

Excluding RTDM_EXECUTE_ATOMICALLY() which has no in-tree user, what
remains is solving the issue for users of the cobalt_atomic_{enter,
leave} pattern, i.e.:

kernel/drivers/can/rtcan_raw.c
kernel/drivers/can/rtcan_socket.c
kernel/drivers/ipc/bufp.c
kernel/drivers/ipc/iddp.c
kernel/drivers/ipc/rtipc.c
kernel/drivers/ipc/xddp.c
kernel/drivers/net/stack/rtmac/tdma/tdma_dev.c
kernel/drivers/testing/timerbench.c
kernel/drivers/udd/udd.c

For the call sites listed about, AFAICS we'd need to:

1. move any blocking call out of the locking scope, by rewriting these
as wait loops rechecking the condition under lock if/when required. Only
a few would need the latter in fact, as in many cases
cobalt_atomic_leave() immediately follows the blocking call in the code
flow.

2. provide _nosched variants for signaling calls
(e.g. rtdm_event_pulse_nosched()) and use them, invoking xnsched_run()
out of lock as appropriate.

However, I cannot find any code exhibiting the issue with mutexes in
these matches. Do you have an in-tree example of the problem you see to
point me at?

-- 
Philippe.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2021-01-09 17:01       ` Philippe Gerum
@ 2021-01-11 10:48         ` Jan Kiszka
  2021-01-11 13:14           ` Philippe Gerum
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kiszka @ 2021-01-11 10:48 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 09.01.21 18:01, Philippe Gerum wrote:
> 
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 23.12.20 11:40, Philippe Gerum wrote:
>>>
>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>
>>>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>>>>>
>>>>> This wiki page [1] contains a draft proposal about specifying which
>>>>> services from the current RTDM interface should be part of the Common
>>>>> Xenomai Platform. Some proposals for deprecation stand out:
>>>>>
>>>>> - I suspect that only very few RTDM drivers are actually handling
>>>>>   requests from other kernel-based drivers in real world applications,
>>>>>   at least not enough to justify RTDM codifying these rare cases into a
>>>>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>>>>>
>>>>>   In other words, although I would agree that a few particular drivers
>>>>>   might want to export a couple of services to kernel-based clients in
>>>>>   order to provide them some sort of backchannel, it seems wrong to
>>>>>   require RTDM drivers to provide a kernel interface which would match
>>>>>   their user interface in the same terms. For these specific cases, ad
>>>>>   hoc code in these few drivers should be enough.
>>>>>
>>>>>   Besides, I believe that most kernel->kernel request paths implemented
>>>>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>>>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>>>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>>>>   vs user buffers, for only very few use cases.
>>>>>
>>>>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>>>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>>>>   move.
>>>>
>>>> That's fine with me. The idea was once that something like bus drivers
>>>> would appear, but that never happened.
>>>>
>>>>>
>>>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>>>>   lock must go. For SMP scalability reasons, this big lock was
>>>>>   eliminated from the EVL core, which means that all the attached
>>>>>   semantics will not hold there. Serializing access to shared resources
>>>>>   should be guaranteed by resource-specific locking, not by a giant
>>>>>   traffic light like the big lock implements.
>>>>
>>>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
>>>> long ago, but users were migrated to cobalt_atomic_enter/leave which may
>>>> now make it look like we no longer need this. Maybe this is already the
>>>> case when using rtdm_waitqueue*, but let's convert everyone first.
>>>
>>> Alternatively, In-tree v3 drivers could also keep relying
>>> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
>>> them. Bottom line is to exclude from the CXP the whole idea that we may
>>> schedule while holding a lock to protect against missed wake ups, in
>>> general the very existence of any superlock which would cover everything
>>> from top to bottom when serializing. I agree that having v3 converge
>>> towards the CXP would be better though.
>>>
>>
>> I'm fine with migrating to a new pattern first, drop that old RTDM
>> pattern and declare the new one as migration path. Same for below.
>>
>>>>
>>>>>
>>>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>>>>   condition on grabbing a mutex either means that:
>>>>>
>>>>
>>>> I think you are missing the use cases:
>>>>
>>>> mutex-lock-timed
>>>> ...
>>>> wait-event-timed
>>>> ...
>>>> mutex-unlock
>>>> (which goes long with timeout sequences)
>>>>
>>>
>>> There is a couple of issues with such use case: first we should never
>>> ever sleep with a mutex held, this would trigger SIGDEBUG if done from
>>> user ( a [binary] semaphore would at least prevent this problem), but
>>> more importantly, how would this pattern allow the event to be signaled
>>> given the waiter holds the lock the sender would need to acquire first?
>>
>> Just look at the existing drivers for the use cases (which obviously
>> imply signalling without holding the mutex).
>>
> 
> Excluding RTDM_EXECUTE_ATOMICALLY() which has no in-tree user, what
> remains is solving the issue for users of the cobalt_atomic_{enter,
> leave} pattern, i.e.:
> 
> kernel/drivers/can/rtcan_raw.c
> kernel/drivers/can/rtcan_socket.c
> kernel/drivers/ipc/bufp.c
> kernel/drivers/ipc/iddp.c
> kernel/drivers/ipc/rtipc.c
> kernel/drivers/ipc/xddp.c
> kernel/drivers/net/stack/rtmac/tdma/tdma_dev.c
> kernel/drivers/testing/timerbench.c
> kernel/drivers/udd/udd.c
> 
> For the call sites listed about, AFAICS we'd need to:
> 
> 1. move any blocking call out of the locking scope, by rewriting these
> as wait loops rechecking the condition under lock if/when required. Only
> a few would need the latter in fact, as in many cases
> cobalt_atomic_leave() immediately follows the blocking call in the code
> flow.
> 
> 2. provide _nosched variants for signaling calls
> (e.g. rtdm_event_pulse_nosched()) and use them, invoking xnsched_run()
> out of lock as appropriate.
> 
> However, I cannot find any code exhibiting the issue with mutexes in
> these matches. Do you have an in-tree example of the problem you see to
> point me at?
> 

All serial drivers use mutexes with timeout in order to make write
operations atomic and permit waiting for free buffers inside that atomic
section.

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2021-01-11 10:48         ` Jan Kiszka
@ 2021-01-11 13:14           ` Philippe Gerum
  2021-01-11 13:18             ` Jan Kiszka
  0 siblings, 1 reply; 10+ messages in thread
From: Philippe Gerum @ 2021-01-11 13:14 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai


Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 09.01.21 18:01, Philippe Gerum wrote:
>> 
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 23.12.20 11:40, Philippe Gerum wrote:
>>>>
>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>
>>>>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>>>>>>
>>>>>> This wiki page [1] contains a draft proposal about specifying which
>>>>>> services from the current RTDM interface should be part of the Common
>>>>>> Xenomai Platform. Some proposals for deprecation stand out:
>>>>>>
>>>>>> - I suspect that only very few RTDM drivers are actually handling
>>>>>>   requests from other kernel-based drivers in real world applications,
>>>>>>   at least not enough to justify RTDM codifying these rare cases into a
>>>>>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>>>>>>
>>>>>>   In other words, although I would agree that a few particular drivers
>>>>>>   might want to export a couple of services to kernel-based clients in
>>>>>>   order to provide them some sort of backchannel, it seems wrong to
>>>>>>   require RTDM drivers to provide a kernel interface which would match
>>>>>>   their user interface in the same terms. For these specific cases, ad
>>>>>>   hoc code in these few drivers should be enough.
>>>>>>
>>>>>>   Besides, I believe that most kernel->kernel request paths implemented
>>>>>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>>>>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>>>>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>>>>>   vs user buffers, for only very few use cases.
>>>>>>
>>>>>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>>>>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>>>>>   move.
>>>>>
>>>>> That's fine with me. The idea was once that something like bus drivers
>>>>> would appear, but that never happened.
>>>>>
>>>>>>
>>>>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>>>>>   lock must go. For SMP scalability reasons, this big lock was
>>>>>>   eliminated from the EVL core, which means that all the attached
>>>>>>   semantics will not hold there. Serializing access to shared resources
>>>>>>   should be guaranteed by resource-specific locking, not by a giant
>>>>>>   traffic light like the big lock implements.
>>>>>
>>>>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
>>>>> long ago, but users were migrated to cobalt_atomic_enter/leave which may
>>>>> now make it look like we no longer need this. Maybe this is already the
>>>>> case when using rtdm_waitqueue*, but let's convert everyone first.
>>>>
>>>> Alternatively, In-tree v3 drivers could also keep relying
>>>> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
>>>> them. Bottom line is to exclude from the CXP the whole idea that we may
>>>> schedule while holding a lock to protect against missed wake ups, in
>>>> general the very existence of any superlock which would cover everything
>>>> from top to bottom when serializing. I agree that having v3 converge
>>>> towards the CXP would be better though.
>>>>
>>>
>>> I'm fine with migrating to a new pattern first, drop that old RTDM
>>> pattern and declare the new one as migration path. Same for below.
>>>
>>>>>
>>>>>>
>>>>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>>>>>   condition on grabbing a mutex either means that:
>>>>>>
>>>>>
>>>>> I think you are missing the use cases:
>>>>>
>>>>> mutex-lock-timed
>>>>> ...
>>>>> wait-event-timed
>>>>> ...
>>>>> mutex-unlock
>>>>> (which goes long with timeout sequences)
>>>>>
>>>>
>>>> There is a couple of issues with such use case: first we should never
>>>> ever sleep with a mutex held, this would trigger SIGDEBUG if done from
>>>> user ( a [binary] semaphore would at least prevent this problem), but
>>>> more importantly, how would this pattern allow the event to be signaled
>>>> given the waiter holds the lock the sender would need to acquire first?
>>>
>>> Just look at the existing drivers for the use cases (which obviously
>>> imply signalling without holding the mutex).
>>>
>> 
>> Excluding RTDM_EXECUTE_ATOMICALLY() which has no in-tree user, what
>> remains is solving the issue for users of the cobalt_atomic_{enter,
>> leave} pattern, i.e.:
>> 
>> kernel/drivers/can/rtcan_raw.c
>> kernel/drivers/can/rtcan_socket.c
>> kernel/drivers/ipc/bufp.c
>> kernel/drivers/ipc/iddp.c
>> kernel/drivers/ipc/rtipc.c
>> kernel/drivers/ipc/xddp.c
>> kernel/drivers/net/stack/rtmac/tdma/tdma_dev.c
>> kernel/drivers/testing/timerbench.c
>> kernel/drivers/udd/udd.c
>> 
>> For the call sites listed about, AFAICS we'd need to:
>> 
>> 1. move any blocking call out of the locking scope, by rewriting these
>> as wait loops rechecking the condition under lock if/when required. Only
>> a few would need the latter in fact, as in many cases
>> cobalt_atomic_leave() immediately follows the blocking call in the code
>> flow.
>> 
>> 2. provide _nosched variants for signaling calls
>> (e.g. rtdm_event_pulse_nosched()) and use them, invoking xnsched_run()
>> out of lock as appropriate.
>> 
>> However, I cannot find any code exhibiting the issue with mutexes in
>> these matches. Do you have an in-tree example of the problem you see to
>> point me at?
>> 
>
> All serial drivers use mutexes with timeout in order to make write
> operations atomic and permit waiting for free buffers inside that atomic
> section.

Ok, but none of these driver sleep with the superlock held, which is the
pattern I'm looking for at the moment. Sleeping with a mutex is a
different issue which also requires some fixing, but neither involves
the RTDM_EXECUTE_ATOMICALLY() or cobalt_atomic_enter/leave() constructs.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2021-01-11 13:14           ` Philippe Gerum
@ 2021-01-11 13:18             ` Jan Kiszka
  2021-01-16 15:41               ` Philippe Gerum
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kiszka @ 2021-01-11 13:18 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: xenomai

On 11.01.21 14:14, Philippe Gerum wrote:
> 
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 09.01.21 18:01, Philippe Gerum wrote:
>>>
>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>
>>>> On 23.12.20 11:40, Philippe Gerum wrote:
>>>>>
>>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>>
>>>>>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>>>>>>>
>>>>>>> This wiki page [1] contains a draft proposal about specifying which
>>>>>>> services from the current RTDM interface should be part of the Common
>>>>>>> Xenomai Platform. Some proposals for deprecation stand out:
>>>>>>>
>>>>>>> - I suspect that only very few RTDM drivers are actually handling
>>>>>>>   requests from other kernel-based drivers in real world applications,
>>>>>>>   at least not enough to justify RTDM codifying these rare cases into a
>>>>>>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>>>>>>>
>>>>>>>   In other words, although I would agree that a few particular drivers
>>>>>>>   might want to export a couple of services to kernel-based clients in
>>>>>>>   order to provide them some sort of backchannel, it seems wrong to
>>>>>>>   require RTDM drivers to provide a kernel interface which would match
>>>>>>>   their user interface in the same terms. For these specific cases, ad
>>>>>>>   hoc code in these few drivers should be enough.
>>>>>>>
>>>>>>>   Besides, I believe that most kernel->kernel request paths implemented
>>>>>>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>>>>>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>>>>>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>>>>>>   vs user buffers, for only very few use cases.
>>>>>>>
>>>>>>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>>>>>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>>>>>>   move.
>>>>>>
>>>>>> That's fine with me. The idea was once that something like bus drivers
>>>>>> would appear, but that never happened.
>>>>>>
>>>>>>>
>>>>>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>>>>>>   lock must go. For SMP scalability reasons, this big lock was
>>>>>>>   eliminated from the EVL core, which means that all the attached
>>>>>>>   semantics will not hold there. Serializing access to shared resources
>>>>>>>   should be guaranteed by resource-specific locking, not by a giant
>>>>>>>   traffic light like the big lock implements.
>>>>>>
>>>>>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
>>>>>> long ago, but users were migrated to cobalt_atomic_enter/leave which may
>>>>>> now make it look like we no longer need this. Maybe this is already the
>>>>>> case when using rtdm_waitqueue*, but let's convert everyone first.
>>>>>
>>>>> Alternatively, In-tree v3 drivers could also keep relying
>>>>> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
>>>>> them. Bottom line is to exclude from the CXP the whole idea that we may
>>>>> schedule while holding a lock to protect against missed wake ups, in
>>>>> general the very existence of any superlock which would cover everything
>>>>> from top to bottom when serializing. I agree that having v3 converge
>>>>> towards the CXP would be better though.
>>>>>
>>>>
>>>> I'm fine with migrating to a new pattern first, drop that old RTDM
>>>> pattern and declare the new one as migration path. Same for below.
>>>>
>>>>>>
>>>>>>>
>>>>>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>>>>>>   condition on grabbing a mutex either means that:
>>>>>>>
>>>>>>
>>>>>> I think you are missing the use cases:
>>>>>>
>>>>>> mutex-lock-timed
>>>>>> ...
>>>>>> wait-event-timed
>>>>>> ...
>>>>>> mutex-unlock
>>>>>> (which goes long with timeout sequences)
>>>>>>
>>>>>
>>>>> There is a couple of issues with such use case: first we should never
>>>>> ever sleep with a mutex held, this would trigger SIGDEBUG if done from
>>>>> user ( a [binary] semaphore would at least prevent this problem), but
>>>>> more importantly, how would this pattern allow the event to be signaled
>>>>> given the waiter holds the lock the sender would need to acquire first?
>>>>
>>>> Just look at the existing drivers for the use cases (which obviously
>>>> imply signalling without holding the mutex).
>>>>
>>>
>>> Excluding RTDM_EXECUTE_ATOMICALLY() which has no in-tree user, what
>>> remains is solving the issue for users of the cobalt_atomic_{enter,
>>> leave} pattern, i.e.:
>>>
>>> kernel/drivers/can/rtcan_raw.c
>>> kernel/drivers/can/rtcan_socket.c
>>> kernel/drivers/ipc/bufp.c
>>> kernel/drivers/ipc/iddp.c
>>> kernel/drivers/ipc/rtipc.c
>>> kernel/drivers/ipc/xddp.c
>>> kernel/drivers/net/stack/rtmac/tdma/tdma_dev.c
>>> kernel/drivers/testing/timerbench.c
>>> kernel/drivers/udd/udd.c
>>>
>>> For the call sites listed about, AFAICS we'd need to:
>>>
>>> 1. move any blocking call out of the locking scope, by rewriting these
>>> as wait loops rechecking the condition under lock if/when required. Only
>>> a few would need the latter in fact, as in many cases
>>> cobalt_atomic_leave() immediately follows the blocking call in the code
>>> flow.
>>>
>>> 2. provide _nosched variants for signaling calls
>>> (e.g. rtdm_event_pulse_nosched()) and use them, invoking xnsched_run()
>>> out of lock as appropriate.
>>>
>>> However, I cannot find any code exhibiting the issue with mutexes in
>>> these matches. Do you have an in-tree example of the problem you see to
>>> point me at?
>>>
>>
>> All serial drivers use mutexes with timeout in order to make write
>> operations atomic and permit waiting for free buffers inside that atomic
>> section.
> 
> Ok, but none of these driver sleep with the superlock held, which is the
> pattern I'm looking for at the moment. Sleeping with a mutex is a
> different issue which also requires some fixing, but neither involves
> the RTDM_EXECUTE_ATOMICALLY() or cobalt_atomic_enter/leave() constructs.
> 

It /could/ require it if it was not sleeping with mutexes.

Jan

-- 
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [CXP] Discussing the RTDM specification
  2021-01-11 13:18             ` Jan Kiszka
@ 2021-01-16 15:41               ` Philippe Gerum
  0 siblings, 0 replies; 10+ messages in thread
From: Philippe Gerum @ 2021-01-16 15:41 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai


Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 11.01.21 14:14, Philippe Gerum wrote:
>> 
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 09.01.21 18:01, Philippe Gerum wrote:
>>>>
>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>
>>>>> On 23.12.20 11:40, Philippe Gerum wrote:
>>>>>>
>>>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>>>
>>>>>>> On 18.12.20 15:19, Philippe Gerum via Xenomai wrote:
>>>>>>>>
>>>>>>>> This wiki page [1] contains a draft proposal about specifying which
>>>>>>>> services from the current RTDM interface should be part of the Common
>>>>>>>> Xenomai Platform. Some proposals for deprecation stand out:
>>>>>>>>
>>>>>>>> - I suspect that only very few RTDM drivers are actually handling
>>>>>>>>   requests from other kernel-based drivers in real world applications,
>>>>>>>>   at least not enough to justify RTDM codifying these rare cases into a
>>>>>>>>   common interface (rtdm_open, rtdm_read, rtdm_write etc).
>>>>>>>>
>>>>>>>>   In other words, although I would agree that a few particular drivers
>>>>>>>>   might want to export a couple of services to kernel-based clients in
>>>>>>>>   order to provide them some sort of backchannel, it seems wrong to
>>>>>>>>   require RTDM drivers to provide a kernel interface which would match
>>>>>>>>   their user interface in the same terms. For these specific cases, ad
>>>>>>>>   hoc code in these few drivers should be enough.
>>>>>>>>
>>>>>>>>   Besides, I believe that most kernel->kernel request paths implemented
>>>>>>>>   by in-tree RTDM drivers have never been tested for years, if ever.
>>>>>>>>   Meanwhile, this kernel->kernel API introduces a basic exception case
>>>>>>>>   into many RTDM and driver code paths, e.g. for differentiating kernel
>>>>>>>>   vs user buffers, for only very few use cases.
>>>>>>>>
>>>>>>>>   For these reasons, I would suggest to deprecate the kernel->kernel API
>>>>>>>>   from RTDM starting from 3.3, excluding it from the CXP in the same
>>>>>>>>   move.
>>>>>>>
>>>>>>> That's fine with me. The idea was once that something like bus drivers
>>>>>>> would appear, but that never happened.
>>>>>>>
>>>>>>>>
>>>>>>>> - RTDM_EXECUTE_ATOMICALLY() and related calls relying on the Cobalt big
>>>>>>>>   lock must go. For SMP scalability reasons, this big lock was
>>>>>>>>   eliminated from the EVL core, which means that all the attached
>>>>>>>>   semantics will not hold there. Serializing access to shared resources
>>>>>>>>   should be guaranteed by resource-specific locking, not by a giant
>>>>>>>>   traffic light like the big lock implements.
>>>>>>>
>>>>>>> This is more complicated: RTDM_EXECUTE_ATOMICALLY was in fact deprecated
>>>>>>> long ago, but users were migrated to cobalt_atomic_enter/leave which may
>>>>>>> now make it look like we no longer need this. Maybe this is already the
>>>>>>> case when using rtdm_waitqueue*, but let's convert everyone first.
>>>>>>
>>>>>> Alternatively, In-tree v3 drivers could also keep relying
>>>>>> RTDM_EXECUTE_ATOMICALLY, the v4 implementation would be different for
>>>>>> them. Bottom line is to exclude from the CXP the whole idea that we may
>>>>>> schedule while holding a lock to protect against missed wake ups, in
>>>>>> general the very existence of any superlock which would cover everything
>>>>>> from top to bottom when serializing. I agree that having v3 converge
>>>>>> towards the CXP would be better though.
>>>>>>
>>>>>
>>>>> I'm fine with migrating to a new pattern first, drop that old RTDM
>>>>> pattern and declare the new one as migration path. Same for below.
>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> - rtdm_mutex_timedlock() has dubious semantics. Hitting a timeout
>>>>>>>>   condition on grabbing a mutex either means that:
>>>>>>>>
>>>>>>>
>>>>>>> I think you are missing the use cases:
>>>>>>>
>>>>>>> mutex-lock-timed
>>>>>>> ...
>>>>>>> wait-event-timed
>>>>>>> ...
>>>>>>> mutex-unlock
>>>>>>> (which goes long with timeout sequences)
>>>>>>>
>>>>>>
>>>>>> There is a couple of issues with such use case: first we should never
>>>>>> ever sleep with a mutex held, this would trigger SIGDEBUG if done from
>>>>>> user ( a [binary] semaphore would at least prevent this problem), but
>>>>>> more importantly, how would this pattern allow the event to be signaled
>>>>>> given the waiter holds the lock the sender would need to acquire first?
>>>>>
>>>>> Just look at the existing drivers for the use cases (which obviously
>>>>> imply signalling without holding the mutex).
>>>>>
>>>>
>>>> Excluding RTDM_EXECUTE_ATOMICALLY() which has no in-tree user, what
>>>> remains is solving the issue for users of the cobalt_atomic_{enter,
>>>> leave} pattern, i.e.:
>>>>
>>>> kernel/drivers/can/rtcan_raw.c
>>>> kernel/drivers/can/rtcan_socket.c
>>>> kernel/drivers/ipc/bufp.c
>>>> kernel/drivers/ipc/iddp.c
>>>> kernel/drivers/ipc/rtipc.c
>>>> kernel/drivers/ipc/xddp.c
>>>> kernel/drivers/net/stack/rtmac/tdma/tdma_dev.c
>>>> kernel/drivers/testing/timerbench.c
>>>> kernel/drivers/udd/udd.c
>>>>
>>>> For the call sites listed about, AFAICS we'd need to:
>>>>
>>>> 1. move any blocking call out of the locking scope, by rewriting these
>>>> as wait loops rechecking the condition under lock if/when required. Only
>>>> a few would need the latter in fact, as in many cases
>>>> cobalt_atomic_leave() immediately follows the blocking call in the code
>>>> flow.
>>>>
>>>> 2. provide _nosched variants for signaling calls
>>>> (e.g. rtdm_event_pulse_nosched()) and use them, invoking xnsched_run()
>>>> out of lock as appropriate.
>>>>
>>>> However, I cannot find any code exhibiting the issue with mutexes in
>>>> these matches. Do you have an in-tree example of the problem you see to
>>>> point me at?
>>>>
>>>
>>> All serial drivers use mutexes with timeout in order to make write
>>> operations atomic and permit waiting for free buffers inside that atomic
>>> section.
>> 
>> Ok, but none of these driver sleep with the superlock held, which is the
>> pattern I'm looking for at the moment. Sleeping with a mutex is a
>> different issue which also requires some fixing, but neither involves
>> the RTDM_EXECUTE_ATOMICALLY() or cobalt_atomic_enter/leave() constructs.
>> 
>
> It /could/ require it if it was not sleeping with mutexes.
>

There is another way, which would be to align Cobalt on the common
pattern for waiting for events racelessly, i.e. by enqueuing the current
thread signaling the event under lock, but without rescheduling. Then
dropping the lock before calling xnsched_run() eventually.

For this, we would need the xnsynch_sleep_on_noresched() variant, which
would in turn call a no-resched variant of xnthread_suspend(), not
triggering a rescheduling if the suspended thread is current. This would
allow us to keep the signal+update sections atomic, without having to
reschedule while holding the lock. Likewise, the consumer side would
follow the pattern used with common wait queues. This is exactly what
the EVL core does, mimicking the common wait queues.

Ideally, it would be possible to amend the internal implementation of
the RTDM waitqueues to provide such support, without changing its
interface.

-- 
Philippe.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-01-16 15:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-12-18 14:19 [CXP] Discussing the RTDM specification Philippe Gerum
2020-12-19  4:53 ` Fino Meng
2020-12-23  9:45 ` Jan Kiszka
2020-12-23 10:40   ` Philippe Gerum
2021-01-05 19:31     ` Jan Kiszka
2021-01-09 17:01       ` Philippe Gerum
2021-01-11 10:48         ` Jan Kiszka
2021-01-11 13:14           ` Philippe Gerum
2021-01-11 13:18             ` Jan Kiszka
2021-01-16 15:41               ` Philippe Gerum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.