From: Jan Kiszka <jan.kiszka@siemens.com>
To: Philippe Gerum <rpm@xenomai.org>,
Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Cc: Xenomai <xenomai@xenomai.org>
Subject: Re: [Xenomai] RTDM syscalls & switching
Date: Tue, 14 Jun 2016 17:09:41 +0200 [thread overview]
Message-ID: <57601E35.3010101@siemens.com> (raw)
In-Reply-To: <5735F39A.8050204@siemens.com>
On 2016-05-13 17:32, Jan Kiszka wrote:
> On 2016-05-13 15:38, Philippe Gerum wrote:
>> On 05/13/2016 07:54 AM, Jan Kiszka wrote:
>>> On 2016-05-13 00:26, Philippe Gerum wrote:
>>>> On 05/12/2016 09:27 PM, Jan Kiszka wrote:
>>>>> On 2016-05-12 21:08, Philippe Gerum wrote:
>>>>>> On 05/12/2016 08:42 PM, Jan Kiszka wrote:
>>>>>>> On 2016-05-12 20:35, Philippe Gerum wrote:
>>>>>>>> On 05/12/2016 08:24 PM, Jan Kiszka wrote:
>>>>>>>>> On 2016-05-12 20:20, Gilles Chanteperdrix wrote:
>>>>>>>>>> On Thu, May 12, 2016 at 07:17:15PM +0200, Jan Kiszka wrote:
>>>>>>>>>>> On 2016-05-12 19:12, Gilles Chanteperdrix wrote:
>>>>>>>>>>>> On Thu, May 12, 2016 at 06:59:04PM +0200, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>> On Thu, May 12, 2016 at 06:50:03PM +0200, Jan Kiszka wrote:
>>>>>>>>>>>>>> On 2016-05-12 18:31, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>> On Thu, May 12, 2016 at 06:06:16PM +0200, Jan Kiszka wrote:
>>>>>>>>>>>>>>>> Gilles,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> regarding commit bec5d0dd42 (rtdm: make syscalls conforming rather than
>>>>>>>>>>>>>>>> current) - I remember a discussion on that topic, but I do not find its
>>>>>>>>>>>>>>>> traces any more. Do you have a pointer
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In any case, I'm confronted with a use case for the old (Xenomai 2),
>>>>>>>>>>>>>>>> lazy switching behaviour: lightweight, performance sensitive IOCTL
>>>>>>>>>>>>>>>> services that can (and should) be called without any switching from both
>>>>>>>>>>>>>>>> domains.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Why not using a plain linux driver? ioctl_nrt callbacks are
>>>>>>>>>>>>>>> redundant with plain linux drivers.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Because that enforces the calling layer to either call the same service
>>>>>>>>>>>>>> via a plain Linux device if the calling thread is currently relaxed or
>>>>>>>>>>>>>> go for the RT device if the caller is in primary. Doable, but I would
>>>>>>>>>>>>>> really like to avoid this pain for the users.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What were the arguments in favour of migrating threads to real-time first?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I currently see the real need only for IOCTLs, but the question is then
>>>>>>>>>>>>>>>> if we shouldn't go back to "__xn_exec_current" in all RTDM cases to
>>>>>>>>>>>>>>>> avoid unwanted migration costs (which are significantly higher than
>>>>>>>>>>>>>>>> syscall restarts).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I do not find commit bec5d0dd42 in xenomai-2.6 git tree, and I do
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Xenomai 2 is still following the lazy scheme - we reverted that commit
>>>>>>>>>>>>>> later on in 7df0c1d96b. Xenomai 3 changed it again with the commit above.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> not remember merging this. However I find commit
>>>>>>>>>>>>>>> 13bfdd477ab880499d2e8f3b82c49ef4d2cccff0 from 2010 which seems to
>>>>>>>>>>>>>>> explain the reason pretty clear.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> At the time of the discussion we had concluded that it was the way
>>>>>>>>>>>>>>> to go. With __xn_exec_current you may enter the ioctl_rt callback
>>>>>>>>>>>>>>> from secondary domain, which is counter-intuitive, error-prone, and
>>>>>>>>>>>>>>> forces you to cripple driver code for checks for the current domain.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nope, normal drivers are not affected as they just implement those
>>>>>>>>>>>>>> services in the respective mode they want to support there and have a
>>>>>>>>>>>>>> simple -ENOSYS for the rest (explicitly in IOCTLs or implicitly by
>>>>>>>>>>>>>> leaving out the implementation of the counterpart handler).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, I got mixed up trying to remember. I think the crux of the
>>>>>>>>>>>>> problem is that if a thread running in primary mode gets
>>>>>>>>>>>>> (temporarily) switched to secondary mode by gdb, the ioctl_nrt
>>>>>>>>>>>>> handler gets invoked, which is almost certainly the wrong thing to
>>>>>>>>>>>>> do. You want the thread to migrate to primary mode to execute
>>>>>>>>>>>>> ioctl_rt, which __xn_exec_conforming achieves. Otherwise running an
>>>>>>>>>>>>> application in gdb causes the application to behave differently.
>>>>>>>>>>>>
>>>>>>>>>>>> And trying and avoiding this issue indeed cripple codes with checks
>>>>>>>>>>>> for rtdm_in_rt_context:
>>>>>>>>>>>> https://git.xenomai.org/xenomai-2.6.git/tree/ksrc/drivers/analogy/rtdm_interface.c#n194
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I don't remember details here, but this is a special case: The driver
>>>>>>>>>>> provides also read_nrt - is that really useful for Analogy?
>>>>>>>>>>>
>>>>>>>>>>> In most cases, you are fine with not providing the nrt (or rt) handler,
>>>>>>>>>>> or with a simple
>>>>>>>>>>>
>>>>>>>>>>> default:
>>>>>>>>>>> return -ENOSYS;
>>>>>>>>>>>
>>>>>>>>>>> in your ioctl dispatcher.
>>>>>>>>>>
>>>>>>>>>> You are missing the point: if you enter read_nrt, there are two
>>>>>>>>>> cases:
>>>>>>>>>> - either the thread is real-time capable and has been relaxed by gdb
>>>>>>>>>> and you want to switch to read_rt for the reasons I already
>>>>>>>>>> explained, in that case, you must return -ENOSYS;
>>>>>>>>>> - or the thread is not real-time capable and the nrt handler
>>>>>>>>>> applies.
>>>>>>>>>>
>>>>>>>>>> So, you need at least
>>>>>>>>>>
>>>>>>>>>> read_nrt()
>>>>>>>>>> {
>>>>>>>>>> if (rt_capable)
>>>>>>>>>> return -ENOSYS;
>>>>>>>>>>
>>>>>>>>>> /* Do the normal case here */
>>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Now tell me how many drivers have read_nrt, write_nrt? 1 in-tree.
>>>>>>>>> recvmsg_nrt, sendmsg_nrt? 0 in-tree. Analogy is special (still like to
>>>>>>>>> understand why, though). And having some special code in the exceptional
>>>>>>>>> case is probably better then the side effects we get from eagerly
>>>>>>>>> switching now.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Sorry, that is exactly the opposite: your use case is exceptional and I
>>>>>>>> believe is wrong. The normal use case is the one that does not ask the
>>>>>>>> user to track the current mode for knowing what any random driver would
>>>>>>>> eventually do depending on the calling context.
>>>>>>>
>>>>>>> You still miss the point that this is not required in 99% of the cases.
>>>>>>> There is no such problem. There only Analogy.
>>>>>>>
>>>>>>
>>>>>> I'm not discussing Analogy at all, those drivers are still biased by the
>>>>>> legacy 2.x logic for dealing with modes and need fixing. I have never
>>>>>> been convinced by the reasoning behind rtdm_in_rt_context(), which
>>>>>> perfectly illustrates why messing with the call mode is not the
>>>>>> application's business.
>>>>>
>>>>> You still need rtdm_in_rt_context() for the (rare) case of having the
>>>>> same handler for both service_rt and service_nrt. That didn't change
>>>>> with any switching strategy adjustment. It can't as long as there are
>>>>> services behind a syscall that may handle any mode, thus that syscall is
>>>>> unable to filter for the service in the background. We really need to
>>>>> differentiate here.
>>>>>
>>>>>>
>>>>>>> Every driver must ensure that a service is only exposed to users in the
>>>>>>> right mode. That is a functional requirement, and drivers that fail to
>>>>>>> do so only work by chance (thus with the restricted workload they are
>>>>>>> tested against). If that is fulfilled, it doesn't matter to the driver
>>>>>>> when the switch happens. It's pure optimization.
>>>>>>>
>>>>>>
>>>>>> You don't seem to get my point either. Let's proceed differently, please
>>>>>> sketch the application code that would require __xn_exec_current for
>>>>>> RTDM calls.
>>>>>
>>>>> You cut the more interesting case (migration ping-pong when calling
>>>>> non-RT drivers from relaxed threads), and I hope you will not forget to
>>>>> answer this.
>>>>>
>>>>
>>>> I'm not ignoring the question, I have been postponing the answer until I
>>>> understand why the application could be put in a situation making this
>>>> migration a problem, and whether another approach would exist for
>>>> solving that problem within the current scheme.
>>>
>>> These two scenarios are unrelated: this migration issue would still be
>>> there even if we solved the one below via a different application/driver
>>> design.
>>>
>>
>> Which starts to be an issue only because the caller is a Cobalt shadow
>> undergoing the SCHED_WEAK policy, calling a RTDM driver for a non-rt
>> operation very frequently. For this reason, those two scenarii are very
>> much related.
>
> Not SCHED_WEAK, but being a shadow in the first place. Unless you
> enforce non-shadow thread creation, all are shadowed in a Xenomai
> application, thus are affected. However, asking our users to user
> __real_pthread_create extensively may not lead to the desired portable
> designs.
>
>>
>>>>
>>>>> But let's go to our case:
>>>>>
>>>>> We have a non-blocking service in the driver, the classic case of
>>>>> accessing a privileged resource that userspace can't or shouldn't touch
>>>>> directly. Think of some kind of register access that requires low-level
>>>>> synchronization with other threads and interrupt handlers. That service
>>>>> is called by both RT and non-RT threads (SCHED_WEAK) at higher frequency
>>>>> (some thousand times per second). The RT threads are obviously on the
>>>>> time critical path, must not migrate, and that can be achieved perfectly
>>>>> already by providing that service under ioctl_rt. The non-RT threads
>>>>> could be migrated to RT, but then they would pay an unneeded price,
>>>>> contributing to a higher system load, in the worst case overload.
>>>>> Therefore, the very same service shall be provided under ioctl_nrt as
>>>>> well. Makes sense?
>>>>>
>>>>
>>>> I understand the conflict with the "rt-always-has-precedence" rule
>>>> implemented by the conforming state, then I have another question:
>>>>
>>>> assuming the nrt thread undergoes the SCHED_WEAK policy because it is
>>>> mainly operating from the Linux space but still needs to synchronize
>>>> with the rt side at some point, which kind of high frequency interaction
>>>> with the rt side is this?
>>>>
>>>> Sharing some resource requiring mutual exclusion via a Cobalt synchro,
>>>> waiting for rt events, something else?
>>>>
>>>
>>> There synchronization need is first of all only on the hardware access
>>> (thus inside the driver), not necessarily at application level. In fact,
>>> there are even scenarios where you only want to exploit the driver as
>>> permission checker on privileged resource accesses (userspace shall only
>>> access certain MMIO registers in a page, thus the driver acts as
>>> gatekeeper). Then there could be no synchronization at all but still the
>>> need to provide migration-free accesses.
>>>
>>
>> I get the idea of the resource gatekeeper, which does make a lot of sense.
>>
>> However I still don't get which benefit your caller has in undergoing
>> the SCHED_WEAK policy - which implies that it has to share
>> synchronization points with Cobalt - compared to running as a regular
>> (glibc) thread, under whichever policy that could fit?
>
> See above: it's additional, non-portable instrumentation of your code to
> tag non-shadowed threads. And then you may easily run into troubles in
> larger, layered application designs that a non-shadowed thread will
> still need a blocking Xenomai service, e.g. via some hidden dependency.
>
>>
>> Leaving the non-RT ioctl call aside, which are those Cobalt calls the
>> SCHED_WEAK thread needs to invoke for synchronizing with rt threads?
>
> I don't have these details at hand, but let's consider a large layered
> application that also does significant work against Linux APIs during
> runtime. You can't always enforce the complete separation. Because if
> you can, you could also move the non-RT part into a separate process
> that has nothing to do with Xenomai.
>
> We promote the transparency of the Xenomai POSIX interface, and that
> should not make the usage of non-Xenomai services needlessly expensive
> or require extensive non-portable tagging via __real_ prefixes.
>
Ping on this still open topic (will now have to introduce a local patch
that restores the original behaviour). Can we resolve the issue upstream
as well?
Thanks,
Jan
--
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux
next prev parent reply other threads:[~2016-06-14 15:09 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-12 16:06 [Xenomai] RTDM syscalls & switching Jan Kiszka
2016-05-12 16:31 ` Gilles Chanteperdrix
2016-05-12 16:50 ` Jan Kiszka
2016-05-12 16:59 ` Gilles Chanteperdrix
2016-05-12 17:12 ` Gilles Chanteperdrix
2016-05-12 17:17 ` Jan Kiszka
2016-05-12 18:20 ` Gilles Chanteperdrix
2016-05-12 18:24 ` Jan Kiszka
2016-05-12 18:30 ` Gilles Chanteperdrix
2016-05-12 18:33 ` Jan Kiszka
2016-05-12 18:35 ` Philippe Gerum
2016-05-12 18:42 ` Jan Kiszka
2016-05-12 19:08 ` Philippe Gerum
2016-05-12 19:27 ` Jan Kiszka
2016-05-12 19:47 ` Gilles Chanteperdrix
2016-05-12 22:26 ` Philippe Gerum
2016-05-13 5:54 ` Jan Kiszka
2016-05-13 13:38 ` Philippe Gerum
2016-05-13 15:32 ` Jan Kiszka
2016-06-14 15:09 ` Jan Kiszka [this message]
2016-06-14 15:23 ` Philippe Gerum
2016-06-14 15:27 ` Jan Kiszka
2016-06-14 15:38 ` Gilles Chanteperdrix
2016-06-14 15:43 ` Jan Kiszka
2016-06-14 15:51 ` Gilles Chanteperdrix
2016-06-14 16:03 ` Jan Kiszka
2016-06-14 16:12 ` Gilles Chanteperdrix
2016-06-14 16:25 ` Jan Kiszka
2016-06-14 16:42 ` Gilles Chanteperdrix
2016-06-14 16:59 ` Jan Kiszka
2016-06-14 22:12 ` Gilles Chanteperdrix
2016-06-14 15:47 ` Jan Kiszka
2016-06-14 19:48 ` Philippe Gerum
2016-06-14 20:03 ` Jan Kiszka
2016-06-14 20:13 ` Philippe Gerum
2016-06-14 17:13 ` Jan Kiszka
2016-06-14 20:11 ` Philippe Gerum
2016-06-14 20:35 ` Gilles Chanteperdrix
2016-05-12 19:11 ` Gilles Chanteperdrix
2016-05-12 19:31 ` Jan Kiszka
2016-05-12 19:39 ` Gilles Chanteperdrix
2016-05-12 17:14 ` Jan Kiszka
2016-05-12 17:38 ` Philippe Gerum
2016-05-12 17:51 ` Jan Kiszka
2016-05-12 18:22 ` Gilles Chanteperdrix
2016-05-12 18:31 ` Jan Kiszka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57601E35.3010101@siemens.com \
--to=jan.kiszka@siemens.com \
--cc=gilles.chanteperdrix@xenomai.org \
--cc=rpm@xenomai.org \
--cc=xenomai@xenomai.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.