Re: AW: [Xenomai-core] Ipipe hook at system call exit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Philippe Gerum <rpm@xenomai.org>
To: "Krause, Karl-Heinz" <karl-heinz.krause@domain.hid>
Cc: xenomai@xenomai.org
Subject: Re: AW: [Xenomai-core] Ipipe hook at system call exit
Date: Wed, 07 Jun 2006 18:53:09 +0200	[thread overview]
Message-ID: <44870475.8030705@domain.hid> (raw)
In-Reply-To: <88AEA5AC18A141439A0D954EB037B0D30439F329@domain.hid>

Krause, Karl-Heinz wrote:
> Thanks Philippe for your quick reply.
> 
> May be a few additional remarks will clarify some remaining misunderstanding. Compared to Xenomai the basic differences are 
> -- there are no shadow threads
> -- we use the standard glibc with futex based synchronization/communication 
>    working across domain boundary transparently.
> This transparency is demonstrated by having the  process binary running on a ipipe-patched Linux only. After loading the realtime module the same binary runs but the responding threads provide for the realtime guarantees.
> Now lets go beyond marketing:
> First the glibc implementation is not completely realtime capable. This concerns two functions
> - the implementation of the SIGEV_THREAD notification 
> - the implementation of the spinlock() function
> For both we do have realtime capable implementations. Since these issues hold also for a natively realtime capable Linux it also must be solved for 
> a "natively" realtime capable Linux (PREMPT_RT) and the chances to have one standard implemention are quite good.  The only difference may be the spinlock(). Here the protection needed for static priorities can be done
> differently. For a two kernel solution the protection must work across (interrupt disable).
> 
> I guess the transparency also explains why we cannot rely on lazy migration back.
If for e.g. a thread which should provide for realtime response does an 
open() or a mmap() during
  its setup phase and then a sigwait() for responding, then the 
sigwait() call has to be executed
in the realtime domain from the very beginning. Having the check for 
migration back to realtime
at the system call epilog of Linux is the most convenient way, otherwise 
we would neet hooks in
  every system call function which is propagatable.

But, intercepting the SYSCALL event for all domains is equivalent to 
having such hook.

> 
> Concerning the futex function. Currently we intercept at the system call exit and call the corresponding rt-function when the number of requested 
> wakeups could not be performed.

You mean that if Linux fails to identify one of its own futexes during a 
get/release operation, then the handling is passed to the RT side?

  This provides for an excellent filtering
> but works for regular mutexes only. If we want to preserve the exact semantics for PI mutexes we have to call the rt-function upfront. 
> For mutexes with priority ceiling migration the migration check at system call exit is sufficient. For priority inheritance we would need to use the scheduler hook.
> 
> Concerning the mlock-stuff we view it to be not sufficient, since if somebody does a malloc() and sets up preallocated structures, they are not necessarily touched.
> 

I still don't get the point here. mlocking the data segment should cause 
all pages included into this segment to be touched by the mm during the 
fixup, basically by forcing the invocation of the page fault handler for 
each page found in the associated VMAs. So there is no way the 
underlying physical memory could not be committed after mlock.

> Concerning the performance issues and your remark that you have still work to do.
For us treating system calls what they really are namely synchronous 
exceptions which
should be handled by the causing domain only would be perfect fit and 
would be faster.(as an option)
> 

IPIPE_EVENT_SELF does it for recent I-pipe patches. It's a modifier 
telling Adeos to send the event only to the causing domain's handler.

> Hopefully this clears up the issues somewhat.

Well, yes and no. Talking about the syscall exit hook, I don't get why 
it is absolutely required since the co-kernel can control the migrations 
as part of a preamble and/or postamble code surrounding the syscall 
demux, given that all syscalls from any domain can be filtered through 
it by Adeos. I do understand that changing the existing and working code 
might not be the preferred solution though, but additions to the 
critical path must enable a mandatory feature which could not be 
obtained by other means.

> 
> 
> Karl-Heinz
>   
> 
> -----Ursprüngliche Nachricht-----
> Von: Philippe Gerum [mailto:rpm@xenomai.org
> Gesendet: Mittwoch, 7. Juni 2006 15:21
> An: Krause, Karl-Heinz
> Cc: xenomai@xenomai.org
> Betreff: Re: [Xenomai-core] Ipipe hook at system call exit
> 
> 
> Hello,
> 
> Krause, Karl-Heinz wrote:
> 
>>Hallo Philippe
>>
>> 
>>
>>Jan Kiszka referred me to you discussing our problem with a missing 
>>Ipipe hook at system call exit.
>>
>>We at Siemens A&D do have a Linux realtime approach which is based on a 
>>previous ADEOS version. When trying to port an improved version to the 
>>Ipipe version for kernel 2.6.15.4 we ran into the problem of not having 
>>an event hook at system call exit. Let me explain the need for it by 
>>briefly outlining our approach.
>>
>>It is a two kernel approach based on the model of a multihreaded process 
>>(means 2.6 kernel) where the threads above  a certain static priority 
>>level e.g. 68 are scheduled by the  scheduler of the realtime kernel. 
>>The realtime kernel maintains exactly the same systemcall interface as 
>>the Linux kernel. The entire process works uniformely with the glibc. 
>>The glibc isn't aware under which scheduler the current thread is 
>>executing. To make this happen and having both schedulers  to work with 
>>the same struct task struct  we had to put some restrictions on the 
>>signalling for the realtime domain (restrictions which make sense for 
>>the realtime arena anyway). Because of that transparency this approach 
>>combines somehow the advantages of a separated realtime kernel with the 
>>user convenience of  PREEMPT_RT. (the user convenience was the driving 
>>requirement for our approach)
>>
> 
> 
> There seems to be quite a lot of commonality with the way Xenomai deals 
> with shadow threads to enable realtime processing in user-space, while 
> providing a seamless integration with Linux. One difference might be the 
> way your system deals with Linux syscalls fired on behalf of a thread 
> controlled by the real-time scheduler; Xenomai migrates the thread to 
> the Linux scheduler transparently, but I did not figure out yet if this 
> was a relevant issue in your system. Anyway, I think that I now roughly 
> understand the general dynamics of it, thanks for the explanations.
> 
> 
>> 
>>
>>Now to the question why we need a hook at systemcall exit.
>>
>>The hook at systemcall entry branches to the system call handling of the 
>>realtime kernel, which is also entered via a systemcall table. The 
>>handling can be grouped in three classes
>>
>>-         complete handling in the realtime domain e.g. timer_settime(), 
>>sigwait()
>>
>>-         only migration of the thread to the Linux scheduler. Basically 
>>all calls needed for setup e.g. open(), mmap(), pthread_create().  The 
>>migration is transparent for the ipipe code, the thread continues 
>>execution in the Linux domain with the call of the Linux system call 
>>table (the priority hasn't changed).
>>
>>-         handling in the realtime domain and migration to the Linux 
>>domain if the thread priority has dropped unter the boundary (e.g 
>>releasing a mutex with priority ceiling)
>>
>> 
>>
>>In particular for the second case a check needs to be done at sytem call 
>>exit as to whether the thread has to migrate (back) to the realtime 
>>scheduler. But this is also needed when a call issued in the Linux 
>>raises the priority above the threshold. A third reason for the hook is 
>>to touch the corresponding pages after a brk() or mmap() call for 
>>getting residency.
>>
>> Note:
>>
>>The migration only takes place for threads of a process marked as realtime.
>>
>>Currently we allow only for one realtime process. First it is sufficient 
>>for us and second it allows us to maintain the futex queue (each domain 
>>maintains a local queue) of the realtime domain with virtual addresses 
>>(no mm_lock).  
>>
> 
> 
> Does this mean that you specifically intercept futex ops to process them 
> in real-time mode when fired over the real-time context? Which would in 
> turn allow you to traverse most of the glibc code and get it 
> synchronized with the plain Linux threads?
> 
> 
>> 
>>
>>So this hook at system call exit is a necessity for us. Of course we 
>>could do a private patch, but do you see a possibility to have it in the 
>>standard Ipipe-patch?
>>
> 
> 
> Basically, I removed the sysexit hook from the I-pipe patch because it 
> added a non-negligible overhead to each syscall. Even the sysenter hook 
> needs some work to reduce its CPU footprint and I've planned to tackle 
> the issue soon. For this reason, the current Xeno implementation only 
> relies on the sysenter (IPIPE_EVENT_SYSCALL) hook to deal with 
> migrations between the Linux and Xenomai schedulers, usually enforcing a 
> lazy migration scheme, i.e. the syscall prologue added by the RT 
> extension switches the caller to the proper domain before running the 
> system call handler, but does not eagerly switch back to the originating 
> domain (well, there are exceptions to this, but that's the usual way 
> things are handled).
> 
> Reading your description, a few questions came to my mind:
> 
> - why do you force a switch back to the originating domain? IOW, are 
> eager transitions absolutely required in your design, since your RT 
> thread is underlaid by a regular Linux task anyway, so it could continue 
> its processing and switch back to the RT side only when needed?
> 
> - would not it be possible to intercept the IPIPE_EVENT_SETSCHED 
> notifications, which are fired by the I-pipe when a Linux task is about 
> to have its priority changed? It's a direct hook from the kernel's 
> sched_setscheduler(), which is given the task_struct pointer of the 
> altered task, right after its priority field has been updated, but still 
> before the Linux runqueue is reordered.
> 
> - would mlocking the data segment of your application be enough/possible 
> to ensure that brk() and mmapped() segments get committed to physical 
> memory automatically, and as such spare you the need for touching those 
> areas explicitely? AFAIK, mlocked pages are going to be fixed up this 
> way by the mm layer during the mlocking call.
> 
> - generally speaking, since you control the prologue and epilogue of all 
> system calls (Linux or real-time) which go through your own syscall 
> demux by mean of the IPIPE_EVENT_SYSCALL hook, it should be possible to 
> handle the whole migration issue (be it eager or lazy in this case) from 
> your code, instead of relying on a hook inserted in Linux's syscall 
> return path. Or am I missing something?
> 
> 
>> 
>>
>> 
>>
>>Karl-Heinz Krause
>>
>>Siemens A&D
>>
>>Nbg.-Moorenbrunn
>>
>> 
>>
>> 
>>
>> 
>>
>> 
>>
>>
>>------------------------------------------------------------------------
>>
>>_______________________________________________
>>Xenomai-core mailing list
>>Xenomai-core@domain.hid
>>https://mail.gna.org/listinfo/xenomai-core
> 
> 
> 


-- 

Philippe.

     prev parent reply	other threads:[~2006-06-07 16:53 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-07 10:34 [Xenomai-core] Ipipe hook at system call exit Krause, Karl-Heinz
2006-06-07 13:21 ` Philippe Gerum
2006-06-07 15:16   ` AW: " Krause, Karl-Heinz
2006-06-07 16:53     ` Philippe Gerum [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44870475.8030705@domain.hid \
    --to=rpm@xenomai.org \
    --cc=karl-heinz.krause@domain.hid \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.