public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Raj Rajkumar <raj@ece.cmu.edu>
To: linux-kernel@vger.kernel.org
Cc: Raj Rajkumar <raj@ece.cmu.edu>
Subject: Re: RFC for a new Scheduling policy/class in the Linux-kernel
Date: Thu, 16 Jul 2009 13:54:16 -0400	[thread overview]
Message-ID: <4A5F6948.5010107@ece.cmu.edu> (raw)

Dear Jim, Ted and others:

Let me first introduce myself.  I work on real-time systems at Carnegie 
Mellon Univ and am one of the authors on the papers that introduced the 
priority inheritance and priority ceiling protocols, and some of their 
follow-ons.  I was also a co-founder of TimeSys in 1996.  My PhD 
student, Karthik Lakshmanan, and an SEI Sr. Staff Member, Dr. Dionisio 
de Niz, and I have begun to revisit multiprocessor scheduling and 
synchronization thanks in part to the surging interest in multicore 
processors.

I just came to know about this message thread.  I hope, as Jim did, that 
I can add something to the ongoing discussion - if not, please bear with me.

First of all, let me agree completely with Jim that simplicity goes a 
very long way.  Two relevant examples follow.   First, fixed-priority 
scheduling has dominated most real-time systems over dynamic-priority 
scheduling) since (a) the practical performance differences are not 
significant (b) overheads are lower and (c) perhaps above all, less 
information is required.  Secondly, priority ceiling/ceiling emulation 
protocols have much better properties than basic priority inheritance 
protocols but the latter are more widely used.  The reason appears to be 
that PCP requires additional information which can also change at 
run-time.   Overall, simplicity does win.   

Secondly, let me fully agree with Ted in that Linux-RT extensions should 
support bandwidth control & isolation.   My group's work on "reserves" 
and "resource kernels" looked at isolating both temporal and spatial 
resources (please see 
http://www.ece.cmu.edu/~raj/publications.html#ResourceKernels) in the 
context of Linux.  Karthik's work extended Linux/RK to distributed 
systems (Distributed Linux/RK).

Thirdly, I also agree with Jim that non-preemptive locking and FIFO 
queueing are fine for mutexes in the kernel.  The primary reason is that 
critical sections in the kernel are written by experts and tend to be 
short.  And as it turns out, this is exactly how Linux SMP support has 
been for quite a few years.  

It looks to me like Jim and Bjoern name the kernel-mutex locking scheme 
(of non-preemption and FIFO queueing) as FMLP and advocate it for 
user-level mutexes.  Jim: Please correct me if my interpretation is 
incorrect.

I would like to propose a solution with 2 components.  First, priority 
inheritance protocols not just prevent (potentially) unbounded priority 
inversion but offer less blocking for higher-priority tasks with 
typically tighter timing constraints.   It is also well known that 
non-preemptive execution is an extreme/simplified form of the priority 
ceiling protocol, where every task is assumed to access every shared 
resource/mutex and hence every priority ceiling is the highest priority 
in the system.    There are cases when this is fine (e.g. when all 
critical sections are *relatively* small as in the Linux kernel) and 
there are cases when this is not (e.g. when some user-level critical 
sections accessed only by lower-criticality tasks are *relatively* long 
compared to the timing constraints of higher priority tasks.

Component 1: Let a priority ceiling be associated with each user-level 
mutex.   When the mutex is locked, the corresponding critical section is 
executed at the priority ceiling value.   The developer can choose this 
to be the highest priority in the system in which case it becomes a 
non-preemptive critical section.    In addition, we could allow mutexes 
to either pick basic priority inheritance (desirable for local mutexes?) 
or the priority ceiling version (desirable for global mutexes shared 
across processors/cores).

Component 2: The queueing discipline for tasks pending on locked mutexes 
is the second policy under discussion.   Jim argues that it should be 
FIFO.  Imagine a bunch of lower-priority tasks stuck in front of a 
higher-priority task, and the "priority inversion" time can be 
considerable.  (Some including me would argue that it goes against the 
grain of using priority-based scheduling for real-time systems in the 
first place.  Why not just use FIFO scheduling?).    However, a 
practical compromise is easy to reach as well.  Let the queueing policy 
(FIFO or priority-based) on mutexes be an option.  FIFO for a "local" 
mutex would not be very helpful.  And priority-queueing for a "global" 
mutex would help tasks with tight timing constraints.

Proposal Summary

   1. Associate a priority ceiling (priority at which critical sections 
are executed) with each (global) mutex.   A macro HIGHEST_CEILING could 
represent a value that is higher than any other application-level priority.
   2. Allow the developer to specify the queueing discipline on a 
(global) mutex.   MUTEX_FIFO_QUEUEING and MUTEX_PRIORITY_QUEUEING are 
allowed options.

I would appreciate any comments.   Thanks for reading through a lot (if 
you reached this far :-)

---
Raj
http://www.ece.cmu.edu/~raj


P.S.  Some relevant references from a couple of decades back.

[1] Rajkumar, R., "Real-Time Synchronization Protocols for Shared Memory 
Multiprocessors". The Tenth International Conference on Distributed 
Computing Systems, 1990.
[2] Rajkumar, R., Sha, L., and Lehoczky J.P. "Real-Time Synchronization 
Protocols for Multiprocessors". Proceedings of the
IEEE Real-Time Systems Symposium, December 1988, pp. 259-269.

Global locking is costly - global critical sections could be moved to a 
"global synchronization processor" (and is described in one of the 
articles above).


             reply	other threads:[~2009-07-16 17:54 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-16 17:54 Raj Rajkumar [this message]
  -- strict thread matches above, loose matches on Subject: below --
2009-07-10 21:50 RFC for a new Scheduling policy/class in the Linux-kernel Henrik Austad
2009-07-11 18:28 ` Peter Zijlstra
2009-07-12  2:40   ` Douglas Niehaus
2009-07-12 15:31     ` Peter Zijlstra
2009-07-13 15:44       ` Raistlin
2009-07-13 16:33         ` Chris Friesen
2009-07-14 10:47           ` Raistlin
2009-07-14 11:03             ` Peter Zijlstra
2009-07-14 18:19               ` Raistlin
2009-07-14 14:48             ` Chris Friesen
2009-07-14 15:19               ` James H. Anderson
2009-07-14 16:31                 ` Peter Zijlstra
2009-07-14 16:54                   ` James H. Anderson
2009-07-14 19:28                     ` Henrik Austad
2009-07-14 19:33                       ` James H. Anderson
2009-07-15 21:53                       ` Ted Baker
2009-07-17  7:40                         ` Henrik Austad
2009-07-17 13:37                           ` Ted Baker
2009-07-15  4:25                     ` Bjoern B. Brandenburg
2009-07-15 20:55                     ` Ted Baker
2009-07-15 21:53                       ` Chris Friesen
2009-07-15 22:34                         ` Ted Baker
2009-07-15 22:39                           ` Dhaval Giani
2009-07-15 23:16                             ` Ted Baker
2009-07-16  8:58                               ` Peter Zijlstra
2009-07-16  9:11                                 ` Peter Zijlstra
2009-07-17  0:32                                 ` Raistlin
2009-07-17  0:43                                 ` Raistlin
2009-07-16 12:17                               ` Raistlin
2009-07-16 23:29                       ` Raistlin
2009-07-18 20:12                         ` Michal Sojka
2009-07-14 17:16                   ` James H. Anderson
2009-07-15 21:19                     ` Ted Baker
2009-07-14 19:54                   ` Raistlin
2009-07-14 16:48               ` Raistlin
2009-07-14 18:24                 ` Chris Friesen
2009-07-14 19:14                   ` Raistlin
2009-07-15 22:14                   ` Ted Baker
2009-07-16  7:17                     ` Henrik Austad
2009-07-16 23:13                       ` Ted Baker
2009-07-17  0:19                         ` Raistlin
2009-07-17  7:31                         ` Henrik Austad
2009-07-16 14:46                     ` Chris Friesen
2009-07-16 22:34                       ` Ted Baker
2009-07-16 23:07                         ` Raistlin
2009-07-15 21:45               ` Ted Baker
2009-07-15 22:12                 ` Chris Friesen
2009-07-15 22:52                   ` Ted Baker
2009-07-17 13:35             ` Giuseppe Lipari
2009-07-13 17:25         ` Peter Zijlstra
2009-07-13 18:14           ` Noah Watkins
2009-07-13 20:13             ` Ted Baker
2009-07-13 21:45               ` Chris Friesen
2009-07-14 11:16                 ` Raistlin
2009-07-15 23:11                 ` Ted Baker
2009-07-16  7:58                   ` Peter Zijlstra
2009-07-16  8:52                     ` Thomas Gleixner
2009-07-16 12:17                     ` Raistlin
2009-07-16 12:59                       ` James H. Anderson
2009-07-16 13:37                         ` Peter Zijlstra
2009-07-16 22:15                     ` Ted Baker
2009-07-16 22:34                       ` Karthik Singaram Lakshmanan
2009-07-16 23:38                         ` Ted Baker
2009-07-17  1:44                           ` Karthik Singaram Lakshmanan
2009-07-16 15:17                   ` Chris Friesen
2009-07-16 21:26                     ` Ted Baker
2009-07-16 22:08                       ` Chris Friesen
2009-07-16 23:54                         ` Ted Baker
2009-07-14  9:15             ` Peter Zijlstra
2009-07-14 19:07               ` Raistlin
2009-07-13 17:28         ` Peter Zijlstra
2009-07-14 19:47           ` Raistlin
     [not found]     ` <002301ca0403$47f9d9d0$d7ed8d70$@tlh@comcast.net>
2009-07-13 23:47       ` Douglas Niehaus
2009-07-14  7:27         ` Chris Friesen
2009-07-14  7:44           ` Douglas Niehaus
2009-07-12  6:17   ` Henrik Austad
2009-07-13  9:55   ` Raistlin
2009-07-13 10:14     ` Peter Zijlstra
2009-07-13 16:06       ` Raistlin
2009-07-14  8:42         ` Peter Zijlstra
2009-07-14  9:36           ` Raistlin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A5F6948.5010107@ece.cmu.edu \
    --to=raj@ece.cmu.edu \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox