Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
@ 2010-04-26 11:56 Ted Baker
  2010-04-26 18:29 ` Joerg Roedel
  2010-05-03 14:41 ` Peter Zijlstra
  0 siblings, 2 replies; 10+ messages in thread
From: Ted Baker @ 2010-04-26 11:56 UTC (permalink / raw)
  To: raj, jayhawk, a.p.zijlstra, raistlin, niehaus, henrik,
	linux-kernel, mingo, billh, linux-rt-users, fabio, anderson, tglx,
	dhaval.giani, cucinotta, lipari
  Cc: baker.tlh

I have not seen any more e-mail on this.  How is it going?  Is there any
chance of rolling in some corrections for the SCHED_SPORADIC treatment?  In
particular, could we have a DO_NOT_RUN priority, that is guaranteed to
prevent a task from running at all?

For more detail, see http://ww2.cs.fsu.edu/~stanovic/papers/rtas10.pdf .

Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
  2010-04-26 11:56 [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel] Ted Baker
@ 2010-04-26 18:29 ` Joerg Roedel
  2010-04-26 18:37   ` Doug Niehaus
  2010-05-03 14:41 ` Peter Zijlstra
  1 sibling, 1 reply; 10+ messages in thread
From: Joerg Roedel @ 2010-04-26 18:29 UTC (permalink / raw)
  To: Ted Baker
  Cc: raj, jayhawk, a.p.zijlstra, raistlin, niehaus, henrik,
	linux-kernel, mingo, billh, linux-rt-users, fabio, anderson, tglx,
	dhaval.giani, cucinotta, lipari, baker.tlh

On Mon, Apr 26, 2010 at 07:56:58AM -0400, Ted Baker wrote:
> I have not seen any more e-mail on this.  How is it going?  Is there any
> chance of rolling in some corrections for the SCHED_SPORADIC treatment?  In
> particular, could we have a DO_NOT_RUN priority, that is guaranteed to
> prevent a task from running at all?

Sorry for asking a maybe stupid question, but what is this good for and
what is the benefit over SIGSTOP?

	Joerg


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
  2010-04-26 18:29 ` Joerg Roedel
@ 2010-04-26 18:37   ` Doug Niehaus
  0 siblings, 0 replies; 10+ messages in thread
From: Doug Niehaus @ 2010-04-26 18:37 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Ted Baker, raj, jayhawk, a.p.zijlstra, raistlin, henrik,
	linux-kernel, mingo, billh, linux-rt-users, fabio, anderson, tglx,
	dhaval.giani, cucinotta, lipari, baker.tlh

One limitation of SIGSTOP is that the last time I instrumented it in 
detail, which admittedly was several years ago, every process you send 
the message to has to run long enough to receive the message.

We were in the the position of sending it to fairly large groups of 
processes, partly through laziness in code structure, but when I looked 
at the time lines I was appalled to see a large number of context 
switches for *very* short execution intervals. All were associated with 
receiving the SIGSTOP and SIGCONT.

I found a rather painful humor in the fact that we were running 
processes in order to keep them from running.

So, a way to change state of a process that does not cost a context 
switch has some appeal.

Doug

On 04/26/2010 01:29 PM, Joerg Roedel wrote:
> On Mon, Apr 26, 2010 at 07:56:58AM -0400, Ted Baker wrote:
>    
>> I have not seen any more e-mail on this.  How is it going?  Is there any
>> chance of rolling in some corrections for the SCHED_SPORADIC treatment?  In
>> particular, could we have a DO_NOT_RUN priority, that is guaranteed to
>> prevent a task from running at all?
>>      
> Sorry for asking a maybe stupid question, but what is this good for and
> what is the benefit over SIGSTOP?
>
> 	Joerg
>
>    

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
  2010-04-26 11:56 [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel] Ted Baker
  2010-04-26 18:29 ` Joerg Roedel
@ 2010-05-03 14:41 ` Peter Zijlstra
  2010-05-03 15:54   ` Ted Baker
  2010-05-03 16:13   ` Ted Baker
  1 sibling, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2010-05-03 14:41 UTC (permalink / raw)
  To: Ted Baker
  Cc: raj, jayhawk, raistlin, niehaus, henrik, linux-kernel, mingo,
	billh, linux-rt-users, fabio, anderson, tglx, dhaval.giani,
	cucinotta, lipari, baker.tlh

Hi Ted,

On Mon, 2010-04-26 at 07:56 -0400, Ted Baker wrote:
> I have not seen any more e-mail on this.  How is it going?  Is there any
> chance of rolling in some corrections for the SCHED_SPORADIC treatment?  In
> particular, could we have a DO_NOT_RUN priority, that is guaranteed to
> prevent a task from running at all?

Without having fully read the referenced paper, we're currently looking
to support the sporadic task model through SCHED_DEADLINE (by our SSSUP
friends):

  http://lkml.org/lkml/2010/2/28/107

This work aims to implement a full sporadic task scheduler [initially
(g)EDF], SCHED_SPORADIC would have been a better name, but since POSIX
stole that from us we took SCHED_DEADLINE to indicate its a deadline
scheduler.

Along with this work comes the full Deadline-inheritance (which should
be but a small change from our current Priority-inheritance code), and
also Bandwidth-inheritance (more work). Esp. the latter would also be
required for your proposed SCHED_SPORADIC since it does aim to be a
'strict' bandwidth enforcing scheduler.

[Does the proposed 'fixed' SCHED_SPORADIC deal with admission control?]

But as it stands, this work would provide much more complete sporadic
task support than the fixed SCHED_SPORADIC would.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
  2010-05-03 14:41 ` Peter Zijlstra
@ 2010-05-03 15:54   ` Ted Baker
  2010-05-03 16:13   ` Ted Baker
  1 sibling, 0 replies; 10+ messages in thread
From: Ted Baker @ 2010-05-03 15:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: raj, jayhawk, raistlin, niehaus, henrik, linux-kernel, mingo,
	billh, linux-rt-users, fabio, anderson, tglx, dhaval.giani,
	cucinotta, lipari, baker.tlh

Dear Peter,

Thanks for the message.

The reference to lkml is not very helpful, as it refers to 5
months of e-mail, broken down by days, with no (apparent) search
capability.  However, I did manage to find some references to the
deadline scheduling work via the search capability on the
marc.info site.

In any case, deadline scheduling and sporadic server scheduling
are two quite different things. SCHED_SPORADIC belongs to the
fixed priority scheduling domain.  So, the proposed SCHED_DEADLINE
will not meet the same user requirement.

SCHED_SPORADIC has a well-defined POSIX name and a well-defined
API.  The semantics are not so badly broken that they can't be
fixed by an "interpretation" of the existing standard.  I intend
to submit a request for such an interpretation to the Austin
Group, to get them to relax their specification for
SCHED_SPORADIC, so that it can be implemented with reasonable
semantics.  It would help to convince the Austin Group to do this
if there were "existing practice" in Linux for this, though.

While I appreciate the enthusiasm of the SSUP folks for deadline
scheduling, the debates over the virtues of deadline-based vs.
fixed-priority scheduling, as well as over partitioned versus
global scheduling schemes, are likely to continue without a clear
resolution for a long time.  In the mean time, existing
applications use fixed priorities.  So, I'd hope that wherever
Linux goes with respect to adding support for deadline scheduling
it maintains the option for applications to use fixed-priority
scheduling.

BTW, I've found that deadline and fixed-priority scheduling can be
implemented and used together in a reasonable way, using a mapping
of all system priorities to values of a 128-bit time type.  The
internal scheduler is entirely based on deadline values, but fixed
priorities above and below normal dadlines are accomodated by
mapping:

a. The largest possible time value (2^128) is reserved for
DO_NOT_RUN.  Each processor's idle task is given a priority
level higher (earlier in time) than this.

b. Fixed priority values that are intended to be lower in priority
(later) than deadline-scheduled tasks are given "deadlines" in the
unreachable future.  A range of very large time values, just
short of DO_NOT_RUN can be reserved for this purpose.

c. Most time values are treated as normal deadlines.

d. The range of deadline values that would always be in the past
either negative values or values very close to zero, are reserved
for fixed priority scheduling of higher priority (earlier) than
all deadline-scheduled tasks.

In this fashion, it is easy for one scheduler, using one domain
of priority values, with a consistent interpretations, to 
implement.

Ted

On Mon, May 03, 2010 at 04:41:22PM +0200, Peter Zijlstra wrote:
> Hi Ted,
> 
> On Mon, 2010-04-26 at 07:56 -0400, Ted Baker wrote:
> > I have not seen any more e-mail on this.  How is it going?  Is there any
> > chance of rolling in some corrections for the SCHED_SPORADIC treatment?  In
> > particular, could we have a DO_NOT_RUN priority, that is guaranteed to
> > prevent a task from running at all?
> 
> Without having fully read the referenced paper, we're currently looking
> to support the sporadic task model through SCHED_DEADLINE (by our SSSUP
> friends):
> 
>   http://lkml.org/lkml/2010/2/28/107
> 
> This work aims to implement a full sporadic task scheduler [initially
> (g)EDF], SCHED_SPORADIC would have been a better name, but since POSIX
> stole that from us we took SCHED_DEADLINE to indicate its a deadline
> scheduler.
> 
> Along with this work comes the full Deadline-inheritance (which should
> be but a small change from our current Priority-inheritance code), and
> also Bandwidth-inheritance (more work). Esp. the latter would also be
> required for your proposed SCHED_SPORADIC since it does aim to be a
> 'strict' bandwidth enforcing scheduler.
> 
> [Does the proposed 'fixed' SCHED_SPORADIC deal with admission control?]
> 
> But as it stands, this work would provide much more complete sporadic
> task support than the fixed SCHED_SPORADIC would.
> 
> 
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
  2010-05-03 14:41 ` Peter Zijlstra
  2010-05-03 15:54   ` Ted Baker
@ 2010-05-03 16:13   ` Ted Baker
  1 sibling, 0 replies; 10+ messages in thread
From: Ted Baker @ 2010-05-03 16:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: raj, jayhawk, raistlin, niehaus, henrik, linux-kernel, mingo,
	billh, linux-rt-users, fabio, anderson, tglx, dhaval.giani,
	cucinotta, lipari, baker.tlh

Sorry for the complaint about the lklm.org reference.
Somehow, the "2/28/107" portion of the URL got cut off when
I pasted it into my browser. :-}

However, the rest of my comments still apply.

--Ted

On Mon, May 03, 2010 at 04:41:22PM +0200, Peter Zijlstra wrote:
> Hi Ted,
> 
> On Mon, 2010-04-26 at 07:56 -0400, Ted Baker wrote:
> > I have not seen any more e-mail on this.  How is it going?  Is there any
> > chance of rolling in some corrections for the SCHED_SPORADIC treatment?  In
> > particular, could we have a DO_NOT_RUN priority, that is guaranteed to
> > prevent a task from running at all?
> 
> Without having fully read the referenced paper, we're currently looking
> to support the sporadic task model through SCHED_DEADLINE (by our SSSUP
> friends):
> 
>   http://lkml.org/lkml/2010/2/28/107
> 
> This work aims to implement a full sporadic task scheduler [initially
> (g)EDF], SCHED_SPORADIC would have been a better name, but since POSIX
> stole that from us we took SCHED_DEADLINE to indicate its a deadline
> scheduler.
> 
> Along with this work comes the full Deadline-inheritance (which should
> be but a small change from our current Priority-inheritance code), and
> also Bandwidth-inheritance (more work). Esp. the latter would also be
> required for your proposed SCHED_SPORADIC since it does aim to be a
> 'strict' bandwidth enforcing scheduler.
> 
> [Does the proposed 'fixed' SCHED_SPORADIC deal with admission control?]
> 
> But as it stands, this work would provide much more complete sporadic
> task support than the fixed SCHED_SPORADIC would.
> 
> 
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
@ 2009-07-16 19:41 Raj Rajkumar
  0 siblings, 0 replies; 10+ messages in thread
From: Raj Rajkumar @ 2009-07-16 19:41 UTC (permalink / raw)
  To: LKML

James H. Anderson wrote:
 >
 > Hi Raj,
 >
 > On Thu, 16 Jul 2009, Raj Rajkumar wrote:
 >
 >> non-preemptive critical section.    In addition, we could allow 
mutexes to either pick basic priority inheritance (desirable for local 
mutexes?) or the priority ceiling version (desirable for global mutexes 
shared across processors/cores).
 >
 > This discussion when I entered it was about using global scheduling
 > in Linux (not partitioning), so that's what I thought the focus of the
 > discussion was.  What's the definition of a local mutex in that case?
 > And how do you use ceilings under global scheduling?
 >
 > Thanks.
 >
 > -Jim

Hi Jim:   I was not aware of the global scheduling constraint from the 
earlier discussions - thanks for the clarification.   Two thoughts on 
global partitioning:

   1. I presume you and others have pointed out the anomalies and low 
processor utilization that can result from global scheduling (the Dhall 
& Liu analysis being the most famous).   In addition, there are run-time 
performance implications as the caches keep getting cold as processes 
migrate.   The Linux notion of processor affinity needs to be put to 
good use!
   2. The definition of a priority ceiling (the priority of the highest 
priority task that can access a shared resource/mutex) holds independent 
of partitioning (static binding) or global scheduling (dynamic 
binding).   The following issue still remains.  If there are m 
processors, consider m low-priority tasks sharing m mutexes to execute 
VERY long critical sections.  These mutexes are only shared with m (or 
fewer) other lower priority tasks.   If these tasks each grab a mutex on 
each processor and execute these long critical sections, higher priority 
tasks waiting to execute will be starved/delayed.  With the ceiling 
notion in place, these critical sections will be executing at a lower 
ceiling priority and can therefore be preempted.

Combining the two comments above, I would suspect that in practice, 
tasks with tight timing constraints would be bound to specific 
processors/cores (they can be spread out that they do not compete with 
each other, and hence each/many/most can get very good response times on 
their processors) and my prior comments would apply with processor 
affinities in place.   Tasks with less tight timing constraints and 
perhaps targeting other functions with their own shared mutexes will use 
ceiling execution for critical sections, without affecting the response 
times of the tighter real-time tasks.

Best,

---
Raj

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <4A5F7254.3020809@ece.cmu.edu>]

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
       [not found] <4A5F7254.3020809@ece.cmu.edu>
@ 2009-07-16 19:18 ` James H. Anderson
       [not found]   ` <4A5F806D.6040701@ece.cmu.edu>
  0 siblings, 1 reply; 10+ messages in thread
From: James H. Anderson @ 2009-07-16 19:18 UTC (permalink / raw)
  To: Raj Rajkumar
  Cc: Ted Baker, Noah Watkins, Peter Zijlstra, Raistlin,
	Douglas Niehaus, Henrik Austad, LKML, Ingo Molnar, Bill Huey,
	Linux RT <linux-rt-users@vger.kernel.org> Fabio Checconi,
	Thomas Gleixner, Dhaval Giani, Tommaso Cucinotta, Giuseppe Lipari,
	Bjoern Brandenburg


Hi Raj,

On Thu, 16 Jul 2009, Raj Rajkumar wrote:

> non-preemptive critical section.    In addition, we could allow mutexes 
> to either pick basic priority inheritance (desirable for local mutexes?) 
> or the priority ceiling version (desirable for global mutexes shared 
> across processors/cores).

This discussion when I entered it was about using global scheduling
in Linux (not partitioning), so that's what I thought the focus of the
discussion was.  What's the definition of a local mutex in that case?
And how do you use ceilings under global scheduling?

Thanks.

-Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

[parent not found: <4A5F806D.6040701@ece.cmu.edu>]

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
       [not found]   ` <4A5F806D.6040701@ece.cmu.edu>
@ 2009-07-16 19:46     ` James H. Anderson
  2009-07-16 20:47       ` Raj Rajkumar
  0 siblings, 1 reply; 10+ messages in thread
From: James H. Anderson @ 2009-07-16 19:46 UTC (permalink / raw)
  To: Raj Rajkumar
  Cc: Ted Baker, Noah Watkins, Peter Zijlstra, Raistlin,
	Douglas Niehaus, Henrik Austad, LKML, Ingo Molnar, Bill Huey,
	Linux RT <linux-rt-users@vger.kernel.org> Fabio Checconi,
	Thomas Gleixner, Dhaval Giani, Tommaso Cucinotta, Giuseppe Lipari,
	Bjoern Brandenburg, Karthik Singaram Lakshmanan, Dionisio de Niz

> It looks to me like Jim and Bjoern name the kernel-mutex locking scheme 
> (of non-preemption and FIFO queueing) as FMLP and advocate it for 
> user-level mutexes.  Jim: Please correct me if my interpretation is 
> incorrect.

I should have addressed this, sorry.

Actually, I don't advocate for anything.  :-)  As I said in my very
first email in this thread, in the LTIMUS^RT project, changing Linux
is not one of our goals.  I leave that to other people who are way
smarter than me.

But to the point you raise, please note that the long version of the
FMLP is a little more than combining non-preemption with FIFO waiting
since waiting is via suspension.  And as I said in an earlier email,
we designed it for a real-time (only) environment.  However, I think
a user-level variant that could be used in a more general environment
would certainly be possible.

-Jim

P.S. We didn't talk about the low processor utlization (Dhall effect)
mentioned in your last email.  However, that applies to hard real-time
workloads, not soft real-time workloads.  This discussion has been
touching on both.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel]
  2009-07-16 19:46     ` James H. Anderson
@ 2009-07-16 20:47       ` Raj Rajkumar
  0 siblings, 0 replies; 10+ messages in thread
From: Raj Rajkumar @ 2009-07-16 20:47 UTC (permalink / raw)
  To: James H. Anderson
  Cc: Ted Baker, Noah Watkins, Peter Zijlstra, Raistlin,
	Douglas Niehaus, Henrik Austad, LKML, Ingo Molnar, Bill Huey,
	Linux RT <linux-rt-users@vger.kernel.org> Fabio Checconi,
	Thomas Gleixner, Dhaval Giani, Tommaso Cucinotta, Giuseppe Lipari,
	Bjoern Brandenburg, Karthik Singaram Lakshmanan, Dionisio de Niz

Jim:

Good discussion.   THanks for taking the time to educate me on past 
exchanges.

 > We didn't talk about the low processor utlization (Dhall effect)
 > mentioned in your last email.  However, that applies to hard real-time
 > workloads, not soft real-time workloads.  This discussion has been
 > touching on both.

For hard real-time workloads, partitioning (static binding to specific 
processors) works well, with developer control over where tasks run and 
their contenders.  For soft real-time workloads, global scheduling 
(dynamic binding to available processors) should do well.   The 
situation is analogous to what we see in banks and airports.  There is a 
common global queue serviced by multiple counters for "soft" real-time 
customers, and for those (business or first-class/special) customers 
needing higher QoS,  separate queues are provided.  In the OS context, 
we need to ensure that the two queues/servers do not interfere.  
Ceilings would help even when the hard and soft real-time tasks use the 
same processors.

However, the question of dealing with mutexes shared by processes 
allocated to different processors remains.  As Ted has pointed out, 
avoiding them would be best!  In practice, moving them to a 
synchronization processor (as was pointed out by Peter? and also 
discussed in one of my earlier papers on synchronization on 
multi-processors) ought to be considered.  I think the first-order 
improvements are
from

(1) ensuring that task waiting times are bounded as a function of 
critical sections only (i.e. avoiding the "unbounded" priority inversion 
problem) - this is accomplished by having critical sections execute at 
ceiling priorities (or higher) in the multiprocessor case,
(2) avoiding the wait from very long critical sections used only by 
lower-priority tasks - using priority ceilings instead of higher 
priority values for long critical sections mitigates this problem.

---
Raj

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-05-03 16:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-26 11:56 [Fwd: Re: RFC for a new Scheduling policy/class in the Linux-kernel] Ted Baker
2010-04-26 18:29 ` Joerg Roedel
2010-04-26 18:37   ` Doug Niehaus
2010-05-03 14:41 ` Peter Zijlstra
2010-05-03 15:54   ` Ted Baker
2010-05-03 16:13   ` Ted Baker
  -- strict thread matches above, loose matches on Subject: below --
2009-07-16 19:41 Raj Rajkumar
     [not found] <4A5F7254.3020809@ece.cmu.edu>
2009-07-16 19:18 ` James H. Anderson
     [not found]   ` <4A5F806D.6040701@ece.cmu.edu>
2009-07-16 19:46     ` James H. Anderson
2009-07-16 20:47       ` Raj Rajkumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).