Re: OSDL Bug 3770 - Nick Piggin

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Nick Piggin <piggin@cyberone.com.au>
To: Loic Domaigne <loicWorks@gmx.net>
Cc: nptl@bullopensource.org,
	Linux Kernel Mailing List <Linux-Kernel@Vger.Kernel.ORG>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: OSDL Bug 3770
Date: Sat, 18 Dec 2004 20:43:38 +1100	[thread overview]
Message-ID: <41C3FBCA.1020707@cyberone.com.au> (raw)
In-Reply-To: <41C3F4BB.2050102@gmx.net>

Loic Domaigne wrote:

> Hello Nick!
> Hello NPTL Mailing List!
>

Hello Loic! Thanks for the interesting mail.

I'm CCing lkml and Ingo with this, because I wouldn't feel comfortable 
to veto
this myself.

lkml: We're discussing the fact that on SMP machines, our realtime 
scheduling
policies are per-CPU only. This caused a problem where a high priority 
task on
one CPU caused all lower priority tasks on that CPU to be starved, while 
tasks
on another CPU with the same low priority were able to run.

>>> Ah, the problem is that when the driver thread has a higher
>>> priority than the worker threads, so when the driver goes into an
>>> infinite loop  waiting, the able to schedule, however.
>>
>
> Although POSIX legally permits such implementation for realtime policy 
> on SMP machines, this implementation is clearly *NOT* REASONABLE.
>

Well I haven't done much in the realtime area... but nobody has 
complained till now.

> The reason is extremely simple: the application *CANNOT* necessarily 
> known that it gets stuck behind a higher-priority thread (though it 
> could had run on another CPU if the scheduler had decided otherwise). 
> That's *NOT* doable to program in a deterministic fashion in such 
> "realtime"-environement
>

You could use CPU binding. I'd argue that this may be nearly a 
requirement for
any realtime system of significant complexity on an SMP system.

*But*, notice that the program in question did not run on UP and 
randomly fail
on SMP, rather it would not work on single processor AT ALL.

> [
> "Realtime" put into quote. I am speaking here of soft realtime, that 
> is an environment whose tasks scheduling follow a specific 
> deterministic order. I am not speaking about hard-realtime that have 
> additional timing constraints. Following that definition, we can say 
> that Linux offers (soft) "Realtime".
> ]
>
>
>> > The driver really needs to sleep, use a mutex, use a lower priority,
>>
>>> or  something in order for it to work.
>>
>
> NO! It is not the responsability of the application to fix that 
> behavior! We can in our case because 'we know', but some applications 
> don't!!!
>

That's a bit hand-wavy ;) but I don't dismiss it out of hand because as 
I said,
I'm not so familiar with this area. I would be interested in an example 
of some
application where this matters, and which absolutely can't use any 
synchronisation
primitives.

>
> The mistake done here is interesting. When you have a pool of servers, 
> you can proceed in two ways to serve the clients:
>
>     (1) make a FIFO queue for each server. When a client arrives, it
>         chooses the queue that is the shortest.
>
>     (2) make an unique FIFO queue for all servers. All clients are
>         queued, and when a server is done it takes the first client
>         waiting on that big queue.
>
> Queuing theory proves that (2) is better. Exactly due to the reason we 
> have here. With (1), the guys in the queue might get stuck if the 
> corresponding server is blocked by a client. With (2), when a server 
> is blocked by a client, it doesn't prevent the other clients to be 
> served by other servers.
>

But that model is flawed for SMP scheduling. If it were that easy, we 
might have a
single queue for _all_ tasks.

The main problem is the cost of synchronisation and cacheline sharing. A 
secondary
problem is that of CPU affinities - moving a task to another CPU nearly 
always has
some non zero cost in terms of cache (and in case of NUMA, memory) 
efficiency.

Our global queue scheduler was basically crap for more than 4 CPUs. We 
could give
RT tasks a global queue with little impact to non-RT workloads (in fact, 
I think
early iterations of the 2.6 scheduler trialed this)... but let's not 
cripple the
RT apps that do the right thing (and need scalablility).

Another problem is that scheduling may not be O(1) anymore, if you have 
CPU affinity
bindings in place.

To summaries, I believe that if per-CPU RT queues is allowed within 
POSIX, then we
want to go with the sanest possible implementation, and force any broken 
apps to
fix themselves.... let's not cave in now :)

Nick

> [
> An historical note. USA had implemented (2) in offices, supermarkets 
> and such long before Europa. Because in Europe, customers were 
> convinced that model (2) took more time, because the queue was longer.
> ]
>
>

next      parent reply	other threads:[~2004-12-18  9:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1102071900.14792.81.camel@decugiss.frec.bull.fr>
     [not found] ` <41B6368F.9060704@cyberone.com.au>
     [not found]   ` <1102495648.3613.39.camel@decugiss.frec.bull.fr>
     [not found]     ` <41B6C2D4.5040705@cyberone.com.au>
     [not found]       ` <1102497754.3644.1.camel@decugiss.frec.bull.fr>
     [not found]         ` <41B6D544.1010106@cyberone.com.au>
     [not found]           ` <1102501896.3644.5.camel@decugiss.frec.bull.fr>
     [not found]             ` <41B6D824.80804@cyberone.com.au>
     [not found]               ` <41B6DA44.4020100@cyberone.com.au>
     [not found]                 ` <1102502987.3644.7.camel@decugiss.frec.bull.fr>
     [not found]                   ` <41B6DEC1.9050506@cyberone.com.au>
     [not found]                     ` <1102523077.3644.42.camel@decugiss.frec.bull.fr>
     [not found]                       ` <41B8115C.30509@cyberone.com.au>
     [not found]                         ` <41B82435.7020802@cyberone.com.au>
     [not found]                           ` <1102590314.3644.107.camel@decugiss.frec.bull.fr>
     [not found]                             ` <41C3F4BB.2050102@gmx.net>
2004-12-18  9:43                               ` Nick Piggin [this message]
2004-12-20 17:02 Re: OSDL Bug 3770 Loic Domaigne
2004-12-21 11:09 ` Nick Piggin
2004-12-21 12:06   ` Loic Domaigne
2004-12-21 13:32     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41C3FBCA.1020707@cyberone.com.au \
    --to=piggin@cyberone.com.au \
    --cc=Linux-Kernel@Vger.Kernel.ORG \
    --cc=loicWorks@gmx.net \
    --cc=mingo@elte.hu \
    --cc=nptl@bullopensource.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox