Re: Scheduler ( was: Just a second ) ...

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: Scheduler ( was: Just a second ) ...
       [not found] <Pine.LNX.4.33.0112181508001.3410-100000@penguin.transmeta.com>
@ 2001-12-20  3:50 ` Rik van Riel
  2001-12-20  4:04   ` Ryan Cumming
                     ` (3 more replies)
  0 siblings, 4 replies; 87+ messages in thread
From: Rik van Riel @ 2001-12-20  3:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> The thing is, I'm personally very suspicious of the "features for that
> exclusive 0.1%" mentality.

Then why do we have sendfile(), or that idiotic sys_readahead() ?

(is there _any_ use for sys_readahead() ?  at all ?)

cheers,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  3:50 ` Scheduler ( was: Just a second ) Rik van Riel
@ 2001-12-20  4:04   ` Ryan Cumming
  2001-12-20  5:39   ` David S. Miller
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 87+ messages in thread
From: Ryan Cumming @ 2001-12-20  4:04 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, torvalds

On December 19, 2001 19:50, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
> > The thing is, I'm personally very suspicious of the "features for that
> > exclusive 0.1%" mentality.
>
> Then why do we have sendfile(), or that idiotic sys_readahead() ?

Damn straights

sendfile(2) had an oppertunity to be a real extention of the Unix philosophy. 
If it was called something like "copy" (to match "read" and "write"), and 
worked on all fds (even if it didn't do zerocopy, it should still just work), 
it'd fit in a lot more nicely than even BSD sockets. Alas, as it is, it's 
more of a wart than an extention. 

Now, sys_readahead() is pretty much the stupidest thing I've ever heard. If 
we had a copy(2) syscall, we could do the same thing by: copy(sourcefile, 
/dev/null, count). I don't think sys_readahead() even qualifies as a wart. 

-Ryan

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  3:50 ` Scheduler ( was: Just a second ) Rik van Riel
  2001-12-20  4:04   ` Ryan Cumming
@ 2001-12-20  5:39   ` David S. Miller
  2001-12-20  5:58     ` Linus Torvalds
  2001-12-20 11:29     ` Rik van Riel
  2001-12-20  5:52   ` Linus Torvalds
  2001-12-20  6:33   ` Scheduler, Can we save some juice Timothy Covell
  3 siblings, 2 replies; 87+ messages in thread
From: David S. Miller @ 2001-12-20  5:39 UTC (permalink / raw)
  To: riel; +Cc: torvalds, bcrl, alan, davidel, linux-kernel

   From: Rik van Riel <riel@conectiva.com.br>
   Date: Thu, 20 Dec 2001 01:50:36 -0200 (BRST)

   On Tue, 18 Dec 2001, Linus Torvalds wrote:
   
   > The thing is, I'm personally very suspicious of the "features for that
   > exclusive 0.1%" mentality.
   
   Then why do we have sendfile(), or that idiotic sys_readahead() ?

Sending files over sockets are %99 of what most network servers are
actually doing today, it is much more than 0.1% :-)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  5:39   ` David S. Miller
@ 2001-12-20  5:58     ` Linus Torvalds
  2001-12-20  6:01       ` David S. Miller
  2001-12-20 11:29     ` Rik van Riel
  1 sibling, 1 reply; 87+ messages in thread
From: Linus Torvalds @ 2001-12-20  5:58 UTC (permalink / raw)
  To: David S. Miller; +Cc: riel, bcrl, alan, davidel, linux-kernel


On Wed, 19 Dec 2001, David S. Miller wrote:
>
>    Then why do we have sendfile(), or that idiotic sys_readahead() ?
>
> Sending files over sockets are %99 of what most network servers are
> actually doing today, it is much more than 0.1% :-)

Well, that was true when the thing was written, but whether anybody _uses_
it any more, I don't know. Tux gets the same effect on its own, and I
don't know if Apache defaults to using sendfile or not.

readahead was just a personal 5-minute experiment, we can certainly remove
that ;)

		Linus


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  5:58     ` Linus Torvalds
@ 2001-12-20  6:01       ` David S. Miller
  2001-12-20 22:40         ` Troels Walsted Hansen
  0 siblings, 1 reply; 87+ messages in thread
From: David S. Miller @ 2001-12-20  6:01 UTC (permalink / raw)
  To: torvalds; +Cc: riel, bcrl, alan, davidel, linux-kernel

   From: Linus Torvalds <torvalds@transmeta.com>
   Date: Wed, 19 Dec 2001 21:58:41 -0800 (PST)
   
   Well, that was true when the thing was written, but whether anybody _uses_
   it any more, I don't know. Tux gets the same effect on its own, and I
   don't know if Apache defaults to using sendfile or not.
   
Samba uses it by default, that I know for sure :-)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* RE: Scheduler ( was: Just a second ) ...
  2001-12-20  6:01       ` David S. Miller
@ 2001-12-20 22:40         ` Troels Walsted Hansen
  2001-12-20 23:55           ` Chris Ricker
  0 siblings, 1 reply; 87+ messages in thread
From: Troels Walsted Hansen @ 2001-12-20 22:40 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: linux-kernel

>From: David S. Miller
>   From: Linus Torvalds <torvalds@transmeta.com>
>   Well, that was true when the thing was written, but whether anybody
_uses_
>   it any more, I don't know. Tux gets the same effect on its own, and
I
>   don't know if Apache defaults to using sendfile or not.
>   
>Samba uses it by default, that I know for sure :-)

I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
the word "sendfile" in the source at least. :( Wonder why the sendfile
patches where never merged...

--
Troels Walsted Hansen


^ permalink raw reply	[flat|nested] 87+ messages in thread

* RE: Scheduler ( was: Just a second ) ...
  2001-12-20 22:40         ` Troels Walsted Hansen
@ 2001-12-20 23:55           ` Chris Ricker
  2001-12-20 23:59             ` CaT
  2001-12-21  0:06             ` Davide Libenzi
  0 siblings, 2 replies; 87+ messages in thread
From: Chris Ricker @ 2001-12-20 23:55 UTC (permalink / raw)
  To: Troels Walsted Hansen; +Cc: 'David S. Miller', World Domination Now!

On Thu, 20 Dec 2001, Troels Walsted Hansen wrote:

> >From: David S. Miller
> >   From: Linus Torvalds <torvalds@transmeta.com>
> >   Well, that was true when the thing was written, but whether anybody
> _uses_
> >   it any more, I don't know. Tux gets the same effect on its own, and
> I
> >   don't know if Apache defaults to using sendfile or not.
> >   
> >Samba uses it by default, that I know for sure :-)
> 
> I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
> the word "sendfile" in the source at least. :( Wonder why the sendfile
> patches where never merged...

The only real-world source I've noticed actually using sendfile() are some
of the better ftp daemons (such as vsftpd).

later,
chris

-- 
Chris Ricker                                               kaboom@gatech.edu

This is a dare to the Bush administration.
        -- Thurston Moore



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20 23:55           ` Chris Ricker
@ 2001-12-20 23:59             ` CaT
  2001-12-21  0:06             ` Davide Libenzi
  1 sibling, 0 replies; 87+ messages in thread
From: CaT @ 2001-12-20 23:59 UTC (permalink / raw)
  To: Chris Ricker
  Cc: Troels Walsted Hansen, 'David S. Miller',
	World Domination Now!

On Thu, Dec 20, 2001 at 04:55:55PM -0700, Chris Ricker wrote:
> > I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
> > the word "sendfile" in the source at least. :( Wonder why the sendfile
> > patches where never merged...
> 
> The only real-world source I've noticed actually using sendfile() are some
> of the better ftp daemons (such as vsftpd).

proftpd uses it also.

-- 
CaT        - A high level of technology does not a civilisation make.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* RE: Scheduler ( was: Just a second ) ...
  2001-12-20 23:55           ` Chris Ricker
  2001-12-20 23:59             ` CaT
@ 2001-12-21  0:06             ` Davide Libenzi
  1 sibling, 0 replies; 87+ messages in thread
From: Davide Libenzi @ 2001-12-21  0:06 UTC (permalink / raw)
  To: Chris Ricker
  Cc: Troels Walsted Hansen, 'David S. Miller',
	World Domination Now!

On Thu, 20 Dec 2001, Chris Ricker wrote:

> On Thu, 20 Dec 2001, Troels Walsted Hansen wrote:
>
> > >From: David S. Miller
> > >   From: Linus Torvalds <torvalds@transmeta.com>
> > >   Well, that was true when the thing was written, but whether anybody
> > _uses_
> > >   it any more, I don't know. Tux gets the same effect on its own, and
> > I
> > >   don't know if Apache defaults to using sendfile or not.
> > >
> > >Samba uses it by default, that I know for sure :-)
> >
> > I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
> > the word "sendfile" in the source at least. :( Wonder why the sendfile
> > patches where never merged...
>
> The only real-world source I've noticed actually using sendfile() are some
> of the better ftp daemons (such as vsftpd).

And XMail :)




- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  5:39   ` David S. Miller
  2001-12-20  5:58     ` Linus Torvalds
@ 2001-12-20 11:29     ` Rik van Riel
  2001-12-20 11:34       ` David S. Miller
  1 sibling, 1 reply; 87+ messages in thread
From: Rik van Riel @ 2001-12-20 11:29 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, bcrl, alan, davidel, linux-kernel

On Wed, 19 Dec 2001, David S. Miller wrote:
> From: Rik van Riel <riel@conectiva.com.br>
>    On Tue, 18 Dec 2001, Linus Torvalds wrote:
>
>    > The thing is, I'm personally very suspicious of the "features for that
>    > exclusive 0.1%" mentality.
>
>    Then why do we have sendfile(), or that idiotic sys_readahead() ?
>
> Sending files over sockets are %99 of what most network servers are
> actually doing today, it is much more than 0.1% :-)

The same could be said for AIO, there are a _lot_ of
server programs which are heavily overthreaded because
of a lack of AIO...

cheers,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20 11:29     ` Rik van Riel
@ 2001-12-20 11:34       ` David S. Miller
  0 siblings, 0 replies; 87+ messages in thread
From: David S. Miller @ 2001-12-20 11:34 UTC (permalink / raw)
  To: riel; +Cc: torvalds, bcrl, alan, davidel, linux-kernel

   From: Rik van Riel <riel@conectiva.com.br>
   Date: Thu, 20 Dec 2001 09:29:28 -0200 (BRST)

   On Wed, 19 Dec 2001, David S. Miller wrote:
   > Sending files over sockets are %99 of what most network servers are
   > actually doing today, it is much more than 0.1% :-)
   
   The same could be said for AIO, there are a _lot_ of
   server programs which are heavily overthreaded because
   of a lack of AIO...

If you read my most recent responses to Ingo's postings, you'll see
that I'm starting to completely agree with you :-)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  3:50 ` Scheduler ( was: Just a second ) Rik van Riel
  2001-12-20  4:04   ` Ryan Cumming
  2001-12-20  5:39   ` David S. Miller
@ 2001-12-20  5:52   ` Linus Torvalds
  2001-12-20  6:33   ` Scheduler, Can we save some juice Timothy Covell
  3 siblings, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-20  5:52 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List

On Thu, 20 Dec 2001, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
>
> > The thing is, I'm personally very suspicious of the "features for that
> > exclusive 0.1%" mentality.
>
> Then why do we have sendfile(), or that idiotic sys_readahead() ?

Hey, I expect others to do things in their tree, and I live by the same
rules: I do my stuff openly in my tree.

The Apache people actually seemed quite interested in sendfile. Of course,
that was before apache seemed to stop worrying about trying to beat
others at performance (rightly or wrongly - I think they are right
from a pragmatic viewpoint, and wrong from a PR one).

And hey, the same way I encourage others to experiment openly with their
trees, I experiment with mine.

			Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Scheduler, Can we save some juice ...
  2001-12-20  3:50 ` Scheduler ( was: Just a second ) Rik van Riel
                     ` (2 preceding siblings ...)
  2001-12-20  5:52   ` Linus Torvalds
@ 2001-12-20  6:33   ` Timothy Covell
  2001-12-20  6:50     ` Ryan Cumming
  2001-12-20  6:52     ` Robert Love
  3 siblings, 2 replies; 87+ messages in thread
From: Timothy Covell @ 2001-12-20  6:33 UTC (permalink / raw)
  To: Rik van Riel, Linus Torvalds
  Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List

On Wednesday 19 December 2001 21:50, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
> > The thing is, I'm personally very suspicious of the "features for that
> > exclusive 0.1%" mentality.
>
> Then why do we have sendfile(), or that idiotic sys_readahead() ?
>
> (is there _any_ use for sys_readahead() ?  at all ?)
>
> cheers,
>
> Rik

OK, here's another 0.1% for you.  Considering how Linux SMP
doesn't have high CPU affinity, would it be possible to make a
patch such that the additional CPUs remain in deep sleep/HALT
mode until the first CPU hits a high-water mark of say 90% 
utilization?  I've started doing this by hand with the (x)pulse
application.   My goal is to save electricity and cut down on 
excess heat when I'm just browsing the web and not compiling
or seti@home'ing.

-- 
timothy.covell@ashavan.org.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler, Can we save some juice ...
  2001-12-20  6:33   ` Scheduler, Can we save some juice Timothy Covell
@ 2001-12-20  6:50     ` Ryan Cumming
  2001-12-20  6:52     ` Robert Love
  1 sibling, 0 replies; 87+ messages in thread
From: Ryan Cumming @ 2001-12-20  6:50 UTC (permalink / raw)
  To: timothy.covell; +Cc: Kernel Mailing List

On December 19, 2001 22:33, Timothy Covell wrote:
> OK, here's another 0.1% for you.  Considering how Linux SMP
> doesn't have high CPU affinity, would it be possible to make a
> patch such that the additional CPUs remain in deep sleep/HALT
> mode until the first CPU hits a high-water mark of say 90%
> utilization?  I've started doing this by hand with the (x)pulse
> application.   My goal is to save electricity and cut down on
> excess heat when I'm just browsing the web and not compiling
> or seti@home'ing.

I seriously doubt there would be a noticable power consumption or heat 
difference between two CPU's running HLT half the time, and one CPU running 
HLT all the time. And I'm downright certain it isn't worth the code 
complexity even if it was, there is very little (read: no) intersection 
between the SMP and low-power user base.

-Ryan

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler, Can we save some juice ...
  2001-12-20  6:33   ` Scheduler, Can we save some juice Timothy Covell
  2001-12-20  6:50     ` Ryan Cumming
@ 2001-12-20  6:52     ` Robert Love
  2001-12-20 17:39       ` Timothy Covell
  1 sibling, 1 reply; 87+ messages in thread
From: Robert Love @ 2001-12-20  6:52 UTC (permalink / raw)
  To: timothy.covell
  Cc: Rik van Riel, Linus Torvalds, Benjamin LaHaise, Alan Cox,
	Davide Libenzi, Kernel Mailing List

On Thu, 2001-12-20 at 01:33, Timothy Covell wrote:

> OK, here's another 0.1% for you.  Considering how Linux SMP
> doesn't have high CPU affinity, would it be possible to make a
> patch such that the additional CPUs remain in deep sleep/HALT
> mode until the first CPU hits a high-water mark of say 90% 
> utilization?  I've started doing this by hand with the (x)pulse
> application.   My goal is to save electricity and cut down on 
> excess heat when I'm just browsing the web and not compiling
> or seti@home'ing.

You'd probably be better off working against load and not CPU usage,
since a single app can hit you at 100% CPU.  Load average is the sort of
metric you want, since if there is more than 1 task waiting to run on
average, you will benefit from multiple CPUs.

That said, this would be easy to do in user space using the hotplug CPU
patch.  Monitor load average (just like any X applet does) and when it
crosses over the threshold: "echo 1 > /proc/sys/cpu/2/online"

Another solution would be to use CPU affinity to lock init (and thus all
tasks) to 0x00000001 or whatever and then start allowing 0x00000002 or
whatever when load gets too high.

My point: it is awful easy in user space.

	Robert Love

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler, Can we save some juice ...
  2001-12-20  6:52     ` Robert Love
@ 2001-12-20 17:39       ` Timothy Covell
  0 siblings, 0 replies; 87+ messages in thread
From: Timothy Covell @ 2001-12-20 17:39 UTC (permalink / raw)
  To: Robert Love; +Cc: linux-kernel

On Thursday 20 December 2001 00:52, Robert Love wrote:
> On Thu, 2001-12-20 at 01:33, Timothy Covell wrote:
> > OK, here's another 0.1% for you.  Considering how Linux SMP
> > doesn't have high CPU affinity, would it be possible to make a
> > patch such that the additional CPUs remain in deep sleep/HALT
> > mode until the first CPU hits a high-water mark of say 90%
> > utilization?  I've started doing this by hand with the (x)pulse
> > application.   My goal is to save electricity and cut down on
> > excess heat when I'm just browsing the web and not compiling
> > or seti@home'ing.
>
> You'd probably be better off working against load and not CPU usage,
> since a single app can hit you at 100% CPU.  Load average is the sort of
> metric you want, since if there is more than 1 task waiting to run on
> average, you will benefit from multiple CPUs.
>
> That said, this would be easy to do in user space using the hotplug CPU
> patch.  Monitor load average (just like any X applet does) and when it
> crosses over the threshold: "echo 1 > /proc/sys/cpu/2/online"
>
> Another solution would be to use CPU affinity to lock init (and thus all
> tasks) to 0x00000001 or whatever and then start allowing 0x00000002 or
> whatever when load gets too high.
>
> My point: it is awful easy in user space.
>
> 	Robert Love
>

You make good points.  I'll try the hotplug CPU patch to automate things
more than with my simple use of Xpulse, (whose code I could have
used if I wanted to get off my butt and write a useful C application.)


-- 
timothy.covell@ashavan.org.

^ permalink raw reply	[flat|nested] 87+ messages in thread

[parent not found: <20011218020456.A11541@redhat.com>]

* Re: Scheduler ( was: Just a second ) ...
       [not found] <20011218020456.A11541@redhat.com>
@ 2001-12-18 16:50 ` Linus Torvalds
  2001-12-18 16:56   ` Rik van Riel
                     ` (2 more replies)
  0 siblings, 3 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 16:50 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Benjamin LaHaise wrote:
> On Mon, Dec 17, 2001 at 10:10:30PM -0800, Linus Torvalds wrote:
> > > Well, we've got serious chicken and egg problems then.
> >
> > Why?
>
> The code can't go into glibc without syscall numbers being reserved.

It sure as hell can.

And I'll bet $5 USD that glibc wouldn't take the patches anyway before
the kernel interfaces are _tested_.

> I've posted the code, there are people playing with it.  I can't make them
> comment.

Well, if people aren't interested, then it doesn't _ever_ go in.

Remember: we do not add features just because we can.

Quite frankly, I don't think you've told that many people. I haven't seen
any discussion about the aio stuff on linux-kernel, which may be because
you posted several announcements and nobody cared, or it may be that
you've only mentioned it fleetingly and people didn't notice.

Take a look at how long it took for ext3 to be "standard" - I put them in
my tree when I started getting real feedback that it was used and people
liked using it. I simply do not like applying patches "just to get users".
Not even reservations - because I reserve the right to _never_ apply
something if critical review ends up saying that "that doesn't make
sense".

Quite frankly, the fact that it is being tested out at places like Oracle
etc is secondary - those people will use anything. That's proven by
history. That doesn't mean that _I_ accept anything.

Now, the fact that I like the interfaces is actually secondary - it does
make me much more likely to include it even in a half-baked thing, but it
does NOT mean that I trust my own taste so much that I'd do it "under the
covers" with little open discussion, use and modification.

Where _is_ the discussion on linux-kernel?

Where are the negative comments from Al? (Al _always_ has negative
comments and suggestions for improvements, don't try to say that he also
liked it unconditionally ;)

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
@ 2001-12-18 16:56   ` Rik van Riel
  2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 17:55   ` Davide Libenzi
  2001-12-18 19:43   ` Alexander Viro
  2 siblings, 1 reply; 87+ messages in thread
From: Rik van Riel @ 2001-12-18 16:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> Where _is_ the discussion on linux-kernel?

Which mailing lists do you want to be subscribed to ? ;)

Rik
-- 
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:56   ` Rik van Riel
@ 2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 19:04       ` Alan Cox
                         ` (2 more replies)
  0 siblings, 3 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:18 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List


On Tue, 18 Dec 2001, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
>
> > Where _is_ the discussion on linux-kernel?
>
> Which mailing lists do you want to be subscribed to ? ;)

I'm not subscribed to any, thank you very much. I read them through a news
gateway, which gives me access to the common ones.

And if the discussion wasn't on the common ones, then it wasn't an open
discussion.

And no, I don't think IRC counts either, sorry.

		Linus


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:18     ` Linus Torvalds
@ 2001-12-18 19:04       ` Alan Cox
  2001-12-18 21:02         ` Larry McVoy
  2001-12-19 16:50         ` Daniel Phillips
  2001-12-18 19:11       ` Mike Galbraith
  2001-12-18 19:15       ` Rik van Riel
  2 siblings, 2 replies; 87+ messages in thread
From: Alan Cox @ 2001-12-18 19:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

> I'm not subscribed to any, thank you very much. I read them through a news
> gateway, which gives me access to the common ones.
> 
> And if the discussion wasn't on the common ones, then it wasn't an open
> discussion.

If the discussion was on the l/k list then most kernel developers arent
going to read it because tey dont have time to wade through all the crap
that doesnt matter to them.
 
> And no, I don't think IRC counts either, sorry.

IRC is where most stuff, especially cross vendor stuff is initially
discussed nowdays, along with kernelnewbies where most of the intro
stuff is - but thats disussed rather than formally proposed and studied

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 19:04       ` Alan Cox
@ 2001-12-18 21:02         ` Larry McVoy
  2001-12-18 21:14           ` David S. Miller
  2001-12-18 21:18           ` Rik van Riel
  2001-12-19 16:50         ` Daniel Phillips
  1 sibling, 2 replies; 87+ messages in thread
From: Larry McVoy @ 2001-12-18 21:02 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

Maybe I'm an old stick in the mud, but IRC seems like a big waste of
time to me.  It's perfect for off the cuff answers and fairly useless
for thoughtful answers.  We used to write well thought out papers and
specifications for OS work.  These days if you can't do it in a paragraph
on IRC it must not be worth doing, eh?

On Tue, Dec 18, 2001 at 07:04:59PM +0000, Alan Cox wrote:
> > I'm not subscribed to any, thank you very much. I read them through a news
> > gateway, which gives me access to the common ones.
> > 
> > And if the discussion wasn't on the common ones, then it wasn't an open
> > discussion.
> 
> If the discussion was on the l/k list then most kernel developers arent
> going to read it because tey dont have time to wade through all the crap
> that doesnt matter to them.
>  
> > And no, I don't think IRC counts either, sorry.
> 
> IRC is where most stuff, especially cross vendor stuff is initially
> discussed nowdays, along with kernelnewbies where most of the intro
> stuff is - but thats disussed rather than formally proposed and studied
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:02         ` Larry McVoy
@ 2001-12-18 21:14           ` David S. Miller
  2001-12-18 21:17             ` Larry McVoy
  2001-12-18 21:18           ` Rik van Riel
  1 sibling, 1 reply; 87+ messages in thread
From: David S. Miller @ 2001-12-18 21:14 UTC (permalink / raw)
  To: lm; +Cc: alan, torvalds, riel, bcrl, davidel, linux-kernel

   From: Larry McVoy <lm@bitmover.com>
   Date: Tue, 18 Dec 2001 13:02:28 -0800

   Maybe I'm an old stick in the mud, but IRC seems like a big waste of
   time to me.

It's like being at a Linux conference all the time. :-)

It does kind of make sense given that people are so scattered across
the planet.  Sometimes I want to just grill someone on something, and
email would be too much back and forth, IRC is one way to accomplish
that.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:14           ` David S. Miller
@ 2001-12-18 21:17             ` Larry McVoy
  2001-12-18 21:19               ` Rik van Riel
  2001-12-18 21:30               ` David S. Miller
  0 siblings, 2 replies; 87+ messages in thread
From: Larry McVoy @ 2001-12-18 21:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: lm, alan, torvalds, riel, bcrl, davidel, linux-kernel

On Tue, Dec 18, 2001 at 01:14:20PM -0800, David S. Miller wrote:
>    From: Larry McVoy <lm@bitmover.com>
>    Date: Tue, 18 Dec 2001 13:02:28 -0800
> 
>    Maybe I'm an old stick in the mud, but IRC seems like a big waste of
>    time to me.
> 
> It's like being at a Linux conference all the time. :-)
> 
> It does kind of make sense given that people are so scattered across
> the planet.  Sometimes I want to just grill someone on something, and
> email would be too much back and forth, IRC is one way to accomplish
> that.

Let me introduce you to this neat invention called a telephone.  It's
the black thing next to your desk, it rings, has buttons.  If you push
the right buttons, well, it's magic...

:-)

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:17             ` Larry McVoy
@ 2001-12-18 21:19               ` Rik van Riel
  2001-12-18 21:30               ` David S. Miller
  1 sibling, 0 replies; 87+ messages in thread
From: Rik van Riel @ 2001-12-18 21:19 UTC (permalink / raw)
  To: Larry McVoy; +Cc: David S. Miller, alan, torvalds, bcrl, davidel, linux-kernel

On Tue, 18 Dec 2001, Larry McVoy wrote:
> On Tue, Dec 18, 2001 at 01:14:20PM -0800, David S. Miller wrote:
> >    From: Larry McVoy <lm@bitmover.com>
> >    Date: Tue, 18 Dec 2001 13:02:28 -0800
> >
> >    Maybe I'm an old stick in the mud, but IRC seems like a big waste of
> >    time to me.
> >
> > It's like being at a Linux conference all the time. :-)
>
> Let me introduce you to this neat invention called a telephone.  It's
> the black thing next to your desk, it rings, has buttons.  If you push
> the right buttons, well, it's magic...

Yeah, but you can't scroll up a page on the phone...

(also, talking with multiple people at the same time
is kind of annoying in audio, while it's ok on irc)

Rik
--
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:17             ` Larry McVoy
  2001-12-18 21:19               ` Rik van Riel
@ 2001-12-18 21:30               ` David S. Miller
  1 sibling, 0 replies; 87+ messages in thread
From: David S. Miller @ 2001-12-18 21:30 UTC (permalink / raw)
  To: lm; +Cc: alan, torvalds, riel, bcrl, davidel, linux-kernel

   From: Larry McVoy <lm@bitmover.com>
   Date: Tue, 18 Dec 2001 13:17:13 -0800

   Let me introduce you to this neat invention called a telephone.  It's
   the black thing next to your desk, it rings, has buttons.  If you push
   the right buttons, well, it's magic...

I'm not calling Holland every time I want to poke Jens about
something in a patch we're working on :-)

I hate telephones for technical stuff, because people can call the
fucking thing when I am not behind my computer or even worse when I AM
behind my computer and I want to concentrate on the code on my screen
without being disturbed.  With IRC it is MY CHOICE to get involved in
the discussion, I can choose to respond or not respond to someone, I
can choose to be available or not available at any given time.  It's
just a real-time version of email.  And the "passive, I can ignore
you" part is what I like about it.

Telephones frankly suck for discussing technical topics.  I can't cut
and paste pieces of code from my other editor buffer to show you over
the phone, as another example as to why.

A lot of people like to use telephones specifically because it does
not give the other party the option of ignoring you once they pick up
the phone.  I value the ability to make the choice to ignore people
because a lot of ideas I don't give a crap about come under my nose.

In fact that may be one of the best parts about Linux development
compared to doing stuff at a company, one isn't required to listen to
someone's idea or to even read it.  If today I don't give a crap about
Joe's filesystem idea, hey guess what I'm not going to read any of his
emails about the thing.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:02         ` Larry McVoy
  2001-12-18 21:14           ` David S. Miller
@ 2001-12-18 21:18           ` Rik van Riel
  1 sibling, 0 replies; 87+ messages in thread
From: Rik van Riel @ 2001-12-18 21:18 UTC (permalink / raw)
  To: Larry McVoy
  Cc: Alan Cox, Linus Torvalds, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

On Tue, 18 Dec 2001, Larry McVoy wrote:

> Maybe I'm an old stick in the mud, but IRC seems like a big waste of
> time to me.  It's perfect for off the cuff answers and fairly useless
> for thoughtful answers.  We used to write well thought out papers and
> specifications for OS work.  These days if you can't do it in a
> paragraph on IRC it must not be worth doing, eh?

Actually, we tend to use multiple media at the same time.

It happens very often that because of some discussion on
IRC we end up writing up a few paragraphs and sending it
to people by email.

For other things, email is clearly too slow, so stuff is
done on IRC (eg. walking somebody through a piece of code
to identify and agree on a bug).

cheers,

Rik
--
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 19:04       ` Alan Cox
  2001-12-18 21:02         ` Larry McVoy
@ 2001-12-19 16:50         ` Daniel Phillips
  1 sibling, 0 replies; 87+ messages in thread
From: Daniel Phillips @ 2001-12-19 16:50 UTC (permalink / raw)
  To: Alan Cox, Linus Torvalds
  Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

On December 18, 2001 08:04 pm, Alan Cox wrote:
> > I'm not subscribed to any, thank you very much. I read them through a news
> > gateway, which gives me access to the common ones.
> > 
> > And if the discussion wasn't on the common ones, then it wasn't an open
> > discussion.
> 
> If the discussion was on the l/k list then most kernel developers arent
> going to read it because tey dont have time to wade through all the crap
> that doesnt matter to them.

Hi Alan,

It's AIO we're talking about, right?  AIO is interesting to quite a few 
people.  I'd read the thread.  I'd also read any background material that Ben 
would be so kind as to supply.

--
Daniel

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 19:04       ` Alan Cox
@ 2001-12-18 19:11       ` Mike Galbraith
  2001-12-18 19:15       ` Rik van Riel
  2 siblings, 0 replies; 87+ messages in thread
From: Mike Galbraith @ 2001-12-18 19:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> And no, I don't think IRC counts either, sorry.

Well yeah.. it's synchronous IO :)

	-Mike


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 19:04       ` Alan Cox
  2001-12-18 19:11       ` Mike Galbraith
@ 2001-12-18 19:15       ` Rik van Riel
  2 siblings, 0 replies; 87+ messages in thread
From: Rik van Riel @ 2001-12-18 19:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> And no, I don't think IRC counts either, sorry.

Whether you think it counts or not, IRC is where
most stuff is happening nowadays.

cheers,

Rik
-- 
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
  2001-12-18 16:56   ` Rik van Riel
@ 2001-12-18 17:55   ` Davide Libenzi
  2001-12-18 19:43   ` Alexander Viro
  2 siblings, 0 replies; 87+ messages in thread
From: Davide Libenzi @ 2001-12-18 17:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> Quite frankly, I don't think you've told that many people. I haven't seen
> any discussion about the aio stuff on linux-kernel, which may be because
> you posted several announcements and nobody cared, or it may be that
> you've only mentioned it fleetingly and people didn't notice.

This is not to ask the inclusion of /dev/epoll inside the kernel ( it can
be easily merged by users that want to use it ) but i've had its users to
prefer talking about that out of the mailing list. Maybe because they're
scared to be eaten by some gurus when asking easy questions :)




- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
  2001-12-18 16:56   ` Rik van Riel
  2001-12-18 17:55   ` Davide Libenzi
@ 2001-12-18 19:43   ` Alexander Viro
  2 siblings, 0 replies; 87+ messages in thread
From: Alexander Viro @ 2001-12-18 19:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> Where are the negative comments from Al? (Al _always_ has negative
> comments and suggestions for improvements, don't try to say that he also
> liked it unconditionally ;)

Heh.

Aside of a _big_ problem with exposing async API to userland (for a
lot of reasons, including usual quality of async code in general and
event-drivel one in particular) there is more specific one - Ben's
long-promised full-async writepage() and friends.  I'll believe it
when I see it and so far it didn't appear.

So for the time being I'm staying the fsck out of that - I don't like
it, but I'm sick and tired of this sort of religious wars.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
@ 2001-12-18  5:59 V Ganesh
  0 siblings, 0 replies; 87+ messages in thread
From: V Ganesh @ 2001-12-18  5:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: wli

In article <20011217205547.C821@holomorphy.com> you wrote:
: On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
:> The most likely cause is simply waking up after each sound interrupt: you
:> also have a _lot_ of time handling interrupts. Quite frankly, web surfing
:> and mp3 playing simply shouldn't use any noticeable amounts of CPU.

: I think we have a winner:
: /proc/interrupts
: ------------------------------------------------
:            CPU0
:   0:   17321824          XT-PIC  timer
:   1:          4          XT-PIC  keyboard
:   2:          0          XT-PIC  cascade
:   5:   46490271          XT-PIC  soundblaster
:   9:     400232          XT-PIC  usb-ohci, eth0, eth1
:  11:     939150          XT-PIC  aic7xxx, aic7xxx
:  14:         13          XT-PIC  ide0

: Approximately 4 times more often than the timer interrupt.
: That's not nice...

a bit offtopic, but the reason why there are so many interrupts is
that there's probably something like esd running. I've observed that idle
esd manages to generate tons of interrupts, although an strace of esd
reveals it stuck in a select(). probably one of the ioctls it issued
earlier is causing the driver to continuously read/write to the device.
the interrupts stop as soon as you kill esd.

: SoundBlaster 16
: A change of hardware should help verify this.

it happens even with cs4232 (redhat 7.2, 2.4.7-10smp), so I doubt it's
a soundblaster issue.

ganesh

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
@ 2001-12-18  5:11 Thierry Forveille
  2001-12-17 21:41 ` John Heil
  2001-12-18 14:31 ` Alan Cox
  0 siblings, 2 replies; 87+ messages in thread
From: Thierry Forveille @ 2001-12-18  5:11 UTC (permalink / raw)
  To: linux-kernel

Linus Torvalds (torvalds@transmeta.com) writes
> On Mon, 17 Dec 2001, Rik van Riel wrote:
> >
> > Try readprofile some day, chances are schedule() is pretty
> > near the top of the list.
>
> Ehh.. Of course I do readprofile.
>  
> But did you ever compare readprofile output to _total_ cycles spent?
>
I have a feeling that this discussion got sidetracked: cpu cycles burnt 
in the scheduler indeed is non-issue, but big tasks being needlessly moved
around on SMPs is worth tackling.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  5:11 Thierry Forveille
@ 2001-12-17 21:41 ` John Heil
  2001-12-18 14:31 ` Alan Cox
  1 sibling, 0 replies; 87+ messages in thread
From: John Heil @ 2001-12-17 21:41 UTC (permalink / raw)
  To: Thierry Forveille; +Cc: linux-kernel

On Mon, 17 Dec 2001, Thierry Forveille wrote:

> Date: Mon, 17 Dec 2001 19:11:10 -1000 (HST)
> From: Thierry Forveille <forveill@cfht.hawaii.edu>
> To: linux-kernel@vger.kernel.org
> Subject: Re: Scheduler ( was: Just a second ) ...
> 
> Linus Torvalds (torvalds@transmeta.com) writes
> > On Mon, 17 Dec 2001, Rik van Riel wrote:
> > >
> > > Try readprofile some day, chances are schedule() is pretty
> > > near the top of the list.
> >
> > Ehh.. Of course I do readprofile.
> >  
> > But did you ever compare readprofile output to _total_ cycles spent?
> >
> I have a feeling that this discussion got sidetracked: cpu cycles burnt 
> in the scheduler indeed is non-issue, but big tasks being needlessly moved
> around on SMPs is worth tackling.

Given a cpu affinity facility, policy mgmt would belong in user space.
CPU affinity would be pretty simple and I think the effort is already
in flight IIRC.

Johnh

-
-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
johnhscs@sc-software.com
http://www.sc-software.com
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  5:11 Thierry Forveille
  2001-12-17 21:41 ` John Heil
@ 2001-12-18 14:31 ` Alan Cox
  1 sibling, 0 replies; 87+ messages in thread
From: Alan Cox @ 2001-12-18 14:31 UTC (permalink / raw)
  To: Thierry Forveille; +Cc: linux-kernel

> I have a feeling that this discussion got sidetracked: cpu cycles burnt 
> in the scheduler indeed is non-issue, but big tasks being needlessly moved
> around on SMPs is worth tackling.]

Its not a non issue - 40% of an 8 way box is a lot of lost CPU. Fixing the
CPU bounce around problem also matters a lot - Ingo's speedups seen just by 
improving that on the current scheduler show its worth the work



^ permalink raw reply	[flat|nested] 87+ messages in thread

[parent not found: <20011217200946.D753@holomorphy.com>]

* Re: Scheduler ( was: Just a second ) ...
       [not found] <20011217200946.D753@holomorphy.com>
@ 2001-12-18  4:27 ` Linus Torvalds
  2001-12-18  4:55   ` William Lee Irwin III
  2001-12-18 18:13   ` Davide Libenzi
  0 siblings, 2 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18  4:27 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Kernel Mailing List

[ cc'd back to Linux kernel, in case somebody wants to take a look whether
  there is something wrong in the sound drivers, for example ]

On Mon, 17 Dec 2001, William Lee Irwin III wrote:
>
> This is no benchmark. This is my home machine it's taking a bite out of.
> I'm trying to websurf and play mp3's and read email here. No forkbombs.
> No databases. No made-up benchmarks. I don't know what it's doing (or
> trying to do) in there but I'd like the CPU cycles back.
>
> From a recent /proc/profile dump on 2.4.17-pre1 (no patches), my top 5
> (excluding default_idle) are:
> --------------------------------------------------------
>  22420 total                                      0.0168
>   4624 default_idle                              96.3333
>   1280 schedule                                   0.6202
>   1130 handle_IRQ_event                          11.7708
>    929 file_read_actor                            9.6771
>    843 fast_clear_page                            7.5268

The most likely cause is simply waking up after each sound interrupt: you
also have a _lot_ of time handling interrupts. Quite frankly, web surfing
and mp3 playing simply shouldn't use any noticeable amounts of CPU.

The point being that I really doubt it's the scheduler proper, it's
probably how it is _used_. And I'd suspect your sound driver (or user)
conspires to keep scheduling stuff.

For example (and this is _purely_ an example, I don't know if this is
your particular case), this sounds like a classic case of "bad buffering".
What bad buffering would do is:
 - you have a sound buffer that the mp3 player tries to keep full
 - your sound buffer is, let's pick a random number, 64 entries of 1024
   bytes each.
 - the sound card gives an interrupt every time it has emptied a buffer.
 - the mp3 player is waiting on "free space"
 - we wake up the mp3 player for _every_ sound fragment filled.

Do you see what this leads to? We schedule the mp3 task (which gets a high
priority because it tends to run for a really short time, filling just 1
small buffer each time) _every_ time a single buffer empties. Even though
we have 63 other full buffers.

The classic fix for these kinds of things is _not_ to make the scheduler
faster. Sure, that would help, but that's not really the problem. The
_real_ fix is to use water-marks, and make the sound driver wake up the
writing process only when (say) half the buffers have emptied.

Now the mp3 player can fill 32 of the buffers at a time, and gets
scheduled an order of magnitude less. It doesn't end up waking up every
time.

Which sound driver are you using, just in case this _is_ the reason?

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:27 ` Linus Torvalds
@ 2001-12-18  4:55   ` William Lee Irwin III
  2001-12-18  6:09     ` Linus Torvalds
  2001-12-18 14:21     ` Adam Schrotenboer
  2001-12-18 18:13   ` Davide Libenzi
  1 sibling, 2 replies; 87+ messages in thread
From: William Lee Irwin III @ 2001-12-18  4:55 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: torvalds

On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> The most likely cause is simply waking up after each sound interrupt: you
> also have a _lot_ of time handling interrupts. Quite frankly, web surfing
> and mp3 playing simply shouldn't use any noticeable amounts of CPU.

I think we have a winner:
/proc/interrupts
------------------------------------------------
           CPU0       
  0:   17321824          XT-PIC  timer
  1:          4          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:   46490271          XT-PIC  soundblaster
  9:     400232          XT-PIC  usb-ohci, eth0, eth1
 11:     939150          XT-PIC  aic7xxx, aic7xxx
 14:         13          XT-PIC  ide0

Approximately 4 times more often than the timer interrupt.
That's not nice...

On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> Which sound driver are you using, just in case this _is_ the reason?

SoundBlaster 16
A change of hardware should help verify this.


Cheers,
Bill

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:55   ` William Lee Irwin III
@ 2001-12-18  6:09     ` Linus Torvalds
  2001-12-18  6:34       ` Jeff Garzik
                         ` (6 more replies)
  2001-12-18 14:21     ` Adam Schrotenboer
  1 sibling, 7 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18  6:09 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Kernel Mailing List, Jeff Garzik

On Mon, 17 Dec 2001, William Lee Irwin III wrote:
>
>   5:   46490271          XT-PIC  soundblaster
>
> Approximately 4 times more often than the timer interrupt.
> That's not nice...

Yeah.

Well, looking at the issue, the problem is probably not just in the sb
driver: the soundblaster driver shares the output buffer code with a
number of other drivers (there's some horrible "dmabuf.c" code in common).

And yes, the dmabuf code will wake up the writer on every single DMA
complete interrupt. Considering that you seem to have them at least 400
times a second (and probably more, unless you've literally had sound going
since the machine was booted), I think we know why your setup spends time
in the scheduler.

> On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > Which sound driver are you using, just in case this _is_ the reason?
>
> SoundBlaster 16
> A change of hardware should help verify this.

A number of sound drivers will use the same logic.

You may be able to change this more easily some other way, by using a
larger fragment size for example. That's up to the sw that actually feeds
the sound stream, so it might be your decoder that selects a small
fragment size.

Quite frankly I don't know the sound infrastructure well enough to make
any more intelligent suggestions about other decoders or similar to try,
at this point I just start blathering.

But yes, I bet you'll also see much less impact of this if you were to
switch to more modern hardware.

grep grep grep.. Oh, before you do that, how about changing "min_fragment"
in sb_audio.c from 5 to something bigger like 9 or 10?

That

	audio_devs[devc->dev]->min_fragment = 5;

literally means that your minimum fragment size seems to be a rather
pathetic 32 bytes (which doesn't mean that your sound will be set to that,
but it _might_ be). That sounds totally ridiculous, but maybe I've
misunderstood the code.

Jeff, you've worked on the sb code at some point - does it really do
32-byte sound fragments? Why? That sounds truly insane if I really parsed
that code correctly. That's thousands of separate DMA transfers
and interrupts per second..

Raising that min_fragment thing from 5 to 10 would make the minimum DMA
buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
in less than 1/100th of a second, but at least it should be < 200 irqs/sec
rather than >400).

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
@ 2001-12-18  6:34       ` Jeff Garzik
  2001-12-18 12:23       ` Rik van Riel
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 87+ messages in thread
From: Jeff Garzik @ 2001-12-18  6:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List

Linus Torvalds wrote:
> Jeff, you've worked on the sb code at some point - does it really do
> 32-byte sound fragments? Why? That sounds truly insane if I really parsed
> that code correctly. That's thousands of separate DMA transfers
> and interrupts per second..

I do not see a hardware minimum fragment size in the HW docs...  The
default hardware reset frag size is 2048 bytes.  So, yes, 32 bytes is
pretty small for today's rate.

But... I wonder if the fault lies more with the application setting a
too-small fragment size and the driver actually allows it to do so, or,
the code following this comment in reorganize_buffers in
drivers/sound/audio.c needs to be revisited:
   /* Compute the fragment size using the default algorithm */

Remember this code is from ancient times...  probably written way before
44 Khz was common at all.

	Jeff

-- 
Jeff Garzik      | Only so many songs can be sung
Building 1024    | with two lips, two lungs, and one tongue.
MandrakeSoft     |         - nomeansno

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
  2001-12-18  6:34       ` Jeff Garzik
@ 2001-12-18 12:23       ` Rik van Riel
  2001-12-18 14:29       ` Alan Cox
                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 87+ messages in thread
From: Rik van Riel @ 2001-12-18 12:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

On Mon, 17 Dec 2001, Linus Torvalds wrote:
> On Mon, 17 Dec 2001, William Lee Irwin III wrote:
> >
> >   5:   46490271          XT-PIC  soundblaster
> >
> > Approximately 4 times more often than the timer interrupt.
> > That's not nice...

That's not nearly as much as your typical server system runs
in network packets and wakeups of the samba/database/http
daemons, though ...

> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).

So you fixed it for the sound driver, nice.  We still have
the issue tha the scheduler can take up lots of time on busy
server systems, though.

(though I suspect on those systems it probably spends more
time recalculating than selecting processes)

regards,

Rik
-- 
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
  2001-12-18  6:34       ` Jeff Garzik
  2001-12-18 12:23       ` Rik van Riel
@ 2001-12-18 14:29       ` Alan Cox
  2001-12-18 17:07         ` Linus Torvalds
  2001-12-18 15:51       ` Martin Josefsson
                         ` (3 subsequent siblings)
  6 siblings, 1 reply; 87+ messages in thread
From: Alan Cox @ 2001-12-18 14:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).

The sb driver is fine

> A number of sound drivers will use the same logic.

Most hardware does

> Quite frankly I don't know the sound infrastructure well enough to make
> any more intelligent suggestions about other decoders or similar to try,
> at this point I just start blathering.

some of the sound stuff uses very short fragments to get accurate 
audio/video synchronization. Some apps also do it gratuitously when they
should be using other API's. Its also used sensibly for things like
gnome-meeting where its worth trading CPU for latency because 1K of
buffering starts giving you earth<->moon type conversations

> But yes, I bet you'll also see much less impact of this if you were to
> switch to more modern hardware.

Not really - the app asked for an event every 32 bytes. This is an app not
kernel problem.

> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).

With a few exceptions the applications tend to use 4K or larger DMA chunks
anyway. Very few need tiny chunks.

Alan


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:29       ` Alan Cox
@ 2001-12-18 17:07         ` Linus Torvalds
  0 siblings, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:07 UTC (permalink / raw)
  To: Alan Cox; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

On Tue, 18 Dec 2001, Alan Cox wrote:
>
> > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> > in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> > rather than >400).
>
> With a few exceptions the applications tend to use 4K or larger DMA chunks
> anyway. Very few need tiny chunks.

Doing another grep seems to imply that none of the other drivers even
allow as small chunks as the sb driver does, 32 byte "events" is just
ridiculous. At simple 2-channel, 16-bits, CD-quality sound, that's a DMA
event every 0.18 msec (5500 times a second, 181 _micro_seconds appart).

I obviously agree that the app shouldn't even ask for small chunks:
whether a mp3 player reacts within 1/10th or 1/1000th of a second of the
user asking it to switch tracks, nobody can even tell. So an mp3 player
should probably use a big fragment size on the order of 4kB or similar
(that still gives max fragment latency of 0.022 seconds, faster than
humans can react).

So it sounds like a player sillyness, but I don't think the driver should
even allow such waste of resources, considering that no other driver
allows it either..

			Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (2 preceding siblings ...)
  2001-12-18 14:29       ` Alan Cox
@ 2001-12-18 15:51       ` Martin Josefsson
  2001-12-18 17:08         ` Linus Torvalds
  2001-12-18 16:16       ` Roger Larsson
                         ` (2 subsequent siblings)
  6 siblings, 1 reply; 87+ messages in thread
From: Martin Josefsson @ 2001-12-18 15:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> 
> On Mon, 17 Dec 2001, William Lee Irwin III wrote:
> >
> >   5:   46490271          XT-PIC  soundblaster
> >
> > Approximately 4 times more often than the timer interrupt.
> > That's not nice...

  0:   24867181          XT-PIC  timer
  5:    9070614          XT-PIC  soundblaster

After I bootup I start X and then xmms and then my system plays mp3's
almost all the time.

> > > Which sound driver are you using, just in case this _is_ the reason?
> >
> > SoundBlaster 16

I have an old ISA SoundBlaster 16
 
> Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).

After watchning /proc/interrupts with 30 second intervals I see that I
only get 43 interrupts/second when playing 16bit 44.1kHz stereo.

And according to vmstat I have 153-158 interrupts/second in total
(it's probably the networktraffic that increases it a little above 143).

/Martin


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 15:51       ` Martin Josefsson
@ 2001-12-18 17:08         ` Linus Torvalds
  0 siblings, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:08 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik


On Tue, 18 Dec 2001, Martin Josefsson wrote:
>
> After watchning /proc/interrupts with 30 second intervals I see that I
> only get 43 interrupts/second when playing 16bit 44.1kHz stereo.

That's _exactly_ what you get with a 4kB fragment size.

You have a sane player that asks for a sane fragment size. While whatever
William uses seems to ask for a really small one..

		Linus


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (3 preceding siblings ...)
  2001-12-18 15:51       ` Martin Josefsson
@ 2001-12-18 16:16       ` Roger Larsson
  2001-12-18 17:16         ` Herman Oosthuysen
  2001-12-18 17:16         ` Linus Torvalds
  2001-12-18 17:21       ` David Mansfield
  2001-12-18 18:25       ` William Lee Irwin III
  6 siblings, 2 replies; 87+ messages in thread
From: Roger Larsson @ 2001-12-18 16:16 UTC (permalink / raw)
  To: Linus Torvalds, William Lee Irwin III
  Cc: Kernel Mailing List, linux-audio-dev, Jeff Garzik

This might be of interest on linux-audio-dev too...

On Tuesday den 18 December 2001 07.09, Linus Torvalds wrote:
> On Mon, 17 Dec 2001, William Lee Irwin III wrote:
> >   5:   46490271          XT-PIC  soundblaster
> >
> > Approximately 4 times more often than the timer interrupt.
> > That's not nice...
>
> Yeah.
>
> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).
>
> And yes, the dmabuf code will wake up the writer on every single DMA
> complete interrupt. Considering that you seem to have them at least 400
> times a second (and probably more, unless you've literally had sound going
> since the machine was booted), I think we know why your setup spends time
> in the scheduler.
>
> > On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > > Which sound driver are you using, just in case this _is_ the reason?
> >
> > SoundBlaster 16
> > A change of hardware should help verify this.
>
> A number of sound drivers will use the same logic.
>
> You may be able to change this more easily some other way, by using a
> larger fragment size for example. That's up to the sw that actually feeds
> the sound stream, so it might be your decoder that selects a small
> fragment size.
>
> Quite frankly I don't know the sound infrastructure well enough to make
> any more intelligent suggestions about other decoders or similar to try,
> at this point I just start blathering.
>
> But yes, I bet you'll also see much less impact of this if you were to
> switch to more modern hardware.
>
> grep grep grep.. Oh, before you do that, how about changing "min_fragment"
> in sb_audio.c from 5 to something bigger like 9 or 10?
>
> That
>
> 	audio_devs[devc->dev]->min_fragment = 5;
>
> literally means that your minimum fragment size seems to be a rather
> pathetic 32 bytes (which doesn't mean that your sound will be set to that,
> but it _might_ be). That sounds totally ridiculous, but maybe I've
> misunderstood the code.

I think it really is 32 samples, yes that is little - but too small?
It depends on the used sample frequency...

Paul Davis wrote this on linux-audio-dev 2001-12-05
"in doing lots of testing on JACK, i've noticed that although the
trident driver now works (there were some patches from jaroslav and
myself), in general i still get xruns with the lowest possible latency
setting for that card (1.3msec per interrupt, 2.6msec buffer). with
the same settings on my hammerfall, i don't get xruns, even with
substantial system load."

>
> Jeff, you've worked on the sb code at some point - does it really do
> 32-byte sound fragments? Why? That sounds truly insane if I really parsed
> that code correctly. That's thousands of separate DMA transfers
> and interrupts per second..
>

Lets see: we have >1 GHz CPU and interrupts at >1000 Hz
 => 1 Mcycle / interrupt - is that insane?

If the hardware can support it? Why not let it? It is really up to the 
applications/user to decide...

> Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).
>

Yes, it is probably more reasonable - but if the soundcard can support it?
(I have a vision of lots of linux-audio-dev folks pulling out their new 
soundcard and replacing it with their since long forgotten SB16...)

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:16       ` Roger Larsson
@ 2001-12-18 17:16         ` Herman Oosthuysen
  2001-12-18 17:16         ` Linus Torvalds
  1 sibling, 0 replies; 87+ messages in thread
From: Herman Oosthuysen @ 2001-12-18 17:16 UTC (permalink / raw)
  To: Kernel Mailing List, linux-audio-dev

My tuppence worth from a real-time embedded perspective:
A shorter time slice and other real-time improvements to the scheduler will
certainly improve life to the embedded crowd.  Bear in mind that 90% of
processors are used for embedded apps.  Shorter time slices etc. means
smaller buffers, less RAM and lower cost.

I don't know what the current distribution is for Linux regarding embedded
vs data processing, but the embedded use of Linux is certainly growing
rapidly - we expect to make a million thingummyjigs running Linux next year
and there are many other companies doing the same.  Within the next few
years, I expect embedded use of Linux to overshadow data use by a large
margin.

Since embedded processors are 'invisible' and never in the news, I would be
very happy if Linus and others will keep us poor boys in mind...
--
Herman Oosthuysen
Herman@WirelessNetworksInc.com
Suite 300, #3016, 5th Ave NE,
Calgary, Alberta, T2A 6K4, Canada
Phone: (403) 569-5688, Fax: (403) 235-3965
----- Original Message ----- >
> Lets see: we have >1 GHz CPU and interrupts at >1000 Hz
>  => 1 Mcycle / interrupt - is that insane?
>
> If the hardware can support it? Why not let it? It is really up to the
> applications/user to decide...
>
> > Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> > buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer
empties
> > in less than 1/100th of a second, but at least it should be < 200
irqs/sec
> > rather than >400).
> >
>
> /RogerL
>
> --
> Roger Larsson
> Skellefteå
> Sweden

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:16       ` Roger Larsson
  2001-12-18 17:16         ` Herman Oosthuysen
@ 2001-12-18 17:16         ` Linus Torvalds
  1 sibling, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:16 UTC (permalink / raw)
  To: Roger Larsson
  Cc: William Lee Irwin III, Kernel Mailing List, linux-audio-dev,
	Jeff Garzik

On Tue, 18 Dec 2001, Roger Larsson wrote:
>
> Lets see: we have >1 GHz CPU and interrupts at >1000 Hz
>  => 1 Mcycle / interrupt - is that insane?

Ehh.. First off, the CPU may be 1GHz, but the memory subsystem, and the
PCI subsystem definitely are _not_. Most PCI cards still run at a
(comparatively) leisurely 33MHz, and when we're talking about audio, we're
talking about actually having to _access_ that audio device.

Yes. At 33MHz, not at 1GHz.

Also, at 32-byte fragments, the frequency is actually 5.5kHz, not 1kHz.
Now, I seriously doubt the mp3-player actually used 32-byte fragments (it
probably just asked for something small, and got it), but let's say it
asked for something in the kHz range (ie 256-512 byte frags). That does
_not_ equate to "1 Mcycle". It equates to 33 _kilocycles_ in PCI-land, and
a PCI read will take several cycles.

> If the hardware can support it? Why not let it? It is really up to the
> applications/user to decide...

Well, this particular user was unhappy with the CPU spending a noticeably
amount of time on just web-surfing and mp3-playing.

So clearly the _user_ didn't ask for it.

And I suspect that the app writer just didn't even realize what he did. He
may have used another sound card that didn't even allow small fragments.

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (4 preceding siblings ...)
  2001-12-18 16:16       ` Roger Larsson
@ 2001-12-18 17:21       ` David Mansfield
  2001-12-18 17:27         ` Linus Torvalds
  2001-12-18 18:25       ` William Lee Irwin III
  6 siblings, 1 reply; 87+ messages in thread
From: David Mansfield @ 2001-12-18 17:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

> 
> 	audio_devs[devc->dev]->min_fragment = 5;
> 

Generally speaking, you want to be able to specify about a 1ms fragment,
speaking as a realtime audio programmer (no offense Victor...).  However,
1ms is 128 bytes at 16bit stereo, but only 32 bytes at 8bit mono.  Nobody
does 8bit mono, but that's probably why it's there.  A lot of drivers seem 
to have 128 byte as minimum fragment size.  Even the high end stuff like 
the RME hammerfall only go down to 64 byte fragment PER CHANNEL, which is 
the same as 128 bytes for stereo in the SB 16.

> Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).

Note that the ALSA drivers allow the app to set watermarks for wakeup, 
while allowing flexibility in fragment size and number.  You can 
essentially say, wake me up when there are at least n fragments empty, and 
put me to sleep if m fragments are full.

David

-- 
/==============================\
| David Mansfield              |
| david@cobite.com             |
\==============================/

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:21       ` David Mansfield
@ 2001-12-18 17:27         ` Linus Torvalds
  2001-12-18 17:54           ` Andreas Dilger
  2001-12-18 18:58           ` Alan Cox
  0 siblings, 2 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:27 UTC (permalink / raw)
  To: David Mansfield; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik


On Tue, 18 Dec 2001, David Mansfield wrote:
> >
> > 	audio_devs[devc->dev]->min_fragment = 5;
> >
>
> Generally speaking, you want to be able to specify about a 1ms fragment,
> speaking as a realtime audio programmer (no offense Victor...).  However,
> 1ms is 128 bytes at 16bit stereo, but only 32 bytes at 8bit mono.  Nobody
> does 8bit mono, but that's probably why it's there.  A lot of drivers seem
> to have 128 byte as minimum fragment size.

Good point.

Somebody should really look at "dma_set_fragment", and see whether we can
make "min_fragment" be really just a hardware minimum chunk size, but use
other heuristics like frequency to cut off the minimum size (ie just do
something like

	/* We want to limit it to 1024 Hz */
	min_bytes = freq*channel*bytes_per_channel >> 10;

Although I'm not sure we _have_ the frequency at that point: somebody
might set the fragment size first, and the frequency later.

Maybe the best thing to do is to educate the people who write the sound
apps for Linux (somebody was complaining about "esd" triggering this, for
example).

		Linus


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:27         ` Linus Torvalds
@ 2001-12-18 17:54           ` Andreas Dilger
  2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:35             ` Linus Torvalds
  2001-12-18 18:58           ` Alan Cox
  1 sibling, 2 replies; 87+ messages in thread
From: Andreas Dilger @ 2001-12-18 17:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List,
	Jeff Garzik

On Dec 18, 2001  09:27 -0800, Linus Torvalds wrote:
> Maybe the best thing to do is to educate the people who write the sound
> apps for Linux (somebody was complaining about "esd" triggering this, for
> example).

Yes, esd is an interrupt hog, it seems.  When reading this thread, I
checked, and sure enough I was getting 190 interrupts/sec on the
sound card while not playing any sound.  I killed esd (which I don't
use anyways), and interrupts went to 0/sec when not playing sound.
Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
sound card.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:54           ` Andreas Dilger
@ 2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:52               ` Andreas Dilger
                                 ` (3 more replies)
  2001-12-18 18:35             ` Linus Torvalds
  1 sibling, 4 replies; 87+ messages in thread
From: Doug Ledford @ 2001-12-18 18:27 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Kernel Mailing List

Andreas Dilger wrote:

> On Dec 18, 2001  09:27 -0800, Linus Torvalds wrote:
> 
>>Maybe the best thing to do is to educate the people who write the sound
>>apps for Linux (somebody was complaining about "esd" triggering this, for
>>example).
>>
> 
> Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> checked, and sure enough I was getting 190 interrupts/sec on the
> sound card while not playing any sound.  I killed esd (which I don't
> use anyways), and interrupts went to 0/sec when not playing sound.
> Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> sound card.

Weel, evidently esd and artsd both do this (well, I assume esd does now, it 
didn't do this in the past).  Basically, they both transmit silence over the 
sound chip when nothing else is going on.  So even though you don't hear 
anything, the same sound output DMA is taking place.  That avoids things 
like nasty pops when you start up the sound hardware for a beep and that 
sort of thing.  It also maintains state where as dropping output entirely 
could result in things like module auto unloading and then reloading on the 
next beep, etc.  Personally, the interrupt count and overhead annoyed me 
enough that when I started hacking on the i810 sound driver one of my 
primary goals was to get overhead and interrupt count down.  I think I 
suceeded quite well.  On my current workstation:

Context switches per second not playing any sound: 8300 - 8800
Context switches per second playing an MP3: 9200 - 9900
Interrupts per second from sound device: 86
%CPU used when not playing MP3: 0 - 3% (magicdev is a CPU pig once every 2 
seconds)
%CPU used when playing MP3s: 0 - 4%

In any case, it might be worth the original poster's time in figuring out 
just how much of his lost CPU is because of playing sound and how much is 
actually caused by the windowing system and all the associated bloat that 
comes with it now a days.

-- 

  Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
       Please check my web site for aic7xxx updates/answers before
                       e-mailing me about problems

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
@ 2001-12-18 18:52               ` Andreas Dilger
  2001-12-18 19:03                 ` Doug Ledford
  2001-12-19  9:19               ` Peter Wächtler
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 87+ messages in thread
From: Andreas Dilger @ 2001-12-18 18:52 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Kernel Mailing List

On Dec 18, 2001  13:27 -0500, Doug Ledford wrote:
> Andreas Dilger wrote:
> > Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> > checked, and sure enough I was getting 190 interrupts/sec on the
> > sound card while not playing any sound.  I killed esd (which I don't
> > use anyways), and interrupts went to 0/sec when not playing sound.
> > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> > sound card.
> 
> Weel, evidently esd and artsd both do this (well, I assume esd does now, it 
> didn't do this in the past).  Basically, they both transmit silence over the 
> sound chip when nothing else is going on.  So even though you don't hear 
> anything, the same sound output DMA is taking place.  That avoids things 
> like nasty pops when you start up the sound hardware for a beep and that 
> sort of thing.

Hmm, I _do_ notice a pop when the sound hardware is first initialized at
boot time, but not when mpg123 starts/stops (without esd running) so I
personally don't get any benefit from "the sound of silence".  That said,
asside from the 190 interrupts/sec from esd, it doesn't appear to use any
measurable CPU time by itself.

> Context switches per second not playing any sound: 8300 - 8800
> Context switches per second playing an MP3: 9200 - 9900

Hmm, something seems very strange there.  On an idle system, I get about
100 context switches/sec, and about 150/sec when playing sound (up to 400/sec
when moving the mouse between windows).  9000 cswitches/sec is _very_ high.
This is with a text-only player which has screen output (other than the
ID3 info from the currently played song).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:52               ` Andreas Dilger
@ 2001-12-18 19:03                 ` Doug Ledford
  0 siblings, 0 replies; 87+ messages in thread
From: Doug Ledford @ 2001-12-18 19:03 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Kernel Mailing List

Andreas Dilger wrote:

> Hmm, I _do_ notice a pop when the sound hardware is first initialized at
> boot time, but not when mpg123 starts/stops (without esd running) so I
> personally don't get any benefit from "the sound of silence".  That said,
> asside from the 190 interrupts/sec from esd, it doesn't appear to use any
> measurable CPU time by itself.
> 
> 
>>Context switches per second not playing any sound: 8300 - 8800
>>Context switches per second playing an MP3: 9200 - 9900
>>
> 
> Hmm, something seems very strange there.  On an idle system, I get about
> 100 context switches/sec, and about 150/sec when playing sound (up to 400/sec
> when moving the mouse between windows).  9000 cswitches/sec is _very_ high.
> This is with a text-only player which has screen output (other than the
> ID3 info from the currently played song).

I haven't taken the time to track down what's causing all the context 
switches, but on my system they are indeed "normal".  I suspect large 
numbers of them are a result of interactions between gnome, nautilus, X, 
xmms, esd, and gnome-xmms.  However, I did just track down one reason for 
it.  It's not 8300 - 8800, its 830 - 880.  There appears to be a bug in the 
procinfo -n1 mode that results in an extra digit getting tacked onto the end 
of the context switch line.  So, take my original numbers and lop off the 
last digit from the context switch numbers and that's more like what the 
machine is actually doing.

-- 

  Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
       Please check my web site for aic7xxx updates/answers before
                       e-mailing me about problems

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:52               ` Andreas Dilger
@ 2001-12-19  9:19               ` Peter Wächtler
  2001-12-19 11:05               ` Helge Hafting
  2001-12-21 20:23               ` Rob Landley
  3 siblings, 0 replies; 87+ messages in thread
From: Peter Wächtler @ 2001-12-19  9:19 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Kernel Mailing List

Doug Ledford schrieb:
> 
> Andreas Dilger wrote:
> 
> > On Dec 18, 2001  09:27 -0800, Linus Torvalds wrote:
> >
> >>Maybe the best thing to do is to educate the people who write the sound
> >>apps for Linux (somebody was complaining about "esd" triggering this, for
> >>example).
> >>
> >
> > Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> > checked, and sure enough I was getting 190 interrupts/sec on the
> > sound card while not playing any sound.  I killed esd (which I don't
> > use anyways), and interrupts went to 0/sec when not playing sound.
> > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> > sound card.
> 
> Weel, evidently esd and artsd both do this (well, I assume esd does now, it
> didn't do this in the past).  Basically, they both transmit silence over the
> sound chip when nothing else is going on.  So even though you don't hear
> anything, the same sound output DMA is taking place.  That avoids things
> like nasty pops when you start up the sound hardware for a beep and that
> sort of thing.  It also maintains state where as dropping output entirely
> could result in things like module auto unloading and then reloading on the
> next beep, etc.  Personally, the interrupt count and overhead annoyed me
> enough that when I started hacking on the i810 sound driver one of my
> primary goals was to get overhead and interrupt count down.  I think I
> suceeded quite well.  On my current workstation:
> 
> Context switches per second not playing any sound: 8300 - 8800
> Context switches per second playing an MP3: 9200 - 9900
> Interrupts per second from sound device: 86
> %CPU used when not playing MP3: 0 - 3% (magicdev is a CPU pig once every 2
> seconds)
> %CPU used when playing MP3s: 0 - 4%
> 
> In any case, it might be worth the original poster's time in figuring out
> just how much of his lost CPU is because of playing sound and how much is
> actually caused by the windowing system and all the associated bloat that
> comes with it now a days.
> 

Do you really think 8000 context switches are sane?

pippin:/var/log # vmstat 1
   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 2  0  0 100728   4424 121572  27800   0   1     6     6   61    77  98   2   0
 2  0  0 100728   5448 121572  27800   0   0     0    68  112   811  93   7   0
 2  0  0 100728   5448 121572  27800   0   0     0     0  101   776  95   5   0
 3  0  0 100728   4928 121572  27800   0   0     0     0  101   794  92   8   0

having a load ~2.1 (2 seti@home)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:52               ` Andreas Dilger
  2001-12-19  9:19               ` Peter Wächtler
@ 2001-12-19 11:05               ` Helge Hafting
  2001-12-21 20:23               ` Rob Landley
  3 siblings, 0 replies; 87+ messages in thread
From: Helge Hafting @ 2001-12-19 11:05 UTC (permalink / raw)
  To: Doug Ledford, linux-kernel

Doug Ledford wrote:

> Weel, evidently esd and artsd both do this (well, I assume esd does now, it
> didn't do this in the past).  Basically, they both transmit silence over the
> sound chip when nothing else is going on.  So even though you don't hear
> anything, the same sound output DMA is taking place.  

Uuurgh. :-(

> That avoids things
> like nasty pops when you start up the sound hardware for a beep and that

Yuk, bad hardware.  Pops when you start or stop writing?  You don't even
have to turn the volume off or something to get a pop?  Toss it.

> sort of thing.  It also maintains state where as dropping output entirely
> could result in things like module auto unloading and then reloading on the
> next beep, etc.  

Much better solved by having the device open, but not writing anything.
Open devices don't unload.

Helge Hafting

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
                                 ` (2 preceding siblings ...)
  2001-12-19 11:05               ` Helge Hafting
@ 2001-12-21 20:23               ` Rob Landley
  3 siblings, 0 replies; 87+ messages in thread
From: Rob Landley @ 2001-12-21 20:23 UTC (permalink / raw)
  To: Doug Ledford, Andreas Dilger; +Cc: Kernel Mailing List

On Tuesday 18 December 2001 01:27 pm, Doug Ledford wrote:

> Weel, evidently esd and artsd both do this (well, I assume esd does now, it
> didn't do this in the past).  Basically, they both transmit silence over
> the sound chip when nothing else is going on.  So even though you don't
> hear anything, the same sound output DMA is taking place.  That avoids

THAT explains it.

My Dell Inspiron 3500 laptop's built-in sound (NeoMagic MagicMedia 256 AV, 
uses ad1848 module) works fine when I first boot the sucker, but looses its 
marbles after an APM suspend and stops receiving interrupts.  (Extensive 
poking around with setpci has so far failed to get it working again, but on a 
shutdown and restart the bios sets it up fine.  Not a clue what's up there.  
The bios and module agree it's using IRQ 7, but lspci insists it's IRQ 11, 
both before and after apm suspend.  Boggle.)

I was confused for a while about how exactly it was failing because KDE and 
mpg123 from the command line fail in different ways.  mpg123 will play the 
same half-second clip in a loop (ahah! no interrupt!), but sound in kde just 
vanishes and I get silence and hung apps whenever I try to launch anything.

The clue is that it doesn't always fail when I suspend it without having X 
up.  Translation: maybe the sound card's getting hosed by being open and in 
use on APM shutdown!

Hmmm...  I should poke at this over the weekend...

(Nope, not a new problem.  My laptop's sound has been like this since at 
least 2.4.4, which I think was the first version I installed on the box.  But 
it's still annoying, I can go weeks without a true reboot 'cause I have a 
zillion konqueror windows and such open.  I have to clear my desktop to get 
sound working again for a few hours.  Obnoxious...)

Rob

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:54           ` Andreas Dilger
  2001-12-18 18:27             ` Doug Ledford
@ 2001-12-18 18:35             ` Linus Torvalds
  1 sibling, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 18:35 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List,
	Jeff Garzik

On Tue, 18 Dec 2001, Andreas Dilger wrote:
>
> Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> checked, and sure enough I was getting 190 interrupts/sec on the
> sound card while not playing any sound.  I killed esd (which I don't
> use anyways), and interrupts went to 0/sec when not playing sound.
> Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> sound card.

190 interrupts / sec sounds excessive, but not wildly so. The interrupt
per se is not going to be a CPU hog unless the sound card does programmed
IO to fill the data queues, and while that is not unheard of, I don't
think such a card has been made in the last five years.

Obviously getting 190 irq's per second even when not actually _doing_
anything is a total waste of CPU, and is bad form. There may be some
reason why esd does it, most probably for good synchronization between
sound events and to avoid popping when the sound is shut down (many sound
drivers seem to pop a bit on open/close, possibly due to driver bugs, but
possibly because some hard-to-avoid-programmatically hardware glitch when
powering down the logic.

So waiting a while with the driver active may actually be a reasonable
thing to do, although I suspect that after long sequences of silence "esd"
should really shut down for a while (and "long" here is probably on the
order of seconds, not minutes).

What probably _really_ ends up hurting performance is probably not the
interrupt per se (although it is noticeable), but the fact that we wake up
and cause a schedule - which often blows any CPU caches, making the _next_
interrupt also be more expensive than it would possibly need to be.

The code for that (in the case of drivers that use the generic "dmabuf.c"
infrastructure) seems to be in "finish_output_interrupt()", and I suspect
that it could be improved with something like

	dmap = adev->dmap_out;
	lim = dmap->nbufs;
	if (lim < 2) lim = 2;
	if (dmap->qlen <= lim/2) {
		...
	}

around the current unconditional wakeups.

Yeah, yeah, untested, stupid example, the idea being that we only wake up
if we have at least half the frags free now, instead of waking up for
_every_ fragment that free's up.

The above is just as a suggestion for some testing, if somebody actually
feels like trying it out. It probably won't be good as-is, but as a
starting point..

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:27         ` Linus Torvalds
  2001-12-18 17:54           ` Andreas Dilger
@ 2001-12-18 18:58           ` Alan Cox
  2001-12-18 19:31             ` Gerd Knorr
  1 sibling, 1 reply; 87+ messages in thread
From: Alan Cox @ 2001-12-18 18:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List,
	Jeff Garzik

> Maybe the best thing to do is to educate the people who write the sound
> apps for Linux (somebody was complaining about "esd" triggering this, for
> example).

esd is a culprit, and artsd to an extent. esd is scheduled to die so artsd
is the big one to tidy. Kernel side OSS is dead so its a matter for ALSA

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:58           ` Alan Cox
@ 2001-12-18 19:31             ` Gerd Knorr
  0 siblings, 0 replies; 87+ messages in thread
From: Gerd Knorr @ 2001-12-18 19:31 UTC (permalink / raw)
  To: linux-kernel

>  Kernel side OSS is dead

What do you mean with "Kernel side OSS"?  Only Hannu's OSS/free drivers?
Or all current kernel drivers which support the OSS API, including most
(all?) PCI sound drivers which don't use any old OSS/free code?

  Gerd

-- 
#define	ENOCLUE 125 /* userland programmer induced race condition */

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (5 preceding siblings ...)
  2001-12-18 17:21       ` David Mansfield
@ 2001-12-18 18:25       ` William Lee Irwin III
  6 siblings, 0 replies; 87+ messages in thread
From: William Lee Irwin III @ 2001-12-18 18:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List, Jeff Garzik

On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote:
> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).
> And yes, the dmabuf code will wake up the writer on every single DMA
> complete interrupt. Considering that you seem to have them at least 400
> times a second (and probably more, unless you've literally had sound going
> since the machine was booted), I think we know why your setup spends time
> in the scheduler.
> A number of sound drivers will use the same logic.

I've chucked the sb32 and plugged in the emu10k1 I had been planning
to install for a while, to good effect. It's not an ISA sb16, but it
apparently uses the same driver.

I'm getting an overall 1% reduction in system load, and the following
"top 5" profile:

 53374 total                                      0.0400
 11430 default_idle                             238.1250
  8820 handle_IRQ_event                          91.8750
  2186 do_softirq                                10.5096
  1984 schedule                                   1.2525
  1612 number                                     1.4816
  1473 __generic_copy_to_user                    18.4125

Oddly, I'm getting even more interrupts than I saw with the sb32...

  0:    2752924          XT-PIC  timer
  9:   14223905          XT-PIC  EMU10K1, eth1

(eth1 generates orders of magnitude fewer interrupts than the timer)

On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote:
> You may be able to change this more easily some other way, by using a
> larger fragment size for example. That's up to the sw that actually feeds
> the sound stream, so it might be your decoder that selects a small
> fragment size.
> Quite frankly I don't know the sound infrastructure well enough to make
> any more intelligent suggestions about other decoders or similar to try,
> at this point I just start blathering.

Already more insight into the problem I was experiencing than I had
before, and I must confess to those such as myself this lead certainly
seems "plucked out of the air". Good work! =)

On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote:
> But yes, I bet you'll also see much less impact of this if you were to
> switch to more modern hardware.

I hear from elsewhere the emu10k1 has a bad reputation as source of
excessive interrupts. Looks like I bought the wrong sound card(s).
Maybe I should go shopping. =)


Thanks a bunch!
Bill

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:55   ` William Lee Irwin III
  2001-12-18  6:09     ` Linus Torvalds
@ 2001-12-18 14:21     ` Adam Schrotenboer
  1 sibling, 0 replies; 87+ messages in thread
From: Adam Schrotenboer @ 2001-12-18 14:21 UTC (permalink / raw)
  To: Kernel Mailing List

On Monday 17 December 2001 23:55, William Lee Irwin III wrote:
> On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > The most likely cause is simply waking up after each sound interrupt: you
> > also have a _lot_ of time handling interrupts. Quite frankly, web surfing
> > and mp3 playing simply shouldn't use any noticeable amounts of CPU.
>
> I think we have a winner:
> /proc/interrupts
> ------------------------------------------------
>            CPU0
>   0:   17321824          XT-PIC  timer
>   1:          4          XT-PIC  keyboard
>   2:          0          XT-PIC  cascade
>   5:   46490271          XT-PIC  soundblaster
>   9:     400232          XT-PIC  usb-ohci, eth0, eth1
>  11:     939150          XT-PIC  aic7xxx, aic7xxx
>  14:         13          XT-PIC  ide0
>
> Approximately 4 times more often than the timer interrupt.
> That's not nice...

FWIW, I have an ES1371 based sound card, and mpg123 drives it at 172 
interrupts/sec (calculated in procinfo). But that _is_ only when playing. And 
(my slightly hacked) timidity drives my card w/ only 23(@48kHz sample rate; 
21 @ 44.1kHz) interrupts/sec

Is this 172 figure right? (Not through esd either. i almost always turn it 
off, and sp recompiled mpg123 to use the std OSS driver)

>
> On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > Which sound driver are you using, just in case this _is_ the reason?
>
> SoundBlaster 16
> A change of hardware should help verify this.
>
>
> Cheers,
> Bill
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:27 ` Linus Torvalds
  2001-12-18  4:55   ` William Lee Irwin III
@ 2001-12-18 18:13   ` Davide Libenzi
  1 sibling, 0 replies; 87+ messages in thread
From: Davide Libenzi @ 2001-12-18 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> The most likely cause is simply waking up after each sound interrupt: you
> also have a _lot_ of time handling interrupts. Quite frankly, web surfing
> and mp3 playing simply shouldn't use any noticeable amounts of CPU.

It must be noted that wking up a task is going to take two lock operations
( and two unlock ), one in try_to_wakeup() and the other one in schedule().
This double the frequency seen by the lock.



- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Just a second ...
@ 2001-12-16  0:13 Linus Torvalds
  2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi
  0 siblings, 1 reply; 87+ messages in thread
From: Linus Torvalds @ 2001-12-16  0:13 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Kernel Mailing List

On Sat, 15 Dec 2001, Davide Libenzi wrote:
>
> when you find 10 secs free in your spare time i really would like to know
> the reason ( if any ) of your abstention from any schdeuler discussion.
> No hurry, just a few lines out of lkml.

I just don't find it very interesting. The scheduler is about 100 lines
out of however-many-million (3.8 at least count), and doesn't even impact
most normal performace very much.

We'll clearly do per-CPU runqueues or something some day. And that worries
me not one whit, compared to thigns like VM and block device layer ;)

I know a lot of people think schedulers are important, and the operating
system theory about them is overflowing - it's one of those things that
people can argue about forever, yet is conceptually simple enough that
people aren't afraid of it. I just personally never found it to be a major
issue.

Let's face it - the current scheduler has the same old basic structure
that it did almost 10 years ago, and yes, it's not optimal, but there
really aren't that many real-world loads where people really care. I'm
sorry, but it's true.

And you have to realize that there are not very many things that have
aged as well as the scheduler. Which is just another proof that scheduling
is easy.

We've rewritten the VM several times in the last ten years, and I expect
it will be changed several more times in the next few years. Withing five
years we'll almost certainly have to make the current three-level page
tables be four levels etc.

In comparison to those kinds of issues, I suspect that making the
scheduler use per-CPU queues together with some inter-CPU load balancing
logic is probably _trivial_. Patches already exist, and I don't feel that
people can screw up the few hundred lines too badly.

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Scheduler ( was: Just a second ) ...
  2001-12-16  0:13 Just a second Linus Torvalds
@ 2001-12-17 22:48 ` Davide Libenzi
  2001-12-17 22:53   ` Linus Torvalds
  0 siblings, 1 reply; 87+ messages in thread
From: Davide Libenzi @ 2001-12-17 22:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Sat, 15 Dec 2001, Linus Torvalds wrote:

> I just don't find it very interesting. The scheduler is about 100 lines
> out of however-many-million (3.8 at least count), and doesn't even impact
> most normal performace very much.

Linus, sharing queue and lock between CPUs for a "thing" highly frequency
( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
and it's not that much funny. And it's not only performance wise, it's
more design wise.

> We'll clearly do per-CPU runqueues or something some day. And that worries
> me not one whit, compared to thigns like VM and block device layer ;)

Why not 2.5.x ?

> I know a lot of people think schedulers are important, and the operating
> system theory about them is overflowing - ...

It's no more important of anything else, it's just one of the remaining
scalability/design issues. No, it's not more important than VM but
there're enough people working on VM. And the hope is to get the scheduler
right with an ETA of less than 10 years.

> it's one of those things that people can argue about forever, ...

Yes, i suppose that if something is not addressed, it'll come up again and
again.

> yet is conceptually simple enough that people aren't afraid of it.
         ^^^^^^^^^^^^^^^^^^^

1, ...

> Let's face it - the current scheduler has the same old basic structure
> that it did almost 10 years ago, and yes, it's not optimal, but there
> really aren't that many real-world loads where people really care. I'm
> sorry, but it's true.

Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane
for UP systems, starts to matter. Just to leave out cache line effects.
Just to leave out the way the current scheduler moves tasks around CPUs.
Linus, it's not only about performance benchmarks with 2451 processes
jumping on the run queue, that i could not care less about, it's just a
sum of sucky "things" that make an issue. You can look at it like a
cosmetic/design patch more than a strict performance patch if you like.

> And you have to realize that there are not very many things that have
> aged as well as the scheduler. Which is just another proof that
> scheduling is easy.
  ^^^^^^^^^^^^^^^^^^

..., 2, ...

> We've rewritten the VM several times in the last ten years, and I expect
> it will be changed several more times in the next few years. Withing five
> years we'll almost certainly have to make the current three-level page
> tables be four levels etc.
>
> In comparison to those kinds of issues, I suspect that making the
> scheduler use per-CPU queues together with some inter-CPU load balancing
> logic is probably _trivial_.
                    ^^^^^^^^^

... 3, there should be a subliminal message inside but i'm not able to
get it ;)
I would not call selecting the right task to run in an SMP system trivial.
The difference between selecting the right task to run and selecting the
right page to swap is that if you screw up with the task the system
impact is lower. But, if you screw up, your design will suck in both cases.
Anyway, given that 1) real men do VM ( i thought they didn't eat quiche )
and easy-coders do scheduling 2) the schdeuler is easy/trivial and you do
not seem interested in working on it 3) whoever is doing the scheduler
cannot screw up things, why don't you give the responsibility for example
to Alan or Ingo so that a discussion ( obviously easy ) about the future
of the schdeuler can be started w/out hurting real men doing VM ?
I'm talking about, you know, that kind of discussions where people bring
solutions, code and numbers, they talk about the good and bad of certain
approaches and they finally come up ( after some sane fight ) with a much
or less widely approved solution. The scheduler, besides the real men
crap, is one of the basic components of an OS, and having a public
debate, i'm not saying every month and neither every year, but at least
once every four years ( this is the last i remember ) could be a nice thing.
And no, if you do not give to someone that you trust the "power" to
redesign the scheduler, no schdeuler discussions will start simply
because people don't like the result of a debate to be dumped to /dev/null.

> Patches already exist, and I don't feel that people can screw up the few
> hundred lines too badly.

Can you point me to a Linux patch that implements _real_independent_
( queue and locking ) CPU schedulers with global balancing policy ?
I searched very badly but i did not find anything.

Your faithfully,
Jimmy Scheduler

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi
@ 2001-12-17 22:53   ` Linus Torvalds
  2001-12-17 23:15     ` Davide Libenzi
  2001-12-18  1:54     ` Rik van Riel
  0 siblings, 2 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-17 22:53 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Kernel Mailing List

On Mon, 17 Dec 2001, Davide Libenzi wrote:

> On Sat, 15 Dec 2001, Linus Torvalds wrote:
>
> > I just don't find it very interesting. The scheduler is about 100 lines
> > out of however-many-million (3.8 at least count), and doesn't even impact
> > most normal performace very much.
>
> Linus, sharing queue and lock between CPUs for a "thing" highly frequency
> ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
> and it's not that much funny. And it's not only performance wise, it's
> more design wise.

"Design wise" is highly overrated.

Simplicity is _much_ more important, if something commonly is only done a
few hundred times a second. Locking overhead is basically zero for that
case.

> > We'll clearly do per-CPU runqueues or something some day. And that worries
> > me not one whit, compared to thigns like VM and block device layer ;)
>
> Why not 2.5.x ?

Maybe. But read the rest of the sentence. There are issues that are about
a million times more important.

> Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane
> for UP systems, starts to matter.

4 cpu's are "high end" today. We can probably point to tens of thousands
of UP machines for each 4-way out there. The ratio gets even worse for 8,
and 16 CPU's is basically a rounding error.

You have to prioritize. Scheduling overhead is way down the list.

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 22:53   ` Linus Torvalds
@ 2001-12-17 23:15     ` Davide Libenzi
  2001-12-17 23:18       ` Linus Torvalds
  2001-12-18  1:54     ` Rik van Riel
  1 sibling, 1 reply; 87+ messages in thread
From: Davide Libenzi @ 2001-12-17 23:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

>
> On Mon, 17 Dec 2001, Davide Libenzi wrote:
>
> > On Sat, 15 Dec 2001, Linus Torvalds wrote:
> >
> > > I just don't find it very interesting. The scheduler is about 100 lines
> > > out of however-many-million (3.8 at least count), and doesn't even impact
> > > most normal performace very much.
> >
> > Linus, sharing queue and lock between CPUs for a "thing" highly frequency
> > ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
> > and it's not that much funny. And it's not only performance wise, it's
> > more design wise.
>
> "Design wise" is highly overrated.
>
> Simplicity is _much_ more important, if something commonly is only done a
> few hundred times a second. Locking overhead is basically zero for that
> case.

Few hundred is a nice definition because you can basically range from 0 to
infinite. Anyway i agree that we can spend days debating about what this
"few hundred" translate to, and i do not really want to.


> 4 cpu's are "high end" today. We can probably point to tens of thousands
> of UP machines for each 4-way out there. The ratio gets even worse for 8,
> and 16 CPU's is basically a rounding error.
>
> You have to prioritize. Scheduling overhead is way down the list.

You don't really have to serialize/prioritize, old Latins used to say
"Divide Et Impera" ;)




- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:15     ` Davide Libenzi
@ 2001-12-17 23:18       ` Linus Torvalds
  2001-12-17 23:39         ` Davide Libenzi
  2001-12-17 23:52         ` Benjamin LaHaise
  0 siblings, 2 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-17 23:18 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Kernel Mailing List

On Mon, 17 Dec 2001, Davide Libenzi wrote:
> >
> > You have to prioritize. Scheduling overhead is way down the list.
>
> You don't really have to serialize/prioritize, old Latins used to say
> "Divide Et Impera" ;)

Well, you explicitly _asked_ me why I had been silent on the issue. I told
you.

I also told you that I thought it wasn't that big of a deal, and that
patches already exist.

So I'm letting the patches fight it out among the people who _do_ care.

Then, eventually, I'll do something about it, when we have a winner.

If that isn't "Divide et Impera", I don't know _what_ is. Remember: the
romans didn't much care for their subjects. They just wanted the glory,
and the taxes.

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:18       ` Linus Torvalds
@ 2001-12-17 23:39         ` Davide Libenzi
  2001-12-17 23:52         ` Benjamin LaHaise
  1 sibling, 0 replies; 87+ messages in thread
From: Davide Libenzi @ 2001-12-17 23:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> So I'm letting the patches fight it out among the people who _do_ care.
>
> Then, eventually, I'll do something about it, when we have a winner.
>
> If that isn't "Divide et Impera", I don't know _what_ is. Remember: the
> romans didn't much care for their subjects. They just wanted the glory,
> and the taxes.

Just like today, everyone I talk to wants glory, and everyone I talk to
wants to _not_ pay taxes.



- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:18       ` Linus Torvalds
  2001-12-17 23:39         ` Davide Libenzi
@ 2001-12-17 23:52         ` Benjamin LaHaise
  2001-12-18  1:11           ` Linus Torvalds
  1 sibling, 1 reply; 87+ messages in thread
From: Benjamin LaHaise @ 2001-12-17 23:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List

On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote:
> Well, you explicitly _asked_ me why I had been silent on the issue. I told
> you.

Well, what about those of us who need syscall numbers assigned for which 
you are the only official assigned number registry?

		-ben

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:52         ` Benjamin LaHaise
@ 2001-12-18  1:11           ` Linus Torvalds
  2001-12-18  1:46             ` H. Peter Anvin
  2001-12-18  5:54             ` Benjamin LaHaise
  0 siblings, 2 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18  1:11 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List


On Mon, 17 Dec 2001, Benjamin LaHaise wrote:
> On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote:
> > Well, you explicitly _asked_ me why I had been silent on the issue. I told
> > you.
>
> Well, what about those of us who need syscall numbers assigned for which
> you are the only official assigned number registry?

I've told you a number of times that I'd like to see the preliminary
implementation publicly discussed and some uses outside of private
companies that I have no insight into..

		Linus


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  1:11           ` Linus Torvalds
@ 2001-12-18  1:46             ` H. Peter Anvin
  2001-12-18  5:54             ` Benjamin LaHaise
  1 sibling, 0 replies; 87+ messages in thread
From: H. Peter Anvin @ 2001-12-18  1:46 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <Pine.LNX.4.33.0112171710160.2035-100000@penguin.transmeta.com>
By author:    Linus Torvalds <torvalds@transmeta.com>
In newsgroup: linux.dev.kernel
> 
> I've told you a number of times that I'd like to see the preliminary
> implementation publicly discussed and some uses outside of private
> companies that I have no insight into..
> 

There was a group at IBM who presented on an alternate SMP scheduler
at this year's OLS; it generated quite a bit of good discussion.

	-hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  1:11           ` Linus Torvalds
  2001-12-18  1:46             ` H. Peter Anvin
@ 2001-12-18  5:54             ` Benjamin LaHaise
  2001-12-18  6:10               ` Linus Torvalds
  1 sibling, 1 reply; 87+ messages in thread
From: Benjamin LaHaise @ 2001-12-18  5:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List

On Mon, Dec 17, 2001 at 05:11:09PM -0800, Linus Torvalds wrote:
> I've told you a number of times that I'd like to see the preliminary
> implementation publicly discussed and some uses outside of private
> companies that I have no insight into..

Well, we've got serious chicken and egg problems then.

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  5:54             ` Benjamin LaHaise
@ 2001-12-18  6:10               ` Linus Torvalds
  0 siblings, 0 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18  6:10 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List


On Tue, 18 Dec 2001, Benjamin LaHaise wrote:

> On Mon, Dec 17, 2001 at 05:11:09PM -0800, Linus Torvalds wrote:
> > I've told you a number of times that I'd like to see the preliminary
> > implementation publicly discussed and some uses outside of private
> > companies that I have no insight into..
>
> Well, we've got serious chicken and egg problems then.

Why?

I'd rather have people playing around with new system calls and _test_
them, and then have to recompile their apps if the system calls move
later, than introduce new system calls that haven't gotten any public
testing at all..

		Linus


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 22:53   ` Linus Torvalds
  2001-12-17 23:15     ` Davide Libenzi
@ 2001-12-18  1:54     ` Rik van Riel
  2001-12-18  2:35       ` Linus Torvalds
  1 sibling, 1 reply; 87+ messages in thread
From: Rik van Riel @ 2001-12-18  1:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> You have to prioritize. Scheduling overhead is way down the list.

That's not what the profiling on my UP machine indicates,
let alone on SMP machines.

Try readprofile some day, chances are schedule() is pretty
near the top of the list.

regards,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  1:54     ` Rik van Riel
@ 2001-12-18  2:35       ` Linus Torvalds
  2001-12-18  2:51         ` David Lang
                           ` (2 more replies)
  0 siblings, 3 replies; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18  2:35 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Davide Libenzi, Kernel Mailing List

On Mon, 17 Dec 2001, Rik van Riel wrote:
>
> Try readprofile some day, chances are schedule() is pretty
> near the top of the list.

Ehh.. Of course I do readprofile.

But did you ever compare readprofile output to _total_ cycles spent?

The fact is, it's not even noticeable under any normal loads, and
_definitely_ not on UP except with totally made up benchmarks that just
pass tokens around or yield all the time.

Because we spend 95-99% in user space or idle. Which is as it should be.
There are _very_ few loads that are kernel-intensive, and in fact the best
way to get high system times is to do either lots of fork/exec/wait with
everything cached, or do lots of open/read/write/close with everything
cached.

Of the remaining 1-5% of time, schedule() shows up as one fairly high
thing, but on most profiles I've seen of real work it shows up long after
things like "clear_page()" and "copy_page()".

And look closely at the profile, and you'll notice that it tends to be a
_loong_ tail of stuff.

Quite frankly, I'd be a _lot_ more interested in making the scheduling
slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
100Hz one, _despite_ the fact that it will increase scheduling load even
more. Because it improves interactive feel, and sometimes even performance
(ie being able to sleep for shorter sequences of time allows some things
that want "almost realtime" behaviour to avoid busy-looping for those
short waits - improving performace exactly _because_ they put more load on
the scheduler).

The benchmark that is just about _the_ worst on the scheduler is actually
something like "lmbench", and if you look at profiles for that you'll
notice that system call entry and exit together with the read/write path
ends up being more of a performance issue.

And you know what? From a user standpoint, improving disk latency is again
a _lot_ more noticeable than scheduler overhead.

And even more important than performance is being able to read and write
to CD-RW disks without having to know about things like "ide-scsi" etc,
and do it sanely over different bus architectures etc.

The scheduler simply isn't that important.

			Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  2:35       ` Linus Torvalds
@ 2001-12-18  2:51         ` David Lang
  2001-12-18  3:08         ` Davide Libenzi
  2001-12-18 14:09         ` Alan Cox
  2 siblings, 0 replies; 87+ messages in thread
From: David Lang @ 2001-12-18  2:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List

one problem the current scheduler has on SMP machines (even 2 CPU ones) is
that if the system is running one big process it will bounce from CPU to
CPU and actually finish considerably slower then if you are running two
CPU intensive tasks (with less cpu hopping). I saw this a few months ago
as I was doing something as simple as gunzip on a large file, I got a 30%
speed increase by running setiathome at the same time.

I'm not trying to say that it should be the top priority, but there are
definante weaknesses showing in the current implementation.

David Lang


 On Mon, 17 Dec 2001, Linus Torvalds wrote:

> Date: Mon, 17 Dec 2001 18:35:54 -0800 (PST)
> From: Linus Torvalds <torvalds@transmeta.com>
> To: Rik van Riel <riel@conectiva.com.br>
> Cc: Davide Libenzi <davidel@xmailserver.org>,
>      Kernel Mailing List <linux-kernel@vger.kernel.org>
> Subject: Re: Scheduler ( was: Just a second ) ...
>
>
> On Mon, 17 Dec 2001, Rik van Riel wrote:
> >
> > Try readprofile some day, chances are schedule() is pretty
> > near the top of the list.
>
> Ehh.. Of course I do readprofile.
>
> But did you ever compare readprofile output to _total_ cycles spent?
>
> The fact is, it's not even noticeable under any normal loads, and
> _definitely_ not on UP except with totally made up benchmarks that just
> pass tokens around or yield all the time.
>
> Because we spend 95-99% in user space or idle. Which is as it should be.
> There are _very_ few loads that are kernel-intensive, and in fact the best
> way to get high system times is to do either lots of fork/exec/wait with
> everything cached, or do lots of open/read/write/close with everything
> cached.
>
> Of the remaining 1-5% of time, schedule() shows up as one fairly high
> thing, but on most profiles I've seen of real work it shows up long after
> things like "clear_page()" and "copy_page()".
>
> And look closely at the profile, and you'll notice that it tends to be a
> _loong_ tail of stuff.
>
> Quite frankly, I'd be a _lot_ more interested in making the scheduling
> slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
> 100Hz one, _despite_ the fact that it will increase scheduling load even
> more. Because it improves interactive feel, and sometimes even performance
> (ie being able to sleep for shorter sequences of time allows some things
> that want "almost realtime" behaviour to avoid busy-looping for those
> short waits - improving performace exactly _because_ they put more load on
> the scheduler).
>
> The benchmark that is just about _the_ worst on the scheduler is actually
> something like "lmbench", and if you look at profiles for that you'll
> notice that system call entry and exit together with the read/write path
> ends up being more of a performance issue.
>
> And you know what? From a user standpoint, improving disk latency is again
> a _lot_ more noticeable than scheduler overhead.
>
> And even more important than performance is being able to read and write
> to CD-RW disks without having to know about things like "ide-scsi" etc,
> and do it sanely over different bus architectures etc.
>
> The scheduler simply isn't that important.
>
> 			Linus
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  2:35       ` Linus Torvalds
  2001-12-18  2:51         ` David Lang
@ 2001-12-18  3:08         ` Davide Libenzi
  2001-12-18  3:19           ` Davide Libenzi
  2001-12-18 14:09         ` Alan Cox
  2 siblings, 1 reply; 87+ messages in thread
From: Davide Libenzi @ 2001-12-18  3:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> Quite frankly, I'd be a _lot_ more interested in making the scheduling
> slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
> 100Hz one, _despite_ the fact that it will increase scheduling load even
> more. Because it improves interactive feel, and sometimes even performance
> (ie being able to sleep for shorter sequences of time allows some things
> that want "almost realtime" behaviour to avoid busy-looping for those
> short waits - improving performace exactly _because_ they put more load on
> the scheduler).

I'm ok with increasing HZ but not so ok with decreasing time slices.
When you switch a task you've a fixed cost ( tlb, cache image,... ) that,
if you decrease the time slice, you're going to weigh with a lower run time
highering its percent impact.
The more interactive feel can be achieved by using a real BVT
implementation :

-            p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
+            p->counter += NICE_TO_TICKS(p->nice);

The only problem with this is that, with certain task run patterns,
processes can run a long time ( having an high dynamic priority ) before
they get scheduled.
What i was thinking was something like, in timer.c :

        if (p->counter > decay_ticks)
            --p->counter;
        else if (++p->timer_ticks >= MAX_RUN_TIME) {
            p->counter -= p->timer_ticks;
            p->timer_ticks = 0;
            p->need_resched = 1;
        }

Having MAX_RUN_TIME ~= NICE_TO_TICKS(0)
In this way I/O bound tasks can run with high priority giving a better
interactive feel, w/out running too much freezing the system when exiting
from a quite long I/O wait.

- Davide

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  3:08         ` Davide Libenzi
@ 2001-12-18  3:19           ` Davide Libenzi
  0 siblings, 0 replies; 87+ messages in thread
From: Davide Libenzi @ 2001-12-18  3:19 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linus Torvalds, Rik van Riel, Kernel Mailing List

On Mon, 17 Dec 2001, Davide Libenzi wrote:

> What i was thinking was something like, in timer.c :
>
>         if (p->counter > decay_ticks)
>             --p->counter;
>         else if (++p->timer_ticks >= MAX_RUN_TIME) {
>             p->counter -= p->timer_ticks;
>             p->timer_ticks = 0;
>             p->need_resched = 1;
>         }

Obviously that code doesn't work :) but the idea is to not permit the task
to run more than a maximum time consecutively.



- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  2:35       ` Linus Torvalds
  2001-12-18  2:51         ` David Lang
  2001-12-18  3:08         ` Davide Libenzi
@ 2001-12-18 14:09         ` Alan Cox
  2001-12-18  9:12           ` John Heil
                             ` (3 more replies)
  2 siblings, 4 replies; 87+ messages in thread
From: Alan Cox @ 2001-12-18 14:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List

> to CD-RW disks without having to know about things like "ide-scsi" etc,
> and do it sanely over different bus architectures etc.
> 
> The scheduler simply isn't that important.

The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
That isn't going to go away by sticking heads in sand.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
@ 2001-12-18  9:12           ` John Heil
  2001-12-18 15:34           ` degger
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 87+ messages in thread
From: John Heil @ 2001-12-18  9:12 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Rik van Riel, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Alan Cox wrote:

> Date: Tue, 18 Dec 2001 14:09:16 +0000 (GMT)
> From: Alan Cox <alan@lxorguk.ukuu.org.uk>
> To: Linus Torvalds <torvalds@transmeta.com>
> Cc: Rik van Riel <riel@conectiva.com.br>,
>     Davide Libenzi <davidel@xmailserver.org>,
>     Kernel Mailing List <linux-kernel@vger.kernel.org>
> Subject: Re: Scheduler ( was: Just a second ) ...
> 
> > to CD-RW disks without having to know about things like "ide-scsi" etc,
> > and do it sanely over different bus architectures etc.
> > 
> > The scheduler simply isn't that important.
> 
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

What % of a std 2 cpu, do you think it eats?

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
johnhscs@sc-software.com
http://www.sc-software.com
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
  2001-12-18  9:12           ` John Heil
@ 2001-12-18 15:34           ` degger
  2001-12-18 18:35             ` Mike Kravetz
  2001-12-18 18:48             ` Davide Libenzi
  2001-12-18 16:50           ` Mike Kravetz
  2001-12-18 17:00           ` Linus Torvalds
  3 siblings, 2 replies; 87+ messages in thread
From: degger @ 2001-12-18 15:34 UTC (permalink / raw)
  To: alan; +Cc: linux-kernel

On 18 Dec, Alan Cox wrote:

> The scheduler is eating 40-60% of the machine on real world 8 cpu
> workloads. That isn't going to go away by sticking heads in sand.

What about a CONFIG_8WAY which, if set, activates a scheduler that
performs better on such nontypical machines? I see and understand
boths sides arguments yet I fail to see where the real problem is
with having a scheduler that just kicks in _iff_ we're running the
kernel on a nontypical kind of machine.
This would keep the straigtforward scheduler Linus is defending
for the single processor machines while providing more performance
to heavy SMP machines by having a more complex scheduler better suited
for this task.

--
Servus,
       Daniel

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 15:34           ` degger
@ 2001-12-18 18:35             ` Mike Kravetz
  2001-12-18 18:48             ` Davide Libenzi
  1 sibling, 0 replies; 87+ messages in thread
From: Mike Kravetz @ 2001-12-18 18:35 UTC (permalink / raw)
  To: degger; +Cc: alan, linux-kernel

On Tue, Dec 18, 2001 at 04:34:57PM +0100, degger@fhm.edu wrote:
> What about a CONFIG_8WAY which, if set, activates a scheduler that
> performs better on such nontypical machines?

I'm pretty sure that we can create a scheduler that works well on
an 8-way, and works just as well as the current scheduler on a UP
machine.  There is already a CONFIG_SMP which is all that should
be necessary to distinguish between the two.

What may be of more concern is support for different architectures
such as HMT and NUMA.  What about better scheduler support for
people working in the RT embedded space?  Each of these seem to
have different scheduling requirements.  Do people working on these
'non-typical' machines need to create their own scheduler patches?
OR is there some 'clean' way to incorporate them into the source
tree?

-- 
Mike

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 15:34           ` degger
  2001-12-18 18:35             ` Mike Kravetz
@ 2001-12-18 18:48             ` Davide Libenzi
  1 sibling, 0 replies; 87+ messages in thread
From: Davide Libenzi @ 2001-12-18 18:48 UTC (permalink / raw)
  To: degger; +Cc: Alan Cox, lkml

On Tue, 18 Dec 2001 degger@fhm.edu wrote:

> On 18 Dec, Alan Cox wrote:
>
> > The scheduler is eating 40-60% of the machine on real world 8 cpu
> > workloads. That isn't going to go away by sticking heads in sand.
>
> What about a CONFIG_8WAY which, if set, activates a scheduler that
> performs better on such nontypical machines? I see and understand
> boths sides arguments yet I fail to see where the real problem is
> with having a scheduler that just kicks in _iff_ we're running the
> kernel on a nontypical kind of machine.
> This would keep the straigtforward scheduler Linus is defending
> for the single processor machines while providing more performance
> to heavy SMP machines by having a more complex scheduler better suited
> for this task.

By using a multi queue scheduler with global balancing policy you can keep
the core scheduler as is and have the balancing code to take care of
distributing the load.
Obviously that code is under CONFIG_SMP, so it's not even compiled in UP.
In this way you've the same scheduler code running independently with a
lower load on the run queue and an high locality of locking.




- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
  2001-12-18  9:12           ` John Heil
  2001-12-18 15:34           ` degger
@ 2001-12-18 16:50           ` Mike Kravetz
  2001-12-18 17:22             ` Linus Torvalds
  2001-12-18 17:00           ` Linus Torvalds
  3 siblings, 1 reply; 87+ messages in thread
From: Mike Kravetz @ 2001-12-18 16:50 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Rik van Riel, Davide Libenzi, Kernel Mailing List

On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

Can you be more specific as to the workload you are referring to?
As someone who has been playing with the scheduler for a while,
I am interested in all such workloads.

-- 
Mike

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50           ` Mike Kravetz
@ 2001-12-18 17:22             ` Linus Torvalds
  2001-12-18 17:50               ` Davide Libenzi
  0 siblings, 1 reply; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:22 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Alan Cox, Rik van Riel, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Mike Kravetz wrote:
> On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > That isn't going to go away by sticking heads in sand.
>
> Can you be more specific as to the workload you are referring to?
> As someone who has been playing with the scheduler for a while,
> I am interested in all such workloads.

Well, careful: depending on what "%" means, a 8-cpu machine has either
"100% max" or "800% max".

So are we talking about "we spend 40-60% of all CPU cycles in the
scheduler" or are we talking about "we spend 40-60% of the CPU power of
_one_ CPU out of 8 in the scheduler".

Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the
scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates
to "The worst scheduler case I've ever seen under a real load spent 5-8%
of the machine CPU resources on scheduling".

And let's face it, 5-8% is bad, but we're not talking "half the CPU power"
here.

		Linus

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:22             ` Linus Torvalds
@ 2001-12-18 17:50               ` Davide Libenzi
  0 siblings, 0 replies; 87+ messages in thread
From: Davide Libenzi @ 2001-12-18 17:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Mike Kravetz, Alan Cox, Rik van Riel, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

>
> On Tue, 18 Dec 2001, Mike Kravetz wrote:
> > On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > > That isn't going to go away by sticking heads in sand.
> >
> > Can you be more specific as to the workload you are referring to?
> > As someone who has been playing with the scheduler for a while,
> > I am interested in all such workloads.
>
> Well, careful: depending on what "%" means, a 8-cpu machine has either
> "100% max" or "800% max".
>
> So are we talking about "we spend 40-60% of all CPU cycles in the
> scheduler" or are we talking about "we spend 40-60% of the CPU power of
> _one_ CPU out of 8 in the scheduler".
>
> Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the
> scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates
> to "The worst scheduler case I've ever seen under a real load spent 5-8%
> of the machine CPU resources on scheduling".
>
> And let's face it, 5-8% is bad, but we're not talking "half the CPU power"
> here.

Linus, you're plain right that we can spend days debating about the
scheduler load.
You've to agree that sharing a single lock/queue for multiple CPU is,
let's say, quite crappy.
You agreed that the scheduler is easy and the fix should not take that
much time.
You said that you're going to accept the solution that is coming out from
the mailing list.
Why don't we start talking about some solution and code ?
Starting from a basic architecture down to the implementation.
Alan and Rik are quite "unloaded" now, what do You think ?



- Davide



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
                             ` (2 preceding siblings ...)
  2001-12-18 16:50           ` Mike Kravetz
@ 2001-12-18 17:00           ` Linus Torvalds
  2001-12-18 19:17             ` Alan Cox
  3 siblings, 1 reply; 87+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:00 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Alan Cox wrote:
>
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

Did you _read_ what I said?

We _have_ patches. You apparently have your own set.

Fight it out. Don't involve me, because I don't think it's even a
challenging thing. I wrote what is _still_ largely the algorithm in 1991,
and it's damn near the only piece of code from back then that even _has_
some similarity to the original code still. All the "recompute count when
everybody has gone down to zero" was there pretty much from day 1 (*).

Which makes me say: "oh, a quick hack from 1991 works on most machines in
2001, so how hard a problem can it be?"

Fight it out. People asked whether I was interested, and I said "no". Take
a clue: do benchmarks on all the competing patches, and try to create the
best one, and present it to me as a done deal.

		Linus

(*) The single biggest change from day 1 is that it used to iterate over a
global array of process slots, and for scalability reasons (not CPU
scalability, but "max nr of processes in the system" scalability) the
array was gotten rid of, giving the current doubly linked list. Everything
else that any scheduler person complains about was pretty much there
otherwise ;)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:00           ` Linus Torvalds
@ 2001-12-18 19:17             ` Alan Cox
  0 siblings, 0 replies; 87+ messages in thread
From: Alan Cox @ 2001-12-18 19:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Rik van Riel, Davide Libenzi, Kernel Mailing List

> > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > That isn't going to go away by sticking heads in sand.
> 
> Did you _read_ what I said?
> 
> We _have_ patches. You apparently have your own set.

I did read that mail - but somewhat later. Right now Im scanning l/k
every few days no more.

As to my stuff - everything I propose different to ibm/davide is about
cost/speed of ordering or minor optimisations. I don't plan to compete and
duplicate work

^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2001-12-22  4:25 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.33.0112181508001.3410-100000@penguin.transmeta.com>
2001-12-20  3:50 ` Scheduler ( was: Just a second ) Rik van Riel
2001-12-20  4:04   ` Ryan Cumming
2001-12-20  5:39   ` David S. Miller
2001-12-20  5:58     ` Linus Torvalds
2001-12-20  6:01       ` David S. Miller
2001-12-20 22:40         ` Troels Walsted Hansen
2001-12-20 23:55           ` Chris Ricker
2001-12-20 23:59             ` CaT
2001-12-21  0:06             ` Davide Libenzi
2001-12-20 11:29     ` Rik van Riel
2001-12-20 11:34       ` David S. Miller
2001-12-20  5:52   ` Linus Torvalds
2001-12-20  6:33   ` Scheduler, Can we save some juice Timothy Covell
2001-12-20  6:50     ` Ryan Cumming
2001-12-20  6:52     ` Robert Love
2001-12-20 17:39       ` Timothy Covell
     [not found] <20011218020456.A11541@redhat.com>
2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
2001-12-18 16:56   ` Rik van Riel
2001-12-18 17:18     ` Linus Torvalds
2001-12-18 19:04       ` Alan Cox
2001-12-18 21:02         ` Larry McVoy
2001-12-18 21:14           ` David S. Miller
2001-12-18 21:17             ` Larry McVoy
2001-12-18 21:19               ` Rik van Riel
2001-12-18 21:30               ` David S. Miller
2001-12-18 21:18           ` Rik van Riel
2001-12-19 16:50         ` Daniel Phillips
2001-12-18 19:11       ` Mike Galbraith
2001-12-18 19:15       ` Rik van Riel
2001-12-18 17:55   ` Davide Libenzi
2001-12-18 19:43   ` Alexander Viro
2001-12-18  5:59 V Ganesh
  -- strict thread matches above, loose matches on Subject: below --
2001-12-18  5:11 Thierry Forveille
2001-12-17 21:41 ` John Heil
2001-12-18 14:31 ` Alan Cox
     [not found] <20011217200946.D753@holomorphy.com>
2001-12-18  4:27 ` Linus Torvalds
2001-12-18  4:55   ` William Lee Irwin III
2001-12-18  6:09     ` Linus Torvalds
2001-12-18  6:34       ` Jeff Garzik
2001-12-18 12:23       ` Rik van Riel
2001-12-18 14:29       ` Alan Cox
2001-12-18 17:07         ` Linus Torvalds
2001-12-18 15:51       ` Martin Josefsson
2001-12-18 17:08         ` Linus Torvalds
2001-12-18 16:16       ` Roger Larsson
2001-12-18 17:16         ` Herman Oosthuysen
2001-12-18 17:16         ` Linus Torvalds
2001-12-18 17:21       ` David Mansfield
2001-12-18 17:27         ` Linus Torvalds
2001-12-18 17:54           ` Andreas Dilger
2001-12-18 18:27             ` Doug Ledford
2001-12-18 18:52               ` Andreas Dilger
2001-12-18 19:03                 ` Doug Ledford
2001-12-19  9:19               ` Peter Wächtler
2001-12-19 11:05               ` Helge Hafting
2001-12-21 20:23               ` Rob Landley
2001-12-18 18:35             ` Linus Torvalds
2001-12-18 18:58           ` Alan Cox
2001-12-18 19:31             ` Gerd Knorr
2001-12-18 18:25       ` William Lee Irwin III
2001-12-18 14:21     ` Adam Schrotenboer
2001-12-18 18:13   ` Davide Libenzi
2001-12-16  0:13 Just a second Linus Torvalds
2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi
2001-12-17 22:53   ` Linus Torvalds
2001-12-17 23:15     ` Davide Libenzi
2001-12-17 23:18       ` Linus Torvalds
2001-12-17 23:39         ` Davide Libenzi
2001-12-17 23:52         ` Benjamin LaHaise
2001-12-18  1:11           ` Linus Torvalds
2001-12-18  1:46             ` H. Peter Anvin
2001-12-18  5:54             ` Benjamin LaHaise
2001-12-18  6:10               ` Linus Torvalds
2001-12-18  1:54     ` Rik van Riel
2001-12-18  2:35       ` Linus Torvalds
2001-12-18  2:51         ` David Lang
2001-12-18  3:08         ` Davide Libenzi
2001-12-18  3:19           ` Davide Libenzi
2001-12-18 14:09         ` Alan Cox
2001-12-18  9:12           ` John Heil
2001-12-18 15:34           ` degger
2001-12-18 18:35             ` Mike Kravetz
2001-12-18 18:48             ` Davide Libenzi
2001-12-18 16:50           ` Mike Kravetz
2001-12-18 17:22             ` Linus Torvalds
2001-12-18 17:50               ` Davide Libenzi
2001-12-18 17:00           ` Linus Torvalds
2001-12-18 19:17             ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox