* Re: Scheduler ( was: Just a second ) ... [not found] <Pine.LNX.4.33.0112181508001.3410-100000@penguin.transmeta.com> @ 2001-12-20 3:50 ` Rik van Riel 2001-12-20 4:04 ` Ryan Cumming ` (3 more replies) 0 siblings, 4 replies; 87+ messages in thread From: Rik van Riel @ 2001-12-20 3:50 UTC (permalink / raw) To: Linus Torvalds Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Linus Torvalds wrote: > The thing is, I'm personally very suspicious of the "features for that > exclusive 0.1%" mentality. Then why do we have sendfile(), or that idiotic sys_readahead() ? (is there _any_ use for sys_readahead() ? at all ?) cheers, Rik -- Shortwave goes a long way: irc.starchat.net #swl http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 3:50 ` Scheduler ( was: Just a second ) Rik van Riel @ 2001-12-20 4:04 ` Ryan Cumming 2001-12-20 5:39 ` David S. Miller ` (2 subsequent siblings) 3 siblings, 0 replies; 87+ messages in thread From: Ryan Cumming @ 2001-12-20 4:04 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-kernel, torvalds On December 19, 2001 19:50, Rik van Riel wrote: > On Tue, 18 Dec 2001, Linus Torvalds wrote: > > The thing is, I'm personally very suspicious of the "features for that > > exclusive 0.1%" mentality. > > Then why do we have sendfile(), or that idiotic sys_readahead() ? Damn straights sendfile(2) had an oppertunity to be a real extention of the Unix philosophy. If it was called something like "copy" (to match "read" and "write"), and worked on all fds (even if it didn't do zerocopy, it should still just work), it'd fit in a lot more nicely than even BSD sockets. Alas, as it is, it's more of a wart than an extention. Now, sys_readahead() is pretty much the stupidest thing I've ever heard. If we had a copy(2) syscall, we could do the same thing by: copy(sourcefile, /dev/null, count). I don't think sys_readahead() even qualifies as a wart. -Ryan ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 3:50 ` Scheduler ( was: Just a second ) Rik van Riel 2001-12-20 4:04 ` Ryan Cumming @ 2001-12-20 5:39 ` David S. Miller 2001-12-20 5:58 ` Linus Torvalds 2001-12-20 11:29 ` Rik van Riel 2001-12-20 5:52 ` Linus Torvalds 2001-12-20 6:33 ` Scheduler, Can we save some juice Timothy Covell 3 siblings, 2 replies; 87+ messages in thread From: David S. Miller @ 2001-12-20 5:39 UTC (permalink / raw) To: riel; +Cc: torvalds, bcrl, alan, davidel, linux-kernel From: Rik van Riel <riel@conectiva.com.br> Date: Thu, 20 Dec 2001 01:50:36 -0200 (BRST) On Tue, 18 Dec 2001, Linus Torvalds wrote: > The thing is, I'm personally very suspicious of the "features for that > exclusive 0.1%" mentality. Then why do we have sendfile(), or that idiotic sys_readahead() ? Sending files over sockets are %99 of what most network servers are actually doing today, it is much more than 0.1% :-) ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 5:39 ` David S. Miller @ 2001-12-20 5:58 ` Linus Torvalds 2001-12-20 6:01 ` David S. Miller 2001-12-20 11:29 ` Rik van Riel 1 sibling, 1 reply; 87+ messages in thread From: Linus Torvalds @ 2001-12-20 5:58 UTC (permalink / raw) To: David S. Miller; +Cc: riel, bcrl, alan, davidel, linux-kernel On Wed, 19 Dec 2001, David S. Miller wrote: > > Then why do we have sendfile(), or that idiotic sys_readahead() ? > > Sending files over sockets are %99 of what most network servers are > actually doing today, it is much more than 0.1% :-) Well, that was true when the thing was written, but whether anybody _uses_ it any more, I don't know. Tux gets the same effect on its own, and I don't know if Apache defaults to using sendfile or not. readahead was just a personal 5-minute experiment, we can certainly remove that ;) Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 5:58 ` Linus Torvalds @ 2001-12-20 6:01 ` David S. Miller 2001-12-20 22:40 ` Troels Walsted Hansen 0 siblings, 1 reply; 87+ messages in thread From: David S. Miller @ 2001-12-20 6:01 UTC (permalink / raw) To: torvalds; +Cc: riel, bcrl, alan, davidel, linux-kernel From: Linus Torvalds <torvalds@transmeta.com> Date: Wed, 19 Dec 2001 21:58:41 -0800 (PST) Well, that was true when the thing was written, but whether anybody _uses_ it any more, I don't know. Tux gets the same effect on its own, and I don't know if Apache defaults to using sendfile or not. Samba uses it by default, that I know for sure :-) ^ permalink raw reply [flat|nested] 87+ messages in thread
* RE: Scheduler ( was: Just a second ) ... 2001-12-20 6:01 ` David S. Miller @ 2001-12-20 22:40 ` Troels Walsted Hansen 2001-12-20 23:55 ` Chris Ricker 0 siblings, 1 reply; 87+ messages in thread From: Troels Walsted Hansen @ 2001-12-20 22:40 UTC (permalink / raw) To: 'David S. Miller'; +Cc: linux-kernel >From: David S. Miller > From: Linus Torvalds <torvalds@transmeta.com> > Well, that was true when the thing was written, but whether anybody _uses_ > it any more, I don't know. Tux gets the same effect on its own, and I > don't know if Apache defaults to using sendfile or not. > >Samba uses it by default, that I know for sure :-) I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes the word "sendfile" in the source at least. :( Wonder why the sendfile patches where never merged... -- Troels Walsted Hansen ^ permalink raw reply [flat|nested] 87+ messages in thread
* RE: Scheduler ( was: Just a second ) ... 2001-12-20 22:40 ` Troels Walsted Hansen @ 2001-12-20 23:55 ` Chris Ricker 2001-12-20 23:59 ` CaT 2001-12-21 0:06 ` Davide Libenzi 0 siblings, 2 replies; 87+ messages in thread From: Chris Ricker @ 2001-12-20 23:55 UTC (permalink / raw) To: Troels Walsted Hansen; +Cc: 'David S. Miller', World Domination Now! On Thu, 20 Dec 2001, Troels Walsted Hansen wrote: > >From: David S. Miller > > From: Linus Torvalds <torvalds@transmeta.com> > > Well, that was true when the thing was written, but whether anybody > _uses_ > > it any more, I don't know. Tux gets the same effect on its own, and > I > > don't know if Apache defaults to using sendfile or not. > > > >Samba uses it by default, that I know for sure :-) > > I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes > the word "sendfile" in the source at least. :( Wonder why the sendfile > patches where never merged... The only real-world source I've noticed actually using sendfile() are some of the better ftp daemons (such as vsftpd). later, chris -- Chris Ricker kaboom@gatech.edu This is a dare to the Bush administration. -- Thurston Moore ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 23:55 ` Chris Ricker @ 2001-12-20 23:59 ` CaT 2001-12-21 0:06 ` Davide Libenzi 1 sibling, 0 replies; 87+ messages in thread From: CaT @ 2001-12-20 23:59 UTC (permalink / raw) To: Chris Ricker Cc: Troels Walsted Hansen, 'David S. Miller', World Domination Now! On Thu, Dec 20, 2001 at 04:55:55PM -0700, Chris Ricker wrote: > > I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes > > the word "sendfile" in the source at least. :( Wonder why the sendfile > > patches where never merged... > > The only real-world source I've noticed actually using sendfile() are some > of the better ftp daemons (such as vsftpd). proftpd uses it also. -- CaT - A high level of technology does not a civilisation make. ^ permalink raw reply [flat|nested] 87+ messages in thread
* RE: Scheduler ( was: Just a second ) ... 2001-12-20 23:55 ` Chris Ricker 2001-12-20 23:59 ` CaT @ 2001-12-21 0:06 ` Davide Libenzi 1 sibling, 0 replies; 87+ messages in thread From: Davide Libenzi @ 2001-12-21 0:06 UTC (permalink / raw) To: Chris Ricker Cc: Troels Walsted Hansen, 'David S. Miller', World Domination Now! On Thu, 20 Dec 2001, Chris Ricker wrote: > On Thu, 20 Dec 2001, Troels Walsted Hansen wrote: > > > >From: David S. Miller > > > From: Linus Torvalds <torvalds@transmeta.com> > > > Well, that was true when the thing was written, but whether anybody > > _uses_ > > > it any more, I don't know. Tux gets the same effect on its own, and > > I > > > don't know if Apache defaults to using sendfile or not. > > > > > >Samba uses it by default, that I know for sure :-) > > > > I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes > > the word "sendfile" in the source at least. :( Wonder why the sendfile > > patches where never merged... > > The only real-world source I've noticed actually using sendfile() are some > of the better ftp daemons (such as vsftpd). And XMail :) - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 5:39 ` David S. Miller 2001-12-20 5:58 ` Linus Torvalds @ 2001-12-20 11:29 ` Rik van Riel 2001-12-20 11:34 ` David S. Miller 1 sibling, 1 reply; 87+ messages in thread From: Rik van Riel @ 2001-12-20 11:29 UTC (permalink / raw) To: David S. Miller; +Cc: torvalds, bcrl, alan, davidel, linux-kernel On Wed, 19 Dec 2001, David S. Miller wrote: > From: Rik van Riel <riel@conectiva.com.br> > On Tue, 18 Dec 2001, Linus Torvalds wrote: > > > The thing is, I'm personally very suspicious of the "features for that > > exclusive 0.1%" mentality. > > Then why do we have sendfile(), or that idiotic sys_readahead() ? > > Sending files over sockets are %99 of what most network servers are > actually doing today, it is much more than 0.1% :-) The same could be said for AIO, there are a _lot_ of server programs which are heavily overthreaded because of a lack of AIO... cheers, Rik -- Shortwave goes a long way: irc.starchat.net #swl http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 11:29 ` Rik van Riel @ 2001-12-20 11:34 ` David S. Miller 0 siblings, 0 replies; 87+ messages in thread From: David S. Miller @ 2001-12-20 11:34 UTC (permalink / raw) To: riel; +Cc: torvalds, bcrl, alan, davidel, linux-kernel From: Rik van Riel <riel@conectiva.com.br> Date: Thu, 20 Dec 2001 09:29:28 -0200 (BRST) On Wed, 19 Dec 2001, David S. Miller wrote: > Sending files over sockets are %99 of what most network servers are > actually doing today, it is much more than 0.1% :-) The same could be said for AIO, there are a _lot_ of server programs which are heavily overthreaded because of a lack of AIO... If you read my most recent responses to Ingo's postings, you'll see that I'm starting to completely agree with you :-) ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-20 3:50 ` Scheduler ( was: Just a second ) Rik van Riel 2001-12-20 4:04 ` Ryan Cumming 2001-12-20 5:39 ` David S. Miller @ 2001-12-20 5:52 ` Linus Torvalds 2001-12-20 6:33 ` Scheduler, Can we save some juice Timothy Covell 3 siblings, 0 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-20 5:52 UTC (permalink / raw) To: Rik van Riel Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List On Thu, 20 Dec 2001, Rik van Riel wrote: > On Tue, 18 Dec 2001, Linus Torvalds wrote: > > > The thing is, I'm personally very suspicious of the "features for that > > exclusive 0.1%" mentality. > > Then why do we have sendfile(), or that idiotic sys_readahead() ? Hey, I expect others to do things in their tree, and I live by the same rules: I do my stuff openly in my tree. The Apache people actually seemed quite interested in sendfile. Of course, that was before apache seemed to stop worrying about trying to beat others at performance (rightly or wrongly - I think they are right from a pragmatic viewpoint, and wrong from a PR one). And hey, the same way I encourage others to experiment openly with their trees, I experiment with mine. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Scheduler, Can we save some juice ... 2001-12-20 3:50 ` Scheduler ( was: Just a second ) Rik van Riel ` (2 preceding siblings ...) 2001-12-20 5:52 ` Linus Torvalds @ 2001-12-20 6:33 ` Timothy Covell 2001-12-20 6:50 ` Ryan Cumming 2001-12-20 6:52 ` Robert Love 3 siblings, 2 replies; 87+ messages in thread From: Timothy Covell @ 2001-12-20 6:33 UTC (permalink / raw) To: Rik van Riel, Linus Torvalds Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List On Wednesday 19 December 2001 21:50, Rik van Riel wrote: > On Tue, 18 Dec 2001, Linus Torvalds wrote: > > The thing is, I'm personally very suspicious of the "features for that > > exclusive 0.1%" mentality. > > Then why do we have sendfile(), or that idiotic sys_readahead() ? > > (is there _any_ use for sys_readahead() ? at all ?) > > cheers, > > Rik OK, here's another 0.1% for you. Considering how Linux SMP doesn't have high CPU affinity, would it be possible to make a patch such that the additional CPUs remain in deep sleep/HALT mode until the first CPU hits a high-water mark of say 90% utilization? I've started doing this by hand with the (x)pulse application. My goal is to save electricity and cut down on excess heat when I'm just browsing the web and not compiling or seti@home'ing. -- timothy.covell@ashavan.org. ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler, Can we save some juice ... 2001-12-20 6:33 ` Scheduler, Can we save some juice Timothy Covell @ 2001-12-20 6:50 ` Ryan Cumming 2001-12-20 6:52 ` Robert Love 1 sibling, 0 replies; 87+ messages in thread From: Ryan Cumming @ 2001-12-20 6:50 UTC (permalink / raw) To: timothy.covell; +Cc: Kernel Mailing List On December 19, 2001 22:33, Timothy Covell wrote: > OK, here's another 0.1% for you. Considering how Linux SMP > doesn't have high CPU affinity, would it be possible to make a > patch such that the additional CPUs remain in deep sleep/HALT > mode until the first CPU hits a high-water mark of say 90% > utilization? I've started doing this by hand with the (x)pulse > application. My goal is to save electricity and cut down on > excess heat when I'm just browsing the web and not compiling > or seti@home'ing. I seriously doubt there would be a noticable power consumption or heat difference between two CPU's running HLT half the time, and one CPU running HLT all the time. And I'm downright certain it isn't worth the code complexity even if it was, there is very little (read: no) intersection between the SMP and low-power user base. -Ryan ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler, Can we save some juice ... 2001-12-20 6:33 ` Scheduler, Can we save some juice Timothy Covell 2001-12-20 6:50 ` Ryan Cumming @ 2001-12-20 6:52 ` Robert Love 2001-12-20 17:39 ` Timothy Covell 1 sibling, 1 reply; 87+ messages in thread From: Robert Love @ 2001-12-20 6:52 UTC (permalink / raw) To: timothy.covell Cc: Rik van Riel, Linus Torvalds, Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List On Thu, 2001-12-20 at 01:33, Timothy Covell wrote: > OK, here's another 0.1% for you. Considering how Linux SMP > doesn't have high CPU affinity, would it be possible to make a > patch such that the additional CPUs remain in deep sleep/HALT > mode until the first CPU hits a high-water mark of say 90% > utilization? I've started doing this by hand with the (x)pulse > application. My goal is to save electricity and cut down on > excess heat when I'm just browsing the web and not compiling > or seti@home'ing. You'd probably be better off working against load and not CPU usage, since a single app can hit you at 100% CPU. Load average is the sort of metric you want, since if there is more than 1 task waiting to run on average, you will benefit from multiple CPUs. That said, this would be easy to do in user space using the hotplug CPU patch. Monitor load average (just like any X applet does) and when it crosses over the threshold: "echo 1 > /proc/sys/cpu/2/online" Another solution would be to use CPU affinity to lock init (and thus all tasks) to 0x00000001 or whatever and then start allowing 0x00000002 or whatever when load gets too high. My point: it is awful easy in user space. Robert Love ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler, Can we save some juice ... 2001-12-20 6:52 ` Robert Love @ 2001-12-20 17:39 ` Timothy Covell 0 siblings, 0 replies; 87+ messages in thread From: Timothy Covell @ 2001-12-20 17:39 UTC (permalink / raw) To: Robert Love; +Cc: linux-kernel On Thursday 20 December 2001 00:52, Robert Love wrote: > On Thu, 2001-12-20 at 01:33, Timothy Covell wrote: > > OK, here's another 0.1% for you. Considering how Linux SMP > > doesn't have high CPU affinity, would it be possible to make a > > patch such that the additional CPUs remain in deep sleep/HALT > > mode until the first CPU hits a high-water mark of say 90% > > utilization? I've started doing this by hand with the (x)pulse > > application. My goal is to save electricity and cut down on > > excess heat when I'm just browsing the web and not compiling > > or seti@home'ing. > > You'd probably be better off working against load and not CPU usage, > since a single app can hit you at 100% CPU. Load average is the sort of > metric you want, since if there is more than 1 task waiting to run on > average, you will benefit from multiple CPUs. > > That said, this would be easy to do in user space using the hotplug CPU > patch. Monitor load average (just like any X applet does) and when it > crosses over the threshold: "echo 1 > /proc/sys/cpu/2/online" > > Another solution would be to use CPU affinity to lock init (and thus all > tasks) to 0x00000001 or whatever and then start allowing 0x00000002 or > whatever when load gets too high. > > My point: it is awful easy in user space. > > Robert Love > You make good points. I'll try the hotplug CPU patch to automate things more than with my simple use of Xpulse, (whose code I could have used if I wanted to get off my butt and write a useful C application.) -- timothy.covell@ashavan.org. ^ permalink raw reply [flat|nested] 87+ messages in thread
[parent not found: <20011218020456.A11541@redhat.com>]
* Re: Scheduler ( was: Just a second ) ... [not found] <20011218020456.A11541@redhat.com> @ 2001-12-18 16:50 ` Linus Torvalds 2001-12-18 16:56 ` Rik van Riel ` (2 more replies) 0 siblings, 3 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 16:50 UTC (permalink / raw) To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Benjamin LaHaise wrote: > On Mon, Dec 17, 2001 at 10:10:30PM -0800, Linus Torvalds wrote: > > > Well, we've got serious chicken and egg problems then. > > > > Why? > > The code can't go into glibc without syscall numbers being reserved. It sure as hell can. And I'll bet $5 USD that glibc wouldn't take the patches anyway before the kernel interfaces are _tested_. > I've posted the code, there are people playing with it. I can't make them > comment. Well, if people aren't interested, then it doesn't _ever_ go in. Remember: we do not add features just because we can. Quite frankly, I don't think you've told that many people. I haven't seen any discussion about the aio stuff on linux-kernel, which may be because you posted several announcements and nobody cared, or it may be that you've only mentioned it fleetingly and people didn't notice. Take a look at how long it took for ext3 to be "standard" - I put them in my tree when I started getting real feedback that it was used and people liked using it. I simply do not like applying patches "just to get users". Not even reservations - because I reserve the right to _never_ apply something if critical review ends up saying that "that doesn't make sense". Quite frankly, the fact that it is being tested out at places like Oracle etc is secondary - those people will use anything. That's proven by history. That doesn't mean that _I_ accept anything. Now, the fact that I like the interfaces is actually secondary - it does make me much more likely to include it even in a half-baked thing, but it does NOT mean that I trust my own taste so much that I'd do it "under the covers" with little open discussion, use and modification. Where _is_ the discussion on linux-kernel? Where are the negative comments from Al? (Al _always_ has negative comments and suggestions for improvements, don't try to say that he also liked it unconditionally ;) Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds @ 2001-12-18 16:56 ` Rik van Riel 2001-12-18 17:18 ` Linus Torvalds 2001-12-18 17:55 ` Davide Libenzi 2001-12-18 19:43 ` Alexander Viro 2 siblings, 1 reply; 87+ messages in thread From: Rik van Riel @ 2001-12-18 16:56 UTC (permalink / raw) To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Linus Torvalds wrote: > Where _is_ the discussion on linux-kernel? Which mailing lists do you want to be subscribed to ? ;) Rik -- DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 16:56 ` Rik van Riel @ 2001-12-18 17:18 ` Linus Torvalds 2001-12-18 19:04 ` Alan Cox ` (2 more replies) 0 siblings, 3 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 17:18 UTC (permalink / raw) To: Rik van Riel; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Rik van Riel wrote: > On Tue, 18 Dec 2001, Linus Torvalds wrote: > > > Where _is_ the discussion on linux-kernel? > > Which mailing lists do you want to be subscribed to ? ;) I'm not subscribed to any, thank you very much. I read them through a news gateway, which gives me access to the common ones. And if the discussion wasn't on the common ones, then it wasn't an open discussion. And no, I don't think IRC counts either, sorry. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:18 ` Linus Torvalds @ 2001-12-18 19:04 ` Alan Cox 2001-12-18 21:02 ` Larry McVoy 2001-12-19 16:50 ` Daniel Phillips 2001-12-18 19:11 ` Mike Galbraith 2001-12-18 19:15 ` Rik van Riel 2 siblings, 2 replies; 87+ messages in thread From: Alan Cox @ 2001-12-18 19:04 UTC (permalink / raw) To: Linus Torvalds Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi, Kernel Mailing List > I'm not subscribed to any, thank you very much. I read them through a news > gateway, which gives me access to the common ones. > > And if the discussion wasn't on the common ones, then it wasn't an open > discussion. If the discussion was on the l/k list then most kernel developers arent going to read it because tey dont have time to wade through all the crap that doesnt matter to them. > And no, I don't think IRC counts either, sorry. IRC is where most stuff, especially cross vendor stuff is initially discussed nowdays, along with kernelnewbies where most of the intro stuff is - but thats disussed rather than formally proposed and studied ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 19:04 ` Alan Cox @ 2001-12-18 21:02 ` Larry McVoy 2001-12-18 21:14 ` David S. Miller 2001-12-18 21:18 ` Rik van Riel 2001-12-19 16:50 ` Daniel Phillips 1 sibling, 2 replies; 87+ messages in thread From: Larry McVoy @ 2001-12-18 21:02 UTC (permalink / raw) To: Alan Cox Cc: Linus Torvalds, Rik van Riel, Benjamin LaHaise, Davide Libenzi, Kernel Mailing List Maybe I'm an old stick in the mud, but IRC seems like a big waste of time to me. It's perfect for off the cuff answers and fairly useless for thoughtful answers. We used to write well thought out papers and specifications for OS work. These days if you can't do it in a paragraph on IRC it must not be worth doing, eh? On Tue, Dec 18, 2001 at 07:04:59PM +0000, Alan Cox wrote: > > I'm not subscribed to any, thank you very much. I read them through a news > > gateway, which gives me access to the common ones. > > > > And if the discussion wasn't on the common ones, then it wasn't an open > > discussion. > > If the discussion was on the l/k list then most kernel developers arent > going to read it because tey dont have time to wade through all the crap > that doesnt matter to them. > > > And no, I don't think IRC counts either, sorry. > > IRC is where most stuff, especially cross vendor stuff is initially > discussed nowdays, along with kernelnewbies where most of the intro > stuff is - but thats disussed rather than formally proposed and studied > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 21:02 ` Larry McVoy @ 2001-12-18 21:14 ` David S. Miller 2001-12-18 21:17 ` Larry McVoy 2001-12-18 21:18 ` Rik van Riel 1 sibling, 1 reply; 87+ messages in thread From: David S. Miller @ 2001-12-18 21:14 UTC (permalink / raw) To: lm; +Cc: alan, torvalds, riel, bcrl, davidel, linux-kernel From: Larry McVoy <lm@bitmover.com> Date: Tue, 18 Dec 2001 13:02:28 -0800 Maybe I'm an old stick in the mud, but IRC seems like a big waste of time to me. It's like being at a Linux conference all the time. :-) It does kind of make sense given that people are so scattered across the planet. Sometimes I want to just grill someone on something, and email would be too much back and forth, IRC is one way to accomplish that. ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 21:14 ` David S. Miller @ 2001-12-18 21:17 ` Larry McVoy 2001-12-18 21:19 ` Rik van Riel 2001-12-18 21:30 ` David S. Miller 0 siblings, 2 replies; 87+ messages in thread From: Larry McVoy @ 2001-12-18 21:17 UTC (permalink / raw) To: David S. Miller; +Cc: lm, alan, torvalds, riel, bcrl, davidel, linux-kernel On Tue, Dec 18, 2001 at 01:14:20PM -0800, David S. Miller wrote: > From: Larry McVoy <lm@bitmover.com> > Date: Tue, 18 Dec 2001 13:02:28 -0800 > > Maybe I'm an old stick in the mud, but IRC seems like a big waste of > time to me. > > It's like being at a Linux conference all the time. :-) > > It does kind of make sense given that people are so scattered across > the planet. Sometimes I want to just grill someone on something, and > email would be too much back and forth, IRC is one way to accomplish > that. Let me introduce you to this neat invention called a telephone. It's the black thing next to your desk, it rings, has buttons. If you push the right buttons, well, it's magic... :-) -- --- Larry McVoy lm at bitmover.com http://www.bitmover.com/lm ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 21:17 ` Larry McVoy @ 2001-12-18 21:19 ` Rik van Riel 2001-12-18 21:30 ` David S. Miller 1 sibling, 0 replies; 87+ messages in thread From: Rik van Riel @ 2001-12-18 21:19 UTC (permalink / raw) To: Larry McVoy; +Cc: David S. Miller, alan, torvalds, bcrl, davidel, linux-kernel On Tue, 18 Dec 2001, Larry McVoy wrote: > On Tue, Dec 18, 2001 at 01:14:20PM -0800, David S. Miller wrote: > > From: Larry McVoy <lm@bitmover.com> > > Date: Tue, 18 Dec 2001 13:02:28 -0800 > > > > Maybe I'm an old stick in the mud, but IRC seems like a big waste of > > time to me. > > > > It's like being at a Linux conference all the time. :-) > > Let me introduce you to this neat invention called a telephone. It's > the black thing next to your desk, it rings, has buttons. If you push > the right buttons, well, it's magic... Yeah, but you can't scroll up a page on the phone... (also, talking with multiple people at the same time is kind of annoying in audio, while it's ok on irc) Rik -- DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 21:17 ` Larry McVoy 2001-12-18 21:19 ` Rik van Riel @ 2001-12-18 21:30 ` David S. Miller 1 sibling, 0 replies; 87+ messages in thread From: David S. Miller @ 2001-12-18 21:30 UTC (permalink / raw) To: lm; +Cc: alan, torvalds, riel, bcrl, davidel, linux-kernel From: Larry McVoy <lm@bitmover.com> Date: Tue, 18 Dec 2001 13:17:13 -0800 Let me introduce you to this neat invention called a telephone. It's the black thing next to your desk, it rings, has buttons. If you push the right buttons, well, it's magic... I'm not calling Holland every time I want to poke Jens about something in a patch we're working on :-) I hate telephones for technical stuff, because people can call the fucking thing when I am not behind my computer or even worse when I AM behind my computer and I want to concentrate on the code on my screen without being disturbed. With IRC it is MY CHOICE to get involved in the discussion, I can choose to respond or not respond to someone, I can choose to be available or not available at any given time. It's just a real-time version of email. And the "passive, I can ignore you" part is what I like about it. Telephones frankly suck for discussing technical topics. I can't cut and paste pieces of code from my other editor buffer to show you over the phone, as another example as to why. A lot of people like to use telephones specifically because it does not give the other party the option of ignoring you once they pick up the phone. I value the ability to make the choice to ignore people because a lot of ideas I don't give a crap about come under my nose. In fact that may be one of the best parts about Linux development compared to doing stuff at a company, one isn't required to listen to someone's idea or to even read it. If today I don't give a crap about Joe's filesystem idea, hey guess what I'm not going to read any of his emails about the thing. ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 21:02 ` Larry McVoy 2001-12-18 21:14 ` David S. Miller @ 2001-12-18 21:18 ` Rik van Riel 1 sibling, 0 replies; 87+ messages in thread From: Rik van Riel @ 2001-12-18 21:18 UTC (permalink / raw) To: Larry McVoy Cc: Alan Cox, Linus Torvalds, Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Larry McVoy wrote: > Maybe I'm an old stick in the mud, but IRC seems like a big waste of > time to me. It's perfect for off the cuff answers and fairly useless > for thoughtful answers. We used to write well thought out papers and > specifications for OS work. These days if you can't do it in a > paragraph on IRC it must not be worth doing, eh? Actually, we tend to use multiple media at the same time. It happens very often that because of some discussion on IRC we end up writing up a few paragraphs and sending it to people by email. For other things, email is clearly too slow, so stuff is done on IRC (eg. walking somebody through a piece of code to identify and agree on a bug). cheers, Rik -- DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 19:04 ` Alan Cox 2001-12-18 21:02 ` Larry McVoy @ 2001-12-19 16:50 ` Daniel Phillips 1 sibling, 0 replies; 87+ messages in thread From: Daniel Phillips @ 2001-12-19 16:50 UTC (permalink / raw) To: Alan Cox, Linus Torvalds Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On December 18, 2001 08:04 pm, Alan Cox wrote: > > I'm not subscribed to any, thank you very much. I read them through a news > > gateway, which gives me access to the common ones. > > > > And if the discussion wasn't on the common ones, then it wasn't an open > > discussion. > > If the discussion was on the l/k list then most kernel developers arent > going to read it because tey dont have time to wade through all the crap > that doesnt matter to them. Hi Alan, It's AIO we're talking about, right? AIO is interesting to quite a few people. I'd read the thread. I'd also read any background material that Ben would be so kind as to supply. -- Daniel ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:18 ` Linus Torvalds 2001-12-18 19:04 ` Alan Cox @ 2001-12-18 19:11 ` Mike Galbraith 2001-12-18 19:15 ` Rik van Riel 2 siblings, 0 replies; 87+ messages in thread From: Mike Galbraith @ 2001-12-18 19:11 UTC (permalink / raw) To: Linus Torvalds Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Linus Torvalds wrote: > And no, I don't think IRC counts either, sorry. Well yeah.. it's synchronous IO :) -Mike ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:18 ` Linus Torvalds 2001-12-18 19:04 ` Alan Cox 2001-12-18 19:11 ` Mike Galbraith @ 2001-12-18 19:15 ` Rik van Riel 2 siblings, 0 replies; 87+ messages in thread From: Rik van Riel @ 2001-12-18 19:15 UTC (permalink / raw) To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Linus Torvalds wrote: > And no, I don't think IRC counts either, sorry. Whether you think it counts or not, IRC is where most stuff is happening nowadays. cheers, Rik -- DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds 2001-12-18 16:56 ` Rik van Riel @ 2001-12-18 17:55 ` Davide Libenzi 2001-12-18 19:43 ` Alexander Viro 2 siblings, 0 replies; 87+ messages in thread From: Davide Libenzi @ 2001-12-18 17:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Linus Torvalds wrote: > Quite frankly, I don't think you've told that many people. I haven't seen > any discussion about the aio stuff on linux-kernel, which may be because > you posted several announcements and nobody cared, or it may be that > you've only mentioned it fleetingly and people didn't notice. This is not to ask the inclusion of /dev/epoll inside the kernel ( it can be easily merged by users that want to use it ) but i've had its users to prefer talking about that out of the mailing list. Maybe because they're scared to be eaten by some gurus when asking easy questions :) - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds 2001-12-18 16:56 ` Rik van Riel 2001-12-18 17:55 ` Davide Libenzi @ 2001-12-18 19:43 ` Alexander Viro 2 siblings, 0 replies; 87+ messages in thread From: Alexander Viro @ 2001-12-18 19:43 UTC (permalink / raw) To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Linus Torvalds wrote: > Where are the negative comments from Al? (Al _always_ has negative > comments and suggestions for improvements, don't try to say that he also > liked it unconditionally ;) Heh. Aside of a _big_ problem with exposing async API to userland (for a lot of reasons, including usual quality of async code in general and event-drivel one in particular) there is more specific one - Ben's long-promised full-async writepage() and friends. I'll believe it when I see it and so far it didn't appear. So for the time being I'm staying the fsck out of that - I don't like it, but I'm sick and tired of this sort of religious wars. ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... @ 2001-12-18 5:59 V Ganesh 0 siblings, 0 replies; 87+ messages in thread From: V Ganesh @ 2001-12-18 5:59 UTC (permalink / raw) To: linux-kernel; +Cc: wli In article <20011217205547.C821@holomorphy.com> you wrote: : On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote: :> The most likely cause is simply waking up after each sound interrupt: you :> also have a _lot_ of time handling interrupts. Quite frankly, web surfing :> and mp3 playing simply shouldn't use any noticeable amounts of CPU. : I think we have a winner: : /proc/interrupts : ------------------------------------------------ : CPU0 : 0: 17321824 XT-PIC timer : 1: 4 XT-PIC keyboard : 2: 0 XT-PIC cascade : 5: 46490271 XT-PIC soundblaster : 9: 400232 XT-PIC usb-ohci, eth0, eth1 : 11: 939150 XT-PIC aic7xxx, aic7xxx : 14: 13 XT-PIC ide0 : Approximately 4 times more often than the timer interrupt. : That's not nice... a bit offtopic, but the reason why there are so many interrupts is that there's probably something like esd running. I've observed that idle esd manages to generate tons of interrupts, although an strace of esd reveals it stuck in a select(). probably one of the ioctls it issued earlier is causing the driver to continuously read/write to the device. the interrupts stop as soon as you kill esd. : SoundBlaster 16 : A change of hardware should help verify this. it happens even with cs4232 (redhat 7.2, 2.4.7-10smp), so I doubt it's a soundblaster issue. ganesh ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ...
@ 2001-12-18 5:11 Thierry Forveille
2001-12-17 21:41 ` John Heil
2001-12-18 14:31 ` Alan Cox
0 siblings, 2 replies; 87+ messages in thread
From: Thierry Forveille @ 2001-12-18 5:11 UTC (permalink / raw)
To: linux-kernel
Linus Torvalds (torvalds@transmeta.com) writes
> On Mon, 17 Dec 2001, Rik van Riel wrote:
> >
> > Try readprofile some day, chances are schedule() is pretty
> > near the top of the list.
>
> Ehh.. Of course I do readprofile.
>
> But did you ever compare readprofile output to _total_ cycles spent?
>
I have a feeling that this discussion got sidetracked: cpu cycles burnt
in the scheduler indeed is non-issue, but big tasks being needlessly moved
around on SMPs is worth tackling.
^ permalink raw reply [flat|nested] 87+ messages in thread* Re: Scheduler ( was: Just a second ) ... 2001-12-18 5:11 Thierry Forveille @ 2001-12-17 21:41 ` John Heil 2001-12-18 14:31 ` Alan Cox 1 sibling, 0 replies; 87+ messages in thread From: John Heil @ 2001-12-17 21:41 UTC (permalink / raw) To: Thierry Forveille; +Cc: linux-kernel On Mon, 17 Dec 2001, Thierry Forveille wrote: > Date: Mon, 17 Dec 2001 19:11:10 -1000 (HST) > From: Thierry Forveille <forveill@cfht.hawaii.edu> > To: linux-kernel@vger.kernel.org > Subject: Re: Scheduler ( was: Just a second ) ... > > Linus Torvalds (torvalds@transmeta.com) writes > > On Mon, 17 Dec 2001, Rik van Riel wrote: > > > > > > Try readprofile some day, chances are schedule() is pretty > > > near the top of the list. > > > > Ehh.. Of course I do readprofile. > > > > But did you ever compare readprofile output to _total_ cycles spent? > > > I have a feeling that this discussion got sidetracked: cpu cycles burnt > in the scheduler indeed is non-issue, but big tasks being needlessly moved > around on SMPs is worth tackling. Given a cpu affinity facility, policy mgmt would belong in user space. CPU affinity would be pretty simple and I think the effort is already in flight IIRC. Johnh - ----------------------------------------------------------------- John Heil South Coast Software Custom systems software for UNIX and IBM MVS mainframes 1-714-774-6952 johnhscs@sc-software.com http://www.sc-software.com ----------------------------------------------------------------- ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 5:11 Thierry Forveille 2001-12-17 21:41 ` John Heil @ 2001-12-18 14:31 ` Alan Cox 1 sibling, 0 replies; 87+ messages in thread From: Alan Cox @ 2001-12-18 14:31 UTC (permalink / raw) To: Thierry Forveille; +Cc: linux-kernel > I have a feeling that this discussion got sidetracked: cpu cycles burnt > in the scheduler indeed is non-issue, but big tasks being needlessly moved > around on SMPs is worth tackling.] Its not a non issue - 40% of an 8 way box is a lot of lost CPU. Fixing the CPU bounce around problem also matters a lot - Ingo's speedups seen just by improving that on the current scheduler show its worth the work ^ permalink raw reply [flat|nested] 87+ messages in thread
[parent not found: <20011217200946.D753@holomorphy.com>]
* Re: Scheduler ( was: Just a second ) ... [not found] <20011217200946.D753@holomorphy.com> @ 2001-12-18 4:27 ` Linus Torvalds 2001-12-18 4:55 ` William Lee Irwin III 2001-12-18 18:13 ` Davide Libenzi 0 siblings, 2 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 4:27 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Kernel Mailing List [ cc'd back to Linux kernel, in case somebody wants to take a look whether there is something wrong in the sound drivers, for example ] On Mon, 17 Dec 2001, William Lee Irwin III wrote: > > This is no benchmark. This is my home machine it's taking a bite out of. > I'm trying to websurf and play mp3's and read email here. No forkbombs. > No databases. No made-up benchmarks. I don't know what it's doing (or > trying to do) in there but I'd like the CPU cycles back. > > From a recent /proc/profile dump on 2.4.17-pre1 (no patches), my top 5 > (excluding default_idle) are: > -------------------------------------------------------- > 22420 total 0.0168 > 4624 default_idle 96.3333 > 1280 schedule 0.6202 > 1130 handle_IRQ_event 11.7708 > 929 file_read_actor 9.6771 > 843 fast_clear_page 7.5268 The most likely cause is simply waking up after each sound interrupt: you also have a _lot_ of time handling interrupts. Quite frankly, web surfing and mp3 playing simply shouldn't use any noticeable amounts of CPU. The point being that I really doubt it's the scheduler proper, it's probably how it is _used_. And I'd suspect your sound driver (or user) conspires to keep scheduling stuff. For example (and this is _purely_ an example, I don't know if this is your particular case), this sounds like a classic case of "bad buffering". What bad buffering would do is: - you have a sound buffer that the mp3 player tries to keep full - your sound buffer is, let's pick a random number, 64 entries of 1024 bytes each. - the sound card gives an interrupt every time it has emptied a buffer. - the mp3 player is waiting on "free space" - we wake up the mp3 player for _every_ sound fragment filled. Do you see what this leads to? We schedule the mp3 task (which gets a high priority because it tends to run for a really short time, filling just 1 small buffer each time) _every_ time a single buffer empties. Even though we have 63 other full buffers. The classic fix for these kinds of things is _not_ to make the scheduler faster. Sure, that would help, but that's not really the problem. The _real_ fix is to use water-marks, and make the sound driver wake up the writing process only when (say) half the buffers have emptied. Now the mp3 player can fill 32 of the buffers at a time, and gets scheduled an order of magnitude less. It doesn't end up waking up every time. Which sound driver are you using, just in case this _is_ the reason? Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 4:27 ` Linus Torvalds @ 2001-12-18 4:55 ` William Lee Irwin III 2001-12-18 6:09 ` Linus Torvalds 2001-12-18 14:21 ` Adam Schrotenboer 2001-12-18 18:13 ` Davide Libenzi 1 sibling, 2 replies; 87+ messages in thread From: William Lee Irwin III @ 2001-12-18 4:55 UTC (permalink / raw) To: Kernel Mailing List; +Cc: torvalds On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote: > The most likely cause is simply waking up after each sound interrupt: you > also have a _lot_ of time handling interrupts. Quite frankly, web surfing > and mp3 playing simply shouldn't use any noticeable amounts of CPU. I think we have a winner: /proc/interrupts ------------------------------------------------ CPU0 0: 17321824 XT-PIC timer 1: 4 XT-PIC keyboard 2: 0 XT-PIC cascade 5: 46490271 XT-PIC soundblaster 9: 400232 XT-PIC usb-ohci, eth0, eth1 11: 939150 XT-PIC aic7xxx, aic7xxx 14: 13 XT-PIC ide0 Approximately 4 times more often than the timer interrupt. That's not nice... On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote: > Which sound driver are you using, just in case this _is_ the reason? SoundBlaster 16 A change of hardware should help verify this. Cheers, Bill ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 4:55 ` William Lee Irwin III @ 2001-12-18 6:09 ` Linus Torvalds 2001-12-18 6:34 ` Jeff Garzik ` (6 more replies) 2001-12-18 14:21 ` Adam Schrotenboer 1 sibling, 7 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 6:09 UTC (permalink / raw) To: William Lee Irwin III; +Cc: Kernel Mailing List, Jeff Garzik On Mon, 17 Dec 2001, William Lee Irwin III wrote: > > 5: 46490271 XT-PIC soundblaster > > Approximately 4 times more often than the timer interrupt. > That's not nice... Yeah. Well, looking at the issue, the problem is probably not just in the sb driver: the soundblaster driver shares the output buffer code with a number of other drivers (there's some horrible "dmabuf.c" code in common). And yes, the dmabuf code will wake up the writer on every single DMA complete interrupt. Considering that you seem to have them at least 400 times a second (and probably more, unless you've literally had sound going since the machine was booted), I think we know why your setup spends time in the scheduler. > On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote: > > Which sound driver are you using, just in case this _is_ the reason? > > SoundBlaster 16 > A change of hardware should help verify this. A number of sound drivers will use the same logic. You may be able to change this more easily some other way, by using a larger fragment size for example. That's up to the sw that actually feeds the sound stream, so it might be your decoder that selects a small fragment size. Quite frankly I don't know the sound infrastructure well enough to make any more intelligent suggestions about other decoders or similar to try, at this point I just start blathering. But yes, I bet you'll also see much less impact of this if you were to switch to more modern hardware. grep grep grep.. Oh, before you do that, how about changing "min_fragment" in sb_audio.c from 5 to something bigger like 9 or 10? That audio_devs[devc->dev]->min_fragment = 5; literally means that your minimum fragment size seems to be a rather pathetic 32 bytes (which doesn't mean that your sound will be set to that, but it _might_ be). That sounds totally ridiculous, but maybe I've misunderstood the code. Jeff, you've worked on the sb code at some point - does it really do 32-byte sound fragments? Why? That sounds truly insane if I really parsed that code correctly. That's thousands of separate DMA transfers and interrupts per second.. Raising that min_fragment thing from 5 to 10 would make the minimum DMA buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what, at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties in less than 1/100th of a second, but at least it should be < 200 irqs/sec rather than >400). Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 6:09 ` Linus Torvalds @ 2001-12-18 6:34 ` Jeff Garzik 2001-12-18 12:23 ` Rik van Riel ` (5 subsequent siblings) 6 siblings, 0 replies; 87+ messages in thread From: Jeff Garzik @ 2001-12-18 6:34 UTC (permalink / raw) To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List Linus Torvalds wrote: > Jeff, you've worked on the sb code at some point - does it really do > 32-byte sound fragments? Why? That sounds truly insane if I really parsed > that code correctly. That's thousands of separate DMA transfers > and interrupts per second.. I do not see a hardware minimum fragment size in the HW docs... The default hardware reset frag size is 2048 bytes. So, yes, 32 bytes is pretty small for today's rate. But... I wonder if the fault lies more with the application setting a too-small fragment size and the driver actually allows it to do so, or, the code following this comment in reorganize_buffers in drivers/sound/audio.c needs to be revisited: /* Compute the fragment size using the default algorithm */ Remember this code is from ancient times... probably written way before 44 Khz was common at all. Jeff -- Jeff Garzik | Only so many songs can be sung Building 1024 | with two lips, two lungs, and one tongue. MandrakeSoft | - nomeansno ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 6:09 ` Linus Torvalds 2001-12-18 6:34 ` Jeff Garzik @ 2001-12-18 12:23 ` Rik van Riel 2001-12-18 14:29 ` Alan Cox ` (4 subsequent siblings) 6 siblings, 0 replies; 87+ messages in thread From: Rik van Riel @ 2001-12-18 12:23 UTC (permalink / raw) To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik On Mon, 17 Dec 2001, Linus Torvalds wrote: > On Mon, 17 Dec 2001, William Lee Irwin III wrote: > > > > 5: 46490271 XT-PIC soundblaster > > > > Approximately 4 times more often than the timer interrupt. > > That's not nice... That's not nearly as much as your typical server system runs in network packets and wakeups of the samba/database/http daemons, though ... > Well, looking at the issue, the problem is probably not just in the sb > driver: the soundblaster driver shares the output buffer code with a > number of other drivers (there's some horrible "dmabuf.c" code in common). So you fixed it for the sound driver, nice. We still have the issue tha the scheduler can take up lots of time on busy server systems, though. (though I suspect on those systems it probably spends more time recalculating than selecting processes) regards, Rik -- DMCA, SSSCA, W3C? Who cares? http://thefreeworld.net/ http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 6:09 ` Linus Torvalds 2001-12-18 6:34 ` Jeff Garzik 2001-12-18 12:23 ` Rik van Riel @ 2001-12-18 14:29 ` Alan Cox 2001-12-18 17:07 ` Linus Torvalds 2001-12-18 15:51 ` Martin Josefsson ` (3 subsequent siblings) 6 siblings, 1 reply; 87+ messages in thread From: Alan Cox @ 2001-12-18 14:29 UTC (permalink / raw) To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik > Well, looking at the issue, the problem is probably not just in the sb > driver: the soundblaster driver shares the output buffer code with a > number of other drivers (there's some horrible "dmabuf.c" code in common). The sb driver is fine > A number of sound drivers will use the same logic. Most hardware does > Quite frankly I don't know the sound infrastructure well enough to make > any more intelligent suggestions about other decoders or similar to try, > at this point I just start blathering. some of the sound stuff uses very short fragments to get accurate audio/video synchronization. Some apps also do it gratuitously when they should be using other API's. Its also used sensibly for things like gnome-meeting where its worth trading CPU for latency because 1K of buffering starts giving you earth<->moon type conversations > But yes, I bet you'll also see much less impact of this if you were to > switch to more modern hardware. Not really - the app asked for an event every 32 bytes. This is an app not kernel problem. > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties > in less than 1/100th of a second, but at least it should be < 200 irqs/sec > rather than >400). With a few exceptions the applications tend to use 4K or larger DMA chunks anyway. Very few need tiny chunks. Alan ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 14:29 ` Alan Cox @ 2001-12-18 17:07 ` Linus Torvalds 0 siblings, 0 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 17:07 UTC (permalink / raw) To: Alan Cox; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik On Tue, 18 Dec 2001, Alan Cox wrote: > > > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties > > in less than 1/100th of a second, but at least it should be < 200 irqs/sec > > rather than >400). > > With a few exceptions the applications tend to use 4K or larger DMA chunks > anyway. Very few need tiny chunks. Doing another grep seems to imply that none of the other drivers even allow as small chunks as the sb driver does, 32 byte "events" is just ridiculous. At simple 2-channel, 16-bits, CD-quality sound, that's a DMA event every 0.18 msec (5500 times a second, 181 _micro_seconds appart). I obviously agree that the app shouldn't even ask for small chunks: whether a mp3 player reacts within 1/10th or 1/1000th of a second of the user asking it to switch tracks, nobody can even tell. So an mp3 player should probably use a big fragment size on the order of 4kB or similar (that still gives max fragment latency of 0.022 seconds, faster than humans can react). So it sounds like a player sillyness, but I don't think the driver should even allow such waste of resources, considering that no other driver allows it either.. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 6:09 ` Linus Torvalds ` (2 preceding siblings ...) 2001-12-18 14:29 ` Alan Cox @ 2001-12-18 15:51 ` Martin Josefsson 2001-12-18 17:08 ` Linus Torvalds 2001-12-18 16:16 ` Roger Larsson ` (2 subsequent siblings) 6 siblings, 1 reply; 87+ messages in thread From: Martin Josefsson @ 2001-12-18 15:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik On Mon, 17 Dec 2001, Linus Torvalds wrote: > > On Mon, 17 Dec 2001, William Lee Irwin III wrote: > > > > 5: 46490271 XT-PIC soundblaster > > > > Approximately 4 times more often than the timer interrupt. > > That's not nice... 0: 24867181 XT-PIC timer 5: 9070614 XT-PIC soundblaster After I bootup I start X and then xmms and then my system plays mp3's almost all the time. > > > Which sound driver are you using, just in case this _is_ the reason? > > > > SoundBlaster 16 I have an old ISA SoundBlaster 16 > Raising that min_fragment thing from 5 to 10 would make the minimum DMA > buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what, > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties > in less than 1/100th of a second, but at least it should be < 200 irqs/sec > rather than >400). After watchning /proc/interrupts with 30 second intervals I see that I only get 43 interrupts/second when playing 16bit 44.1kHz stereo. And according to vmstat I have 153-158 interrupts/second in total (it's probably the networktraffic that increases it a little above 143). /Martin ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 15:51 ` Martin Josefsson @ 2001-12-18 17:08 ` Linus Torvalds 0 siblings, 0 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 17:08 UTC (permalink / raw) To: Martin Josefsson; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik On Tue, 18 Dec 2001, Martin Josefsson wrote: > > After watchning /proc/interrupts with 30 second intervals I see that I > only get 43 interrupts/second when playing 16bit 44.1kHz stereo. That's _exactly_ what you get with a 4kB fragment size. You have a sane player that asks for a sane fragment size. While whatever William uses seems to ask for a really small one.. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 6:09 ` Linus Torvalds ` (3 preceding siblings ...) 2001-12-18 15:51 ` Martin Josefsson @ 2001-12-18 16:16 ` Roger Larsson 2001-12-18 17:16 ` Herman Oosthuysen 2001-12-18 17:16 ` Linus Torvalds 2001-12-18 17:21 ` David Mansfield 2001-12-18 18:25 ` William Lee Irwin III 6 siblings, 2 replies; 87+ messages in thread From: Roger Larsson @ 2001-12-18 16:16 UTC (permalink / raw) To: Linus Torvalds, William Lee Irwin III Cc: Kernel Mailing List, linux-audio-dev, Jeff Garzik This might be of interest on linux-audio-dev too... On Tuesday den 18 December 2001 07.09, Linus Torvalds wrote: > On Mon, 17 Dec 2001, William Lee Irwin III wrote: > > 5: 46490271 XT-PIC soundblaster > > > > Approximately 4 times more often than the timer interrupt. > > That's not nice... > > Yeah. > > Well, looking at the issue, the problem is probably not just in the sb > driver: the soundblaster driver shares the output buffer code with a > number of other drivers (there's some horrible "dmabuf.c" code in common). > > And yes, the dmabuf code will wake up the writer on every single DMA > complete interrupt. Considering that you seem to have them at least 400 > times a second (and probably more, unless you've literally had sound going > since the machine was booted), I think we know why your setup spends time > in the scheduler. > > > On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote: > > > Which sound driver are you using, just in case this _is_ the reason? > > > > SoundBlaster 16 > > A change of hardware should help verify this. > > A number of sound drivers will use the same logic. > > You may be able to change this more easily some other way, by using a > larger fragment size for example. That's up to the sw that actually feeds > the sound stream, so it might be your decoder that selects a small > fragment size. > > Quite frankly I don't know the sound infrastructure well enough to make > any more intelligent suggestions about other decoders or similar to try, > at this point I just start blathering. > > But yes, I bet you'll also see much less impact of this if you were to > switch to more modern hardware. > > grep grep grep.. Oh, before you do that, how about changing "min_fragment" > in sb_audio.c from 5 to something bigger like 9 or 10? > > That > > audio_devs[devc->dev]->min_fragment = 5; > > literally means that your minimum fragment size seems to be a rather > pathetic 32 bytes (which doesn't mean that your sound will be set to that, > but it _might_ be). That sounds totally ridiculous, but maybe I've > misunderstood the code. I think it really is 32 samples, yes that is little - but too small? It depends on the used sample frequency... Paul Davis wrote this on linux-audio-dev 2001-12-05 "in doing lots of testing on JACK, i've noticed that although the trident driver now works (there were some patches from jaroslav and myself), in general i still get xruns with the lowest possible latency setting for that card (1.3msec per interrupt, 2.6msec buffer). with the same settings on my hammerfall, i don't get xruns, even with substantial system load." > > Jeff, you've worked on the sb code at some point - does it really do > 32-byte sound fragments? Why? That sounds truly insane if I really parsed > that code correctly. That's thousands of separate DMA transfers > and interrupts per second.. > Lets see: we have >1 GHz CPU and interrupts at >1000 Hz => 1 Mcycle / interrupt - is that insane? If the hardware can support it? Why not let it? It is really up to the applications/user to decide... > Raising that min_fragment thing from 5 to 10 would make the minimum DMA > buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what, > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties > in less than 1/100th of a second, but at least it should be < 200 irqs/sec > rather than >400). > Yes, it is probably more reasonable - but if the soundcard can support it? (I have a vision of lots of linux-audio-dev folks pulling out their new soundcard and replacing it with their since long forgotten SB16...) /RogerL -- Roger Larsson Skellefteå Sweden ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 16:16 ` Roger Larsson @ 2001-12-18 17:16 ` Herman Oosthuysen 2001-12-18 17:16 ` Linus Torvalds 1 sibling, 0 replies; 87+ messages in thread From: Herman Oosthuysen @ 2001-12-18 17:16 UTC (permalink / raw) To: Kernel Mailing List, linux-audio-dev My tuppence worth from a real-time embedded perspective: A shorter time slice and other real-time improvements to the scheduler will certainly improve life to the embedded crowd. Bear in mind that 90% of processors are used for embedded apps. Shorter time slices etc. means smaller buffers, less RAM and lower cost. I don't know what the current distribution is for Linux regarding embedded vs data processing, but the embedded use of Linux is certainly growing rapidly - we expect to make a million thingummyjigs running Linux next year and there are many other companies doing the same. Within the next few years, I expect embedded use of Linux to overshadow data use by a large margin. Since embedded processors are 'invisible' and never in the news, I would be very happy if Linus and others will keep us poor boys in mind... -- Herman Oosthuysen Herman@WirelessNetworksInc.com Suite 300, #3016, 5th Ave NE, Calgary, Alberta, T2A 6K4, Canada Phone: (403) 569-5688, Fax: (403) 235-3965 ----- Original Message ----- > > Lets see: we have >1 GHz CPU and interrupts at >1000 Hz > => 1 Mcycle / interrupt - is that insane? > > If the hardware can support it? Why not let it? It is really up to the > applications/user to decide... > > > Raising that min_fragment thing from 5 to 10 would make the minimum DMA > > buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what, > > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties > > in less than 1/100th of a second, but at least it should be < 200 irqs/sec > > rather than >400). > > > > /RogerL > > -- > Roger Larsson > Skellefteå > Sweden ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 16:16 ` Roger Larsson 2001-12-18 17:16 ` Herman Oosthuysen @ 2001-12-18 17:16 ` Linus Torvalds 1 sibling, 0 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 17:16 UTC (permalink / raw) To: Roger Larsson Cc: William Lee Irwin III, Kernel Mailing List, linux-audio-dev, Jeff Garzik On Tue, 18 Dec 2001, Roger Larsson wrote: > > Lets see: we have >1 GHz CPU and interrupts at >1000 Hz > => 1 Mcycle / interrupt - is that insane? Ehh.. First off, the CPU may be 1GHz, but the memory subsystem, and the PCI subsystem definitely are _not_. Most PCI cards still run at a (comparatively) leisurely 33MHz, and when we're talking about audio, we're talking about actually having to _access_ that audio device. Yes. At 33MHz, not at 1GHz. Also, at 32-byte fragments, the frequency is actually 5.5kHz, not 1kHz. Now, I seriously doubt the mp3-player actually used 32-byte fragments (it probably just asked for something small, and got it), but let's say it asked for something in the kHz range (ie 256-512 byte frags). That does _not_ equate to "1 Mcycle". It equates to 33 _kilocycles_ in PCI-land, and a PCI read will take several cycles. > If the hardware can support it? Why not let it? It is really up to the > applications/user to decide... Well, this particular user was unhappy with the CPU spending a noticeably amount of time on just web-surfing and mp3-playing. So clearly the _user_ didn't ask for it. And I suspect that the app writer just didn't even realize what he did. He may have used another sound card that didn't even allow small fragments. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 6:09 ` Linus Torvalds ` (4 preceding siblings ...) 2001-12-18 16:16 ` Roger Larsson @ 2001-12-18 17:21 ` David Mansfield 2001-12-18 17:27 ` Linus Torvalds 2001-12-18 18:25 ` William Lee Irwin III 6 siblings, 1 reply; 87+ messages in thread From: David Mansfield @ 2001-12-18 17:21 UTC (permalink / raw) To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik > > audio_devs[devc->dev]->min_fragment = 5; > Generally speaking, you want to be able to specify about a 1ms fragment, speaking as a realtime audio programmer (no offense Victor...). However, 1ms is 128 bytes at 16bit stereo, but only 32 bytes at 8bit mono. Nobody does 8bit mono, but that's probably why it's there. A lot of drivers seem to have 128 byte as minimum fragment size. Even the high end stuff like the RME hammerfall only go down to 64 byte fragment PER CHANNEL, which is the same as 128 bytes for stereo in the SB 16. > Raising that min_fragment thing from 5 to 10 would make the minimum DMA > buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what, > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties > in less than 1/100th of a second, but at least it should be < 200 irqs/sec > rather than >400). Note that the ALSA drivers allow the app to set watermarks for wakeup, while allowing flexibility in fragment size and number. You can essentially say, wake me up when there are at least n fragments empty, and put me to sleep if m fragments are full. David -- /==============================\ | David Mansfield | | david@cobite.com | \==============================/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:21 ` David Mansfield @ 2001-12-18 17:27 ` Linus Torvalds 2001-12-18 17:54 ` Andreas Dilger 2001-12-18 18:58 ` Alan Cox 0 siblings, 2 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 17:27 UTC (permalink / raw) To: David Mansfield; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik On Tue, 18 Dec 2001, David Mansfield wrote: > > > > audio_devs[devc->dev]->min_fragment = 5; > > > > Generally speaking, you want to be able to specify about a 1ms fragment, > speaking as a realtime audio programmer (no offense Victor...). However, > 1ms is 128 bytes at 16bit stereo, but only 32 bytes at 8bit mono. Nobody > does 8bit mono, but that's probably why it's there. A lot of drivers seem > to have 128 byte as minimum fragment size. Good point. Somebody should really look at "dma_set_fragment", and see whether we can make "min_fragment" be really just a hardware minimum chunk size, but use other heuristics like frequency to cut off the minimum size (ie just do something like /* We want to limit it to 1024 Hz */ min_bytes = freq*channel*bytes_per_channel >> 10; Although I'm not sure we _have_ the frequency at that point: somebody might set the fragment size first, and the frequency later. Maybe the best thing to do is to educate the people who write the sound apps for Linux (somebody was complaining about "esd" triggering this, for example). Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:27 ` Linus Torvalds @ 2001-12-18 17:54 ` Andreas Dilger 2001-12-18 18:27 ` Doug Ledford 2001-12-18 18:35 ` Linus Torvalds 2001-12-18 18:58 ` Alan Cox 1 sibling, 2 replies; 87+ messages in thread From: Andreas Dilger @ 2001-12-18 17:54 UTC (permalink / raw) To: Linus Torvalds Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List, Jeff Garzik On Dec 18, 2001 09:27 -0800, Linus Torvalds wrote: > Maybe the best thing to do is to educate the people who write the sound > apps for Linux (somebody was complaining about "esd" triggering this, for > example). Yes, esd is an interrupt hog, it seems. When reading this thread, I checked, and sure enough I was getting 190 interrupts/sec on the sound card while not playing any sound. I killed esd (which I don't use anyways), and interrupts went to 0/sec when not playing sound. Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S) sound card. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:54 ` Andreas Dilger @ 2001-12-18 18:27 ` Doug Ledford 2001-12-18 18:52 ` Andreas Dilger ` (3 more replies) 2001-12-18 18:35 ` Linus Torvalds 1 sibling, 4 replies; 87+ messages in thread From: Doug Ledford @ 2001-12-18 18:27 UTC (permalink / raw) To: Andreas Dilger; +Cc: Kernel Mailing List Andreas Dilger wrote: > On Dec 18, 2001 09:27 -0800, Linus Torvalds wrote: > >>Maybe the best thing to do is to educate the people who write the sound >>apps for Linux (somebody was complaining about "esd" triggering this, for >>example). >> > > Yes, esd is an interrupt hog, it seems. When reading this thread, I > checked, and sure enough I was getting 190 interrupts/sec on the > sound card while not playing any sound. I killed esd (which I don't > use anyways), and interrupts went to 0/sec when not playing sound. > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S) > sound card. Weel, evidently esd and artsd both do this (well, I assume esd does now, it didn't do this in the past). Basically, they both transmit silence over the sound chip when nothing else is going on. So even though you don't hear anything, the same sound output DMA is taking place. That avoids things like nasty pops when you start up the sound hardware for a beep and that sort of thing. It also maintains state where as dropping output entirely could result in things like module auto unloading and then reloading on the next beep, etc. Personally, the interrupt count and overhead annoyed me enough that when I started hacking on the i810 sound driver one of my primary goals was to get overhead and interrupt count down. I think I suceeded quite well. On my current workstation: Context switches per second not playing any sound: 8300 - 8800 Context switches per second playing an MP3: 9200 - 9900 Interrupts per second from sound device: 86 %CPU used when not playing MP3: 0 - 3% (magicdev is a CPU pig once every 2 seconds) %CPU used when playing MP3s: 0 - 4% In any case, it might be worth the original poster's time in figuring out just how much of his lost CPU is because of playing sound and how much is actually caused by the windowing system and all the associated bloat that comes with it now a days. -- Doug Ledford <dledford@redhat.com> http://people.redhat.com/dledford Please check my web site for aic7xxx updates/answers before e-mailing me about problems ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 18:27 ` Doug Ledford @ 2001-12-18 18:52 ` Andreas Dilger 2001-12-18 19:03 ` Doug Ledford 2001-12-19 9:19 ` Peter Wächtler ` (2 subsequent siblings) 3 siblings, 1 reply; 87+ messages in thread From: Andreas Dilger @ 2001-12-18 18:52 UTC (permalink / raw) To: Doug Ledford; +Cc: Kernel Mailing List On Dec 18, 2001 13:27 -0500, Doug Ledford wrote: > Andreas Dilger wrote: > > Yes, esd is an interrupt hog, it seems. When reading this thread, I > > checked, and sure enough I was getting 190 interrupts/sec on the > > sound card while not playing any sound. I killed esd (which I don't > > use anyways), and interrupts went to 0/sec when not playing sound. > > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S) > > sound card. > > Weel, evidently esd and artsd both do this (well, I assume esd does now, it > didn't do this in the past). Basically, they both transmit silence over the > sound chip when nothing else is going on. So even though you don't hear > anything, the same sound output DMA is taking place. That avoids things > like nasty pops when you start up the sound hardware for a beep and that > sort of thing. Hmm, I _do_ notice a pop when the sound hardware is first initialized at boot time, but not when mpg123 starts/stops (without esd running) so I personally don't get any benefit from "the sound of silence". That said, asside from the 190 interrupts/sec from esd, it doesn't appear to use any measurable CPU time by itself. > Context switches per second not playing any sound: 8300 - 8800 > Context switches per second playing an MP3: 9200 - 9900 Hmm, something seems very strange there. On an idle system, I get about 100 context switches/sec, and about 150/sec when playing sound (up to 400/sec when moving the mouse between windows). 9000 cswitches/sec is _very_ high. This is with a text-only player which has screen output (other than the ID3 info from the currently played song). Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 18:52 ` Andreas Dilger @ 2001-12-18 19:03 ` Doug Ledford 0 siblings, 0 replies; 87+ messages in thread From: Doug Ledford @ 2001-12-18 19:03 UTC (permalink / raw) To: Andreas Dilger; +Cc: Kernel Mailing List Andreas Dilger wrote: > Hmm, I _do_ notice a pop when the sound hardware is first initialized at > boot time, but not when mpg123 starts/stops (without esd running) so I > personally don't get any benefit from "the sound of silence". That said, > asside from the 190 interrupts/sec from esd, it doesn't appear to use any > measurable CPU time by itself. > > >>Context switches per second not playing any sound: 8300 - 8800 >>Context switches per second playing an MP3: 9200 - 9900 >> > > Hmm, something seems very strange there. On an idle system, I get about > 100 context switches/sec, and about 150/sec when playing sound (up to 400/sec > when moving the mouse between windows). 9000 cswitches/sec is _very_ high. > This is with a text-only player which has screen output (other than the > ID3 info from the currently played song). I haven't taken the time to track down what's causing all the context switches, but on my system they are indeed "normal". I suspect large numbers of them are a result of interactions between gnome, nautilus, X, xmms, esd, and gnome-xmms. However, I did just track down one reason for it. It's not 8300 - 8800, its 830 - 880. There appears to be a bug in the procinfo -n1 mode that results in an extra digit getting tacked onto the end of the context switch line. So, take my original numbers and lop off the last digit from the context switch numbers and that's more like what the machine is actually doing. -- Doug Ledford <dledford@redhat.com> http://people.redhat.com/dledford Please check my web site for aic7xxx updates/answers before e-mailing me about problems ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 18:27 ` Doug Ledford 2001-12-18 18:52 ` Andreas Dilger @ 2001-12-19 9:19 ` Peter Wächtler 2001-12-19 11:05 ` Helge Hafting 2001-12-21 20:23 ` Rob Landley 3 siblings, 0 replies; 87+ messages in thread From: Peter Wächtler @ 2001-12-19 9:19 UTC (permalink / raw) To: Doug Ledford; +Cc: Kernel Mailing List Doug Ledford schrieb: > > Andreas Dilger wrote: > > > On Dec 18, 2001 09:27 -0800, Linus Torvalds wrote: > > > >>Maybe the best thing to do is to educate the people who write the sound > >>apps for Linux (somebody was complaining about "esd" triggering this, for > >>example). > >> > > > > Yes, esd is an interrupt hog, it seems. When reading this thread, I > > checked, and sure enough I was getting 190 interrupts/sec on the > > sound card while not playing any sound. I killed esd (which I don't > > use anyways), and interrupts went to 0/sec when not playing sound. > > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S) > > sound card. > > Weel, evidently esd and artsd both do this (well, I assume esd does now, it > didn't do this in the past). Basically, they both transmit silence over the > sound chip when nothing else is going on. So even though you don't hear > anything, the same sound output DMA is taking place. That avoids things > like nasty pops when you start up the sound hardware for a beep and that > sort of thing. It also maintains state where as dropping output entirely > could result in things like module auto unloading and then reloading on the > next beep, etc. Personally, the interrupt count and overhead annoyed me > enough that when I started hacking on the i810 sound driver one of my > primary goals was to get overhead and interrupt count down. I think I > suceeded quite well. On my current workstation: > > Context switches per second not playing any sound: 8300 - 8800 > Context switches per second playing an MP3: 9200 - 9900 > Interrupts per second from sound device: 86 > %CPU used when not playing MP3: 0 - 3% (magicdev is a CPU pig once every 2 > seconds) > %CPU used when playing MP3s: 0 - 4% > > In any case, it might be worth the original poster's time in figuring out > just how much of his lost CPU is because of playing sound and how much is > actually caused by the windowing system and all the associated bloat that > comes with it now a days. > Do you really think 8000 context switches are sane? pippin:/var/log # vmstat 1 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 2 0 0 100728 4424 121572 27800 0 1 6 6 61 77 98 2 0 2 0 0 100728 5448 121572 27800 0 0 0 68 112 811 93 7 0 2 0 0 100728 5448 121572 27800 0 0 0 0 101 776 95 5 0 3 0 0 100728 4928 121572 27800 0 0 0 0 101 794 92 8 0 having a load ~2.1 (2 seti@home) ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 18:27 ` Doug Ledford 2001-12-18 18:52 ` Andreas Dilger 2001-12-19 9:19 ` Peter Wächtler @ 2001-12-19 11:05 ` Helge Hafting 2001-12-21 20:23 ` Rob Landley 3 siblings, 0 replies; 87+ messages in thread From: Helge Hafting @ 2001-12-19 11:05 UTC (permalink / raw) To: Doug Ledford, linux-kernel Doug Ledford wrote: > Weel, evidently esd and artsd both do this (well, I assume esd does now, it > didn't do this in the past). Basically, they both transmit silence over the > sound chip when nothing else is going on. So even though you don't hear > anything, the same sound output DMA is taking place. Uuurgh. :-( > That avoids things > like nasty pops when you start up the sound hardware for a beep and that Yuk, bad hardware. Pops when you start or stop writing? You don't even have to turn the volume off or something to get a pop? Toss it. > sort of thing. It also maintains state where as dropping output entirely > could result in things like module auto unloading and then reloading on the > next beep, etc. Much better solved by having the device open, but not writing anything. Open devices don't unload. Helge Hafting ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 18:27 ` Doug Ledford ` (2 preceding siblings ...) 2001-12-19 11:05 ` Helge Hafting @ 2001-12-21 20:23 ` Rob Landley 3 siblings, 0 replies; 87+ messages in thread From: Rob Landley @ 2001-12-21 20:23 UTC (permalink / raw) To: Doug Ledford, Andreas Dilger; +Cc: Kernel Mailing List On Tuesday 18 December 2001 01:27 pm, Doug Ledford wrote: > Weel, evidently esd and artsd both do this (well, I assume esd does now, it > didn't do this in the past). Basically, they both transmit silence over > the sound chip when nothing else is going on. So even though you don't > hear anything, the same sound output DMA is taking place. That avoids THAT explains it. My Dell Inspiron 3500 laptop's built-in sound (NeoMagic MagicMedia 256 AV, uses ad1848 module) works fine when I first boot the sucker, but looses its marbles after an APM suspend and stops receiving interrupts. (Extensive poking around with setpci has so far failed to get it working again, but on a shutdown and restart the bios sets it up fine. Not a clue what's up there. The bios and module agree it's using IRQ 7, but lspci insists it's IRQ 11, both before and after apm suspend. Boggle.) I was confused for a while about how exactly it was failing because KDE and mpg123 from the command line fail in different ways. mpg123 will play the same half-second clip in a loop (ahah! no interrupt!), but sound in kde just vanishes and I get silence and hung apps whenever I try to launch anything. The clue is that it doesn't always fail when I suspend it without having X up. Translation: maybe the sound card's getting hosed by being open and in use on APM shutdown! Hmmm... I should poke at this over the weekend... (Nope, not a new problem. My laptop's sound has been like this since at least 2.4.4, which I think was the first version I installed on the box. But it's still annoying, I can go weeks without a true reboot 'cause I have a zillion konqueror windows and such open. I have to clear my desktop to get sound working again for a few hours. Obnoxious...) Rob ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:54 ` Andreas Dilger 2001-12-18 18:27 ` Doug Ledford @ 2001-12-18 18:35 ` Linus Torvalds 1 sibling, 0 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 18:35 UTC (permalink / raw) To: Andreas Dilger Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List, Jeff Garzik On Tue, 18 Dec 2001, Andreas Dilger wrote: > > Yes, esd is an interrupt hog, it seems. When reading this thread, I > checked, and sure enough I was getting 190 interrupts/sec on the > sound card while not playing any sound. I killed esd (which I don't > use anyways), and interrupts went to 0/sec when not playing sound. > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S) > sound card. 190 interrupts / sec sounds excessive, but not wildly so. The interrupt per se is not going to be a CPU hog unless the sound card does programmed IO to fill the data queues, and while that is not unheard of, I don't think such a card has been made in the last five years. Obviously getting 190 irq's per second even when not actually _doing_ anything is a total waste of CPU, and is bad form. There may be some reason why esd does it, most probably for good synchronization between sound events and to avoid popping when the sound is shut down (many sound drivers seem to pop a bit on open/close, possibly due to driver bugs, but possibly because some hard-to-avoid-programmatically hardware glitch when powering down the logic. So waiting a while with the driver active may actually be a reasonable thing to do, although I suspect that after long sequences of silence "esd" should really shut down for a while (and "long" here is probably on the order of seconds, not minutes). What probably _really_ ends up hurting performance is probably not the interrupt per se (although it is noticeable), but the fact that we wake up and cause a schedule - which often blows any CPU caches, making the _next_ interrupt also be more expensive than it would possibly need to be. The code for that (in the case of drivers that use the generic "dmabuf.c" infrastructure) seems to be in "finish_output_interrupt()", and I suspect that it could be improved with something like dmap = adev->dmap_out; lim = dmap->nbufs; if (lim < 2) lim = 2; if (dmap->qlen <= lim/2) { ... } around the current unconditional wakeups. Yeah, yeah, untested, stupid example, the idea being that we only wake up if we have at least half the frags free now, instead of waking up for _every_ fragment that free's up. The above is just as a suggestion for some testing, if somebody actually feels like trying it out. It probably won't be good as-is, but as a starting point.. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:27 ` Linus Torvalds 2001-12-18 17:54 ` Andreas Dilger @ 2001-12-18 18:58 ` Alan Cox 2001-12-18 19:31 ` Gerd Knorr 1 sibling, 1 reply; 87+ messages in thread From: Alan Cox @ 2001-12-18 18:58 UTC (permalink / raw) To: Linus Torvalds Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List, Jeff Garzik > Maybe the best thing to do is to educate the people who write the sound > apps for Linux (somebody was complaining about "esd" triggering this, for > example). esd is a culprit, and artsd to an extent. esd is scheduled to die so artsd is the big one to tidy. Kernel side OSS is dead so its a matter for ALSA ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 18:58 ` Alan Cox @ 2001-12-18 19:31 ` Gerd Knorr 0 siblings, 0 replies; 87+ messages in thread From: Gerd Knorr @ 2001-12-18 19:31 UTC (permalink / raw) To: linux-kernel > Kernel side OSS is dead What do you mean with "Kernel side OSS"? Only Hannu's OSS/free drivers? Or all current kernel drivers which support the OSS API, including most (all?) PCI sound drivers which don't use any old OSS/free code? Gerd -- #define ENOCLUE 125 /* userland programmer induced race condition */ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 6:09 ` Linus Torvalds ` (5 preceding siblings ...) 2001-12-18 17:21 ` David Mansfield @ 2001-12-18 18:25 ` William Lee Irwin III 6 siblings, 0 replies; 87+ messages in thread From: William Lee Irwin III @ 2001-12-18 18:25 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Jeff Garzik On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote: > Well, looking at the issue, the problem is probably not just in the sb > driver: the soundblaster driver shares the output buffer code with a > number of other drivers (there's some horrible "dmabuf.c" code in common). > And yes, the dmabuf code will wake up the writer on every single DMA > complete interrupt. Considering that you seem to have them at least 400 > times a second (and probably more, unless you've literally had sound going > since the machine was booted), I think we know why your setup spends time > in the scheduler. > A number of sound drivers will use the same logic. I've chucked the sb32 and plugged in the emu10k1 I had been planning to install for a while, to good effect. It's not an ISA sb16, but it apparently uses the same driver. I'm getting an overall 1% reduction in system load, and the following "top 5" profile: 53374 total 0.0400 11430 default_idle 238.1250 8820 handle_IRQ_event 91.8750 2186 do_softirq 10.5096 1984 schedule 1.2525 1612 number 1.4816 1473 __generic_copy_to_user 18.4125 Oddly, I'm getting even more interrupts than I saw with the sb32... 0: 2752924 XT-PIC timer 9: 14223905 XT-PIC EMU10K1, eth1 (eth1 generates orders of magnitude fewer interrupts than the timer) On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote: > You may be able to change this more easily some other way, by using a > larger fragment size for example. That's up to the sw that actually feeds > the sound stream, so it might be your decoder that selects a small > fragment size. > Quite frankly I don't know the sound infrastructure well enough to make > any more intelligent suggestions about other decoders or similar to try, > at this point I just start blathering. Already more insight into the problem I was experiencing than I had before, and I must confess to those such as myself this lead certainly seems "plucked out of the air". Good work! =) On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote: > But yes, I bet you'll also see much less impact of this if you were to > switch to more modern hardware. I hear from elsewhere the emu10k1 has a bad reputation as source of excessive interrupts. Looks like I bought the wrong sound card(s). Maybe I should go shopping. =) Thanks a bunch! Bill ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 4:55 ` William Lee Irwin III 2001-12-18 6:09 ` Linus Torvalds @ 2001-12-18 14:21 ` Adam Schrotenboer 1 sibling, 0 replies; 87+ messages in thread From: Adam Schrotenboer @ 2001-12-18 14:21 UTC (permalink / raw) To: Kernel Mailing List On Monday 17 December 2001 23:55, William Lee Irwin III wrote: > On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote: > > The most likely cause is simply waking up after each sound interrupt: you > > also have a _lot_ of time handling interrupts. Quite frankly, web surfing > > and mp3 playing simply shouldn't use any noticeable amounts of CPU. > > I think we have a winner: > /proc/interrupts > ------------------------------------------------ > CPU0 > 0: 17321824 XT-PIC timer > 1: 4 XT-PIC keyboard > 2: 0 XT-PIC cascade > 5: 46490271 XT-PIC soundblaster > 9: 400232 XT-PIC usb-ohci, eth0, eth1 > 11: 939150 XT-PIC aic7xxx, aic7xxx > 14: 13 XT-PIC ide0 > > Approximately 4 times more often than the timer interrupt. > That's not nice... FWIW, I have an ES1371 based sound card, and mpg123 drives it at 172 interrupts/sec (calculated in procinfo). But that _is_ only when playing. And (my slightly hacked) timidity drives my card w/ only 23(@48kHz sample rate; 21 @ 44.1kHz) interrupts/sec Is this 172 figure right? (Not through esd either. i almost always turn it off, and sp recompiled mpg123 to use the std OSS driver) > > On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote: > > Which sound driver are you using, just in case this _is_ the reason? > > SoundBlaster 16 > A change of hardware should help verify this. > > > Cheers, > Bill > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 4:27 ` Linus Torvalds 2001-12-18 4:55 ` William Lee Irwin III @ 2001-12-18 18:13 ` Davide Libenzi 1 sibling, 0 replies; 87+ messages in thread From: Davide Libenzi @ 2001-12-18 18:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List On Mon, 17 Dec 2001, Linus Torvalds wrote: > The most likely cause is simply waking up after each sound interrupt: you > also have a _lot_ of time handling interrupts. Quite frankly, web surfing > and mp3 playing simply shouldn't use any noticeable amounts of CPU. It must be noted that wking up a task is going to take two lock operations ( and two unlock ), one in try_to_wakeup() and the other one in schedule(). This double the frequency seen by the lock. - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Just a second ...
@ 2001-12-16 0:13 Linus Torvalds
2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi
0 siblings, 1 reply; 87+ messages in thread
From: Linus Torvalds @ 2001-12-16 0:13 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Kernel Mailing List
On Sat, 15 Dec 2001, Davide Libenzi wrote:
>
> when you find 10 secs free in your spare time i really would like to know
> the reason ( if any ) of your abstention from any schdeuler discussion.
> No hurry, just a few lines out of lkml.
I just don't find it very interesting. The scheduler is about 100 lines
out of however-many-million (3.8 at least count), and doesn't even impact
most normal performace very much.
We'll clearly do per-CPU runqueues or something some day. And that worries
me not one whit, compared to thigns like VM and block device layer ;)
I know a lot of people think schedulers are important, and the operating
system theory about them is overflowing - it's one of those things that
people can argue about forever, yet is conceptually simple enough that
people aren't afraid of it. I just personally never found it to be a major
issue.
Let's face it - the current scheduler has the same old basic structure
that it did almost 10 years ago, and yes, it's not optimal, but there
really aren't that many real-world loads where people really care. I'm
sorry, but it's true.
And you have to realize that there are not very many things that have
aged as well as the scheduler. Which is just another proof that scheduling
is easy.
We've rewritten the VM several times in the last ten years, and I expect
it will be changed several more times in the next few years. Withing five
years we'll almost certainly have to make the current three-level page
tables be four levels etc.
In comparison to those kinds of issues, I suspect that making the
scheduler use per-CPU queues together with some inter-CPU load balancing
logic is probably _trivial_. Patches already exist, and I don't feel that
people can screw up the few hundred lines too badly.
Linus
^ permalink raw reply [flat|nested] 87+ messages in thread* Scheduler ( was: Just a second ) ... 2001-12-16 0:13 Just a second Linus Torvalds @ 2001-12-17 22:48 ` Davide Libenzi 2001-12-17 22:53 ` Linus Torvalds 0 siblings, 1 reply; 87+ messages in thread From: Davide Libenzi @ 2001-12-17 22:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List On Sat, 15 Dec 2001, Linus Torvalds wrote: > I just don't find it very interesting. The scheduler is about 100 lines > out of however-many-million (3.8 at least count), and doesn't even impact > most normal performace very much. Linus, sharing queue and lock between CPUs for a "thing" highly frequency ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly and it's not that much funny. And it's not only performance wise, it's more design wise. > We'll clearly do per-CPU runqueues or something some day. And that worries > me not one whit, compared to thigns like VM and block device layer ;) Why not 2.5.x ? > I know a lot of people think schedulers are important, and the operating > system theory about them is overflowing - ... It's no more important of anything else, it's just one of the remaining scalability/design issues. No, it's not more important than VM but there're enough people working on VM. And the hope is to get the scheduler right with an ETA of less than 10 years. > it's one of those things that people can argue about forever, ... Yes, i suppose that if something is not addressed, it'll come up again and again. > yet is conceptually simple enough that people aren't afraid of it. ^^^^^^^^^^^^^^^^^^^ 1, ... > Let's face it - the current scheduler has the same old basic structure > that it did almost 10 years ago, and yes, it's not optimal, but there > really aren't that many real-world loads where people really care. I'm > sorry, but it's true. Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane for UP systems, starts to matter. Just to leave out cache line effects. Just to leave out the way the current scheduler moves tasks around CPUs. Linus, it's not only about performance benchmarks with 2451 processes jumping on the run queue, that i could not care less about, it's just a sum of sucky "things" that make an issue. You can look at it like a cosmetic/design patch more than a strict performance patch if you like. > And you have to realize that there are not very many things that have > aged as well as the scheduler. Which is just another proof that > scheduling is easy. ^^^^^^^^^^^^^^^^^^ ..., 2, ... > We've rewritten the VM several times in the last ten years, and I expect > it will be changed several more times in the next few years. Withing five > years we'll almost certainly have to make the current three-level page > tables be four levels etc. > > In comparison to those kinds of issues, I suspect that making the > scheduler use per-CPU queues together with some inter-CPU load balancing > logic is probably _trivial_. ^^^^^^^^^ ... 3, there should be a subliminal message inside but i'm not able to get it ;) I would not call selecting the right task to run in an SMP system trivial. The difference between selecting the right task to run and selecting the right page to swap is that if you screw up with the task the system impact is lower. But, if you screw up, your design will suck in both cases. Anyway, given that 1) real men do VM ( i thought they didn't eat quiche ) and easy-coders do scheduling 2) the schdeuler is easy/trivial and you do not seem interested in working on it 3) whoever is doing the scheduler cannot screw up things, why don't you give the responsibility for example to Alan or Ingo so that a discussion ( obviously easy ) about the future of the schdeuler can be started w/out hurting real men doing VM ? I'm talking about, you know, that kind of discussions where people bring solutions, code and numbers, they talk about the good and bad of certain approaches and they finally come up ( after some sane fight ) with a much or less widely approved solution. The scheduler, besides the real men crap, is one of the basic components of an OS, and having a public debate, i'm not saying every month and neither every year, but at least once every four years ( this is the last i remember ) could be a nice thing. And no, if you do not give to someone that you trust the "power" to redesign the scheduler, no schdeuler discussions will start simply because people don't like the result of a debate to be dumped to /dev/null. > Patches already exist, and I don't feel that people can screw up the few > hundred lines too badly. Can you point me to a Linux patch that implements _real_independent_ ( queue and locking ) CPU schedulers with global balancing policy ? I searched very badly but i did not find anything. Your faithfully, Jimmy Scheduler ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi @ 2001-12-17 22:53 ` Linus Torvalds 2001-12-17 23:15 ` Davide Libenzi 2001-12-18 1:54 ` Rik van Riel 0 siblings, 2 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-17 22:53 UTC (permalink / raw) To: Davide Libenzi; +Cc: Kernel Mailing List On Mon, 17 Dec 2001, Davide Libenzi wrote: > On Sat, 15 Dec 2001, Linus Torvalds wrote: > > > I just don't find it very interesting. The scheduler is about 100 lines > > out of however-many-million (3.8 at least count), and doesn't even impact > > most normal performace very much. > > Linus, sharing queue and lock between CPUs for a "thing" highly frequency > ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly > and it's not that much funny. And it's not only performance wise, it's > more design wise. "Design wise" is highly overrated. Simplicity is _much_ more important, if something commonly is only done a few hundred times a second. Locking overhead is basically zero for that case. > > We'll clearly do per-CPU runqueues or something some day. And that worries > > me not one whit, compared to thigns like VM and block device layer ;) > > Why not 2.5.x ? Maybe. But read the rest of the sentence. There are issues that are about a million times more important. > Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane > for UP systems, starts to matter. 4 cpu's are "high end" today. We can probably point to tens of thousands of UP machines for each 4-way out there. The ratio gets even worse for 8, and 16 CPU's is basically a rounding error. You have to prioritize. Scheduling overhead is way down the list. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-17 22:53 ` Linus Torvalds @ 2001-12-17 23:15 ` Davide Libenzi 2001-12-17 23:18 ` Linus Torvalds 2001-12-18 1:54 ` Rik van Riel 1 sibling, 1 reply; 87+ messages in thread From: Davide Libenzi @ 2001-12-17 23:15 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List On Mon, 17 Dec 2001, Linus Torvalds wrote: > > On Mon, 17 Dec 2001, Davide Libenzi wrote: > > > On Sat, 15 Dec 2001, Linus Torvalds wrote: > > > > > I just don't find it very interesting. The scheduler is about 100 lines > > > out of however-many-million (3.8 at least count), and doesn't even impact > > > most normal performace very much. > > > > Linus, sharing queue and lock between CPUs for a "thing" highly frequency > > ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly > > and it's not that much funny. And it's not only performance wise, it's > > more design wise. > > "Design wise" is highly overrated. > > Simplicity is _much_ more important, if something commonly is only done a > few hundred times a second. Locking overhead is basically zero for that > case. Few hundred is a nice definition because you can basically range from 0 to infinite. Anyway i agree that we can spend days debating about what this "few hundred" translate to, and i do not really want to. > 4 cpu's are "high end" today. We can probably point to tens of thousands > of UP machines for each 4-way out there. The ratio gets even worse for 8, > and 16 CPU's is basically a rounding error. > > You have to prioritize. Scheduling overhead is way down the list. You don't really have to serialize/prioritize, old Latins used to say "Divide Et Impera" ;) - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-17 23:15 ` Davide Libenzi @ 2001-12-17 23:18 ` Linus Torvalds 2001-12-17 23:39 ` Davide Libenzi 2001-12-17 23:52 ` Benjamin LaHaise 0 siblings, 2 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-17 23:18 UTC (permalink / raw) To: Davide Libenzi; +Cc: Kernel Mailing List On Mon, 17 Dec 2001, Davide Libenzi wrote: > > > > You have to prioritize. Scheduling overhead is way down the list. > > You don't really have to serialize/prioritize, old Latins used to say > "Divide Et Impera" ;) Well, you explicitly _asked_ me why I had been silent on the issue. I told you. I also told you that I thought it wasn't that big of a deal, and that patches already exist. So I'm letting the patches fight it out among the people who _do_ care. Then, eventually, I'll do something about it, when we have a winner. If that isn't "Divide et Impera", I don't know _what_ is. Remember: the romans didn't much care for their subjects. They just wanted the glory, and the taxes. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-17 23:18 ` Linus Torvalds @ 2001-12-17 23:39 ` Davide Libenzi 2001-12-17 23:52 ` Benjamin LaHaise 1 sibling, 0 replies; 87+ messages in thread From: Davide Libenzi @ 2001-12-17 23:39 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List On Mon, 17 Dec 2001, Linus Torvalds wrote: > So I'm letting the patches fight it out among the people who _do_ care. > > Then, eventually, I'll do something about it, when we have a winner. > > If that isn't "Divide et Impera", I don't know _what_ is. Remember: the > romans didn't much care for their subjects. They just wanted the glory, > and the taxes. Just like today, everyone I talk to wants glory, and everyone I talk to wants to _not_ pay taxes. - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-17 23:18 ` Linus Torvalds 2001-12-17 23:39 ` Davide Libenzi @ 2001-12-17 23:52 ` Benjamin LaHaise 2001-12-18 1:11 ` Linus Torvalds 1 sibling, 1 reply; 87+ messages in thread From: Benjamin LaHaise @ 2001-12-17 23:52 UTC (permalink / raw) To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote: > Well, you explicitly _asked_ me why I had been silent on the issue. I told > you. Well, what about those of us who need syscall numbers assigned for which you are the only official assigned number registry? -ben ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-17 23:52 ` Benjamin LaHaise @ 2001-12-18 1:11 ` Linus Torvalds 2001-12-18 1:46 ` H. Peter Anvin 2001-12-18 5:54 ` Benjamin LaHaise 0 siblings, 2 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 1:11 UTC (permalink / raw) To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List On Mon, 17 Dec 2001, Benjamin LaHaise wrote: > On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote: > > Well, you explicitly _asked_ me why I had been silent on the issue. I told > > you. > > Well, what about those of us who need syscall numbers assigned for which > you are the only official assigned number registry? I've told you a number of times that I'd like to see the preliminary implementation publicly discussed and some uses outside of private companies that I have no insight into.. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 1:11 ` Linus Torvalds @ 2001-12-18 1:46 ` H. Peter Anvin 2001-12-18 5:54 ` Benjamin LaHaise 1 sibling, 0 replies; 87+ messages in thread From: H. Peter Anvin @ 2001-12-18 1:46 UTC (permalink / raw) To: linux-kernel Followup to: <Pine.LNX.4.33.0112171710160.2035-100000@penguin.transmeta.com> By author: Linus Torvalds <torvalds@transmeta.com> In newsgroup: linux.dev.kernel > > I've told you a number of times that I'd like to see the preliminary > implementation publicly discussed and some uses outside of private > companies that I have no insight into.. > There was a group at IBM who presented on an alternate SMP scheduler at this year's OLS; it generated quite a bit of good discussion. -hpa -- <hpa@transmeta.com> at work, <hpa@zytor.com> in private! "Unix gives you enough rope to shoot yourself in the foot." http://www.zytor.com/~hpa/puzzle.txt <amsp@zytor.com> ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 1:11 ` Linus Torvalds 2001-12-18 1:46 ` H. Peter Anvin @ 2001-12-18 5:54 ` Benjamin LaHaise 2001-12-18 6:10 ` Linus Torvalds 1 sibling, 1 reply; 87+ messages in thread From: Benjamin LaHaise @ 2001-12-18 5:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List On Mon, Dec 17, 2001 at 05:11:09PM -0800, Linus Torvalds wrote: > I've told you a number of times that I'd like to see the preliminary > implementation publicly discussed and some uses outside of private > companies that I have no insight into.. Well, we've got serious chicken and egg problems then. -ben -- Fish. ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 5:54 ` Benjamin LaHaise @ 2001-12-18 6:10 ` Linus Torvalds 0 siblings, 0 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 6:10 UTC (permalink / raw) To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Benjamin LaHaise wrote: > On Mon, Dec 17, 2001 at 05:11:09PM -0800, Linus Torvalds wrote: > > I've told you a number of times that I'd like to see the preliminary > > implementation publicly discussed and some uses outside of private > > companies that I have no insight into.. > > Well, we've got serious chicken and egg problems then. Why? I'd rather have people playing around with new system calls and _test_ them, and then have to recompile their apps if the system calls move later, than introduce new system calls that haven't gotten any public testing at all.. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-17 22:53 ` Linus Torvalds 2001-12-17 23:15 ` Davide Libenzi @ 2001-12-18 1:54 ` Rik van Riel 2001-12-18 2:35 ` Linus Torvalds 1 sibling, 1 reply; 87+ messages in thread From: Rik van Riel @ 2001-12-18 1:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List On Mon, 17 Dec 2001, Linus Torvalds wrote: > You have to prioritize. Scheduling overhead is way down the list. That's not what the profiling on my UP machine indicates, let alone on SMP machines. Try readprofile some day, chances are schedule() is pretty near the top of the list. regards, Rik -- Shortwave goes a long way: irc.starchat.net #swl http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 1:54 ` Rik van Riel @ 2001-12-18 2:35 ` Linus Torvalds 2001-12-18 2:51 ` David Lang ` (2 more replies) 0 siblings, 3 replies; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 2:35 UTC (permalink / raw) To: Rik van Riel; +Cc: Davide Libenzi, Kernel Mailing List On Mon, 17 Dec 2001, Rik van Riel wrote: > > Try readprofile some day, chances are schedule() is pretty > near the top of the list. Ehh.. Of course I do readprofile. But did you ever compare readprofile output to _total_ cycles spent? The fact is, it's not even noticeable under any normal loads, and _definitely_ not on UP except with totally made up benchmarks that just pass tokens around or yield all the time. Because we spend 95-99% in user space or idle. Which is as it should be. There are _very_ few loads that are kernel-intensive, and in fact the best way to get high system times is to do either lots of fork/exec/wait with everything cached, or do lots of open/read/write/close with everything cached. Of the remaining 1-5% of time, schedule() shows up as one fairly high thing, but on most profiles I've seen of real work it shows up long after things like "clear_page()" and "copy_page()". And look closely at the profile, and you'll notice that it tends to be a _loong_ tail of stuff. Quite frankly, I'd be a _lot_ more interested in making the scheduling slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a 100Hz one, _despite_ the fact that it will increase scheduling load even more. Because it improves interactive feel, and sometimes even performance (ie being able to sleep for shorter sequences of time allows some things that want "almost realtime" behaviour to avoid busy-looping for those short waits - improving performace exactly _because_ they put more load on the scheduler). The benchmark that is just about _the_ worst on the scheduler is actually something like "lmbench", and if you look at profiles for that you'll notice that system call entry and exit together with the read/write path ends up being more of a performance issue. And you know what? From a user standpoint, improving disk latency is again a _lot_ more noticeable than scheduler overhead. And even more important than performance is being able to read and write to CD-RW disks without having to know about things like "ide-scsi" etc, and do it sanely over different bus architectures etc. The scheduler simply isn't that important. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 2:35 ` Linus Torvalds @ 2001-12-18 2:51 ` David Lang 2001-12-18 3:08 ` Davide Libenzi 2001-12-18 14:09 ` Alan Cox 2 siblings, 0 replies; 87+ messages in thread From: David Lang @ 2001-12-18 2:51 UTC (permalink / raw) To: Linus Torvalds; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List one problem the current scheduler has on SMP machines (even 2 CPU ones) is that if the system is running one big process it will bounce from CPU to CPU and actually finish considerably slower then if you are running two CPU intensive tasks (with less cpu hopping). I saw this a few months ago as I was doing something as simple as gunzip on a large file, I got a 30% speed increase by running setiathome at the same time. I'm not trying to say that it should be the top priority, but there are definante weaknesses showing in the current implementation. David Lang On Mon, 17 Dec 2001, Linus Torvalds wrote: > Date: Mon, 17 Dec 2001 18:35:54 -0800 (PST) > From: Linus Torvalds <torvalds@transmeta.com> > To: Rik van Riel <riel@conectiva.com.br> > Cc: Davide Libenzi <davidel@xmailserver.org>, > Kernel Mailing List <linux-kernel@vger.kernel.org> > Subject: Re: Scheduler ( was: Just a second ) ... > > > On Mon, 17 Dec 2001, Rik van Riel wrote: > > > > Try readprofile some day, chances are schedule() is pretty > > near the top of the list. > > Ehh.. Of course I do readprofile. > > But did you ever compare readprofile output to _total_ cycles spent? > > The fact is, it's not even noticeable under any normal loads, and > _definitely_ not on UP except with totally made up benchmarks that just > pass tokens around or yield all the time. > > Because we spend 95-99% in user space or idle. Which is as it should be. > There are _very_ few loads that are kernel-intensive, and in fact the best > way to get high system times is to do either lots of fork/exec/wait with > everything cached, or do lots of open/read/write/close with everything > cached. > > Of the remaining 1-5% of time, schedule() shows up as one fairly high > thing, but on most profiles I've seen of real work it shows up long after > things like "clear_page()" and "copy_page()". > > And look closely at the profile, and you'll notice that it tends to be a > _loong_ tail of stuff. > > Quite frankly, I'd be a _lot_ more interested in making the scheduling > slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a > 100Hz one, _despite_ the fact that it will increase scheduling load even > more. Because it improves interactive feel, and sometimes even performance > (ie being able to sleep for shorter sequences of time allows some things > that want "almost realtime" behaviour to avoid busy-looping for those > short waits - improving performace exactly _because_ they put more load on > the scheduler). > > The benchmark that is just about _the_ worst on the scheduler is actually > something like "lmbench", and if you look at profiles for that you'll > notice that system call entry and exit together with the read/write path > ends up being more of a performance issue. > > And you know what? From a user standpoint, improving disk latency is again > a _lot_ more noticeable than scheduler overhead. > > And even more important than performance is being able to read and write > to CD-RW disks without having to know about things like "ide-scsi" etc, > and do it sanely over different bus architectures etc. > > The scheduler simply isn't that important. > > Linus > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 2:35 ` Linus Torvalds 2001-12-18 2:51 ` David Lang @ 2001-12-18 3:08 ` Davide Libenzi 2001-12-18 3:19 ` Davide Libenzi 2001-12-18 14:09 ` Alan Cox 2 siblings, 1 reply; 87+ messages in thread From: Davide Libenzi @ 2001-12-18 3:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: Rik van Riel, Kernel Mailing List On Mon, 17 Dec 2001, Linus Torvalds wrote: > Quite frankly, I'd be a _lot_ more interested in making the scheduling > slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a > 100Hz one, _despite_ the fact that it will increase scheduling load even > more. Because it improves interactive feel, and sometimes even performance > (ie being able to sleep for shorter sequences of time allows some things > that want "almost realtime" behaviour to avoid busy-looping for those > short waits - improving performace exactly _because_ they put more load on > the scheduler). I'm ok with increasing HZ but not so ok with decreasing time slices. When you switch a task you've a fixed cost ( tlb, cache image,... ) that, if you decrease the time slice, you're going to weigh with a lower run time highering its percent impact. The more interactive feel can be achieved by using a real BVT implementation : - p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice); + p->counter += NICE_TO_TICKS(p->nice); The only problem with this is that, with certain task run patterns, processes can run a long time ( having an high dynamic priority ) before they get scheduled. What i was thinking was something like, in timer.c : if (p->counter > decay_ticks) --p->counter; else if (++p->timer_ticks >= MAX_RUN_TIME) { p->counter -= p->timer_ticks; p->timer_ticks = 0; p->need_resched = 1; } Having MAX_RUN_TIME ~= NICE_TO_TICKS(0) In this way I/O bound tasks can run with high priority giving a better interactive feel, w/out running too much freezing the system when exiting from a quite long I/O wait. - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 3:08 ` Davide Libenzi @ 2001-12-18 3:19 ` Davide Libenzi 0 siblings, 0 replies; 87+ messages in thread From: Davide Libenzi @ 2001-12-18 3:19 UTC (permalink / raw) To: Davide Libenzi; +Cc: Linus Torvalds, Rik van Riel, Kernel Mailing List On Mon, 17 Dec 2001, Davide Libenzi wrote: > What i was thinking was something like, in timer.c : > > if (p->counter > decay_ticks) > --p->counter; > else if (++p->timer_ticks >= MAX_RUN_TIME) { > p->counter -= p->timer_ticks; > p->timer_ticks = 0; > p->need_resched = 1; > } Obviously that code doesn't work :) but the idea is to not permit the task to run more than a maximum time consecutively. - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 2:35 ` Linus Torvalds 2001-12-18 2:51 ` David Lang 2001-12-18 3:08 ` Davide Libenzi @ 2001-12-18 14:09 ` Alan Cox 2001-12-18 9:12 ` John Heil ` (3 more replies) 2 siblings, 4 replies; 87+ messages in thread From: Alan Cox @ 2001-12-18 14:09 UTC (permalink / raw) To: Linus Torvalds; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List > to CD-RW disks without having to know about things like "ide-scsi" etc, > and do it sanely over different bus architectures etc. > > The scheduler simply isn't that important. The scheduler is eating 40-60% of the machine on real world 8 cpu workloads. That isn't going to go away by sticking heads in sand. ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 14:09 ` Alan Cox @ 2001-12-18 9:12 ` John Heil 2001-12-18 15:34 ` degger ` (2 subsequent siblings) 3 siblings, 0 replies; 87+ messages in thread From: John Heil @ 2001-12-18 9:12 UTC (permalink / raw) To: Alan Cox Cc: Linus Torvalds, Rik van Riel, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Alan Cox wrote: > Date: Tue, 18 Dec 2001 14:09:16 +0000 (GMT) > From: Alan Cox <alan@lxorguk.ukuu.org.uk> > To: Linus Torvalds <torvalds@transmeta.com> > Cc: Rik van Riel <riel@conectiva.com.br>, > Davide Libenzi <davidel@xmailserver.org>, > Kernel Mailing List <linux-kernel@vger.kernel.org> > Subject: Re: Scheduler ( was: Just a second ) ... > > > to CD-RW disks without having to know about things like "ide-scsi" etc, > > and do it sanely over different bus architectures etc. > > > > The scheduler simply isn't that important. > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads. > That isn't going to go away by sticking heads in sand. What % of a std 2 cpu, do you think it eats? > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - ----------------------------------------------------------------- John Heil South Coast Software Custom systems software for UNIX and IBM MVS mainframes 1-714-774-6952 johnhscs@sc-software.com http://www.sc-software.com ----------------------------------------------------------------- ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 14:09 ` Alan Cox 2001-12-18 9:12 ` John Heil @ 2001-12-18 15:34 ` degger 2001-12-18 18:35 ` Mike Kravetz 2001-12-18 18:48 ` Davide Libenzi 2001-12-18 16:50 ` Mike Kravetz 2001-12-18 17:00 ` Linus Torvalds 3 siblings, 2 replies; 87+ messages in thread From: degger @ 2001-12-18 15:34 UTC (permalink / raw) To: alan; +Cc: linux-kernel On 18 Dec, Alan Cox wrote: > The scheduler is eating 40-60% of the machine on real world 8 cpu > workloads. That isn't going to go away by sticking heads in sand. What about a CONFIG_8WAY which, if set, activates a scheduler that performs better on such nontypical machines? I see and understand boths sides arguments yet I fail to see where the real problem is with having a scheduler that just kicks in _iff_ we're running the kernel on a nontypical kind of machine. This would keep the straigtforward scheduler Linus is defending for the single processor machines while providing more performance to heavy SMP machines by having a more complex scheduler better suited for this task. -- Servus, Daniel ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 15:34 ` degger @ 2001-12-18 18:35 ` Mike Kravetz 2001-12-18 18:48 ` Davide Libenzi 1 sibling, 0 replies; 87+ messages in thread From: Mike Kravetz @ 2001-12-18 18:35 UTC (permalink / raw) To: degger; +Cc: alan, linux-kernel On Tue, Dec 18, 2001 at 04:34:57PM +0100, degger@fhm.edu wrote: > What about a CONFIG_8WAY which, if set, activates a scheduler that > performs better on such nontypical machines? I'm pretty sure that we can create a scheduler that works well on an 8-way, and works just as well as the current scheduler on a UP machine. There is already a CONFIG_SMP which is all that should be necessary to distinguish between the two. What may be of more concern is support for different architectures such as HMT and NUMA. What about better scheduler support for people working in the RT embedded space? Each of these seem to have different scheduling requirements. Do people working on these 'non-typical' machines need to create their own scheduler patches? OR is there some 'clean' way to incorporate them into the source tree? -- Mike ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 15:34 ` degger 2001-12-18 18:35 ` Mike Kravetz @ 2001-12-18 18:48 ` Davide Libenzi 1 sibling, 0 replies; 87+ messages in thread From: Davide Libenzi @ 2001-12-18 18:48 UTC (permalink / raw) To: degger; +Cc: Alan Cox, lkml On Tue, 18 Dec 2001 degger@fhm.edu wrote: > On 18 Dec, Alan Cox wrote: > > > The scheduler is eating 40-60% of the machine on real world 8 cpu > > workloads. That isn't going to go away by sticking heads in sand. > > What about a CONFIG_8WAY which, if set, activates a scheduler that > performs better on such nontypical machines? I see and understand > boths sides arguments yet I fail to see where the real problem is > with having a scheduler that just kicks in _iff_ we're running the > kernel on a nontypical kind of machine. > This would keep the straigtforward scheduler Linus is defending > for the single processor machines while providing more performance > to heavy SMP machines by having a more complex scheduler better suited > for this task. By using a multi queue scheduler with global balancing policy you can keep the core scheduler as is and have the balancing code to take care of distributing the load. Obviously that code is under CONFIG_SMP, so it's not even compiled in UP. In this way you've the same scheduler code running independently with a lower load on the run queue and an high locality of locking. - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 14:09 ` Alan Cox 2001-12-18 9:12 ` John Heil 2001-12-18 15:34 ` degger @ 2001-12-18 16:50 ` Mike Kravetz 2001-12-18 17:22 ` Linus Torvalds 2001-12-18 17:00 ` Linus Torvalds 3 siblings, 1 reply; 87+ messages in thread From: Mike Kravetz @ 2001-12-18 16:50 UTC (permalink / raw) To: Alan Cox Cc: Linus Torvalds, Rik van Riel, Davide Libenzi, Kernel Mailing List On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote: > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads. > That isn't going to go away by sticking heads in sand. Can you be more specific as to the workload you are referring to? As someone who has been playing with the scheduler for a while, I am interested in all such workloads. -- Mike ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 16:50 ` Mike Kravetz @ 2001-12-18 17:22 ` Linus Torvalds 2001-12-18 17:50 ` Davide Libenzi 0 siblings, 1 reply; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 17:22 UTC (permalink / raw) To: Mike Kravetz; +Cc: Alan Cox, Rik van Riel, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Mike Kravetz wrote: > On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote: > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads. > > That isn't going to go away by sticking heads in sand. > > Can you be more specific as to the workload you are referring to? > As someone who has been playing with the scheduler for a while, > I am interested in all such workloads. Well, careful: depending on what "%" means, a 8-cpu machine has either "100% max" or "800% max". So are we talking about "we spend 40-60% of all CPU cycles in the scheduler" or are we talking about "we spend 40-60% of the CPU power of _one_ CPU out of 8 in the scheduler". Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates to "The worst scheduler case I've ever seen under a real load spent 5-8% of the machine CPU resources on scheduling". And let's face it, 5-8% is bad, but we're not talking "half the CPU power" here. Linus ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:22 ` Linus Torvalds @ 2001-12-18 17:50 ` Davide Libenzi 0 siblings, 0 replies; 87+ messages in thread From: Davide Libenzi @ 2001-12-18 17:50 UTC (permalink / raw) To: Linus Torvalds; +Cc: Mike Kravetz, Alan Cox, Rik van Riel, Kernel Mailing List On Tue, 18 Dec 2001, Linus Torvalds wrote: > > On Tue, 18 Dec 2001, Mike Kravetz wrote: > > On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote: > > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads. > > > That isn't going to go away by sticking heads in sand. > > > > Can you be more specific as to the workload you are referring to? > > As someone who has been playing with the scheduler for a while, > > I am interested in all such workloads. > > Well, careful: depending on what "%" means, a 8-cpu machine has either > "100% max" or "800% max". > > So are we talking about "we spend 40-60% of all CPU cycles in the > scheduler" or are we talking about "we spend 40-60% of the CPU power of > _one_ CPU out of 8 in the scheduler". > > Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the > scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates > to "The worst scheduler case I've ever seen under a real load spent 5-8% > of the machine CPU resources on scheduling". > > And let's face it, 5-8% is bad, but we're not talking "half the CPU power" > here. Linus, you're plain right that we can spend days debating about the scheduler load. You've to agree that sharing a single lock/queue for multiple CPU is, let's say, quite crappy. You agreed that the scheduler is easy and the fix should not take that much time. You said that you're going to accept the solution that is coming out from the mailing list. Why don't we start talking about some solution and code ? Starting from a basic architecture down to the implementation. Alan and Rik are quite "unloaded" now, what do You think ? - Davide ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 14:09 ` Alan Cox ` (2 preceding siblings ...) 2001-12-18 16:50 ` Mike Kravetz @ 2001-12-18 17:00 ` Linus Torvalds 2001-12-18 19:17 ` Alan Cox 3 siblings, 1 reply; 87+ messages in thread From: Linus Torvalds @ 2001-12-18 17:00 UTC (permalink / raw) To: Alan Cox; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List On Tue, 18 Dec 2001, Alan Cox wrote: > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads. > That isn't going to go away by sticking heads in sand. Did you _read_ what I said? We _have_ patches. You apparently have your own set. Fight it out. Don't involve me, because I don't think it's even a challenging thing. I wrote what is _still_ largely the algorithm in 1991, and it's damn near the only piece of code from back then that even _has_ some similarity to the original code still. All the "recompute count when everybody has gone down to zero" was there pretty much from day 1 (*). Which makes me say: "oh, a quick hack from 1991 works on most machines in 2001, so how hard a problem can it be?" Fight it out. People asked whether I was interested, and I said "no". Take a clue: do benchmarks on all the competing patches, and try to create the best one, and present it to me as a done deal. Linus (*) The single biggest change from day 1 is that it used to iterate over a global array of process slots, and for scalability reasons (not CPU scalability, but "max nr of processes in the system" scalability) the array was gotten rid of, giving the current doubly linked list. Everything else that any scheduler person complains about was pretty much there otherwise ;) ^ permalink raw reply [flat|nested] 87+ messages in thread
* Re: Scheduler ( was: Just a second ) ... 2001-12-18 17:00 ` Linus Torvalds @ 2001-12-18 19:17 ` Alan Cox 0 siblings, 0 replies; 87+ messages in thread From: Alan Cox @ 2001-12-18 19:17 UTC (permalink / raw) To: Linus Torvalds Cc: Alan Cox, Rik van Riel, Davide Libenzi, Kernel Mailing List > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads. > > That isn't going to go away by sticking heads in sand. > > Did you _read_ what I said? > > We _have_ patches. You apparently have your own set. I did read that mail - but somewhat later. Right now Im scanning l/k every few days no more. As to my stuff - everything I propose different to ibm/davide is about cost/speed of ordering or minor optimisations. I don't plan to compete and duplicate work ^ permalink raw reply [flat|nested] 87+ messages in thread
end of thread, other threads:[~2001-12-22 4:25 UTC | newest]
Thread overview: 87+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Pine.LNX.4.33.0112181508001.3410-100000@penguin.transmeta.com>
2001-12-20 3:50 ` Scheduler ( was: Just a second ) Rik van Riel
2001-12-20 4:04 ` Ryan Cumming
2001-12-20 5:39 ` David S. Miller
2001-12-20 5:58 ` Linus Torvalds
2001-12-20 6:01 ` David S. Miller
2001-12-20 22:40 ` Troels Walsted Hansen
2001-12-20 23:55 ` Chris Ricker
2001-12-20 23:59 ` CaT
2001-12-21 0:06 ` Davide Libenzi
2001-12-20 11:29 ` Rik van Riel
2001-12-20 11:34 ` David S. Miller
2001-12-20 5:52 ` Linus Torvalds
2001-12-20 6:33 ` Scheduler, Can we save some juice Timothy Covell
2001-12-20 6:50 ` Ryan Cumming
2001-12-20 6:52 ` Robert Love
2001-12-20 17:39 ` Timothy Covell
[not found] <20011218020456.A11541@redhat.com>
2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
2001-12-18 16:56 ` Rik van Riel
2001-12-18 17:18 ` Linus Torvalds
2001-12-18 19:04 ` Alan Cox
2001-12-18 21:02 ` Larry McVoy
2001-12-18 21:14 ` David S. Miller
2001-12-18 21:17 ` Larry McVoy
2001-12-18 21:19 ` Rik van Riel
2001-12-18 21:30 ` David S. Miller
2001-12-18 21:18 ` Rik van Riel
2001-12-19 16:50 ` Daniel Phillips
2001-12-18 19:11 ` Mike Galbraith
2001-12-18 19:15 ` Rik van Riel
2001-12-18 17:55 ` Davide Libenzi
2001-12-18 19:43 ` Alexander Viro
2001-12-18 5:59 V Ganesh
-- strict thread matches above, loose matches on Subject: below --
2001-12-18 5:11 Thierry Forveille
2001-12-17 21:41 ` John Heil
2001-12-18 14:31 ` Alan Cox
[not found] <20011217200946.D753@holomorphy.com>
2001-12-18 4:27 ` Linus Torvalds
2001-12-18 4:55 ` William Lee Irwin III
2001-12-18 6:09 ` Linus Torvalds
2001-12-18 6:34 ` Jeff Garzik
2001-12-18 12:23 ` Rik van Riel
2001-12-18 14:29 ` Alan Cox
2001-12-18 17:07 ` Linus Torvalds
2001-12-18 15:51 ` Martin Josefsson
2001-12-18 17:08 ` Linus Torvalds
2001-12-18 16:16 ` Roger Larsson
2001-12-18 17:16 ` Herman Oosthuysen
2001-12-18 17:16 ` Linus Torvalds
2001-12-18 17:21 ` David Mansfield
2001-12-18 17:27 ` Linus Torvalds
2001-12-18 17:54 ` Andreas Dilger
2001-12-18 18:27 ` Doug Ledford
2001-12-18 18:52 ` Andreas Dilger
2001-12-18 19:03 ` Doug Ledford
2001-12-19 9:19 ` Peter Wächtler
2001-12-19 11:05 ` Helge Hafting
2001-12-21 20:23 ` Rob Landley
2001-12-18 18:35 ` Linus Torvalds
2001-12-18 18:58 ` Alan Cox
2001-12-18 19:31 ` Gerd Knorr
2001-12-18 18:25 ` William Lee Irwin III
2001-12-18 14:21 ` Adam Schrotenboer
2001-12-18 18:13 ` Davide Libenzi
2001-12-16 0:13 Just a second Linus Torvalds
2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi
2001-12-17 22:53 ` Linus Torvalds
2001-12-17 23:15 ` Davide Libenzi
2001-12-17 23:18 ` Linus Torvalds
2001-12-17 23:39 ` Davide Libenzi
2001-12-17 23:52 ` Benjamin LaHaise
2001-12-18 1:11 ` Linus Torvalds
2001-12-18 1:46 ` H. Peter Anvin
2001-12-18 5:54 ` Benjamin LaHaise
2001-12-18 6:10 ` Linus Torvalds
2001-12-18 1:54 ` Rik van Riel
2001-12-18 2:35 ` Linus Torvalds
2001-12-18 2:51 ` David Lang
2001-12-18 3:08 ` Davide Libenzi
2001-12-18 3:19 ` Davide Libenzi
2001-12-18 14:09 ` Alan Cox
2001-12-18 9:12 ` John Heil
2001-12-18 15:34 ` degger
2001-12-18 18:35 ` Mike Kravetz
2001-12-18 18:48 ` Davide Libenzi
2001-12-18 16:50 ` Mike Kravetz
2001-12-18 17:22 ` Linus Torvalds
2001-12-18 17:50 ` Davide Libenzi
2001-12-18 17:00 ` Linus Torvalds
2001-12-18 19:17 ` Alan Cox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox