* [OKS] O(1) scheduler in 2.4 @ 2002-07-01 17:52 Bill Davidsen 2002-07-01 18:12 ` Tom Rini 2002-07-01 18:49 ` Ingo Molnar 0 siblings, 2 replies; 28+ messages in thread From: Bill Davidsen @ 2002-07-01 17:52 UTC (permalink / raw) To: Linux-Kernel Mailing List What's the issue? The most popular trees have been using it without issue for six months or so, and I know of no cases of bad behaviour. I know there are people who don't believe in the preempt patch, but the new scheduler seems to work better under both desktop and server load. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-01 17:52 [OKS] O(1) scheduler in 2.4 Bill Davidsen @ 2002-07-01 18:12 ` Tom Rini 2002-07-01 23:44 ` J.A. Magallon 2002-07-02 14:46 ` Bill Davidsen 2002-07-01 18:49 ` Ingo Molnar 1 sibling, 2 replies; 28+ messages in thread From: Tom Rini @ 2002-07-01 18:12 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux-Kernel Mailing List On Mon, Jul 01, 2002 at 01:52:54PM -0400, Bill Davidsen wrote: > What's the issue? a) We're at 2.4.19-rc1 right now. It would be horribly counterproductive to put O(1) in right now. b) 2.4 is the _stable_ tree. If every big change in 2.5 got back ported to 2.4, it'd be just like 2.5 :) c) I also suspect that it hasn't been as widley tested on !x86 as the stuff currently in 2.4. And again, 2.4 is the stable tree. -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-01 18:12 ` Tom Rini @ 2002-07-01 23:44 ` J.A. Magallon 2002-07-02 2:48 ` Tom Rini 2002-07-02 16:05 ` venom 2002-07-02 14:46 ` Bill Davidsen 1 sibling, 2 replies; 28+ messages in thread From: J.A. Magallon @ 2002-07-01 23:44 UTC (permalink / raw) To: Tom Rini; +Cc: Bill Davidsen, Linux-Kernel Mailing List On 2002.07.01 Tom Rini wrote: >On Mon, Jul 01, 2002 at 01:52:54PM -0400, Bill Davidsen wrote: > >> What's the issue? > >a) We're at 2.4.19-rc1 right now. It would be horribly >counterproductive to put O(1) in right now. .20-pre1 would be a good start, but my hope is that this reserved for the vm updates from -aa ;). >b) 2.4 is the _stable_ tree. If every big change in 2.5 got back ported >to 2.4, it'd be just like 2.5 :) So you want to wait till 2.6.40 to be able to use a O1 scheduler on a kernel that does not eat up your drives ? (say, next year by this same month...) >c) I also suspect that it hasn't been as widley tested on !x86 as the >stuff currently in 2.4. And again, 2.4 is the stable tree. > I know it is not a priority for 2.4, but say it wil never happen... -- J.A. Magallon \ Software is like sex: It's better when it's free mailto:jamagallon@able.es \ -- Linus Torvalds, FSF T-shirt Linux werewolf 2.4.19-rc1-jam1, Mandrake Linux 8.3 (Cooker) for i586 gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.7mdk) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-01 23:44 ` J.A. Magallon @ 2002-07-02 2:48 ` Tom Rini 2002-07-03 1:11 ` Rob Landley 2002-07-02 16:05 ` venom 1 sibling, 1 reply; 28+ messages in thread From: Tom Rini @ 2002-07-02 2:48 UTC (permalink / raw) To: J.A. Magallon; +Cc: Bill Davidsen, Linux-Kernel Mailing List On Tue, Jul 02, 2002 at 01:44:32AM +0200, J.A. Magallon wrote: > > On 2002.07.01 Tom Rini wrote: > >On Mon, Jul 01, 2002 at 01:52:54PM -0400, Bill Davidsen wrote: > > > >> What's the issue? > > > >b) 2.4 is the _stable_ tree. If every big change in 2.5 got back ported > >to 2.4, it'd be just like 2.5 :) > > So you want to wait till 2.6.40 to be able to use a O1 scheduler on a > kernel that does not eat up your drives ? (say, next year by this same month...) I assume you mean 2.4.60 here, and no, I don't think O1 scheduler should go into 2.4 ever. We're aiming for a _stable_ series here. Let me stress that again, _stable_. I'd hope that 2.4.60 is as slow in coming as 2.0.40 is. > >c) I also suspect that it hasn't been as widley tested on !x86 as the > >stuff currently in 2.4. And again, 2.4 is the stable tree. > > I know it is not a priority for 2.4, but say it wil never happen... I won't say it will never happen, just that I don't think it should. It's a rather invasive thing (and as Ingo said, it's just not getting stable). -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-02 2:48 ` Tom Rini @ 2002-07-03 1:11 ` Rob Landley 2002-07-03 7:30 ` Adrian Bunk 2002-07-03 8:35 ` Ingo Molnar 0 siblings, 2 replies; 28+ messages in thread From: Rob Landley @ 2002-07-03 1:11 UTC (permalink / raw) To: Tom Rini, J.A. Magallon; +Cc: Bill Davidsen, Linux-Kernel Mailing List On Monday 01 July 2002 10:48 pm, Tom Rini wrote: > On Tue, Jul 02, 2002 at 01:44:32AM +0200, J.A. Magallon wrote: > > On 2002.07.01 Tom Rini wrote: > > >On Mon, Jul 01, 2002 at 01:52:54PM -0400, Bill Davidsen wrote: > > >> What's the issue? > > > > > >b) 2.4 is the _stable_ tree. If every big change in 2.5 got back ported > > >to 2.4, it'd be just like 2.5 :) > > > > So you want to wait till 2.6.40 to be able to use a O1 scheduler on a > > kernel that does not eat up your drives ? (say, next year by this same > > month...) > > I assume you mean 2.4.60 here, and no, I don't think O1 scheduler should > go into 2.4 ever. We're aiming for a _stable_ series here. Let me Ah, monday morning virtue, overcompensating for 2.4.10. It's the hangover speaking... "We upgrade our kernel on a production machine without testing it first, and we get mad if anything actually CHANGED. We want that upgrade to be a NOP, darn it! We want it to be as if we never did it in the first place, that's why we do it..." If you want stone tablet stability, why the heck are you upgrading your kernel? Downloading the new version off of kernel.org generally means you're mucking about with a working box, making changes that are not 100% required. If a security vulnerability comes out, you have the source and can patch the specific bug in your version. (If you're not up to that, you're probably using a vendor kernel, which is a whole 'nother can of worms.) If you install new hardware or software, and it going "boing" would be a bad thing, you try it on a scratch box first. If you don't, you deserve what you get. I'm under the impression 2.4.19 is introducing chunks of Andre Hedrick's new IDE code. So it's ok to upgrade something that can, in case of a bug, eat your data silently in a way that journaling won't detect. Why? LBA-48 and ATA-133, of course. But scheduling, which is SUPPOSED to be non-deterministic from day one and could theoretically be brain-dead round robin without affecting anything but performance... That's not safe to upgrade. Right. If you have a race condition in your code that a new scheduler triggers, ANYTHING could trigger it. 2.4.18 behaves horribly under load, try md5sum on an iso image and then pull up an xterm and su to another user. It can take 30 seconds. (Yeah, that's mostly IO starvation rather than the scheduler, but still, how is the new scheduler going to do WORSE than this?) The argument here is basically "don't change anything". It's not exactly a series then, is it? If you want trailing edge, 2.0 is still being maintained, let alone 2.2. Those have a great excuse for not accepting anything new beyond a really obvious bugfix. 2.4 does not, because 2.6 isn't out yet. Backporting of somethings from 2.5 to 2.4 will occur until then, and O(1) is an obvious eventual candidate. > stress that again, _stable_. I'd hope that 2.4.60 is as slow in coming > as 2.0.40 is. So the fact that it's in Alan Cox's kernel (meaning Red Hat is shipping it in 2.4.18-5.55, meaning that if more people aren't actually USING it yet than marcelo's 2.4, they will be soon), and andrea's kernel (meaning new VM development is being done with it in mind)... It may not be "sufficiently tested" yet but it's GETTING a lot of testing. You use anything EXCEPT a stock vanilla 2.4, you're probably getting O(1) at this point. If the vendors are starting to ship the thing already, what is the DOWN side to integrating it? The down side to NEVER integrating it is eventually fewer people using the kernel off of kernel.org. Does this remind anybody else of the 0.90 software raid stuff? At some point it makes more sense to keep the OLD one around as a patch for the 5% of the community that doesn't want to upgrade. We're not there on the scheduler yet, but "should not happen" without a qualifier means "never"... > > >c) I also suspect that it hasn't been as widley tested on !x86 as the > > >stuff currently in 2.4. And again, 2.4 is the stable tree. > > > > I know it is not a priority for 2.4, but say it wil never happen... > > I won't say it will never happen, just that I don't think it should. > It's a rather invasive thing (and as Ingo said, it's just not getting > stable). Ingo's main objection was that the patch is only 6 months old, and that 2.4 is only now stabilizing and that bug squeezing and smoothing should be given a little longer to ensure that people have the option of NOT upgrading, and that those upgrading want improvements rather than critical "this just doesn't work" fixes. And that's a fine argument. But 2.6 isn't going to be out this year. It's not even having its first freeze until October. Traditionally, we've been running a year and a half between stable releases (and another six months to actually get the new one battle-tested to where the distros and at least 50% of the production boxes upgrade.) We've got a year to eighteen months left on that cycle. Are the distros going to hold off adding it to 2.4 for a year to 18 months? The real question is, how much MORE conservative than the distros should the mainline kernels be? Rob ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-03 1:11 ` Rob Landley @ 2002-07-03 7:30 ` Adrian Bunk 2002-07-03 8:35 ` Ingo Molnar 1 sibling, 0 replies; 28+ messages in thread From: Adrian Bunk @ 2002-07-03 7:30 UTC (permalink / raw) To: Rob Landley; +Cc: Linux-Kernel Mailing List On Tue, 2 Jul 2002, Rob Landley wrote: >... > The real question is, how much MORE conservative than the distros should the > mainline kernels be? Your "the distros" are only a subset of all Linux distributions? E.g. the 2.4 kernel images in Debian (that will be in the next release of Debian) are plain ftp.kernel.org kernels (no -ac or -aa kernels) with only very few patches (read: bug fixes) applied. > Rob cu Adrian -- You only think this is a free country. Like the US the UK spends a lot of time explaining its a free country because its a police state. Alan Cox ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-03 1:11 ` Rob Landley 2002-07-03 7:30 ` Adrian Bunk @ 2002-07-03 8:35 ` Ingo Molnar 2002-07-04 3:36 ` Bill Davidsen 1 sibling, 1 reply; 28+ messages in thread From: Ingo Molnar @ 2002-07-03 8:35 UTC (permalink / raw) To: Rob Landley Cc: Tom Rini, J.A. Magallon, Bill Davidsen, Linux-Kernel Mailing List On Tue, 2 Jul 2002, Rob Landley wrote: > If you want stone tablet stability, why the heck are you upgrading your > kernel? [...] to get security and stability fixes. > The argument here is basically "don't change anything". It's not > exactly a series then, is it? If you want trailing edge, 2.0 is still > being maintained, let alone 2.2. Those have a great excuse for not > accepting anything new beyond a really obvious bugfix. 2.4 does not, > because 2.6 isn't out yet. Backporting of somethings from 2.5 to 2.4 > will occur until then, and O(1) is an obvious eventual candidate. it might be a candidate for inclusion once it has _proven_ stability and robustness (in terms of tester and developer exposion), on the same order of magnitude as the 2.4 kernel - but that needs time and exposure in trees like the -ac tree and vendor trees. It might not happen at all, during the lifetime of 2.4. Note that the O(1) scheduler isnt a security or stability fix, neither is it a driver backport. It isnt a feature backport that enables hardware that couldnt be used in 2.4 before. The VM was a special case because most people agreed that it truly sucked, and even though people keep disagreeing about that decision, the VM is in a pretty good shape now - and we still have good correlation between the VM in 2.5, and the VM in 2.4. The 2.4 scheduler on the other hand doesnt suck for 99% of the people, so our hands are not forced in any way - we have the choice of a 'proven-rock-solid good scheduler' vs. an 'even better, but still young scheduler'. if say 90% of Linux users on the planet adopt the O(1) scheduler, and in a year or two there wont be a bigger distro (including Debian of course) without the O(1) scheduler in it [which, admittedly, is happening already], then it can and should perhaps be merged into 2.4. But right now i think that the majority of 2.4 users are running the stock 2.4 scheduler. > So the fact that it's in Alan Cox's kernel (meaning Red Hat is shipping > it in 2.4.18-5.55, meaning that if more people aren't actually USING it > yet than marcelo's 2.4, they will be soon), and andrea's kernel (meaning > new VM development is being done with it in mind)... It may not be > "sufficiently tested" yet but it's GETTING a lot of testing. You use > anything EXCEPT a stock vanilla 2.4, you're probably getting O(1) at > this point. things like migration to a new kernel happen on a slighly slower scale than the 6 months this patch has existed. I'd say in 1 year what you say might be true. 70% of the Linux users are not running the 'very latest' release. also note that the O(1) scheduler patch in the Red Hat kernel rpm was a stability fork done months ago, with stability fixes backported into it. The 2.4 O(1) patches being distributed now are more like direct backports of the 2.5 scheduler - this way we can get testing and feedback even from those people who do not want to (or cannot) run a 2.5 kernel due to the massive IO changes being underway. i do not say that the O(1) scheduler has bugs (if i knew about any i'd have fixed it already :), i am simply saying that to be able to say to Marcelo "it does not have bugs and does not introduce problems" it needs more exposure. [ And if the author of a given piece of code says things like this then it usually does not get merged ;-) ] > not there on the scheduler yet, but "should not happen" without a > qualifier means "never"... we agree here. > The real question is, how much MORE conservative than the distros should > the mainline kernels be? There's a natural 'feature race' between distros, so the distros can act as an additional (and pretty powerful) testing tool for various kernel features - and for which the distros are willing to spend resources and take risks as well. In fact they also act as a 'user demand' filter, for kernel features as well. And if all distros pick up a given feature, and it's been in for more than 6 months, (instead of 'more than 6 months since first patch') then Marcelo will have a much easier decision :-) Ingo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-03 8:35 ` Ingo Molnar @ 2002-07-04 3:36 ` Bill Davidsen 2002-07-04 6:56 ` Ingo Molnar 2002-07-04 18:08 ` Rob Landley 0 siblings, 2 replies; 28+ messages in thread From: Bill Davidsen @ 2002-07-04 3:36 UTC (permalink / raw) To: Ingo Molnar Cc: Rob Landley, Tom Rini, J.A. Magallon, Linux-Kernel Mailing List > it might be a candidate for inclusion once it has _proven_ stability and > robustness (in terms of tester and developer exposion), on the same order > of magnitude as the 2.4 kernel - but that needs time and exposure in trees > like the -ac tree and vendor trees. It might not happen at all, during the > lifetime of 2.4. It has already proven to be stable and robust in the sense that it isn't worse than the stock scheduler on typical loads and is vastly better on some. > > Note that the O(1) scheduler isnt a security or stability fix, neither is > it a driver backport. It isnt a feature backport that enables hardware > that couldnt be used in 2.4 before. The VM was a special case because most > people agreed that it truly sucked, and even though people keep > disagreeing about that decision, the VM is in a pretty good shape now - > and we still have good correlation between the VM in 2.5, and the VM in > 2.4. The 2.4 scheduler on the other hand doesnt suck for 99% of the > people, so our hands are not forced in any way - we have the choice of a > 'proven-rock-solid good scheduler' vs. an 'even better, but still young > scheduler'. Here I disagree. Sure behaves like a stability fix to me. On a system with a mix of interractive and cpu-bound processes, including processes with hundreds of threads, you just can't get reasonable performance balancing with nice() because it is totally impractical to keep tuning a thread which changes from hog to disk io to socket waits with a human in the loop. The new scheduler notices this stuff and makes it work, I don't even know for sure (as in tried it) if you can have different nice on threads of the same process. This is not some neat feature to buy a few percent better this or that, this is roughly 50% more users on the server before it falls over, and no total bogs when many threads change to hog mode at once. You will not hear me saying this about preempt, or low-latency, and I bet that after I try lock-break this weekend I won't fell that I have to have that either. The O(1) scheduler is self defense against badly behaved processes, and the reason it should go in mainline is so it won't depend on someone finding the time to backport the fun stuff from 2.5 as a patch every time. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-04 3:36 ` Bill Davidsen @ 2002-07-04 6:56 ` Ingo Molnar 2002-07-04 7:36 ` J Sloan ` (2 more replies) 2002-07-04 18:08 ` Rob Landley 1 sibling, 3 replies; 28+ messages in thread From: Ingo Molnar @ 2002-07-04 6:56 UTC (permalink / raw) To: Bill Davidsen Cc: Rob Landley, Tom Rini, J.A. Magallon, Linux-Kernel Mailing List On Wed, 3 Jul 2002, Bill Davidsen wrote: > > it might be a candidate for inclusion once it has _proven_ stability and > > robustness (in terms of tester and developer exposion), on the same order > > of magnitude as the 2.4 kernel - but that needs time and exposure in trees > > like the -ac tree and vendor trees. It might not happen at all, during the > > lifetime of 2.4. > > It has already proven to be stable and robust in the sense that it isn't > worse than the stock scheduler on typical loads and is vastly better on > some. this is your experience, and i'm happy about that. Whether it's the same experience for 90% of Linux users, time will tell. > > Note that the O(1) scheduler isnt a security or stability fix, neither is > > it a driver backport. It isnt a feature backport that enables hardware > > that couldnt be used in 2.4 before. The VM was a special case because most > > people agreed that it truly sucked, and even though people keep > > disagreeing about that decision, the VM is in a pretty good shape now - > > and we still have good correlation between the VM in 2.5, and the VM in > > 2.4. The 2.4 scheduler on the other hand doesnt suck for 99% of the > > people, so our hands are not forced in any way - we have the choice of a > > 'proven-rock-solid good scheduler' vs. an 'even better, but still young > > scheduler'. > > Here I disagree. Sure behaves like a stability fix to me. On a system > with a mix of interractive and cpu-bound processes, including processes > with hundreds of threads, you just can't get reasonable performance > balancing with nice() because it is totally impractical to keep tuning a > thread which changes from hog to disk io to socket waits with a human in > the loop. The new scheduler notices this stuff and makes it work, I > don't even know for sure (as in tried it) if you can have different nice > on threads of the same process. (yes, it's possible to nice() individual threads.) > This is not some neat feature to buy a few percent better this or that, > this is roughly 50% more users on the server before it falls over, and > no total bogs when many threads change to hog mode at once. are these hard numbers? I havent seen much hard data yet from real-life servers using the O(1) scheduler. There was lots of feedback from desktop-class systems that behave better, but servers used to be pretty good with the previous scheduler as well. > You will not hear me saying this about preempt, or low-latency, and I > bet that after I try lock-break this weekend I won't fell that I have to > have that either. The O(1) scheduler is self defense against badly > behaved processes, and the reason it should go in mainline is so it > won't depend on someone finding the time to backport the fun stuff from > 2.5 as a patch every time. well, the O(1) scheduler indeed tries to put up as much defense against 'badly behaved' processes as possible. In fact you should try to start up your admin shells via nice -20, that gives much more priority than it used to under the previous scheduler - it's very close to the RT priorities, but without the risks. This works in the other direction as well: nice +19 has a much stronger meaning (in terms of preemption and timeslice distribution) than it used to. Ingo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-04 6:56 ` Ingo Molnar @ 2002-07-04 7:36 ` J Sloan 2002-07-05 6:18 ` Andrew Rodland 2002-07-05 9:12 ` William Lee Irwin III 2 siblings, 0 replies; 28+ messages in thread From: J Sloan @ 2002-07-04 7:36 UTC (permalink / raw) To: Ingo Molnar Cc: Bill Davidsen, Rob Landley, Tom Rini, J.A. Magallon, Linux-Kernel Mailing List Ingo, it's apparent you are refraining from pushing this O(1) scheduler - that's admirable, but don't swing too far in the other direction. The fact is, it's working well in 2.5, it's working well in the 2.4-ac tree, it's working well in the 2.4-aa tree, and Red Hat has been shipping it. It will soon be the case that most Linux users are using O(1) - thus any poor clown who downloads the standard src from kernel.org has a large task ahead of him if he wants similar functionality to the majority of linux users. This divergence may not be a good thing... ;-) Joe Ingo Molnar wrote: >On Wed, 3 Jul 2002, Bill Davidsen wrote: > > > >>>it might be a candidate for inclusion once it has _proven_ stability and >>>robustness (in terms of tester and developer exposion), on the same order >>>of magnitude as the 2.4 kernel - but that needs time and exposure in trees >>>like the -ac tree and vendor trees. It might not happen at all, during the >>>lifetime of 2.4. >>> >>> >>It has already proven to be stable and robust in the sense that it isn't >>worse than the stock scheduler on typical loads and is vastly better on >>some. >> >> > >this is your experience, and i'm happy about that. Whether it's the same >experience for 90% of Linux users, time will tell. > > > >>>Note that the O(1) scheduler isnt a security or stability fix, neither is >>>it a driver backport. It isnt a feature backport that enables hardware >>>that couldnt be used in 2.4 before. The VM was a special case because most >>>people agreed that it truly sucked, and even though people keep >>>disagreeing about that decision, the VM is in a pretty good shape now - >>>and we still have good correlation between the VM in 2.5, and the VM in >>>2.4. The 2.4 scheduler on the other hand doesnt suck for 99% of the >>>people, so our hands are not forced in any way - we have the choice of a >>>'proven-rock-solid good scheduler' vs. an 'even better, but still young >>>scheduler'. >>> >>> >>Here I disagree. Sure behaves like a stability fix to me. On a system >>with a mix of interractive and cpu-bound processes, including processes >>with hundreds of threads, you just can't get reasonable performance >>balancing with nice() because it is totally impractical to keep tuning a >>thread which changes from hog to disk io to socket waits with a human in >>the loop. The new scheduler notices this stuff and makes it work, I >>don't even know for sure (as in tried it) if you can have different nice >>on threads of the same process. >> >> > >(yes, it's possible to nice() individual threads.) > > > >>This is not some neat feature to buy a few percent better this or that, >>this is roughly 50% more users on the server before it falls over, and >>no total bogs when many threads change to hog mode at once. >> >> > >are these hard numbers? I havent seen much hard data yet from real-life >servers using the O(1) scheduler. There was lots of feedback from >desktop-class systems that behave better, but servers used to be pretty >good with the previous scheduler as well. > > > >>You will not hear me saying this about preempt, or low-latency, and I >>bet that after I try lock-break this weekend I won't fell that I have to >>have that either. The O(1) scheduler is self defense against badly >>behaved processes, and the reason it should go in mainline is so it >>won't depend on someone finding the time to backport the fun stuff from >>2.5 as a patch every time. >> >> > >well, the O(1) scheduler indeed tries to put up as much defense against >'badly behaved' processes as possible. In fact you should try to start up >your admin shells via nice -20, that gives much more priority than it used >to under the previous scheduler - it's very close to the RT priorities, >but without the risks. This works in the other direction as well: nice +19 >has a much stronger meaning (in terms of preemption and timeslice >distribution) than it used to. > > Ingo > >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-04 6:56 ` Ingo Molnar 2002-07-04 7:36 ` J Sloan @ 2002-07-05 6:18 ` Andrew Rodland 2002-07-05 6:56 ` Adrian Bunk 2002-07-05 9:12 ` William Lee Irwin III 2 siblings, 1 reply; 28+ messages in thread From: Andrew Rodland @ 2002-07-05 6:18 UTC (permalink / raw) To: linux-kernel; +Cc: Ingo Molnar On Thu, 4 Jul 2002 08:56:01 +0200 (CEST) Ingo Molnar <mingo@elte.hu> wrote: > > On Wed, 3 Jul 2002, Bill Davidsen wrote: > > > > it might be a candidate for inclusion once it has _proven_ > > > stability and robustness (in terms of tester and developer > > > exposion), on the same order of magnitude as the 2.4 kernel - but > > > that needs time and exposure in trees like the -ac tree and vendor > > > trees. It might not happen at all, during the lifetime of 2.4. > > > > It has already proven to be stable and robust in the sense that it > > isn't worse than the stock scheduler on typical loads and is vastly > > better on some. > > this is your experience, and i'm happy about that. Whether it's the > same experience for 90% of Linux users, time will tell. > > > > Note that the O(1) scheduler isnt a security or stability fix, > > > neither is it a driver backport. It isnt a feature backport that > > > enables hardware that couldnt be used in 2.4 before. The VM was a > > > special case because most people agreed that it truly sucked, and > > > even though people keep disagreeing about that decision, the VM is > > > in a pretty good shape now - and we still have good correlation > > > between the VM in 2.5, and the VM in 2.4. The 2.4 scheduler on the > > > other hand doesnt suck for 99% of the people, so our hands are not > > > forced in any way - we have the choice of a'proven-rock-solid good > > > scheduler' vs. an 'even better, but still young scheduler'. > > > > Here I disagree. Sure behaves like a stability fix to me. On a > > system with a mix of interractive and cpu-bound processes, including > > processes with hundreds of threads, you just can't get reasonable > > performance balancing with nice() because it is totally impractical > > to keep tuning a thread which changes from hog to disk io to socket > > waits with a human in the loop. The new scheduler notices this stuff > > and makes it work, I don't even know for sure (as in tried it) if > > you can have different nice on threads of the same process. > > (yes, it's possible to nice() individual threads.) > > > This is not some neat feature to buy a few percent better this or > > that, this is roughly 50% more users on the server before it falls > > over, and no total bogs when many threads change to hog mode at > > once. > > are these hard numbers? I havent seen much hard data yet from > real-life servers using the O(1) scheduler. There was lots of feedback > from desktop-class systems that behave better, but servers used to be > pretty good with the previous scheduler as well. > > > You will not hear me saying this about preempt, or low-latency, and > > I bet that after I try lock-break this weekend I won't fell that I > > have to have that either. The O(1) scheduler is self defense against > > badly behaved processes, and the reason it should go in mainline is > > so it won't depend on someone finding the time to backport the fun > > stuff from 2.5 as a patch every time. > > well, the O(1) scheduler indeed tries to put up as much defense > against'badly behaved' processes as possible. In fact you should try > to start up your admin shells via nice -20, that gives much more > priority than it used to under the previous scheduler - it's very > close to the RT priorities, but without the risks. This works in the > other direction as well: nice +19 has a much stronger meaning (in > terms of preemption and timeslice distribution) than it used to. Very nearly off topic, but I've had a few people on IRC tell me that they love O(1) specifically because it has a 'nice that actually does something'. As a matter of fact, I've had to change my X startup scripts, to make it a bit less selfish; the defaults are just plain silly, now. I had thought before that I had a complaint about processes that spawn a large number of children, and then reap them all at once, but it turns out that I was just running myself out of memory while conducting the test, and that if I avoid swapping, I don't run into any problems. I'm running 2.4.19-pre10-ac2 + preempt + some little things, on a 400mhz laptop, and it's just about as smooth as I could ask for. As for O(1) in mainline, I think that it's better than what we've got. But as for me, as long as O(1)-sched keeps moving, and AC keeps cranking out the patches, I'll be happy. >:) > > Ingo > > - > To unsubscribe from this list: send the line "unsubscribe > linux-kernel" in the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-05 6:18 ` Andrew Rodland @ 2002-07-05 6:56 ` Adrian Bunk 2002-07-05 7:02 ` Andrew Rodland 0 siblings, 1 reply; 28+ messages in thread From: Adrian Bunk @ 2002-07-05 6:56 UTC (permalink / raw) To: Andrew Rodland; +Cc: linux-kernel On Fri, 5 Jul 2002, Andrew Rodland wrote: >... > Very nearly off topic, but I've had a few people on IRC tell me that > they love O(1) specifically because it has a 'nice that actually does > something'. As a matter of fact, I've had to change my X startup > scripts, to make it a bit less selfish; the defaults are just plain > silly, now. >... This is exactly a reason why O(1) shouldn't go into 2.4: E.g. my X is as suggested by my the installation routine of my distribution (Debian unstable/testing) niced to -10. It would be a bad surprise for _many_ people if they upgrade their 2.4 kernel because of other security and/or stability fixes and such a setting is then wrong. cu Adrian -- You only think this is a free country. Like the US the UK spends a lot of time explaining its a free country because its a police state. Alan Cox ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-05 6:56 ` Adrian Bunk @ 2002-07-05 7:02 ` Andrew Rodland 0 siblings, 0 replies; 28+ messages in thread From: Andrew Rodland @ 2002-07-05 7:02 UTC (permalink / raw) To: linux-kernel; +Cc: Adrian Bunk On Fri, 5 Jul 2002 08:56:59 +0200 (CEST) Adrian Bunk <bunk@fs.tum.de> wrote: > On Fri, 5 Jul 2002, Andrew Rodland wrote: > > >... > > Very nearly off topic, but I've had a few people on IRC tell me that > > they love O(1) specifically because it has a 'nice that actually > > does something'. As a matter of fact, I've had to change my X > > startup scripts, to make it a bit less selfish; the defaults are > > just plain silly, now. > >... > > This is exactly a reason why O(1) shouldn't go into 2.4: > > E.g. my X is as suggested by my the installation routine of my > distribution (Debian unstable/testing) niced to -10. It would be a bad > surprise for _many_ people if they upgrade their 2.4 kernel because of > other security and/or stability fixes and such a setting is then > wrong. > Same setup, actually -- I changed it to -3 and it seems nicer. As for it going into 2.4, well, I'm not incredibly strongly for it, but I do get a feeling that most of the distros (especially the ones famous for patching their kernels beyond recognizabliity) will start jumping on this particular wagon soon. Does the kernel want to be like debian ("Well, yeah, the releases are horribly out of date, but normal human beings don't actually _use_ the releases") ? P.S. Do not suppose from this message that I do not love debian immensely. :) ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-04 6:56 ` Ingo Molnar 2002-07-04 7:36 ` J Sloan 2002-07-05 6:18 ` Andrew Rodland @ 2002-07-05 9:12 ` William Lee Irwin III 2 siblings, 0 replies; 28+ messages in thread From: William Lee Irwin III @ 2002-07-05 9:12 UTC (permalink / raw) To: Ingo Molnar Cc: Bill Davidsen, Rob Landley, Tom Rini, J.A. Magallon, Linux-Kernel Mailing List On Thu, Jul 04, 2002 at 08:56:01AM +0200, Ingo Molnar wrote: > are these hard numbers? I havent seen much hard data yet from real-life > servers using the O(1) scheduler. There was lots of feedback from > desktop-class systems that behave better, but servers used to be pretty > good with the previous scheduler as well. I seem to recall some testing having been done demonstrating such differences. I'll ask around when I get back from vacation, though I'll confess it's far afield from my usual interests. Cheers, Bill ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-04 3:36 ` Bill Davidsen 2002-07-04 6:56 ` Ingo Molnar @ 2002-07-04 18:08 ` Rob Landley 2002-07-05 11:17 ` Bill Davidsen 1 sibling, 1 reply; 28+ messages in thread From: Rob Landley @ 2002-07-04 18:08 UTC (permalink / raw) To: Bill Davidsen, Ingo Molnar Cc: Tom Rini, J.A. Magallon, Linux-Kernel Mailing List On Wednesday 03 July 2002 11:36 pm, Bill Davidsen wrote: > This is not some neat feature to buy a few percent better this or that, > this is roughly 50% more users on the server before it falls over, and no > total bogs when many threads change to hog mode at once. > > You will not hear me saying this about preempt, or low-latency, and I bet > that after I try lock-break this weekend I won't fell that I have to have > that either. The O(1) scheduler is self defense against badly behaved > processes, and the reason it should go in mainline is so it won't depend > on someone finding the time to backport the fun stuff from 2.5 as a patch > every time. I've got a similar setup. At work I'm doing a simple ssh based vpn: connections to the vpn address range outside the local subnet are intercepted by port forwarding to a tiny daemon (700 lines of C source, mostly comments), that shells out to ssh (forwarding stdin and stdout back to the net connection) to connect to the appropriate remote gateaway, where it runs netcat to complete the connection. So each tcp/ip stream is individually wrapped in its own ssh process, which exits automatically when the connection closes. No mess, no fuss, and scalability is based on active connections rather than the number of systems in the VPN. Unfortunately, some of these VPN gateways are behind existing firewalls (cisco, etc). If I can get a port forwarded to my vpn gateway from that firewall, life is good (it's a little more work for the daemon to figure out where to ssh to, but that's part of the 700 lines). But when I can't get that, the machine has to dial out to a known public machine (the "star server") and have its incoming data bounced off of that machine. (Evil, but only incoming connections to those trapped machines need to use the star server. Everybody else can still dial direct, and the trapped machines can still dial out direct.) The star server tends to be running LOTS of ssh processes (four for each connection: one instance of sshd for each incoming connection, plus the netbounce processes that sshd instance runs, which talk to one another through named pipes. I could get that down to two processes by modifying sshd to integrate the netbounce functionality, but it hasn't been a bottleneck. Netbounce doesn't eat much, sshd is the real cpu hog. And it's not as easy to rewrite netbounce to be one central process with a poll loop as you'd think: sshd wants to run SOMETHING. So far I'm using standard sshd code, I'd prefer not to make special purpose modifications to the thing it if I can help it.) The bottleneck is that with thirty big data transfers going through sixty sshd processes (which are real CPU hogs decrypting incoming data and encrypting outgoing data), a 700 mhz athlon goes catatonic. The existing bulk data shoveling connections have their data shoveled fine, but new incoming connections (even for short lived "fetch me 10k of web data of the remote box" type connections) are Not Happy. The existing scheduler's getting confused by the fact that the sshd sessions DO sometimes block to get/send their data, and isn't so good at keeping a running average to spot the CPU hogs and the sessions that are more interactive or simply short lived. That's why I'm playing with the O(1) scheduler. I may need to put rate limiting in netbounce anyway, but the problem I'm HITTING is that the existing scheduler is melting down so badly that past a fairly low saturation level, fresh connection attempts through the star server are timing out. (This hardware seems like it should be able to handle around 100 simultaneous connections, and it's currently melting down around 30.) Yeah, I'm beating the CPU to death encrypting and decrypting data. Yeah, I could throw more hardware at the problem (and will). I could take another stab at redesigning the star server to consolidate all the netbounce processes into a single poll loop (which would require modifying sshd), but netbounce isn't the problem: the two sshd processes per connection are. (I could merge all the connections to and from each box into a single sshd process per gateway, but that clashes with the way the rest of the VPN works, which is simple and suprisingly reliable, and there would still be at least one per box anyway. And what that really MEANS is that I'd be bypassing the process scheduler and doing my own manual scheduling.) This is a real-world situation of a pure scheduling problem. The star server has a quarter gigabyte of ram and isn't going anywhere near swap. The scheduler has plenty of hints about CPU usage, blocking for I/O, and freshly spawned processes needing to start at a higher priority than entrenched saturation level data shovelers. Hence putting "play with O(1)" on my to-do list... Rob ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-04 18:08 ` Rob Landley @ 2002-07-05 11:17 ` Bill Davidsen 2002-07-05 15:09 ` Rob Landley 0 siblings, 1 reply; 28+ messages in thread From: Bill Davidsen @ 2002-07-05 11:17 UTC (permalink / raw) To: Rob Landley Cc: Ingo Molnar, Tom Rini, J.A. Magallon, Linux-Kernel Mailing List Rob, while I'm sure O(1) would help, you have designed this network to have a high overhead. I'll send you some notes on how to easily reduce the overhead to max one sshd per machine connected to the bounce machine. And give you an option to move the crypt overhead to the machines at the end points. On Thu, 4 Jul 2002, Rob Landley wrote: [...snip...]> > The bottleneck is that with thirty big data transfers going through sixty > sshd processes (which are real CPU hogs decrypting incoming data and > encrypting outgoing data), a 700 mhz athlon goes catatonic. The existing > bulk data shoveling connections have their data shoveled fine, but new > incoming connections (even for short lived "fetch me 10k of web data of the > remote box" type connections) are Not Happy. The existing scheduler's > getting confused by the fact that the sshd sessions DO sometimes block to > get/send their data, and isn't so good at keeping a running average to spot > the CPU hogs and the sessions that are more interactive or simply short lived. > > That's why I'm playing with the O(1) scheduler. I may need to put rate > limiting in netbounce anyway, but the problem I'm HITTING is that the > existing scheduler is melting down so badly that past a fairly low saturation > level, fresh connection attempts through the star server are timing out. > (This hardware seems like it should be able to handle around 100 simultaneous > connections, and it's currently melting down around 30.) > > Yeah, I'm beating the CPU to death encrypting and decrypting data. Yeah, I > could throw more hardware at the problem (and will). I could take another > stab at redesigning the star server to consolidate all the netbounce > processes into a single poll loop (which would require modifying sshd), but > netbounce isn't the problem: the two sshd processes per connection are. (I > could merge all the connections to and from each box into a single sshd > process per gateway, but that clashes with the way the rest of the VPN works, > which is simple and suprisingly reliable, and there would still be at least > one per box anyway. And what that really MEANS is that I'd be bypassing the > process scheduler and doing my own manual scheduling.) > > This is a real-world situation of a pure scheduling problem. The star server > has a quarter gigabyte of ram and isn't going anywhere near swap. The > scheduler has plenty of hints about CPU usage, blocking for I/O, and freshly > spawned processes needing to start at a higher priority than entrenched > saturation level data shovelers. [...snip...] -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-05 11:17 ` Bill Davidsen @ 2002-07-05 15:09 ` Rob Landley 2002-07-06 4:31 ` Bill Davidsen 0 siblings, 1 reply; 28+ messages in thread From: Rob Landley @ 2002-07-05 15:09 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux-Kernel Mailing List On Friday 05 July 2002 07:17 am, Bill Davidsen wrote: > Rob, while I'm sure O(1) would help, you have designed this network to > have a high overhead. The star server design has high overhead. The direct point to point does not. I'm trying to make the star server work the way the rest of the network works, WITHOUT extensively redesigning the parts of the network that don't need to use the star server The star server's inherently a kludge, and I know it, and I'm trying to minimize its use. It's inherently a single point of failure, and a bottleneck on an otherwise distributed and scalable system, and it potentially doubles bandwidth usage and generally MORE than doubles the bandwidth bill because fast connections are expensive and often metered. No redesign will fix those fundamental problems. The star server really exists for political reasons: some people think that having a process behind the firewall go out and fetch incoming connections is more secure than forwarding a port. Either way, a way in exists by definition, or you haven't got a VPN. On a technical level, I really do suggest just forwarding the port. > I'll send you some notes on how to easily reduce the > overhead to max one sshd per machine connected to the bounce machine. And > give you an option to move the crypt overhead to the machines at the end > points. Thanks for your suggestions. I mentioned last time that I could redesign my existing code in a number of ways to get around the old flawed scheduler, yes. And there are easier ones than you suggested (I REALLY don't want to over-complicate what is currently a very simple design). And redoing it probably seems like a much easier problem to tackle when you don't know what the full set of design requirements are. (Among other things, nodes move dynamically. They go down, their IP address changes...) I did stop and reconsider your suggestion about removing the star server's redundant decrypt/re-encrypt step. It could be done without introducing a ppp layer (which has several of the aforementioned design requirements problems I won't go into here). Unfortunately, if I did that, the initial handshaking a client box does with the star server (to identify itself and the type of connection it wants to make, etc) wouldn't be encrypted or cryptographically verified either (unless I did it myself, and right now all the encryption is neatly handled by ssh, which I already mentioned not wanting to modify). As I said, if O(1) doesn't work I have options. (I have a long to-do list to get through first but I hope to be able to try it on a stress-testable server sometime after the weekend.) The other thing is that CPU usage should scale with bandwidth shoveled, and that should be mostly true whether it's one process or 100. (Yeah, modulo cache flushing, but it's the same process and that data it works on is use-once stream no matter how you look at it.) The star server is hooked up to the internet, not a LAN. If it's got a faster than 10 megabit connection somebody's putting a LOT of money behind it, they could definitely afford to throw SMP CPU time at the problem, and in that case having multiple processes makes scaling easier. Having CPU usage be a limiting factor was acceptable in the initial design, but the behavior under load of the old scheduler is a bit... unexpected at times. And at THIS point, the question is whether to redesign the app or fix the scheduler. (I expect the multi-threaded people hit this all the time. :) Rob ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-05 15:09 ` Rob Landley @ 2002-07-06 4:31 ` Bill Davidsen 2002-07-06 23:10 ` Rob Landley 0 siblings, 1 reply; 28+ messages in thread From: Bill Davidsen @ 2002-07-06 4:31 UTC (permalink / raw) To: Rob Landley; +Cc: Linux-Kernel Mailing List On Fri, 5 Jul 2002, Rob Landley wrote: > I did stop and reconsider your suggestion about removing the star server's > redundant decrypt/re-encrypt step. It could be done without introducing a > ppp layer (which has several of the aforementioned design requirements > problems I won't go into here). Unfortunately, if I did that, the initial > handshaking a client box does with the star server (to identify itself and > the type of connection it wants to make, etc) wouldn't be encrypted or > cryptographically verified either (unless I did it myself, and right now all > the encryption is neatly handled by ssh, which I already mentioned not > wanting to modify). That's not correct... if you set the encryption type to none the connection and port forwarding are not encrypted, but the handshake still is, using password, host key, or requiring both. You can make a fully authenticated non-encrypted connection. I like running the popular "sleep" program as the main command, and using port forwarding for what you do, since you reject running ppp over ssh. I'm running 19-pre10ac2+smp patches, as I recall ac4 or 5 are out, I just stopped upgrading when I got stability. If you run uni you should be able to drop in the new kernel, push the excryption overhead to the endpoints, and have nearly no work on the star server. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-06 4:31 ` Bill Davidsen @ 2002-07-06 23:10 ` Rob Landley 2002-07-07 10:55 ` Bill Davidsen 0 siblings, 1 reply; 28+ messages in thread From: Rob Landley @ 2002-07-06 23:10 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux-Kernel Mailing List On Saturday 06 July 2002 12:31 am, Bill Davidsen wrote: > That's not correct... if you set the encryption type to none the > connection and port forwarding are not encrypted, but the handshake still > is, using password, host key, or requiring both. You can make a fully > authenticated non-encrypted connection. I like running the popular "sleep" > program as the main command, and using port forwarding for what you do, > since you reject running ppp over ssh. I'm sending data through the connection to tell the star server who we are (which sshd may know but the star server doesn't) which I don't want to have snooped. Box identification keys and such that could be used to forge access. Yes, I could redesign the entire handshaking protocol to be based on an md5 sum with a timestamp, and redo the key distribution of all the other boxes in my network to try to get ssh to inform us of which box is connecting in, but I don't want to. The star server is a kludge. Period. Moving CPU usage won't change that. The problem you're trying to solve ONLY affects the star server. The SIMPLE solution is to put rate limiting into netbounce, which I specced out last week but thought O(1) would be a better solution. It's not the load that's the problem (a T1 line, DSL connection, or 10baseT can't saturate it). It's the unfairness, which was found in laboratory stress testing with a 100baseT network connection and a 700 mhz processor. (If you can afford a 100baseT connection to the internet which you can keep saturated for long periods of time, you can usually afford more than a 700 mhz processor. Really.) Moving the limiting factor from CPU time (which is cheap) to network bandwidth (which is expensive) makes the unfairness harder to fix anyway. The O(1) scheduler gives the behavior we want, randomly dropping packets because your network connection is saturated (which is the normal case in the field anyway today) does not. If sheer bandwidth saturation causes a similar problem in the second round of stress testing, I may have to put rate limiting into netbounce anyway, although I'm hoping a combination of processing latency and O(1) will do that for me even when the star server is not CPU-limited but bandwidth limited by real-world internet connections. (I'm not holding my breath, but dealing with the current problem means I can hold off for a while...) You keep trying to find a way to fix the wrong problem: it's not the bandwidth, it's the unfairness. O(1) is a quick and easy fix that might address this (on tuesday), and so far it's only a problem when a kludge I'm trying to minimize the use of is stress tested under laboratory conditions. I only mentioned it in the first place as a real-world application of O(1) to an existing problem, yes one that could be solved in other ways. I could get this thing to work under DOS if I wanted to, without any scheduler at all, I just really don't consider it a good use of time. I would happily have the star server do twice the work if it avoids adding complexity and overhead to the nodes that are NOT using the star server. Doing otherwise is optimizing the wrong thing: the star server is a bad idea requested by management for customers who want to have a VPN without configuring their firewalls to work with it. It -CAN'T- be efficient, it's a single bottleneck for the entire network that's sending every packet through the same interface twice, the only question is how inefficient will it be. I'm not rewriting the way the rest of the nodes work to coddle the star server unless I have no choice, and I had about three alternatives lined up to try before I decided that O(1) would be a cleaner and easier thing to try. > I'm running 19-pre10ac2+smp patches, as I recall ac4 or 5 are out, I just > stopped upgrading when I got stability. If you run uni you should be able > to drop in the new kernel, push the excryption overhead to the endpoints, > and have nearly no work on the star server. What the heck does the new kernel have to do with rewriting my app so that ssh is used in a different manner? I could do that on 2.4.18 just fine, and I've repeatedly said I don't want to, and going truly in-depth as to the reasons WHY is off-topic here. I got the O(1) patch for 19-rc1 and will testing it on my laptop in a few hours. I really don't want to continue this thread. Rob ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-06 23:10 ` Rob Landley @ 2002-07-07 10:55 ` Bill Davidsen 0 siblings, 0 replies; 28+ messages in thread From: Bill Davidsen @ 2002-07-07 10:55 UTC (permalink / raw) To: Rob Landley; +Cc: Linux-Kernel Mailing List On Sat, 6 Jul 2002, Rob Landley wrote: > I got the O(1) patch for 19-rc1 and will testing it on my laptop in a few > hours. I really don't want to continue this thread. Agreed, I haven't been able to communicate the idea to you in several tries or you would not be talking about rewriting your application (unless changing router IP requires that). You appear to be looking for the best way to do something which doesn't need to be done, and I wish you joy at it. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-01 23:44 ` J.A. Magallon 2002-07-02 2:48 ` Tom Rini @ 2002-07-02 16:05 ` venom 2002-07-02 16:53 ` Tomas Szepe 1 sibling, 1 reply; 28+ messages in thread From: venom @ 2002-07-02 16:05 UTC (permalink / raw) To: J.A. Magallon; +Cc: Tom Rini, Bill Davidsen, Linux-Kernel Mailing List On Tue, 2 Jul 2002, J.A. Magallon wrote: > Date: Tue, 2 Jul 2002 01:44:32 +0200 > From: J.A. Magallon <jamagallon@able.es> > To: Tom Rini <trini@kernel.crashing.org> > Cc: Bill Davidsen <davidsen@tmr.com>, > Linux-Kernel Mailing List <linux-kernel@vger.kernel.org> > Subject: Re: [OKS] O(1) scheduler in 2.4 > > > On 2002.07.01 Tom Rini wrote: > >On Mon, Jul 01, 2002 at 01:52:54PM -0400, Bill Davidsen wrote: > > > >> What's the issue? > > > >a) We're at 2.4.19-rc1 right now. It would be horribly > >counterproductive to put O(1) in right now. > > .20-pre1 would be a good start, but my hope is that this reserved for > the vm updates from -aa ;). If I am not wrong in the AA tree the O(1) scheduler has been merged, so there is an opportunity do update booth ;). > > >c) I also suspect that it hasn't been as widley tested on !x86 as the > >stuff currently in 2.4. And again, 2.4 is the stable tree. > > > Well, I think it has been supposed to high quality test, also if the tester basis was quite reduced... ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-02 16:05 ` venom @ 2002-07-02 16:53 ` Tomas Szepe 0 siblings, 0 replies; 28+ messages in thread From: Tomas Szepe @ 2002-07-02 16:53 UTC (permalink / raw) To: linux-kernel > > >> What's the issue? > > > > > >a) We're at 2.4.19-rc1 right now. It would be horribly > > >counterproductive to put O(1) in right now. > > > > .20-pre1 would be a good start, but my hope is that this reserved for > > the vm updates from -aa ;). > > If I am not wrong in the AA tree the O(1) scheduler has been merged, so > there is an opportunity do update booth ;). ... and then hope the thing doesn't turn into a suicide booth. T. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-01 18:12 ` Tom Rini 2002-07-01 23:44 ` J.A. Magallon @ 2002-07-02 14:46 ` Bill Davidsen 2002-07-02 15:12 ` Tom Rini 1 sibling, 1 reply; 28+ messages in thread From: Bill Davidsen @ 2002-07-02 14:46 UTC (permalink / raw) To: Tom Rini; +Cc: Linux-Kernel Mailing List On Mon, 1 Jul 2002, Tom Rini wrote: > On Mon, Jul 01, 2002 at 01:52:54PM -0400, Bill Davidsen wrote: > > > What's the issue? > > a) We're at 2.4.19-rc1 right now. It would be horribly > counterproductive to put O(1) in right now. > b) 2.4 is the _stable_ tree. If every big change in 2.5 got back ported > to 2.4, it'd be just like 2.5 :) > c) I also suspect that it hasn't been as widley tested on !x86 as the > stuff currently in 2.4. And again, 2.4 is the stable tree. Since 2.5 feature freeze isn't planned until fall, I think you can assume there will be releases after 2.4.19... Since it has been as heavily tested as any feature not in a stable release kernel can be, there seems little reason to put it off for a year, assuming 2.6 releases within six months of feature freeze. Stable doesn't mean moribund, we are working Andrea's VM stuff in, and that's a LOT more likely to behave differently on hardware with other word length. Keeping inferior performance for another year and then trying to separate 2.5 other unintended features from any possible scheduler issues seems like a reduction in stability for 2.6. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-02 14:46 ` Bill Davidsen @ 2002-07-02 15:12 ` Tom Rini 2002-07-04 4:02 ` Bill Davidsen 0 siblings, 1 reply; 28+ messages in thread From: Tom Rini @ 2002-07-02 15:12 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux-Kernel Mailing List On Tue, Jul 02, 2002 at 10:46:56AM -0400, Bill Davidsen wrote: > On Mon, 1 Jul 2002, Tom Rini wrote: > > > On Mon, Jul 01, 2002 at 01:52:54PM -0400, Bill Davidsen wrote: > > > > > What's the issue? > > > > a) We're at 2.4.19-rc1 right now. It would be horribly > > counterproductive to put O(1) in right now. > > b) 2.4 is the _stable_ tree. If every big change in 2.5 got back ported > > to 2.4, it'd be just like 2.5 :) > > c) I also suspect that it hasn't been as widley tested on !x86 as the > > stuff currently in 2.4. And again, 2.4 is the stable tree. > > Since 2.5 feature freeze isn't planned until fall, I think you can assume > there will be releases after 2.4.19... I sure hope so, I've got a whole bunch of PPC stuff that's been around for ages now that just might make it into 2.4.20 :) > Since it has been as heavily tested > as any feature not in a stable release kernel can be, there seems little > reason to put it off for a year, assuming 2.6 releases within six months > of feature freeze. Sure there is. It's called stopping feature creep. O(1) is a nice feature, but so is the bio stuff, the initcall levels, and other things in 2.5 as well. But should we back port all of these to 2.4 as well? > Stable doesn't mean moribund, we are working Andrea's VM stuff in, and > that's a LOT more likely to behave differently on hardware with other word > length. Being someone who actually works on !x86 hardware all of the time, I'm slightly warry of Andrea's VM work as well. But it's also something which has been split into numerous small chunks, so hopefully problems will be spotted. > Keeping inferior performance for another year and then trying to > separate 2.5 other unintended features from any possible scheduler issues > seems like a reduction in stability for 2.6. It's no more of a reduction in stability than not back porting everything else. And making things stable is why eventually Linus says 'enough' and kicks out 2.stable.0-test1. Anyhow, since this isn't a subsystem backport, but part of the core kernel, I would think that you could only get limited use out of the testing (I remember reading some of the O(1) announcments for 2.4.then-current and reading about small bugs that weren't in the 2.5 version). -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-02 15:12 ` Tom Rini @ 2002-07-04 4:02 ` Bill Davidsen 2002-07-04 4:17 ` Tom Rini 0 siblings, 1 reply; 28+ messages in thread From: Bill Davidsen @ 2002-07-04 4:02 UTC (permalink / raw) To: Tom Rini; +Cc: Linux-Kernel Mailing List On Tue, 2 Jul 2002, Tom Rini wrote: > Sure there is. It's called stopping feature creep. O(1) is a nice > feature, but so is the bio stuff, the initcall levels, and other things > in 2.5 as well. But should we back port all of these to 2.4 as well? None of the other stuff is (a) a solution for any current problem I've seen (it NEW capability), or (b) has a functional and widely exposed port to 2.4 already. The only other feature which which I'm familiar which even remotely fits those two characteristics is rmap, and with the VM changes Andrea has made I certainly don't hit really bad VM behaviour on my machines. On some rmap is a tad better, but compared to 2.4.16 or so 19-preX-aa is acceptable. > > Stable doesn't mean moribund, we are working Andrea's VM stuff in, and > > that's a LOT more likely to behave differently on hardware with other word > > length. > > Being someone who actually works on !x86 hardware all of the time, I'm > slightly warry of Andrea's VM work as well. But it's also something > which has been split into numerous small chunks, so hopefully problems > will be spotted. > > > Keeping inferior performance for another year and then trying to > > separate 2.5 other unintended features from any possible scheduler issues > > seems like a reduction in stability for 2.6. > > It's no more of a reduction in stability than not back porting > everything else. And making things stable is why eventually Linus says > 'enough' and kicks out 2.stable.0-test1. Anyhow, since this isn't a > subsystem backport, but part of the core kernel, I would think that you > could only get limited use out of the testing (I remember reading some > of the O(1) announcments for 2.4.then-current and reading about small > bugs that weren't in the 2.5 version). The current scheduler has one big bug; it gives the processor to the wrong process under some load conditions to the point where the system appears hung for seconds (or longer). There are two issues, one is best or acceptable performance, and one is "best worst-case performance." The O(1) simply doesn't have or hasn't shown me the jackpot case when load changes on a machine. To me that justifies O(1). Even if it was not faster than the current scheduler for normal load, the worst case is what needs a fix. And as I mentioned to Ingo, I don't feel that way about low-latency or preempt, even though they help a little they don't really fix anything broken, and I don't argue for inclusion. The current scheduler does behave very badly in some cases, and should be fixed now, not in 18 months. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-04 4:02 ` Bill Davidsen @ 2002-07-04 4:17 ` Tom Rini 0 siblings, 0 replies; 28+ messages in thread From: Tom Rini @ 2002-07-04 4:17 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux-Kernel Mailing List On Thu, Jul 04, 2002 at 12:02:20AM -0400, Bill Davidsen wrote: > On Tue, 2 Jul 2002, Tom Rini wrote: > > > Sure there is. It's called stopping feature creep. O(1) is a nice > > feature, but so is the bio stuff, the initcall levels, and other things > > in 2.5 as well. But should we back port all of these to 2.4 as well? > > None of the other stuff is (a) a solution for any current problem I've > seen (it NEW capability), or (b) has a functional and widely exposed port > to 2.4 already. I believe (b), but bio is attempting to solve some of the underlying block device issues. And all of the IDE stuff is trying to make a good IDE subsystem. And so on and so forth. > The only other feature which which I'm familiar which even remotely fits > those two characteristics is rmap, and with the VM changes Andrea has made > I certainly don't hit really bad VM behaviour on my machines. On some rmap > is a tad better, but compared to 2.4.16 or so 19-preX-aa is acceptable. And rmap isn't in 2.4. And I don't think it will be, nor IMHO some parts of -aa. > > It's no more of a reduction in stability than not back porting > > everything else. And making things stable is why eventually Linus says > > 'enough' and kicks out 2.stable.0-test1. Anyhow, since this isn't a > > subsystem backport, but part of the core kernel, I would think that you > > could only get limited use out of the testing (I remember reading some > > of the O(1) announcments for 2.4.then-current and reading about small > > bugs that weren't in the 2.5 version). > > The current scheduler has one big bug; it gives the processor to the wrong > process under some load conditions to the point where the system appears > hung for seconds (or longer). So, in some corner cases it sucks. The VM has issues for corner cases as well, which is why distros include lots of other patches in their kernels. > And as I mentioned to Ingo, I don't feel that way about low-latency or > preempt, even though they help a little they don't really fix anything > broken, and I don't argue for inclusion. The current scheduler does behave > very badly in some cases, and should be fixed now, not in 18 months. I don't think the low-latency, preempt or O(1) should make it into 2.4. And since Ingo, who wrote this, doesn't think it should go into 2.4 right now, it hopefully won't. Just because some corner cases can be fixed by massive rewrites doesn't mean the fix should be backported. It seems I can't stress this enough, 2.4 is supposed to be _stable_. And by stable I mean doesn't crash, lock up, or panic. Less than ideal VM usage or CPU usage generally isn't solvable in small easily verifiable patches like fixing crashes, lock ups and panics are. I'm not saying people shouldn't use O(1) (or preempt or low-latency or a half dozen other things not in 2.4 proper), just that they shouldn't go into 2.4.<current>. Vendors should decide if they want to add them on top of a stable base. Users should decide if they want to add them on top of a stable base. -- Tom Rini (TR1265) http://gate.crashing.org/~trini/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-01 17:52 [OKS] O(1) scheduler in 2.4 Bill Davidsen 2002-07-01 18:12 ` Tom Rini @ 2002-07-01 18:49 ` Ingo Molnar 2002-07-02 15:07 ` Bill Davidsen 1 sibling, 1 reply; 28+ messages in thread From: Ingo Molnar @ 2002-07-01 18:49 UTC (permalink / raw) To: Bill Davidsen; +Cc: Linux-Kernel Mailing List On Mon, 1 Jul 2002, Bill Davidsen wrote: > What's the issue? The most popular trees have been using it without > issue for six months or so, and I know of no cases of bad behaviour. > [...] well, the patch is barely 6 months old. A new scheduler changes the 'heart' of the kernel and something like that should not be done for the stable branch, especially since it has finally started to converge towards a state that can be called stable ... > [...] I know there are people who don't believe in the preempt patch, > but the new scheduler seems to work better under both desktop and server > load. well, the preempt patch is rather for RT-type workloads where milliseconds matter, which improvements are not a matter of belief, but a matter of hard latencies. Mere mortals should hardly notice its effects under normal loads - perhaps a bit more 'snappiness'. But such effects do accumulate up, and people are seeing visible improvements with combo-patches of lowlat-lockbreak+preempt+O(1). Ingo ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [OKS] O(1) scheduler in 2.4 2002-07-01 18:49 ` Ingo Molnar @ 2002-07-02 15:07 ` Bill Davidsen 0 siblings, 0 replies; 28+ messages in thread From: Bill Davidsen @ 2002-07-02 15:07 UTC (permalink / raw) To: Ingo Molnar; +Cc: Linux-Kernel Mailing List On Mon, 1 Jul 2002, Ingo Molnar wrote: > > On Mon, 1 Jul 2002, Bill Davidsen wrote: > > > What's the issue? The most popular trees have been using it without > > issue for six months or so, and I know of no cases of bad behaviour. > > [...] > > well, the patch is barely 6 months old. A new scheduler changes the > 'heart' of the kernel and something like that should not be done for the > stable branch, especially since it has finally started to converge towards > a state that can be called stable ... As I noted, the VM changes which are going in without objection are more likely to be a cause of problems caused by word length, memory organization, etc. And they work fine, at least for Intel and SPARC. O(1) has been as tested as any feature can be, certainly -ac kernels are run by more people than 2.5 kernels, and running the best process is less likely to be hardware dependent. There is a big win with this scheduler, it keeps the system running far better on mixed loads, and does it without hours of playing with nice() to get things balanced. > > [...] I know there are people who don't believe in the preempt patch, > > but the new scheduler seems to work better under both desktop and server > > load. > > well, the preempt patch is rather for RT-type workloads where milliseconds > matter, which improvements are not a matter of belief, but a matter of > hard latencies. Mere mortals should hardly notice its effects under normal > loads - perhaps a bit more 'snappiness'. But such effects do accumulate > up, and people are seeing visible improvements with combo-patches of > lowlat-lockbreak+preempt+O(1). Last time I tried that, I used all but lockbreak, and the only place I saw anything for my loads was slightly lower latency for a slow machine playing router. But I'm running news and dns servers, and O(1) seems to drop the load average by about 15% (as much as you can measure on a machine with 400% swings in the demand ;-) Thanks for the input, I just don't see that there will ever be a better time to put it in, the 2.5 kernel is very lightly used and tested, and has enough other things happening to mask anything short of a disaster. And 2.6 will be another stable kernel, at least numerically, initially with much less testing. -- bill davidsen <davidsen@tmr.com> CTO, TMR Associates, Inc Doing interesting things with little computers since 1979. ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2002-07-07 10:58 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-07-01 17:52 [OKS] O(1) scheduler in 2.4 Bill Davidsen 2002-07-01 18:12 ` Tom Rini 2002-07-01 23:44 ` J.A. Magallon 2002-07-02 2:48 ` Tom Rini 2002-07-03 1:11 ` Rob Landley 2002-07-03 7:30 ` Adrian Bunk 2002-07-03 8:35 ` Ingo Molnar 2002-07-04 3:36 ` Bill Davidsen 2002-07-04 6:56 ` Ingo Molnar 2002-07-04 7:36 ` J Sloan 2002-07-05 6:18 ` Andrew Rodland 2002-07-05 6:56 ` Adrian Bunk 2002-07-05 7:02 ` Andrew Rodland 2002-07-05 9:12 ` William Lee Irwin III 2002-07-04 18:08 ` Rob Landley 2002-07-05 11:17 ` Bill Davidsen 2002-07-05 15:09 ` Rob Landley 2002-07-06 4:31 ` Bill Davidsen 2002-07-06 23:10 ` Rob Landley 2002-07-07 10:55 ` Bill Davidsen 2002-07-02 16:05 ` venom 2002-07-02 16:53 ` Tomas Szepe 2002-07-02 14:46 ` Bill Davidsen 2002-07-02 15:12 ` Tom Rini 2002-07-04 4:02 ` Bill Davidsen 2002-07-04 4:17 ` Tom Rini 2002-07-01 18:49 ` Ingo Molnar 2002-07-02 15:07 ` Bill Davidsen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.