* Re: 2.4.18pre4aa1 @ 2002-01-24 5:23 rwhron 2002-01-24 6:27 ` 2.4.18pre4aa1 Daniel Phillips 2002-01-25 0:11 ` 2.4.18pre4aa1 Andrea Arcangeli 0 siblings, 2 replies; 28+ messages in thread From: rwhron @ 2002-01-24 5:23 UTC (permalink / raw) To: linux-kernel Changelog with history at: http://home.earthlink.net/~rwhron/kernel/2.4.18pre4aa1.html Benchmarks on 2.4.18pre4aa1 and lots of other kernels at: http://home.earthlink.net/~rwhron/kernel/k6-2-475.html -- Randy Hron ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-24 5:23 2.4.18pre4aa1 rwhron @ 2002-01-24 6:27 ` Daniel Phillips 2002-01-25 0:09 ` 2.4.18pre4aa1 Andrea Arcangeli 2002-01-25 0:19 ` 2.4.18pre4aa1 rwhron 2002-01-25 0:11 ` 2.4.18pre4aa1 Andrea Arcangeli 1 sibling, 2 replies; 28+ messages in thread From: Daniel Phillips @ 2002-01-24 6:27 UTC (permalink / raw) To: rwhron, linux-kernel On January 24, 2002 06:23 am, rwhron@earthlink.net wrote: > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at: > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O benchmark, but it does create a high load, and may put some pressure on the cpu and i/o schedulers. Each dbench process creates about 21 megabytes worth of files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB for the dbench runs. Big enough so the tests cannot run from the buffer/page caches on this box." Thanks kindly for the testing, but please don't use dbench any more for benchmarks. If you are testing stability, fine, but dbench throughput numbers are not good for much more than wild goose chases. Even when mostly uncached, dbench still produces flaky results. -- Daniel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-24 6:27 ` 2.4.18pre4aa1 Daniel Phillips @ 2002-01-25 0:09 ` Andrea Arcangeli 2002-01-28 9:53 ` 2.4.18pre4aa1 Daniel Phillips 2002-01-25 0:19 ` 2.4.18pre4aa1 rwhron 1 sibling, 1 reply; 28+ messages in thread From: Andrea Arcangeli @ 2002-01-25 0:09 UTC (permalink / raw) To: Daniel Phillips; +Cc: rwhron, linux-kernel On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote: > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote: > > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at: > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html > > "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O benchmark, > but it does create a high load, and may put some pressure on the cpu and > i/o schedulers. Each dbench process creates about 21 megabytes worth of > files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB for the dbench runs. Big > enough so the tests cannot run from the buffer/page caches on this box." > > Thanks kindly for the testing, but please don't use dbench any more for > benchmarks. If you are testing stability, fine, but dbench throughput > numbers are not good for much more than wild goose chases. > > Even when mostly uncached, dbench still produces flaky results. this is not enterely true. dbench has a value. the only problem with dbench is that you can trivially cheat and change the kernel in a broken way, but optimal _only_ for dbench, just to get stellar dbench numbers, but this is definitely not the case with the -aa tree, -aa tree is definitely not optimized for dbench, infact the recent improvement cames most probably from dyn-sched and bdflush histeresis introduction, not from vm changes at all (there were no recent significant vm changes in the page replacement and aging algorithms infact). rmap instead sucks in most of the benchmarks because of the noticeable overhead of maintaining those reverse maps that starts to help only by the time you need to swap/pageout (totally useless and only overhead for number crunching, database selfcaching etc..). This is the only issue with the rmap design and you can definitely see it in the numbers. Here I'm only speaking about the design, I never checked the current implementation. Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 0:09 ` 2.4.18pre4aa1 Andrea Arcangeli @ 2002-01-28 9:53 ` Daniel Phillips 2002-01-28 15:29 ` 2.4.18pre4aa1 Andrea Arcangeli 0 siblings, 1 reply; 28+ messages in thread From: Daniel Phillips @ 2002-01-28 9:53 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: rwhron, linux-kernel On January 25, 2002 01:09 am, Andrea Arcangeli wrote: > On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote: > > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote: > > > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at: > > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html > > > > "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O > > benchmark, but it does create a high load, and may put some pressure on > > the cpu and i/o schedulers. Each dbench process creates about 21 > > megabytes worth of files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB > > for the dbench runs. Big enough so the tests cannot run from the > > buffer/page caches on this box." > > > > Thanks kindly for the testing, but please don't use dbench any more for > > benchmarks. If you are testing stability, fine, but dbench throughput > > numbers are not good for much more than wild goose chases. > > > > Even when mostly uncached, dbench still produces flaky results. > > this is not enterely true. dbench has a value. Yes, but not for benchmarks. It has value only as a stability test - while it may in some cases provide some general indication of performance, its variance is far too large, even under controlled conditions, for it to have any value as a benchmark. I'm surprised you'd even suggest this. Andrea, please, if we want good benchmarks let's at least be clear on what tools benchmarkers should/should not be using. > the only problem with > dbench is that you can trivially cheat and change the kernel in a broken > way, but optimal _only_ for dbench, just to get stellar dbench numbers, No, this is not the only problem. DBench is just plain *flaky*. You don't appear to be clear on why. In short, dbench has two main flaws: - It's extremely sensitive to scheduling. If one process happens to make progress then it gets more heavily cached and its progress becomes even greater. The benchmark completes much more quickly in this case, whereas if all processes progress at nearly the same rate (by chance) it runs more slowly. - It can happen (again by chance) that dbench files get deleted while still in cache, and this process completes in a fraction of the time that real disk IO would require. I've seen successsive runs of dbench *under identical conditions* (that is, from a clean reboot etc.) vary by as much as 30%. Others report even greater variance. Can we please agree that dbench is useless for benchmarks? -- Daniel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-28 9:53 ` 2.4.18pre4aa1 Daniel Phillips @ 2002-01-28 15:29 ` Andrea Arcangeli 2002-01-28 20:28 ` 2.4.18pre4aa1 Daniel Phillips 0 siblings, 1 reply; 28+ messages in thread From: Andrea Arcangeli @ 2002-01-28 15:29 UTC (permalink / raw) To: Daniel Phillips; +Cc: rwhron, linux-kernel On Mon, Jan 28, 2002 at 10:53:25AM +0100, Daniel Phillips wrote: > On January 25, 2002 01:09 am, Andrea Arcangeli wrote: > > On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote: > > > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote: > > > > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at: > > > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html > > > > > > "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O > > > benchmark, but it does create a high load, and may put some pressure on > > > the cpu and i/o schedulers. Each dbench process creates about 21 > > > megabytes worth of files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB > > > for the dbench runs. Big enough so the tests cannot run from the > > > buffer/page caches on this box." > > > > > > Thanks kindly for the testing, but please don't use dbench any more for > > > benchmarks. If you are testing stability, fine, but dbench throughput > > > numbers are not good for much more than wild goose chases. > > > > > > Even when mostly uncached, dbench still produces flaky results. > > > > this is not enterely true. dbench has a value. > > Yes, but not for benchmarks. It has value only as a stability test - while > it may in some cases provide some general indication of performance, its > variance is far too large, even under controlled conditions, for it to have > any value as a benchmark. I'm surprised you'd even suggest this. > > Andrea, please, if we want good benchmarks let's at least be clear on what > tools benchmarkers should/should not be using. > > > the only problem with > > dbench is that you can trivially cheat and change the kernel in a broken > > way, but optimal _only_ for dbench, just to get stellar dbench numbers, > > No, this is not the only problem. DBench is just plain *flaky*. You don't > appear to be clear on why. In short, dbench has two main flaws: > > - It's extremely sensitive to scheduling. If one process happens to make > progress then it gets more heavily cached and its progress becomes even > greater. The benchmark completes much more quickly in this case, whereas > if all processes progress at nearly the same rate (by chance) it runs > more slowly. > > - It can happen (again by chance) that dbench files get deleted while still > in cache, and this process completes in a fraction of the time that real > disk IO would require. > > I've seen successsive runs of dbench *under identical conditions* (that is, > from a clean reboot etc.) vary by as much as 30%. Others report even greater > variance. Can we please agree that dbench is useless for benchmarks? I never seen it to vary 30% on the same kernel. Anyways dbench tells you mostly about elevator etc... it's a good test to check the elevator is working properly, the ++ must be mixed with the dots etc... if the elevator is aggressive enough. Of course that means the elevator is not perfectly fair but that's the whole point about having an elevator. It is also an interesting test for page replacement, but with page replacement it would be possible to write a broken algorithm that produces good numbers, that's the thing I believe to be bad about dbench (oh, like tiotest fake numbers too of course). Other than this it just shows rmap12a has an elevator not aggressive enough which is probably true, I doubt it has anything to do with the VM changes in rmap (of course rmap design significant overhead is helping to slow it down too though), more likely the bomb_segments logic from Andrew that Rik has included, infact the broken page replacement that lefts old stuff in cache if something might generate more unfairness that should generate faster dbench numbers for rmap, but on this last bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm you need to make sure to cache lots of the readahead as well (also the one not used yet), but I'm not 100% sure on the effect of lefting old pollution in cache rather than recycling it, I never attempted it). Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-28 15:29 ` 2.4.18pre4aa1 Andrea Arcangeli @ 2002-01-28 20:28 ` Daniel Phillips 2002-01-28 23:40 ` 2.4.18pre4aa1 Andrea Arcangeli 0 siblings, 1 reply; 28+ messages in thread From: Daniel Phillips @ 2002-01-28 20:28 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: rwhron, linux-kernel On January 28, 2002 04:29 pm, Andrea Arcangeli wrote: > On Mon, Jan 28, 2002 at 10:53:25AM +0100, Daniel Phillips wrote: > > On January 25, 2002 01:09 am, Andrea Arcangeli wrote: > > > On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote: > > > > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote: > > > > Even when mostly uncached, dbench still produces flaky results. > > [...] > > > the only problem with > > > dbench is that you can trivially cheat and change the kernel in a broken > > > way, but optimal _only_ for dbench, just to get stellar dbench numbers, > > > > No, this is not the only problem. DBench is just plain *flaky*. You don't > > appear to be clear on why. In short, dbench has two main flaws: > > > > - It's extremely sensitive to scheduling. If one process happens to make > > progress then it gets more heavily cached and its progress becomes even > > greater. The benchmark completes much more quickly in this case, whereas > > if all processes progress at nearly the same rate (by chance) it runs > > more slowly. > > > > - It can happen (again by chance) that dbench files get deleted while still > > in cache, and this process completes in a fraction of the time that real > > disk IO would require. > > > > I've seen successsive runs of dbench *under identical conditions* (that is, > > from a clean reboot etc.) vary by as much as 30%. Others report even greater > > variance. Can we please agree that dbench is useless for benchmarks? > > I never seen it to vary 30% on the same kernel. Just ask around. Marcelo or Andrew Morton would be a good place to start. > Anyways dbench tells you mostly about elevator etc... it's a good test > to check the elevator is working properly, the ++ must be mixed with the > dots etc... if the elevator is aggressive enough. Of course that means > the elevator is not perfectly fair but that's the whole point about > having an elevator. It is also an interesting test for page replacement, > but with page replacement it would be possible to write a broken > algorithm that produces good numbers, that's the thing I believe to be > bad about dbench (oh, like tiotest fake numbers too of course). Other > than this it just shows rmap12a has an elevator not aggressive enough > which is probably true, I doubt it has anything to do with the VM > changes in rmap (of course rmap design significant overhead is helping > to slow it down too though), more likely the bomb_segments logic from > Andrew that Rik has included, infact the broken page replacement that > lefts old stuff in cache if something might generate more unfairness > that should generate faster dbench numbers for rmap, but on this last > bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm > you need to make sure to cache lots of the readahead as well (also the > one not used yet), but I'm not 100% sure on the effect of lefting old > pollution in cache rather than recycling it, I never attempted it). Interesting analysis. It's a hint at how hard the elevator problem really is. Fairness as in 'equal load distribution' is not the best policy under heavy load, just as it is not the best policy under heavy swapping. Exactly what kind of unfairness is best, though, is a deep, difficult question. I'll bet it doesn't get seriously addressed even in this kernel cycle, or at best, very late in the cycle after the big infrastructure changes settle down. -- Daniel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-28 20:28 ` 2.4.18pre4aa1 Daniel Phillips @ 2002-01-28 23:40 ` Andrea Arcangeli 2002-01-29 0:15 ` 2.4.18pre4aa1 Daniel Phillips 0 siblings, 1 reply; 28+ messages in thread From: Andrea Arcangeli @ 2002-01-28 23:40 UTC (permalink / raw) To: Daniel Phillips; +Cc: rwhron, linux-kernel On Mon, Jan 28, 2002 at 09:28:24PM +0100, Daniel Phillips wrote: > Just ask around. Marcelo or Andrew Morton would be a good place to start. ah, btw, if you test with a broken page replacement (kind of random) it's normal you get huge variations. But with my -aa tree, you should never get a significant difference (no matter if it's Marcelo or Andrew to run the benchmark). I've also to say I always mke2fs first when I run my benchmarks, so I don't consider possible filesystem layout differences into the equation but I doubt (unless you're running with a corner case like running out of space or stuff like that), that it will make a significant difference either. > > Anyways dbench tells you mostly about elevator etc... it's a good test > > to check the elevator is working properly, the ++ must be mixed with the > > dots etc... if the elevator is aggressive enough. Of course that means > > the elevator is not perfectly fair but that's the whole point about > > having an elevator. It is also an interesting test for page replacement, > > but with page replacement it would be possible to write a broken > > algorithm that produces good numbers, that's the thing I believe to be > > bad about dbench (oh, like tiotest fake numbers too of course). Other > > than this it just shows rmap12a has an elevator not aggressive enough > > which is probably true, I doubt it has anything to do with the VM > > changes in rmap (of course rmap design significant overhead is helping > > to slow it down too though), more likely the bomb_segments logic from > > Andrew that Rik has included, infact the broken page replacement that > > lefts old stuff in cache if something might generate more unfairness > > that should generate faster dbench numbers for rmap, but on this last > > bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm > > you need to make sure to cache lots of the readahead as well (also the > > one not used yet), but I'm not 100% sure on the effect of lefting old > > pollution in cache rather than recycling it, I never attempted it). > > Interesting analysis. It's a hint at how hard the elevator problem really > is. Fairness as in 'equal load distribution' is not the best policy under > heavy load, just as it is not the best policy under heavy swapping. Exactly as always it depends if the object is throughput or latency, for dbench that's the object. Also the function between throughtput and latency is not linear and it depends on too many factors to find an elevator algorithm that works well on the paper. So, in function of that, one vapourware idea I had while reading your email is to use the feedback from the output througput generated to know when it's worthwhile to decrease or increase the latency. If decreasing latency doesn't decrease the final throughput generated, that means we're ok to decrease latency even more. As soon as the throughput decreases (despite of people waiting on the submit_bh pipeline), we know we'd better not decrease latency further, unless we want to hurt performance. The current elevator (not rmap) is always very permissive, so throughput is ok in dbench (and anything seeking as hard as dbench), but latency often sucks (actually in -aa I decreased the read latency so it's acceptable, not like in mainline, but still it's far from being very reactive under a write flood). The feedback from the output channel to control the latency parameters in a dynamic manner may help to decrease latency when possible (not unconditionally with elvtune). One of the thing I love about the analog electronics are the operational chips, a feedback loop solves so much difficult problems so easily. Software can do similar things lots of times. Anyways this is just vapourware (probably quite complex to implement in a generic manner) but fixed algorithms are not likely to give us a solution (we'll be either too permissive or too slow in dbench), while this kind of feedback sounds like something that may solve the problem dynamically, or maybe I'm simply just dreaming :). > what kind of unfairness is best, though, is a deep, difficult question. I'll > bet it doesn't get seriously addressed even in this kernel cycle, or at best, > very late in the cycle after the big infrastructure changes settle down. > > -- > Daniel Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-28 23:40 ` 2.4.18pre4aa1 Andrea Arcangeli @ 2002-01-29 0:15 ` Daniel Phillips 2002-01-29 13:05 ` 2.4.18pre4aa1 Pavel Machek 0 siblings, 1 reply; 28+ messages in thread From: Daniel Phillips @ 2002-01-29 0:15 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: rwhron, linux-kernel On January 29, 2002 12:40 am, Andrea Arcangeli wrote: > On Mon, Jan 28, 2002 at 09:28:24PM +0100, Daniel Phillips wrote: > > Just ask around. Marcelo or Andrew Morton would be a good place to start. > > ah, btw, if you test with a broken page replacement (kind of random) > it's normal you get huge variations. > > But with my -aa tree, you should never get a significant difference (no > matter if it's Marcelo or Andrew to run the benchmark). Oh, that's interesting, and actually I can see why that might be (feedback in your VM is quite predictable, so it isn't prone to oscillation). It's not just the VM that affects dbench's running pattern though, it's also scheduling. > I've also to say I always mke2fs first when I run my benchmarks, Yes, and it would be nice if we had an operation to squeeze cache down to its minimum size (whatever that means) just for running benchmarks accurately without rebooting. > so I don't consider > possible filesystem layout differences into the equation but I doubt > (unless you're running with a corner case like running out of space or > stuff like that), that it will make a significant difference either. > > > Anyways dbench tells you mostly about elevator etc... it's a good test > > > to check the elevator is working properly, the ++ must be mixed with the > > > dots etc... if the elevator is aggressive enough. Of course that means > > > the elevator is not perfectly fair but that's the whole point about > > > having an elevator. It is also an interesting test for page replacement, > > > but with page replacement it would be possible to write a broken > > > algorithm that produces good numbers, that's the thing I believe to be > > > bad about dbench (oh, like tiotest fake numbers too of course). Other > > > than this it just shows rmap12a has an elevator not aggressive enough > > > which is probably true, I doubt it has anything to do with the VM > > > changes in rmap (of course rmap design significant overhead is helping > > > to slow it down too though), more likely the bomb_segments logic from > > > Andrew that Rik has included, infact the broken page replacement that > > > lefts old stuff in cache if something might generate more unfairness > > > that should generate faster dbench numbers for rmap, but on this last > > > bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm > > > you need to make sure to cache lots of the readahead as well (also the > > > one not used yet), but I'm not 100% sure on the effect of lefting old > > > pollution in cache rather than recycling it, I never attempted it). > > > > Interesting analysis. It's a hint at how hard the elevator problem really > > is. Fairness as in 'equal load distribution' is not the best policy under > > heavy load, just as it is not the best policy under heavy swapping. Exactly > > as always it depends if the object is throughput or latency, for dbench > that's the object. > > Also the function between throughtput and latency is not linear and it > depends on too many factors to find an elevator algorithm that works > well on the paper. > > So, in function of that, one vapourware idea I had while reading your > email is to use the feedback from the output througput generated to know > when it's worthwhile to decrease or increase the latency. If decreasing > latency doesn't decrease the final throughput generated, that means > we're ok to decrease latency even more. As soon as the throughput > decreases (despite of people waiting on the submit_bh pipeline), we know > we'd better not decrease latency further, unless we want to hurt > performance. But what is the knob by which you control latency? > The current elevator (not rmap) is always very permissive, so throughput > is ok in dbench (and anything seeking as hard as dbench), but latency > often sucks (actually in -aa I decreased the read latency so it's > acceptable, not like in mainline, but still it's far from being very > reactive under a write flood). The feedback from the output channel to > control the latency parameters in a dynamic manner may help to decrease > latency when possible (not unconditionally with elvtune). One of the > thing I love about the analog electronics are the operational chips, a > feedback loop solves so much difficult problems so easily. Software can > do similar things lots of times. Oh yes, that's exactly the way I think of these things and I did experiment with a similar idea earlier this year with my 'early flush with bandwidth estimation' earlier this year. What I found is, it's very hard to get a good 'signal' by tracking kernel statistics. By the time I averaged the disk bandwidth enough to get a smooth signal, the lag was way too high to be useful. The statistics just aren't very coninuous, so they tend to resist analysis by analog methods. Note: they resist analysis, they don't defy it. > Anyways this is just vapourware > (probably quite complex to implement in a generic manner) but fixed > algorithms are not likely to give us a solution (we'll be either too > permissive or too slow in dbench), while this kind of feedback sounds > like something that may solve the problem dynamically, or maybe I'm > simply just dreaming :). Well I'm dreaming the same dreams, and by coincidence it's the reason I was complaining earlier today on lkml about the lack of good muldiv operations with double-wide intermediate results in the kernel. Such operators are needed to do the filtering calculations and so on with enough precision - and by this, I mean 'enough choices of divisor' more than 'enough bits' - so the algorithms don't choke on their own noise. But before you can do signal processing, feedback, or whatever, you have to have a good signal. -- Daniel ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-29 0:15 ` 2.4.18pre4aa1 Daniel Phillips @ 2002-01-29 13:05 ` Pavel Machek 0 siblings, 0 replies; 28+ messages in thread From: Pavel Machek @ 2002-01-29 13:05 UTC (permalink / raw) To: Daniel Phillips; +Cc: Andrea Arcangeli, rwhron, linux-kernel Hi! > > I've also to say I always mke2fs first when I run my benchmarks, > > Yes, and it would be nice if we had an operation to squeeze cache down to > its minimum size (whatever that means) just for running benchmarks > accurately without rebooting. Take a look at swsusp -- it frees as much memory as possible before doing anything. Pavel -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-24 6:27 ` 2.4.18pre4aa1 Daniel Phillips 2002-01-25 0:09 ` 2.4.18pre4aa1 Andrea Arcangeli @ 2002-01-25 0:19 ` rwhron 2002-01-25 0:29 ` 2.4.18pre4aa1 Rik van Riel 1 sibling, 1 reply; 28+ messages in thread From: rwhron @ 2002-01-25 0:19 UTC (permalink / raw) To: Daniel Phillips; +Cc: linux-kernel > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html > > Even when mostly uncached, dbench still produces flaky results. dbench results are not perfectly repeatable. I agree that dbench results that vary by 20% or so may not be meaningful. I think dbench is of some value though. In some cases the difference between kernels is 200% or more. Below are results from a couple of aa releases, and a few rmap releases. Some of the tests were ran twice. You can see that there is some variation between "identical" runs. You can see that aa kernels do extremely well with large numbers of processes, and as the number of processes increases from 64 -> 128 -> 192, the throughput drops in a predictable way. rmap, when compared with most other kernels does well with 64 processes. At 192, rmap doesn't do as well. That may be useful information for the people developing rmap. dbench 64 processes 2.4.18pre4aa1 ************************************************** 25.2 MB/sec 2.4.18pre2aa2 ******************************************** 22.2 MB/sec 2.4.17rmap11a **************************** 14.2 MB/sec 2.4.17rmap11a *************************** 13.9 MB/sec 2.4.17rmap12a *************************** 13.7 MB/sec 2.4.18pre3rmap11b ********************** 11.4 MB/sec 2.4.17rmap11c ********************* 10.8 MB/sec 2.4.17rmap11c ********************* 10.6 MB/sec 2.4.17rmap11b ******************* 9.7 MB/sec dbench 128 processes 2.4.18pre4aa1 ******************************** 16.4 MB/sec 2.4.18pre2aa2 ******************************** 16.3 MB/sec 2.4.18pre2aa2 ***************************** 14.9 MB/sec 2.4.17rmap11a ************ 6.1 MB/sec 2.4.17rmap11a ************ 6.1 MB/sec 2.4.18pre3rmap11b ********** 5.1 MB/sec 2.4.17rmap11b ********* 5.0 MB/sec 2.4.17rmap12a ********* 4.5 MB/sec 2.4.17rmap11c ******** 4.2 MB/sec 2.4.17rmap11c ******** 4.2 MB/sec dbench 192 processes 2.4.18pre2aa2 ***************** 8.8 MB/sec 2.4.18pre4aa1 **************** 8.2 MB/sec 2.4.18pre2aa2 *************** 7.7 MB/sec 2.4.17rmap11a ******** 4.4 MB/sec 2.4.17rmap11a ******** 4.3 MB/sec 2.4.18pre3rmap11b ******* 3.8 MB/sec 2.4.17rmap11b ******* 3.8 MB/sec 2.4.17rmap12a ****** 3.1 MB/sec 2.4.17rmap11c ***** 3.0 MB/sec 2.4.17rmap11c ***** 2.9 MB/sec On the other hand, rmap does very well with sequential reads on tiobench, which is running a lot fewer processes than dbench. Read, Write, and Seeks are MB/sec Num Seq Read Rand Read Seq Write Rand Write Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%) --- ------------- ----------- ------------- ----------- 2.4.17rmap12a 1 22.85 32.2% 1.15 2.2% 13.10 83.5% 0.71 1.6% 2.4.18pre2aa2 1 11.96 23.1% 2.24 3.2% 12.90 76.8% 0.71 1.6% 2.4.18pre4aa1 1 11.23 21.3% 3.12 4.8% 11.92 66.1% 0.66 1.3% 2.4.17rmap12a 2 22.07 32.1% 1.20 2.2% 12.84 80.4% 0.71 1.6% 2.4.18pre2aa2 2 11.09 22.0% 2.57 3.2% 13.10 76.3% 0.70 1.6% 2.4.18pre4aa1 2 10.68 20.9% 3.39 4.4% 12.14 67.9% 0.67 1.3% 2.4.17rmap12a 4 21.75 32.0% 1.20 2.2% 12.69 78.5% 0.71 1.6% 2.4.18pre2aa2 4 10.52 21.1% 2.82 3.6% 12.84 73.9% 0.69 1.5% 2.4.18pre4aa1 4 10.48 20.4% 3.56 4.2% 12.28 69.0% 0.67 1.4% 2.4.17rmap12a 8 21.34 31.8% 1.23 2.3% 12.57 77.3% 0.71 1.7% 2.4.18pre2aa2 8 10.24 19.5% 3.01 4.0% 12.94 74.1% 0.70 1.6% 2.4.18pre4aa1 8 10.08 18.9% 3.63 4.5% 12.24 68.8% 0.67 1.4% I added bonnie++ to the list of tests a day or so ago. I'll begin putting those results up in the near future. -- Randy Hron ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 0:19 ` 2.4.18pre4aa1 rwhron @ 2002-01-25 0:29 ` Rik van Riel 2002-01-25 3:23 ` 2.4.18pre4aa1 rwhron 0 siblings, 1 reply; 28+ messages in thread From: Rik van Riel @ 2002-01-25 0:29 UTC (permalink / raw) To: rwhron; +Cc: Daniel Phillips, linux-kernel On Thu, 24 Jan 2002 rwhron@earthlink.net wrote: > > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html > > > > Even when mostly uncached, dbench still produces flaky results. > Below are results from a couple of aa releases, and a few rmap > releases. [snip results: -aa twice as fast as -rmap for dbench, -rmap twice as fast as -aa for tiobench] What would be interesting here are the dbench dots, where a '+' indicates that a program exits. It's possible that under one of the kernels the programs are getting throttled differently and some of the dbench processes exit _way_ earlier than the others, leaving a much lighter load on the rest of the system for the second part of the test. It would be interesting to see the dbench dots from both -aa and -rmap ;) regards, Rik -- "Linux holds advantages over the single-vendor commercial OS" -- Microsoft's "Competing with Linux" document http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 0:29 ` 2.4.18pre4aa1 Rik van Riel @ 2002-01-25 3:23 ` rwhron 2002-01-25 3:35 ` 2.4.18pre4aa1 Rik van Riel 2002-01-28 0:37 ` 2.4.18pre4aa1 Andrea Arcangeli 0 siblings, 2 replies; 28+ messages in thread From: rwhron @ 2002-01-25 3:23 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-kernel > [snip results: -aa twice as fast as -rmap for dbench, > -rmap twice as fast as -aa for tiobench] Look closely at all the numbers: dbench 64 128 192 on ext completed in 4500 seconds on 2.4.18pre4aa1 dbench 64 128 192 on ext completed in 12471 seconds on 2.4.17rmap12a 2.4.18pre4aa1 completed the three dbenches 277% faster. For tiobench: Tiobench is interesting because it has the CPU% column. I mentioned sequential reads because it's a bench where 2.4.17rmap12a was faster. Someone else might say 2.4.18pre4aa1 was 271% faster at random reads. Let's analyze CPU efficiency where threads = 1: Num Seq Read Rand Read Seq Write Rand Write Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%) --- ------------- ----------- ------------- ----------- 2.4.17rmap12a 1 22.85 32.2% 1.15 2.2% 13.10 83.5% 0.71 1.6% 2.4.18pre4aa1 1 11.23 21.3% 3.12 4.8% 11.92 66.1% 0.66 1.3% Sequential Read CPU Efficiency 2.4.18pre4aa1 11.23 / .213 = 52.723 2.4.17rmap12a 22.85 / .322 = 70.962 2.4.17rmap12a was 35% more CPU efficent. Random Read CPU Efficiency 2.4.18pre4aa1 3.12 / .048 = 65.000 2.4.17rmap12a 1.15 / .022 = 52.272 2.4.18pre4aa1 was 24% more CPU efficient. Sequential Write CPU Efficiency 2.4.18pre4aa1 11.92 / .661 = 18.033 2.4.17rmap12a 13.10 / .835 = 15.688 2.4.18pre4aa1 was 15% more CPU efficient. Random Write CPU Efficiency 2.4.18pre4aa1 .066 / .013 = 50.767 2.4.17rmap12a .071 / .016 = 44.375 2.4.18pre4aa1 was 14% more CPU efficient. > It would be interesting to see the dbench dots from both > -aa and -rmap ;) All the dots are at: http://home.earthlink.net/~rwhron/kernel/dots/ -- Randy Hron ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 3:23 ` 2.4.18pre4aa1 rwhron @ 2002-01-25 3:35 ` Rik van Riel 2002-01-25 4:56 ` 2.4.18pre4aa1 rwhron 2002-01-25 12:26 ` 2.4.18pre4aa1 Dave Jones 2002-01-28 0:37 ` 2.4.18pre4aa1 Andrea Arcangeli 1 sibling, 2 replies; 28+ messages in thread From: Rik van Riel @ 2002-01-25 3:35 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Thu, 24 Jan 2002 rwhron@earthlink.net wrote: > > It would be interesting to see the dbench dots from both > > -aa and -rmap ;) > > All the dots are at: > http://home.earthlink.net/~rwhron/kernel/dots/ I think we have an explanation here. With dbench 192 on -aa the first processes exit around halfway through the dbench test and around the end only few processes are left. With rmap the write trottling is a bit smoother, but this results in all processes running to about 70% through the test and many more processes running at the last part of the test, exiting simultaneously. Considering the possible bad consequences for real workloads, I'm not sure I want to make the system more unfair just to better accomodate dbench ;) regards, Rik -- "Linux holds advantages over the single-vendor commercial OS" -- Microsoft's "Competing with Linux" document http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 3:35 ` 2.4.18pre4aa1 Rik van Riel @ 2002-01-25 4:56 ` rwhron 2002-01-25 4:57 ` 2.4.18pre4aa1 Rik van Riel 2002-01-25 12:26 ` 2.4.18pre4aa1 Dave Jones 1 sibling, 1 reply; 28+ messages in thread From: rwhron @ 2002-01-25 4:56 UTC (permalink / raw) To: Rik van Riel; +Cc: linux-kernel > workloads, I'm not sure I want to make the system more > unfair just to better accomodate dbench ;) I'm wondering if rmap is a little too aggressive on read-ahead, and if that has a negative impact on a complex workload. -- Randy Hron ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 4:56 ` 2.4.18pre4aa1 rwhron @ 2002-01-25 4:57 ` Rik van Riel 2002-01-25 5:18 ` 2.4.18pre4aa1 David Weinehall 0 siblings, 1 reply; 28+ messages in thread From: Rik van Riel @ 2002-01-25 4:57 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Thu, 24 Jan 2002 rwhron@earthlink.net wrote: > > workloads, I'm not sure I want to make the system more > > unfair just to better accomodate dbench ;) > > I'm wondering if rmap is a little too aggressive on > read-ahead, and if that has a negative impact on > a complex workload. I haven't changed the readahead code one bit compared to 2.4 mainline, but I'm wondering the same. Fixing readahead window sizing has been on my TODO list for quite a while already. regards, Rik -- "Linux holds advantages over the single-vendor commercial OS" -- Microsoft's "Competing with Linux" document http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 4:57 ` 2.4.18pre4aa1 Rik van Riel @ 2002-01-25 5:18 ` David Weinehall 2002-01-25 17:03 ` 2.4.18pre4aa1 Rik van Riel 0 siblings, 1 reply; 28+ messages in thread From: David Weinehall @ 2002-01-25 5:18 UTC (permalink / raw) To: Rik van Riel; +Cc: rwhron, linux-kernel On Fri, Jan 25, 2002 at 02:57:02AM -0200, Rik van Riel wrote: > On Thu, 24 Jan 2002 rwhron@earthlink.net wrote: > > > > workloads, I'm not sure I want to make the system more > > > unfair just to better accomodate dbench ;) > > > > I'm wondering if rmap is a little too aggressive on > > read-ahead, and if that has a negative impact on > > a complex workload. > > I haven't changed the readahead code one bit compared > to 2.4 mainline, but I'm wondering the same. > > Fixing readahead window sizing has been on my TODO list > for quite a while already. One thing that struck me about this; doesn't both the rmap-patches and the aa-patches contain other changes than merely changes to the VM? If so, couldn't these changes tip the result in an unfair direction?! After all, what we want is a VM-to-VM shoot-out, not a VM-to-VM+whatever shoot-out. After all, one would assume that the non VM-related changes would be merged to the kernel no matter what VM is used, right? Then again, maybe I just ate the blue pill and returned to a world of illusions not knowing what's best for me. Regards: David Weinehall _ _ // David Weinehall <tao@acc.umu.se> /> Northern lights wander \\ // Maintainer of the v2.0 kernel // Dance across the winter sky // \> http://www.acc.umu.se/~tao/ </ Full colour fire </ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 5:18 ` 2.4.18pre4aa1 David Weinehall @ 2002-01-25 17:03 ` Rik van Riel 2002-01-25 17:29 ` 2.4.18pre4aa1 Dave Jones 0 siblings, 1 reply; 28+ messages in thread From: Rik van Riel @ 2002-01-25 17:03 UTC (permalink / raw) To: David Weinehall; +Cc: rwhron, linux-kernel On Fri, 25 Jan 2002, David Weinehall wrote: > One thing that struck me about this; doesn't both the rmap-patches and > the aa-patches contain other changes than merely changes to the VM? If > so, couldn't these changes tip the result in an unfair direction?! After > all, what we want is a VM-to-VM shoot-out, not a VM-to-VM+whatever > shoot-out. After all, one would assume that the non VM-related changes > would be merged to the kernel no matter what VM is used, right? The -aa kernel seems to contain patches to a few dozen subsystems. The -rmap patch is pretty much only VM changes. You're right that this is not a strict VM vs VM comparison... kind regards, Rik -- "Linux holds advantages over the single-vendor commercial OS" -- Microsoft's "Competing with Linux" document http://www.surriel.com/ http://distro.conectiva.com/ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 17:03 ` 2.4.18pre4aa1 Rik van Riel @ 2002-01-25 17:29 ` Dave Jones 0 siblings, 0 replies; 28+ messages in thread From: Dave Jones @ 2002-01-25 17:29 UTC (permalink / raw) To: Rik van Riel; +Cc: David Weinehall, rwhron, linux-kernel On Fri, Jan 25, 2002 at 03:03:16PM -0200, Rik van Riel wrote: > The -aa kernel seems to contain patches to a few dozen subsystems. > The -rmap patch is pretty much only VM changes. > You're right that this is not a strict VM vs VM comparison... Agreed. Andrea's tree seemed to gain quite a bit of a lead when bits of the lowlat patches were applied for eg. Just taking 00_vm_?? from ../people/andrea/.. would give better comparison for a head to head vm pissing contest. -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 3:35 ` 2.4.18pre4aa1 Rik van Riel 2002-01-25 4:56 ` 2.4.18pre4aa1 rwhron @ 2002-01-25 12:26 ` Dave Jones 2002-01-25 14:57 ` 2.4.18pre4aa1 rwhron 1 sibling, 1 reply; 28+ messages in thread From: Dave Jones @ 2002-01-25 12:26 UTC (permalink / raw) To: Rik van Riel; +Cc: rwhron, linux-kernel On Fri, Jan 25, 2002 at 01:35:08AM -0200, Rik van Riel wrote: > Considering the possible bad consequences for real > workloads, I'm not sure I want to make the system more > unfair just to better accomodate dbench ;) it may be useful if Randy can throw a real world test into the benchmarking, to get a better comparison of the various systems. The obvious one that springs to mind would be something like compilation of a large source tree kernel/mozilla/etc.. (same version, same config options every time). Though, as compilation is largely compute bound, instead of IO bound, the more small files that need to be read/generated the better. Or maybe timing an updatedb. Its realworld enough in that its a daily task, generates lots of IO.. -- | Dave Jones. http://www.codemonkey.org.uk | SuSE Labs ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 12:26 ` 2.4.18pre4aa1 Dave Jones @ 2002-01-25 14:57 ` rwhron 0 siblings, 0 replies; 28+ messages in thread From: rwhron @ 2002-01-25 14:57 UTC (permalink / raw) To: Dave Jones, linux-kernel > it may be useful if Randy can throw a real world test > into the benchmarking, to get a better comparison of > the various systems. The obvious one that springs to mind > would be something like compilation of a large source tree Thanks for the feedback. 2.5.2-dj5 wins the lucky "first-timer" award on the new tests. Extract/configure/make/check autoconf-2.52: Executes over 100000 processes and creates a lot of small temporary files. Won't hit the disk much on this box. Extract/Configure/make/test perl-5.6.1: For perl, "make test" is executed 5 times. "make test" is about 75% system and 25% user, which may provide more variation between kernel versions. > Or maybe timing an updatedb. Its realworld enough in that its > a daily task, generates lots of IO.. I'll time updatedb too. updatedb may vary over time, depending on how many src trees are extracted. I'll make an effort to keep that variable consistent. -- Randy Hron ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-25 3:23 ` 2.4.18pre4aa1 rwhron 2002-01-25 3:35 ` 2.4.18pre4aa1 Rik van Riel @ 2002-01-28 0:37 ` Andrea Arcangeli 1 sibling, 0 replies; 28+ messages in thread From: Andrea Arcangeli @ 2002-01-28 0:37 UTC (permalink / raw) To: rwhron; +Cc: Rik van Riel, linux-kernel On Thu, Jan 24, 2002 at 10:23:57PM -0500, rwhron@earthlink.net wrote: > > [snip results: -aa twice as fast as -rmap for dbench, > > -rmap twice as fast as -aa for tiobench] > > Look closely at all the numbers: > > dbench 64 128 192 on ext completed in 4500 seconds on 2.4.18pre4aa1 > dbench 64 128 192 on ext completed in 12471 seconds on 2.4.17rmap12a > > 2.4.18pre4aa1 completed the three dbenches 277% faster. > > For tiobench: > > Tiobench is interesting because it has the CPU% column. I mentioned > sequential reads because it's a bench where 2.4.17rmap12a was faster. > Someone else might say 2.4.18pre4aa1 was 271% faster at random reads. > Let's analyze CPU efficiency where threads = 1: > > Num Seq Read Rand Read Seq Write Rand Write > Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%) > --- ------------- ----------- ------------- ----------- > 2.4.17rmap12a 1 22.85 32.2% 1.15 2.2% 13.10 83.5% 0.71 1.6% > 2.4.18pre4aa1 1 11.23 21.3% 3.12 4.8% 11.92 66.1% 0.66 1.3% Those weird numbers generated by rmap12a on tiobench shows that the page replacement algorithm in rmap is not able to detect cache pollution, that lefts pollution in cache rather than discarding the pollution, so later that is causing reads not to be served from disk, but to be served from cache. Being tiobench an I/O benchmark the above is a completly fake result, seq read I/O is not going to be faster with rmap. If you change tiobench to remount the fs where the output files are been generated between the "random write" and the "seq read" tests, you should get out comparable numbers. I don't consider goodness the fact rmap12a lefts old pollution in the caches, that seems to proof it will do the wrong thing when the most recently used data is part of the working set (like after you do the first cvs checkout, you want the second checkout not to hit the disk, this page replacement in rmap12a should hit the disk the second time too). In some ways tiobench has the same problems of dbench. A broken page replacement algorithm can generate stellar numbers in both of the two benchmarks. Furthmore running the 'seq read' after the 'random write' (tiobench does that), adds even more "random" to the output of the 'seq read' because the 'random read' and 'random write' tests are not comparable in first place too: the random seed is setup always different, and also to make a real 'seq read' test, the 'seq read' should be run after the 'seq write', not after the 'random write' (even assuming the random seed is always initialized to the same value). Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-24 5:23 2.4.18pre4aa1 rwhron 2002-01-24 6:27 ` 2.4.18pre4aa1 Daniel Phillips @ 2002-01-25 0:11 ` Andrea Arcangeli 1 sibling, 0 replies; 28+ messages in thread From: Andrea Arcangeli @ 2002-01-25 0:11 UTC (permalink / raw) To: rwhron; +Cc: linux-kernel On Thu, Jan 24, 2002 at 12:23:42AM -0500, rwhron@earthlink.net wrote: > Changelog with history at: > http://home.earthlink.net/~rwhron/kernel/2.4.18pre4aa1.html > > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at: > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html Randy, I will reiterate the obvious, but your reliable and impartial performance feedback is extremely helpful. Thanks, Keep up the good work :), Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* 2.4.18pre4aa1 @ 2002-01-22 6:48 Andrea Arcangeli 2002-01-22 6:58 ` 2.4.18pre4aa1 Robert Love 0 siblings, 1 reply; 28+ messages in thread From: Andrea Arcangeli @ 2002-01-22 6:48 UTC (permalink / raw) To: linux-kernel This is the first release moving the pagetables in highmem. It only compiles on x86 and it is still a bit experimental. I couldn't reproduce problems yet though. the new pte-highmem patch can be downloaded from here: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre4aa1/20_pte-highmem-6 Next relevant things to do are the non-x86 archs compilation, and I'd like to sort out the vary-IO for rawio and the hardblocksize-O_DIRECT patch. URL: ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre4aa1.bz2 ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre4aa1/ Diff between 2.4.18pre2aa2 and 2.4.18pre4aa1 follows: Only in 2.4.18pre2aa2: 00_3.5G-address-space-2 Only in 2.4.18pre4aa1/: 00_3.5G-address-space-3 Merge 1-2-3 GB option. Only in 2.4.18pre4aa1/: 00_access_process_vm-1 Fix oops in access_process_vm (get_area_pages will set the page pointer to NULL on non-ram maps). Only in 2.4.18pre4aa1/: 00_allow_mixed_b_size-1 This is the groundwork for the O_DIRECT-hardblocksize patch, and for the IOvary patch for rawio. In short this prevents the merging of different b_size in the same request at the blkdev layer. After I mentioned this Jens immediatly sent me a patch and here it is. So now I'd suggest to drop the varyIO thing which shouldn't be necessary any longer, and to port the rawio-large-bsize patch, and the O_DIRECT hardblocksize patches on top of my current tree. I'd like to include both. Of course the O_DIRECT-hardblocksize patch can also take advantage of the large-b_size improvement to brw_kiovec during large requests hardblocksize aligned. At least unless we want to change the alignment requirements, in such a case the varyIO info would be still valuable. About the O_DIRECT-hardblocksize patch there's also another problem though, if the get_block says that the buffer is new(), then the whole "soft" block must be cleared out, if not written to completly implicitly by the write. I just fixed similar bugs in presence of I/O errors or ENOSPC with O_DIRECT, and I don't want to reintroduce the very same problem while adding a new feature. The buffer_new() path is a very slow path for the DB usage point of view, so it's perfectly fine there to just writeout the zero page (or something like that) on the blocks around in a synchronous manner etc.. Only in 2.4.18pre4aa1/: 00_icmp-offset-1 Remote security fix from Andi (see bugtraq). Only in 2.4.18pre4aa1/: 00_init-blk-freelist-1 Requests cmd wasn't initialized when first queued into the blkdev, so if dequeued and then re-enqueued without being used, they could get unbalanced. Now always initialize it during get_request, so it certainly works right. Only in 2.4.18pre2aa2: 00_msync-ret-1 Only in 2.4.18pre2aa2: 00_page-cache-release-1 Only in 2.4.18pre2aa2: 00_ramdisk-buffercache-2 Only in 2.4.18pre2aa2: 00_truncate-garbage-1 Merged in mainline. Only in 2.4.18pre2aa2: 00_vmalloc-tlb-flush-1 Merged into mainline (modulo Jeff having implemented pagetable walking/tlb misses into uml that doesn't assume the tlb flush [ouch, right Andrew, tlb invalidate :) ] cames first). Only in 2.4.18pre2aa2: 00_nfs-2.4.17-cto-1 Only in 2.4.18pre4aa1/: 00_nfs-2.4.17-cto-2 Only in 2.4.18pre2aa2: 00_nfs-bkl-1 Only in 2.4.18pre4aa1/: 00_nfs-bkl-2 Only in 2.4.18pre2aa2: 00_nfs-rdplus-1 Only in 2.4.18pre4aa1/: 00_nfs-rdplus-2 Only in 2.4.18pre2aa2: 00_nfs-svc_tcp-1 Only in 2.4.18pre4aa1/: 00_nfs-svc_tcp-2 Only in 2.4.18pre2aa2: 00_nfs-tcp-tweaks-1 Only in 2.4.18pre4aa1/: 00_nfs-tcp-tweaks-2 Only in 2.4.18pre4aa1/: 10_nfs-o_direct-1 New NFS updates from Trond. Only in 2.4.18pre2aa2: 00_rwsem-fair-25 Only in 2.4.18pre2aa2: 00_rwsem-fair-25-recursive-7 Only in 2.4.18pre4aa1/: 00_rwsem-fair-26 Only in 2.4.18pre4aa1/: 00_rwsem-fair-26-recursive-7 Rediffed. Only in 2.4.18pre4aa1/: 00_waitfor-one-page-1 Export complaining symbol. Only in 2.4.18pre2aa2: 10_vm-22 Only in 2.4.18pre4aa1/: 10_vm-23 Minor changes (try to always do some relevant work during the refiling). Only in 2.4.18pre4aa1/: 20_pte-highmem-6 First "working" version of the pte-highmem patch, this fixes (or at least "should fix" :) lots of bugs. pte_offset_lowmem is still there because kmap doesn't yet work by the time pte_offset_lowmem is recalled. Lots of fixes, special thanks to Hugh, Linus and others for the review and the feedback! All drivers should be updated. Works for me so far. Only in 2.4.18pre2aa2: 30_dyn-sched-2 Only in 2.4.18pre4aa1/: 30_dyn-sched-3 Minor changes, volatile would be needed only to avoid confusing gcc, but nobody cares about variables changing under gcc anyways so let's remove it so it will be a little faster. Only in 2.4.18pre2aa2: 50_uml-patch-2.4.17-4.bz2 Only in 2.4.18pre4aa1/: 50_uml-patch-2.4.17-7.bz2 Latest update from Jeff (hopefully vmalloc works despite it doesn't start with the tlb invalidate). Only in 2.4.18pre4aa1/: 60_show-stack-1 Export symbol, so CONFIG_TUX_DEBUG has a chance to generate a loadable kernel module. Only in 2.4.18pre2aa2: 60_tux-vfs-4 Only in 2.4.18pre4aa1/: 60_tux-vfs-5 Rediffed. Andrea ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-22 6:48 2.4.18pre4aa1 Andrea Arcangeli @ 2002-01-22 6:58 ` Robert Love 2002-01-22 7:37 ` 2.4.18pre4aa1 Dan Chen 0 siblings, 1 reply; 28+ messages in thread From: Robert Love @ 2002-01-22 6:58 UTC (permalink / raw) To: Andrea Arcangeli; +Cc: linux-kernel On Tue, 2002-01-22 at 01:48, Andrea Arcangeli wrote: > Only in 2.4.18pre4aa1/: 00_icmp-offset-1 > > Remote security fix from Andi (see bugtraq). Are we sure this works? I thought I saw someone (IRC perhaps?) who had weird anomalies with this fix (although it does certainly fix the hole). > Only in 2.4.18pre2aa2: 10_vm-22 > Only in 2.4.18pre4aa1/: 10_vm-23 > > Minor changes (try to always do some relevant work during the > refiling). When will we see this in 2.4 stock? ;-) I know you have said you are busy, but it would great to get the bits pushed to Marcelo in reasonable documented chunks so he can merge them... Also, these should be pushed to Linus, too. Same VM in 2.5, after all. Robert Love ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-22 6:58 ` 2.4.18pre4aa1 Robert Love @ 2002-01-22 7:37 ` Dan Chen 2002-01-22 7:43 ` 2.4.18pre4aa1 Robert Love 2002-01-22 10:02 ` 2.4.18pre4aa1 Russell King 0 siblings, 2 replies; 28+ messages in thread From: Dan Chen @ 2002-01-22 7:37 UTC (permalink / raw) To: Robert Love; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 715 bytes --] No weird anomalies here. I believe the ones you refer to were a result of ipv6 bits not being updated as well. Russell posted two patches for those. http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428323&w=2 http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428401&w=2 On Tue, Jan 22, 2002 at 01:58:58AM -0500, Robert Love wrote: > > Only in 2.4.18pre4aa1/: 00_icmp-offset-1 > > > > Remote security fix from Andi (see bugtraq). > > Are we sure this works? I thought I saw someone (IRC perhaps?) who had > weird anomalies with this fix (although it does certainly fix the hole). -- Dan Chen crimsun@email.unc.edu GPG key: www.unc.edu/~crimsun/pubkey.gpg.asc [-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --] ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-22 7:37 ` 2.4.18pre4aa1 Dan Chen @ 2002-01-22 7:43 ` Robert Love 2002-01-22 10:02 ` 2.4.18pre4aa1 Russell King 1 sibling, 0 replies; 28+ messages in thread From: Robert Love @ 2002-01-22 7:43 UTC (permalink / raw) To: Dan Chen; +Cc: linux-kernel On Tue, 2002-01-22 at 02:37, Dan Chen wrote: > No weird anomalies here. I believe the ones you refer to were a result > of ipv6 bits not being updated as well. Russell posted two patches for > those. > > http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428323&w=2 > http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428401&w=2 Maybe, although I seem to recall odd ICMP behavior being the problem. Although I don't think the above is in -aa. Andrea, perhaps this too should be merged? Ideally this will all show up in 2.4-proper soon, anyhow. Robert Love ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-22 7:37 ` 2.4.18pre4aa1 Dan Chen 2002-01-22 7:43 ` 2.4.18pre4aa1 Robert Love @ 2002-01-22 10:02 ` Russell King 2002-01-22 10:12 ` 2.4.18pre4aa1 Robert Love 1 sibling, 1 reply; 28+ messages in thread From: Russell King @ 2002-01-22 10:02 UTC (permalink / raw) To: Dan Chen; +Cc: Robert Love, linux-kernel On Tue, Jan 22, 2002 at 02:37:42AM -0500, Dan Chen wrote: > No weird anomalies here. I believe the ones you refer to were a result > of ipv6 bits not being updated as well. Russell posted two patches for > those. No - I do see weirdness in ipv4 as well: bash-2.04# uptime 10:00am up 18:57, 1 user, load average: 0.02, 0.03, 0.00 bash-2.04# dmesg|grep 'broad' 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. 127.0.0.1 sent an invalid ICMP error to a broadcast. Only one of these happened on boot. The rest randomly pop up over time. I'm going to try tcpdumping lo to see if I can work out what's causing them. -- Russell King (rmk@arm.linux.org.uk) The developer of ARM Linux http://www.arm.linux.org.uk/personal/aboutme.html ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: 2.4.18pre4aa1 2002-01-22 10:02 ` 2.4.18pre4aa1 Russell King @ 2002-01-22 10:12 ` Robert Love 0 siblings, 0 replies; 28+ messages in thread From: Robert Love @ 2002-01-22 10:12 UTC (permalink / raw) To: Russell King; +Cc: Dan Chen, linux-kernel, andrea On Tue, 2002-01-22 at 05:02, Russell King wrote: > On Tue, Jan 22, 2002 at 02:37:42AM -0500, Dan Chen wrote: > > No weird anomalies here. I believe the ones you refer to were a result > > of ipv6 bits not being updated as well. Russell posted two patches for > > those. > > No - I do see weirdness in ipv4 as well: OK, this is the anomaly I spoke of. Weird ICMP errors. I've seen others with this problem. I don't think we have a proper solution here. > bash-2.04# uptime > 10:00am up 18:57, 1 user, load average: 0.02, 0.03, 0.00 > bash-2.04# dmesg|grep 'broad' > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > 127.0.0.1 sent an invalid ICMP error to a broadcast. > > Only one of these happened on boot. The rest randomly pop up over time. > I'm going to try tcpdumping lo to see if I can work out what's causing > them. Robert Love ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2002-01-31 20:37 UTC | newest] Thread overview: 28+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-01-24 5:23 2.4.18pre4aa1 rwhron 2002-01-24 6:27 ` 2.4.18pre4aa1 Daniel Phillips 2002-01-25 0:09 ` 2.4.18pre4aa1 Andrea Arcangeli 2002-01-28 9:53 ` 2.4.18pre4aa1 Daniel Phillips 2002-01-28 15:29 ` 2.4.18pre4aa1 Andrea Arcangeli 2002-01-28 20:28 ` 2.4.18pre4aa1 Daniel Phillips 2002-01-28 23:40 ` 2.4.18pre4aa1 Andrea Arcangeli 2002-01-29 0:15 ` 2.4.18pre4aa1 Daniel Phillips 2002-01-29 13:05 ` 2.4.18pre4aa1 Pavel Machek 2002-01-25 0:19 ` 2.4.18pre4aa1 rwhron 2002-01-25 0:29 ` 2.4.18pre4aa1 Rik van Riel 2002-01-25 3:23 ` 2.4.18pre4aa1 rwhron 2002-01-25 3:35 ` 2.4.18pre4aa1 Rik van Riel 2002-01-25 4:56 ` 2.4.18pre4aa1 rwhron 2002-01-25 4:57 ` 2.4.18pre4aa1 Rik van Riel 2002-01-25 5:18 ` 2.4.18pre4aa1 David Weinehall 2002-01-25 17:03 ` 2.4.18pre4aa1 Rik van Riel 2002-01-25 17:29 ` 2.4.18pre4aa1 Dave Jones 2002-01-25 12:26 ` 2.4.18pre4aa1 Dave Jones 2002-01-25 14:57 ` 2.4.18pre4aa1 rwhron 2002-01-28 0:37 ` 2.4.18pre4aa1 Andrea Arcangeli 2002-01-25 0:11 ` 2.4.18pre4aa1 Andrea Arcangeli -- strict thread matches above, loose matches on Subject: below -- 2002-01-22 6:48 2.4.18pre4aa1 Andrea Arcangeli 2002-01-22 6:58 ` 2.4.18pre4aa1 Robert Love 2002-01-22 7:37 ` 2.4.18pre4aa1 Dan Chen 2002-01-22 7:43 ` 2.4.18pre4aa1 Robert Love 2002-01-22 10:02 ` 2.4.18pre4aa1 Russell King 2002-01-22 10:12 ` 2.4.18pre4aa1 Robert Love
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox