Re: 2.4.18pre4aa1

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: 2.4.18pre4aa1
@ 2002-01-24  5:23 rwhron
  2002-01-24  6:27 ` 2.4.18pre4aa1 Daniel Phillips
  2002-01-25  0:11 ` 2.4.18pre4aa1 Andrea Arcangeli
  0 siblings, 2 replies; 28+ messages in thread
From: rwhron @ 2002-01-24  5:23 UTC (permalink / raw)
  To: linux-kernel

Changelog with history at:
http://home.earthlink.net/~rwhron/kernel/2.4.18pre4aa1.html

Benchmarks on 2.4.18pre4aa1 and lots of other kernels at:
http://home.earthlink.net/~rwhron/kernel/k6-2-475.html
-- 
Randy Hron


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-24  5:23 2.4.18pre4aa1 rwhron
@ 2002-01-24  6:27 ` Daniel Phillips
  2002-01-25  0:09   ` 2.4.18pre4aa1 Andrea Arcangeli
  2002-01-25  0:19   ` 2.4.18pre4aa1 rwhron
  2002-01-25  0:11 ` 2.4.18pre4aa1 Andrea Arcangeli
  1 sibling, 2 replies; 28+ messages in thread
From: Daniel Phillips @ 2002-01-24  6:27 UTC (permalink / raw)
  To: rwhron, linux-kernel

On January 24, 2002 06:23 am, rwhron@earthlink.net wrote:
> Benchmarks on 2.4.18pre4aa1 and lots of other kernels at:
> http://home.earthlink.net/~rwhron/kernel/k6-2-475.html

  "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O benchmark, 
  but it does create a high load, and may put some pressure on the cpu and 
  i/o schedulers. Each dbench process creates about 21 megabytes worth of 
  files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB for the dbench runs. Big 
  enough so the tests cannot run from the buffer/page caches on this box."

Thanks kindly for the testing, but please don't use dbench any more for 
benchmarks.  If you are testing stability, fine, but dbench throughput 
numbers are not good for much more than wild goose chases.

Even when mostly uncached, dbench still produces flaky results.

-- 
Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-24  6:27 ` 2.4.18pre4aa1 Daniel Phillips
@ 2002-01-25  0:09   ` Andrea Arcangeli
  2002-01-28  9:53     ` 2.4.18pre4aa1 Daniel Phillips
  2002-01-25  0:19   ` 2.4.18pre4aa1 rwhron
  1 sibling, 1 reply; 28+ messages in thread
From: Andrea Arcangeli @ 2002-01-25  0:09 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: rwhron, linux-kernel

On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote:
> On January 24, 2002 06:23 am, rwhron@earthlink.net wrote:
> > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at:
> > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html
> 
>   "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O benchmark, 
>   but it does create a high load, and may put some pressure on the cpu and 
>   i/o schedulers. Each dbench process creates about 21 megabytes worth of 
>   files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB for the dbench runs. Big 
>   enough so the tests cannot run from the buffer/page caches on this box."
> 
> Thanks kindly for the testing, but please don't use dbench any more for 
> benchmarks.  If you are testing stability, fine, but dbench throughput 
> numbers are not good for much more than wild goose chases.
> 
> Even when mostly uncached, dbench still produces flaky results.

this is not enterely true. dbench has a value. the only problem with
dbench is that you can trivially cheat and change the kernel in a broken
way, but optimal _only_ for dbench, just to get stellar dbench numbers,
but this is definitely not the case with the -aa tree, -aa tree is
definitely not optimized for dbench, infact the recent improvement cames
most probably from dyn-sched and bdflush histeresis introduction, not
from vm changes at all (there were no recent significant vm changes in
the page replacement and aging algorithms infact). rmap instead sucks in
most of the benchmarks because of the noticeable overhead of maintaining
those reverse maps that starts to help only by the time you need to
swap/pageout (totally useless and only overhead for number crunching,
database selfcaching etc..). This is the only issue with the rmap design
and you can definitely see it in the numbers. Here I'm only speaking
about the design, I never checked the current implementation.

Andrea

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  0:09   ` 2.4.18pre4aa1 Andrea Arcangeli
@ 2002-01-28  9:53     ` Daniel Phillips
  2002-01-28 15:29       ` 2.4.18pre4aa1 Andrea Arcangeli
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Phillips @ 2002-01-28  9:53 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: rwhron, linux-kernel

On January 25, 2002 01:09 am, Andrea Arcangeli wrote:
> On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote:
> > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote:
> > > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at:
> > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html
> > 
> >   "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O 
> >   benchmark, but it does create a high load, and may put some pressure on 
> >   the cpu and i/o schedulers. Each dbench process creates about 21 
> >   megabytes worth of files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB 
> >   for the dbench runs. Big enough so the tests cannot run from the 
> >   buffer/page caches on this box."
> > 
> > Thanks kindly for the testing, but please don't use dbench any more for 
> > benchmarks.  If you are testing stability, fine, but dbench throughput 
> > numbers are not good for much more than wild goose chases.
> > 
> > Even when mostly uncached, dbench still produces flaky results.
> 
> this is not enterely true. dbench has a value.

Yes, but not for benchmarks.  It has value only as a stability test - while 
it may in some cases provide some general indication of performance, its 
variance is far too large, even under controlled conditions, for it to have 
any value as a benchmark.  I'm surprised you'd even suggest this.

Andrea, please, if we want good benchmarks let's at least be clear on what 
tools benchmarkers should/should not be using.

> the only problem with
> dbench is that you can trivially cheat and change the kernel in a broken
> way, but optimal _only_ for dbench, just to get stellar dbench numbers,

No, this is not the only problem.  DBench is just plain *flaky*.  You don't
appear to be clear on why.  In short, dbench has two main flaws:

  - It's extremely sensitive to scheduling.  If one process happens to make
    progress then it gets more heavily cached and its progress becomes even
    greater.  The benchmark completes much more quickly in this case, whereas
    if all processes progress at nearly the same rate (by chance) it runs
    more slowly.

  - It can happen (again by chance) that dbench files get deleted while still
    in cache, and this process completes in a fraction of the time that real 
    disk IO would require.

I've seen successsive runs of dbench *under identical conditions* (that is, 
from a clean reboot etc.) vary by as much as 30%.  Others report even greater 
variance.  Can we please agree that dbench is useless for benchmarks?

-- 
Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-28  9:53     ` 2.4.18pre4aa1 Daniel Phillips
@ 2002-01-28 15:29       ` Andrea Arcangeli
  2002-01-28 20:28         ` 2.4.18pre4aa1 Daniel Phillips
  0 siblings, 1 reply; 28+ messages in thread
From: Andrea Arcangeli @ 2002-01-28 15:29 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: rwhron, linux-kernel

On Mon, Jan 28, 2002 at 10:53:25AM +0100, Daniel Phillips wrote:
> On January 25, 2002 01:09 am, Andrea Arcangeli wrote:
> > On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote:
> > > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote:
> > > > Benchmarks on 2.4.18pre4aa1 and lots of other kernels at:
> > > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html
> > > 
> > >   "dbench 64, 128, 192 on ext2fs. dbench may not be the best I/O 
> > >   benchmark, but it does create a high load, and may put some pressure on 
> > >   the cpu and i/o schedulers. Each dbench process creates about 21 
> > >   megabytes worth of files, so disk usage is 1.3 GB, 2.6 GB and 4.0 GB 
> > >   for the dbench runs. Big enough so the tests cannot run from the 
> > >   buffer/page caches on this box."
> > > 
> > > Thanks kindly for the testing, but please don't use dbench any more for 
> > > benchmarks.  If you are testing stability, fine, but dbench throughput 
> > > numbers are not good for much more than wild goose chases.
> > > 
> > > Even when mostly uncached, dbench still produces flaky results.
> > 
> > this is not enterely true. dbench has a value.
> 
> Yes, but not for benchmarks.  It has value only as a stability test - while 
> it may in some cases provide some general indication of performance, its 
> variance is far too large, even under controlled conditions, for it to have 
> any value as a benchmark.  I'm surprised you'd even suggest this.
> 
> Andrea, please, if we want good benchmarks let's at least be clear on what 
> tools benchmarkers should/should not be using.
> 
> > the only problem with
> > dbench is that you can trivially cheat and change the kernel in a broken
> > way, but optimal _only_ for dbench, just to get stellar dbench numbers,
> 
> No, this is not the only problem.  DBench is just plain *flaky*.  You don't
> appear to be clear on why.  In short, dbench has two main flaws:
> 
>   - It's extremely sensitive to scheduling.  If one process happens to make
>     progress then it gets more heavily cached and its progress becomes even
>     greater.  The benchmark completes much more quickly in this case, whereas
>     if all processes progress at nearly the same rate (by chance) it runs
>     more slowly.
> 
>   - It can happen (again by chance) that dbench files get deleted while still
>     in cache, and this process completes in a fraction of the time that real 
>     disk IO would require.
> 
> I've seen successsive runs of dbench *under identical conditions* (that is, 
> from a clean reboot etc.) vary by as much as 30%.  Others report even greater 
> variance.  Can we please agree that dbench is useless for benchmarks?

I never seen it to vary 30% on the same kernel.

Anyways dbench tells you mostly about elevator etc... it's a good test
to check the elevator is working properly, the ++ must be mixed with the
dots etc... if the elevator is aggressive enough. Of course that means
the elevator is not perfectly fair but that's the whole point about
having an elevator. It is also an interesting test for page replacement,
but with page replacement it would be possible to write a broken
algorithm that produces good numbers, that's the thing I believe to be
bad about dbench (oh, like tiotest fake numbers too of course). Other
than this it just shows rmap12a has an elevator not aggressive enough
which is probably true, I doubt it has anything to do with the VM
changes in rmap (of course rmap design significant overhead is helping
to slow it down too though), more likely the bomb_segments logic from
Andrew that Rik has included, infact the broken page replacement that
lefts old stuff in cache if something might generate more unfairness
that should generate faster dbench numbers for rmap, but on this last
bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm
you need to make sure to cache lots of the readahead as well (also the
one not used yet), but I'm not 100% sure on the effect of lefting old
pollution in cache rather than recycling it, I never attempted it).

Andrea

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-28 15:29       ` 2.4.18pre4aa1 Andrea Arcangeli
@ 2002-01-28 20:28         ` Daniel Phillips
  2002-01-28 23:40           ` 2.4.18pre4aa1 Andrea Arcangeli
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Phillips @ 2002-01-28 20:28 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: rwhron, linux-kernel

On January 28, 2002 04:29 pm, Andrea Arcangeli wrote:
> On Mon, Jan 28, 2002 at 10:53:25AM +0100, Daniel Phillips wrote:
> > On January 25, 2002 01:09 am, Andrea Arcangeli wrote:
> > > On Thu, Jan 24, 2002 at 07:27:43AM +0100, Daniel Phillips wrote:
> > > > On January 24, 2002 06:23 am, rwhron@earthlink.net wrote:
> > > > Even when mostly uncached, dbench still produces flaky results.
> > [...]
> > > the only problem with
> > > dbench is that you can trivially cheat and change the kernel in a broken
> > > way, but optimal _only_ for dbench, just to get stellar dbench numbers,
> > 
> > No, this is not the only problem.  DBench is just plain *flaky*.  You 
don't
> > appear to be clear on why.  In short, dbench has two main flaws:
> > 
> >   - It's extremely sensitive to scheduling.  If one process happens to 
make
> >     progress then it gets more heavily cached and its progress becomes 
even
> >     greater.  The benchmark completes much more quickly in this case, 
whereas
> >     if all processes progress at nearly the same rate (by chance) it runs
> >     more slowly.
> > 
> >   - It can happen (again by chance) that dbench files get deleted while 
still
> >     in cache, and this process completes in a fraction of the time that 
real 
> >     disk IO would require.
> > 
> > I've seen successsive runs of dbench *under identical conditions* (that 
is, 
> > from a clean reboot etc.) vary by as much as 30%.  Others report even 
greater 
> > variance.  Can we please agree that dbench is useless for benchmarks?
> 
> I never seen it to vary 30% on the same kernel.

Just ask around.  Marcelo or Andrew Morton would be a good place to start.

> Anyways dbench tells you mostly about elevator etc... it's a good test
> to check the elevator is working properly, the ++ must be mixed with the
> dots etc... if the elevator is aggressive enough. Of course that means
> the elevator is not perfectly fair but that's the whole point about
> having an elevator. It is also an interesting test for page replacement,
> but with page replacement it would be possible to write a broken
> algorithm that produces good numbers, that's the thing I believe to be
> bad about dbench (oh, like tiotest fake numbers too of course). Other
> than this it just shows rmap12a has an elevator not aggressive enough
> which is probably true, I doubt it has anything to do with the VM
> changes in rmap (of course rmap design significant overhead is helping
> to slow it down too though), more likely the bomb_segments logic from
> Andrew that Rik has included, infact the broken page replacement that
> lefts old stuff in cache if something might generate more unfairness
> that should generate faster dbench numbers for rmap, but on this last
> bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm
> you need to make sure to cache lots of the readahead as well (also the
> one not used yet), but I'm not 100% sure on the effect of lefting old
> pollution in cache rather than recycling it, I never attempted it).

Interesting analysis.  It's a hint at how hard the elevator problem really 
is.  Fairness as in 'equal load distribution' is not the best policy under 
heavy load, just as it is not the best policy under heavy swapping.  Exactly 
what kind of unfairness is best, though, is a deep, difficult question.  I'll 
bet it doesn't get seriously addressed even in this kernel cycle, or at best, 
very late in the cycle after the big infrastructure changes settle down.

-- 
Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-28 20:28         ` 2.4.18pre4aa1 Daniel Phillips
@ 2002-01-28 23:40           ` Andrea Arcangeli
  2002-01-29  0:15             ` 2.4.18pre4aa1 Daniel Phillips
  0 siblings, 1 reply; 28+ messages in thread
From: Andrea Arcangeli @ 2002-01-28 23:40 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: rwhron, linux-kernel

On Mon, Jan 28, 2002 at 09:28:24PM +0100, Daniel Phillips wrote:
> Just ask around.  Marcelo or Andrew Morton would be a good place to start.

ah, btw, if you test with a broken page replacement (kind of random)
it's normal you get huge variations.

But with my -aa tree, you should never get a significant difference (no
matter if it's Marcelo or Andrew to run the benchmark). I've also to say
I always mke2fs first when I run my benchmarks, so I don't consider
possible filesystem layout differences into the equation but I doubt
(unless you're running with a corner case like running out of space or
stuff like that), that it will make a significant difference either.

> > Anyways dbench tells you mostly about elevator etc... it's a good test
> > to check the elevator is working properly, the ++ must be mixed with the
> > dots etc... if the elevator is aggressive enough. Of course that means
> > the elevator is not perfectly fair but that's the whole point about
> > having an elevator. It is also an interesting test for page replacement,
> > but with page replacement it would be possible to write a broken
> > algorithm that produces good numbers, that's the thing I believe to be
> > bad about dbench (oh, like tiotest fake numbers too of course). Other
> > than this it just shows rmap12a has an elevator not aggressive enough
> > which is probably true, I doubt it has anything to do with the VM
> > changes in rmap (of course rmap design significant overhead is helping
> > to slow it down too though), more likely the bomb_segments logic from
> > Andrew that Rik has included, infact the broken page replacement that
> > lefts old stuff in cache if something might generate more unfairness
> > that should generate faster dbench numbers for rmap, but on this last
> > bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm
> > you need to make sure to cache lots of the readahead as well (also the
> > one not used yet), but I'm not 100% sure on the effect of lefting old
> > pollution in cache rather than recycling it, I never attempted it).
> 
> Interesting analysis.  It's a hint at how hard the elevator problem really 
> is.  Fairness as in 'equal load distribution' is not the best policy under 
> heavy load, just as it is not the best policy under heavy swapping.  Exactly 

as always it depends if the object is throughput or latency, for dbench
that's the object.

Also the function between throughtput and latency is not linear and it
depends on too many factors to find an elevator algorithm that works
well on the paper.

So, in function of that, one vapourware idea I had while reading your
email is to use the feedback from the output througput generated to know
when it's worthwhile to decrease or increase the latency. If decreasing
latency doesn't decrease the final throughput generated, that means
we're ok to decrease latency even more.  As soon as the throughput
decreases (despite of people waiting on the submit_bh pipeline), we know
we'd better not decrease latency further, unless we want to hurt
performance.

The current elevator (not rmap) is always very permissive, so throughput
is ok in dbench (and anything seeking as hard as dbench), but latency
often sucks (actually in -aa I decreased the read latency so it's
acceptable, not like in mainline, but still it's far from being very
reactive under a write flood). The feedback from the output channel to
control the latency parameters in a dynamic manner may help to decrease
latency when possible (not unconditionally with elvtune). One of the
thing I love about the analog electronics are the operational chips, a
feedback loop solves so much difficult problems so easily. Software can
do similar things lots of times. Anyways this is just vapourware
(probably quite complex to implement in a generic manner) but fixed
algorithms are not likely to give us a solution (we'll be either too
permissive or too slow in dbench), while this kind of feedback sounds
like something that may solve the problem dynamically, or maybe I'm
simply just dreaming :).

> what kind of unfairness is best, though, is a deep, difficult question.  I'll 
> bet it doesn't get seriously addressed even in this kernel cycle, or at best, 
> very late in the cycle after the big infrastructure changes settle down.
> 
> -- 
> Daniel

Andrea

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-28 23:40           ` 2.4.18pre4aa1 Andrea Arcangeli
@ 2002-01-29  0:15             ` Daniel Phillips
  2002-01-29 13:05               ` 2.4.18pre4aa1 Pavel Machek
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Phillips @ 2002-01-29  0:15 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: rwhron, linux-kernel

On January 29, 2002 12:40 am, Andrea Arcangeli wrote:
> On Mon, Jan 28, 2002 at 09:28:24PM +0100, Daniel Phillips wrote:
> > Just ask around.  Marcelo or Andrew Morton would be a good place to start.
> 
> ah, btw, if you test with a broken page replacement (kind of random)
> it's normal you get huge variations.
> 
> But with my -aa tree, you should never get a significant difference (no
> matter if it's Marcelo or Andrew to run the benchmark).

Oh, that's interesting, and actually I can see why that might be (feedback
in your VM is quite predictable, so it isn't prone to oscillation).  It's
not just the VM that affects dbench's running pattern though, it's also
scheduling.

> I've also to say I always mke2fs first when I run my benchmarks,

Yes, and it would be nice if we had an operation to squeeze cache down to
its minimum size (whatever that means) just for running benchmarks
accurately without rebooting.

> so I don't consider
> possible filesystem layout differences into the equation but I doubt
> (unless you're running with a corner case like running out of space or
> stuff like that), that it will make a significant difference either.

> > > Anyways dbench tells you mostly about elevator etc... it's a good test
> > > to check the elevator is working properly, the ++ must be mixed with the
> > > dots etc... if the elevator is aggressive enough. Of course that means
> > > the elevator is not perfectly fair but that's the whole point about
> > > having an elevator. It is also an interesting test for page replacement,
> > > but with page replacement it would be possible to write a broken
> > > algorithm that produces good numbers, that's the thing I believe to be
> > > bad about dbench (oh, like tiotest fake numbers too of course). Other
> > > than this it just shows rmap12a has an elevator not aggressive enough
> > > which is probably true, I doubt it has anything to do with the VM
> > > changes in rmap (of course rmap design significant overhead is helping
> > > to slow it down too though), more likely the bomb_segments logic from
> > > Andrew that Rik has included, infact the broken page replacement that
> > > lefts old stuff in cache if something might generate more unfairness
> > > that should generate faster dbench numbers for rmap, but on this last
> > > bit I'm not 100% sure (AFIK to get a fast dbench by cheating with the vm
> > > you need to make sure to cache lots of the readahead as well (also the
> > > one not used yet), but I'm not 100% sure on the effect of lefting old
> > > pollution in cache rather than recycling it, I never attempted it).
> > 
> > Interesting analysis.  It's a hint at how hard the elevator problem really 
> > is.  Fairness as in 'equal load distribution' is not the best policy under 
> > heavy load, just as it is not the best policy under heavy swapping.  Exactly 
> 
> as always it depends if the object is throughput or latency, for dbench
> that's the object.
> 
> Also the function between throughtput and latency is not linear and it
> depends on too many factors to find an elevator algorithm that works
> well on the paper.
> 
> So, in function of that, one vapourware idea I had while reading your
> email is to use the feedback from the output througput generated to know
> when it's worthwhile to decrease or increase the latency. If decreasing
> latency doesn't decrease the final throughput generated, that means
> we're ok to decrease latency even more.  As soon as the throughput
> decreases (despite of people waiting on the submit_bh pipeline), we know
> we'd better not decrease latency further, unless we want to hurt
> performance.

But what is the knob by which you control latency?

> The current elevator (not rmap) is always very permissive, so throughput
> is ok in dbench (and anything seeking as hard as dbench), but latency
> often sucks (actually in -aa I decreased the read latency so it's
> acceptable, not like in mainline, but still it's far from being very
> reactive under a write flood). The feedback from the output channel to
> control the latency parameters in a dynamic manner may help to decrease
> latency when possible (not unconditionally with elvtune). One of the
> thing I love about the analog electronics are the operational chips, a
> feedback loop solves so much difficult problems so easily. Software can
> do similar things lots of times.

Oh yes, that's exactly the way I think of these things and I did
experiment with a similar idea earlier this year with my 'early flush with
bandwidth estimation' earlier this year.  What I found is, it's very hard
to get a good 'signal' by tracking kernel statistics.  By the time I
averaged the disk bandwidth enough to get a smooth signal, the lag was way
too high to be useful.  The statistics just aren't very coninuous, so they
tend to resist analysis by analog methods.  Note: they resist analysis,
they don't defy it.

> Anyways this is just vapourware
> (probably quite complex to implement in a generic manner) but fixed
> algorithms are not likely to give us a solution (we'll be either too
> permissive or too slow in dbench), while this kind of feedback sounds
> like something that may solve the problem dynamically, or maybe I'm
> simply just dreaming :).

Well I'm dreaming the same dreams, and by coincidence it's the reason I was
complaining earlier today on lkml about the lack of good muldiv operations
with double-wide intermediate results in the kernel.  Such operators are
needed to do the filtering calculations and so on with enough precision -
and by this, I mean 'enough choices of divisor' more than 'enough bits' -
so the algorithms don't choke on their own noise.

But before you can do signal processing, feedback, or whatever, you have to
have a good signal.

-- 
Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-29  0:15             ` 2.4.18pre4aa1 Daniel Phillips
@ 2002-01-29 13:05               ` Pavel Machek
  0 siblings, 0 replies; 28+ messages in thread
From: Pavel Machek @ 2002-01-29 13:05 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Andrea Arcangeli, rwhron, linux-kernel

Hi!

> > I've also to say I always mke2fs first when I run my benchmarks,
> 
> Yes, and it would be nice if we had an operation to squeeze cache down to
> its minimum size (whatever that means) just for running benchmarks
> accurately without rebooting.

Take a look at swsusp -- it frees as much memory as possible before
doing anything.
								Pavel 
-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-24  6:27 ` 2.4.18pre4aa1 Daniel Phillips
  2002-01-25  0:09   ` 2.4.18pre4aa1 Andrea Arcangeli
@ 2002-01-25  0:19   ` rwhron
  2002-01-25  0:29     ` 2.4.18pre4aa1 Rik van Riel
  1 sibling, 1 reply; 28+ messages in thread
From: rwhron @ 2002-01-25  0:19 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

> > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html
> 
> Even when mostly uncached, dbench still produces flaky results.

dbench results are not perfectly repeatable.  I agree that dbench
results that vary by 20% or so may not be meaningful.  I think 
dbench is of some value though. In some cases the difference
between kernels is 200% or more.

Below are results from a couple of aa releases, and a few rmap
releases.  Some of the tests were ran twice.  You can see that 
there is some variation between "identical" runs.  You can see
that aa kernels do extremely well with large numbers of processes,
and as the number of processes increases from 64 -> 128 -> 192,
the throughput drops in a predictable way.

rmap, when compared with most other kernels does well with 64 processes.
At 192, rmap doesn't do as well.  That may be useful information for the 
people developing rmap.

dbench 64 processes
2.4.18pre4aa1      ************************************************** 25.2 MB/sec
2.4.18pre2aa2      ******************************************** 22.2 MB/sec
2.4.17rmap11a      **************************** 14.2 MB/sec
2.4.17rmap11a      *************************** 13.9 MB/sec
2.4.17rmap12a      *************************** 13.7 MB/sec
2.4.18pre3rmap11b  ********************** 11.4 MB/sec
2.4.17rmap11c      ********************* 10.8 MB/sec
2.4.17rmap11c      ********************* 10.6 MB/sec
2.4.17rmap11b      ******************* 9.7 MB/sec

dbench 128 processes
2.4.18pre4aa1      ******************************** 16.4 MB/sec
2.4.18pre2aa2      ******************************** 16.3 MB/sec
2.4.18pre2aa2      ***************************** 14.9 MB/sec
2.4.17rmap11a      ************ 6.1 MB/sec
2.4.17rmap11a      ************ 6.1 MB/sec
2.4.18pre3rmap11b  ********** 5.1 MB/sec
2.4.17rmap11b      ********* 5.0 MB/sec
2.4.17rmap12a      ********* 4.5 MB/sec
2.4.17rmap11c      ******** 4.2 MB/sec
2.4.17rmap11c      ******** 4.2 MB/sec

dbench 192 processes
2.4.18pre2aa2      ***************** 8.8 MB/sec
2.4.18pre4aa1      **************** 8.2 MB/sec
2.4.18pre2aa2      *************** 7.7 MB/sec
2.4.17rmap11a      ******** 4.4 MB/sec
2.4.17rmap11a      ******** 4.3 MB/sec
2.4.18pre3rmap11b  ******* 3.8 MB/sec
2.4.17rmap11b      ******* 3.8 MB/sec
2.4.17rmap12a      ****** 3.1 MB/sec
2.4.17rmap11c      ***** 3.0 MB/sec
2.4.17rmap11c      ***** 2.9 MB/sec


On the other hand, rmap does very well with sequential reads 
on tiobench, which is running a lot fewer processes than dbench.

       Read, Write, and Seeks are MB/sec

 	       Num     Seq Read     Rand Read      Seq Write   Rand Write
 	       Thr    Rate (CPU%)  Rate (CPU%)    Rate (CPU%)  Rate (CPU%)
 	       ---  -------------  -----------  -------------  -----------
2.4.17rmap12a    1   22.85  32.2%   1.15  2.2%   13.10  83.5%   0.71  1.6%
2.4.18pre2aa2    1   11.96  23.1%   2.24  3.2%   12.90  76.8%   0.71  1.6%
2.4.18pre4aa1    1   11.23  21.3%   3.12  4.8%   11.92  66.1%   0.66  1.3%

2.4.17rmap12a    2   22.07  32.1%   1.20  2.2%   12.84  80.4%   0.71  1.6%
2.4.18pre2aa2    2   11.09  22.0%   2.57  3.2%   13.10  76.3%   0.70  1.6%
2.4.18pre4aa1    2   10.68  20.9%   3.39  4.4%   12.14  67.9%   0.67  1.3%

2.4.17rmap12a    4   21.75  32.0%   1.20  2.2%   12.69  78.5%   0.71  1.6%
2.4.18pre2aa2    4   10.52  21.1%   2.82  3.6%   12.84  73.9%   0.69  1.5%
2.4.18pre4aa1    4   10.48  20.4%   3.56  4.2%   12.28  69.0%   0.67  1.4%

2.4.17rmap12a    8   21.34  31.8%   1.23  2.3%   12.57  77.3%   0.71  1.7%
2.4.18pre2aa2    8   10.24  19.5%   3.01  4.0%   12.94  74.1%   0.70  1.6%
2.4.18pre4aa1    8   10.08  18.9%   3.63  4.5%   12.24  68.8%   0.67  1.4%

I added bonnie++ to the list of tests a day or so ago.
I'll begin putting those results up in the near future.

-- 
Randy Hron


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  0:19   ` 2.4.18pre4aa1 rwhron
@ 2002-01-25  0:29     ` Rik van Riel
  2002-01-25  3:23       ` 2.4.18pre4aa1 rwhron
  0 siblings, 1 reply; 28+ messages in thread
From: Rik van Riel @ 2002-01-25  0:29 UTC (permalink / raw)
  To: rwhron; +Cc: Daniel Phillips, linux-kernel

On Thu, 24 Jan 2002 rwhron@earthlink.net wrote:

> > > http://home.earthlink.net/~rwhron/kernel/k6-2-475.html
> >
> > Even when mostly uncached, dbench still produces flaky results.

> Below are results from a couple of aa releases, and a few rmap
> releases.

  [snip results:  -aa twice as fast as -rmap for dbench,
                  -rmap twice as fast as -aa for tiobench]

What would be interesting here are the dbench dots, where
a '+' indicates that a program exits.

It's possible that under one of the kernels the programs
are getting throttled differently and some of the dbench
processes exit _way_ earlier than the others, leaving a
much lighter load on the rest of the system for the second
part of the test.

It would be interesting to see the dbench dots from both
-aa and -rmap ;)

regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  0:29     ` 2.4.18pre4aa1 Rik van Riel
@ 2002-01-25  3:23       ` rwhron
  2002-01-25  3:35         ` 2.4.18pre4aa1 Rik van Riel
  2002-01-28  0:37         ` 2.4.18pre4aa1 Andrea Arcangeli
  0 siblings, 2 replies; 28+ messages in thread
From: rwhron @ 2002-01-25  3:23 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel

>   [snip results:  -aa twice as fast as -rmap for dbench,
>                   -rmap twice as fast as -aa for tiobench]

Look closely at all the numbers:

dbench 64 128 192 on ext completed in 4500 seconds on 2.4.18pre4aa1
dbench 64 128 192 on ext completed in 12471 seconds on 2.4.17rmap12a

2.4.18pre4aa1 completed the three dbenches 277% faster.

For tiobench:

Tiobench is interesting because it has the CPU% column.  I mentioned 
sequential reads because it's a bench where 2.4.17rmap12a was faster.  
Someone else might say 2.4.18pre4aa1 was 271% faster at random reads.  
Let's analyze CPU efficiency where threads = 1:

               Num     Seq Read     Rand Read      Seq Write   Rand Write
               Thr    Rate (CPU%)  Rate (CPU%)    Rate (CPU%)  Rate (CPU%)
               ---  -------------  -----------  -------------  -----------
2.4.17rmap12a    1   22.85  32.2%   1.15  2.2%   13.10  83.5%   0.71  1.6%
2.4.18pre4aa1    1   11.23  21.3%   3.12  4.8%   11.92  66.1%   0.66  1.3%


Sequential Read CPU Efficiency
2.4.18pre4aa1   11.23 / .213 = 52.723
2.4.17rmap12a   22.85 / .322 = 70.962
2.4.17rmap12a was 35% more CPU efficent.

Random Read CPU Efficiency
2.4.18pre4aa1   3.12 / .048 = 65.000
2.4.17rmap12a   1.15 / .022 = 52.272
2.4.18pre4aa1 was 24% more CPU efficient.

Sequential Write CPU Efficiency
2.4.18pre4aa1   11.92 / .661 = 18.033
2.4.17rmap12a   13.10 / .835 = 15.688
2.4.18pre4aa1 was 15% more CPU efficient.

Random Write CPU Efficiency
2.4.18pre4aa1   .066 / .013 = 50.767
2.4.17rmap12a   .071 / .016 = 44.375
2.4.18pre4aa1 was 14% more CPU efficient.

> It would be interesting to see the dbench dots from both
> -aa and -rmap ;)

All the dots are at:
http://home.earthlink.net/~rwhron/kernel/dots/

-- 
Randy Hron


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  3:23       ` 2.4.18pre4aa1 rwhron
@ 2002-01-25  3:35         ` Rik van Riel
  2002-01-25  4:56           ` 2.4.18pre4aa1 rwhron
  2002-01-25 12:26           ` 2.4.18pre4aa1 Dave Jones
  2002-01-28  0:37         ` 2.4.18pre4aa1 Andrea Arcangeli
  1 sibling, 2 replies; 28+ messages in thread
From: Rik van Riel @ 2002-01-25  3:35 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

On Thu, 24 Jan 2002 rwhron@earthlink.net wrote:

> > It would be interesting to see the dbench dots from both
> > -aa and -rmap ;)
>
> All the dots are at:
> http://home.earthlink.net/~rwhron/kernel/dots/

I think we have an explanation here.

With dbench 192 on -aa the first processes exit around
halfway through the dbench test and around the end only
few processes are left.

With rmap the write trottling is a bit smoother, but
this results in all processes running to about 70% through
the test and many more processes running at the last part
of the test, exiting simultaneously.

Considering the possible bad consequences for real
workloads, I'm not sure I want to make the system more
unfair just to better accomodate dbench ;)

regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  3:35         ` 2.4.18pre4aa1 Rik van Riel
@ 2002-01-25  4:56           ` rwhron
  2002-01-25  4:57             ` 2.4.18pre4aa1 Rik van Riel
  2002-01-25 12:26           ` 2.4.18pre4aa1 Dave Jones
  1 sibling, 1 reply; 28+ messages in thread
From: rwhron @ 2002-01-25  4:56 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel

> workloads, I'm not sure I want to make the system more
> unfair just to better accomodate dbench ;)

I'm wondering if rmap is a little too aggressive on 
read-ahead, and if that has a negative impact on
a complex workload.

-- 
Randy Hron


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  4:56           ` 2.4.18pre4aa1 rwhron
@ 2002-01-25  4:57             ` Rik van Riel
  2002-01-25  5:18               ` 2.4.18pre4aa1 David Weinehall
  0 siblings, 1 reply; 28+ messages in thread
From: Rik van Riel @ 2002-01-25  4:57 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

On Thu, 24 Jan 2002 rwhron@earthlink.net wrote:

> > workloads, I'm not sure I want to make the system more
> > unfair just to better accomodate dbench ;)
>
> I'm wondering if rmap is a little too aggressive on
> read-ahead, and if that has a negative impact on
> a complex workload.

I haven't changed the readahead code one bit compared
to 2.4 mainline, but I'm wondering the same.

Fixing readahead window sizing has been on my TODO list
for quite a while already.

regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  4:57             ` 2.4.18pre4aa1 Rik van Riel
@ 2002-01-25  5:18               ` David Weinehall
  2002-01-25 17:03                 ` 2.4.18pre4aa1 Rik van Riel
  0 siblings, 1 reply; 28+ messages in thread
From: David Weinehall @ 2002-01-25  5:18 UTC (permalink / raw)
  To: Rik van Riel; +Cc: rwhron, linux-kernel

On Fri, Jan 25, 2002 at 02:57:02AM -0200, Rik van Riel wrote:
> On Thu, 24 Jan 2002 rwhron@earthlink.net wrote:
> 
> > > workloads, I'm not sure I want to make the system more
> > > unfair just to better accomodate dbench ;)
> >
> > I'm wondering if rmap is a little too aggressive on
> > read-ahead, and if that has a negative impact on
> > a complex workload.
> 
> I haven't changed the readahead code one bit compared
> to 2.4 mainline, but I'm wondering the same.
> 
> Fixing readahead window sizing has been on my TODO list
> for quite a while already.

One thing that struck me about this; doesn't both the rmap-patches and
the aa-patches contain other changes than merely changes to the VM? If
so, couldn't these changes tip the result in an unfair direction?! After
all, what we want is a VM-to-VM shoot-out, not a VM-to-VM+whatever
shoot-out. After all, one would assume that the non VM-related changes
would be merged to the kernel no matter what VM is used, right?

Then again, maybe I just ate the blue pill and returned to a world of
illusions not knowing what's best for me.


Regards: David Weinehall
  _                                                                 _
 // David Weinehall <tao@acc.umu.se> /> Northern lights wander      \\
//  Maintainer of the v2.0 kernel   //  Dance across the winter sky //
\>  http://www.acc.umu.se/~tao/    </   Full colour fire           </

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  5:18               ` 2.4.18pre4aa1 David Weinehall
@ 2002-01-25 17:03                 ` Rik van Riel
  2002-01-25 17:29                   ` 2.4.18pre4aa1 Dave Jones
  0 siblings, 1 reply; 28+ messages in thread
From: Rik van Riel @ 2002-01-25 17:03 UTC (permalink / raw)
  To: David Weinehall; +Cc: rwhron, linux-kernel

On Fri, 25 Jan 2002, David Weinehall wrote:

> One thing that struck me about this; doesn't both the rmap-patches and
> the aa-patches contain other changes than merely changes to the VM? If
> so, couldn't these changes tip the result in an unfair direction?! After
> all, what we want is a VM-to-VM shoot-out, not a VM-to-VM+whatever
> shoot-out. After all, one would assume that the non VM-related changes
> would be merged to the kernel no matter what VM is used, right?

The -aa kernel seems to contain patches to a few dozen subsystems.

The -rmap patch is pretty much only VM changes.

You're right that this is not a strict VM vs VM comparison...

kind regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25 17:03                 ` 2.4.18pre4aa1 Rik van Riel
@ 2002-01-25 17:29                   ` Dave Jones
  0 siblings, 0 replies; 28+ messages in thread
From: Dave Jones @ 2002-01-25 17:29 UTC (permalink / raw)
  To: Rik van Riel; +Cc: David Weinehall, rwhron, linux-kernel

On Fri, Jan 25, 2002 at 03:03:16PM -0200, Rik van Riel wrote:
 
 > The -aa kernel seems to contain patches to a few dozen subsystems.
 > The -rmap patch is pretty much only VM changes.
 > You're right that this is not a strict VM vs VM comparison...

 Agreed. Andrea's tree seemed to gain quite a bit of a lead
 when bits of the lowlat patches were applied for eg.
 Just taking 00_vm_?? from ../people/andrea/.. would give better
 comparison for a head to head vm pissing contest. 

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  3:35         ` 2.4.18pre4aa1 Rik van Riel
  2002-01-25  4:56           ` 2.4.18pre4aa1 rwhron
@ 2002-01-25 12:26           ` Dave Jones
  2002-01-25 14:57             ` 2.4.18pre4aa1 rwhron
  1 sibling, 1 reply; 28+ messages in thread
From: Dave Jones @ 2002-01-25 12:26 UTC (permalink / raw)
  To: Rik van Riel; +Cc: rwhron, linux-kernel

On Fri, Jan 25, 2002 at 01:35:08AM -0200, Rik van Riel wrote:

 > Considering the possible bad consequences for real
 > workloads, I'm not sure I want to make the system more
 > unfair just to better accomodate dbench ;)

 it may be useful if Randy can throw a real world test
 into the benchmarking, to get a better comparison of
 the various systems. The obvious one that springs to mind
 would be something like compilation of a large source tree
 kernel/mozilla/etc..  (same version, same config options
 every time). Though, as compilation is largely compute bound,
 instead of IO bound, the more small files that need to be
 read/generated the better.

 Or maybe timing an updatedb. Its realworld enough in that its
 a daily task, generates lots of IO..

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25 12:26           ` 2.4.18pre4aa1 Dave Jones
@ 2002-01-25 14:57             ` rwhron
  0 siblings, 0 replies; 28+ messages in thread
From: rwhron @ 2002-01-25 14:57 UTC (permalink / raw)
  To: Dave Jones, linux-kernel

>  it may be useful if Randy can throw a real world test
>  into the benchmarking, to get a better comparison of
>  the various systems. The obvious one that springs to mind
>  would be something like compilation of a large source tree

Thanks for the feedback.

2.5.2-dj5 wins the lucky "first-timer" award on the new tests.

Extract/configure/make/check autoconf-2.52:
Executes over 100000 processes and creates a lot of small
temporary files.  Won't hit the disk much on this box.

Extract/Configure/make/test perl-5.6.1:
For perl, "make test" is executed 5 times.  "make test" is about 
75% system and 25% user, which may provide more variation between 
kernel versions.

>  Or maybe timing an updatedb. Its realworld enough in that its
>  a daily task, generates lots of IO..

I'll time updatedb too.  updatedb may vary over time, depending
on how many src trees are extracted.  I'll make an effort to
keep that variable consistent.

-- 
Randy Hron

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-25  3:23       ` 2.4.18pre4aa1 rwhron
  2002-01-25  3:35         ` 2.4.18pre4aa1 Rik van Riel
@ 2002-01-28  0:37         ` Andrea Arcangeli
  1 sibling, 0 replies; 28+ messages in thread
From: Andrea Arcangeli @ 2002-01-28  0:37 UTC (permalink / raw)
  To: rwhron; +Cc: Rik van Riel, linux-kernel

On Thu, Jan 24, 2002 at 10:23:57PM -0500, rwhron@earthlink.net wrote:
> >   [snip results:  -aa twice as fast as -rmap for dbench,
> >                   -rmap twice as fast as -aa for tiobench]
> 
> Look closely at all the numbers:
> 
> dbench 64 128 192 on ext completed in 4500 seconds on 2.4.18pre4aa1
> dbench 64 128 192 on ext completed in 12471 seconds on 2.4.17rmap12a
> 
> 2.4.18pre4aa1 completed the three dbenches 277% faster.
> 
> For tiobench:
> 
> Tiobench is interesting because it has the CPU% column.  I mentioned 
> sequential reads because it's a bench where 2.4.17rmap12a was faster.  
> Someone else might say 2.4.18pre4aa1 was 271% faster at random reads.  
> Let's analyze CPU efficiency where threads = 1:
> 
>                Num     Seq Read     Rand Read      Seq Write   Rand Write
>                Thr    Rate (CPU%)  Rate (CPU%)    Rate (CPU%)  Rate (CPU%)
>                ---  -------------  -----------  -------------  -----------
> 2.4.17rmap12a    1   22.85  32.2%   1.15  2.2%   13.10  83.5%   0.71  1.6%
> 2.4.18pre4aa1    1   11.23  21.3%   3.12  4.8%   11.92  66.1%   0.66  1.3%

Those weird numbers generated by rmap12a on tiobench shows that the page
replacement algorithm in rmap is not able to detect cache pollution,
that lefts pollution in cache rather than discarding the pollution, so
later that is causing reads not to be served from disk, but to be served
from cache.

Being tiobench an I/O benchmark the above is a completly fake result,
seq read I/O is not going to be faster with rmap. If you change tiobench
to remount the fs where the output files are been generated between the
"random write" and the "seq read" tests, you should get out comparable
numbers.

I don't consider goodness the fact rmap12a lefts old pollution in the
caches, that seems to proof it will do the wrong thing when the most
recently used data is part of the working set (like after you do the
first cvs checkout, you want the second checkout not to hit the disk,
this page replacement in rmap12a should hit the disk the second time
too).

In some ways tiobench has the same problems of dbench. A broken page
replacement algorithm can generate stellar numbers in both of the two
benchmarks.

Furthmore running the 'seq read' after the 'random write' (tiobench does
that), adds even more "random" to the output of the 'seq read' because
the 'random read' and 'random write' tests are not comparable in first
place too: the random seed is setup always different, and also to make a
real 'seq read' test, the 'seq read' should be run after the 'seq
write', not after the 'random write' (even assuming the random seed is
always initialized to the same value).

Andrea

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-24  5:23 2.4.18pre4aa1 rwhron
  2002-01-24  6:27 ` 2.4.18pre4aa1 Daniel Phillips
@ 2002-01-25  0:11 ` Andrea Arcangeli
  1 sibling, 0 replies; 28+ messages in thread
From: Andrea Arcangeli @ 2002-01-25  0:11 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

On Thu, Jan 24, 2002 at 12:23:42AM -0500, rwhron@earthlink.net wrote:
> Changelog with history at:
> http://home.earthlink.net/~rwhron/kernel/2.4.18pre4aa1.html
> 
> Benchmarks on 2.4.18pre4aa1 and lots of other kernels at:
> http://home.earthlink.net/~rwhron/kernel/k6-2-475.html

Randy, I will reiterate the obvious, but your reliable and impartial
performance feedback is extremely helpful. Thanks,

Keep up the good work :),

Andrea

^ permalink raw reply	[flat|nested] 28+ messages in thread

* 2.4.18pre4aa1
@ 2002-01-22  6:48 Andrea Arcangeli
  2002-01-22  6:58 ` 2.4.18pre4aa1 Robert Love
  0 siblings, 1 reply; 28+ messages in thread
From: Andrea Arcangeli @ 2002-01-22  6:48 UTC (permalink / raw)
  To: linux-kernel

This is the first release moving the pagetables in highmem. It only
compiles on x86 and it is still a bit experimental. I couldn't reproduce
problems yet though. the new pte-highmem patch can be downloaded from
here:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre4aa1/20_pte-highmem-6

Next relevant things to do are the non-x86 archs compilation, and I'd
like to sort out the vary-IO for rawio and the hardblocksize-O_DIRECT
patch.

URL:

	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre4aa1.bz2
	ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.18pre4aa1/

Diff between 2.4.18pre2aa2 and 2.4.18pre4aa1 follows:

Only in 2.4.18pre2aa2: 00_3.5G-address-space-2
Only in 2.4.18pre4aa1/: 00_3.5G-address-space-3

	Merge 1-2-3 GB option.

Only in 2.4.18pre4aa1/: 00_access_process_vm-1

	Fix oops in access_process_vm (get_area_pages will
	set the page pointer to NULL on non-ram maps).

Only in 2.4.18pre4aa1/: 00_allow_mixed_b_size-1

	This is the groundwork for the O_DIRECT-hardblocksize
	patch, and for the IOvary patch for rawio.

	In short this prevents the merging of different b_size
	in the same request at the blkdev layer. After I mentioned
	this Jens immediatly sent me a patch and here it is.

	So now I'd suggest to drop the varyIO thing which shouldn't
	be necessary any longer, and to port the rawio-large-bsize
	patch, and the O_DIRECT hardblocksize patches on top of
	my current tree. I'd like to include both. Of course the
	O_DIRECT-hardblocksize patch can also take advantage of
	the large-b_size improvement to brw_kiovec during large
	requests hardblocksize aligned. At least unless we want
	to change the alignment requirements, in such a case
	the varyIO info would be still valuable.

	About the O_DIRECT-hardblocksize patch there's also another problem
	though, if the get_block says that the buffer is new(), then
	the whole "soft" block must be cleared out, if not written
	to completly implicitly by the write. I just fixed similar bugs in
	presence of I/O errors or ENOSPC with O_DIRECT, and I don't want to
	reintroduce the very same problem while adding a new feature. The
	buffer_new() path is a very slow path for the DB usage point of view,
	so it's perfectly fine there to just writeout the zero page (or
	something like that) on the blocks around in a synchronous manner etc..

Only in 2.4.18pre4aa1/: 00_icmp-offset-1

	Remote security fix from Andi (see bugtraq).

Only in 2.4.18pre4aa1/: 00_init-blk-freelist-1

	Requests cmd wasn't initialized when first queued into the blkdev,
	so if dequeued and then re-enqueued without being used, they could get
	unbalanced. Now always initialize it during get_request, so it certainly
	works right.

Only in 2.4.18pre2aa2: 00_msync-ret-1
Only in 2.4.18pre2aa2: 00_page-cache-release-1
Only in 2.4.18pre2aa2: 00_ramdisk-buffercache-2
Only in 2.4.18pre2aa2: 00_truncate-garbage-1

	Merged in mainline.

Only in 2.4.18pre2aa2: 00_vmalloc-tlb-flush-1

	Merged into mainline (modulo Jeff having implemented pagetable
	walking/tlb misses into uml that doesn't assume the tlb
	flush [ouch, right Andrew, tlb invalidate :) ] cames first).

Only in 2.4.18pre2aa2: 00_nfs-2.4.17-cto-1
Only in 2.4.18pre4aa1/: 00_nfs-2.4.17-cto-2
Only in 2.4.18pre2aa2: 00_nfs-bkl-1
Only in 2.4.18pre4aa1/: 00_nfs-bkl-2
Only in 2.4.18pre2aa2: 00_nfs-rdplus-1
Only in 2.4.18pre4aa1/: 00_nfs-rdplus-2
Only in 2.4.18pre2aa2: 00_nfs-svc_tcp-1
Only in 2.4.18pre4aa1/: 00_nfs-svc_tcp-2
Only in 2.4.18pre2aa2: 00_nfs-tcp-tweaks-1
Only in 2.4.18pre4aa1/: 00_nfs-tcp-tweaks-2
Only in 2.4.18pre4aa1/: 10_nfs-o_direct-1

	New NFS updates from Trond.

Only in 2.4.18pre2aa2: 00_rwsem-fair-25
Only in 2.4.18pre2aa2: 00_rwsem-fair-25-recursive-7
Only in 2.4.18pre4aa1/: 00_rwsem-fair-26
Only in 2.4.18pre4aa1/: 00_rwsem-fair-26-recursive-7

	Rediffed.

Only in 2.4.18pre4aa1/: 00_waitfor-one-page-1

	Export complaining symbol.

Only in 2.4.18pre2aa2: 10_vm-22
Only in 2.4.18pre4aa1/: 10_vm-23

	Minor changes (try to always do some relevant work during the
	refiling).

Only in 2.4.18pre4aa1/: 20_pte-highmem-6

	First "working" version of the pte-highmem patch, this fixes (or at
	least "should fix" :) lots of bugs. pte_offset_lowmem is still there
	because kmap doesn't yet work by the time pte_offset_lowmem is
	recalled. Lots of fixes, special thanks to Hugh, Linus and others for
	the review and the feedback! All drivers should be updated. Works
	for me so far.

Only in 2.4.18pre2aa2: 30_dyn-sched-2
Only in 2.4.18pre4aa1/: 30_dyn-sched-3

	Minor changes, volatile would be needed only to avoid confusing
	gcc, but nobody cares about variables changing under gcc anyways so
	let's remove it so it will be a little faster.

Only in 2.4.18pre2aa2: 50_uml-patch-2.4.17-4.bz2
Only in 2.4.18pre4aa1/: 50_uml-patch-2.4.17-7.bz2

	Latest update from Jeff (hopefully vmalloc works despite it doesn't
	start with the tlb invalidate).

Only in 2.4.18pre4aa1/: 60_show-stack-1

	Export symbol, so CONFIG_TUX_DEBUG has a chance to generate a loadable
	kernel module. 

Only in 2.4.18pre2aa2: 60_tux-vfs-4
Only in 2.4.18pre4aa1/: 60_tux-vfs-5

	Rediffed.

Andrea

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-22  6:48 2.4.18pre4aa1 Andrea Arcangeli
@ 2002-01-22  6:58 ` Robert Love
  2002-01-22  7:37   ` 2.4.18pre4aa1 Dan Chen
  0 siblings, 1 reply; 28+ messages in thread
From: Robert Love @ 2002-01-22  6:58 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel

On Tue, 2002-01-22 at 01:48, Andrea Arcangeli wrote:

> Only in 2.4.18pre4aa1/: 00_icmp-offset-1
> 
> 	Remote security fix from Andi (see bugtraq).

Are we sure this works?  I thought I saw someone (IRC perhaps?) who had
weird anomalies with this fix (although it does certainly fix the hole).

> Only in 2.4.18pre2aa2: 10_vm-22
> Only in 2.4.18pre4aa1/: 10_vm-23
> 
> 	Minor changes (try to always do some relevant work during the
> 	refiling).

When will we see this in 2.4 stock? ;-)

I know you have said you are busy, but it would great to get the bits
pushed to Marcelo in reasonable documented chunks so he can merge
them...

Also, these should be pushed to Linus, too.  Same VM in 2.5, after all.

	Robert Love

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-22  6:58 ` 2.4.18pre4aa1 Robert Love
@ 2002-01-22  7:37   ` Dan Chen
  2002-01-22  7:43     ` 2.4.18pre4aa1 Robert Love
  2002-01-22 10:02     ` 2.4.18pre4aa1 Russell King
  0 siblings, 2 replies; 28+ messages in thread
From: Dan Chen @ 2002-01-22  7:37 UTC (permalink / raw)
  To: Robert Love; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 715 bytes --]

No weird anomalies here. I believe the ones you refer to were a result
of ipv6 bits not being updated as well. Russell posted two patches for
those.

http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428323&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428401&w=2

On Tue, Jan 22, 2002 at 01:58:58AM -0500, Robert Love wrote:
> > Only in 2.4.18pre4aa1/: 00_icmp-offset-1
> > 
> > 	Remote security fix from Andi (see bugtraq).
> 
> Are we sure this works?  I thought I saw someone (IRC perhaps?) who had
> weird anomalies with this fix (although it does certainly fix the hole).

-- 
Dan Chen                 crimsun@email.unc.edu
GPG key:   www.unc.edu/~crimsun/pubkey.gpg.asc

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-22  7:37   ` 2.4.18pre4aa1 Dan Chen
@ 2002-01-22  7:43     ` Robert Love
  2002-01-22 10:02     ` 2.4.18pre4aa1 Russell King
  1 sibling, 0 replies; 28+ messages in thread
From: Robert Love @ 2002-01-22  7:43 UTC (permalink / raw)
  To: Dan Chen; +Cc: linux-kernel

On Tue, 2002-01-22 at 02:37, Dan Chen wrote:
> No weird anomalies here. I believe the ones you refer to were a result
> of ipv6 bits not being updated as well. Russell posted two patches for
> those.
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428323&w=2
> http://marc.theaimsgroup.com/?l=linux-kernel&m=101164602428401&w=2

Maybe, although I seem to recall odd ICMP behavior being the problem. 
Although I don't think the above is in -aa.  Andrea, perhaps this too
should be merged?

Ideally this will all show up in 2.4-proper soon, anyhow.

	Robert Love


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-22  7:37   ` 2.4.18pre4aa1 Dan Chen
  2002-01-22  7:43     ` 2.4.18pre4aa1 Robert Love
@ 2002-01-22 10:02     ` Russell King
  2002-01-22 10:12       ` 2.4.18pre4aa1 Robert Love
  1 sibling, 1 reply; 28+ messages in thread
From: Russell King @ 2002-01-22 10:02 UTC (permalink / raw)
  To: Dan Chen; +Cc: Robert Love, linux-kernel

On Tue, Jan 22, 2002 at 02:37:42AM -0500, Dan Chen wrote:
> No weird anomalies here. I believe the ones you refer to were a result
> of ipv6 bits not being updated as well. Russell posted two patches for
> those.

No - I do see weirdness in ipv4 as well:

bash-2.04# uptime
 10:00am  up 18:57,  1 user,  load average: 0.02, 0.03, 0.00
bash-2.04# dmesg|grep 'broad'
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.
127.0.0.1 sent an invalid ICMP error to a broadcast.

Only one of these happened on boot.  The rest randomly pop up over time.
I'm going to try tcpdumping lo to see if I can work out what's causing
them.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: 2.4.18pre4aa1
  2002-01-22 10:02     ` 2.4.18pre4aa1 Russell King
@ 2002-01-22 10:12       ` Robert Love
  0 siblings, 0 replies; 28+ messages in thread
From: Robert Love @ 2002-01-22 10:12 UTC (permalink / raw)
  To: Russell King; +Cc: Dan Chen, linux-kernel, andrea

On Tue, 2002-01-22 at 05:02, Russell King wrote:
> On Tue, Jan 22, 2002 at 02:37:42AM -0500, Dan Chen wrote:
> > No weird anomalies here. I believe the ones you refer to were a result
> > of ipv6 bits not being updated as well. Russell posted two patches for
> > those.
> 
> No - I do see weirdness in ipv4 as well:

OK, this is the anomaly I spoke of.  Weird ICMP errors.  I've seen
others with this problem.

I don't think we have a proper solution here.

> bash-2.04# uptime
>  10:00am  up 18:57,  1 user,  load average: 0.02, 0.03, 0.00
> bash-2.04# dmesg|grep 'broad'
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 127.0.0.1 sent an invalid ICMP error to a broadcast.
> 
> Only one of these happened on boot.  The rest randomly pop up over time.
> I'm going to try tcpdumping lo to see if I can work out what's causing
> them.

	Robert Love


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2002-01-31 20:37 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-24  5:23 2.4.18pre4aa1 rwhron
2002-01-24  6:27 ` 2.4.18pre4aa1 Daniel Phillips
2002-01-25  0:09   ` 2.4.18pre4aa1 Andrea Arcangeli
2002-01-28  9:53     ` 2.4.18pre4aa1 Daniel Phillips
2002-01-28 15:29       ` 2.4.18pre4aa1 Andrea Arcangeli
2002-01-28 20:28         ` 2.4.18pre4aa1 Daniel Phillips
2002-01-28 23:40           ` 2.4.18pre4aa1 Andrea Arcangeli
2002-01-29  0:15             ` 2.4.18pre4aa1 Daniel Phillips
2002-01-29 13:05               ` 2.4.18pre4aa1 Pavel Machek
2002-01-25  0:19   ` 2.4.18pre4aa1 rwhron
2002-01-25  0:29     ` 2.4.18pre4aa1 Rik van Riel
2002-01-25  3:23       ` 2.4.18pre4aa1 rwhron
2002-01-25  3:35         ` 2.4.18pre4aa1 Rik van Riel
2002-01-25  4:56           ` 2.4.18pre4aa1 rwhron
2002-01-25  4:57             ` 2.4.18pre4aa1 Rik van Riel
2002-01-25  5:18               ` 2.4.18pre4aa1 David Weinehall
2002-01-25 17:03                 ` 2.4.18pre4aa1 Rik van Riel
2002-01-25 17:29                   ` 2.4.18pre4aa1 Dave Jones
2002-01-25 12:26           ` 2.4.18pre4aa1 Dave Jones
2002-01-25 14:57             ` 2.4.18pre4aa1 rwhron
2002-01-28  0:37         ` 2.4.18pre4aa1 Andrea Arcangeli
2002-01-25  0:11 ` 2.4.18pre4aa1 Andrea Arcangeli
  -- strict thread matches above, loose matches on Subject: below --
2002-01-22  6:48 2.4.18pre4aa1 Andrea Arcangeli
2002-01-22  6:58 ` 2.4.18pre4aa1 Robert Love
2002-01-22  7:37   ` 2.4.18pre4aa1 Dan Chen
2002-01-22  7:43     ` 2.4.18pre4aa1 Robert Love
2002-01-22 10:02     ` 2.4.18pre4aa1 Russell King
2002-01-22 10:12       ` 2.4.18pre4aa1 Robert Love

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox