Re: Minutes from Feb 21 LSE Call

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: Minutes from Feb 21 LSE Call
       [not found] ` <1510000.1045942974@[10.10.2.4]>
@ 2003-02-22 19:56   ` Larry McVoy
  2003-02-22 20:24     ` William Lee Irwin III
                       ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-22 19:56 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Mark Hahn, David S. Miller, Larry McVoy, linux-kernel

On Sat, Feb 22, 2003 at 11:42:55AM -0800, Martin J. Bligh wrote:
> >> Dell makes money on many things other than thin-margin PCs.  And lo'
> > 
> > Dell's revenue is 53/29/18% desktop/notebook/server; 
> > 80% of US sales are to businesses.  their annual report doesn't
> > break out service revenue.
> 
> Interesting. Given the profit margins involved, I bet they still
> make more money on servers than desktops and notebooks combined
> (the annual report doesn't seem to list that). And that's before 
> you take account of the "linux weighting" on top of that ...

Err, here's a news flash.  Dell has just one server with more than
4 CPUS and it tops out at 8.  Everything else is clusters.  And they
call any machine that doesn't have a head a server, they have servers
starting $299.  Yeah, that's right, $299.

http://www.dell.com/us/en/bsd/products/series_pedge_servers.htm

How much do you want to bet that more than 95% of their server revenue
comes from 4CPU or less boxes?  I wouldn't be surprised if it is more
like 99.5%.  And you can configure yourself a pretty nice quad xeon box
for $25K.  Yeah, there is some profit in there but nowhere near the huge
margins you are counting on to make your case.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 19:56   ` Minutes from Feb 21 LSE Call Larry McVoy
@ 2003-02-22 20:24     ` William Lee Irwin III
  2003-02-22 21:02     ` Martin J. Bligh
  2003-02-22 21:29     ` Jeff Garzik
  2 siblings, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-22 20:24 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Mark Hahn, David S. Miller,
	Larry McVoy, linux-kernel

On Sat, Feb 22, 2003 at 11:56:42AM -0800, Larry McVoy wrote:
> Err, here's a news flash.  Dell has just one server with more than
> 4 CPUS and it tops out at 8.  Everything else is clusters.  And they
> call any machine that doesn't have a head a server, they have servers
> starting $299.  Yeah, that's right, $299.
> http://www.dell.com/us/en/bsd/products/series_pedge_servers.htm

Sounds like low-capacity boxen meant to minimize colocation costs via
rackspace minimization.


On Sat, Feb 22, 2003 at 11:56:42AM -0800, Larry McVoy wrote:
> How much do you want to bet that more than 95% of their server revenue
> comes from 4CPU or less boxes?  I wouldn't be surprised if it is more
> like 99.5%.  And you can configure yourself a pretty nice quad xeon box
> for $25K.  Yeah, there is some profit in there but nowhere near the huge
> margins you are counting on to make your case.

Ask their marketing dept. or something. I can maximize utility
integrals and find Nash equilibria, but can't tell you Dell's secrets.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 19:56   ` Minutes from Feb 21 LSE Call Larry McVoy
  2003-02-22 20:24     ` William Lee Irwin III
@ 2003-02-22 21:02     ` Martin J. Bligh
  2003-02-22 22:06       ` Mark Hahn
  2003-02-22 23:15       ` Larry McVoy
  2003-02-22 21:29     ` Jeff Garzik
  2 siblings, 2 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-22 21:02 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Mark Hahn, David S. Miller, linux-kernel

>> Interesting. Given the profit margins involved, I bet they still
>> make more money on servers than desktops and notebooks combined
>> (the annual report doesn't seem to list that). And that's before 
>> you take account of the "linux weighting" on top of that ...
> 
> Err, here's a news flash.  Dell has just one server with more than
> 4 CPUS and it tops out at 8.  Everything else is clusters.  And they
> call any machine that doesn't have a head a server, they have servers
> starting $299.  Yeah, that's right, $299.
> 
> http://www.dell.com/us/en/bsd/products/series_pedge_servers.htm
> 
> How much do you want to bet that more than 95% of their server revenue
> comes from 4CPU or less boxes?  I wouldn't be surprised if it is more
> like 99.5%.  And you can configure yourself a pretty nice quad xeon box
> for $25K.  Yeah, there is some profit in there but nowhere near the huge
> margins you are counting on to make your case.

OK, so now you've slid from talking about PCs to 2-way to 4-way ...
perhaps because your original arguement was fatally flawed.

The work we're doing on scalablity has big impacts on 4-way systems
as well as the high end. We're also simultaneously dramatically improving
stability for smaller SMP machines by finding reproducing races in 
5 minutes that smaller machines might hit once every year or so, and 
running high-stress workloads that thrash the hell out of various 
subsystems exposing bugs.

Some applications work well on clusters, which will give them cheaper 
hardware, at the expense of a lot more complexity in userspace ... 
depending on the scale of the system, that's a tradeoff that might go 
either way. 

For applications that don't work well on clusters, you have no real
choice but to go with the high-end systems. I'd like to see Linux
across the board, as would many others.

You don't believe we can make it scale without screwing up the low end,
I do believe we can do that. Time will tell ... Linus et al are not 
stupid ... we're not going to be able to submit stuff that screwed up
the low-end, even if we wanted to.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 21:02     ` Martin J. Bligh
@ 2003-02-22 22:06       ` Mark Hahn
  2003-02-22 22:17         ` William Lee Irwin III
                           ` (3 more replies)
  2003-02-22 23:15       ` Larry McVoy
  1 sibling, 4 replies; 157+ messages in thread
From: Mark Hahn @ 2003-02-22 22:06 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: linux-kernel

> OK, so now you've slid from talking about PCs to 2-way to 4-way ...
> perhaps because your original arguement was fatally flawed.

oh, come on.  the issue is whether memory is fast and flat.
most "scalability" efforts are mainly trying to code around the fact
that any ccNUMA (and most 4-ways) is going to be slow/bumpy.
it is reasonable to worry that optimizations for imbalanced machines
will hurt "normal" ones.  is it worth hurting uni by 5% to give
a 50% speedup to IBM's 32-way?  I think not, simply because 
low-end machines are more important to Linux.

the best way to kill Linux is to turn it into an OS best suited 
for $6+-digit machines.

> For applications that don't work well on clusters, you have no real

ccNUMA worst-case latencies are not much different from decent 
cluster (message-passing) latencies.  getting an app to work on a cluster
is a matter of programming will.

regards, mark hahn.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 22:06       ` Mark Hahn
@ 2003-02-22 22:17         ` William Lee Irwin III
  2003-02-22 23:28           ` Larry McVoy
  2003-02-22 22:44         ` Ben Greear
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-22 22:17 UTC (permalink / raw)
  To: Mark Hahn; +Cc: Martin J. Bligh, linux-kernel

On Sat, Feb 22, 2003 at 05:06:27PM -0500, Mark Hahn wrote:
> ccNUMA worst-case latencies are not much different from decent 
> cluster (message-passing) latencies.

Not even close, by several orders of magnitude.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 22:17         ` William Lee Irwin III
@ 2003-02-22 23:28           ` Larry McVoy
  2003-02-22 23:47             ` Martin J. Bligh
                               ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-22 23:28 UTC (permalink / raw)
  To: William Lee Irwin III, Mark Hahn, Martin J. Bligh, linux-kernel

On Sat, Feb 22, 2003 at 02:17:39PM -0800, William Lee Irwin III wrote:
> On Sat, Feb 22, 2003 at 05:06:27PM -0500, Mark Hahn wrote:
> > ccNUMA worst-case latencies are not much different from decent 
> > cluster (message-passing) latencies.
> 
> Not even close, by several orders of magnitude.

Err, I think you're wrong.  It's been a long time since I looked, but I'm
pretty sure myrinet had single digit microseconds.  Yup, google rocks,
7.6 usecs, user to user.  Last I checked, Sequents worst case was around
there, right?
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:28           ` Larry McVoy
@ 2003-02-22 23:47             ` Martin J. Bligh
  2003-02-23  0:09             ` Gerrit Huizenga
  2003-02-24 18:36             ` Andy Pfiffer
  2 siblings, 0 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-22 23:47 UTC (permalink / raw)
  To: Larry McVoy, William Lee Irwin III, Mark Hahn, linux-kernel

>> > ccNUMA worst-case latencies are not much different from decent 
>> > cluster (message-passing) latencies.
>> 
>> Not even close, by several orders of magnitude.
> 
> Err, I think you're wrong.  It's been a long time since I looked, but I'm
> pretty sure myrinet had single digit microseconds.  Yup, google rocks,
> 7.6 usecs, user to user.  Last I checked, Sequents worst case was around
> there, right?

Sequent hardware is very old. Go time a Regatta.

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:28           ` Larry McVoy
  2003-02-22 23:47             ` Martin J. Bligh
@ 2003-02-23  0:09             ` Gerrit Huizenga
  2003-02-23  8:01               ` Larry McVoy
  2003-02-24 18:36             ` Andy Pfiffer
  2 siblings, 1 reply; 157+ messages in thread
From: Gerrit Huizenga @ 2003-02-23  0:09 UTC (permalink / raw)
  To: Larry McVoy
  Cc: William Lee Irwin III, Mark Hahn, Martin J. Bligh, linux-kernel

On Sat, 22 Feb 2003 15:28:59 PST, Larry McVoy wrote:
> On Sat, Feb 22, 2003 at 02:17:39PM -0800, William Lee Irwin III wrote:
> > On Sat, Feb 22, 2003 at 05:06:27PM -0500, Mark Hahn wrote:
> > > ccNUMA worst-case latencies are not much different from decent 
> > > cluster (message-passing) latencies.
> > 
> > Not even close, by several orders of magnitude.
> 
> Err, I think you're wrong.  It's been a long time since I looked, but I'm
> pretty sure myrinet had single digit microseconds.  Yup, google rocks,
> 7.6 usecs, user to user.  Last I checked, Sequents worst case was around
> there, right?

You are going to drag 1994 technology into this to compare against
something in 2003?  Hmm.  You might win on that comparison.  But yeah,
Sequent way back then was in that ballpark.  World has moved forwards
since then...

gerrit

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-23  0:09             ` Gerrit Huizenga
@ 2003-02-23  8:01               ` Larry McVoy
  2003-02-23  8:05                 ` William Lee Irwin III
  0 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-23  8:01 UTC (permalink / raw)
  To: Gerrit Huizenga
  Cc: Larry McVoy, William Lee Irwin III, Mark Hahn, Martin J. Bligh,
	linux-kernel

On Sat, Feb 22, 2003 at 04:09:15PM -0800, Gerrit Huizenga wrote:
> On Sat, 22 Feb 2003 15:28:59 PST, Larry McVoy wrote:
> > On Sat, Feb 22, 2003 at 02:17:39PM -0800, William Lee Irwin III wrote:
> > > On Sat, Feb 22, 2003 at 05:06:27PM -0500, Mark Hahn wrote:
> > > > ccNUMA worst-case latencies are not much different from decent 
> > > > cluster (message-passing) latencies.
> > > 
> > > Not even close, by several orders of magnitude.
> > 
> > Err, I think you're wrong.  It's been a long time since I looked, but I'm
> > pretty sure myrinet had single digit microseconds.  Yup, google rocks,
> > 7.6 usecs, user to user.  Last I checked, Sequents worst case was around
> > there, right?
> 
> You are going to drag 1994 technology into this to compare against
> something in 2003?  Hmm.  You might win on that comparison.  But yeah,
> Sequent way back then was in that ballpark.  World has moved forwards
> since then...

Really?  "Several orders of magnitude"?  Show me the data.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-23  8:01               ` Larry McVoy
@ 2003-02-23  8:05                 ` William Lee Irwin III
  0 siblings, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-23  8:05 UTC (permalink / raw)
  To: Larry McVoy, Gerrit Huizenga, Larry McVoy, Mark Hahn,
	Martin J. Bligh, linux-kernel

On Sat, Feb 22, 2003 at 04:09:15PM -0800, Gerrit Huizenga wrote:
>> You are going to drag 1994 technology into this to compare against
>> something in 2003?  Hmm.  You might win on that comparison.  But yeah,
>> Sequent way back then was in that ballpark.  World has moved forwards
>> since then...

On Sun, Feb 23, 2003 at 12:01:43AM -0800, Larry McVoy wrote:
> Really?  "Several orders of magnitude"?  Show me the data.

I was assuming ethernet when I said that.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:28           ` Larry McVoy
  2003-02-22 23:47             ` Martin J. Bligh
  2003-02-23  0:09             ` Gerrit Huizenga
@ 2003-02-24 18:36             ` Andy Pfiffer
  2 siblings, 0 replies; 157+ messages in thread
From: Andy Pfiffer @ 2003-02-24 18:36 UTC (permalink / raw)
  To: Larry McVoy
  Cc: William Lee Irwin III, Mark Hahn, Martin J. Bligh, linux-kernel

On Sat, 2003-02-22 at 15:28, Larry McVoy wrote:
> On Sat, Feb 22, 2003 at 02:17:39PM -0800, William Lee Irwin III wrote:
> > On Sat, Feb 22, 2003 at 05:06:27PM -0500, Mark Hahn wrote:
> > > ccNUMA worst-case latencies are not much different from decent 
> > > cluster (message-passing) latencies.
> > 
> > Not even close, by several orders of magnitude.
> 
> Err, I think you're wrong.  It's been a long time since I looked, but I'm
> pretty sure myrinet had single digit microseconds.  Yup, google rocks,
> 7.6 usecs, user to user.  Last I checked, Sequents worst case was around
> there, right?

FYI: The Intel/DOE ASCI Red system (>1 TFLOPS) delivered user-to-user
messaging of < 5us.  With a tail wind, peak point-to-point data rates,
delivered from a user-mode buffer into another user-mode buffer anywhere
else on the system were just shy of 400 megabytes/second (actual rates
could be affected by several factors -- obviously).


Andy



^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 22:06       ` Mark Hahn
  2003-02-22 22:17         ` William Lee Irwin III
@ 2003-02-22 22:44         ` Ben Greear
  2003-02-23 23:29           ` Bill Davidsen
  2003-02-22 23:10         ` Martin J. Bligh
  2003-02-25  2:19         ` Hans Reiser
  3 siblings, 1 reply; 157+ messages in thread
From: Ben Greear @ 2003-02-22 22:44 UTC (permalink / raw)
  To: Mark Hahn; +Cc: Martin J. Bligh, linux-kernel

Mark Hahn wrote:
>>OK, so now you've slid from talking about PCs to 2-way to 4-way ...
>>perhaps because your original arguement was fatally flawed.
> 
> 
> oh, come on.  the issue is whether memory is fast and flat.
> most "scalability" efforts are mainly trying to code around the fact
> that any ccNUMA (and most 4-ways) is going to be slow/bumpy.
> it is reasonable to worry that optimizations for imbalanced machines
> will hurt "normal" ones.  is it worth hurting uni by 5% to give
> a 50% speedup to IBM's 32-way?  I think not, simply because 
> low-end machines are more important to Linux.
> 
> the best way to kill Linux is to turn it into an OS best suited 
> for $6+-digit machines.

Linux has a key feature that most other OS's lack:  It can (easily, and by all)
be recompiled for a particular architecture.  So, there is no particular reason why
optimizing for a high-end system has to kill performance on uni-processor
machines.

For instance, don't locks simply get compiled away to nothing on
uni-processor machines?

-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear



^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 22:44         ` Ben Greear
@ 2003-02-23 23:29           ` Bill Davidsen
  2003-02-23 23:37             ` Martin J. Bligh
  0 siblings, 1 reply; 157+ messages in thread
From: Bill Davidsen @ 2003-02-23 23:29 UTC (permalink / raw)
  To: Ben Greear; +Cc: Martin J. Bligh, Linux Kernel Mailing List

On Sat, 22 Feb 2003, Ben Greear wrote:

> Mark Hahn wrote:

> > oh, come on.  the issue is whether memory is fast and flat.
> > most "scalability" efforts are mainly trying to code around the fact
> > that any ccNUMA (and most 4-ways) is going to be slow/bumpy.
> > it is reasonable to worry that optimizations for imbalanced machines
> > will hurt "normal" ones.  is it worth hurting uni by 5% to give
> > a 50% speedup to IBM's 32-way?  I think not, simply because 
> > low-end machines are more important to Linux.
> > 
> > the best way to kill Linux is to turn it into an OS best suited 
> > for $6+-digit machines.
> 
> Linux has a key feature that most other OS's lack:  It can (easily, and by all)
> be recompiled for a particular architecture.  So, there is no particular reason why
> optimizing for a high-end system has to kill performance on uni-processor
> machines.

This is exactly correct, although build just the optimal kernel for a
machine is still somewhat art rather than science. You have to choose the
trade-offs carefully.

> For instance, don't locks simply get compiled away to nothing on
> uni-processor machines?

Preempt causes most of the issues of SMP with few of the benefits. There
are loads for which it's ideal, but for general use it may not be the
right feature, and I ran it during the time when it was just a patch, but
lately I'm convinced it's for special occasions.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-23 23:29           ` Bill Davidsen
@ 2003-02-23 23:37             ` Martin J. Bligh
  2003-02-24  4:57               ` Larry McVoy
  0 siblings, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-23 23:37 UTC (permalink / raw)
  To: Bill Davidsen, Ben Greear; +Cc: Linux Kernel Mailing List

>> For instance, don't locks simply get compiled away to nothing on
>> uni-processor machines?
> 
> Preempt causes most of the issues of SMP with few of the benefits. There
> are loads for which it's ideal, but for general use it may not be the
> right feature, and I ran it during the time when it was just a patch, but
> lately I'm convinced it's for special occasions.

Note that preemption was pushed by the embedded people Larry was advocating
for, not the big-machine crowd .... ironic, eh?

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-23 23:37             ` Martin J. Bligh
@ 2003-02-24  4:57               ` Larry McVoy
  2003-02-24  6:10                 ` Gerhard Mack
  2003-02-24  7:44                 ` Bill Huey
  0 siblings, 2 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-24  4:57 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Bill Davidsen, Ben Greear, Linux Kernel Mailing List

On Sun, Feb 23, 2003 at 03:37:49PM -0800, Martin J. Bligh wrote:
> >> For instance, don't locks simply get compiled away to nothing on
> >> uni-processor machines?
> > 
> > Preempt causes most of the issues of SMP with few of the benefits. There
> > are loads for which it's ideal, but for general use it may not be the
> > right feature, and I ran it during the time when it was just a patch, but
> > lately I'm convinced it's for special occasions.
> 
> Note that preemption was pushed by the embedded people Larry was advocating
> for, not the big-machine crowd .... ironic, eh?

Dig through the mail logs and you'll see that I was completely against the
preemption patch.  I think it is a bad idea, if you want real time, use
rt/linux, it solves the problem right.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  4:57               ` Larry McVoy
@ 2003-02-24  6:10                 ` Gerhard Mack
  2003-02-24  6:52                   ` Larry McVoy
  2003-02-24  7:44                 ` Bill Huey
  1 sibling, 1 reply; 157+ messages in thread
From: Gerhard Mack @ 2003-02-24  6:10 UTC (permalink / raw)
  To: Larry McVoy
  Cc: Martin J. Bligh, Bill Davidsen, Ben Greear,
	Linux Kernel Mailing List

On Sun, 23 Feb 2003, Larry McVoy wrote:

> Date: Sun, 23 Feb 2003 20:57:17 -0800
> From: Larry McVoy <lm@bitmover.com>
> To: Martin J. Bligh <mbligh@aracnet.com>
> Cc: Bill Davidsen <davidsen@tmr.com>, Ben Greear <greearb@candelatech.com>,
>      Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
> Subject: Re: Minutes from Feb 21 LSE Call
>
> On Sun, Feb 23, 2003 at 03:37:49PM -0800, Martin J. Bligh wrote:
> > >> For instance, don't locks simply get compiled away to nothing on
> > >> uni-processor machines?
> > >
> > > Preempt causes most of the issues of SMP with few of the benefits. There
> > > are loads for which it's ideal, but for general use it may not be the
> > > right feature, and I ran it during the time when it was just a patch, but
> > > lately I'm convinced it's for special occasions.
> >
> > Note that preemption was pushed by the embedded people Larry was advocating
> > for, not the big-machine crowd .... ironic, eh?
>
> Dig through the mail logs and you'll see that I was completely against the
> preemption patch.  I think it is a bad idea, if you want real time, use
> rt/linux, it solves the problem right.

So your saying I need to switch to rt/linux to run games or an mp3 player?

	Gerhard


--
Gerhard Mack

gmack@innerfire.net

<>< As a computer I find your faith in technology amusing.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:10                 ` Gerhard Mack
@ 2003-02-24  6:52                   ` Larry McVoy
  2003-02-24  7:46                     ` Bill Huey
  0 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-24  6:52 UTC (permalink / raw)
  To: Gerhard Mack
  Cc: Larry McVoy, Martin J. Bligh, Bill Davidsen, Ben Greear,
	Linux Kernel Mailing List

> > Dig through the mail logs and you'll see that I was completely against the
> > preemption patch.  I think it is a bad idea, if you want real time, use
> > rt/linux, it solves the problem right.
> 
> So your saying I need to switch to rt/linux to run games or an mp3 player?

It depends on the quality you want.  If you want it to work without
exception, yeah, I guess that is what I'm saying.  People seem to be
willing to put up with sloppy playback on a computer that they would
freak out over if it happened on their TV.  rt/linux will make your
el cheapo laptop actually deliver what you need.

I think there has been a fair amount of discussion of this sort of stuff
in the games world.  Some game company got taken to task recently because
even 2Ghz machines couldn't run their game properly.  Makes me wonder if
a real time system is what they need.  
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:52                   ` Larry McVoy
@ 2003-02-24  7:46                     ` Bill Huey
  0 siblings, 0 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-24  7:46 UTC (permalink / raw)
  To: Larry McVoy, Gerhard Mack, Larry McVoy, Martin J. Bligh,
	Bill Davidsen, Ben Greear, Linux Kernel Mailing List
  Cc: Bill Huey (Hui)

On Sun, Feb 23, 2003 at 10:52:04PM -0800, Larry McVoy wrote:
> willing to put up with sloppy playback on a computer that they would
> freak out over if it happened on their TV.  rt/linux will make your
> el cheapo laptop actually deliver what you need.
> 
> I think there has been a fair amount of discussion of this sort of stuff
> in the games world.  Some game company got taken to task recently because
> even 2Ghz machines couldn't run their game properly.  Makes me wonder if
> a real time system is what they need.  

RT for TV, mp3 player and game performance ? What the hell happened to
network, disk QoS and carrier grade issues with modern operating systems
as it concerns telecoms ? VoIP ? My god.

bill


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  4:57               ` Larry McVoy
  2003-02-24  6:10                 ` Gerhard Mack
@ 2003-02-24  7:44                 ` Bill Huey
  2003-02-24  7:54                   ` William Lee Irwin III
  1 sibling, 1 reply; 157+ messages in thread
From: Bill Huey @ 2003-02-24  7:44 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Bill Davidsen, Ben Greear,
	Linux Kernel Mailing List
  Cc: Bill Huey (Hui)

On Sun, Feb 23, 2003 at 08:57:17PM -0800, Larry McVoy wrote:
> Dig through the mail logs and you'll see that I was completely against the
> preemption patch.  I think it is a bad idea, if you want real time, use
> rt/linux, it solves the problem right.

And large unbounded operation on data structures. DOS, a single tasking
operating system is fast running a single thread of execution too, it just
happens to also be completely useless.

Whether folks like it or not, embedded RT is the future of Linux much more
so than any single NUMA machine that's sold or can be sold by IBM, SGI and
any other vendor of that type.

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  7:44                 ` Bill Huey
@ 2003-02-24  7:54                   ` William Lee Irwin III
  2003-02-24  8:00                     ` Bill Huey
  0 siblings, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24  7:54 UTC (permalink / raw)
  To: Bill Huey
  Cc: Larry McVoy, Martin J. Bligh, Bill Davidsen, Ben Greear,
	Linux Kernel Mailing List

On Sun, Feb 23, 2003 at 08:57:17PM -0800, Larry McVoy wrote:
>> Dig through the mail logs and you'll see that I was completely against the
>> preemption patch.  I think it is a bad idea, if you want real time, use
>> rt/linux, it solves the problem right.

On Sun, Feb 23, 2003 at 11:44:47PM -0800, Bill Huey wrote:
> And large unbounded operation on data structures. DOS, a single tasking
> operating system is fast running a single thread of execution too, it just
> happens to also be completely useless.
> Whether folks like it or not, embedded RT is the future of Linux much more
> so than any single NUMA machine that's sold or can be sold by IBM, SGI and
> any other vendor of that type.

And scalability is as essential there as it is on 512x/16TB O2K's.

For this, it's _downward_ scalability, where "downward" is relative to
"typical" UP x86 boxen.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  7:54                   ` William Lee Irwin III
@ 2003-02-24  8:00                     ` Bill Huey
  2003-02-24  8:40                       ` Andrew Morton
  2003-02-24  8:43                       ` William Lee Irwin III
  0 siblings, 2 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-24  8:00 UTC (permalink / raw)
  To: William Lee Irwin III, Larry McVoy, Martin J. Bligh,
	Bill Davidsen, Ben Greear, Linux Kernel Mailing List
  Cc: Bill Huey (Hui)

On Sun, Feb 23, 2003 at 11:54:30PM -0800, William Lee Irwin III wrote:
> On Sun, Feb 23, 2003 at 11:44:47PM -0800, Bill Huey wrote:
> > And large unbounded operation on data structures. DOS, a single tasking
> > operating system is fast running a single thread of execution too, it just
> > happens to also be completely useless.
> > Whether folks like it or not, embedded RT is the future of Linux much more
> > so than any single NUMA machine that's sold or can be sold by IBM, SGI and
> > any other vendor of that type.
> 
> And scalability is as essential there as it is on 512x/16TB O2K's.
> 
> For this, it's _downward_ scalability, where "downward" is relative to
> "typical" UP x86 boxen.

The good thing about Linux is that, with some compile options, stuff (scalability)
can be insert and removed and any time. One shouldn't narrow their view of how an
OS can be out of a strict tradition.

I don't buy this spinlock-for-all-locking things tradition with no preemption,
especially given some of the IO performance improvement that happened as a courtesy
of preempt. Some how that was forgotten in Larry's discussion.

bill


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:00                     ` Bill Huey
@ 2003-02-24  8:40                       ` Andrew Morton
  2003-02-24  8:50                         ` William Lee Irwin III
  2003-02-24  8:56                         ` Bill Huey
  2003-02-24  8:43                       ` William Lee Irwin III
  1 sibling, 2 replies; 157+ messages in thread
From: Andrew Morton @ 2003-02-24  8:40 UTC (permalink / raw)
  To: Bill Huey; +Cc: wli, lm, mbligh, davidsen, greearb, linux-kernel, billh

Bill Huey (Hui) <billh@gnuppy.monkey.org> wrote:
>
> especially given some of the IO performance improvement that happened as a courtesy
> of preempt.

There is no evidence for any such thing.  Nor has any plausible
theory been put forward as to why such an improvement should occur.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:40                       ` Andrew Morton
@ 2003-02-24  8:50                         ` William Lee Irwin III
  2003-02-24 16:17                           ` yodaiken
  2003-02-24  8:56                         ` Bill Huey
  1 sibling, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24  8:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Bill Huey, lm, mbligh, davidsen, greearb, linux-kernel

Bill Huey (Hui) <billh@gnuppy.monkey.org> wrote:
>> especially given some of the IO performance improvement that
>> happened as a courtesy of preempt.

On Mon, Feb 24, 2003 at 12:40:05AM -0800, Andrew Morton wrote:
> There is no evidence for any such thing.  Nor has any plausible
> theory been put forward as to why such an improvement should occur.

There's a vague notion in my head that it should decrease scheduling
latencies in general, possibly including responses to io completion.

No idea how that lines up with reality. You've actually tracked
scheduling latencies at least at some point in the past. What kind
of results have you seen from the stuff (if any)?


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:50                         ` William Lee Irwin III
@ 2003-02-24 16:17                           ` yodaiken
  2003-02-24 23:13                             ` William Lee Irwin III
  2003-02-25  2:07                             ` Bill Huey
  0 siblings, 2 replies; 157+ messages in thread
From: yodaiken @ 2003-02-24 16:17 UTC (permalink / raw)
  To: William Lee Irwin III, Andrew Morton, Bill Huey, lm, mbligh,
	davidsen, greearb, linux-kernel

On Mon, Feb 24, 2003 at 12:50:31AM -0800, William Lee Irwin III wrote:
> Bill Huey (Hui) <billh@gnuppy.monkey.org> wrote:
> >> especially given some of the IO performance improvement that
> >> happened as a courtesy of preempt.
> 
> On Mon, Feb 24, 2003 at 12:40:05AM -0800, Andrew Morton wrote:
> > There is no evidence for any such thing.  Nor has any plausible
> > theory been put forward as to why such an improvement should occur.
> 
> There's a vague notion in my head that it should decrease scheduling

Vague notions seems to be the level of data on this topic.

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
www.fsmlabs.com  www.rtlinux.com
1+ 505 838 9109


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 16:17                           ` yodaiken
@ 2003-02-24 23:13                             ` William Lee Irwin III
  2003-02-24 23:27                               ` yodaiken
  2003-02-25  2:07                             ` Bill Huey
  1 sibling, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24 23:13 UTC (permalink / raw)
  To: yodaiken
  Cc: Andrew Morton, Bill Huey, lm, mbligh, davidsen, greearb,
	linux-kernel

On Mon, Feb 24, 2003 at 12:50:31AM -0800, William Lee Irwin III wrote:
>> There's a vague notion in my head that it should decrease scheduling

On Mon, Feb 24, 2003 at 09:17:58AM -0700, yodaiken@fsmlabs.com wrote:
> Vague notions seems to be the level of data on this topic.

Which, if you had bothered reading the rest of my post, is why I asked
for data.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 23:13                             ` William Lee Irwin III
@ 2003-02-24 23:27                               ` yodaiken
  2003-02-24 23:54                                 ` William Lee Irwin III
  2003-02-25  2:17                                 ` Bill Huey
  0 siblings, 2 replies; 157+ messages in thread
From: yodaiken @ 2003-02-24 23:27 UTC (permalink / raw)
  To: William Lee Irwin III, yodaiken, Andrew Morton, Bill Huey, lm,
	mbligh, davidsen, greearb, linux-kernel

On Mon, Feb 24, 2003 at 03:13:41PM -0800, William Lee Irwin III wrote:
> On Mon, Feb 24, 2003 at 12:50:31AM -0800, William Lee Irwin III wrote:
> >> There's a vague notion in my head that it should decrease scheduling
> 
> On Mon, Feb 24, 2003 at 09:17:58AM -0700, yodaiken@fsmlabs.com wrote:
> > Vague notions seems to be the level of data on this topic.
> 
> Which, if you had bothered reading the rest of my post, is why I asked
> for data.

I'm not sure what you are complaining about. I don't think there is good
or even marginal data or explanations of this "effect". 




-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
www.fsmlabs.com  www.rtlinux.com
1+ 505 838 9109


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 23:27                               ` yodaiken
@ 2003-02-24 23:54                                 ` William Lee Irwin III
  2003-02-24 23:54                                   ` yodaiken
  2003-02-25  2:17                                 ` Bill Huey
  1 sibling, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24 23:54 UTC (permalink / raw)
  To: yodaiken
  Cc: Andrew Morton, Bill Huey, lm, mbligh, davidsen, greearb,
	linux-kernel

On Mon, Feb 24, 2003 at 03:13:41PM -0800, William Lee Irwin III wrote:
>> Which, if you had bothered reading the rest of my post, is why I asked
>> for data.

On Mon, Feb 24, 2003 at 04:27:54PM -0700, yodaiken@fsmlabs.com wrote:
> I'm not sure what you are complaining about. I don't think there is good
> or even marginal data or explanations of this "effect". 

I'm complaining about being quoted out of context and the animus against
unsupported preempt claims being directed against me.

Re-stating preempt's "ostensible purpose" is the purpose of the "vague
notion", not adding to the pile of speculation.

For the data, akpm has apparently tracked scheduling latency, so there
is a chance he actually knows whether it's serving its ostensible
purpose as opposed to having a large stockpile of overwrought wisecracks
and a propensity for quoting out of context.

-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 23:54                                 ` William Lee Irwin III
@ 2003-02-24 23:54                                   ` yodaiken
  0 siblings, 0 replies; 157+ messages in thread
From: yodaiken @ 2003-02-24 23:54 UTC (permalink / raw)
  To: William Lee Irwin III, yodaiken, Andrew Morton, Bill Huey, lm,
	mbligh, davidsen, greearb, linux-kernel

On Mon, Feb 24, 2003 at 03:54:33PM -0800, William Lee Irwin III wrote:
> On Mon, Feb 24, 2003 at 03:13:41PM -0800, William Lee Irwin III wrote:
> >> Which, if you had bothered reading the rest of my post, is why I asked
> >> for data.
> 
> On Mon, Feb 24, 2003 at 04:27:54PM -0700, yodaiken@fsmlabs.com wrote:
> > I'm not sure what you are complaining about. I don't think there is good
> > or even marginal data or explanations of this "effect". 
> 
> I'm complaining about being quoted out of context and the animus against
> unsupported preempt claims being directed against me.

I did not quote you out of context. 

> For the data, akpm has apparently tracked scheduling latency, so there
> is a chance he actually knows whether it's serving its ostensible
> purpose as opposed to having a large stockpile of overwrought wisecracks
> and a propensity for quoting out of context.

You seem determined to pick a fight. Goodbye.



-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
www.fsmlabs.com  www.rtlinux.com
1+ 505 838 9109


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 23:27                               ` yodaiken
  2003-02-24 23:54                                 ` William Lee Irwin III
@ 2003-02-25  2:17                                 ` Bill Huey
  2003-02-25  2:24                                   ` yodaiken
                                                     ` (4 more replies)
  1 sibling, 5 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-25  2:17 UTC (permalink / raw)
  To: yodaiken
  Cc: William Lee Irwin III, Andrew Morton, lm, mbligh, davidsen,
	greearb, linux-kernel, Bill Huey (Hui)

On Mon, Feb 24, 2003 at 04:27:54PM -0700, yodaiken@fsmlabs.com wrote:
> I'm not sure what you are complaining about. I don't think there is good
> or even marginal data or explanations of this "effect". 

You don't need data. It's conceptually obvious. If you have a higher
priority thread that's not running because another thread of lower priority
is hogging the CPU for some unknown operation in the kernel, then you're
going be less able to respond to external events from the IO system and
other things with respect to a Unix style priority scheduler.

That's why we have fully preemptive RTOS to deal with that and priority
inheritence, both of which are fundamental to any kind of fixed-priority
RTOS.

If you're scheduler is scheduling crap, then it's not going to be very
effective and scheduling...

Rhetorical question... what the hell do you think this is about ?

	http://linuxdevices.com/articles/AT5698775833.html

It's about getting relationship inside the kernel to respect and be
controllable by the scheduler in some formal manner, not some random
not-so-well-though-out hack of the day.

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:17                                 ` Bill Huey
@ 2003-02-25  2:24                                   ` yodaiken
  2003-02-25  2:35                                     ` Bill Huey
  2003-02-25  2:43                                     ` Bill Huey
  2003-02-25  2:32                                   ` Larry McVoy
                                                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 157+ messages in thread
From: yodaiken @ 2003-02-25  2:24 UTC (permalink / raw)
  To: Bill Huey
  Cc: yodaiken, William Lee Irwin III, Andrew Morton, lm, mbligh,
	davidsen, greearb, linux-kernel

On Mon, Feb 24, 2003 at 06:17:36PM -0800, Bill Huey wrote:
> On Mon, Feb 24, 2003 at 04:27:54PM -0700, yodaiken@fsmlabs.com wrote:
> > I'm not sure what you are complaining about. I don't think there is good
> > or even marginal data or explanations of this "effect". 
> 
> You don't need data. It's conceptually obvious. If you have a higher

Oh. Well that makes things clear enough. Goodbye.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:24                                   ` yodaiken
@ 2003-02-25  2:35                                     ` Bill Huey
  2003-02-25  2:43                                     ` Bill Huey
  1 sibling, 0 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-25  2:35 UTC (permalink / raw)
  To: yodaiken
  Cc: William Lee Irwin III, Andrew Morton, lm, mbligh, davidsen,
	greearb, linux-kernel

On Mon, Feb 24, 2003 at 07:24:45PM -0700, yodaiken@fsmlabs.com wrote:
> Oh. Well that makes things clear enough. Goodbye.

It's completely clear.

Ok, now I know you're a completely screwed narrow minded asshole. Good
grief.

bill


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:24                                   ` yodaiken
  2003-02-25  2:35                                     ` Bill Huey
@ 2003-02-25  2:43                                     ` Bill Huey
  1 sibling, 0 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-25  2:43 UTC (permalink / raw)
  To: yodaiken
  Cc: William Lee Irwin III, Andrew Morton, lm, mbligh, davidsen,
	greearb, linux-kernel, Bill Huey (Hui)

On Mon, Feb 24, 2003 at 07:24:45PM -0700, yodaiken@fsmlabs.com wrote:
> Oh. Well that makes things clear enough. Goodbye.

I'd be worried about why you don't have a competent reply to this
article again:

	http://linuxdevices.com/articles/AT5698775833.html

Whether you, Larry and other so call Unix traditionalists realize
it, "resource kernels" from the folks like CMU's RTOS group are going
to rule you and the rest of the RT community. It's the future.

bill


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:17                                 ` Bill Huey
  2003-02-25  2:24                                   ` yodaiken
@ 2003-02-25  2:32                                   ` Larry McVoy
  2003-02-25  2:40                                     ` Bill Huey
  2003-02-25  5:24                                   ` Rik van Riel
                                                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  2:32 UTC (permalink / raw)
  To: Bill Huey
  Cc: yodaiken, William Lee Irwin III, Andrew Morton, lm, mbligh,
	davidsen, greearb, linux-kernel

> Rhetorical question... what the hell do you think this is about ?
> 
> 	http://linuxdevices.com/articles/AT5698775833.html

Hmm, maybe someone who is advertising their companies mistaken approach?
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:32                                   ` Larry McVoy
@ 2003-02-25  2:40                                     ` Bill Huey
  0 siblings, 0 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-25  2:40 UTC (permalink / raw)
  To: Larry McVoy, yodaiken, William Lee Irwin III, Andrew Morton, lm,
	mbligh, davidsen, greearb, linux-kernel
  Cc: Bill Huey (Hui)

On Mon, Feb 24, 2003 at 06:32:26PM -0800, Larry McVoy wrote:
> Hmm, maybe someone who is advertising their companies mistaken approach?

Or maybe your understanding of this is faded and you haven't keep up with
your generational contemporaries, like our BSD/OS engineers about schedulers,
preemption and priority inheritence.

Again, assuming that you actually understand what this means read this:
	http://linuxdevices.com/articles/AT5698775833.html

...because I don't think you really do understand it.

bill


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:17                                 ` Bill Huey
  2003-02-25  2:24                                   ` yodaiken
  2003-02-25  2:32                                   ` Larry McVoy
@ 2003-02-25  5:24                                   ` Rik van Riel
  2003-02-25 15:30                                   ` Alan Cox
  2003-02-26 19:31                                   ` Bill Davidsen
  4 siblings, 0 replies; 157+ messages in thread
From: Rik van Riel @ 2003-02-25  5:24 UTC (permalink / raw)
  To: Bill Huey
  Cc: yodaiken, William Lee Irwin III, Andrew Morton, lm, mbligh,
	davidsen, greearb, linux-kernel

On Mon, 24 Feb 2003, Bill Huey wrote:

> You don't need data. It's conceptually obvious.

I hope you realise this is about as good as a real godwination ?

Rik
-- 
Engineers don't grow up, they grow sideways.
http://www.surriel.com/		http://kernelnewbies.org/

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:17                                 ` Bill Huey
                                                     ` (2 preceding siblings ...)
  2003-02-25  5:24                                   ` Rik van Riel
@ 2003-02-25 15:30                                   ` Alan Cox
  2003-02-25 14:59                                     ` Bill Huey
  2003-02-26 19:31                                   ` Bill Davidsen
  4 siblings, 1 reply; 157+ messages in thread
From: Alan Cox @ 2003-02-25 15:30 UTC (permalink / raw)
  To: Bill Huey
  Cc: yodaiken, William Lee Irwin III, Andrew Morton, lm, mbligh,
	davidsen, greearb, Linux Kernel Mailing List

On Tue, 2003-02-25 at 02:17, Bill Huey wrote:
> You don't need data. It's conceptually obvious. If you have a higher
> priority thread that's not running because another thread of lower priority
> is hogging the CPU for some unknown operation in the kernel, then you're
> going be less able to respond to external events from the IO system and
> other things with respect to a Unix style priority scheduler.

Nothing is conceptually obvious. Thats the difference between 'science'
and engineering. Our bridges have to stay up.

> It's about getting relationship inside the kernel to respect and be
> controllable by the scheduler in some formal manner, not some random
> not-so-well-though-out hack of the day.

Prove it, compute the bounded RT worst case. You can't do it. Linux, NT,
VMS and so on are all basically "armwaved real time". Now for a lot of
things armwaved realtime is ok, one 'click' an hour on a phone call
from a DSP load miss isnt a big deal. Just don't try the same with
precision heavy machinery.

Its not a lack of competence, we genuinely don't yet have the understanding
in computing to solve some of the problems people are content to armwave
about.

If I need extremely high provable precision, Victor's approach is right, if
I want armwaved realtimeish behaviour with a more convenient way of working
then Victor's approach may not be the best.

Its called engineering. There are multiple ways to build most things, each
with different advantages, there are multiple ways to model it each with
more accuracy in some areas. Knowing how to use the right tool is a lot 
more important than having some religion about it.

Alan

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 15:30                                   ` Alan Cox
@ 2003-02-25 14:59                                     ` Bill Huey
  2003-02-25 15:44                                       ` yodaiken
  0 siblings, 1 reply; 157+ messages in thread
From: Bill Huey @ 2003-02-25 14:59 UTC (permalink / raw)
  To: Alan Cox
  Cc: yodaiken, William Lee Irwin III, Andrew Morton, lm, mbligh,
	davidsen, greearb, Linux Kernel Mailing List, Bill Huey (Hui)

On Tue, Feb 25, 2003 at 03:30:59PM +0000, Alan Cox wrote:
> Nothing is conceptually obvious. Thats the difference between 'science'
> and engineering. Our bridges have to stay up.

Yes, I absolutely agree with this. It shouldn't be the case where one is
over the other, they should have a complementary relationship.

> > It's about getting relationship inside the kernel to respect and be
> > controllable by the scheduler in some formal manner, not some random
> > not-so-well-though-out hack of the day.
> 
> Prove it, compute the bounded RT worst case. You can't do it. Linux, NT,
> VMS and so on are all basically "armwaved real time". Now for a lot of
> things armwaved realtime is ok, one 'click' an hour on a phone call
> from a DSP load miss isnt a big deal. Just don't try the same with
> precision heavy machinery.
> 
> Its not a lack of competence, we genuinely don't yet have the understanding
> in computing to solve some of the problems people are content to armwave
> about.
> 
> If I need extremely high provable precision, Victor's approach is right, if
> I want armwaved realtimeish behaviour with a more convenient way of working
> then Victor's approach may not be the best.

I spoke to some folks related to CMU's RTOS group about a year ago and was
influenced by their preemption design in that they claimed to get tight RT
latency characteristics by what seems like some mild changes to the Linux
kernel. I recently start to investigate their stuff, took a clue from them
and became convince that this approach was very neat and elegant. MontaVista
apparently uses this approach over other groups that run Linux as a thread
in another RT kernel. Whether this, static analysis tools doing rate{deadline}-monotonic
analysis and scheduler "reservations" (born from that RT theory I believe)
are unclear to me at this moment. I just find this particular track neat
and reminiscent of some FreeBSD ideals that I'd like to see fully working in
an open source kernel.

Top level link to many papers:
	http://linuxdevices.com/articles/AT6476691775.html

A paper I've take interest in recently from the top-level link:
	http://www.linuxdevices.com/articles/AT6078481804.html

People I originally talked to that influence my view on this:
	http://www-2.cs.cmu.edu/~rajkumar/linux-rk.html

> Its called engineering. There are multiple ways to build most things, each
> with different advantages, there are multiple ways to model it each with
> more accuracy in some areas. Knowing how to use the right tool is a lot 
> more important than having some religion about it.

Yes, I agree. I'm not trying to make a religious assertion and I don't
function that way. I just want things to work smoother and explore some
interesting ideas that I think eventually will be highly relevant to a
very broad embedded arena.

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 14:59                                     ` Bill Huey
@ 2003-02-25 15:44                                       ` yodaiken
  0 siblings, 0 replies; 157+ messages in thread
From: yodaiken @ 2003-02-25 15:44 UTC (permalink / raw)
  To: Bill Huey
  Cc: Alan Cox, yodaiken, William Lee Irwin III, Andrew Morton, lm,
	mbligh, davidsen, greearb, Linux Kernel Mailing List

On Tue, Feb 25, 2003 at 06:59:12AM -0800, Bill Huey wrote:
> latency characteristics by what seems like some mild changes to the Linux
> kernel. I recently start to investigate their stuff, took a clue from them
> and became convince that this approach was very neat and elegant. MontaVista
> apparently uses this approach over other groups that run Linux as a thread
> in another RT kernel. Whether this, static analysis tools doing rate{deadline}-monotonic
> analysis and scheduler "reservations" (born from that RT theory I believe)
> are unclear to me at this moment. I just find this particular track neat
> and reminiscent of some FreeBSD ideals that I'd like to see fully working in
> an open source kernel.

There are two easy tests:
	1) Run a millisecond period real-time task on a system under
	heave load (not just compute load) and ping flood
	 and find worst case jitter.
	In our experience tests run for less than 24 hours are worthless.
	(I've seen a lot of numbers based on 1million interrupts -
	do the math and laugh)
	It's not fair to throttle the network to make the numbers come
	out better. Please also make clear how much of the kernel you
	had to rewrite to get your numbers: e.g. specially configured
	network drivers are nice, but have an impact on usability.

	BTW: a version of this test is distributed with RTLinux . 
	2) Run the same real-time task and run a known compute/I/O load
	such as the standard kernel compile to see the overhead of 
	real-time. Remember:
		hard cli:
		run RT code only

	produces great numbers for  (1) at the expense of (2) so
	no reconfiguration allowed between these tests.

	Now try these on some embedded processors that run under 
	1GHz and 1G memory.

FWIW: RTLinux numbers are 18microseconds jitter and about 15 seconds 
slowdown of a 5 minute kernel compile on a kinda wimpy P3.  On a 2.4Ghz
we do slightly better. I got 12 microseconds on a K7, the drop for 
embedded processors is low. PowerPCs are generally excellent. The 
second test requires a little more work on things like StrongArms
because nobody has the patience to time a kernel compile on those.

As for RMA, it's a nice trick, but of limited use. Instead of 
test (and design for testability) you get a formula for calculating
schedulability from the computation times of the tasks. But since we 
have no good way to estimate compute times of code without test, it
has the result of moving ignorance instead of removing it. Also, the
idea that frequency and priority are lock-step is simply incorrect
for many applications. When you start dealing with really esoteric
concepts: like demand driven tasks and shared resources,
RMA wheezes mightily.

Pre-allocation of resources is good for RT, although not especially
revolutionary. Traditional RT systems were written using cyclic 
schedulers. Many of our simulation customers use a "slot" or 
"frame" scheduler. Fortunately, these are really old ideas so I 
know about them.

Probably because of the well advertised low level of my knowledge and 
abilities, I advocate that RT systems be designed with simplicity and
testability in mind. We have found that exceptionally complex RT 
control systems can be developed on such a basis. Making the tools more
complicated does not seem to improve reliability or performance: 
the application performance is more interesting than features of the
OS.

You can see a nice illustration of the differences
between RTLinux and the TimeSys approach in my paper on priority
inheritance http://www.fsmlabs.com/articles/inherit/inherit.html
(Orignally http://www.linuxdevices.com/articles/AT7168794919.html)
and Doug Locke's response 
	http://www.linuxdevices.com/articles/AT5698775833.html

> 
> Top level link to many papers:
> 	http://linuxdevices.com/articles/AT6476691775.html
> 
> A paper I've take interest in recently from the top-level link:
> 	http://www.linuxdevices.com/articles/AT6078481804.html
> 
> People I originally talked to that influence my view on this:
> 	http://www-2.cs.cmu.edu/~rajkumar/linux-rk.html
> 
> > Its called engineering. There are multiple ways to build most things, each
> > with different advantages, there are multiple ways to model it each with
> > more accuracy in some areas. Knowing how to use the right tool is a lot 
> > more important than having some religion about it.
> 
> Yes, I agree. I'm not trying to make a religious assertion and I don't
> function that way. I just want things to work smoother and explore some
> interesting ideas that I think eventually will be highly relevant to a
> very broad embedded arena.
> 
> bill
>  

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
www.fsmlabs.com  www.rtlinux.com
1+ 505 838 9109

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:17                                 ` Bill Huey
                                                     ` (3 preceding siblings ...)
  2003-02-25 15:30                                   ` Alan Cox
@ 2003-02-26 19:31                                   ` Bill Davidsen
  2003-02-27  0:56                                     ` Bill Huey
  4 siblings, 1 reply; 157+ messages in thread
From: Bill Davidsen @ 2003-02-26 19:31 UTC (permalink / raw)
  To: Bill Huey; +Cc: Linux Kernel Mailing List

On Mon, 24 Feb 2003, Bill Huey wrote:

> You don't need data. It's conceptually obvious. 

  The mantra of doomed IPOs ill-fated software projects, and the guy down
the street who has never invested in a company which was still in business
24 months later. No matter how great the concept it still has to work. 

  It's conceptionally obvious that professional programmers working for a
major software house will write a better os than a grad student fighting
off boredom one summer... in the end you always need data.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-26 19:31                                   ` Bill Davidsen
@ 2003-02-27  0:56                                     ` Bill Huey
  2003-02-27 20:04                                       ` Bill Davidsen
  0 siblings, 1 reply; 157+ messages in thread
From: Bill Huey @ 2003-02-27  0:56 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel Mailing List, Bill Huey (Hui)

On Wed, Feb 26, 2003 at 02:31:33PM -0500, Bill Davidsen wrote:
> On Mon, 24 Feb 2003, Bill Huey wrote:
> > You don't need data. It's conceptually obvious. 
> 
>   The mantra of doomed IPOs ill-fated software projects, and the guy down
> the street who has never invested in a company which was still in business
> 24 months later. No matter how great the concept it still has to work. 

I'm not disagreeing with that, but if you read the previous exchange you'd see
that I was reacting to what seemed to be an obviously rude dismissal of how
latency effects both IO performance of a system and trashes the usability of
the a priority driven scheduler. It's basic computer science.

>   It's conceptionally obvious that professional programmers working for a
> major software house will write a better os than a grad student fighting
> off boredom one summer... in the end you always need data.

Had to read your post a couple of times to make sure that the tone of it
wasn't charged. :)

All I can say now is that I'm working on it. We'll see if it's vaporware
in the near future.

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-27  0:56                                     ` Bill Huey
@ 2003-02-27 20:04                                       ` Bill Davidsen
  0 siblings, 0 replies; 157+ messages in thread
From: Bill Davidsen @ 2003-02-27 20:04 UTC (permalink / raw)
  To: Bill Huey; +Cc: Linux Kernel Mailing List

On Wed, 26 Feb 2003, Bill Huey wrote:

> On Wed, Feb 26, 2003 at 02:31:33PM -0500, Bill Davidsen wrote:
> > On Mon, 24 Feb 2003, Bill Huey wrote:
> > > You don't need data. It's conceptually obvious. 
> > 
> >   The mantra of doomed IPOs ill-fated software projects, and the guy down
> > the street who has never invested in a company which was still in business
> > 24 months later. No matter how great the concept it still has to work. 
> 
> I'm not disagreeing with that, but if you read the previous exchange you'd see
> that I was reacting to what seemed to be an obviously rude dismissal of how
> latency effects both IO performance of a system and trashes the usability of
> the a priority driven scheduler. It's basic computer science.

No argument from me, but I have seen systems driving up the system time
and beating the cache with scheduling logic and context switches. There's
a balance to be had there, and in timeslice size, and other places as
well, and real data are always useful.

> >   It's conceptionally obvious that professional programmers working for a
> > major software house will write a better os than a grad student fighting
> > off boredom one summer... in the end you always need data.
> 
> Had to read your post a couple of times to make sure that the tone of it
> wasn't charged. :)

It's always more effective if it's subtle and and people take an instant
to get it.

> All I can say now is that I'm working on it. We'll see if it's vaporware
> in the near future.

Great. I have no doubt that when you have convinced yourself one way or
the other you won't have any problem convincing me. When the io was slow,
the VM was primitive, and the scheduler was a doorknob, preempt made a big
improvement. Now that the rest of the kernel doesn't suck, it's a lot
hardware to make a big improvement.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 16:17                           ` yodaiken
  2003-02-24 23:13                             ` William Lee Irwin III
@ 2003-02-25  2:07                             ` Bill Huey
  2003-02-25  2:14                               ` Larry McVoy
  1 sibling, 1 reply; 157+ messages in thread
From: Bill Huey @ 2003-02-25  2:07 UTC (permalink / raw)
  To: yodaiken
  Cc: William Lee Irwin III, Andrew Morton, lm, mbligh, davidsen,
	greearb, linux-kernel, Bill Huey (Hui)

On Mon, Feb 24, 2003 at 09:17:58AM -0700, yodaiken@fsmlabs.com wrote:
> On Mon, Feb 24, 2003 at 12:50:31AM -0800, William Lee Irwin III wrote:
> > There's a vague notion in my head that it should decrease scheduling
> 
> Vague notions seems to be the level of data on this topic.

Ok, replace "vague notion" with latency and scheduling concepts that
everybody else except you understands and you'll be a bit more relevant.

It's not even about IO system, it's about a consumer-producer relationships
between threads and some kind of IPC generic mechanism. You'd run into
the same problems by having two threads communicating in a priorty capable
scheduler, since the temporal granualarity of "things that the scheduler
manages" gets clobbered but inheritently brain damaged locking.

Say, how would the scheduler properly order the priority relationships for
non-preemptable thread that holds that critical section for 100ms under
an extreme (or normal) case ?

The effectiveness of the scheduler in these cases would be meaningless.
Shit, just replace that SOB with a stocastic-insert-round-robin system and
it'll be just as effective if this current state of Linux locking stays
in place. There's probably more truth than exaggeration from what I've
seen both in the code and running Linux as a desktop OS.

> Victor Yodaiken 

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:07                             ` Bill Huey
@ 2003-02-25  2:14                               ` Larry McVoy
  2003-02-25  2:24                                 ` Bill Huey
  0 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  2:14 UTC (permalink / raw)
  To: Bill Huey
  Cc: yodaiken, William Lee Irwin III, Andrew Morton, lm, mbligh,
	davidsen, greearb, linux-kernel

On Mon, Feb 24, 2003 at 06:07:30PM -0800, Bill Huey wrote:
> On Mon, Feb 24, 2003 at 09:17:58AM -0700, yodaiken@fsmlabs.com wrote:
> > On Mon, Feb 24, 2003 at 12:50:31AM -0800, William Lee Irwin III wrote:
> > > There's a vague notion in my head that it should decrease scheduling
> > 
> > Vague notions seems to be the level of data on this topic.
> 
> Ok, replace "vague notion" with latency and scheduling concepts that
> everybody else except you understands and you'll be a bit more relevant.

Victor has forgotten more than most people know about operating systems.
Dig into his background, he tends to know what he is talking about even
if he is a little terse at times.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:14                               ` Larry McVoy
@ 2003-02-25  2:24                                 ` Bill Huey
  2003-02-25  2:46                                   ` Valdis.Kletnieks
  0 siblings, 1 reply; 157+ messages in thread
From: Bill Huey @ 2003-02-25  2:24 UTC (permalink / raw)
  To: Larry McVoy, yodaiken, William Lee Irwin III, Andrew Morton, lm,
	mbligh, davidsen, greearb, linux-kernel
  Cc: Bill Huey (Hui)

On Mon, Feb 24, 2003 at 06:14:26PM -0800, Larry McVoy wrote:
> > Ok, replace "vague notion" with latency and scheduling concepts that
> > everybody else except you understands and you'll be a bit more relevant.
> 
> Victor has forgotten more than most people know about operating systems.
> Dig into his background, he tends to know what he is talking about even
> if he is a little terse at times.

But apparently what knows is not very modern. I'm no slouch either being a
former BSDi (the original Unix folks) engineer, but I don't go dimissing
folks implicitly like he did to "The Will", William Irwin... and then not
adding anything usable in the conversation. That's just no excuse for an
adult running a company or in a public forum that's discussing these very
important issues.

Frankly, I don't care what he has or what traditional so called "Unix
folks" think. Even FreeBSD's SMPng project, using BSD/OS's 5.0 code
deals with these issues respectfully. These old school Unix folks seem
to have a much more modern attitude towards this stuff than either
you or Victor.

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:24                                 ` Bill Huey
@ 2003-02-25  2:46                                   ` Valdis.Kletnieks
  2003-02-25 14:47                                     ` Mr. James W. Laferriere
  0 siblings, 1 reply; 157+ messages in thread
From: Valdis.Kletnieks @ 2003-02-25  2:46 UTC (permalink / raw)
  To: Bill Huey
  Cc: Larry McVoy, yodaiken, William Lee Irwin III, Andrew Morton, lm,
	mbligh, davidsen, greearb, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 351 bytes --]

On Mon, 24 Feb 2003 18:24:38 PST, Bill Huey said:

> But apparently what knows is not very modern. I'm no slouch either being a
> former BSDi (the original Unix folks) engineer, but I don't go dimissing

And here I thought "the original Unix folks" was Dennis and Ken mailing you
an RL05 with a "Good luck, let us know if it works" cover letter... ;)

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:46                                   ` Valdis.Kletnieks
@ 2003-02-25 14:47                                     ` Mr. James W. Laferriere
  2003-02-25 15:59                                       ` Jesse Pollard
  0 siblings, 1 reply; 157+ messages in thread
From: Mr. James W. Laferriere @ 2003-02-25 14:47 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: Linux Kernel Maillist


	Hello Valdis ,  One in those days there were no RL05's (never were
	if my memory serves) .  They were RL02's 10mb packs .  Maybe RM05 ?
	*nix definately was NOT known as BSD then .  JimL

On Mon, 24 Feb 2003 Valdis.Kletnieks@vt.edu wrote:
> On Mon, 24 Feb 2003 18:24:38 PST, Bill Huey said:
> > But apparently what knows is not very modern. I'm no slouch either being a
> > former BSDi (the original Unix folks) engineer, but I don't go dimissing
> And here I thought "the original Unix folks" was Dennis and Ken mailing you
> an RL05 with a "Good luck, let us know if it works" cover letter... ;)
-- 
       +------------------------------------------------------------------+
       | James   W.   Laferriere | System    Techniques | Give me VMS     |
       | Network        Engineer |     P.O. Box 854     |  Give me Linux  |
       | babydr@baby-dragons.com | Coudersport PA 16915 |   only  on  AXP |
       +------------------------------------------------------------------+

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 14:47                                     ` Mr. James W. Laferriere
@ 2003-02-25 15:59                                       ` Jesse Pollard
  0 siblings, 0 replies; 157+ messages in thread
From: Jesse Pollard @ 2003-02-25 15:59 UTC (permalink / raw)
  To: Mr. James W. Laferriere, Valdis.Kletnieks; +Cc: Linux Kernel Maillist

On Tuesday 25 February 2003 08:47 am, Mr. James W. Laferriere wrote:
> Hello Valdis ,  One in those days there were no RL05's (never were
> 	if my memory serves) .  They were RL02's 10mb packs .  Maybe RM05 ?
> 	*nix definately was NOT known as BSD then .  JimL

nope - it was RK05 (2.5 MB disk). The distribution was on tape, not disk. (9 
track tapes were only 20 bucks, RKs were several thousand).

RLs did not show up for almost 10 years.

RM05 was a 600 MB disk, and didn't show up until after the PDP11/70 and VAX/11 
existed (it was a relabled CDC9766 disk I believe).

And it was UNIX v x, where x varied from null (not labeled) and 1 .. 7.

RL distributions did not come from AT&T (Yourden, Inc. was where I got one)
-- 
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:40                       ` Andrew Morton
  2003-02-24  8:50                         ` William Lee Irwin III
@ 2003-02-24  8:56                         ` Bill Huey
  2003-02-24  9:09                           ` Andrew Morton
                                             ` (2 more replies)
  1 sibling, 3 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-24  8:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: wli, lm, mbligh, davidsen, greearb, linux-kernel, Bill Huey (Hui)

On Mon, Feb 24, 2003 at 12:40:05AM -0800, Andrew Morton wrote:
> There is no evidence for any such thing.  Nor has any plausible
> theory been put forward as to why such an improvement should occur.

I find what you're saying a rather unbelievable given some of the
benchmarks I saw when the preempt patch started to floating around.

If you search linuxdevices.com for articles on preempt, you'll see a
claim about IO performance improvements with the patch. If somethings
changed then I'd like to know.

The numbers are here:
	http://kpreempt.sourceforge.net/

bill


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:56                         ` Bill Huey
@ 2003-02-24  9:09                           ` Andrew Morton
  2003-02-24  9:24                             ` Bill Huey
  2003-02-24 14:40                           ` Bill Davidsen
  2003-02-24 21:10                           ` Andrea Arcangeli
  2 siblings, 1 reply; 157+ messages in thread
From: Andrew Morton @ 2003-02-24  9:09 UTC (permalink / raw)
  To: Bill Huey; +Cc: wli, lm, mbligh, davidsen, greearb, linux-kernel, billh

Bill Huey (Hui) <billh@gnuppy.monkey.org> wrote:
>
> On Mon, Feb 24, 2003 at 12:40:05AM -0800, Andrew Morton wrote:
> > There is no evidence for any such thing.  Nor has any plausible
> > theory been put forward as to why such an improvement should occur.
> 
> I find what you're saying a rather unbelievable given some of the
> benchmarks I saw when the preempt patch started to floating around.
> 
> If you search linuxdevices.com for articles on preempt, you'll see a
> claim about IO performance improvements with the patch. If somethings
> changed then I'd like to know.
> 
> The numbers are here:
> 	http://kpreempt.sourceforge.net/
> 

That's a 5% difference across five dbench runs.  If it is even 
statistically significant, dbench is notoriously prone to chaotic
effects (less so in 2.5)  It is a long stretch to say that any
increase in dbench numbers can be generalised to "improved IO
performance" across the board.

The preempt stuff is all about *worst-case* latency.  I doubt if
it shifts the average latency (which is in the 50-100 microsecond
range) by more that 50 microseconds.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  9:09                           ` Andrew Morton
@ 2003-02-24  9:24                             ` Bill Huey
  2003-02-24  9:56                               ` Andrew Morton
  0 siblings, 1 reply; 157+ messages in thread
From: Bill Huey @ 2003-02-24  9:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: wli, lm, mbligh, davidsen, greearb, linux-kernel, Bill Huey (Hui)

On Mon, Feb 24, 2003 at 01:09:38AM -0800, Andrew Morton wrote:
> That's a 5% difference across five dbench runs.  If it is even 
> statistically significant, dbench is notoriously prone to chaotic
> effects (less so in 2.5)  It is a long stretch to say that any
> increase in dbench numbers can be generalised to "improved IO
> performance" across the board.

I think the test is valid. If the scheduler can't deal with some
kind IO event in a very tight time window, then you'd think that 
it might influence the performance of that IO system.

> The preempt stuff is all about *worst-case* latency.  I doubt if
> it shifts the average latency (which is in the 50-100 microsecond
> range) by more that 50 microseconds.

You obviously don't know what the current patch is suppose to do, I'm
assuming that's what you're refering to at this point. A fully preemptive
kernel, like the one from TimeSys, is about constraining worst case
latency by using sleeping locks that enable preemption across critical
section where that's normally turned off courtesy of spinlocks. Combine
that with heavy weight interrupts you have a mix for constraining maximum
latency to about 50us in their kernel.

The patch and locking schema in Linux in it's current form only reduces
the latency on "average", which is an inverse to your claim if concerning
maximum latency. The last time I looked at 2.5.62 there were still quite
a few place where there was the possibility of a critical section bounded
by spinlocks (with interrupts turned off) to iterate over a data structure
(VM), copy, move memory in critical sections that have very large upper
bounds.

I can't believe an engineer of your stature would blow something this
basic to the understanding of locking. You can't mean what you just
said above.

Read:
	http://linuxdevices.com/articles/AT6106723802.html

That's basically what I'm refering to...

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  9:24                             ` Bill Huey
@ 2003-02-24  9:56                               ` Andrew Morton
  2003-02-24 10:11                                 ` Bill Huey
  0 siblings, 1 reply; 157+ messages in thread
From: Andrew Morton @ 2003-02-24  9:56 UTC (permalink / raw)
  To: Bill Huey; +Cc: wli, lm, mbligh, davidsen, greearb, linux-kernel, billh

Bill Huey (Hui) <billh@gnuppy.monkey.org> wrote:
>
> On Mon, Feb 24, 2003 at 01:09:38AM -0800, Andrew Morton wrote:
> > That's a 5% difference across five dbench runs.  If it is even 
> > statistically significant, dbench is notoriously prone to chaotic
> > effects (less so in 2.5)  It is a long stretch to say that any
> > increase in dbench numbers can be generalised to "improved IO
> > performance" across the board.
> 
> I think the test is valid. If the scheduler can't deal with some
> kind IO event in a very tight time window, then you'd think that 
> it might influence the performance of that IO system.
> 

On the contrary.  If the disk request queue is plugged and the task
which is submitting writeback is preempted, the IO system could 
remain artificially idle for hundreds of milliseconds while the CPU
is off calculating pi.  This is one of the reasons why I converted
the 2.5 request queues to unplug autonomously.

But that is speculation as well - I never observed this aspect to be
a real problem.  Probably, it was not.

Substantiation of your claim requires quality testing and a plausible
explanation.  I do not believe we have seen either, OK?

> Read:
> 	http://linuxdevices.com/articles/AT6106723802.html

I did, briefly.  It appears to be claiming that the average scheduling
latency of the non-preemptible kernel is ten milliseconds!

Maybe I need to read that again in the morning.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  9:56                               ` Andrew Morton
@ 2003-02-24 10:11                                 ` Bill Huey
  0 siblings, 0 replies; 157+ messages in thread
From: Bill Huey @ 2003-02-24 10:11 UTC (permalink / raw)
  To: Andrew Morton
  Cc: wli, lm, mbligh, davidsen, greearb, linux-kernel, Bill Huey (Hui)

On Mon, Feb 24, 2003 at 01:56:25AM -0800, Andrew Morton wrote:
> But that is speculation as well - I never observed this aspect to be
> a real problem.  Probably, it was not.
> 
> Substantiation of your claim requires quality testing and a plausible
> explanation.  I do not believe we have seen either, OK?

Well, let's back off here. It's not my claim, it's Robert Love's in that
URL. Not to arrange a fight, but I had to point that out. :)

> > 	http://linuxdevices.com/articles/AT6106723802.html
> 
> I did, briefly.  It appears to be claiming that the average scheduling
> latency of the non-preemptible kernel is ten milliseconds!

They mention that this is related to the console code. Obviously, if you're
not checking for reschedule in a big pix map scroll blit, then it's going
to stick out boldly as a big latency spike.

A fully preemptive system would only turn off preemption in places that
would break drivers and other obvious places like scheduler run-queues,
etc...

> Maybe I need to read that again in the morning.

It's also an old article, but goes over a lot of the basics of a fully
preemptable kernel like that. Things might not be as dramatic now with
2.5.62. Not sure how things are now...

bill

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:56                         ` Bill Huey
  2003-02-24  9:09                           ` Andrew Morton
@ 2003-02-24 14:40                           ` Bill Davidsen
  2003-02-24 21:10                           ` Andrea Arcangeli
  2 siblings, 0 replies; 157+ messages in thread
From: Bill Davidsen @ 2003-02-24 14:40 UTC (permalink / raw)
  To: Bill Huey; +Cc: Andrew Morton, wli, lm, mbligh, greearb, linux-kernel

On Mon, 24 Feb 2003, Bill Huey wrote:

> On Mon, Feb 24, 2003 at 12:40:05AM -0800, Andrew Morton wrote:
> > There is no evidence for any such thing.  Nor has any plausible
> > theory been put forward as to why such an improvement should occur.
> 
> I find what you're saying a rather unbelievable given some of the
> benchmarks I saw when the preempt patch started to floating around.
> 
> If you search linuxdevices.com for articles on preempt, you'll see a
> claim about IO performance improvements with the patch. If somethings
> changed then I'd like to know.

Clearly you do know... preempt started out when 2.4 was the only game in
town. It made improvements to some degree because the rest of the kernel
had some real latency issues.

Skip forward through low latency patches, several flavors of elevator
improvements, faster clock rate, rmap, better VM, object rmap, finer
grained locking, io scheduling of several types including latency limiting
and prevention of write blocking, and the O(1) scheduler.

Preempt was a great way to get the right thing running sooner because
there was a lot of latency in many places. Just doesn't seem to be true
anymore. Preempt doesn't make as much difference anymore because many
things have been improved.

I'm sure that there are applications which benefit greatly from preempt,
but the days of vast improvement seem to be gone, the low hanging fruit
has been picked. Context switching latency is still way higher than 2.4,
that isn't hurting io as much as all the other improvements have helped.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:56                         ` Bill Huey
  2003-02-24  9:09                           ` Andrew Morton
  2003-02-24 14:40                           ` Bill Davidsen
@ 2003-02-24 21:10                           ` Andrea Arcangeli
  2 siblings, 0 replies; 157+ messages in thread
From: Andrea Arcangeli @ 2003-02-24 21:10 UTC (permalink / raw)
  To: Bill Huey; +Cc: Andrew Morton, wli, lm, mbligh, davidsen, greearb, linux-kernel

On Mon, Feb 24, 2003 at 12:56:17AM -0800, Bill Huey wrote:
> On Mon, Feb 24, 2003 at 12:40:05AM -0800, Andrew Morton wrote:
> > There is no evidence for any such thing.  Nor has any plausible
> > theory been put forward as to why such an improvement should occur.
> 
> I find what you're saying a rather unbelievable given some of the
> benchmarks I saw when the preempt patch started to floating around.
> 
> If you search linuxdevices.com for articles on preempt, you'll see a
> claim about IO performance improvements with the patch. If somethings
> changed then I'd like to know.
> 
> The numbers are here:
> 	http://kpreempt.sourceforge.net/

most kernels out there are buggy w/o preempt. 2.4.21pre4aa3 has most of
the needed preemption checks in the kernel loops instead. It's quite
pointless to compare preempt with an otherwise buggy kernel.

Andrea

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  8:00                     ` Bill Huey
  2003-02-24  8:40                       ` Andrew Morton
@ 2003-02-24  8:43                       ` William Lee Irwin III
  1 sibling, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24  8:43 UTC (permalink / raw)
  To: Bill Huey
  Cc: Larry McVoy, Martin J. Bligh, Bill Davidsen, Ben Greear,
	Linux Kernel Mailing List

On Sun, Feb 23, 2003 at 11:54:30PM -0800, William Lee Irwin III wrote:
>> And scalability is as essential there as it is on 512x/16TB O2K's.
>> For this, it's _downward_ scalability, where "downward" is relative to
>> "typical" UP x86 boxen.

On Mon, Feb 24, 2003 at 12:00:52AM -0800, Bill Huey wrote:
> The good thing about Linux is that, with some compile options, stuff
> (scalability) can be insert and removed and any time. One shouldn't
> narrow their view of how an OS can be out of a strict tradition.

No!! Scalability means the kernel figures out how to adapt to the box.
Removing scalability means it no longer adapts to the size of your box.
Scalability includes scaling "downward" to smaller systems.


On Mon, Feb 24, 2003 at 12:00:52AM -0800, Bill Huey wrote:
> I don't buy this spinlock-for-all-locking things tradition with no
> preemption, especially given some of the IO performance improvement
> that happened as a courtesy of preempt. Some how that was forgotten
> in Larry's discussion.

I've largely not been a party to the preempt business. Advances in
scheduling semantics are good, but are not my focus.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 22:06       ` Mark Hahn
  2003-02-22 22:17         ` William Lee Irwin III
  2003-02-22 22:44         ` Ben Greear
@ 2003-02-22 23:10         ` Martin J. Bligh
  2003-02-22 23:20           ` Larry McVoy
  2003-02-25  2:19         ` Hans Reiser
  3 siblings, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-22 23:10 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-kernel

>> OK, so now you've slid from talking about PCs to 2-way to 4-way ...
>> perhaps because your original arguement was fatally flawed.
> 
> oh, come on.  the issue is whether memory is fast and flat.
> most "scalability" efforts are mainly trying to code around the fact
> that any ccNUMA (and most 4-ways) is going to be slow/bumpy.

Scalability is not just NUMA machines by any stretch of the imagination.
It's 2x, 4x, 8x SMP as well.

> it is reasonable to worry that optimizations for imbalanced machines
> will hurt "normal" ones.  is it worth hurting uni by 5% to give
> a 50% speedup to IBM's 32-way?  I think not, simply because 
> low-end machines are more important to Linux.

We would never try to propose such a change, and never have. 
Name a scalability change that's hurt the performance of UP by 5%.
There isn't one.

> ccNUMA worst-case latencies are not much different from decent 
> cluster (message-passing) latencies.  getting an app to work on a cluster
> is a matter of programming will.

It's a matter of repeatedly reimplementing a bunch of stuff in userspace,
instead of doing things in kernel space once, properly, with all the
machine specific knowledge that's needed. It's *so* much easier to
program over a single OS image.

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:10         ` Martin J. Bligh
@ 2003-02-22 23:20           ` Larry McVoy
  2003-02-22 23:46             ` Martin J. Bligh
  0 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-22 23:20 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Mark Hahn, linux-kernel

> We would never try to propose such a change, and never have. 
> Name a scalability change that's hurt the performance of UP by 5%.
> There isn't one.

This is *exactly* the reasoning that every OS marketing weenie has used
for the last 20 years to justify their "feature" of the week.

The road to slow bloated code is paved one cache miss at a time.  You
may quote me on that.  In fact, print it out and put it above your
monitor and look at it every day.  One cache miss at a time.  How much
does one cache miss add to any benchmark?  .001%?  Less.  

But your pet features didn't slow the system down.  Nope, they just made
the cache smaller, which you didn't notice because whatever artificial
benchmark you ran didn't happen to need the whole cache.  

You need to understand that system resources belong to the user.  Not the
kernel.  The goal is to have all of the kernel code running under any 
load be less than 1% of the CPU.  Your 5% number up there would pretty 
much double the amount of time we spend in the kernel for most workloads.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:20           ` Larry McVoy
@ 2003-02-22 23:46             ` Martin J. Bligh
  0 siblings, 0 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-22 23:46 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Mark Hahn, linux-kernel

>> We would never try to propose such a change, and never have. 
>> Name a scalability change that's hurt the performance of UP by 5%.
>> There isn't one.
> 
> This is *exactly* the reasoning that every OS marketing weenie has used
> for the last 20 years to justify their "feature" of the week.

Fine, stick 'em all together. I bet it's either an improvement or 
doesn't even register on the scale. Knock yourself out.

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 22:06       ` Mark Hahn
                           ` (2 preceding siblings ...)
  2003-02-22 23:10         ` Martin J. Bligh
@ 2003-02-25  2:19         ` Hans Reiser
  2003-02-25  3:49           ` Martin J. Bligh
  3 siblings, 1 reply; 157+ messages in thread
From: Hans Reiser @ 2003-02-25  2:19 UTC (permalink / raw)
  To: Mark Hahn; +Cc: Martin J. Bligh, linux-kernel

I expect to have 16-32 CPUs in my $3000 desktop in 5 years .  If you all 
start planning for that now, you might get it debugged before it happens 
to me.;-)

I don't expect to connect the 16-32 CPUs with ethernet.... but it won't 
surprise me if they have non-uniform memory.

 It is just a matter of time before the users need Reiser4 to be highly 
scalable, and I don't want to rewrite when they do, so we are worrying 
about it now.

-- 
Hans

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  2:19         ` Hans Reiser
@ 2003-02-25  3:49           ` Martin J. Bligh
  2003-02-25  5:12             ` Steven Cole
  0 siblings, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25  3:49 UTC (permalink / raw)
  To: Hans Reiser; +Cc: linux-kernel

> I expect to have 16-32 CPUs in my $3000 desktop in 5 years .  If you all
> start planning for that now, you might get it debugged before it happens
> to me.;-)

Thank you ... some sanity amongst the crowd

> I don't expect to connect the 16-32 CPUs with ethernet.... but it won't
> surprise me if they have non-uniform memory.

Indeed. Just look at AMD hammer for NUMA effects, and SMT and multiple
chip on die technologies for the way things are going.

M.



^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  3:49           ` Martin J. Bligh
@ 2003-02-25  5:12             ` Steven Cole
  2003-02-25 20:37               ` Scott Robert Ladd
  0 siblings, 1 reply; 157+ messages in thread
From: Steven Cole @ 2003-02-25  5:12 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Hans Reiser, LKML, Larry McVoy

On Mon, 2003-02-24 at 20:49, Martin J. Bligh wrote:
> > I expect to have 16-32 CPUs in my $3000 desktop in 5 years .  If you all
> > start planning for that now, you might get it debugged before it happens
> > to me.;-)
> 
> Thank you ... some sanity amongst the crowd
> 
> > I don't expect to connect the 16-32 CPUs with ethernet.... but it won't
> > surprise me if they have non-uniform memory.
> 
> Indeed. Just look at AMD hammer for NUMA effects, and SMT and multiple
> chip on die technologies for the way things are going.
> 
> M.

Hans may have 32 CPUs in his $3000 box, and I expect to have 8 CPUs in
my $500 Walmart special 5 or 6 years hence.  And multiple chip on die
along with HT is what will make it possible.

What concerns me is that this will make it possible to put insane
numbers of CPUs in those $250,000 and higher boxes.  If Martin et al can
scale Linux to 64 CPUs, can they make it scale several binary orders of
magnitude higher? Why do this?  NUMA memory is much faster than even
very fast network connections any day.   

Is there a market for such a thing?  I won't pretend to know that
answer.  But the capability to do it will be there, and in 5 years the
3.2 kernel probably won't be quite stable yet, so decisions made in the
next year for 2.9/3.0 may have to last until then.

Please listen to Larry.  When he says you can't scale endlessly, I have
a feeling he knows what he's talking about.  The Nirvana machine has 48
SGI boxes with 128 CPUs in each.  I don't hear about many 128 CPU
machines nowadays.  Perhaps Irix just wasn't quite up to the job.  But
new technologies will make this kind of machine affordable (by the
government and financial institutions) in the not too distant future.  

Just my two cents.  Enough ranting for today.

Steven

^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-25  5:12             ` Steven Cole
@ 2003-02-25 20:37               ` Scott Robert Ladd
  2003-02-25 21:36                 ` Hans Reiser
  2003-02-26  0:44                 ` Alan Cox
  0 siblings, 2 replies; 157+ messages in thread
From: Scott Robert Ladd @ 2003-02-25 20:37 UTC (permalink / raw)
  To: Steven Cole, Martin J. Bligh; +Cc: Hans Reiser, LKML, Larry McVoy

Steven Cole wrote:
> Hans may have 32 CPUs in his $3000 box, and I expect to have 8 CPUs in
> my $500 Walmart special 5 or 6 years hence.  And multiple chip on die
> along with HT is what will make it possible.

Or will Walmart be selling systems with one CPU for $62.50?

"Normal" folk simply have no use for an 8 CPU system. Sure, the technology
is great -- but no many people are buying HDTV, let alone a computer system
that could do real-time 3D holographic imaging. What Walmart is selling
today for $199 is a 1.1 GHz Duron system with minimal memory and a 10GB hard
drive. Not exactly state of the art (although it might make a nice node in a
super-cheap cluster!)

Of course, you'll have your Joe Normals who will buy multiprocessor machines
with neon lights and case windows -- but those are the same people who drive
a Ford Excessive 4WD SuperCab pickup when the only thing they ever "haul" is
groceries.

(Note: I drive a big SUV because I *do* haul stuff, and I've got lots of
kids -- the right tool for the job, as Alan stated.)

> What concerns me is that this will make it possible to put insane
> numbers of CPUs in those $250,000 and higher boxes.  If Martin et al can
> scale Linux to 64 CPUs, can they make it scale several binary orders of
> magnitude higher? Why do this?  NUMA memory is much faster than even
> very fast network connections any day.
>
> Is there a market for such a thing?

Such systems will be very useful in limited markets. If I need to simulate
the global climate or the evolution of galaxies, I can damned-well use
65,536 quad-core CPUs, and I'll be happy to install Linux on such a box.
Writing e-mail or scanning my kids' drawings doesn't require that sort of
power.

> Please listen to Larry.  When he says you can't scale endlessly, I have
> a feeling he knows what he's talking about.  The Nirvana machine has 48
> SGI boxes with 128 CPUs in each.  I don't hear about many 128 CPU
> machines nowadays.  Perhaps Irix just wasn't quite up to the job.  But
> new technologies will make this kind of machine affordable (by the
> government and financial institutions) in the not too distant future.

Linux needs a roadmap; perhaps it has one, and I just haven't seen it?

I'm not entirely certain that Linux can scale from toasters to Deep Thought;
the needs of an office worker don't coincide well with the needs of a
scientist trying to simulate the dynamics of hurricanes. I've worked both
ends of that spectrum; they really are two different universes that may not
be effectively addressed by one Linux.

I, for one, would rather see Linux work best on high-end systems; I have no
problem leaving the low end of the spectrum to consumer-oriented companies
like Microsoft. Linux has the most potential of any extant OS, in my
opinion, for handling the types of systems you envision. And to achieve such
a goal, some planning needs to be done *now* to avoid quagmires and
minefields in the future.

..Scott

--
Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 20:37               ` Scott Robert Ladd
@ 2003-02-25 21:36                 ` Hans Reiser
  2003-02-25 23:28                   ` Scott Robert Ladd
  2003-02-26  0:44                 ` Alan Cox
  1 sibling, 1 reply; 157+ messages in thread
From: Hans Reiser @ 2003-02-25 21:36 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: Steven Cole, Martin J. Bligh, LKML, Larry McVoy

Scott Robert Ladd wrote:

>"Normal" folk simply have no use for an 8 CPU system. 
>
I had this argument over whether normal people would ever really need a 
10mb hard drive when I was 21.  Once was enough, sorry, I didn't 
convince the other guy then, and I don't think I have gotten more 
eloquent since then.

I'll just say that entertainment will drive computing for the next 5-15 
years, and game designers won't have enough CPU that whole time.  
Hollywood is dying like radio did, and immersive experiences are 
replacing it.

HDTV might not make it.  I personally don't really want any audio or 
video devices or sources which are not well integrated into my computer, 
and HDTV is not.  I am not sure if the rest of the market will think 
like me, but the gamers  might....  I am getting a La Cie 4 monitor next 
week which will do 2048x1536 without blurring pixels for $960, and I 
just don't think I will want to use an HDTV for anything except maybe 
the kitchen.  I try to watch a high quality movie once a week with a 
friend because I don't want to miss out on our culture (and games are 
not yet as culturally rich as movies), but games are more engaging, and 
I am not really managing to watch the movie a week.  I seem to be at the 
extreme of a growing trend.

Scott Robert Ladd wrote:

(Note: I drive a big SUV because I *do** haul stuff, and I've got lots of
kids -- the right tool for the job, as Alan stated.)

You didn't say whether you typically haul stuff and kids over rough 
roads.  If you don't (and very few SUV owners do), then what you need is 
called a "mini-van", which is what people who are functionally oriented 
buy for city hauling of kids and stuff ;-), and I bought my wife one.  
It has more than 16 CPUs in it....

-- 
Hans

^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-25 21:36                 ` Hans Reiser
@ 2003-02-25 23:28                   ` Scott Robert Ladd
  2003-02-25 23:41                     ` Hans Reiser
  2003-02-26  6:04                     ` Aaron Lehmann
  0 siblings, 2 replies; 157+ messages in thread
From: Scott Robert Ladd @ 2003-02-25 23:28 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Steven Cole, Martin J. Bligh, LKML, Larry McVoy

> >"Normal" folk simply have no use for an 8 CPU system.
> I had this argument over whether normal people would ever really need a
> 10mb hard drive when I was 21.  Once was enough, sorry, I didn't
> convince the other guy then, and I don't think I have gotten more
> eloquent since then.

I should be more careful in what I say. I remember fighting to get 20MB
drives in systems for an early Novell LAN, when management thought that no
one would ever need more than 10MB.

To be more precise in my reasoning:

High-powered, multiprocessor computers will be an essential part of people's
lives -- in medical equipment, possibly guiding transportation, in various
tools that affect people's live. As for what we see today as a "home
computer": The vast majority of people don't use what they already have.
This is one reason that sales of "home computers" have slowed; people just
don't need a 3GHz system (with or without HT or SMP) for checking e-mail and
writing a letter to Aunt Edna.

> I'll just say that entertainment will drive computing for the next 5-15
> years, and game designers won't have enough CPU that whole time.
> Hollywood is dying like radio did, and immersive experiences are
> replacing it.

You are correct. Gaming, file sharing, digital imaging -- those application
eat horsepower. But I honestly can't see how 8 processors can possibly make
Abiword run "better."

Technologies tend to hit a point where they're "good enough" for the
majority of users. For example, houses haven't really changed much in 50
years, in spite of Disney visions and the HGTV. I haven't seen too many
push-button houses (like people predicted in the 1950s); and I still want my
flying cars, dang-it!

> You didn't say whether you typically haul stuff and kids over rough
> roads.  If you don't (and very few SUV owners do), then what you need is
> called a "mini-van", which is what people who are functionally oriented
> buy for city hauling of kids and stuff ;-), and I bought my wife one.
> It has more than 16 CPUs in it....

I live half-time in rural Colorado -- at 9800 feet above sea level, on rough
highways 60 miles from the nearest grocery store. I've also done Search &
Rescue, and I'm involved in work on Indian Reservations (where roads just
plain stink). I do need a different vehicle for when I'm in Florida -- we
usually leave my behemoth parked and drive a boring Taurus.

My 4x4 SUV is kinda raggy; it's 18 years old, and I maintain it myself.
People who buy $75,000 Cadillac SUVs with leather seats do it for prestige
and "mine is bigger than yours" competition. Kinda like folks who buy
dual-processor systems with 250GB drives, so they can web surf or impress
people at LAN parties... ;)

This point does fit with our discussion of multiprocessor computers.
Minivans are *not* marvels of high technology; they're actually quite
prosaic. But they do the job well for many people who have no need for a
high-tech car. Meanwhile, the best-technology vehicles don't sell very well.
I suspect the same rule holds true for computers.

..Scott

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 23:28                   ` Scott Robert Ladd
@ 2003-02-25 23:41                     ` Hans Reiser
  2003-02-26  0:19                       ` Scott Robert Ladd
                                         ` (2 more replies)
  2003-02-26  6:04                     ` Aaron Lehmann
  1 sibling, 3 replies; 157+ messages in thread
From: Hans Reiser @ 2003-02-25 23:41 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: Steven Cole, Martin J. Bligh, LKML, Larry McVoy

Scott Robert Ladd wrote:

> But I honestly can't see how 8 processors can possibly make
>Abiword run "better."
>
They can't, but you know it was before 1980 that hardware exceeded what 
was really needed for email.  What happened then?  People needed more 
horsepower for wysiwyg editors, the new thing of that time.....

Now it is games that hardware is too slow for.  After games, maybe AI 
assistants?....  Will you be saying, "My AI doesn't have enough 
horsepower to run on, its databases are small and out of date, and it is 
providing me worse advice than my wealthy friends get, and providing it 
later."?  How much will you pay for a good AI to advise you?  (I really 
like my I-Nav GPS adviser in my mini-van.... money well spent....)

>I live half-time in rural Colorado -- at 9800 feet above sea level, on rough
>highways 60 miles from the nearest grocery store.
>
Ok, you win that one.;-)

>Kinda like folks who buy
>dual-processor systems with 250GB drives, so they can web surf or impress
>people at LAN parties... ;)
>
I am buying a new monitor so that I can do head-shots more easily in 
tribes 2;-).  I suppose I should be more motivated by having bigger 
emacs windows and thereby increasing the size of my visual cache, and 
maybe when I was younger I would have been more motivated by that, and 
it does prevent me from feeling guilty about spending that money, but at 
this phase of my life ;-) I hate it when pixelization prevents me from 
lining up on the head....

It is interesting that games are the only compelling motivation for 
faster desktop hardware these days.  It may be part of why we are in a 
tech bust.  When AIs become hardware purchase drivers, there will likely 
be a boom again.

-- 
Hans



^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-25 23:41                     ` Hans Reiser
@ 2003-02-26  0:19                       ` Scott Robert Ladd
  2003-02-26  0:35                         ` Hans Reiser
  2003-02-26  0:47                       ` Steven Cole
  2003-02-26 16:07                       ` Horst von Brand
  2 siblings, 1 reply; 157+ messages in thread
From: Scott Robert Ladd @ 2003-02-26  0:19 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Steven Cole, Martin J. Bligh, LKML, Larry McVoy

Hans Reiser wrote
> Now it is games that hardware is too slow for.  After games, maybe AI
> assistants?....  Will you be saying, "My AI doesn't have enough
> horsepower to run on, its databases are small and out of date, and it is
> providing me worse advice than my wealthy friends get, and providing it
> later."?  How much will you pay for a good AI to advise you?  (I really
> like my I-Nav GPS adviser in my mini-van.... money well spent....)

Really good AI is predicated on the invention of better algorithms. I do a
bit of work in this area; we're a long way from any useful AI -- unless you
think Microsoft's "Clippy" qualifies. :)

I would love to see "intelligence" in software; IBM's recent "autonomic
computing" initiative is marketing hype for a good idea. Programs (including
Linux!) should be self-diagnosing, fault tolerant, and self-correcting.
We're not there yet on the software side (again).

And "smart AI" may not be something people want. Many people distrust
machines -- and in gaming, a really good AI simply isn't as important (or
desirable) as are pretty graphics (handled by a GPU).

> Ok, you win that one.;-)

Yeah! ;)

> It is interesting that games are the only compelling motivation for
> faster desktop hardware these days.  It may be part of why we are in a
> tech bust.  When AIs become hardware purchase drivers, there will likely
> be a boom again.

I've worked with several game companies; AI just isn't a priority. Games
need to be "good enough" to challenge average gamers; people who want a real
challenge play online against other humans.

Excellent breast physics (Extreme Beach Volleyball) sells games; a crafty,
hard-to-defeat AI actually turns off casual players and just isn't "sexy".

And now I think we're getting *WAY* off topic. :)

..Scott

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-26  0:19                       ` Scott Robert Ladd
@ 2003-02-26  0:35                         ` Hans Reiser
  2003-02-26 16:31                           ` Horst von Brand
  0 siblings, 1 reply; 157+ messages in thread
From: Hans Reiser @ 2003-02-26  0:35 UTC (permalink / raw)
  To: Scott Robert Ladd; +Cc: Steven Cole, Martin J. Bligh, LKML, Larry McVoy

Scott Robert Ladd wrote:

>
>I've worked with several game companies; AI just isn't a priority. Games
>need to be "good enough" to challenge average gamers; people who want a real
>challenge play online against other humans.
>
I didn't mean game AI.  In real life, computers aim better than humans 
do.  In real life, that makes people want the AI, in a game, when the 
robot is faster they don't buy the game.

I predict the US military will drive AI research over the next 10 years 
because AIs can shoot better and faster.  After the AIs mature on the 
battlefield they'll start being more useful to industry (replacing bus 
drivers, etc.)

In 15-30 years, AIs will be a big market, a huge one.  Of course, people 
said that 30 years ago and it seemed reasonable then....

-- 
Hans

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-26  0:35                         ` Hans Reiser
@ 2003-02-26 16:31                           ` Horst von Brand
  0 siblings, 0 replies; 157+ messages in thread
From: Horst von Brand @ 2003-02-26 16:31 UTC (permalink / raw)
  To: Hans Reiser; +Cc: LKML

[Massive cutdown on Cc:]
Hans Reiser <reiser@namesys.com>

[...]

> In 15-30 years, AIs will be a big market, a huge one.  Of course, people 
> said that 30 years ago and it seemed reasonable then....

It won't. Because AI is handwaving patch over hack with the odd kludge for
lack of a decent, structured solution. If the problem is important, some
solution is found eventually, and the area doesn't qualify anymore ;-)

Happened to "automatic programming", to get a program written from a
high-level specification was an AI problem, until compiler technology was
born and matured. To be able to manage a computer system required a human,
until modern OSes. Today you have machines reading handwriting (sort of) as
part of PDAs, there is even some limited voice input available. Automatic
recognition of failed parts from video cameras is routine, work is
progressing on face recognition. It just isn't called AI anymore.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 23:41                     ` Hans Reiser
  2003-02-26  0:19                       ` Scott Robert Ladd
@ 2003-02-26  0:47                       ` Steven Cole
  2003-02-26 16:07                       ` Horst von Brand
  2 siblings, 0 replies; 157+ messages in thread
From: Steven Cole @ 2003-02-26  0:47 UTC (permalink / raw)
  To: Hans Reiser; +Cc: Scott Robert Ladd, LKML

cc list trimmed.

On Tue, 2003-02-25 at 16:41, Hans Reiser wrote:
> Scott Robert Ladd wrote:
> 
> > But I honestly can't see how 8 processors can possibly make
> >Abiword run "better."
> >
> They can't, but you know it was before 1980 that hardware exceeded what 
> was really needed for email.  What happened then?  People needed more 
> horsepower for wysiwyg editors, the new thing of that time.....
> 
> Now it is games that hardware is too slow for.  After games, maybe AI 
> assistants?....  Will you be saying, "My AI doesn't have enough 
> horsepower to run on, its databases are small and out of date, and it is 
> providing me worse advice than my wealthy friends get, and providing it 
> later."?  How much will you pay for a good AI to advise you?  (I really 
> like my I-Nav GPS adviser in my mini-van.... money well spent....)
> 
[snippage]
> 
> It is interesting that games are the only compelling motivation for 
> faster desktop hardware these days.  It may be part of why we are in a 
> tech bust.  When AIs become hardware purchase drivers, there will likely 
> be a boom again.
> 
> -- 
> Hans

It's easy to say that people don't need a multiple Ghz processor to run
most applications (games and AI aside) because it's true.  But human
nature is such that people bought muscle cars with way more horsepower
than needed in the 60's and 70's before environmental concerns
intervened.  The current slowdown in PC purchases may be more due to a
cyclical bear market than due to satiation of need.  When the economy
turns around, and it always does, many people will opt for the $600 2.4
Ghz P4 instead of the $200 1.1 Ghz Duron.  And not because they need it,
but because of other factors.

Now, fast forward five years.  If AMD is still around, Intel will be
forced to offer ridiculously fast hardware just to stay in business.  My
original point is that the Ghz race may be supplemented by a SMP/HT
race, not because of need (AI and games may help provide an excuse), but
because of greed and envy.  Never underestimate those last two.  And
that SMP/HT race could have an important impact on future kernel design.

Steven (Looking forward to his 2.4 Ghz P4 which will compile a 2.5
kernel faster than the 15 minutes it takes his 450 Mhz PIII today,
especially with Reiser4 patched in.;) )

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 23:41                     ` Hans Reiser
  2003-02-26  0:19                       ` Scott Robert Ladd
  2003-02-26  0:47                       ` Steven Cole
@ 2003-02-26 16:07                       ` Horst von Brand
  2003-02-26 19:47                         ` Alan Cox
  2 siblings, 1 reply; 157+ messages in thread
From: Horst von Brand @ 2003-02-26 16:07 UTC (permalink / raw)
  To: Hans Reiser; +Cc: LKML

[Massive snippage of Cc:]
Hans Reiser <reiser@namesys.com> said:

[...]

> It is interesting that games are the only compelling motivation for 
> faster desktop hardware these days.  It may be part of why we are in a 
> tech bust.  When AIs become hardware purchase drivers, there will likely 
> be a boom again.

Oh, it was always that way. When it was Apple ][+, nobody complained about
the spreadsheet being too small/slow, it was games which were CPU and
display hungry. The machines of that vintage that have still a following
around here are the Atari 800XL and such, which had special hardware for
managing grahics on display. With the first PCs it was color displays. Then
came CDs and multimedia. Today it is fast CPUs and accelerated video cards
my students want for running the latest crop in games. Many keep Win98 just
for running games, for work they use Linux ;-)

Some people say that most new computing stuff is first introduced for
gaming. That does make sense to me, as in a game you'll be more tolerant of
rough edges; plus games do have a much wider appeal than office suites or
databases, and are a much more competitive market to boot. ;-)
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-26 16:07                       ` Horst von Brand
@ 2003-02-26 19:47                         ` Alan Cox
  0 siblings, 0 replies; 157+ messages in thread
From: Alan Cox @ 2003-02-26 19:47 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Hans Reiser, LKML

On Wed, 2003-02-26 at 16:07, Horst von Brand wrote:
> Some people say that most new computing stuff is first introduced for
> gaming. That does make sense to me, as in a game you'll be more tolerant of
> rough edges; plus games do have a much wider appeal than office suites or
> databases, and are a much more competitive market to boot. ;-)

If you've ever seen Master Thief run on a 16Mhz palmpilot you might want to
ask the game folks some *hard* questions too 8)


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 23:28                   ` Scott Robert Ladd
  2003-02-25 23:41                     ` Hans Reiser
@ 2003-02-26  6:04                     ` Aaron Lehmann
  1 sibling, 0 replies; 157+ messages in thread
From: Aaron Lehmann @ 2003-02-26  6:04 UTC (permalink / raw)
  To: Scott Robert Ladd
  Cc: Hans Reiser, Steven Cole, Martin J. Bligh, LKML, Larry McVoy

On Tue, Feb 25, 2003 at 06:28:08PM -0500, Scott Robert Ladd wrote:
> You are correct. Gaming, file sharing, digital imaging -- those application
> eat horsepower. But I honestly can't see how 8 processors can possibly make
> Abiword run "better."

With the current (and historic) state of Abiword performance, anything
would be an improvement.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-25 20:37               ` Scott Robert Ladd
  2003-02-25 21:36                 ` Hans Reiser
@ 2003-02-26  0:44                 ` Alan Cox
  2003-02-25 23:58                   ` Scott Robert Ladd
  1 sibling, 1 reply; 157+ messages in thread
From: Alan Cox @ 2003-02-26  0:44 UTC (permalink / raw)
  To: Scott Robert Ladd
  Cc: Steven Cole, Martin J. Bligh, Hans Reiser, LKML, Larry McVoy

On Tue, 2003-02-25 at 20:37, Scott Robert Ladd wrote:
> Steven Cole wrote:
> > Hans may have 32 CPUs in his $3000 box, and I expect to have 8 CPUs in
> > my $500 Walmart special 5 or 6 years hence.  And multiple chip on die
> > along with HT is what will make it possible.
> 
> Or will Walmart be selling systems with one CPU for $62.50?
> 
> "Normal" folk simply have no use for an 8 CPU system. Sure, the technology
> is great -- but no many people are buying HDTV, let alone a computer system
> that could do real-time 3D holographic imaging. What Walmart is selling
> today for $199 is a 1.1 GHz Duron system with minimal memory and a 10GB hard

Last time I checked it was an 800Mhz VIA C3 with onboard everything (EPIA 
variant). Even the CPU is BGA mounted to keep cost down


^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-26  0:44                 ` Alan Cox
@ 2003-02-25 23:58                   ` Scott Robert Ladd
  0 siblings, 0 replies; 157+ messages in thread
From: Scott Robert Ladd @ 2003-02-25 23:58 UTC (permalink / raw)
  To: Alan Cox; +Cc: Steven Cole, Martin J. Bligh, Hans Reiser, LKML, Larry McVoy

Alan Cox wrote:
SRL> that could do real-time 3D holographic imaging. What Walmart is
SRL> selling today for $199 is a 1.1 GHz Duron system with minimal
SRL> memory and a 10GB hard.

AC> Last time I checked it was an 800Mhz VIA C3 with onboard everything
AC> (EPIA variant). Even the CPU is BGA mounted to keep cost down

My reference is:

http://www.walmart.com/catalog/product.gsp?product_id=2138700&cat=3951&type=
19&dept=3944&path=0:3944:3951

1.1 GHz Duron
128 MB RAM
10 GB drive
CD-ROM
Ethernet

For $199.98.

..Scott

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 21:02     ` Martin J. Bligh
  2003-02-22 22:06       ` Mark Hahn
@ 2003-02-22 23:15       ` Larry McVoy
  2003-02-22 23:23         ` Christoph Hellwig
                           ` (3 more replies)
  1 sibling, 4 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-22 23:15 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, Mark Hahn, David S. Miller, linux-kernel

On Sat, Feb 22, 2003 at 01:02:12PM -0800, Martin J. Bligh wrote:
> > How much do you want to bet that more than 95% of their server revenue
> > comes from 4CPU or less boxes?  I wouldn't be surprised if it is more
> > like 99.5%.  And you can configure yourself a pretty nice quad xeon box
> > for $25K.  Yeah, there is some profit in there but nowhere near the huge
> > margins you are counting on to make your case.
> 
> OK, so now you've slid from talking about PCs to 2-way to 4-way ...
> perhaps because your original arguement was fatally flawed.

Nice attempt at deflection but it won't work.  Your position is that
there is no money in PC's only in big iron.  Last I checked, "big iron"
doesn't include $25K 4 way machines, now does it?  You claimed that
Dell was making the majority of their profits from servers.  To refresh
your memory: "I bet they still make more money on servers than desktops
and notebooks combined".  Are you still claiming that?  If so, please
provide some data to back it up because, as Mark and others have pointed
out, the bulk of their servers are headless desktop machines in tower
or rackmount cases.  I fail to see how there are better margins on the
same hardware in a rackmount box for $800 when the desktop costs $750.
Those rack mount power supplies and cases are not as cheap as the desktop
ones, so I see no difference in the margins.

Let's get back to your position.  You want to shovel stuff in the kernel
for the benefit of the 32 way / 64 way etc boxes.  I don't see that as
wise.  You could prove me wrong.  Here's how you do it: go get oprofile
or whatever that tool is which lets you run apps and count cache misses.
Start including before/after runs of each microbench in lmbench and 
some time sharing loads with and without your changes.  When you can do
that and you don't add any more bus traffic, you're a genius and 
I'll shut up.

But that's a false promise because by definition, fine grained threading
adds more bus traffic.  It's kind of hard to not have that happen, the
caches have to stay coherent somehow.

> Some applications work well on clusters, which will give them cheaper 
> hardware, at the expense of a lot more complexity in userspace ... 
> depending on the scale of the system, that's a tradeoff that might go 
> either way. 

Tell it to Google.  That's probably one of the largest applications in
the world; I was the 4th engineer there, and I didn't think that the
cluster added complexity at all.  On the contrary, it made things go
one hell of a lot faster.

> You don't believe we can make it scale without screwing up the low end,
> I do believe we can do that. 

I'd like a little more than "I think I can, I think I can, I think I can".
The people who are saying "no you can't, no you can't, no you can't" have
seen this sort of work done before and there is no data which shows that
it is possible and all sorts of data which shows that it is not.

Show me one OS which scales to 32 CPUs on an I/O load and run lmbench 
on a single CPU.  Then take that same CPU and stuff it into a uniprocessor
motherboard and run the same benchmarks on under Linux.  The Linux one
will blow away the multi threaded one.  Come on, prove me wrong, show
me the data.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:15       ` Larry McVoy
@ 2003-02-22 23:23         ` Christoph Hellwig
  2003-02-22 23:54           ` Mark Hahn
  2003-02-22 23:44         ` Martin J. Bligh
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 157+ messages in thread
From: Christoph Hellwig @ 2003-02-22 23:23 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, Mark Hahn,
	David S. Miller, linux-kernel

On Sat, Feb 22, 2003 at 03:15:52PM -0800, Larry McVoy wrote:
> Show me one OS which scales to 32 CPUs on an I/O load and run lmbench 
> on a single CPU.  Then take that same CPU and stuff it into a uniprocessor
> motherboard and run the same benchmarks on under Linux.  The Linux one
> will blow away the multi threaded one.  Come on, prove me wrong, show
> me the data.

I could ask the SGI Eagan folks to do that with an Altix and a IA64
Whitebox  - oh wait, both OSes would be Linux..


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:23         ` Christoph Hellwig
@ 2003-02-22 23:54           ` Mark Hahn
  0 siblings, 0 replies; 157+ messages in thread
From: Mark Hahn @ 2003-02-22 23:54 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel

> I could ask the SGI Eagan folks to do that with an Altix and a IA64
> Whitebox  - oh wait, both OSes would be Linux..

the only public info I've seen is "round-trip in as little as 40ns",
which is too vague to be useful.  and sounds WAY optimistic - perhaps
that's just between two CPUs in a single brick.  remember that 
LMBench shows memory latencies of O(100ns) for even fast uniprocessors.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:15       ` Larry McVoy
  2003-02-22 23:23         ` Christoph Hellwig
@ 2003-02-22 23:44         ` Martin J. Bligh
  2003-02-24  4:56           ` Larry McVoy
  2003-02-22 23:57         ` Jeff Garzik
  2003-02-23 23:57         ` Bill Davidsen
  3 siblings, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-22 23:44 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Mark Hahn, David S. Miller, linux-kernel

>> OK, so now you've slid from talking about PCs to 2-way to 4-way ...
>> perhaps because your original arguement was fatally flawed.
> 
> Nice attempt at deflection but it won't work.  

On your part or mine? seemingly yours.

> Your position is that
> there is no money in PC's only in big iron.  Last I checked, "big iron"
> doesn't include $25K 4 way machines, now does it?  

I would call 4x a "big machine" which is what I originally said.

> You claimed that
> Dell was making the majority of their profits from servers. 

I think that's probably true (nobody can be certain, as we don't have the
numbers).

> To refresh
> your memory: "I bet they still make more money on servers than desktops
> and notebooks combined".  Are you still claiming that?  

Yup.

> If so, please
> provide some data to back it up because, as Mark and others have pointed
> out, the bulk of their servers are headless desktop machines in tower
> or rackmount cases. 

So what? they're still servers. I can no more provide data to back it up
than you can to contradict it, because they don't release those figures. 
Note my sentence began "I bet", not "I have cast iron evidence".

> Let's get back to your position.  You want to shovel stuff in the kernel
> for the benefit of the 32 way / 64 way etc boxes.  

Actually, I'm focussed on 16-way at the moment, and have never run on,
or published numbers for anything higher. If you need to exaggerate
to make your point, then go ahead, but it's pretty transparent.

> I don't see that as wise.  You could prove me wrong.  
> Here's how you do it: go get oprofile
> or whatever that tool is which lets you run apps and count cache misses.
> Start including before/after runs of each microbench in lmbench and 
> some time sharing loads with and without your changes.  When you can do
> that and you don't add any more bus traffic, you're a genius and 
> I'll shut up.

I don't feel the need to do that to prove my point, but if you feel the
need to do it to prove yours, go ahead.

> But that's a false promise because by definition, fine grained threading
> adds more bus traffic.  It's kind of hard to not have that happen, the
> caches have to stay coherent somehow.

Adding more bus traffic is fine if you increase throughput. Focussing
on just one tiny aspect of performance is ludicrous. Look at the big
picture. Run some non-micro benchmarks. Analyse the results. Compare
2.4 vs 2.5 (or any set of patches I've put into the kernel of your choice) 
On UP, 2P or whatever you care about.

You seem to think the maintainers are morons that we can just slide crap
straight by ... give them a little more credit than that.

> Tell it to Google.  That's probably one of the largest applications in
> the world; I was the 4th engineer there, and I didn't think that the
> cluster added complexity at all.  On the contrary, it made things go
> one hell of a lot faster.

As I've explained to you many times before, it depends on the system. 
Some things split easily, some don't.

>> You don't believe we can make it scale without screwing up the low end,
>> I do believe we can do that. 
> 
> I'd like a little more than "I think I can, I think I can, I think I can".
> The people who are saying "no you can't, no you can't, no you can't" have
> seen this sort of work done before and there is no data which shows that
> it is possible and all sorts of data which shows that it is not.

The only data that's relevant is what we've done to Linux. If you want
to run the numbers, and show some useful metric on a semi-realistic
benchmark, I'd love to seem.

> Show me one OS which scales to 32 CPUs on an I/O load and run lmbench 
> on a single CPU.  Then take that same CPU and stuff it into a uniprocessor
> motherboard and run the same benchmarks on under Linux.  The Linux one
> will blow away the multi threaded one.  

Nobody has every really focussed before on an OS that scales across the 
board from UP to big iron ... a closed development system is bad at 
resolving that sort of thing. The real interesting comparison is UP
or 2x SMP on Linux with and without the scalability changes that have 
made it into the tree.

> Come on, prove me wrong, show me the data.

I don't have to *prove* you wrong. I'm happy in my own personal knowledge
that you're wrong, and things seem to be going along just fine, thanks.
If you want to change the attitude of the maintainers, I suggest you
generate the data yourself.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:44         ` Martin J. Bligh
@ 2003-02-24  4:56           ` Larry McVoy
  2003-02-24  5:06             ` William Lee Irwin III
  2003-02-24  5:16             ` Martin J. Bligh
  0 siblings, 2 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-24  4:56 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, Mark Hahn, David S. Miller, linux-kernel

> > Your position is that
> > there is no money in PC's only in big iron.  Last I checked, "big iron"
> > doesn't include $25K 4 way machines, now does it?  
> 
> I would call 4x a "big machine" which is what I originally said.

Nonsense.  You were talking about 16/32/64 way boxes, go read your own mail.
In fact, you said so in this message.

Furthermore, I can prove that isn't what you are talking about.  Show me
the performance gains you are getting on 4way systems from your changes.
Last I checked, things scaled pretty nicely on 4 ways.

> > You claimed that
> > Dell was making the majority of their profits from servers. 
> 
> I think that's probably true (nobody can be certain, as we don't have the
> numbers).

Yes, we do.  You just don't like what the numbers are saying.  You can
work backward from the size of the server market and the percentages
claimed by Sun, HP, IBM, etc.  If you do that, you'll see that even
if Dell was making 100% margins on every server they sold, that still
wouldn't be 51% of their profits.

It's not "probably true", it's not physically possible that it is true
and if you don't know that you are simply waving your hands and not 
doing any math.

> > To refresh
> > your memory: "I bet they still make more money on servers than desktops
> > and notebooks combined".  Are you still claiming that?  
> 
> Yup.

Well, you are flat out 100% wrong.

> > If so, please
> > provide some data to back it up because, as Mark and others have pointed
> > out, the bulk of their servers are headless desktop machines in tower
> > or rackmount cases. 
> 
> So what? they're still servers. I can no more provide data to back it up
> than you can to contradict it, because they don't release those figures. 

Read the mail I've posted on topic, the data is there.  Or better yet,
don't trust me, go work it out for yourself, it isn't hard.

> > I don't see that as wise.  You could prove me wrong.  
> > Here's how you do it: go get oprofile
> > or whatever that tool is which lets you run apps and count cache misses.
> > Start including before/after runs of each microbench in lmbench and 
> > some time sharing loads with and without your changes.  When you can do
> > that and you don't add any more bus traffic, you're a genius and 
> > I'll shut up.
> 
> I don't feel the need to do that to prove my point, but if you feel the
> need to do it to prove yours, go ahead.

Ahh, now we're getting somewhere.  As soon as we get anywhere near real
numbers, you don't want anything to do with it.  Why is that?

> You seem to think the maintainers are morons that we can just slide crap
> straight by ... give them a little more credit than that.

It happens all the time.

> > Come on, prove me wrong, show me the data.
> 
> I don't have to *prove* you wrong. I'm happy in my own personal knowledge
> that you're wrong, and things seem to be going along just fine, thanks.

Wow.  Compelling.  "It is so because I say it is so".  Jeez, forgive me 
if I'm not falling all over myself to have that sort of engineering being
the basis for scaling work.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  4:56           ` Larry McVoy
@ 2003-02-24  5:06             ` William Lee Irwin III
  2003-02-24  6:00               ` Mark Hahn
  2003-02-24 15:06               ` Alan Cox
  2003-02-24  5:16             ` Martin J. Bligh
  1 sibling, 2 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24  5:06 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, Mark Hahn,
	David S. Miller, linux-kernel

On Sun, Feb 23, 2003 at 08:56:16PM -0800, Larry McVoy wrote:
> Furthermore, I can prove that isn't what you are talking about.  Show me
> the performance gains you are getting on 4way systems from your changes.
> Last I checked, things scaled pretty nicely on 4 ways.

Try 4 or 8 mkfs's in parallel on a 4x box running virgin 2.4.x.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  5:06             ` William Lee Irwin III
@ 2003-02-24  6:00               ` Mark Hahn
  2003-02-24  6:02                 ` William Lee Irwin III
  2003-02-24 15:06               ` Alan Cox
  1 sibling, 1 reply; 157+ messages in thread
From: Mark Hahn @ 2003-02-24  6:00 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Larry McVoy, linux-kernel

> > Last I checked, things scaled pretty nicely on 4 ways.
> 
> Try 4 or 8 mkfs's in parallel on a 4x box running virgin 2.4.x.

"Doctor, it hurts..."


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:00               ` Mark Hahn
@ 2003-02-24  6:02                 ` William Lee Irwin III
  0 siblings, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24  6:02 UTC (permalink / raw)
  To: Mark Hahn; +Cc: Larry McVoy, linux-kernel

At some point in the past, Larry McVoy wrote:
>>> Last I checked, things scaled pretty nicely on 4 ways.

At some point in the past, I wrote:
>> Try 4 or 8 mkfs's in parallel on a 4x box running virgin 2.4.x.

On Mon, Feb 24, 2003 at 01:00:22AM -0500, Mark Hahn wrote:
> "Doctor, it hurts..."

Doing disk io is supposed to hurt? I'll file this in the "sick and
wrong" category along with RBJ and Hohensee.

In the meantime, compare to 2.5.x.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  5:06             ` William Lee Irwin III
  2003-02-24  6:00               ` Mark Hahn
@ 2003-02-24 15:06               ` Alan Cox
  2003-02-24 23:18                 ` William Lee Irwin III
  1 sibling, 1 reply; 157+ messages in thread
From: Alan Cox @ 2003-02-24 15:06 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Larry McVoy, Martin J. Bligh, Larry McVoy, Mark Hahn,
	David S. Miller, Linux Kernel Mailing List

On Mon, 2003-02-24 at 05:06, William Lee Irwin III wrote:
> On Sun, Feb 23, 2003 at 08:56:16PM -0800, Larry McVoy wrote:
> > Furthermore, I can prove that isn't what you are talking about.  Show me
> > the performance gains you are getting on 4way systems from your changes.
> > Last I checked, things scaled pretty nicely on 4 ways.
> 
> Try 4 or 8 mkfs's in parallel on a 4x box running virgin 2.4.x.

You have strange ideas of typical workloads. The mkfs paralle one is a good
one though because its also a lot better on one CPU in 2.5


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 15:06               ` Alan Cox
@ 2003-02-24 23:18                 ` William Lee Irwin III
  0 siblings, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24 23:18 UTC (permalink / raw)
  To: Alan Cox
  Cc: Larry McVoy, Martin J. Bligh, Larry McVoy, Mark Hahn,
	David S. Miller, Linux Kernel Mailing List

On Mon, 2003-02-24 at 05:06, William Lee Irwin III wrote:
>> Try 4 or 8 mkfs's in parallel on a 4x box running virgin 2.4.x.

On Mon, Feb 24, 2003 at 03:06:53PM +0000, Alan Cox wrote:
> You have strange ideas of typical workloads. The mkfs paralle one is a good
> one though because its also a lot better on one CPU in 2.5

The results I saw were that this did not affect 2.5 in any interesting
way and 2.4 behaved "very badly".

It's a simple way to get lots of disk io going without a complex
benchmark. There are good reasons and real workloads why things were
done to fix this.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  4:56           ` Larry McVoy
  2003-02-24  5:06             ` William Lee Irwin III
@ 2003-02-24  5:16             ` Martin J. Bligh
  2003-02-24  6:58               ` Larry McVoy
  1 sibling, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-24  5:16 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

> Nonsense.  You were talking about 16/32/64 way boxes, go read your own
> mail. In fact, you said so in this message.

Where? I never mentioned 32 / 64 way boxes, for starters ...

> Furthermore, I can prove that isn't what you are talking about.  Show me
> the performance gains you are getting on 4way systems from your changes.
> Last I checked, things scaled pretty nicely on 4 ways.

Depends what you mean by "your changes". If you do a before and after
comparison on a 4x machine on the scalability changes IBM LTC has made, I
think you'd find a dramatic difference. Of course, it depends to some
extent on what tests you run. Maybe running bitkeeper (or whatever you're
testing) just eats cpu, and doesn't do much interprocess communication or
disk IO (compared to the CPU load), in which case it'll scale pretty well
on 
anything as long as it's multithreaded enough. If you're just worried about
one particular app, yes of course you could tweak the system to go faster
for it ... but that's not what a general purpose OS is about.

> Yes, we do.  You just don't like what the numbers are saying.  You can
> work backward from the size of the server market and the percentages
> claimed by Sun, HP, IBM, etc.  If you do that, you'll see that even
> if Dell was making 100% margins on every server they sold, that still
> wouldn't be 51% of their profits.

Ummm ... now go back to what we were actually talking about. Linux margins. 
You think a significant percentage of the desktops they sell run Linux?

>> > To refresh
>> > your memory: "I bet they still make more money on servers than desktops
>> > and notebooks combined".  Are you still claiming that?  
>> 
>> Yup.
> 
> Well, you are flat out 100% wrong.

In the context we were talking about (Linux), I seriously doubt it.
Apologies if I didn't feel the need to continously restate the context in
every email to stop you from trying to twist the argument.

> Ahh, now we're getting somewhere.  As soon as we get anywhere near real
> numbers, you don't want anything to do with it.  Why is that?

Because I don't see why I should waste my time running benchmarks just to
prove you wrong. I don't respect you that much, and it seems the
maintainers don't either. When you become somebody with the stature in the
Linux community of, say, Linus or Andrew I'd be prepared to spend a lot
more time running benchmarks on any concerns you might have.

>> I don't have to *prove* you wrong. I'm happy in my own personal knowledge
>> that you're wrong, and things seem to be going along just fine, thanks.
> 
> Wow.  Compelling.  "It is so because I say it is so".  Jeez, forgive me 
> if I'm not falling all over myself to have that sort of engineering being
> the basis for scaling work.

Ummm ... and your argument is different because of what? You've run some
tiny little microfocused benchmark, seen a couple of bus cycles, and
projected the results out? Not very impressive, really, is it? Go run a
real benchmark and prove it makes a difference if you want to sway people's
opinions. Until then, I suspect the  current status quo will continue in
terms of us getting patches accepted.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  5:16             ` Martin J. Bligh
@ 2003-02-24  6:58               ` Larry McVoy
  2003-02-24  7:39                 ` Martin J. Bligh
                                   ` (3 more replies)
  0 siblings, 4 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-24  6:58 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

On Sun, Feb 23, 2003 at 09:16:38PM -0800, Martin J. Bligh wrote:
> Ummm ... now go back to what we were actually talking about. Linux margins. 
> You think a significant percentage of the desktops they sell run Linux?

The real discussion was the justification for scaling work beyond the
small SMPs.  You tried to make the point that there is no money in PC's so
any work to scale Linux up would help hardware companies stay financially
healthy.  I and others pointed out that there is indeed a pile of money
in PC's, that's vast majority of the hardware dell sells.  They don't
sell anything bigger than an 8 way and they only have one of those.
We went on to do the digging to figure out that it's impossible that
dell makes a substantial portion of their profits from the big servers.

The point being that there is a company generating $32B/year in sales and
almost all of that is in uniprocessors.  Directly countering your statement
that there is no margin in PC's.  They are making $2B/year in profits, QED.

Which brings us back to the point.  If the world is not heading towards
an 8 way on every desk then it is really questionable to make a lot of
changes to the kernel to make it work really well on 8-ways.  Yeah, I'm
sure it makes you feel good, but it's more of a intellectual exercise than
anything which really benefits the vast majority of the kernel user base.

> > Ahh, now we're getting somewhere.  As soon as we get anywhere near real
> > numbers, you don't want anything to do with it.  Why is that?
> 
> Because I don't see why I should waste my time running benchmarks just to
> prove you wrong. I don't respect you that much, and it seems the
> maintainers don't either. When you become somebody with the stature in the
> Linux community of, say, Linus or Andrew I'd be prepared to spend a lot
> more time running benchmarks on any concerns you might have.

Who cares if you respect me, what does that have to do with proper
engineering?   Do you think that I'm the only person who wants to see
numbers?  You think Linus doesn't care about this?  Maybe you missed
the whole IA32 vs IA64 instruction cache thread.  It sure sounded like
he cares.  How about Alan?  He stepped up and pointed out that less
is more.  How about Mark?  He knows a thing or two about the topic?
In fact, I think you'd be hard pressed to find anyone who wouldn't be
interested in seeing the cache effects of a patch.

People care about performance, both scaling up and scaling down.  A lot of
performance changes are measured poorly, in a way that makes the changes
look good but doesn't expose the hidden costs of the change.  What I'm
saying is that those sorts of measurements screwed over performance in
the past, why are you trying to repeat old mistakes?

> > Wow.  Compelling.  "It is so because I say it is so".  Jeez, forgive me 
> > if I'm not falling all over myself to have that sort of engineering being
> > the basis for scaling work.
> 
> Ummm ... and your argument is different because of what? You've run some
> tiny little microfocused benchmark, seen a couple of bus cycles, and
> projected the results out? 

My argument is different because every effort which has gone in the
direction you are going has ended up with a kernel that worked well on
big boxes and sucked rocks on little boxes.  And all of them started
with kernels which performed quite nicely on uniprocessors.

If I was waving my hands and saying "I'm an old fart and I think this
won't work" and that was it, you'd have every right to tell me to piss
off.  I'd tell me to piss off.  But that's not what is going on here.
What's going on is that a pile of smart people have tried over and over
to do what you claim you will do and they all failed.  They all ended up
with kernels that gave up lots of uniprocessor performance and justified
it by throwing more processors at that problem.  You haven't said a 
single thing to refute that and when challenged to measure the parts
which lead to those results you respond with "nah, nah, I don't respect
you so I don't have to measure it".  Come on, *you* should want to know
if what I'm saying is true.  You're an engineer, not a marketing drone,
of course you should want to know, why wouldn't you?

Linux is a really fast system right now.  The code paths are short and
it is possible to use the OS almost as if it were a library, the cost is
so little that you really can mmap stuff in as you need, something that
people have wanted since Multics.  There will always be many more uses
of Linux in small systems than large, simply because there will always
be more small systems.  Keeping Linux working well on small systems is
going to have a dramatically larger positive benefit for the world than
scaling it to 64 processors.  So who do you want to help?  An elite
few or everyone?
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:58               ` Larry McVoy
@ 2003-02-24  7:39                 ` Martin J. Bligh
  2003-02-24 16:17                   ` Larry McVoy
  2003-02-24  7:51                 ` William Lee Irwin III
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-24  7:39 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

>> Ummm ... now go back to what we were actually talking about. Linux
>> margins.  You think a significant percentage of the desktops they sell
>> run Linux?
> 
> The real discussion was the justification for scaling work beyond the
> small SMPs.  You tried to make the point that there is no money in PC's so
> any work to scale Linux up would help hardware companies stay financially
> healthy.

More or less, yes.

> The point being that there is a company generating $32B/year in sales and
> almost all of that is in uniprocessors.  Directly countering your
> statement that there is no margin in PC's.  They are making $2B/year in
> profits, QED.

Which is totally irrelevant. It's the *LINUX* market that matters. What
part of that do you find so hard to understand? 
 
> Which brings us back to the point.  If the world is not heading towards
> an 8 way on every desk then it is really questionable to make a lot of
> changes to the kernel to make it work really well on 8-ways.  Yeah, I'm
> sure it makes you feel good, but it's more of a intellectual exercise than
> anything which really benefits the vast majority of the kernel user base.

It makes IBM money, ergo they pay me. I enjoy doing it, ergo I work for
them. Most of the work benefits smaller systems as well, ergo we get our
patches accepted. So everyone's happy, apart from you, who keeps whining.
 
>> Because I don't see why I should waste my time running benchmarks just to
>> prove you wrong. I don't respect you that much, and it seems the
>> maintainers don't either. When you become somebody with the stature in
>> the Linux community of, say, Linus or Andrew I'd be prepared to spend a
>> lot more time running benchmarks on any concerns you might have.
> 
> Who cares if you respect me, what does that have to do with proper
> engineering?   Do you think that I'm the only person who wants to see
> numbers?  You think Linus doesn't care about this?  Maybe you missed
> the whole IA32 vs IA64 instruction cache thread.  It sure sounded like
> he cares.  How about Alan?  He stepped up and pointed out that less
> is more.  How about Mark?  He knows a thing or two about the topic?
> In fact, I think you'd be hard pressed to find anyone who wouldn't be
> interested in seeing the cache effects of a patch.

So now we've slid from talking about bus traffic from fine-grained locking,
which is mostly just you whining in ignorance of the big picture, to cache
effects, which are obviously important. Nice try at twisting the
conversation. Again.

> People care about performance, both scaling up and scaling down.  A lot of
> performance changes are measured poorly, in a way that makes the changes
> look good but doesn't expose the hidden costs of the change.  What I'm
> saying is that those sorts of measurements screwed over performance in
> the past, why are you trying to repeat old mistakes?

One way to measure those changes poorly would be to do what you were
advocating earlier - look at one tiny metric of a microbenchmark, rather
than the actual throughput of the machine. So pardon me if I take your
concerns, and file them in the appropriate place.
 
> My argument is different because every effort which has gone in the
> direction you are going has ended up with a kernel that worked well on
> big boxes and sucked rocks on little boxes.  And all of them started
> with kernels which performed quite nicely on uniprocessors.

So you're trying to say that fine-grained locking ruins uniprocessor
performance now? Or did you have some other change in mind?

> If I was waving my hands and saying "I'm an old fart and I think this
> won't work" and that was it, you'd have every right to tell me to piss
> off.  I'd tell me to piss off.  But that's not what is going on here.
> What's going on is that a pile of smart people have tried over and over
> to do what you claim you will do and they all failed.  They all ended up
> with kernels that gave up lots of uniprocessor performance and justified
> it by throwing more processors at that problem.  You haven't said a 
> single thing to refute that and when challenged to measure the parts
> which lead to those results you respond with "nah, nah, I don't respect
> you so I don't have to measure it".  Come on, *you* should want to know
> if what I'm saying is true.  You're an engineer, not a marketing drone,
> of course you should want to know, why wouldn't you?

You just don't get it, do you? Your head is so vastly inflated that you
think everyone should run around researching whatever *you* happen to think
is interesting. Do your own benchmarking if you think it's a problem.
You're the one whining about this.
 
> Linux is a really fast system right now.  The code paths are short and
> it is possible to use the OS almost as if it were a library, the cost is
> so little that you really can mmap stuff in as you need, something that
> people have wanted since Multics.  There will always be many more uses
> of Linux in small systems than large, simply because there will always
> be more small systems.  Keeping Linux working well on small systems is
> going to have a dramatically larger positive benefit for the world than
> scaling it to 64 processors.  So who do you want to help?  An elite
> few or everyone?

Everyone. And we can do that, and make large systems work at the same time.
Despite the fact you don't believe me. And despite the fact that you can't
grasp the difference between the number 16 and the number 64.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  7:39                 ` Martin J. Bligh
@ 2003-02-24 16:17                   ` Larry McVoy
  2003-02-24 16:49                     ` Martin J. Bligh
  2003-02-24 18:22                     ` Minutes from Feb 21 LSE Call John W. M. Stevens
  0 siblings, 2 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-24 16:17 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

On Sun, Feb 23, 2003 at 11:39:34PM -0800, Martin J. Bligh wrote:
> > The point being that there is a company generating $32B/year in sales and
> > almost all of that is in uniprocessors.  Directly countering your
> > statement that there is no margin in PC's.  They are making $2B/year in
> > profits, QED.
> 
> Which is totally irrelevant. It's the *LINUX* market that matters. What
> part of that do you find so hard to understand? 

OK, so you can't handle the reality that the server market overall doesn't
make your point so you retreat to the Linux market.  OK, fine.  All the
data anyone has ever seen has Linux running on *smaller* servers, not
larger.  Show me all the cases where people replaced 4 CPU NT boxes with
8 CPU Linux boxes.  

The point being that if in the overall market place, big iron isn't
dominating, you have one hell of a tough time making the case that the
Linux market place is somehow profoundly different and needs larger
boxes to do the same job.

In fact, the opposite is true.  Linux squeezes substantially more
performance out of the same hardware than the commercial OS offerings,
NT or Unix.  So where is the market force which says "oh, switching to
Linux?  Better get more CPUs".

> It makes IBM money, ergo they pay me. I enjoy doing it, ergo I work for
> them. Most of the work benefits smaller systems as well, ergo we get our
> patches accepted. So everyone's happy, apart from you, who keeps whining.

Indeed I do, I'm good at it.  You're about to find out how good.  It's
quite effective to simply focus attention on a problem area.  Here's
my promise to you: there will be a ton of attention focussed on the
scaling patches until you and anyone else doing them starts showing
up with cache miss counters as part of the submission process.

> So now we've slid from talking about bus traffic from fine-grained locking,
> which is mostly just you whining in ignorance of the big picture, to cache
> effects, which are obviously important. Nice try at twisting the
> conversation. Again.

You need to take a deep breath and try and understand that the focus of
the conversation is Linux, not your ego or mine.  Getting mad at me just
wastes energy, stay focussed on the real issue, Linux.

> > People care about performance, both scaling up and scaling down.  A lot of
> > performance changes are measured poorly, in a way that makes the changes
> > look good but doesn't expose the hidden costs of the change.  What I'm
> > saying is that those sorts of measurements screwed over performance in
> > the past, why are you trying to repeat old mistakes?
> 
> One way to measure those changes poorly would be to do what you were
> advocating earlier - look at one tiny metric of a microbenchmark, rather
> than the actual throughput of the machine. So pardon me if I take your
> concerns, and file them in the appropriate place.

You apparently missed the point where I have said (a bunch of times) 
run the benchmarks you want and report before and after the patch 
cache miss counters for the same runs.  Microbenchmarks would be 
a really bad way to do that, you really want to run a real application
because you need it fighting for the cache.  

> > My argument is different because every effort which has gone in the
> > direction you are going has ended up with a kernel that worked well on
> > big boxes and sucked rocks on little boxes.  And all of them started
> > with kernels which performed quite nicely on uniprocessors.
> 
> So you're trying to say that fine-grained locking ruins uniprocessor
> performance now? 

I've been saying that for almost 10 years, check the archives.

> You just don't get it, do you? Your head is so vastly inflated that you
> think everyone should run around researching whatever *you* happen to think
> is interesting. Do your own benchmarking if you think it's a problem.

That's exactly what I'll do if you don't learn how to do it yourself.  I'm
astounded that any competent engineer wouldn't want to know the effects of
their changes, I think you actually do but are just too pissed right now
to see it.  

> > Linux is a really fast system right now.  [etc]
> 
> Everyone. And we can do that, and make large systems work at the same time.
> Despite the fact you don't believe me. And despite the fact that you can't
> grasp the difference between the number 16 and the number 64.

See other postings on this one.  All engineers in your position have said
"we're just trying to get to N cpus where N = ~2x where we are today and
it won't hurt uniprocessor performance".  They *all* say that.  And they
all end up with a slow uniprocessor OS.  Unlike security and a number of
other invasive features, the SMP stuff can't be configed out or you end
up with an #ifdef-ed mess like IRIX.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 16:17                   ` Larry McVoy
@ 2003-02-24 16:49                     ` Martin J. Bligh
  2003-02-25  0:41                       ` Server shipments [was Re: Minutes from Feb 21 LSE Call] Larry McVoy
  2003-02-24 18:22                     ` Minutes from Feb 21 LSE Call John W. M. Stevens
  1 sibling, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-24 16:49 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

>> > The point being that there is a company generating $32B/year in sales
>> > and almost all of that is in uniprocessors.  Directly countering your
>> > statement that there is no margin in PC's.  They are making $2B/year in
>> > profits, QED.
>> 
>> Which is totally irrelevant. It's the *LINUX* market that matters. What
>> part of that do you find so hard to understand? 
> 
> OK, so you can't handle the reality that the server market overall doesn't
> make your point so you retreat to the Linux market.  OK, fine.  All the

Errm. no. That was the conversation all along - you just took some remarks
out of context

> The point being that if in the overall market place, big iron isn't
> dominating, you have one hell of a tough time making the case that the
> Linux market place is somehow profoundly different and needs larger
> boxes to do the same job.

Dominating in terms of volume? No. My postion is that Linux sales for
hardware companies make more money on servers than desktops. We're working
on scalability ... that means CPUs, memory, disk IO, networking,
everything. That improves both the efficiency of servers ... "large
machines" (which your original message had as, and I quote, "4 or more CPU
SMP machines"), 2x and even larger 1x machines. If you're being more
specific as to things like NUMA changes, please point to examples of
patches you think degrades performance on UP / 2x or whatever.

> Indeed I do, I'm good at it.  You're about to find out how good.  It's
> quite effective to simply focus attention on a problem area.  Here's
> my promise to you: there will be a ton of attention focussed on the
> scaling patches until you and anyone else doing them starts showing
> up with cache miss counters as part of the submission process.

Here's my promise to you: people listen to you far less than you think, and
our patches will continue to go into the kernel.
 
>> So now we've slid from talking about bus traffic from fine-grained
>> locking, which is mostly just you whining in ignorance of the big
>> picture, to cache effects, which are obviously important. Nice try at
>> twisting the conversation. Again.
> 
> You need to take a deep breath and try and understand that the focus of
> the conversation is Linux, not your ego or mine.  Getting mad at me just
> wastes energy, stay focussed on the real issue, Linux.

So exactly what do you think is the problem? it seems to keep shifting
mysteriously. Name some patches that got accepted into mainline ... if
they're broken, that'll give us some clues what is bad for the future, and
we can fix them.

>> One way to measure those changes poorly would be to do what you were
>> advocating earlier - look at one tiny metric of a microbenchmark, rather
>> than the actual throughput of the machine. So pardon me if I take your
>> concerns, and file them in the appropriate place.
> 
> You apparently missed the point where I have said (a bunch of times) 
> run the benchmarks you want and report before and after the patch 
> cache miss counters for the same runs.  Microbenchmarks would be 
> a really bad way to do that, you really want to run a real application
> because you need it fighting for the cache.  

One statistic (eg cache miss counters) isn't the big picture. If throughput
goes up or remains the same on all machines, that's what important.
 
>> So you're trying to say that fine-grained locking ruins uniprocessor
>> performance now? 
> 
> I've been saying that for almost 10 years, check the archives.

And you haven't worked out that locks compile away to nothing on UP yet? I
think you might be better off pulling your head out of where it's currently
residing, and pointing it at the source code.

>> You just don't get it, do you? Your head is so vastly inflated that you
>> think everyone should run around researching whatever *you* happen to
>> think is interesting. Do your own benchmarking if you think it's a
>> problem.
> 
> That's exactly what I'll do if you don't learn how to do it yourself.  I'm
> astounded that any competent engineer wouldn't want to know the effects of
> their changes, I think you actually do but are just too pissed right now
> to see it.  

Cool, I'd love to see some benchmarks ... and real throughput numbers from
them, not just microstatistics.
 
> See other postings on this one.  All engineers in your position have said
> "we're just trying to get to N cpus where N = ~2x where we are today and
> it won't hurt uniprocessor performance".  They *all* say that.  And they
> all end up with a slow uniprocessor OS.  Unlike security and a number of
> other invasive features, the SMP stuff can't be configed out or you end
> up with an #ifdef-ed mess like IRIX.

Try looking up "abstraction" in a dictionary. Linus doesn't take #ifdef's
in the main code.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-24 16:49                     ` Martin J. Bligh
@ 2003-02-25  0:41                       ` Larry McVoy
  2003-02-25  0:41                         ` Martin J. Bligh
  0 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  0:41 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

More data from news.com.

Dell has 19% of the server market with $531M/quarter in sales[1] over
212,750 machines per quarter[2].

That means that the average sale price for a server from Dell was $2495.

The average sale price of all servers from all companies is $9347.

I still don't see the big profits touted by the scaling fanatics, anyone
care to explain it?

[1] http://news.com.com/2100-1001-983892.html
[2] http://news.com.com/2100-1001-982004.html
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  0:41                       ` Server shipments [was Re: Minutes from Feb 21 LSE Call] Larry McVoy
@ 2003-02-25  0:41                         ` Martin J. Bligh
  2003-02-25  0:54                           ` Larry McVoy
  2003-02-25  1:09                           ` David Lang
  0 siblings, 2 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25  0:41 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

> More data from news.com.
> 
> Dell has 19% of the server market with $531M/quarter in sales[1] over
> 212,750 machines per quarter[2].
> 
> That means that the average sale price for a server from Dell was $2495.
> 
> The average sale price of all servers from all companies is $9347.
> 
> I still don't see the big profits touted by the scaling fanatics, anyone
> care to explain it?

Sigh. If you're so convinced that there's no money in larger systems,
why don't you write to Sam Palmisano and explain to him the error of
his ways? I'm sure IBM has absolutely no market data to go on ...

If only he could receive an explanation of the error of his ways from
Larry McVoy, I'm sure he'd turn the ship around, for you obviously have
all the facts, figures, and experience of the server market to make this
kind of decision. I await the email from the our CEO that tells us how 
much he respects you, and has taken this decision at your bidding.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  0:41                         ` Martin J. Bligh
@ 2003-02-25  0:54                           ` Larry McVoy
  2003-02-25  2:00                             ` Tupshin Harper
  2003-02-25  3:00                             ` Martin J. Bligh
  2003-02-25  1:09                           ` David Lang
  1 sibling, 2 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  0:54 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

On Mon, Feb 24, 2003 at 04:41:04PM -0800, Martin J. Bligh wrote:
> > More data from news.com.
> > 
> > Dell has 19% of the server market with $531M/quarter in sales[1] over
> > 212,750 machines per quarter[2].
> > 
> > That means that the average sale price for a server from Dell was $2495.
> > 
> > The average sale price of all servers from all companies is $9347.
> > 
> > I still don't see the big profits touted by the scaling fanatics, anyone
> > care to explain it?
> 
> Sigh. If you're so convinced that there's no money in larger systems,
> why don't you write to Sam Palmisano and explain to him the error of
> his ways? I'm sure IBM has absolutely no market data to go on ...

Numbers talk, bullshit walks.  Got shoes?
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  0:54                           ` Larry McVoy
@ 2003-02-25  2:00                             ` Tupshin Harper
  2003-02-25  3:54                               ` Martin J. Bligh
  2003-02-25  3:00                             ` Martin J. Bligh
  1 sibling, 1 reply; 157+ messages in thread
From: Tupshin Harper @ 2003-02-25  2:00 UTC (permalink / raw)
  To: linux-kernel

This conversation has not only gotten out of hand, it's gotten quite
silly. People are arguing semantics and relative economic value where a
few simple assertions should do:

1) There is a significant interest from developers and users in having
Linux run efficiently on *small* platforms.
2) There is a significant interest from developers and users in having
Linux run efficiently on *large* platforms.
3) There is disagreement on whether it is possible to accomplish 1 and 2
simultaneously.
4) There is disagreement on whether adequate testing is taking place to
make sure 2 doesn't degrade 1(or vice versa).

This leads to two choices:
a) Fork. Obviously to be avoided at all reasonable costs.
b) Identify reasonable improvements to the testing methodology so that
any design conflicts are identified immediately instead of gradually
accumulating and degrading performance over time.

I vote b(surprise surprise), however, this just changes the debate to
"what is reasonable testing methodology?" This, however is a debate much
more worth having than "who ships more of what" and "who said what when".

Given that a fairly thorough performance testing suite is already in
place, it would seem to be up to the advocates for the "threatened"
computing environment (large or small) to convince the "testers that be"
that certain tests should be added. It is inherently unreasonable to
expect the developer of a feature/change to be unbiased and neutral with
respect to that feature, therefore it is unreasonable to expect them to
prove beyond a reasonable doubt that their feature has no negative
impact. The best that they can do is convince themselves that the
feature passes the really deep sniff test. The rest is up to the
community. The ability of a third party to critique code changes is a
large part of why the bazaar nature of linux development is so valuable.

-Tupshin

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  2:00                             ` Tupshin Harper
@ 2003-02-25  3:54                               ` Martin J. Bligh
  0 siblings, 0 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25  3:54 UTC (permalink / raw)
  To: Tupshin Harper, linux-kernel

> Given that a fairly thorough performance testing suite is already in
> place, it would seem to be up to the advocates for the "threatened"
> computing environment (large or small) to convince the "testers that be"
> that certain tests should be added. It is inherently unreasonable to
> expect the developer of a feature/change to be unbiased and neutral with
> respect to that feature, therefore it is unreasonable to expect them to
> prove beyond a reasonable doubt that their feature has no negative
> impact. The best that they can do is convince themselves that the
> feature passes the really deep sniff test. The rest is up to the
> community. The ability of a third party to critique code changes is a
> large part of why the bazaar nature of linux development is so valuable.

An excellent and well thought out summary, and exactly why I welcome
Larry's proposal to do some testing and produce specific numbers on
specific patches instead of hand-waving and spreading FUD. This kind of
arrangement is exactly why the open development model will allow Linux to
win out in the long term.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  0:54                           ` Larry McVoy
  2003-02-25  2:00                             ` Tupshin Harper
@ 2003-02-25  3:00                             ` Martin J. Bligh
  2003-02-25  3:13                               ` Larry McVoy
  2003-02-25 17:37                               ` Andrea Arcangeli
  1 sibling, 2 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25  3:00 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

>> > More data from news.com.
>> > 
>> > Dell has 19% of the server market with $531M/quarter in sales[1] over
>> > 212,750 machines per quarter[2].
>> > 
>> > That means that the average sale price for a server from Dell was
>> > $2495.
>> > 
>> > The average sale price of all servers from all companies is $9347.
>> > 
>> > I still don't see the big profits touted by the scaling fanatics,
>> > anyone care to explain it?
>> 
>> Sigh. If you're so convinced that there's no money in larger systems,
>> why don't you write to Sam Palmisano and explain to him the error of
>> his ways? I'm sure IBM has absolutely no market data to go on ...
> 
> Numbers talk, bullshit walks.  Got shoes?

Bullshit numbers walk too. Remember the context? Linux.
Linux servers vs. Linux desktops. If you think the Linux desktop market is
large, I'd like some of whatever you're smoking, as it's obviously good
stuff. 

I think there's money in big iron, you don't seem to. That's fine, you're
not paying my salary (thank $deity). 

Perhaps a person with the slightest understanding of basic arithmetic would
see that this:

>> > That means that the average sale price for a server from Dell was
>> > $2495.
>> > 
>> > The average sale price of all servers from all companies is $9347.

means that somebody other than Dell is making the money on the big servers.
As Dell is a PC company, no real suprise there.

By the way ... you remember when I said that Linux could scale upwards
without hurting the low end? And that the reason we'd succeed in that where
Solaris et al failed was because the development model was different? 

When you said you'd go run some UP benchmarks, that's *exactly* where the
development model is different. It's open enough that you can go do that
sort of thing, and if errors are made, you can point them out. I honestly
welcome the benchmark results you provide ... it's the strength of the
system.

M.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  3:00                             ` Martin J. Bligh
@ 2003-02-25  3:13                               ` Larry McVoy
  2003-02-25  4:11                                 ` Martin J. Bligh
  2003-02-25 17:37                               ` Andrea Arcangeli
  1 sibling, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  3:13 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

On Mon, Feb 24, 2003 at 07:00:42PM -0800, Martin J. Bligh wrote:
> >> > That means that the average sale price for a server from Dell was
> >> > $2495.
> >> > 
> >> > The average sale price of all servers from all companies is $9347.
> 
> means that somebody other than Dell is making the money on the big servers.

What part of "all servers from all companies" did you not understand?
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  3:13                               ` Larry McVoy
@ 2003-02-25  4:11                                 ` Martin J. Bligh
  2003-02-25  4:17                                   ` Larry McVoy
  0 siblings, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25  4:11 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

>> >> > That means that the average sale price for a server from Dell was
>> >> > $2495.
>> >> > 
>> >> > The average sale price of all servers from all companies is $9347.
>> 
>> means that somebody other than Dell is making the money on the big
>> servers.
> 
> What part of "all servers from all companies" did you not understand?

Average price from Dell: $2495
Average price overall:   $9347 

Conclusion ... Dell makes cheaper servers than average, presumably smaller.

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  4:11                                 ` Martin J. Bligh
@ 2003-02-25  4:17                                   ` Larry McVoy
  2003-02-25  4:21                                     ` Martin J. Bligh
  2003-02-25 22:02                                     ` Gerrit Huizenga
  0 siblings, 2 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  4:17 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

On Mon, Feb 24, 2003 at 08:11:21PM -0800, Martin J. Bligh wrote:
> > What part of "all servers from all companies" did you not understand?
> 
> Average price from Dell: $2495
> Average price overall:   $9347 
> 
> Conclusion ... Dell makes cheaper servers than average, presumably smaller.

So how many CPUs do you think you get in a $9K server?  

Better yet, since you work for IBM, how many servers do they ship in a year
with 16 CPUs?
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  4:17                                   ` Larry McVoy
@ 2003-02-25  4:21                                     ` Martin J. Bligh
  2003-02-25  4:37                                       ` Larry McVoy
  2003-02-25 22:02                                     ` Gerrit Huizenga
  1 sibling, 1 reply; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25  4:21 UTC (permalink / raw)
  To: Larry McVoy; +Cc: linux-kernel

>> > What part of "all servers from all companies" did you not understand?
>> 
>> Average price from Dell: $2495
>> Average price overall:   $9347 
>> 
>> Conclusion ... Dell makes cheaper servers than average, presumably
>> smaller.
> 
> So how many CPUs do you think you get in a $9K server?  

Not sure. Average by price is probably 4 or a little over.
 
> Better yet, since you work for IBM, how many servers do they ship in a
> year with 16 CPUs?

Will look. If I can find that data, and it's releasable, I'll send it out.
What's more interesting is how much money they make on machines with, say,
more than 4 CPUs. But I doubt I'll be allowed to release that info ;-)

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  4:21                                     ` Martin J. Bligh
@ 2003-02-25  4:37                                       ` Larry McVoy
  0 siblings, 0 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  4:37 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

On Mon, Feb 24, 2003 at 08:21:57PM -0800, Martin J. Bligh wrote:
> > So how many CPUs do you think you get in a $9K server?  
> 
> Not sure. Average by price is probably 4 or a little over.

Nope.  For $12K you can get
	4x 1.9Ghz
	512MB
	No networking
	1 disk
	No operating system

That's as cheap as it gets.  And I don't know about you, but I have a
tough time believing that anyone buys a 4 CPU box without an OS, without
networking, with .5GB of ram, and with one disk.

If you think you are getting a realistic 4 CPU server for $9K from
a vendor, you're dreaming.  
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  4:17                                   ` Larry McVoy
  2003-02-25  4:21                                     ` Martin J. Bligh
@ 2003-02-25 22:02                                     ` Gerrit Huizenga
  2003-02-25 23:19                                       ` Larry McVoy
  1 sibling, 1 reply; 157+ messages in thread
From: Gerrit Huizenga @ 2003-02-25 22:02 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Martin J. Bligh, linux-kernel

On Mon, 24 Feb 2003 20:17:01 PST, Larry McVoy wrote:
> On Mon, Feb 24, 2003 at 08:11:21PM -0800, Martin J. Bligh wrote:
> > > What part of "all servers from all companies" did you not understand?
> > 
> > Average price from Dell: $2495
> > Average price overall:   $9347 
> > 
> > Conclusion ... Dell makes cheaper servers than average, presumably smaller.
> 
> So how many CPUs do you think you get in a $9K server?  

Did the numbers track add-on prices, as opposed to base server?  Most
servers are sold with one CPU and lots of extra slots.  Need to dig
down to the add-on data to find upgrades to more CPUs and more memory
in the field (and more disk drives).

> Better yet, since you work for IBM, how many servers do they ship in a year
> with 16 CPUs?

Proprietary data, unfortunately.  And I'm not sure if even internally
the totals are rolled up as pSeries, xSeries, zSeries, iSeries, etc. and
broken down by linux/aix/NT/VM/etc.  Nor do most big companies efficiently
track the size of a machine at a customer site after hardware upgrades
(I know for Sequent this in particular was a painful problem - sold a
two way and supported an 18-way machine later).

gerrit

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25 22:02                                     ` Gerrit Huizenga
@ 2003-02-25 23:19                                       ` Larry McVoy
  2003-02-25 23:46                                         ` Gerhard Mack
  0 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-25 23:19 UTC (permalink / raw)
  To: Gerrit Huizenga; +Cc: Larry McVoy, Martin J. Bligh, linux-kernel

On Tue, Feb 25, 2003 at 02:02:28PM -0800, Gerrit Huizenga wrote:
> On Mon, 24 Feb 2003 20:17:01 PST, Larry McVoy wrote:
> > On Mon, Feb 24, 2003 at 08:11:21PM -0800, Martin J. Bligh wrote:
> > > > What part of "all servers from all companies" did you not understand?
> > > 
> > > Average price from Dell: $2495
> > > Average price overall:   $9347 
> > > 
> > > Conclusion ... Dell makes cheaper servers than average, presumably smaller.
> > 
> > So how many CPUs do you think you get in a $9K server?  
>  
> Did the numbers track add-on prices, as opposed to base server?  Most
> servers are sold with one CPU and lots of extra slots.  Need to dig
> down to the add-on data to find upgrades to more CPUs and more memory
> in the field (and more disk drives).

I included the URL's so you could check for yourself but I arrived at
those numbers by taking the world wide revenue associated with servers
and dividing by the number of units shipped.  I would expect that would
include the add on stuff.

I'm sure IBM makes money on their high end stuff but I'd suspect that 
it is more bragging rights than what keeps the lights on.

I think the point which was missed in this whole thread is that even if
IBM has fantastic margins today on big iron, it's unlikely to stay that
way.  The world is catching up.  I can by a dual 1.8Ghz AMD box for 
about $1500.  4 ways are more, maybe $10K or so.  So you have the cheapo
white boxes coming at you from the low end.  

On the high end, go look at what customers want.  They are mostly taking 
those big boxes and partitioning them.  Sooner or later some bright boy
is going to realize that they could put 4 4 way boxes in one rack and 
call it a 16 way box with 4 way partitioning "pre-installed".
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25 23:19                                       ` Larry McVoy
@ 2003-02-25 23:46                                         ` Gerhard Mack
  2003-02-26  4:23                                           ` Jesse Pollard
  0 siblings, 1 reply; 157+ messages in thread
From: Gerhard Mack @ 2003-02-25 23:46 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Gerrit Huizenga, Martin J. Bligh, linux-kernel

On Tue, 25 Feb 2003, Larry McVoy wrote:

> Date: Tue, 25 Feb 2003 15:19:26 -0800
> From: Larry McVoy <lm@bitmover.com>
> To: Gerrit Huizenga <gh@us.ibm.com>
> Cc: Larry McVoy <lm@bitmover.com>, Martin J. Bligh <mbligh@aracnet.com>,
>      linux-kernel@vger.kernel.org
> Subject: Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
>
> On Tue, Feb 25, 2003 at 02:02:28PM -0800, Gerrit Huizenga wrote:
> > On Mon, 24 Feb 2003 20:17:01 PST, Larry McVoy wrote:
> > > On Mon, Feb 24, 2003 at 08:11:21PM -0800, Martin J. Bligh wrote:
> > > > > What part of "all servers from all companies" did you not understand?
> > > >
> > > > Average price from Dell: $2495
> > > > Average price overall:   $9347
> > > >
> > > > Conclusion ... Dell makes cheaper servers than average, presumably smaller.
> > >
> > > So how many CPUs do you think you get in a $9K server?
> >
> > Did the numbers track add-on prices, as opposed to base server?  Most
> > servers are sold with one CPU and lots of extra slots.  Need to dig
> > down to the add-on data to find upgrades to more CPUs and more memory
> > in the field (and more disk drives).
>
> I included the URL's so you could check for yourself but I arrived at
> those numbers by taking the world wide revenue associated with servers
> and dividing by the number of units shipped.  I would expect that would
> include the add on stuff.
>
> I'm sure IBM makes money on their high end stuff but I'd suspect that
> it is more bragging rights than what keeps the lights on.
>
> I think the point which was missed in this whole thread is that even if
> IBM has fantastic margins today on big iron, it's unlikely to stay that
> way.  The world is catching up.  I can by a dual 1.8Ghz AMD box for
> about $1500.  4 ways are more, maybe $10K or so.  So you have the cheapo
> white boxes coming at you from the low end.
>
> On the high end, go look at what customers want.  They are mostly taking
> those big boxes and partitioning them.  Sooner or later some bright boy
> is going to realize that they could put 4 4 way boxes in one rack and
> call it a 16 way box with 4 way partitioning "pre-installed".

er you mean like what racksaver.com does with their 2 dual CPU servers in
a box?

	Gerhard

--
Gerhard Mack

gmack@innerfire.net

<>< As a computer I find your faith in technology amusing.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25 23:46                                         ` Gerhard Mack
@ 2003-02-26  4:23                                           ` Jesse Pollard
  2003-02-26  5:05                                             ` William Lee Irwin III
  2003-02-26  5:27                                             ` Bernd Eckenfels
  0 siblings, 2 replies; 157+ messages in thread
From: Jesse Pollard @ 2003-02-26  4:23 UTC (permalink / raw)
  To: Gerhard Mack, Larry McVoy; +Cc: Gerrit Huizenga, Martin J. Bligh, linux-kernel

On Tuesday 25 February 2003 17:46, Gerhard Mack wrote:
> On Tue, 25 Feb 2003, Larry McVoy wrote:
> > Date: Tue, 25 Feb 2003 15:19:26 -0800
> > From: Larry McVoy <lm@bitmover.com>
> > To: Gerrit Huizenga <gh@us.ibm.com>
> > Cc: Larry McVoy <lm@bitmover.com>, Martin J. Bligh <mbligh@aracnet.com>,
> >      linux-kernel@vger.kernel.org
> > Subject: Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
> >
[snip]
> > On the high end, go look at what customers want.  They are mostly taking
> > those big boxes and partitioning them.  Sooner or later some bright boy
> > is going to realize that they could put 4 4 way boxes in one rack and
> > call it a 16 way box with 4 way partitioning "pre-installed".
>
> er you mean like what racksaver.com does with their 2 dual CPU servers in
> a box?

And that is not "Big Iron".

sorry - Big Iron is a 1 -5 TFlop single system image, shared memory, with 
streaming vector processor...

Something like a Cray X1, single processor for instance.
Or a 1024 processor Cray T3, again single system image, even if it doesn't
have a streaming vector processor.

I don't see that any of the current cluster systems provide the throughput
of such a system. Not even IBMs' SP series. Aggregate measures of theoretical
throughput just don't add up. Practical throughput is almost always only 80% 
of the theoretical (ie. the advertised) througput. Most cannot handle the data
I/O requirement, much less the IPC latency.

Sure, 3 microseconds sounds nice for myranet, but nothing beats 17 clock ticks
where each tick is 4 ns for the first 64 bit word of data... followed by the 
next word in 4 ns per buss. ( and that is on a slow processor....)

The output is fed to memory on every clock tick. (most Cray processors have 4 
memory busses for each processor - two for input data, one for output data 
and one for the instruction stream ; and each has the same cycle time...Now
go to 4/8/16/32 processors without reducing that timing. That requires some
CAREFULL hardware design.)

And you better believe that there are big margins on such a system. You only
have to sell 8 to 16 units to exceed the yearly profit of most computer 
companies. Do I have hard numbers on the units? no. I don't work for Cray.
I have used their systems for the last 12 years, and until the Earth Simulator
came on line, there was nothing that came close to their throughput for 
weather modeling, finite element analysis, or other large problem types.

None of the microprocessors (possibly excepting the Power 4) can come close -
When you look at the processor internals, they all only have a single memory 
buss, running approximately 1 - 2 GB/second to cache.

Look at the cray this way: ALL of main memory is cache... with 4 ports to 
it... for EACH processor... 

Would I like to see Linux running on these? yes. Can I pay for it? No. I'm 
not in such a position where I could buy one. Would customers buy one?
Perhaps - if the price were right or the need great enough. Would having
Linux on it save the vendor money? I don't know. I hope that it would.

Unfortunately, there are too many things missing from Linux for it to be
considered:
	job and process checkpoint/restart (with files/pipes/sockets intact)
	batch job processors (REAL batch jobs ... not just cron)
	resource accounting and resource allocation control
	compartmented mode security support
	truly large filesystem support (10 TB online, 300+ TB nearline in one fs)
	large file support (100-300 GB in one file at least)
	large process support 
		(10Gb processes, 10-1000 threads... I can dream can't I :-)
	automatic hardware failover support
	hot swap components (disks, tapes, memory, processors)

to make a short list.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-26  4:23                                           ` Jesse Pollard
@ 2003-02-26  5:05                                             ` William Lee Irwin III
  2003-02-26  5:27                                             ` Bernd Eckenfels
  1 sibling, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-26  5:05 UTC (permalink / raw)
  To: Jesse Pollard
  Cc: Gerhard Mack, Larry McVoy, Gerrit Huizenga, Martin J. Bligh,
	linux-kernel

On Tue, Feb 25, 2003 at 10:23:04PM -0600, Jesse Pollard wrote:
> And that is not "Big Iron".
> sorry - Big Iron is a 1 -5 TFlop single system image, shared memory, with 
> streaming vector processor...

Thank you for putting things in their perspectives.

This is why I call x86en maxed to their architectural limits "midrange",
which is a kind overestimate given their sickeningly enormous deficits.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-26  4:23                                           ` Jesse Pollard
  2003-02-26  5:05                                             ` William Lee Irwin III
@ 2003-02-26  5:27                                             ` Bernd Eckenfels
  2003-02-26  9:36                                               ` Eric W. Biederman
  2003-02-26 12:09                                               ` Jesse Pollard
  1 sibling, 2 replies; 157+ messages in thread
From: Bernd Eckenfels @ 2003-02-26  5:27 UTC (permalink / raw)
  To: linux-kernel

In article <03022522230400.04587@tabby> you wrote:
> Something like a Cray X1, single processor for instance.
> Or a 1024 processor Cray T3, again single system image, even if it doesn't
> have a streaming vector processor.
> 
> I don't see that any of the current cluster systems provide the throughput
> of such a system. Not even IBMs' SP series.

This clearly depends on the workload. For most vector processors
partitioning does not make sense. And dont forget, most of those systems are
pure compute servers used fr scientific computing.

> The output is fed to memory on every clock tick. (most Cray processors have 4 
> memory busses for each processor - two for input data, one for output data 
> and one for the instruction stream

The fastest Cray on top500.org is T3E1200 on rank _22_, the fastest IBM is
ranked _2_ with a Power3 PRocessor. There are 13 IBM systems before the
first (fastest) Cray system. Of course those GFlops are measured for
parallel problems, but there are a lot out there.

And all those numbers are totally uninteresting for DB or Storage Servers.
Even a SAP SD Benchmark would not be fun on a Cray.

> I have used their systems for the last 12 years, and until the Earth Simulator
> came on line, there was nothing that came close to their throughput for 
> weather modeling, finite element analysis, or other large problem types.

thats clearly wrong. http://www.top500.org/lists/lists.php?Y=2002&M=06

There are a lot of Power3 ans Alpha systems before the first cray.

Greetings
Bernd

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-26  5:27                                             ` Bernd Eckenfels
@ 2003-02-26  9:36                                               ` Eric W. Biederman
  2003-02-26 12:09                                               ` Jesse Pollard
  1 sibling, 0 replies; 157+ messages in thread
From: Eric W. Biederman @ 2003-02-26  9:36 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

Bernd Eckenfels <ecki@calista.eckenfels.6bone.ka-ip.net> writes:

> In article <03022522230400.04587@tabby> you wrote:
> > The output is fed to memory on every clock tick. (most Cray processors have 4
> 
> > memory busses for each processor - two for input data, one for output data 
> > and one for the instruction stream
> 
> The fastest Cray on top500.org is T3E1200 on rank _22_, the fastest IBM is
> ranked _2_ with a Power3 PRocessor. There are 13 IBM systems before the
> first (fastest) Cray system. Of course those GFlops are measured for
> parallel problems, but there are a lot out there.

And it is especially interesting when you note that among 2-5 the
ratings are so close a strong breeze can cause an upset.  And that #5
is composed of dual CPU P4 Xeon nodes....

Eric



^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-26  5:27                                             ` Bernd Eckenfels
  2003-02-26  9:36                                               ` Eric W. Biederman
@ 2003-02-26 12:09                                               ` Jesse Pollard
  2003-02-26 16:42                                                 ` Geert Uytterhoeven
  1 sibling, 1 reply; 157+ messages in thread
From: Jesse Pollard @ 2003-02-26 12:09 UTC (permalink / raw)
  To: Bernd Eckenfels, linux-kernel

On Tuesday 25 February 2003 23:27, Bernd Eckenfels wrote:
> In article <03022522230400.04587@tabby> you wrote:
> > Something like a Cray X1, single processor for instance.
> > Or a 1024 processor Cray T3, again single system image, even if it
> > doesn't have a streaming vector processor.
> >
> > I don't see that any of the current cluster systems provide the
> > throughput of such a system. Not even IBMs' SP series.
>
> This clearly depends on the workload. For most vector processors
> partitioning does not make sense. And dont forget, most of those systems
> are pure compute servers used fr scientific computing.

Not as much as you would expect. I've been next to (cubical over) from some
people doing benchmarking on the IBM SP 3 (a 330 node quad processor system
and a newer one). Neither could achieve the "advertised" speed on real 
problems.

> > The output is fed to memory on every clock tick. (most Cray processors
> > have 4 memory busses for each processor - two for input data, one for
> > output data and one for the instruction stream
>
> The fastest Cray on top500.org is T3E1200 on rank _22_, the fastest IBM is
> ranked _2_ with a Power3 PRocessor. There are 13 IBM systems before the
> first (fastest) Cray system. Of course those GFlops are measured for
> parallel problems, but there are a lot out there.

The T3 achieves its speed based on the torus network. The processors
are only 400 MHz Alphas, 4 to a processing element. The IBM achives
its speed from a carefully crafted benchmark to show the fasted aggregate
computation possible. It is not a practical usage. Basically the computation
is split into the largest possible chunk, each chunk run on independant
systems, and merged at the very end of the computation. (I've used them too
and have access to two of them).

It takes something in the neighborhood of 60-100 processors in a T3 to
equal one Cray arch processor (even on a C90). A 32 processor C90
easily kept up with a T3 until you exceed 900 processors in the T3. (had
access to each of those too).

> And all those numbers are totally uninteresting for DB or Storage Servers.
> Even a SAP SD Benchmark would not be fun on a Cray.

The Cray has been known to support 200+ GB filesystems with 300+TB
nearline storage with a maximum of 11 second access to data when that
data has been migrated to tape... Admittedly, the time gets longer if the file
exceeds about 100 MB since it must then access multiple tapes in parallel.

> > I have used their systems for the last 12 years, and until the Earth
> > Simulator came on line, there was nothing that came close to their
> > throughput for weather modeling, finite element analysis, or other large
> > problem types.
>
> thats clearly wrong. http://www.top500.org/lists/lists.php?Y=2002&M=06

what you are actually looking at is a custom benchmark, carefully crafted
to show the fasted aggregate computation possible. It is not a practical
usage. The aggregate Cray system throughput (if you max out a X1 cluster)
exceeds even the Earth Simulator. Unfortunately, one of these hasn't been
sold yet.

One of the biggest weaknesses in the IBM world is the SP switch. The lack
of true shared memory programming model limites the systems to very coarse
grained parallelism. It really is just a collection of very fast small 
servers. There is no "single system image". The OS and all core utilities 
must be duplicated on each node or the cluster will not boot.

> There are a lot of Power3 ans Alpha systems before the first cray.

Ah  no. The first cray was before the Pentium... The company made a profit
off of its first sale on one system. There was no power 3 or alpha chip.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-26 12:09                                               ` Jesse Pollard
@ 2003-02-26 16:42                                                 ` Geert Uytterhoeven
  0 siblings, 0 replies; 157+ messages in thread
From: Geert Uytterhoeven @ 2003-02-26 16:42 UTC (permalink / raw)
  To: Jesse Pollard; +Cc: Bernd Eckenfels, Linux Kernel Development

On Wed, 26 Feb 2003, Jesse Pollard wrote:
> On Tuesday 25 February 2003 23:27, Bernd Eckenfels wrote:
> > There are a lot of Power3 ans Alpha systems before the first cray.
> 
> Ah  no. The first cray was before the Pentium... The company made a profit
> off of its first sale on one system. There was no power 3 or alpha chip.

I think Bernd was speaking about the Top 500, not about a historical timeline.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  3:00                             ` Martin J. Bligh
  2003-02-25  3:13                               ` Larry McVoy
@ 2003-02-25 17:37                               ` Andrea Arcangeli
  1 sibling, 0 replies; 157+ messages in thread
From: Andrea Arcangeli @ 2003-02-25 17:37 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

On Mon, Feb 24, 2003 at 07:00:42PM -0800, Martin J. Bligh wrote:
> Solaris et al failed was because the development model was different? 

Solaris can't be recompiled UP AFIK. This whole discussion about UP
performance is almost pointless in linux since we have CONFIG_SMP and we
can recompile it.

Especially if what you care is the desktop (not the UP server), the only
kernel bits that matters for the desktop are the VM, the scheduler and
I/O latency and perpahs the clear_page too.  the rest is all a matter of
the X/kde/qt/glibc-dynamiclinking/opengl/memorybloatwithmultiplelibs/etc..
the kernel core-raw performance in the fast paths doesn't matter much for the
desktop, even if the syscall would be twice slower desktop users
wouldn't notice much.

Andrea

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
  2003-02-25  0:41                         ` Martin J. Bligh
  2003-02-25  0:54                           ` Larry McVoy
@ 2003-02-25  1:09                           ` David Lang
  1 sibling, 0 replies; 157+ messages in thread
From: David Lang @ 2003-02-25  1:09 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Larry McVoy, linux-kernel

if you want to say that sales of LINUX servers generates mor profits then
sales of LINUX desktops then you have a chance of being right, not becouse
the server market is so large, but becouse the desktop market is so small.

however if the linux desktop were to get 10% of the market in terms of new
sales (it's already in the 5%-7% range according to some reports, but a
large percentage of that is in repurposed windows desktops) then the
sales and profits of the desktops would easily outclass the sales and
profits of servers due to the shear volume.

IBM and Sun make a lot of sales from the theory that their machines (you
know, the ones in fancy PC cases with PC power supplies and IDE drives)
are somehow more reliable then a x86 machine. as pwople really start
analysing the cost/performance of the machines and implement HA becouse
they need 24x7 coverage and even the big boys boxes need to be updated
people realize that they can buy multiple cheap boxes and get HA for less
then the cost of buying the one 'professional' box (in some cases they can
afford to buy the multiple smaller boxes and replace them every year for
less then the cost of the professional box over 3 years). And as more
folks use linux on the small(er) machines it breaks down the risk barrier.

one of the big reasons people have traditionally used small numbers of
large boxes was that the licensing costs have been significant, well linux
doesn't have a per server license cost (unless you really want to pay one)
so that's also no longer an issue.

there are some jobs that require large machines instead of clusters,
databases are still one of them (at least as far as I have been able to
learn) but a lot of other jobs are being moved to multiple smaller boxes
(or to multiple logical boxes on one large box which is what Larry is
advocating) and in spite of the doomsayers the problems are being worked
out (can you imagine the reaction from telling a sysadmin team managing
one server in 1970 that in 2000 a similar sized team would be managing
hundreds or thousands of servers ala google :-) yes it takes planning and
dicipline, but it's not nearly as hard as people imagine before they get
started down that path)

David Lang

On Mon, 24 Feb 2003, Martin J. Bligh wrote:

> Date: Mon, 24 Feb 2003 16:41:04 -0800
> From: Martin J. Bligh <mbligh@aracnet.com>
> To: Larry McVoy <lm@bitmover.com>
> Cc: linux-kernel@vger.kernel.org
> Subject: Re: Server shipments [was Re: Minutes from Feb 21 LSE Call]
>
> > More data from news.com.
> >
> > Dell has 19% of the server market with $531M/quarter in sales[1] over
> > 212,750 machines per quarter[2].
> >
> > That means that the average sale price for a server from Dell was $2495.
> >
> > The average sale price of all servers from all companies is $9347.
> >
> > I still don't see the big profits touted by the scaling fanatics, anyone
> > care to explain it?
>
> Sigh. If you're so convinced that there's no money in larger systems,
> why don't you write to Sam Palmisano and explain to him the error of
> his ways? I'm sure IBM has absolutely no market data to go on ...
>
> If only he could receive an explanation of the error of his ways from
> Larry McVoy, I'm sure he'd turn the ship around, for you obviously have
> all the facts, figures, and experience of the server market to make this
> kind of decision. I await the email from the our CEO that tells us how
> much he respects you, and has taken this decision at your bidding.
>
> M.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 16:17                   ` Larry McVoy
  2003-02-24 16:49                     ` Martin J. Bligh
@ 2003-02-24 18:22                     ` John W. M. Stevens
  1 sibling, 0 replies; 157+ messages in thread
From: John W. M. Stevens @ 2003-02-24 18:22 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, linux-kernel

On Mon, Feb 24, 2003 at 08:17:16AM -0800, Larry McVoy wrote:
> On Sun, Feb 23, 2003 at 11:39:34PM -0800, Martin J. Bligh wrote:
> 
> See other postings on this one.  All engineers in your position have said
> "we're just trying to get to N cpus where N = ~2x where we are today and
> it won't hurt uniprocessor performance".  They *all* say that.  And they
> all end up with a slow uniprocessor OS.  Unlike security and a number of
> other invasive features, the SMP stuff can't be configed out

Heck, you can't even configure it out on so-called UP systems.

The moment you introduce DMA into a system, you have an (admittedly,
constrained) SMP system.

And of course, simple interruption is another, contrained, kind of
"virtual SMP", yes?

Anybody whose done any USB HC programming is horribly aware of this
fact, trust me!  ;-)

> or you end
> up with an #ifdef-ed mess like IRIX.

Why if-def it every where?

#ifdef	SMP

#define	lock( mutex )	smpLock( lock )

#else

#define	lock( mutex )

#endif

Do that once, use the lock macro, and forget about it (except in
cases where you have to worry about DMA, interruption, or some other
kind of MP, of course).

My (limited, only about 600 machines) experience is that Linux is
inevitably less stable on non-Intel, and on non-UP machines.  Before
worrying about scalability, my opinion is that worrying about getting
the simplest (dual processor) machines as stable as UP machines, first,
would be both a better ROI, and a good basis for higher levels of
scalability.

Mind you, there is a perfectly simple reason (for Linux being less
stable on non-Intel, non-UP machines) that this is true: the
Linux development methodology pretty much makes this an emergent
property.

Interesting discussion, though . . . from my experience, the commercial
Unices use fine grained locking.

Luck,
John S.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:58               ` Larry McVoy
  2003-02-24  7:39                 ` Martin J. Bligh
@ 2003-02-24  7:51                 ` William Lee Irwin III
  2003-02-24 15:47                   ` Larry McVoy
  2003-02-24 13:28                 ` Alan Cox
  2003-02-24 18:44                 ` Davide Libenzi
  3 siblings, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24  7:51 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, linux-kernel

On Sun, Feb 23, 2003 at 10:58:26PM -0800, Larry McVoy wrote:
> Linux is a really fast system right now.  The code paths are short and
> it is possible to use the OS almost as if it were a library, the cost is
> so little that you really can mmap stuff in as you need, something that
> people have wanted since Multics.  There will always be many more uses
> of Linux in small systems than large, simply because there will always
> be more small systems.  Keeping Linux working well on small systems is
> going to have a dramatically larger positive benefit for the world than
> scaling it to 64 processors.  So who do you want to help?  An elite
> few or everyone?

I don't know what kind of joke you think I'm trying to play here.

"Scalability" is about making the kernel properly adapt to the size of
the system. This means UP. This means embedded. This means mid-range
x86 bigfathighmem turds. This means SGI Altix. I have _personally_
written patches to decrease the space footprint of pidhashes and other
data structures so that embedded systems function more optimally.

It's not about crapping all over the low end. It's not about degrading
performance on commonly available systems. It's about increasing the
range of systems on which Linux performs well and is useful.

Maintaining the performance of Linux on commonly available systems is
not only deeply ingrained as one of a set of personal standards amongst
all kernel hackers involved with scalability, it's also a prerequisite
for patch acceptance that is rigorously enforced by maintainers. To
further demonstrate this, look at the pgd_ctor patches, which markedly
reduced the overhead of pgd setup and teardown on UP lowmem systems and
were very minor improvements on PAE systems.

Now it's time to turn the question back around on you. Why do you not
want Linux to work well on a broader range of systems than it does now?

-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  7:51                 ` William Lee Irwin III
@ 2003-02-24 15:47                   ` Larry McVoy
  2003-02-24 16:00                     ` Martin J. Bligh
                                       ` (2 more replies)
  0 siblings, 3 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-24 15:47 UTC (permalink / raw)
  To: William Lee Irwin III, Martin J. Bligh, Larry McVoy, linux-kernel

On Sun, Feb 23, 2003 at 11:51:42PM -0800, William Lee Irwin III wrote:
> Now it's time to turn the question back around on you. Why do you not
> want Linux to work well on a broader range of systems than it does now?

I never said that I didn't.  I'm just taking issue with the choosen path
which has been demonstrated to not work.

"Let's scale Linux by multi threading"

    "Err, that really sucked for everyone who has tried it in the past, all
    the code paths got long and uniprocessor performance suffered"

"Oh, but we won't do that, that would be bad".

    "Great, how about you measure the changes carefully and really show that?"

"We don't need to measure the changes, we know we'll do it right".

And just like in every other time this come up in every other engineering
organization, the focus is in 2x wherever we are today.  It is *never*
about getting to 100x or 1000x.

If you were looking at the problem assuming that the same code had to
run on uniprocessor and a 1000 way smp, right now, today, and designing
for it, I doubt very much we'd have anything to argue about.  A lot of
what I'm saying starts to become obviously true as you increase the 
number of CPUs but engineers are always seduced into making it go 2x 
farther than it does today.  Unfortunately, each of those 2x increases
comes at some cost and they add up.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 15:47                   ` Larry McVoy
@ 2003-02-24 16:00                     ` Martin J. Bligh
  2003-02-24 16:23                     ` Benjamin LaHaise
  2003-02-24 23:36                     ` William Lee Irwin III
  2 siblings, 0 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-24 16:00 UTC (permalink / raw)
  To: Larry McVoy, William Lee Irwin III, linux-kernel

> I never said that I didn't.  I'm just taking issue with the choosen path
> which has been demonstrated to not work.
> 
> "Let's scale Linux by multi threading"
> 
>     "Err, that really sucked for everyone who has tried it in the past,
> all     the code paths got long and uniprocessor performance suffered"
> 
> "Oh, but we won't do that, that would be bad".
> 
>     "Great, how about you measure the changes carefully and really show
> that?"
> 
> "We don't need to measure the changes, we know we'll do it right".

Most of the threading changes have been things like 1 thread per cpu, which
would seem to scale up and down rather well to me ... could you illustrate
by  pointing  to an example of something that's changed in that area which
you think is bad? Yes, if Linux started 2000 kernel threads on a UP system,
that would obviously be bad.

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 15:47                   ` Larry McVoy
  2003-02-24 16:00                     ` Martin J. Bligh
@ 2003-02-24 16:23                     ` Benjamin LaHaise
  2003-02-24 16:25                       ` yodaiken
  2003-02-24 16:31                       ` Minutes from Feb 21 LSE Call Larry McVoy
  2003-02-24 23:36                     ` William Lee Irwin III
  2 siblings, 2 replies; 157+ messages in thread
From: Benjamin LaHaise @ 2003-02-24 16:23 UTC (permalink / raw)
  To: Larry McVoy, William Lee Irwin III, Martin J. Bligh, Larry McVoy,
	linux-kernel

On Mon, Feb 24, 2003 at 07:47:25AM -0800, Larry McVoy wrote:
> If you were looking at the problem assuming that the same code had to
> run on uniprocessor and a 1000 way smp, right now, today, and designing
> for it, I doubt very much we'd have anything to argue about.  A lot of
> what I'm saying starts to become obviously true as you increase the 
> number of CPUs but engineers are always seduced into making it go 2x 
> farther than it does today.  Unfortunately, each of those 2x increases
> comes at some cost and they add up.

Good point.  However, we are in a position to compare test results of 
older linux kernels against newer, and to recompile code out of the 
kernel for specific applications.  I'm curious if there is a collection 
of lmbench results of hand configured and compiled kernels vs the vendor 
module based kernels across 2.0, 2.2, 2.4 and recent 2.5 on the same 
uniprocessor and dual processor configuration.  That would really give 
us a better idea of how a properly tuned kernel vs what people actually 
use for support reasons is costing us, and if we're winning or losing.

		-ben
-- 
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 16:23                     ` Benjamin LaHaise
@ 2003-02-24 16:25                       ` yodaiken
  2003-02-24 18:20                         ` Gerrit Huizenga
  2003-02-24 16:31                       ` Minutes from Feb 21 LSE Call Larry McVoy
  1 sibling, 1 reply; 157+ messages in thread
From: yodaiken @ 2003-02-24 16:25 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Larry McVoy, William Lee Irwin III, Martin J. Bligh, Larry McVoy,
	linux-kernel

On Mon, Feb 24, 2003 at 11:23:14AM -0500, Benjamin LaHaise wrote:
> Good point.  However, we are in a position to compare test results of 
> older linux kernels against newer, and to recompile code out of the 
> kernel for specific applications.  I'm curious if there is a collection 
> of lmbench results of hand configured and compiled kernels vs the vendor 
> module based kernels across 2.0, 2.2, 2.4 and recent 2.5 on the same 
> uniprocessor and dual processor configuration.  That would really give 
> us a better idea of how a properly tuned kernel vs what people actually 
> use for support reasons is costing us, and if we're winning or losing.

It's interesting to me that the people supporting the scale up do not 
carefully do such benchmarks and indeed have a rather cavilier attitude
to testing and benchmarking: or perhaps they don't think it's worth 
publishing. 

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
www.fsmlabs.com  www.rtlinux.com
1+ 505 838 9109


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 16:25                       ` yodaiken
@ 2003-02-24 18:20                         ` Gerrit Huizenga
  2003-02-25  1:51                           ` Minutes from Feb 21 LSE Call - publishing performance data Craig Thomas
  0 siblings, 1 reply; 157+ messages in thread
From: Gerrit Huizenga @ 2003-02-24 18:20 UTC (permalink / raw)
  To: yodaiken
  Cc: Benjamin LaHaise, Larry McVoy, William Lee Irwin III,
	Martin J. Bligh, Larry McVoy, linux-kernel

On Mon, 24 Feb 2003 09:25:33 MST, yodaiken@fsmlabs.com wrote:
> It's interesting to me that the people supporting the scale up do not 
> carefully do such benchmarks and indeed have a rather cavilier attitude
> to testing and benchmarking: or perhaps they don't think it's worth 
> publishing. 

I'm afraid it is the latter half that is closer to correct.  Within
IBM's Linux Technology Center, we have a good sized performance team
and a tightly coupled set of developers who can internally share a
lot of real benchmark data.  Unfortunately, the rules of SPEC and TPC
don't allow us to release data unless it is carefully (and time-
consumingly) audited, and IBM has a history of not dumping the output
of a few hundred runs of benchmarks out in the open and then claiming
that it is all valid, without doing a lot of internal validation first.

I'm sure other large companies doing Linux stuff have similar hurdles.
In some cases, ours are probably higher than average (IBM as an
entity has zero interest in pissing of the TPC or SPEC).

We do have a few papers out there, check OLS for the large database
workload one that steps through 2.4 performance changes (stock
2.4 vs. a set of patches we pushed to UL & RHAT) that increase
database performance about, oh, I forget, 5-fold...  And there
is occasional other data sent out on web server stuff, some
microbenchmark data (see the continuing stream of data from mbligh,
for instance).  Also, the contest data, OSDL data, etc. etc.
shows comparisons and trends for anyone who cares to pay attention.

It *would* be nice if someone could publish a compedium of performance
data, but that would be asking a lot...

gerrit

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call - publishing performance data
  2003-02-24 18:20                         ` Gerrit Huizenga
@ 2003-02-25  1:51                           ` Craig Thomas
  0 siblings, 0 replies; 157+ messages in thread
From: Craig Thomas @ 2003-02-25  1:51 UTC (permalink / raw)
  To: Gerrit Huizenga
  Cc: yodaiken, Benjamin LaHaise, Larry McVoy, William Lee Irwin III,
	Martin J. Bligh, Larry McVoy, linux-kernel

On Mon, 2003-02-24 at 10:20, Gerrit Huizenga wrote:

> 
> We do have a few papers out there, check OLS for the large database
> workload one that steps through 2.4 performance changes (stock
> 2.4 vs. a set of patches we pushed to UL & RHAT) that increase
> database performance about, oh, I forget, 5-fold...  And there
> is occasional other data sent out on web server stuff, some
> microbenchmark data (see the continuing stream of data from mbligh,
> for instance).  Also, the contest data, OSDL data, etc. etc.
> shows comparisons and trends for anyone who cares to pay attention.
> 
> It *would* be nice if someone could publish a compedium of performance
> data, but that would be asking a lot...
> 
> gerrit
> -

OSDL is trying to provide something like this for the 2.5 kernel.  It is
an interest we have to provide this sort of data. We have been building
database workload information and generating test results from our STP
test framework.

We are in the midst of creating content for a Linux Stability Results
web page.  http://www.osdl.org/projects/26lnxstblztn/results/  There is
a great desire on our part to share good performance data for the kernel
as it evolves.  I would like to ask you guys what would you like to
see on page like this?  I feel that we could create a single site where
anyone can get access to performance and reliability information about
the Linux kernel as we move toward the 2.6 version.

The page is set up now so that anyone can contribute content to the page
by editing an html template file to point to test and performance data.
If anyone is interested in this concept, email me privately or
cliffw@osdl.org

-- 
Craig Thomas <craiger@osdl.org>
OSDL

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 16:23                     ` Benjamin LaHaise
  2003-02-24 16:25                       ` yodaiken
@ 2003-02-24 16:31                       ` Larry McVoy
  1 sibling, 0 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-24 16:31 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Larry McVoy, William Lee Irwin III, Martin J. Bligh, linux-kernel

On Mon, Feb 24, 2003 at 11:23:14AM -0500, Benjamin LaHaise wrote:
> kernel for specific applications.  I'm curious if there is a collection 
> of lmbench results of hand configured and compiled kernels vs the vendor 
> module based kernels across 2.0, 2.2, 2.4 and recent 2.5 on the same 
> uniprocessor and dual processor configuration.  

If someone were willing to build the init script infra structure to 
reboot to a new kernel, run the test, etc., I'll buy a couple of 
machines and just let them run through this.  I'd like to do it 
with the cache miss counters turned on so if P4's do a nicer job
of counting than Athlons, I'll get those.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 15:47                   ` Larry McVoy
  2003-02-24 16:00                     ` Martin J. Bligh
  2003-02-24 16:23                     ` Benjamin LaHaise
@ 2003-02-24 23:36                     ` William Lee Irwin III
  2003-02-25  0:23                       ` Larry McVoy
  2 siblings, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24 23:36 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, linux-kernel

On Sun, Feb 23, 2003 at 11:51:42PM -0800, William Lee Irwin III wrote:
>> Now it's time to turn the question back around on you. Why do you not
>> want Linux to work well on a broader range of systems than it does now?

On Mon, Feb 24, 2003 at 07:47:25AM -0800, Larry McVoy wrote:
> I never said that I didn't.  I'm just taking issue with the choosen path
> which has been demonstrated to not work.
> "Let's scale Linux by multi threading"
>     "Err, that really sucked for everyone who has tried it in the past, all
>     the code paths got long and uniprocessor performance suffered"
> "Oh, but we won't do that, that would be bad".
>     "Great, how about you measure the changes carefully and really show that?"
> "We don't need to measure the changes, we know we'll do it right".

The changes are getting measured. By and large if it's slower on UP
it's rejected. There's a dedicated benchmark crew, of which Randy Hron
is an important member, that benchmarks such things very consistently.
Internal benchmarking includes both free and non-free benchmarks. dbench,
tiobench, kernel compiles, contest, and so on are the publicable bits.

Also, code paths are also not necessarily getting longer. Single-
threaded efficiency lowers lock hold time and helps small systems too,
and numerous improvements with buffer_heads, task searching, file
truncation, and the like, are of that flavor.

On Mon, Feb 24, 2003 at 07:47:25AM -0800, Larry McVoy wrote:
> And just like in every other time this come up in every other engineering
> organization, the focus is in 2x wherever we are today.  It is *never*
> about getting to 100x or 1000x.
> If you were looking at the problem assuming that the same code had to
> run on uniprocessor and a 1000 way smp, right now, today, and designing
> for it, I doubt very much we'd have anything to argue about.  A lot of
> what I'm saying starts to become obviously true as you increase the 
> number of CPUs but engineers are always seduced into making it go 2x 
> farther than it does today.  Unfortunately, each of those 2x increases
> comes at some cost and they add up.

Linux is a patchwork kernel. No coherent design will ever shine through.
Scaling the kernel incrementally merely becomes that much more difficult.
The small system performance standards aren't getting lowered.

Also note there are various efforts to scale the kernel _downward_ to
smaller embedded systems, partly by controlling "bloated" hash tables'
sizes and partly by making major subsystems optional and partly by
supporting systems with no MMU. This is not a one-way street, though I
myself am clearly pointed in the upward direction.

-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 23:36                     ` William Lee Irwin III
@ 2003-02-25  0:23                       ` Larry McVoy
  2003-02-25  2:37                         ` Werner Almesberger
  2003-02-25  4:42                         ` William Lee Irwin III
  0 siblings, 2 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  0:23 UTC (permalink / raw)
  To: William Lee Irwin III, Martin J. Bligh, Larry McVoy, linux-kernel

> The changes are getting measured. By and large if it's slower on UP
> it's rejected. 

Suppose I have an application which has a working set which just exactly
fits in the I+D caches, including the related OS stuff.

Someone makes some change to the OS and the benchmark for that change is
smaller than the I+D caches but the change increased the I+D cache space
needed. 

The benchmark will not show any slowdown, correct?
My application no longer fits and will suffer, correct?

The point is that if you are putting SMP changes into the system, you
have to be held to a higher standard for measurement given the past
track record of SMP changes increasing code length and cache footprints.
So "measuring" doesn't mean "it's not slower on XYZ microbenchmark".
It means "under the following work loads the cache misses went down or
stayed the same for before and after tests".

And if you said that all changes should be held to this standard, not
just scaling changes, I'd agree with you.  But scaling changes are the
"bad guy" in my mind, they are not to be trusted, so they should be held
to this standard first.  If we can get everyone to step up to this bat,
that's all to the good.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  0:23                       ` Larry McVoy
@ 2003-02-25  2:37                         ` Werner Almesberger
  2003-02-25  4:42                         ` William Lee Irwin III
  1 sibling, 0 replies; 157+ messages in thread
From: Werner Almesberger @ 2003-02-25  2:37 UTC (permalink / raw)
  To: William Lee Irwin III, Martin J. Bligh, Larry McVoy, linux-kernel

Larry McVoy wrote:
> The point is that if you are putting SMP changes into the system, you
> have to be held to a higher standard for measurement given the past
> track record of SMP changes increasing code length and cache footprints.

So you probably want to run this benchmark on a synthetic CPU a la
cachegrind. The difficult part would be to come up with a reasonably
understandable additive metric for cache pressure.

(I guess there goes another call to arms to academia :-)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  0:23                       ` Larry McVoy
  2003-02-25  2:37                         ` Werner Almesberger
@ 2003-02-25  4:42                         ` William Lee Irwin III
  2003-02-25  4:54                           ` Larry McVoy
  1 sibling, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-25  4:42 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, linux-kernel

At some point in the past, I wrote:
>> The changes are getting measured. By and large if it's slower on UP
>> it's rejected. 

On Mon, Feb 24, 2003 at 04:23:09PM -0800, Larry McVoy wrote:
> Suppose I have an application which has a working set which just exactly
> fits in the I+D caches, including the related OS stuff.
> Someone makes some change to the OS and the benchmark for that change is
> smaller than the I+D caches but the change increased the I+D cache space
> needed. 
> The benchmark will not show any slowdown, correct?
> My application no longer fits and will suffer, correct?

Well, it's often clear from the code whether it'll have a larger cache
footprint or not, so it's probably not that large a problem. OTOH it is
a real problem that little cache or TLB profiling is going on. I tried
once or twice and actually came up with a function or two that should
be inlined instead of uninlined in very short order. Much low-hanging
fruit could be gleaned from those kinds profiles.

It's also worthwhile noting increased cache footprints are actually
very often degradations on SMP and especially NUMA. The notion that
optimizing for SMP and/or NUMA involves increasing cache footprint
on anything doesn't really sound plausible, though I'll admit that
the mistake of trusting microbenchmarks too far on SMP has probably
already been committed at least once. Userspace owns the cache; using
cache for the kernel is "cache pollution", which should be minimized.
Going too far out on the space end of time/space tradeoff curves is
every bit as bad for SMP as UP, and really horrible for NUMA.

On Mon, Feb 24, 2003 at 04:23:09PM -0800, Larry McVoy wrote:
> The point is that if you are putting SMP changes into the system, you
> have to be held to a higher standard for measurement given the past
> track record of SMP changes increasing code length and cache footprints.
> So "measuring" doesn't mean "it's not slower on XYZ microbenchmark".
> It means "under the following work loads the cache misses went down or
> stayed the same for before and after tests".

This kind of measurement is actually relatively unusual. I'm definitely
interested in it, as there appear to be some deficits wrt. locality of
reference that show up as big profile spikes on NUMA boxen. With care
exercised good solutions should also trim down cache misses on UP also.
Cache and TLB miss profile driven development sounds very attractive.

On Mon, Feb 24, 2003 at 04:23:09PM -0800, Larry McVoy wrote:
> And if you said that all changes should be held to this standard, not
> just scaling changes, I'd agree with you.  But scaling changes are the
> "bad guy" in my mind, they are not to be trusted, so they should be held
> to this standard first.  If we can get everyone to step up to this bat,
> that's all to the good.

Let me put it this way: IBM sells tiny boxen too, from 4x, to UP, to
whatever. And people are simultaneously actively trying to scale
downward to embedded bacteria or whatever. So the small systems are
being neither ignored nor sacrificed for anything else.

-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  4:42                         ` William Lee Irwin III
@ 2003-02-25  4:54                           ` Larry McVoy
  2003-02-25  6:00                             ` William Lee Irwin III
  0 siblings, 1 reply; 157+ messages in thread
From: Larry McVoy @ 2003-02-25  4:54 UTC (permalink / raw)
  To: William Lee Irwin III, Martin J. Bligh, Larry McVoy, linux-kernel

> Userspace owns the cache; using
> cache for the kernel is "cache pollution", which should be minimized.
> Going too far out on the space end of time/space tradeoff curves is
> every bit as bad for SMP as UP, and really horrible for NUMA.

Cool, I agree 100% with this.

> > So "measuring" doesn't mean "it's not slower on XYZ microbenchmark".
> > It means "under the following work loads the cache misses went down or
> > stayed the same for before and after tests".
> 
> This kind of measurement is actually relatively unusual. I'm definitely
> interested in it, as there appear to be some deficits wrt. locality of
> reference that show up as big profile spikes on NUMA boxen. With care
> exercised good solutions should also trim down cache misses on UP also.
> Cache and TLB miss profile driven development sounds very attractive.

Again, I'm with you all the way on this.  If the scale up guys can adopt
this as a mantra, I'm a lot less concerned that anything bad will happen.

Tim at OSDL and I have been talking about trying to work out some benchmarks
to test for this.  I came up with the idea of adding a "-s XXX" which means
"touch XXX bytes between each iteration" to each LMbench test.  One problem
is the lack of page coloring will make the numbers bounce around too much.
We talked that over with Linus and he suggested using the big TLB hack to
get around that.  Assuming we can deal with the page coloring, do you think
that there is any merit in taking microbenchmarks, adding an artificial
working set, and running those?

> Let me put it this way: IBM sells tiny boxen too, from 4x, to UP, to
> whatever. And people are simultaneously actively trying to scale
> downward to embedded bacteria or whatever. 

That's really great, I know it's a lot less sexy but it's important.
I'd love to see as much attention on making Linux work on tiny embedded
platforms as there is on making it work on big iron.  Small is cool too.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  4:54                           ` Larry McVoy
@ 2003-02-25  6:00                             ` William Lee Irwin III
  2003-02-25  7:00                               ` Val Henson
  0 siblings, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-25  6:00 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, linux-kernel

At some point in the past, I wrote:
>> This kind of measurement is actually relatively unusual. I'm definitely
>> interested in it, as there appear to be some deficits wrt. locality of
>> reference that show up as big profile spikes on NUMA boxen. With care
>> exercised good solutions should also trim down cache misses on UP also.
>> Cache and TLB miss profile driven development sounds very attractive.

On Mon, Feb 24, 2003 at 08:54:04PM -0800, Larry McVoy wrote:
> Again, I'm with you all the way on this.  If the scale up guys can adopt
> this as a mantra, I'm a lot less concerned that anything bad will happen.

I don't know about mantras, but we're getting to the point where lock
contention is a non-issue on midrange SMP and straight line efficiency
is beyond the range of "obviously it should be done some other way."
The time to chase cache pollution is certainly coming.

On Mon, Feb 24, 2003 at 08:54:04PM -0800, Larry McVoy wrote:
> Tim at OSDL and I have been talking about trying to work out some benchmarks
> to test for this.  I came up with the idea of adding a "-s XXX" which means
> "touch XXX bytes between each iteration" to each LMbench test.  One problem
> is the lack of page coloring will make the numbers bounce around too much.
> We talked that over with Linus and he suggested using the big TLB hack to
> get around that.  Assuming we can deal with the page coloring, do you think
> that there is any merit in taking microbenchmarks, adding an artificial
> working set, and running those?

Page coloring needs to get into the kernel at some point. Using large
TLB entries will artificially tie this to TLB effects and fragmentation,
in addition to pagetable space conservation (on x86 anyway). So I really
don't see any way to deal with reproducibility issues on this front but
just doing page coloring. Everything else that does it as a side effect
would unduly disturb the results, IMHO.

At some point in the past, I wrote:
>> Let me put it this way: IBM sells tiny boxen too, from 4x, to UP, to
>> whatever. And people are simultaneously actively trying to scale
>> downward to embedded bacteria or whatever. 

On Mon, Feb 24, 2003 at 08:54:04PM -0800, Larry McVoy wrote:
> That's really great, I know it's a lot less sexy but it's important.
> I'd love to see as much attention on making Linux work on tiny embedded
> platforms as there is on making it work on big iron.  Small is cool too.

There is, unfortunately the participation in the development cycle of
embedded vendors is not as visible as it is with large system vendors.
More direct, frequent, and vocal input from embedded kernel hackers
would be very valuable, as many "corner cases" with automatic kernel
scaling should occur on the small end, not just the large end.

I've had some brief attempts to explain to me the motives and methods
of embedded system vendors and the like, but I've failed to absorb
enough to get a "big picture" or much of any notion as to why embedded
kernel hackers aren't participating as much in the development cycle.

On the large system side, it's very clear that issues in the core VM
and other parts of the kernel must be addressed to achieve the goals,
and hence participation in the development cycle is outright mandatory.
It's not "working effectively". It's a requirement. And part of that
"requirement" bit is we have to work with constraints never enforced
before, including maintaining the scalability curve on the low end.

It's hard, and probably not impossible, but absolutely required.

-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  6:00                             ` William Lee Irwin III
@ 2003-02-25  7:00                               ` Val Henson
  0 siblings, 0 replies; 157+ messages in thread
From: Val Henson @ 2003-02-25  7:00 UTC (permalink / raw)
  To: William Lee Irwin III, linux-kernel

On Mon, Feb 24, 2003 at 10:00:53PM -0800, William Lee Irwin III wrote:
> On Mon, Feb 24, 2003 at 08:54:04PM -0800, Larry McVoy wrote:
> > That's really great, I know it's a lot less sexy but it's important.
> > I'd love to see as much attention on making Linux work on tiny embedded
> > platforms as there is on making it work on big iron.  Small is cool too.
> 
> There is, unfortunately the participation in the development cycle of
> embedded vendors is not as visible as it is with large system vendors.
> More direct, frequent, and vocal input from embedded kernel hackers
> would be very valuable, as many "corner cases" with automatic kernel
> scaling should occur on the small end, not just the large end.
> 
> I've had some brief attempts to explain to me the motives and methods
> of embedded system vendors and the like, but I've failed to absorb
> enough to get a "big picture" or much of any notion as to why embedded
> kernel hackers aren't participating as much in the development cycle.

Speaking as a former Linux developer for an embedded[1] systems
vendor, it's because embedded companies aren't the size of IBM and
don't have money to spend on software development beyond the "make it
work on our boards" point.  One of the many reasons I'm a _former_
embedded Linux developer.

-VAL

[1] Okay, our boards had up to 4 processors and 1GB memory.  But the
same principles applied.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:58               ` Larry McVoy
  2003-02-24  7:39                 ` Martin J. Bligh
  2003-02-24  7:51                 ` William Lee Irwin III
@ 2003-02-24 13:28                 ` Alan Cox
  2003-02-25  5:19                   ` Chris Wedgwood
  2003-02-24 18:44                 ` Davide Libenzi
  3 siblings, 1 reply; 157+ messages in thread
From: Alan Cox @ 2003-02-24 13:28 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Martin J. Bligh, Linux Kernel Mailing List

On Mon, 2003-02-24 at 06:58, Larry McVoy wrote:
> Which brings us back to the point.  If the world is not heading towards
> an 8 way on every desk then it is really questionable to make a lot of
> changes to the kernel to make it work really well on 8-ways.  

_If_ it harms performance on small boxes. Otherwise you turn Linux into
Irix and your market doesnt look so hot in 3 or 4 years time. Featuritus
is a slow creeping death.

The definitive Linux box appears to be $199 from Walmart right now, and
its not SMP. 



^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24 13:28                 ` Alan Cox
@ 2003-02-25  5:19                   ` Chris Wedgwood
  2003-02-25  5:26                     ` William Lee Irwin III
                                       ` (3 more replies)
  0 siblings, 4 replies; 157+ messages in thread
From: Chris Wedgwood @ 2003-02-25  5:19 UTC (permalink / raw)
  To: Alan Cox; +Cc: Larry McVoy, Martin J. Bligh, Linux Kernel Mailing List

On Mon, Feb 24, 2003 at 01:28:30PM +0000, Alan Cox wrote:

> _If_ it harms performance on small boxes.

You mean like the general slowdown from 2.4 - >2.5?

It seems to me for small boxes, 2.5.x is margianlly slower at most
things than 2.4.x.

I'm hoping and the code solidifes and things are tuned this gap will
go away and 2.5.x will inch ahead...  hoping....

> The definitive Linux box appears to be $199 from Walmart right now,
> and its not SMP.

In two year this kind of hardware probably will be SMP (HT or some
variant).

  --cw

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  5:19                   ` Chris Wedgwood
@ 2003-02-25  5:26                     ` William Lee Irwin III
  2003-02-25 21:21                       ` Chris Wedgwood
  2003-02-25  6:17                     ` Martin J. Bligh
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-25  5:26 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Alan Cox, Larry McVoy, Martin J. Bligh, Linux Kernel Mailing List

On Mon, Feb 24, 2003 at 01:28:30PM +0000, Alan Cox wrote:
>> _If_ it harms performance on small boxes.

On Mon, Feb 24, 2003 at 09:19:56PM -0800, Chris Wedgwood wrote:
> You mean like the general slowdown from 2.4 - >2.5?
> It seems to me for small boxes, 2.5.x is margianlly slower at most
> things than 2.4.x.
> I'm hoping and the code solidifes and things are tuned this gap will
> go away and 2.5.x will inch ahead...  hoping....

Could you help identify the regressions? Profiles? Workload?

On Mon, Feb 24, 2003 at 01:28:30PM +0000, Alan Cox wrote:
>> The definitive Linux box appears to be $199 from Walmart right now,
>> and its not SMP.

On Mon, Feb 24, 2003 at 09:19:56PM -0800, Chris Wedgwood wrote:
> In two year this kind of hardware probably will be SMP (HT or some

I'm a programmer not an economist (despite utility functions and Nash
equilibria). Don't tell me what's definitive, give me some profiles.

-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  5:26                     ` William Lee Irwin III
@ 2003-02-25 21:21                       ` Chris Wedgwood
  2003-02-25 21:14                         ` Martin J. Bligh
  2003-02-25 21:21                         ` William Lee Irwin III
  0 siblings, 2 replies; 157+ messages in thread
From: Chris Wedgwood @ 2003-02-25 21:21 UTC (permalink / raw)
  To: William Lee Irwin III, Alan Cox, Larry McVoy, Martin J. Bligh,
	Linux Kernel Mailing List

On Mon, Feb 24, 2003 at 09:26:02PM -0800, William Lee Irwin III wrote:

> Could you help identify the regressions? Profiles? Workload?

I the OSDL data that Cliff White pointed out sufficient to work-with,
or do you want specific tests run with oprofile outputs?


  --cw

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 21:21                       ` Chris Wedgwood
@ 2003-02-25 21:14                         ` Martin J. Bligh
  2003-02-25 21:21                         ` William Lee Irwin III
  1 sibling, 0 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25 21:14 UTC (permalink / raw)
  To: Chris Wedgwood, William Lee Irwin III, Linux Kernel Mailing List

>> Could you help identify the regressions? Profiles? Workload?
> 
> I the OSDL data that Cliff White pointed out sufficient to work-with,
> or do you want specific tests run with oprofile outputs?

It's a great start, but profiles would really help if you can grab them.

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 21:21                       ` Chris Wedgwood
  2003-02-25 21:14                         ` Martin J. Bligh
@ 2003-02-25 21:21                         ` William Lee Irwin III
  2003-02-25 22:08                           ` Larry McVoy
  1 sibling, 1 reply; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-25 21:21 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Alan Cox, Larry McVoy, Martin J. Bligh, Linux Kernel Mailing List

On Mon, Feb 24, 2003 at 09:26:02PM -0800, William Lee Irwin III wrote:
>> Could you help identify the regressions? Profiles? Workload?

On Tue, Feb 25, 2003 at 01:21:15PM -0800, Chris Wedgwood wrote:
> I the OSDL data that Cliff White pointed out sufficient to work-with,
> or do you want specific tests run with oprofile outputs?

oprofile is what's needed. Looks like he's taking care of that too.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 21:21                         ` William Lee Irwin III
@ 2003-02-25 22:08                           ` Larry McVoy
  2003-02-25 22:10                             ` William Lee Irwin III
  2003-02-25 22:37                             ` Chris Wedgwood
  0 siblings, 2 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-25 22:08 UTC (permalink / raw)
  To: William Lee Irwin III, Chris Wedgwood, Alan Cox, Larry McVoy,
	Martin J. Bligh, Linux Kernel Mailing List

On Tue, Feb 25, 2003 at 01:21:34PM -0800, William Lee Irwin III wrote:
> On Mon, Feb 24, 2003 at 09:26:02PM -0800, William Lee Irwin III wrote:
> >> Could you help identify the regressions? Profiles? Workload?
> 
> On Tue, Feb 25, 2003 at 01:21:15PM -0800, Chris Wedgwood wrote:
> > I the OSDL data that Cliff White pointed out sufficient to work-with,
> > or do you want specific tests run with oprofile outputs?
> 
> oprofile is what's needed. Looks like he's taking care of that too.

Without doing something about the page coloring problem (and he might be)
the numbers will be fairly meaningless.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 22:08                           ` Larry McVoy
@ 2003-02-25 22:10                             ` William Lee Irwin III
  2003-02-25 22:37                             ` Chris Wedgwood
  1 sibling, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-25 22:10 UTC (permalink / raw)
  To: Larry McVoy, Chris Wedgwood, Alan Cox, Larry McVoy,
	Martin J. Bligh, Linux Kernel Mailing List

On Tue, Feb 25, 2003 at 01:21:34PM -0800, William Lee Irwin III wrote:
>> oprofile is what's needed. Looks like he's taking care of that too.

On Tue, Feb 25, 2003 at 02:08:11PM -0800, Larry McVoy wrote:
> Without doing something about the page coloring problem (and he might be)
> the numbers will be fairly meaningless.

Hmm, point. Let's see if we can get Cliff to apply the new patch that
one guy put out yesterday or so.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 22:08                           ` Larry McVoy
  2003-02-25 22:10                             ` William Lee Irwin III
@ 2003-02-25 22:37                             ` Chris Wedgwood
  2003-02-25 22:58                               ` Larry McVoy
  1 sibling, 1 reply; 157+ messages in thread
From: Chris Wedgwood @ 2003-02-25 22:37 UTC (permalink / raw)
  To: Larry McVoy, William Lee Irwin III, Alan Cox, Larry McVoy,
	Martin J. Bligh, Linux Kernel Mailing List

On Tue, Feb 25, 2003 at 02:08:11PM -0800, Larry McVoy wrote:

> Without doing something about the page coloring problem (and he
> might be) the numbers will be fairly meaningless.

page coloring problem?

i was under the impression on anything 8-way-associative or better the
page coloring improvements were negligible for real-world benchmarks
(ie. kernel compiles)

... or is this more an artifact that even though the improvements for
real-world are negligible, micro-benchmarks are susceptible to these
variations this making things like the std. dev. larger than it would
otherwise be?

  --cw

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 22:37                             ` Chris Wedgwood
@ 2003-02-25 22:58                               ` Larry McVoy
  0 siblings, 0 replies; 157+ messages in thread
From: Larry McVoy @ 2003-02-25 22:58 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Larry McVoy, William Lee Irwin III, Alan Cox, Martin J. Bligh,
	Linux Kernel Mailing List

> ... or is this more an artifact that even though the improvements for
> real-world are negligible, micro-benchmarks are susceptible to these
> variations this making things like the std. dev. larger than it would
> otherwise be?

Bingo.  If you are trying to measure whether something adds cache misses
you really want reproducible runs.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  5:19                   ` Chris Wedgwood
  2003-02-25  5:26                     ` William Lee Irwin III
@ 2003-02-25  6:17                     ` Martin J. Bligh
  2003-02-25 17:11                       ` Cliff White
  2003-02-25 21:28                       ` William Lee Irwin III
  2003-02-25 19:20                     ` Alan Cox
  2003-02-25 19:59                     ` Scott Robert Ladd
  3 siblings, 2 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25  6:17 UTC (permalink / raw)
  To: Chris Wedgwood, Alan Cox; +Cc: Larry McVoy, Linux Kernel Mailing List

>> _If_ it harms performance on small boxes.
> 
> You mean like the general slowdown from 2.4 - >2.5?
> 
> It seems to me for small boxes, 2.5.x is margianlly slower at most
> things than 2.4.x.

Can you name a benchmark, or at least do something reproducible between
versions, and produce a 2.4 vs 2.5 profile? Let's at least try to fix it ...

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  6:17                     ` Martin J. Bligh
@ 2003-02-25 17:11                       ` Cliff White
  2003-02-25 17:17                         ` William Lee Irwin III
                                           ` (2 more replies)
  2003-02-25 21:28                       ` William Lee Irwin III
  1 sibling, 3 replies; 157+ messages in thread
From: Cliff White @ 2003-02-25 17:11 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Chris Wedgwood, Alan Cox, Larry McVoy, Linux Kernel Mailing List,
	cliffw

> >> _If_ it harms performance on small boxes.
> > 
> > You mean like the general slowdown from 2.4 - >2.5?
> > 
> > It seems to me for small boxes, 2.5.x is margianlly slower at most
> > things than 2.4.x.
> 
> Can you name a benchmark, or at least do something reproducible between
> versions, and produce a 2.4 vs 2.5 profile? Let's at least try to fix it ...
> 
> M.

Well, here's one bit of data. Easy enough to do if you have a web browser.
LMBench 2.0 on 1-way and 2-way, kernels 2.4.18 and 2.5.60 
1-way (stp1-003 stp1-002) 
2.4.18 http://khack.osdl.org/stp/7443/
2.5.60 http://khack.osdl.org/stp/265622/ 

2-way (stp2-003 stp2-000)
2.4.18 http://khack.osdl.org/stp/3165/
2.5.60 http://khack.osdl.org/stp/265643/

Interesting items for me are the fork/exec/sh times and some of the file + VM 
numbers
LMBench 2.0 Data ( items selected from total of five runs )

Processor, Processes - times in microseconds - smaller is better
----------------------------------------------------------------
Host                 OS  Mhz null null      open selct sig  sig  fork exec sh
                             call  I/O stat clos TCP   inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
stp2-003.  Linux 2.4.18 1000 0.39 0.67 3.89 4.99  30.4 0.93 3.06 344. 1403 4465
stp2-000.  Linux 2.5.60 1000 0.41 0.77 4.34 5.57  32.6 1.15 3.59 245. 1406 5795

stp1-003.  Linux 2.4.18 1000 0.32 0.46 2.60 3.21  16.6 0.79 2.52 104. 918. 4460
stp1-002.  Linux 2.5.60 1000 0.33 0.47 2.83 3.47  16.0 0.94 2.70 143. 1212 5292

Context switching - times in microseconds - smaller is better
-------------------------------------------------------------
Host                 OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
                        ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
--------- ------------- ----- ------ ------ ------ ------ ------- -------
stp2-003.  Linux 2.4.18 2.680 6.2100   15.8 7.9400  110.7    26.4   111.1
stp2-000.  Linux 2.5.60 1.590 5.0700   17.6 7.5800   79.8    11.0   113.6

stp1-003.  Linux 2.4.18 0.590 3.4700   11.1 4.8200  134.3    30.8   131.7
stp1-002.  Linux 2.5.60 1.000 3.5400   11.2 4.1400  129.6    30.4   127.8

*Local* Communication latencies in microseconds - smaller is better
-------------------------------------------------------------------
Host                 OS 2p/0K  Pipe AF     UDP  RPC/   TCP  RPC/ TCP
                        ctxsw       UNIX         UDP         TCP conn
--------- ------------- ----- ----- ---- ----- ----- ----- ----- ----
stp2-003.  Linux 2.4.18 2.680 9.071 17.5  26.9  46.2  34.4  60.0 62.9
stp2-000.  Linux 2.5.60 1.590 8.414 13.2  21.2  43.2  28.3  54.1 97.1

stp1-003.  Linux 2.4.18 0.590 3.623 6.98  11.7  28.2  17.8  38.4 300K
stp1-002.  Linux 2.5.60 1.050 4.591 8.54  14.8  31.8  20.0  41.0 67.1

File & VM system latencies in microseconds - smaller is better
--------------------------------------------------------------
Host                 OS   0K File      10K File      Mmap    Prot    Page
                        Create Delete Create Delete  Latency Fault   Fault
--------- ------------- ------ ------ ------ ------  ------- -----   -----
stp2-003.  Linux 2.4.18   34.6 7.2490  110.9   17.9   2642.0 0.771 3.00000
stp2-000.  Linux 2.5.60   40.0 9.2780  113.3   23.3   4592.0 0.543 3.00000

stp1-003.  Linux 2.4.18   28.8 4.8890  107.5   11.3    686.0 0.621 2.00000
stp1-002.  Linux 2.5.60   32.4 6.4290  112.9   16.2   1455.0 0.465 2.00000

*Local* Communication bandwidths in MB/s - bigger is better
-----------------------------------------------------------
Host                OS  Pipe AF    TCP  File   Mmap  Bcopy  Bcopy  Mem   Mem
                             UNIX      reread reread (libc) (hand) read write
--------- ------------- ---- ---- ---- ------ ------ ------ ------ ---- -----
stp2-003.  Linux 2.4.18 563. 277. 263.  437.0  552.8  249.1  180.7 553. 215.2
stp2-000.  Linux 2.5.60 603. 516. 151.  436.3  549.0  238.0  171.9 548. 233.7

stp1-003.  Linux 2.4.18 1009 820. 404.  414.3  467.0  167.2  154.1 466. 236.2
stp1-002.  Linux 2.5.60 806. 584. 69.1  408.0  461.7  161.1  149.1 461. 233.5


Memory latencies in nanoseconds - smaller is better
    (WARNING - may not be correct, check graphs)
---------------------------------------------------
Host                 OS   Mhz  L1 $   L2 $    Main mem    Guesses
--------- -------------  ---- ----- ------    --------    -------
stp2-003.  Linux 2.4.18  1000 3.464 8.0820  110.9
stp2-000.  Linux 2.5.60  1000 3.545 8.2790  110.6

stp1-003.  Linux 2.4.18  1000 2.994 6.9850  121.4
stp1-002.  Linux 2.5.60  1000 3.023 7.0530  122.5

------------------
cliffw

> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 17:11                       ` Cliff White
@ 2003-02-25 17:17                         ` William Lee Irwin III
  2003-02-25 17:38                         ` Linus Torvalds
  2003-02-25 19:48                         ` Martin J. Bligh
  2 siblings, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-25 17:17 UTC (permalink / raw)
  To: Cliff White
  Cc: Martin J. Bligh, Chris Wedgwood, Alan Cox, Larry McVoy,
	Linux Kernel Mailing List

On Tue, Feb 25, 2003 at 09:11:38AM -0800, Cliff White wrote:
> Interesting items for me are the fork/exec/sh times and some of the file + VM 
> numbers
> LMBench 2.0 Data ( items selected from total of five runs )

Okay, got profiles for the individual tests you're interested in?

Also, what are the statistical significance cutoffs?


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 17:11                       ` Cliff White
  2003-02-25 17:17                         ` William Lee Irwin III
@ 2003-02-25 17:38                         ` Linus Torvalds
  2003-02-25 19:54                           ` Dave Jones
  2003-02-25 19:48                         ` Martin J. Bligh
  2 siblings, 1 reply; 157+ messages in thread
From: Linus Torvalds @ 2003-02-25 17:38 UTC (permalink / raw)
  To: linux-kernel

In article <200302251711.h1PHBct16624@mail.osdl.org>,
Cliff White  <cliffw@osdl.org> wrote:
>
>Well, here's one bit of data. Easy enough to do if you have a web browser.
>LMBench 2.0 on 1-way and 2-way, kernels 2.4.18 and 2.5.60 
>1-way (stp1-003 stp1-002) 
>2.4.18 http://khack.osdl.org/stp/7443/
>2.5.60 http://khack.osdl.org/stp/265622/ 
>
>2-way (stp2-003 stp2-000)
>2.4.18 http://khack.osdl.org/stp/3165/
>2.5.60 http://khack.osdl.org/stp/265643/
>
>Interesting items for me are the fork/exec/sh times and some of the file + VM 
>numbers
>LMBench 2.0 Data ( items selected from total of five runs )
>
>Processor, Processes - times in microseconds - smaller is better
>----------------------------------------------------------------
>Host                 OS  Mhz null null      open selct sig  sig  fork exec sh
>                             call  I/O stat clos TCP   inst hndl proc proc proc
>--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
>stp2-003.  Linux 2.4.18 1000 0.39 0.67 3.89 4.99  30.4 0.93 3.06 344. 1403 4465
>stp2-000.  Linux 2.5.60 1000 0.41 0.77 4.34 5.57  32.6 1.15 3.59 245. 1406 5795

Note that those numbers will look quite different (at least on a P4) if
you use a modern library that uses the "sysenter" stuff. The difference
ends up being something like this:

Host                 OS  Mhz null null      open selct sig  sig  fork exec sh  
                             call  I/O stat clos       inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ----- ---- ---- ---- ---- ----
i686-linu  Linux 2.5.30 2380  0.8  1.1    3    5 0.04K  1.1    3 0.2K   1K   3K
i686-linu  Linux 2.5.62 2380  0.2  0.6    3    4 0.04K  0.7    3 0.2K   1K   3K

(Yeah, I've never run a 2.4.x kernel on this machine, so..) In other
words, the system call has been speeded up quite noticeably. 

Yes, if you don't take advantage of sysenter, then all the sysenter
support will just make us look worse ;(

I'm surprised by your "sh proc" changes, they are quite big. I guess
it's rmap and highmem that bites us, and yes, we've gotten slower there.

		Linus

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 17:38                         ` Linus Torvalds
@ 2003-02-25 19:54                           ` Dave Jones
  2003-02-26  2:04                             ` Linus Torvalds
  0 siblings, 1 reply; 157+ messages in thread
From: Dave Jones @ 2003-02-25 19:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

On Tue, Feb 25, 2003 at 05:38:31PM +0000, Linus Torvalds wrote:

 > Yes, if you don't take advantage of sysenter, then all the sysenter
 > support will just make us look worse ;(

Andi's patch[1] to remove one of the wrmsr's from the context switch
fast path should win back at least some of the lost microbenchmark
points.  (Full info at http://bugzilla.kernel.org/show_bug.cgi?id=350)

		Dave

[1] http://bugzilla.kernel.org/attachment.cgi?id=140&action=view


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 19:54                           ` Dave Jones
@ 2003-02-26  2:04                             ` Linus Torvalds
  0 siblings, 0 replies; 157+ messages in thread
From: Linus Torvalds @ 2003-02-26  2:04 UTC (permalink / raw)
  To: Dave Jones; +Cc: linux-kernel

On Tue, 25 Feb 2003, Dave Jones wrote:
> 
>  > Yes, if you don't take advantage of sysenter, then all the sysenter
>  > support will just make us look worse ;(
> 
> Andi's patch[1] to remove one of the wrmsr's from the context switch
> fast path should win back at least some of the lost microbenchmark
> points. 

But the patch is fundamentally broken wrt preemption at least, and it 
looks totally unfixable.

It's also overly complex, for no apparent reason. The simple way to avoid 
the wrmsr of SYSENTER_CS is to just cache a per-cpu copy in memory, 
preferably in some location that is already in the cache at context switch 
time for other reasons.

		Linus

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 17:11                       ` Cliff White
  2003-02-25 17:17                         ` William Lee Irwin III
  2003-02-25 17:38                         ` Linus Torvalds
@ 2003-02-25 19:48                         ` Martin J. Bligh
  2 siblings, 0 replies; 157+ messages in thread
From: Martin J. Bligh @ 2003-02-25 19:48 UTC (permalink / raw)
  To: Cliff White
  Cc: Chris Wedgwood, Alan Cox, Larry McVoy, Linux Kernel Mailing List

> Interesting items for me are the fork/exec/sh times and some of the file
> + VM  numbers

For the ones where you see degradation in fork/exec type stuff, any chance
you could rerun them with 62-mjb3 with the objrmap stuff in it? That should
fix a lot of the overhead.

Thanks,

M.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  6:17                     ` Martin J. Bligh
  2003-02-25 17:11                       ` Cliff White
@ 2003-02-25 21:28                       ` William Lee Irwin III
  1 sibling, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-25 21:28 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Chris Wedgwood, Alan Cox, Larry McVoy, Linux Kernel Mailing List

At some point in the past, Chris Wedgewood wrote:
>> It seems to me for small boxes, 2.5.x is margianlly slower at most
>> things than 2.4.x.

On Mon, Feb 24, 2003 at 10:17:05PM -0800, Martin J. Bligh wrote:
> Can you name a benchmark, or at least do something reproducible between
> versions, and produce a 2.4 vs 2.5 profile? Let's at least try to fix it ...

Looks like Cliff's got some good data.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25  5:19                   ` Chris Wedgwood
  2003-02-25  5:26                     ` William Lee Irwin III
  2003-02-25  6:17                     ` Martin J. Bligh
@ 2003-02-25 19:20                     ` Alan Cox
  2003-02-25 19:59                     ` Scott Robert Ladd
  3 siblings, 0 replies; 157+ messages in thread
From: Alan Cox @ 2003-02-25 19:20 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Larry McVoy, Martin J. Bligh, Linux Kernel Mailing List

On Tue, 2003-02-25 at 05:19, Chris Wedgwood wrote:
> > The definitive Linux box appears to be $199 from Walmart right now,
> > and its not SMP.
> 
> In two year this kind of hardware probably will be SMP (HT or some
> variant).

Not if it costs money. If the cheapest reasonable x86 cpu is one that has chosen
to avoid HT and SMP it won't have HT and SMP. Think 4xUSB2 connectors, brick PSU
and no user adjustable components.


^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-25  5:19                   ` Chris Wedgwood
                                       ` (2 preceding siblings ...)
  2003-02-25 19:20                     ` Alan Cox
@ 2003-02-25 19:59                     ` Scott Robert Ladd
  2003-02-25 20:18                       ` jlnance
  2003-02-25 21:19                       ` Chris Wedgwood
  3 siblings, 2 replies; 157+ messages in thread
From: Scott Robert Ladd @ 2003-02-25 19:59 UTC (permalink / raw)
  To: Chris Wedgwood, Alan Cox
  Cc: Larry McVoy, Martin J. Bligh, Linux Kernel Mailing List

Chris Wedgwood wrote:
> > The definitive Linux box appears to be $199 from Walmart right now,
> > and its not SMP.
>
> In two year this kind of hardware probably will be SMP (HT or some
> variant).

HT is not the same thing as SMP; while the chip may appear to be two
processors, it is actually equivalent 1.1 to 1.3 processors, depending on
the application.

Multicore processors and true SMP systems are unlikely to become mainstream
consumer items, given the premium price charged for such systems.

That given, I see some value in a stripped-down, low-overhead,
consumer-focused Linux that targets uniprocessor and HT systems, to be used
in the typical business or gaming PC. I'm not sure such is achievable with
the current config options; perhaps I should try to see how small a kernel I
can build for a simple ia32 system...

..Scott

Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)
Professional programming for science and engineering;
Interesting and unusual bits of very free code.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 19:59                     ` Scott Robert Ladd
@ 2003-02-25 20:18                       ` jlnance
  2003-02-25 20:59                         ` Scott Robert Ladd
  2003-02-25 21:19                       ` Chris Wedgwood
  1 sibling, 1 reply; 157+ messages in thread
From: jlnance @ 2003-02-25 20:18 UTC (permalink / raw)
  To: linux-kernel

On Tue, Feb 25, 2003 at 02:59:05PM -0500, Scott Robert Ladd wrote:
> > In two year this kind of hardware probably will be SMP (HT or some
> > variant).
> 
> HT is not the same thing as SMP; while the chip may appear to be two
> processors, it is actually equivalent 1.1 to 1.3 processors, depending on
> the application.
> 
> Multicore processors and true SMP systems are unlikely to become mainstream
> consumer items, given the premium price charged for such systems.

I think the difference between SMP and HT is likely to decrease rather
than increase in the future.  Even now people want to put multiple CPUs
on the same piece of silicon.  Once you do that it only makes sense to
start sharning things between them.  If you had a system with 2 CPUs
which shared a common L1 cache is that going to be a HT or an SMP system?
Or you could go further and have 2 CPUs which share an FPU.  There are
all sorts of combinations you could come up with.  I think designers
will experiment and find the one that gives the most throughput for
the least money.

Jim

^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-25 20:18                       ` jlnance
@ 2003-02-25 20:59                         ` Scott Robert Ladd
  0 siblings, 0 replies; 157+ messages in thread
From: Scott Robert Ladd @ 2003-02-25 20:59 UTC (permalink / raw)
  To: jlnance, linux-kernel

jlnance@unity.ncsu.edu wrote:
> I think the difference between SMP and HT is likely to decrease rather
> than increase in the future.  Even now people want to put multiple CPUs
> on the same piece of silicon.  Once you do that it only makes sense to
> start sharning things between them.  If you had a system with 2 CPUs
> which shared a common L1 cache is that going to be a HT or an SMP system?
> Or you could go further and have 2 CPUs which share an FPU.  There are
> all sorts of combinations you could come up with.  I think designers
> will experiment and find the one that gives the most throughput for
> the least money.

IBM's forthcoming Power5 will have two cores, each with SMT (the generic
term for HyperThreading); it will present itself to the OS as four
processors. Those four processors, however, are not equal; SMT is certainly
valuable, but it can only be as effective as mutliple cores if it in effect
*becomes* multiple cores (and, as such, turns into SMP).

I'm writing a chapter on memory architectures in my parallel programming
book; it's giving me a bit of a headache, as the issues you raise are both
important and complex. We have multiple levels of caches, NUMA
architectures, clusters, SMP, HT... the list just goes on and on, infinite
in diversity and combinations. Vendors will continue to experiment; I doubt
very much that any one architecture will take center stage.

I hope Linux handles the brain-sprain better than I am at the moment! ;)

..Scott

Scott Robert Ladd
Coyote Gulch Productions (http://www.coyotegulch.com)

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-25 19:59                     ` Scott Robert Ladd
  2003-02-25 20:18                       ` jlnance
@ 2003-02-25 21:19                       ` Chris Wedgwood
  2003-02-25 21:38                         ` Scott Robert Ladd
  1 sibling, 1 reply; 157+ messages in thread
From: Chris Wedgwood @ 2003-02-25 21:19 UTC (permalink / raw)
  To: Scott Robert Ladd
  Cc: Alan Cox, Larry McVoy, Martin J. Bligh, Linux Kernel Mailing List

On Tue, Feb 25, 2003 at 02:59:05PM -0500, Scott Robert Ladd wrote:

> HT is not the same thing as SMP; while the chip may appear to be two
> processors, it is actually equivalent 1.1 to 1.3 processors,
> depending on the application.

You can't have non-integer numbers of processors.  HT is a hack that
makes what appears to be two processors using common silicon.

The fact it's slower than a really dual CPU box is irrelevant in some
sense, you still need SMP smart to deal with it; it's only important
when you want to know why performance increases aren't apparent or you
loose performance in some cases... (ie. other virtual CPU thrashing
the cache).

> Multicore processors and true SMP systems are unlikely to become
> mainstream consumer items, given the premium price charged for such
> systems.

I overstated things thinking SMP/HT would be in low-end hardware given
two years.

As Alan pointed out, since the 'Walmart' class hardware is 'whatever
is cheapest' then perhaps HT/SMT/whatever won't be common place for
super-low end boxes in two years --- but I would be surprised if it
didn't gain considerable market share elsewhere.

> That given, I see some value in a stripped-down, low-overhead,
> consumer-focused Linux that targets uniprocessor and HT systems, to
> be used in the typical business or gaming PC.

UP != HT

HT is SMP with magic requirements.  For multiple physical CPUs the
requirements become even more complex; you want to try to group tasks
to physical CPUs, not logical ones lest you thrash the cache.

Presumably there are other tweaks possible two, cache-line's don't
bounce between logic CPUs on a physical CPU for example, so some locks
and other data structures will be much faster to access than those
which actually do need cache-lines to migrate between different
physical CPUs.  I'm not sure if these specific property cane be
exploited in the general case though.

> I'm not sure such is achievable with the current config options;
> perhaps I should try to see how small a kernel I can build for a
> simple ia32 system...

Present 2.5.x looks like it will have smarts for HT as a subset of
NUMA.

If HT does become more common and similar things abound, I'm not sure
if it even makes sense to have a UP kernel for certain platforms
and/or CPUs --- since a mere BIOS change will affect what is
'virtually' apparent to the OS.

  --cw

^ permalink raw reply	[flat|nested] 157+ messages in thread

* RE: Minutes from Feb 21 LSE Call
  2003-02-25 21:19                       ` Chris Wedgwood
@ 2003-02-25 21:38                         ` Scott Robert Ladd
  0 siblings, 0 replies; 157+ messages in thread
From: Scott Robert Ladd @ 2003-02-25 21:38 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Alan Cox, Larry McVoy, Martin J. Bligh, Linux Kernel Mailing List

Chris Wedgwood wrote:
SRL>HT is not the same thing as SMP; while the chip may appear to be
SRL>two processors, it is actually equivalent 1.1 to 1.3 processors,
SRL>depending on the application.
>
CW> You can't have non-integer numbers of processors.  HT is a hack
CW> that makes what appears to be two processors using common
CW> silicon.

I'm aware of that. ;) I'm well aware of the architecture needed to support
HT.

> The fact it's slower than a really dual CPU box is irrelevant in some
> sense, you still need SMP smart to deal with it; it's only important
> when you want to know why performance increases aren't apparent or you
> loose performance in some cases... (ie. other virtual CPU thrashing
> the cache).

Performance differences *are* quite relevant when it comes to thread
scheduling; the two virtual CPUS are not necessarily equivalent in
performnace.

> As Alan pointed out, since the 'Walmart' class hardware is 'whatever
> is cheapest' then perhaps HT/SMT/whatever won't be common place for
> super-low end boxes in two years --- but I would be surprised if it
> didn't gain considerable market share elsewhere.

I suspect HT/SMT be common for people who have multimedia systems, for video
editing and high-end gaming.

I doubt we'll see SMT toasters, though.

> UP != HT

An HT system is still a single, phsyical processor; HT is not equivalent to
a multicore chip, either. Much depends on memory and connection models; a
dual-core chip may be faster or slower than two similar physical SMP
processors. depending on the architecture.

I was speaking in terms of Intel's push to add HT to all of their P4s.
Systems with a single CPU will likely have HT; that still doesn't make them
as powerful as a true dual processor (or dual core CPU) system.

> HT is SMP with magic requirements.  For multiple physical CPUs the
> requirements become even more complex; you want to try to group tasks
> to physical CPUs, not logical ones lest you thrash the cache.

Eaxctly. This is why HT is not the same thing as two physical CPUs. The OS
must be aware of this the effectively schedule jobs. So I think we generally
agree.

> If HT does become more common and similar things abound, I'm not sure
> if it even makes sense to have a UP kernel for certain platforms
> and/or CPUs --- since a mere BIOS change will affect what is
> 'virtually' apparent to the OS.

A good point.

..Scott

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:58               ` Larry McVoy
                                   ` (2 preceding siblings ...)
  2003-02-24 13:28                 ` Alan Cox
@ 2003-02-24 18:44                 ` Davide Libenzi
  3 siblings, 0 replies; 157+ messages in thread
From: Davide Libenzi @ 2003-02-24 18:44 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Linux Kernel Mailing List

On Sun, 23 Feb 2003, Larry McVoy wrote:

> > Because I don't see why I should waste my time running benchmarks just to
> > prove you wrong. I don't respect you that much, and it seems the
> > maintainers don't either. When you become somebody with the stature in the
> > Linux community of, say, Linus or Andrew I'd be prepared to spend a lot
> > more time running benchmarks on any concerns you might have.
>
> Who cares if you respect me, what does that have to do with proper
> engineering?   Do you think that I'm the only person who wants to see
> numbers?  You think Linus doesn't care about this?  Maybe you missed
> the whole IA32 vs IA64 instruction cache thread.  It sure sounded like
> he cares.  How about Alan?  He stepped up and pointed out that less
> is more.  How about Mark?  He knows a thing or two about the topic?
> In fact, I think you'd be hard pressed to find anyone who wouldn't be
> interested in seeing the cache effects of a patch.
>
> People care about performance, both scaling up and scaling down.  A lot of
> performance changes are measured poorly, in a way that makes the changes
> look good but doesn't expose the hidden costs of the change.  What I'm
> saying is that those sorts of measurements screwed over performance in
> the past, why are you trying to repeat old mistakes?

Larry, how many times this kind of discussions went on during the last
years ? I think you should remember pretty well because it was always you
on that side of the river pushing back "Barbarians" with your UP sword.
The point is that people ( expecially young ) like to dig where other
failed, it's normal. It's attractive like honey for bears. Let them try,
many they will fail, but chances are that someone will succeed making it
worth the try. And trust Linus, that is more on your wavelength than on
the huge scalabity one.



- Davide


^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:15       ` Larry McVoy
  2003-02-22 23:23         ` Christoph Hellwig
  2003-02-22 23:44         ` Martin J. Bligh
@ 2003-02-22 23:57         ` Jeff Garzik
  2003-02-23 23:57         ` Bill Davidsen
  3 siblings, 0 replies; 157+ messages in thread
From: Jeff Garzik @ 2003-02-22 23:57 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Larry McVoy, Mark Hahn,
	David S. Miller, linux-kernel

On Sat, Feb 22, 2003 at 03:15:52PM -0800, Larry McVoy wrote:
> or rackmount cases.  I fail to see how there are better margins on the
> same hardware in a rackmount box for $800 when the desktop costs $750.
> Those rack mount power supplies and cases are not as cheap as the desktop
> ones, so I see no difference in the margins.

Oh, it's definitely different hardware.  Maybe the 16550-related portion
of the ASIC is the same :) but just do an lspci to see huge differences in
motherboard chipsets, on-board parts, more complicated BIOS, remote
management bells and whistles, etc.  Even the low-end rackmounts.

But the better margins come simply from the mentality, IMO.  Desktops
just aren't "as important" to a business compared to servers, so IT
shops are willing to spend more money to not only get better hardware,
but also the support services that accompany it.  Selling servers
to enterprise data centers means bigger, more concentrated cash pool.

	Jeff

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 23:15       ` Larry McVoy
                           ` (2 preceding siblings ...)
  2003-02-22 23:57         ` Jeff Garzik
@ 2003-02-23 23:57         ` Bill Davidsen
  2003-02-24  6:22           ` Val Henson
  3 siblings, 1 reply; 157+ messages in thread
From: Bill Davidsen @ 2003-02-23 23:57 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Linux Kernel Mailing List

On Sat, 22 Feb 2003, Larry McVoy wrote:

> > We would never try to propose such a change, and never have. 
> > Name a scalability change that's hurt the performance of UP by 5%.
> > There isn't one.
> 
> This is *exactly* the reasoning that every OS marketing weenie has used
> for the last 20 years to justify their "feature" of the week.
> 
> The road to slow bloated code is paved one cache miss at a time.  You
> may quote me on that.  In fact, print it out and put it above your
> monitor and look at it every day.  One cache miss at a time.  How much
> does one cache miss add to any benchmark?  .001%?  Less.  
> 
> But your pet features didn't slow the system down.  Nope, they just made
> the cache smaller, which you didn't notice because whatever artificial
> benchmark you ran didn't happen to need the whole cache.  

Clearly this is the case, the benefit of a change must balance the
negative effects. Making the code paths longer hurts free cache, having
more of them should not. More code is not always slower code, and doesn't
always have more impact on cache use. You identify something which must be
considered, but it's not the only thing to consider. Linux shouild be
stable, not moribund.

> You need to understand that system resources belong to the user.  Not the
> kernel.  The goal is to have all of the kernel code running under any 
> load be less than 1% of the CPU.  Your 5% number up there would pretty 
> much double the amount of time we spend in the kernel for most workloads.

Who profits? For most users a bit more system time resulting in better
disk performance would be a win, or at least non-lose. This isn't black
and white.

On Sat, 22 Feb 2003, Larry McVoy wrote:

> Let's get back to your position.  You want to shovel stuff in the kernel
> for the benefit of the 32 way / 64 way etc boxes.  I don't see that as
> wise.  You could prove me wrong.  Here's how you do it: go get oprofile
> or whatever that tool is which lets you run apps and count cache misses.
> Start including before/after runs of each microbench in lmbench and 
> some time sharing loads with and without your changes.  When you can do
> that and you don't add any more bus traffic, you're a genius and 
> I'll shut up.

Code only costs when it's executed. Linux is somewhat heading to the place
where a distro has a few useful configs and then people who care for the
last bit of whatever they see as a bottleneck can build their own fro
"make config." So it is possible to add features for big machines without
any impact on the builds which don't use the features. it goes without
saying that this is hard. I would guess that it results in more bugs as
well, if one path or another is "the less-traveled way."

> 
> But that's a false promise because by definition, fine grained threading
> adds more bus traffic.  It's kind of hard to not have that happen, the
> caches have to stay coherent somehow.

Clearly. And things which require more locking will pay some penalty for
this. But a quick scan of this list on keyword "lockless' will show that
people are thinking about this.

I don't think developers will buy ignoring part of the market to
completely optimize for another. Linux will grow by being ubiquitious, not
by winning some battle and losing the war. It's not a niche market os.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-23 23:57         ` Bill Davidsen
@ 2003-02-24  6:22           ` Val Henson
  2003-02-24  6:41             ` William Lee Irwin III
  0 siblings, 1 reply; 157+ messages in thread
From: Val Henson @ 2003-02-24  6:22 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Larry McVoy, Linux Kernel Mailing List

On Sun, Feb 23, 2003 at 06:57:09PM -0500, Bill Davidsen wrote:
> On Sat, 22 Feb 2003, Larry McVoy wrote:
> > 
> > But that's a false promise because by definition, fine grained threading
> > adds more bus traffic.  It's kind of hard to not have that happen, the
> > caches have to stay coherent somehow.
> 
> Clearly. And things which require more locking will pay some penalty for
> this. But a quick scan of this list on keyword "lockless' will show that
> people are thinking about this.

Lockless algorithms still generate bus traffic when you do the atomic
compare-and-swap or load-linked or whatever hardware instruction you
use to implement your lockless algorithm.  Caches still have to stay
coherent, lock or no lock.

-VAL

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-24  6:22           ` Val Henson
@ 2003-02-24  6:41             ` William Lee Irwin III
  0 siblings, 0 replies; 157+ messages in thread
From: William Lee Irwin III @ 2003-02-24  6:41 UTC (permalink / raw)
  To: Val Henson; +Cc: Bill Davidsen, Larry McVoy, Linux Kernel Mailing List

On Sun, Feb 23, 2003 at 06:57:09PM -0500, Bill Davidsen wrote:
>> Clearly. And things which require more locking will pay some penalty for
>> this. But a quick scan of this list on keyword "lockless' will show that
>> people are thinking about this.

On Sun, Feb 23, 2003 at 11:22:30PM -0700, Val Henson wrote:
> Lockless algorithms still generate bus traffic when you do the atomic
> compare-and-swap or load-linked or whatever hardware instruction you
> use to implement your lockless algorithm.  Caches still have to stay
> coherent, lock or no lock.

Not all lockless algorithms operate on the "access everything with
atomic operations" principle. RCU, for example, uses no atomic
operations on the read side, which is actually fewer atomic operations
than standard rwlocks use for the read side.


-- wli

^ permalink raw reply	[flat|nested] 157+ messages in thread

* Re: Minutes from Feb 21 LSE Call
  2003-02-22 19:56   ` Minutes from Feb 21 LSE Call Larry McVoy
  2003-02-22 20:24     ` William Lee Irwin III
  2003-02-22 21:02     ` Martin J. Bligh
@ 2003-02-22 21:29     ` Jeff Garzik
  2 siblings, 0 replies; 157+ messages in thread
From: Jeff Garzik @ 2003-02-22 21:29 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Mark Hahn, David S. Miller,
	Larry McVoy, linux-kernel

Oh, come on :)

It's all vague handwaving because people either don't know real numbers,
or sure as heck won't post them on a public list...

	Jeff




^ permalink raw reply	[flat|nested] 157+ messages in thread

end of thread, other threads:[~2003-02-27 20:10 UTC | newest]

Thread overview: 157+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.44.0302221417120.2686-100000@coffee.psychology.mcmaster.ca>
     [not found] ` <1510000.1045942974@[10.10.2.4]>
2003-02-22 19:56   ` Minutes from Feb 21 LSE Call Larry McVoy
2003-02-22 20:24     ` William Lee Irwin III
2003-02-22 21:02     ` Martin J. Bligh
2003-02-22 22:06       ` Mark Hahn
2003-02-22 22:17         ` William Lee Irwin III
2003-02-22 23:28           ` Larry McVoy
2003-02-22 23:47             ` Martin J. Bligh
2003-02-23  0:09             ` Gerrit Huizenga
2003-02-23  8:01               ` Larry McVoy
2003-02-23  8:05                 ` William Lee Irwin III
2003-02-24 18:36             ` Andy Pfiffer
2003-02-22 22:44         ` Ben Greear
2003-02-23 23:29           ` Bill Davidsen
2003-02-23 23:37             ` Martin J. Bligh
2003-02-24  4:57               ` Larry McVoy
2003-02-24  6:10                 ` Gerhard Mack
2003-02-24  6:52                   ` Larry McVoy
2003-02-24  7:46                     ` Bill Huey
2003-02-24  7:44                 ` Bill Huey
2003-02-24  7:54                   ` William Lee Irwin III
2003-02-24  8:00                     ` Bill Huey
2003-02-24  8:40                       ` Andrew Morton
2003-02-24  8:50                         ` William Lee Irwin III
2003-02-24 16:17                           ` yodaiken
2003-02-24 23:13                             ` William Lee Irwin III
2003-02-24 23:27                               ` yodaiken
2003-02-24 23:54                                 ` William Lee Irwin III
2003-02-24 23:54                                   ` yodaiken
2003-02-25  2:17                                 ` Bill Huey
2003-02-25  2:24                                   ` yodaiken
2003-02-25  2:35                                     ` Bill Huey
2003-02-25  2:43                                     ` Bill Huey
2003-02-25  2:32                                   ` Larry McVoy
2003-02-25  2:40                                     ` Bill Huey
2003-02-25  5:24                                   ` Rik van Riel
2003-02-25 15:30                                   ` Alan Cox
2003-02-25 14:59                                     ` Bill Huey
2003-02-25 15:44                                       ` yodaiken
2003-02-26 19:31                                   ` Bill Davidsen
2003-02-27  0:56                                     ` Bill Huey
2003-02-27 20:04                                       ` Bill Davidsen
2003-02-25  2:07                             ` Bill Huey
2003-02-25  2:14                               ` Larry McVoy
2003-02-25  2:24                                 ` Bill Huey
2003-02-25  2:46                                   ` Valdis.Kletnieks
2003-02-25 14:47                                     ` Mr. James W. Laferriere
2003-02-25 15:59                                       ` Jesse Pollard
2003-02-24  8:56                         ` Bill Huey
2003-02-24  9:09                           ` Andrew Morton
2003-02-24  9:24                             ` Bill Huey
2003-02-24  9:56                               ` Andrew Morton
2003-02-24 10:11                                 ` Bill Huey
2003-02-24 14:40                           ` Bill Davidsen
2003-02-24 21:10                           ` Andrea Arcangeli
2003-02-24  8:43                       ` William Lee Irwin III
2003-02-22 23:10         ` Martin J. Bligh
2003-02-22 23:20           ` Larry McVoy
2003-02-22 23:46             ` Martin J. Bligh
2003-02-25  2:19         ` Hans Reiser
2003-02-25  3:49           ` Martin J. Bligh
2003-02-25  5:12             ` Steven Cole
2003-02-25 20:37               ` Scott Robert Ladd
2003-02-25 21:36                 ` Hans Reiser
2003-02-25 23:28                   ` Scott Robert Ladd
2003-02-25 23:41                     ` Hans Reiser
2003-02-26  0:19                       ` Scott Robert Ladd
2003-02-26  0:35                         ` Hans Reiser
2003-02-26 16:31                           ` Horst von Brand
2003-02-26  0:47                       ` Steven Cole
2003-02-26 16:07                       ` Horst von Brand
2003-02-26 19:47                         ` Alan Cox
2003-02-26  6:04                     ` Aaron Lehmann
2003-02-26  0:44                 ` Alan Cox
2003-02-25 23:58                   ` Scott Robert Ladd
2003-02-22 23:15       ` Larry McVoy
2003-02-22 23:23         ` Christoph Hellwig
2003-02-22 23:54           ` Mark Hahn
2003-02-22 23:44         ` Martin J. Bligh
2003-02-24  4:56           ` Larry McVoy
2003-02-24  5:06             ` William Lee Irwin III
2003-02-24  6:00               ` Mark Hahn
2003-02-24  6:02                 ` William Lee Irwin III
2003-02-24 15:06               ` Alan Cox
2003-02-24 23:18                 ` William Lee Irwin III
2003-02-24  5:16             ` Martin J. Bligh
2003-02-24  6:58               ` Larry McVoy
2003-02-24  7:39                 ` Martin J. Bligh
2003-02-24 16:17                   ` Larry McVoy
2003-02-24 16:49                     ` Martin J. Bligh
2003-02-25  0:41                       ` Server shipments [was Re: Minutes from Feb 21 LSE Call] Larry McVoy
2003-02-25  0:41                         ` Martin J. Bligh
2003-02-25  0:54                           ` Larry McVoy
2003-02-25  2:00                             ` Tupshin Harper
2003-02-25  3:54                               ` Martin J. Bligh
2003-02-25  3:00                             ` Martin J. Bligh
2003-02-25  3:13                               ` Larry McVoy
2003-02-25  4:11                                 ` Martin J. Bligh
2003-02-25  4:17                                   ` Larry McVoy
2003-02-25  4:21                                     ` Martin J. Bligh
2003-02-25  4:37                                       ` Larry McVoy
2003-02-25 22:02                                     ` Gerrit Huizenga
2003-02-25 23:19                                       ` Larry McVoy
2003-02-25 23:46                                         ` Gerhard Mack
2003-02-26  4:23                                           ` Jesse Pollard
2003-02-26  5:05                                             ` William Lee Irwin III
2003-02-26  5:27                                             ` Bernd Eckenfels
2003-02-26  9:36                                               ` Eric W. Biederman
2003-02-26 12:09                                               ` Jesse Pollard
2003-02-26 16:42                                                 ` Geert Uytterhoeven
2003-02-25 17:37                               ` Andrea Arcangeli
2003-02-25  1:09                           ` David Lang
2003-02-24 18:22                     ` Minutes from Feb 21 LSE Call John W. M. Stevens
2003-02-24  7:51                 ` William Lee Irwin III
2003-02-24 15:47                   ` Larry McVoy
2003-02-24 16:00                     ` Martin J. Bligh
2003-02-24 16:23                     ` Benjamin LaHaise
2003-02-24 16:25                       ` yodaiken
2003-02-24 18:20                         ` Gerrit Huizenga
2003-02-25  1:51                           ` Minutes from Feb 21 LSE Call - publishing performance data Craig Thomas
2003-02-24 16:31                       ` Minutes from Feb 21 LSE Call Larry McVoy
2003-02-24 23:36                     ` William Lee Irwin III
2003-02-25  0:23                       ` Larry McVoy
2003-02-25  2:37                         ` Werner Almesberger
2003-02-25  4:42                         ` William Lee Irwin III
2003-02-25  4:54                           ` Larry McVoy
2003-02-25  6:00                             ` William Lee Irwin III
2003-02-25  7:00                               ` Val Henson
2003-02-24 13:28                 ` Alan Cox
2003-02-25  5:19                   ` Chris Wedgwood
2003-02-25  5:26                     ` William Lee Irwin III
2003-02-25 21:21                       ` Chris Wedgwood
2003-02-25 21:14                         ` Martin J. Bligh
2003-02-25 21:21                         ` William Lee Irwin III
2003-02-25 22:08                           ` Larry McVoy
2003-02-25 22:10                             ` William Lee Irwin III
2003-02-25 22:37                             ` Chris Wedgwood
2003-02-25 22:58                               ` Larry McVoy
2003-02-25  6:17                     ` Martin J. Bligh
2003-02-25 17:11                       ` Cliff White
2003-02-25 17:17                         ` William Lee Irwin III
2003-02-25 17:38                         ` Linus Torvalds
2003-02-25 19:54                           ` Dave Jones
2003-02-26  2:04                             ` Linus Torvalds
2003-02-25 19:48                         ` Martin J. Bligh
2003-02-25 21:28                       ` William Lee Irwin III
2003-02-25 19:20                     ` Alan Cox
2003-02-25 19:59                     ` Scott Robert Ladd
2003-02-25 20:18                       ` jlnance
2003-02-25 20:59                         ` Scott Robert Ladd
2003-02-25 21:19                       ` Chris Wedgwood
2003-02-25 21:38                         ` Scott Robert Ladd
2003-02-24 18:44                 ` Davide Libenzi
2003-02-22 23:57         ` Jeff Garzik
2003-02-23 23:57         ` Bill Davidsen
2003-02-24  6:22           ` Val Henson
2003-02-24  6:41             ` William Lee Irwin III
2003-02-22 21:29     ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox