public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  5:11 Thierry Forveille
@ 2001-12-17 21:41 ` John Heil
  2001-12-18 14:31 ` Alan Cox
  1 sibling, 0 replies; 168+ messages in thread
From: John Heil @ 2001-12-17 21:41 UTC (permalink / raw)
  To: Thierry Forveille; +Cc: linux-kernel

On Mon, 17 Dec 2001, Thierry Forveille wrote:

> Date: Mon, 17 Dec 2001 19:11:10 -1000 (HST)
> From: Thierry Forveille <forveill@cfht.hawaii.edu>
> To: linux-kernel@vger.kernel.org
> Subject: Re: Scheduler ( was: Just a second ) ...
> 
> Linus Torvalds (torvalds@transmeta.com) writes
> > On Mon, 17 Dec 2001, Rik van Riel wrote:
> > >
> > > Try readprofile some day, chances are schedule() is pretty
> > > near the top of the list.
> >
> > Ehh.. Of course I do readprofile.
> >  
> > But did you ever compare readprofile output to _total_ cycles spent?
> >
> I have a feeling that this discussion got sidetracked: cpu cycles burnt 
> in the scheduler indeed is non-issue, but big tasks being needlessly moved
> around on SMPs is worth tackling.

Given a cpu affinity facility, policy mgmt would belong in user space.
CPU affinity would be pretty simple and I think the effort is already
in flight IIRC.

Johnh

-
-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
johnhscs@sc-software.com
http://www.sc-software.com
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Scheduler ( was: Just a second ) ...
  2001-12-16  0:13 Just a second Linus Torvalds
@ 2001-12-17 22:48 ` Davide Libenzi
  2001-12-17 22:53   ` Linus Torvalds
  0 siblings, 1 reply; 168+ messages in thread
From: Davide Libenzi @ 2001-12-17 22:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Sat, 15 Dec 2001, Linus Torvalds wrote:

> I just don't find it very interesting. The scheduler is about 100 lines
> out of however-many-million (3.8 at least count), and doesn't even impact
> most normal performace very much.

Linus, sharing queue and lock between CPUs for a "thing" highly frequency
( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
and it's not that much funny. And it's not only performance wise, it's
more design wise.


> We'll clearly do per-CPU runqueues or something some day. And that worries
> me not one whit, compared to thigns like VM and block device layer ;)

Why not 2.5.x ?


> I know a lot of people think schedulers are important, and the operating
> system theory about them is overflowing - ...

It's no more important of anything else, it's just one of the remaining
scalability/design issues. No, it's not more important than VM but
there're enough people working on VM. And the hope is to get the scheduler
right with an ETA of less than 10 years.


> it's one of those things that people can argue about forever, ...

Yes, i suppose that if something is not addressed, it'll come up again and
again.


> yet is conceptually simple enough that people aren't afraid of it.
         ^^^^^^^^^^^^^^^^^^^

1, ...


> Let's face it - the current scheduler has the same old basic structure
> that it did almost 10 years ago, and yes, it's not optimal, but there
> really aren't that many real-world loads where people really care. I'm
> sorry, but it's true.

Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane
for UP systems, starts to matter. Just to leave out cache line effects.
Just to leave out the way the current scheduler moves tasks around CPUs.
Linus, it's not only about performance benchmarks with 2451 processes
jumping on the run queue, that i could not care less about, it's just a
sum of sucky "things" that make an issue. You can look at it like a
cosmetic/design patch more than a strict performance patch if you like.


> And you have to realize that there are not very many things that have
> aged as well as the scheduler. Which is just another proof that
> scheduling is easy.
  ^^^^^^^^^^^^^^^^^^

..., 2, ...


> We've rewritten the VM several times in the last ten years, and I expect
> it will be changed several more times in the next few years. Withing five
> years we'll almost certainly have to make the current three-level page
> tables be four levels etc.
>
> In comparison to those kinds of issues, I suspect that making the
> scheduler use per-CPU queues together with some inter-CPU load balancing
> logic is probably _trivial_.
                    ^^^^^^^^^

... 3, there should be a subliminal message inside but i'm not able to
get it ;)
I would not call selecting the right task to run in an SMP system trivial.
The difference between selecting the right task to run and selecting the
right page to swap is that if you screw up with the task the system
impact is lower. But, if you screw up, your design will suck in both cases.
Anyway, given that 1) real men do VM ( i thought they didn't eat quiche )
and easy-coders do scheduling 2) the schdeuler is easy/trivial and you do
not seem interested in working on it 3) whoever is doing the scheduler
cannot screw up things, why don't you give the responsibility for example
to Alan or Ingo so that a discussion ( obviously easy ) about the future
of the schdeuler can be started w/out hurting real men doing VM ?
I'm talking about, you know, that kind of discussions where people bring
solutions, code and numbers, they talk about the good and bad of certain
approaches and they finally come up ( after some sane fight ) with a much
or less widely approved solution. The scheduler, besides the real men
crap, is one of the basic components of an OS, and having a public
debate, i'm not saying every month and neither every year, but at least
once every four years ( this is the last i remember ) could be a nice thing.
And no, if you do not give to someone that you trust the "power" to
redesign the scheduler, no schdeuler discussions will start simply
because people don't like the result of a debate to be dumped to /dev/null.


> Patches already exist, and I don't feel that people can screw up the few
> hundred lines too badly.

Can you point me to a Linux patch that implements _real_independent_
( queue and locking ) CPU schedulers with global balancing policy ?
I searched very badly but i did not find anything.




Your faithfully,
Jimmy Scheduler






^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi
@ 2001-12-17 22:53   ` Linus Torvalds
  2001-12-17 23:15     ` Davide Libenzi
  2001-12-18  1:54     ` Rik van Riel
  0 siblings, 2 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-17 22:53 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Kernel Mailing List


On Mon, 17 Dec 2001, Davide Libenzi wrote:

> On Sat, 15 Dec 2001, Linus Torvalds wrote:
>
> > I just don't find it very interesting. The scheduler is about 100 lines
> > out of however-many-million (3.8 at least count), and doesn't even impact
> > most normal performace very much.
>
> Linus, sharing queue and lock between CPUs for a "thing" highly frequency
> ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
> and it's not that much funny. And it's not only performance wise, it's
> more design wise.

"Design wise" is highly overrated.

Simplicity is _much_ more important, if something commonly is only done a
few hundred times a second. Locking overhead is basically zero for that
case.

> > We'll clearly do per-CPU runqueues or something some day. And that worries
> > me not one whit, compared to thigns like VM and block device layer ;)
>
> Why not 2.5.x ?

Maybe. But read the rest of the sentence. There are issues that are about
a million times more important.

> Moving to 4, 8, 16 CPUs the run queue load, that would be thought insane
> for UP systems, starts to matter.

4 cpu's are "high end" today. We can probably point to tens of thousands
of UP machines for each 4-way out there. The ratio gets even worse for 8,
and 16 CPU's is basically a rounding error.

You have to prioritize. Scheduling overhead is way down the list.

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 22:53   ` Linus Torvalds
@ 2001-12-17 23:15     ` Davide Libenzi
  2001-12-17 23:18       ` Linus Torvalds
  2001-12-18  1:54     ` Rik van Riel
  1 sibling, 1 reply; 168+ messages in thread
From: Davide Libenzi @ 2001-12-17 23:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

>
> On Mon, 17 Dec 2001, Davide Libenzi wrote:
>
> > On Sat, 15 Dec 2001, Linus Torvalds wrote:
> >
> > > I just don't find it very interesting. The scheduler is about 100 lines
> > > out of however-many-million (3.8 at least count), and doesn't even impact
> > > most normal performace very much.
> >
> > Linus, sharing queue and lock between CPUs for a "thing" highly frequency
> > ( schedule()s + wakeup()s ) accessed like the scheduler it's quite ugly
> > and it's not that much funny. And it's not only performance wise, it's
> > more design wise.
>
> "Design wise" is highly overrated.
>
> Simplicity is _much_ more important, if something commonly is only done a
> few hundred times a second. Locking overhead is basically zero for that
> case.

Few hundred is a nice definition because you can basically range from 0 to
infinite. Anyway i agree that we can spend days debating about what this
"few hundred" translate to, and i do not really want to.


> 4 cpu's are "high end" today. We can probably point to tens of thousands
> of UP machines for each 4-way out there. The ratio gets even worse for 8,
> and 16 CPU's is basically a rounding error.
>
> You have to prioritize. Scheduling overhead is way down the list.

You don't really have to serialize/prioritize, old Latins used to say
"Divide Et Impera" ;)




- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:15     ` Davide Libenzi
@ 2001-12-17 23:18       ` Linus Torvalds
  2001-12-17 23:39         ` Davide Libenzi
  2001-12-17 23:52         ` Benjamin LaHaise
  0 siblings, 2 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-17 23:18 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Kernel Mailing List


On Mon, 17 Dec 2001, Davide Libenzi wrote:
> >
> > You have to prioritize. Scheduling overhead is way down the list.
>
> You don't really have to serialize/prioritize, old Latins used to say
> "Divide Et Impera" ;)

Well, you explicitly _asked_ me why I had been silent on the issue. I told
you.

I also told you that I thought it wasn't that big of a deal, and that
patches already exist.

So I'm letting the patches fight it out among the people who _do_ care.

Then, eventually, I'll do something about it, when we have a winner.

If that isn't "Divide et Impera", I don't know _what_ is. Remember: the
romans didn't much care for their subjects. They just wanted the glory,
and the taxes.

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:18       ` Linus Torvalds
@ 2001-12-17 23:39         ` Davide Libenzi
  2001-12-17 23:52         ` Benjamin LaHaise
  1 sibling, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-17 23:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> So I'm letting the patches fight it out among the people who _do_ care.
>
> Then, eventually, I'll do something about it, when we have a winner.
>
> If that isn't "Divide et Impera", I don't know _what_ is. Remember: the
> romans didn't much care for their subjects. They just wanted the glory,
> and the taxes.

Just like today, everyone I talk to wants glory, and everyone I talk to
wants to _not_ pay taxes.



- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:18       ` Linus Torvalds
  2001-12-17 23:39         ` Davide Libenzi
@ 2001-12-17 23:52         ` Benjamin LaHaise
  2001-12-18  1:11           ` Linus Torvalds
  1 sibling, 1 reply; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-17 23:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List

On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote:
> Well, you explicitly _asked_ me why I had been silent on the issue. I told
> you.

Well, what about those of us who need syscall numbers assigned for which 
you are the only official assigned number registry?

		-ben

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 23:52         ` Benjamin LaHaise
@ 2001-12-18  1:11           ` Linus Torvalds
  2001-12-18  1:46             ` H. Peter Anvin
  2001-12-18  5:54             ` Benjamin LaHaise
  0 siblings, 2 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18  1:11 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List


On Mon, 17 Dec 2001, Benjamin LaHaise wrote:
> On Mon, Dec 17, 2001 at 03:18:14PM -0800, Linus Torvalds wrote:
> > Well, you explicitly _asked_ me why I had been silent on the issue. I told
> > you.
>
> Well, what about those of us who need syscall numbers assigned for which
> you are the only official assigned number registry?

I've told you a number of times that I'd like to see the preliminary
implementation publicly discussed and some uses outside of private
companies that I have no insight into..

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  1:11           ` Linus Torvalds
@ 2001-12-18  1:46             ` H. Peter Anvin
  2001-12-18  5:54             ` Benjamin LaHaise
  1 sibling, 0 replies; 168+ messages in thread
From: H. Peter Anvin @ 2001-12-18  1:46 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <Pine.LNX.4.33.0112171710160.2035-100000@penguin.transmeta.com>
By author:    Linus Torvalds <torvalds@transmeta.com>
In newsgroup: linux.dev.kernel
> 
> I've told you a number of times that I'd like to see the preliminary
> implementation publicly discussed and some uses outside of private
> companies that I have no insight into..
> 

There was a group at IBM who presented on an alternate SMP scheduler
at this year's OLS; it generated quite a bit of good discussion.

	-hpa

-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-17 22:53   ` Linus Torvalds
  2001-12-17 23:15     ` Davide Libenzi
@ 2001-12-18  1:54     ` Rik van Riel
  2001-12-18  2:35       ` Linus Torvalds
  1 sibling, 1 reply; 168+ messages in thread
From: Rik van Riel @ 2001-12-18  1:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> You have to prioritize. Scheduling overhead is way down the list.

That's not what the profiling on my UP machine indicates,
let alone on SMP machines.

Try readprofile some day, chances are schedule() is pretty
near the top of the list.

regards,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  1:54     ` Rik van Riel
@ 2001-12-18  2:35       ` Linus Torvalds
  2001-12-18  2:51         ` David Lang
                           ` (2 more replies)
  0 siblings, 3 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18  2:35 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Davide Libenzi, Kernel Mailing List


On Mon, 17 Dec 2001, Rik van Riel wrote:
>
> Try readprofile some day, chances are schedule() is pretty
> near the top of the list.

Ehh.. Of course I do readprofile.

But did you ever compare readprofile output to _total_ cycles spent?

The fact is, it's not even noticeable under any normal loads, and
_definitely_ not on UP except with totally made up benchmarks that just
pass tokens around or yield all the time.

Because we spend 95-99% in user space or idle. Which is as it should be.
There are _very_ few loads that are kernel-intensive, and in fact the best
way to get high system times is to do either lots of fork/exec/wait with
everything cached, or do lots of open/read/write/close with everything
cached.

Of the remaining 1-5% of time, schedule() shows up as one fairly high
thing, but on most profiles I've seen of real work it shows up long after
things like "clear_page()" and "copy_page()".

And look closely at the profile, and you'll notice that it tends to be a
_loong_ tail of stuff.

Quite frankly, I'd be a _lot_ more interested in making the scheduling
slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
100Hz one, _despite_ the fact that it will increase scheduling load even
more. Because it improves interactive feel, and sometimes even performance
(ie being able to sleep for shorter sequences of time allows some things
that want "almost realtime" behaviour to avoid busy-looping for those
short waits - improving performace exactly _because_ they put more load on
the scheduler).

The benchmark that is just about _the_ worst on the scheduler is actually
something like "lmbench", and if you look at profiles for that you'll
notice that system call entry and exit together with the read/write path
ends up being more of a performance issue.

And you know what? From a user standpoint, improving disk latency is again
a _lot_ more noticeable than scheduler overhead.

And even more important than performance is being able to read and write
to CD-RW disks without having to know about things like "ide-scsi" etc,
and do it sanely over different bus architectures etc.

The scheduler simply isn't that important.

			Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  2:35       ` Linus Torvalds
@ 2001-12-18  2:51         ` David Lang
  2001-12-18  3:08         ` Davide Libenzi
  2001-12-18 14:09         ` Alan Cox
  2 siblings, 0 replies; 168+ messages in thread
From: David Lang @ 2001-12-18  2:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List

one problem the current scheduler has on SMP machines (even 2 CPU ones) is
that if the system is running one big process it will bounce from CPU to
CPU and actually finish considerably slower then if you are running two
CPU intensive tasks (with less cpu hopping). I saw this a few months ago
as I was doing something as simple as gunzip on a large file, I got a 30%
speed increase by running setiathome at the same time.

I'm not trying to say that it should be the top priority, but there are
definante weaknesses showing in the current implementation.

David Lang


 On Mon, 17 Dec 2001, Linus Torvalds wrote:

> Date: Mon, 17 Dec 2001 18:35:54 -0800 (PST)
> From: Linus Torvalds <torvalds@transmeta.com>
> To: Rik van Riel <riel@conectiva.com.br>
> Cc: Davide Libenzi <davidel@xmailserver.org>,
>      Kernel Mailing List <linux-kernel@vger.kernel.org>
> Subject: Re: Scheduler ( was: Just a second ) ...
>
>
> On Mon, 17 Dec 2001, Rik van Riel wrote:
> >
> > Try readprofile some day, chances are schedule() is pretty
> > near the top of the list.
>
> Ehh.. Of course I do readprofile.
>
> But did you ever compare readprofile output to _total_ cycles spent?
>
> The fact is, it's not even noticeable under any normal loads, and
> _definitely_ not on UP except with totally made up benchmarks that just
> pass tokens around or yield all the time.
>
> Because we spend 95-99% in user space or idle. Which is as it should be.
> There are _very_ few loads that are kernel-intensive, and in fact the best
> way to get high system times is to do either lots of fork/exec/wait with
> everything cached, or do lots of open/read/write/close with everything
> cached.
>
> Of the remaining 1-5% of time, schedule() shows up as one fairly high
> thing, but on most profiles I've seen of real work it shows up long after
> things like "clear_page()" and "copy_page()".
>
> And look closely at the profile, and you'll notice that it tends to be a
> _loong_ tail of stuff.
>
> Quite frankly, I'd be a _lot_ more interested in making the scheduling
> slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
> 100Hz one, _despite_ the fact that it will increase scheduling load even
> more. Because it improves interactive feel, and sometimes even performance
> (ie being able to sleep for shorter sequences of time allows some things
> that want "almost realtime" behaviour to avoid busy-looping for those
> short waits - improving performace exactly _because_ they put more load on
> the scheduler).
>
> The benchmark that is just about _the_ worst on the scheduler is actually
> something like "lmbench", and if you look at profiles for that you'll
> notice that system call entry and exit together with the read/write path
> ends up being more of a performance issue.
>
> And you know what? From a user standpoint, improving disk latency is again
> a _lot_ more noticeable than scheduler overhead.
>
> And even more important than performance is being able to read and write
> to CD-RW disks without having to know about things like "ide-scsi" etc,
> and do it sanely over different bus architectures etc.
>
> The scheduler simply isn't that important.
>
> 			Linus
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  2:35       ` Linus Torvalds
  2001-12-18  2:51         ` David Lang
@ 2001-12-18  3:08         ` Davide Libenzi
  2001-12-18  3:19           ` Davide Libenzi
  2001-12-18 14:09         ` Alan Cox
  2 siblings, 1 reply; 168+ messages in thread
From: Davide Libenzi @ 2001-12-18  3:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> Quite frankly, I'd be a _lot_ more interested in making the scheduling
> slices _shorter_ during 2.5.x, and go to a 1kHz clock on x86 instead of a
> 100Hz one, _despite_ the fact that it will increase scheduling load even
> more. Because it improves interactive feel, and sometimes even performance
> (ie being able to sleep for shorter sequences of time allows some things
> that want "almost realtime" behaviour to avoid busy-looping for those
> short waits - improving performace exactly _because_ they put more load on
> the scheduler).

I'm ok with increasing HZ but not so ok with decreasing time slices.
When you switch a task you've a fixed cost ( tlb, cache image,... ) that,
if you decrease the time slice, you're going to weigh with a lower run time
highering its percent impact.
The more interactive feel can be achieved by using a real BVT
implementation :

-            p->counter = (p->counter >> 1) + NICE_TO_TICKS(p->nice);
+            p->counter += NICE_TO_TICKS(p->nice);

The only problem with this is that, with certain task run patterns,
processes can run a long time ( having an high dynamic priority ) before
they get scheduled.
What i was thinking was something like, in timer.c :

        if (p->counter > decay_ticks)
            --p->counter;
        else if (++p->timer_ticks >= MAX_RUN_TIME) {
            p->counter -= p->timer_ticks;
            p->timer_ticks = 0;
            p->need_resched = 1;
        }

Having MAX_RUN_TIME ~= NICE_TO_TICKS(0)
In this way I/O bound tasks can run with high priority giving a better
interactive feel, w/out running too much freezing the system when exiting
from a quite long I/O wait.




- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  3:08         ` Davide Libenzi
@ 2001-12-18  3:19           ` Davide Libenzi
  0 siblings, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-18  3:19 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Linus Torvalds, Rik van Riel, Kernel Mailing List

On Mon, 17 Dec 2001, Davide Libenzi wrote:

> What i was thinking was something like, in timer.c :
>
>         if (p->counter > decay_ticks)
>             --p->counter;
>         else if (++p->timer_ticks >= MAX_RUN_TIME) {
>             p->counter -= p->timer_ticks;
>             p->timer_ticks = 0;
>             p->need_resched = 1;
>         }

Obviously that code doesn't work :) but the idea is to not permit the task
to run more than a maximum time consecutively.



- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
       [not found] <20011217200946.D753@holomorphy.com>
@ 2001-12-18  4:27 ` Linus Torvalds
  2001-12-18  4:55   ` William Lee Irwin III
  2001-12-18 18:13   ` Davide Libenzi
  0 siblings, 2 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18  4:27 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Kernel Mailing List


[ cc'd back to Linux kernel, in case somebody wants to take a look whether
  there is something wrong in the sound drivers, for example ]

On Mon, 17 Dec 2001, William Lee Irwin III wrote:
>
> This is no benchmark. This is my home machine it's taking a bite out of.
> I'm trying to websurf and play mp3's and read email here. No forkbombs.
> No databases. No made-up benchmarks. I don't know what it's doing (or
> trying to do) in there but I'd like the CPU cycles back.
>
> From a recent /proc/profile dump on 2.4.17-pre1 (no patches), my top 5
> (excluding default_idle) are:
> --------------------------------------------------------
>  22420 total                                      0.0168
>   4624 default_idle                              96.3333
>   1280 schedule                                   0.6202
>   1130 handle_IRQ_event                          11.7708
>    929 file_read_actor                            9.6771
>    843 fast_clear_page                            7.5268

The most likely cause is simply waking up after each sound interrupt: you
also have a _lot_ of time handling interrupts. Quite frankly, web surfing
and mp3 playing simply shouldn't use any noticeable amounts of CPU.

The point being that I really doubt it's the scheduler proper, it's
probably how it is _used_. And I'd suspect your sound driver (or user)
conspires to keep scheduling stuff.

For example (and this is _purely_ an example, I don't know if this is
your particular case), this sounds like a classic case of "bad buffering".
What bad buffering would do is:
 - you have a sound buffer that the mp3 player tries to keep full
 - your sound buffer is, let's pick a random number, 64 entries of 1024
   bytes each.
 - the sound card gives an interrupt every time it has emptied a buffer.
 - the mp3 player is waiting on "free space"
 - we wake up the mp3 player for _every_ sound fragment filled.

Do you see what this leads to? We schedule the mp3 task (which gets a high
priority because it tends to run for a really short time, filling just 1
small buffer each time) _every_ time a single buffer empties. Even though
we have 63 other full buffers.

The classic fix for these kinds of things is _not_ to make the scheduler
faster. Sure, that would help, but that's not really the problem. The
_real_ fix is to use water-marks, and make the sound driver wake up the
writing process only when (say) half the buffers have emptied.

Now the mp3 player can fill 32 of the buffers at a time, and gets
scheduled an order of magnitude less. It doesn't end up waking up every
time.

Which sound driver are you using, just in case this _is_ the reason?

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:27 ` Linus Torvalds
@ 2001-12-18  4:55   ` William Lee Irwin III
  2001-12-18  6:09     ` Linus Torvalds
  2001-12-18 14:21     ` Adam Schrotenboer
  2001-12-18 18:13   ` Davide Libenzi
  1 sibling, 2 replies; 168+ messages in thread
From: William Lee Irwin III @ 2001-12-18  4:55 UTC (permalink / raw)
  To: Kernel Mailing List; +Cc: torvalds

On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> The most likely cause is simply waking up after each sound interrupt: you
> also have a _lot_ of time handling interrupts. Quite frankly, web surfing
> and mp3 playing simply shouldn't use any noticeable amounts of CPU.

I think we have a winner:
/proc/interrupts
------------------------------------------------
           CPU0       
  0:   17321824          XT-PIC  timer
  1:          4          XT-PIC  keyboard
  2:          0          XT-PIC  cascade
  5:   46490271          XT-PIC  soundblaster
  9:     400232          XT-PIC  usb-ohci, eth0, eth1
 11:     939150          XT-PIC  aic7xxx, aic7xxx
 14:         13          XT-PIC  ide0

Approximately 4 times more often than the timer interrupt.
That's not nice...

On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> Which sound driver are you using, just in case this _is_ the reason?

SoundBlaster 16
A change of hardware should help verify this.


Cheers,
Bill

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
@ 2001-12-18  5:11 Thierry Forveille
  2001-12-17 21:41 ` John Heil
  2001-12-18 14:31 ` Alan Cox
  0 siblings, 2 replies; 168+ messages in thread
From: Thierry Forveille @ 2001-12-18  5:11 UTC (permalink / raw)
  To: linux-kernel

Linus Torvalds (torvalds@transmeta.com) writes
> On Mon, 17 Dec 2001, Rik van Riel wrote:
> >
> > Try readprofile some day, chances are schedule() is pretty
> > near the top of the list.
>
> Ehh.. Of course I do readprofile.
>  
> But did you ever compare readprofile output to _total_ cycles spent?
>
I have a feeling that this discussion got sidetracked: cpu cycles burnt 
in the scheduler indeed is non-issue, but big tasks being needlessly moved
around on SMPs is worth tackling.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  1:11           ` Linus Torvalds
  2001-12-18  1:46             ` H. Peter Anvin
@ 2001-12-18  5:54             ` Benjamin LaHaise
  2001-12-18  6:10               ` Linus Torvalds
  1 sibling, 1 reply; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-18  5:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Davide Libenzi, Kernel Mailing List

On Mon, Dec 17, 2001 at 05:11:09PM -0800, Linus Torvalds wrote:
> I've told you a number of times that I'd like to see the preliminary
> implementation publicly discussed and some uses outside of private
> companies that I have no insight into..

Well, we've got serious chicken and egg problems then.

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
@ 2001-12-18  5:59 V Ganesh
  0 siblings, 0 replies; 168+ messages in thread
From: V Ganesh @ 2001-12-18  5:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: wli

In article <20011217205547.C821@holomorphy.com> you wrote:
: On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
:> The most likely cause is simply waking up after each sound interrupt: you
:> also have a _lot_ of time handling interrupts. Quite frankly, web surfing
:> and mp3 playing simply shouldn't use any noticeable amounts of CPU.

: I think we have a winner:
: /proc/interrupts
: ------------------------------------------------
:            CPU0
:   0:   17321824          XT-PIC  timer
:   1:          4          XT-PIC  keyboard
:   2:          0          XT-PIC  cascade
:   5:   46490271          XT-PIC  soundblaster
:   9:     400232          XT-PIC  usb-ohci, eth0, eth1
:  11:     939150          XT-PIC  aic7xxx, aic7xxx
:  14:         13          XT-PIC  ide0

: Approximately 4 times more often than the timer interrupt.
: That's not nice...

a bit offtopic, but the reason why there are so many interrupts is
that there's probably something like esd running. I've observed that idle
esd manages to generate tons of interrupts, although an strace of esd
reveals it stuck in a select(). probably one of the ioctls it issued
earlier is causing the driver to continuously read/write to the device.
the interrupts stop as soon as you kill esd.

: SoundBlaster 16
: A change of hardware should help verify this.

it happens even with cs4232 (redhat 7.2, 2.4.7-10smp), so I doubt it's
a soundblaster issue.

ganesh

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:55   ` William Lee Irwin III
@ 2001-12-18  6:09     ` Linus Torvalds
  2001-12-18  6:34       ` Jeff Garzik
                         ` (6 more replies)
  2001-12-18 14:21     ` Adam Schrotenboer
  1 sibling, 7 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18  6:09 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Kernel Mailing List, Jeff Garzik


On Mon, 17 Dec 2001, William Lee Irwin III wrote:
>
>   5:   46490271          XT-PIC  soundblaster
>
> Approximately 4 times more often than the timer interrupt.
> That's not nice...

Yeah.

Well, looking at the issue, the problem is probably not just in the sb
driver: the soundblaster driver shares the output buffer code with a
number of other drivers (there's some horrible "dmabuf.c" code in common).

And yes, the dmabuf code will wake up the writer on every single DMA
complete interrupt. Considering that you seem to have them at least 400
times a second (and probably more, unless you've literally had sound going
since the machine was booted), I think we know why your setup spends time
in the scheduler.

> On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > Which sound driver are you using, just in case this _is_ the reason?
>
> SoundBlaster 16
> A change of hardware should help verify this.

A number of sound drivers will use the same logic.

You may be able to change this more easily some other way, by using a
larger fragment size for example. That's up to the sw that actually feeds
the sound stream, so it might be your decoder that selects a small
fragment size.

Quite frankly I don't know the sound infrastructure well enough to make
any more intelligent suggestions about other decoders or similar to try,
at this point I just start blathering.

But yes, I bet you'll also see much less impact of this if you were to
switch to more modern hardware.

grep grep grep.. Oh, before you do that, how about changing "min_fragment"
in sb_audio.c from 5 to something bigger like 9 or 10?

That

	audio_devs[devc->dev]->min_fragment = 5;

literally means that your minimum fragment size seems to be a rather
pathetic 32 bytes (which doesn't mean that your sound will be set to that,
but it _might_ be). That sounds totally ridiculous, but maybe I've
misunderstood the code.

Jeff, you've worked on the sb code at some point - does it really do
32-byte sound fragments? Why? That sounds truly insane if I really parsed
that code correctly. That's thousands of separate DMA transfers
and interrupts per second..

Raising that min_fragment thing from 5 to 10 would make the minimum DMA
buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
in less than 1/100th of a second, but at least it should be < 200 irqs/sec
rather than >400).

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  5:54             ` Benjamin LaHaise
@ 2001-12-18  6:10               ` Linus Torvalds
  0 siblings, 0 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18  6:10 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List


On Tue, 18 Dec 2001, Benjamin LaHaise wrote:

> On Mon, Dec 17, 2001 at 05:11:09PM -0800, Linus Torvalds wrote:
> > I've told you a number of times that I'd like to see the preliminary
> > implementation publicly discussed and some uses outside of private
> > companies that I have no insight into..
>
> Well, we've got serious chicken and egg problems then.

Why?

I'd rather have people playing around with new system calls and _test_
them, and then have to recompile their apps if the system calls move
later, than introduce new system calls that haven't gotten any public
testing at all..

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
@ 2001-12-18  6:34       ` Jeff Garzik
  2001-12-18 12:23       ` Rik van Riel
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 168+ messages in thread
From: Jeff Garzik @ 2001-12-18  6:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List

Linus Torvalds wrote:
> Jeff, you've worked on the sb code at some point - does it really do
> 32-byte sound fragments? Why? That sounds truly insane if I really parsed
> that code correctly. That's thousands of separate DMA transfers
> and interrupts per second..

I do not see a hardware minimum fragment size in the HW docs...  The
default hardware reset frag size is 2048 bytes.  So, yes, 32 bytes is
pretty small for today's rate.

But... I wonder if the fault lies more with the application setting a
too-small fragment size and the driver actually allows it to do so, or,
the code following this comment in reorganize_buffers in
drivers/sound/audio.c needs to be revisited:
   /* Compute the fragment size using the default algorithm */

Remember this code is from ancient times...  probably written way before
44 Khz was common at all.

	Jeff


-- 
Jeff Garzik      | Only so many songs can be sung
Building 1024    | with two lips, two lungs, and one tongue.
MandrakeSoft     |         - nomeansno

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
@ 2001-12-18  9:12           ` John Heil
  2001-12-18 15:34           ` degger
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 168+ messages in thread
From: John Heil @ 2001-12-18  9:12 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Rik van Riel, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Alan Cox wrote:

> Date: Tue, 18 Dec 2001 14:09:16 +0000 (GMT)
> From: Alan Cox <alan@lxorguk.ukuu.org.uk>
> To: Linus Torvalds <torvalds@transmeta.com>
> Cc: Rik van Riel <riel@conectiva.com.br>,
>     Davide Libenzi <davidel@xmailserver.org>,
>     Kernel Mailing List <linux-kernel@vger.kernel.org>
> Subject: Re: Scheduler ( was: Just a second ) ...
> 
> > to CD-RW disks without having to know about things like "ide-scsi" etc,
> > and do it sanely over different bus architectures etc.
> > 
> > The scheduler simply isn't that important.
> 
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

What % of a std 2 cpu, do you think it eats?

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
johnhscs@sc-software.com
http://www.sc-software.com
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
  2001-12-18  6:34       ` Jeff Garzik
@ 2001-12-18 12:23       ` Rik van Riel
  2001-12-18 14:29       ` Alan Cox
                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 168+ messages in thread
From: Rik van Riel @ 2001-12-18 12:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

On Mon, 17 Dec 2001, Linus Torvalds wrote:
> On Mon, 17 Dec 2001, William Lee Irwin III wrote:
> >
> >   5:   46490271          XT-PIC  soundblaster
> >
> > Approximately 4 times more often than the timer interrupt.
> > That's not nice...

That's not nearly as much as your typical server system runs
in network packets and wakeups of the samba/database/http
daemons, though ...

> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).

So you fixed it for the sound driver, nice.  We still have
the issue tha the scheduler can take up lots of time on busy
server systems, though.

(though I suspect on those systems it probably spends more
time recalculating than selecting processes)

regards,

Rik
-- 
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  2:35       ` Linus Torvalds
  2001-12-18  2:51         ` David Lang
  2001-12-18  3:08         ` Davide Libenzi
@ 2001-12-18 14:09         ` Alan Cox
  2001-12-18  9:12           ` John Heil
                             ` (3 more replies)
  2 siblings, 4 replies; 168+ messages in thread
From: Alan Cox @ 2001-12-18 14:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List

> to CD-RW disks without having to know about things like "ide-scsi" etc,
> and do it sanely over different bus architectures etc.
> 
> The scheduler simply isn't that important.

The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
That isn't going to go away by sticking heads in sand.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:55   ` William Lee Irwin III
  2001-12-18  6:09     ` Linus Torvalds
@ 2001-12-18 14:21     ` Adam Schrotenboer
  1 sibling, 0 replies; 168+ messages in thread
From: Adam Schrotenboer @ 2001-12-18 14:21 UTC (permalink / raw)
  To: Kernel Mailing List

On Monday 17 December 2001 23:55, William Lee Irwin III wrote:
> On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > The most likely cause is simply waking up after each sound interrupt: you
> > also have a _lot_ of time handling interrupts. Quite frankly, web surfing
> > and mp3 playing simply shouldn't use any noticeable amounts of CPU.
>
> I think we have a winner:
> /proc/interrupts
> ------------------------------------------------
>            CPU0
>   0:   17321824          XT-PIC  timer
>   1:          4          XT-PIC  keyboard
>   2:          0          XT-PIC  cascade
>   5:   46490271          XT-PIC  soundblaster
>   9:     400232          XT-PIC  usb-ohci, eth0, eth1
>  11:     939150          XT-PIC  aic7xxx, aic7xxx
>  14:         13          XT-PIC  ide0
>
> Approximately 4 times more often than the timer interrupt.
> That's not nice...

FWIW, I have an ES1371 based sound card, and mpg123 drives it at 172 
interrupts/sec (calculated in procinfo). But that _is_ only when playing. And 
(my slightly hacked) timidity drives my card w/ only 23(@48kHz sample rate; 
21 @ 44.1kHz) interrupts/sec

Is this 172 figure right? (Not through esd either. i almost always turn it 
off, and sp recompiled mpg123 to use the std OSS driver)

>
> On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > Which sound driver are you using, just in case this _is_ the reason?
>
> SoundBlaster 16
> A change of hardware should help verify this.
>
>
> Cheers,
> Bill
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
  2001-12-18  6:34       ` Jeff Garzik
  2001-12-18 12:23       ` Rik van Riel
@ 2001-12-18 14:29       ` Alan Cox
  2001-12-18 17:07         ` Linus Torvalds
  2001-12-18 15:51       ` Martin Josefsson
                         ` (3 subsequent siblings)
  6 siblings, 1 reply; 168+ messages in thread
From: Alan Cox @ 2001-12-18 14:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).

The sb driver is fine

> A number of sound drivers will use the same logic.

Most hardware does

> Quite frankly I don't know the sound infrastructure well enough to make
> any more intelligent suggestions about other decoders or similar to try,
> at this point I just start blathering.

some of the sound stuff uses very short fragments to get accurate 
audio/video synchronization. Some apps also do it gratuitously when they
should be using other API's. Its also used sensibly for things like
gnome-meeting where its worth trading CPU for latency because 1K of
buffering starts giving you earth<->moon type conversations

> But yes, I bet you'll also see much less impact of this if you were to
> switch to more modern hardware.

Not really - the app asked for an event every 32 bytes. This is an app not
kernel problem.

> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).

With a few exceptions the applications tend to use 4K or larger DMA chunks
anyway. Very few need tiny chunks.

Alan


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  5:11 Thierry Forveille
  2001-12-17 21:41 ` John Heil
@ 2001-12-18 14:31 ` Alan Cox
  1 sibling, 0 replies; 168+ messages in thread
From: Alan Cox @ 2001-12-18 14:31 UTC (permalink / raw)
  To: Thierry Forveille; +Cc: linux-kernel

> I have a feeling that this discussion got sidetracked: cpu cycles burnt 
> in the scheduler indeed is non-issue, but big tasks being needlessly moved
> around on SMPs is worth tackling.]

Its not a non issue - 40% of an 8 way box is a lot of lost CPU. Fixing the
CPU bounce around problem also matters a lot - Ingo's speedups seen just by 
improving that on the current scheduler show its worth the work



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
  2001-12-18  9:12           ` John Heil
@ 2001-12-18 15:34           ` degger
  2001-12-18 18:35             ` Mike Kravetz
  2001-12-18 18:48             ` Davide Libenzi
  2001-12-18 16:50           ` Mike Kravetz
  2001-12-18 17:00           ` Linus Torvalds
  3 siblings, 2 replies; 168+ messages in thread
From: degger @ 2001-12-18 15:34 UTC (permalink / raw)
  To: alan; +Cc: linux-kernel

On 18 Dec, Alan Cox wrote:

> The scheduler is eating 40-60% of the machine on real world 8 cpu
> workloads. That isn't going to go away by sticking heads in sand.

What about a CONFIG_8WAY which, if set, activates a scheduler that
performs better on such nontypical machines? I see and understand
boths sides arguments yet I fail to see where the real problem is
with having a scheduler that just kicks in _iff_ we're running the
kernel on a nontypical kind of machine.
This would keep the straigtforward scheduler Linus is defending
for the single processor machines while providing more performance
to heavy SMP machines by having a more complex scheduler better suited
for this task.

--
Servus,
       Daniel


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (2 preceding siblings ...)
  2001-12-18 14:29       ` Alan Cox
@ 2001-12-18 15:51       ` Martin Josefsson
  2001-12-18 17:08         ` Linus Torvalds
  2001-12-18 16:16       ` Roger Larsson
                         ` (2 subsequent siblings)
  6 siblings, 1 reply; 168+ messages in thread
From: Martin Josefsson @ 2001-12-18 15:51 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> 
> On Mon, 17 Dec 2001, William Lee Irwin III wrote:
> >
> >   5:   46490271          XT-PIC  soundblaster
> >
> > Approximately 4 times more often than the timer interrupt.
> > That's not nice...

  0:   24867181          XT-PIC  timer
  5:    9070614          XT-PIC  soundblaster

After I bootup I start X and then xmms and then my system plays mp3's
almost all the time.

> > > Which sound driver are you using, just in case this _is_ the reason?
> >
> > SoundBlaster 16

I have an old ISA SoundBlaster 16
 
> Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).

After watchning /proc/interrupts with 30 second intervals I see that I
only get 43 interrupts/second when playing 16bit 44.1kHz stereo.

And according to vmstat I have 153-158 interrupts/second in total
(it's probably the networktraffic that increases it a little above 143).

/Martin


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (3 preceding siblings ...)
  2001-12-18 15:51       ` Martin Josefsson
@ 2001-12-18 16:16       ` Roger Larsson
  2001-12-18 17:16         ` Herman Oosthuysen
  2001-12-18 17:16         ` Linus Torvalds
  2001-12-18 17:21       ` David Mansfield
  2001-12-18 18:25       ` William Lee Irwin III
  6 siblings, 2 replies; 168+ messages in thread
From: Roger Larsson @ 2001-12-18 16:16 UTC (permalink / raw)
  To: Linus Torvalds, William Lee Irwin III
  Cc: Kernel Mailing List, linux-audio-dev, Jeff Garzik

This might be of interest on linux-audio-dev too...

On Tuesday den 18 December 2001 07.09, Linus Torvalds wrote:
> On Mon, 17 Dec 2001, William Lee Irwin III wrote:
> >   5:   46490271          XT-PIC  soundblaster
> >
> > Approximately 4 times more often than the timer interrupt.
> > That's not nice...
>
> Yeah.
>
> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).
>
> And yes, the dmabuf code will wake up the writer on every single DMA
> complete interrupt. Considering that you seem to have them at least 400
> times a second (and probably more, unless you've literally had sound going
> since the machine was booted), I think we know why your setup spends time
> in the scheduler.
>
> > On Mon, Dec 17, 2001 at 08:27:18PM -0800, Linus Torvalds wrote:
> > > Which sound driver are you using, just in case this _is_ the reason?
> >
> > SoundBlaster 16
> > A change of hardware should help verify this.
>
> A number of sound drivers will use the same logic.
>
> You may be able to change this more easily some other way, by using a
> larger fragment size for example. That's up to the sw that actually feeds
> the sound stream, so it might be your decoder that selects a small
> fragment size.
>
> Quite frankly I don't know the sound infrastructure well enough to make
> any more intelligent suggestions about other decoders or similar to try,
> at this point I just start blathering.
>
> But yes, I bet you'll also see much less impact of this if you were to
> switch to more modern hardware.
>
> grep grep grep.. Oh, before you do that, how about changing "min_fragment"
> in sb_audio.c from 5 to something bigger like 9 or 10?
>
> That
>
> 	audio_devs[devc->dev]->min_fragment = 5;
>
> literally means that your minimum fragment size seems to be a rather
> pathetic 32 bytes (which doesn't mean that your sound will be set to that,
> but it _might_ be). That sounds totally ridiculous, but maybe I've
> misunderstood the code.

I think it really is 32 samples, yes that is little - but too small?
It depends on the used sample frequency...

Paul Davis wrote this on linux-audio-dev 2001-12-05
"in doing lots of testing on JACK, i've noticed that although the
trident driver now works (there were some patches from jaroslav and
myself), in general i still get xruns with the lowest possible latency
setting for that card (1.3msec per interrupt, 2.6msec buffer). with
the same settings on my hammerfall, i don't get xruns, even with
substantial system load."

>
> Jeff, you've worked on the sb code at some point - does it really do
> 32-byte sound fragments? Why? That sounds truly insane if I really parsed
> that code correctly. That's thousands of separate DMA transfers
> and interrupts per second..
>

Lets see: we have >1 GHz CPU and interrupts at >1000 Hz
 => 1 Mcycle / interrupt - is that insane?

If the hardware can support it? Why not let it? It is really up to the 
applications/user to decide...

> Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).
>

Yes, it is probably more reasonable - but if the soundcard can support it?
(I have a vision of lots of linux-audio-dev folks pulling out their new 
soundcard and replacing it with their since long forgotten SB16...)

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
       [not found] <20011218020456.A11541@redhat.com>
@ 2001-12-18 16:50 ` Linus Torvalds
  2001-12-18 16:56   ` Rik van Riel
                     ` (2 more replies)
  0 siblings, 3 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 16:50 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: Davide Libenzi, Kernel Mailing List


On Tue, 18 Dec 2001, Benjamin LaHaise wrote:
> On Mon, Dec 17, 2001 at 10:10:30PM -0800, Linus Torvalds wrote:
> > > Well, we've got serious chicken and egg problems then.
> >
> > Why?
>
> The code can't go into glibc without syscall numbers being reserved.

It sure as hell can.

And I'll bet $5 USD that glibc wouldn't take the patches anyway before
the kernel interfaces are _tested_.

> I've posted the code, there are people playing with it.  I can't make them
> comment.

Well, if people aren't interested, then it doesn't _ever_ go in.

Remember: we do not add features just because we can.

Quite frankly, I don't think you've told that many people. I haven't seen
any discussion about the aio stuff on linux-kernel, which may be because
you posted several announcements and nobody cared, or it may be that
you've only mentioned it fleetingly and people didn't notice.

Take a look at how long it took for ext3 to be "standard" - I put them in
my tree when I started getting real feedback that it was used and people
liked using it. I simply do not like applying patches "just to get users".
Not even reservations - because I reserve the right to _never_ apply
something if critical review ends up saying that "that doesn't make
sense".

Quite frankly, the fact that it is being tested out at places like Oracle
etc is secondary - those people will use anything. That's proven by
history. That doesn't mean that _I_ accept anything.

Now, the fact that I like the interfaces is actually secondary - it does
make me much more likely to include it even in a half-baked thing, but it
does NOT mean that I trust my own taste so much that I'd do it "under the
covers" with little open discussion, use and modification.

Where _is_ the discussion on linux-kernel?

Where are the negative comments from Al? (Al _always_ has negative
comments and suggestions for improvements, don't try to say that he also
liked it unconditionally ;)

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
  2001-12-18  9:12           ` John Heil
  2001-12-18 15:34           ` degger
@ 2001-12-18 16:50           ` Mike Kravetz
  2001-12-18 17:22             ` Linus Torvalds
  2001-12-18 17:00           ` Linus Torvalds
  3 siblings, 1 reply; 168+ messages in thread
From: Mike Kravetz @ 2001-12-18 16:50 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Rik van Riel, Davide Libenzi, Kernel Mailing List

On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

Can you be more specific as to the workload you are referring to?
As someone who has been playing with the scheduler for a while,
I am interested in all such workloads.

-- 
Mike

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
@ 2001-12-18 16:56   ` Rik van Riel
  2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 17:55   ` Scheduler ( was: Just a second ) Davide Libenzi
  2001-12-18 19:43   ` Alexander Viro
  2 siblings, 1 reply; 168+ messages in thread
From: Rik van Riel @ 2001-12-18 16:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> Where _is_ the discussion on linux-kernel?

Which mailing lists do you want to be subscribed to ? ;)

Rik
-- 
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:09         ` Alan Cox
                             ` (2 preceding siblings ...)
  2001-12-18 16:50           ` Mike Kravetz
@ 2001-12-18 17:00           ` Linus Torvalds
  2001-12-18 19:17             ` Alan Cox
  3 siblings, 1 reply; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:00 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rik van Riel, Davide Libenzi, Kernel Mailing List


On Tue, 18 Dec 2001, Alan Cox wrote:
>
> The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> That isn't going to go away by sticking heads in sand.

Did you _read_ what I said?

We _have_ patches. You apparently have your own set.

Fight it out. Don't involve me, because I don't think it's even a
challenging thing. I wrote what is _still_ largely the algorithm in 1991,
and it's damn near the only piece of code from back then that even _has_
some similarity to the original code still. All the "recompute count when
everybody has gone down to zero" was there pretty much from day 1 (*).

Which makes me say: "oh, a quick hack from 1991 works on most machines in
2001, so how hard a problem can it be?"

Fight it out. People asked whether I was interested, and I said "no". Take
a clue: do benchmarks on all the competing patches, and try to create the
best one, and present it to me as a done deal.

		Linus

(*) The single biggest change from day 1 is that it used to iterate over a
global array of process slots, and for scalability reasons (not CPU
scalability, but "max nr of processes in the system" scalability) the
array was gotten rid of, giving the current doubly linked list. Everything
else that any scheduler person complains about was pretty much there
otherwise ;)


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 14:29       ` Alan Cox
@ 2001-12-18 17:07         ` Linus Torvalds
  0 siblings, 0 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:07 UTC (permalink / raw)
  To: Alan Cox; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik


On Tue, 18 Dec 2001, Alan Cox wrote:
>
> > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> > in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> > rather than >400).
>
> With a few exceptions the applications tend to use 4K or larger DMA chunks
> anyway. Very few need tiny chunks.

Doing another grep seems to imply that none of the other drivers even
allow as small chunks as the sb driver does, 32 byte "events" is just
ridiculous. At simple 2-channel, 16-bits, CD-quality sound, that's a DMA
event every 0.18 msec (5500 times a second, 181 _micro_seconds appart).

I obviously agree that the app shouldn't even ask for small chunks:
whether a mp3 player reacts within 1/10th or 1/1000th of a second of the
user asking it to switch tracks, nobody can even tell. So an mp3 player
should probably use a big fragment size on the order of 4kB or similar
(that still gives max fragment latency of 0.022 seconds, faster than
humans can react).

So it sounds like a player sillyness, but I don't think the driver should
even allow such waste of resources, considering that no other driver
allows it either..

			Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 15:51       ` Martin Josefsson
@ 2001-12-18 17:08         ` Linus Torvalds
  0 siblings, 0 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:08 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik


On Tue, 18 Dec 2001, Martin Josefsson wrote:
>
> After watchning /proc/interrupts with 30 second intervals I see that I
> only get 43 interrupts/second when playing 16bit 44.1kHz stereo.

That's _exactly_ what you get with a 4kB fragment size.

You have a sane player that asks for a sane fragment size. While whatever
William uses seems to ask for a really small one..

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:16       ` Roger Larsson
@ 2001-12-18 17:16         ` Herman Oosthuysen
  2001-12-18 17:16         ` Linus Torvalds
  1 sibling, 0 replies; 168+ messages in thread
From: Herman Oosthuysen @ 2001-12-18 17:16 UTC (permalink / raw)
  To: Kernel Mailing List, linux-audio-dev

My tuppence worth from a real-time embedded perspective:
A shorter time slice and other real-time improvements to the scheduler will
certainly improve life to the embedded crowd.  Bear in mind that 90% of
processors are used for embedded apps.  Shorter time slices etc. means
smaller buffers, less RAM and lower cost.

I don't know what the current distribution is for Linux regarding embedded
vs data processing, but the embedded use of Linux is certainly growing
rapidly - we expect to make a million thingummyjigs running Linux next year
and there are many other companies doing the same.  Within the next few
years, I expect embedded use of Linux to overshadow data use by a large
margin.

Since embedded processors are 'invisible' and never in the news, I would be
very happy if Linus and others will keep us poor boys in mind...
--
Herman Oosthuysen
Herman@WirelessNetworksInc.com
Suite 300, #3016, 5th Ave NE,
Calgary, Alberta, T2A 6K4, Canada
Phone: (403) 569-5688, Fax: (403) 235-3965
----- Original Message ----- >
> Lets see: we have >1 GHz CPU and interrupts at >1000 Hz
>  => 1 Mcycle / interrupt - is that insane?
>
> If the hardware can support it? Why not let it? It is really up to the
> applications/user to decide...
>
> > Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> > buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> > at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer
empties
> > in less than 1/100th of a second, but at least it should be < 200
irqs/sec
> > rather than >400).
> >
>
> /RogerL
>
> --
> Roger Larsson
> Skellefteå
> Sweden



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:16       ` Roger Larsson
  2001-12-18 17:16         ` Herman Oosthuysen
@ 2001-12-18 17:16         ` Linus Torvalds
  1 sibling, 0 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:16 UTC (permalink / raw)
  To: Roger Larsson
  Cc: William Lee Irwin III, Kernel Mailing List, linux-audio-dev,
	Jeff Garzik


On Tue, 18 Dec 2001, Roger Larsson wrote:
>
> Lets see: we have >1 GHz CPU and interrupts at >1000 Hz
>  => 1 Mcycle / interrupt - is that insane?

Ehh.. First off, the CPU may be 1GHz, but the memory subsystem, and the
PCI subsystem definitely are _not_. Most PCI cards still run at a
(comparatively) leisurely 33MHz, and when we're talking about audio, we're
talking about actually having to _access_ that audio device.

Yes. At 33MHz, not at 1GHz.

Also, at 32-byte fragments, the frequency is actually 5.5kHz, not 1kHz.
Now, I seriously doubt the mp3-player actually used 32-byte fragments (it
probably just asked for something small, and got it), but let's say it
asked for something in the kHz range (ie 256-512 byte frags). That does
_not_ equate to "1 Mcycle". It equates to 33 _kilocycles_ in PCI-land, and
a PCI read will take several cycles.

> If the hardware can support it? Why not let it? It is really up to the
> applications/user to decide...

Well, this particular user was unhappy with the CPU spending a noticeably
amount of time on just web-surfing and mp3-playing.

So clearly the _user_ didn't ask for it.

And I suspect that the app writer just didn't even realize what he did. He
may have used another sound card that didn't even allow small fragments.

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:56   ` Rik van Riel
@ 2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 19:04       ` Alan Cox
                         ` (2 more replies)
  0 siblings, 3 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:18 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List


On Tue, 18 Dec 2001, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
>
> > Where _is_ the discussion on linux-kernel?
>
> Which mailing lists do you want to be subscribed to ? ;)

I'm not subscribed to any, thank you very much. I read them through a news
gateway, which gives me access to the common ones.

And if the discussion wasn't on the common ones, then it wasn't an open
discussion.

And no, I don't think IRC counts either, sorry.

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (4 preceding siblings ...)
  2001-12-18 16:16       ` Roger Larsson
@ 2001-12-18 17:21       ` David Mansfield
  2001-12-18 17:27         ` Linus Torvalds
  2001-12-18 18:25       ` William Lee Irwin III
  6 siblings, 1 reply; 168+ messages in thread
From: David Mansfield @ 2001-12-18 17:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik

> 
> 	audio_devs[devc->dev]->min_fragment = 5;
> 

Generally speaking, you want to be able to specify about a 1ms fragment,
speaking as a realtime audio programmer (no offense Victor...).  However,
1ms is 128 bytes at 16bit stereo, but only 32 bytes at 8bit mono.  Nobody
does 8bit mono, but that's probably why it's there.  A lot of drivers seem 
to have 128 byte as minimum fragment size.  Even the high end stuff like 
the RME hammerfall only go down to 64 byte fragment PER CHANNEL, which is 
the same as 128 bytes for stereo in the SB 16.

> Raising that min_fragment thing from 5 to 10 would make the minimum DMA
> buffer go from 32 bytes to 1kB, which is a _lot_ more reasonable (what,
> at 2*2 bytes per sample and 44kHz would mean that a 1kB DMA buffer empties
> in less than 1/100th of a second, but at least it should be < 200 irqs/sec
> rather than >400).

Note that the ALSA drivers allow the app to set watermarks for wakeup, 
while allowing flexibility in fragment size and number.  You can 
essentially say, wake me up when there are at least n fragments empty, and 
put me to sleep if m fragments are full.

David

-- 
/==============================\
| David Mansfield              |
| david@cobite.com             |
\==============================/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50           ` Mike Kravetz
@ 2001-12-18 17:22             ` Linus Torvalds
  2001-12-18 17:50               ` Davide Libenzi
  0 siblings, 1 reply; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:22 UTC (permalink / raw)
  To: Mike Kravetz; +Cc: Alan Cox, Rik van Riel, Davide Libenzi, Kernel Mailing List


On Tue, 18 Dec 2001, Mike Kravetz wrote:
> On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > That isn't going to go away by sticking heads in sand.
>
> Can you be more specific as to the workload you are referring to?
> As someone who has been playing with the scheduler for a while,
> I am interested in all such workloads.

Well, careful: depending on what "%" means, a 8-cpu machine has either
"100% max" or "800% max".

So are we talking about "we spend 40-60% of all CPU cycles in the
scheduler" or are we talking about "we spend 40-60% of the CPU power of
_one_ CPU out of 8 in the scheduler".

Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the
scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates
to "The worst scheduler case I've ever seen under a real load spent 5-8%
of the machine CPU resources on scheduling".

And let's face it, 5-8% is bad, but we're not talking "half the CPU power"
here.

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:21       ` David Mansfield
@ 2001-12-18 17:27         ` Linus Torvalds
  2001-12-18 17:54           ` Andreas Dilger
  2001-12-18 18:58           ` Alan Cox
  0 siblings, 2 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 17:27 UTC (permalink / raw)
  To: David Mansfield; +Cc: William Lee Irwin III, Kernel Mailing List, Jeff Garzik


On Tue, 18 Dec 2001, David Mansfield wrote:
> >
> > 	audio_devs[devc->dev]->min_fragment = 5;
> >
>
> Generally speaking, you want to be able to specify about a 1ms fragment,
> speaking as a realtime audio programmer (no offense Victor...).  However,
> 1ms is 128 bytes at 16bit stereo, but only 32 bytes at 8bit mono.  Nobody
> does 8bit mono, but that's probably why it's there.  A lot of drivers seem
> to have 128 byte as minimum fragment size.

Good point.

Somebody should really look at "dma_set_fragment", and see whether we can
make "min_fragment" be really just a hardware minimum chunk size, but use
other heuristics like frequency to cut off the minimum size (ie just do
something like

	/* We want to limit it to 1024 Hz */
	min_bytes = freq*channel*bytes_per_channel >> 10;

Although I'm not sure we _have_ the frequency at that point: somebody
might set the fragment size first, and the frequency later.

Maybe the best thing to do is to educate the people who write the sound
apps for Linux (somebody was complaining about "esd" triggering this, for
example).

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:22             ` Linus Torvalds
@ 2001-12-18 17:50               ` Davide Libenzi
  0 siblings, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-18 17:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Mike Kravetz, Alan Cox, Rik van Riel, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

>
> On Tue, 18 Dec 2001, Mike Kravetz wrote:
> > On Tue, Dec 18, 2001 at 02:09:16PM +0000, Alan Cox wrote:
> > > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > > That isn't going to go away by sticking heads in sand.
> >
> > Can you be more specific as to the workload you are referring to?
> > As someone who has been playing with the scheduler for a while,
> > I am interested in all such workloads.
>
> Well, careful: depending on what "%" means, a 8-cpu machine has either
> "100% max" or "800% max".
>
> So are we talking about "we spend 40-60% of all CPU cycles in the
> scheduler" or are we talking about "we spend 40-60% of the CPU power of
> _one_ CPU out of 8 in the scheduler".
>
> Yes, 40-60% sounds like a lot ("Wow! About half the time is spent in the
> scheduler"), but I bet it's 40-60% of _one_ CPU, which really translates
> to "The worst scheduler case I've ever seen under a real load spent 5-8%
> of the machine CPU resources on scheduling".
>
> And let's face it, 5-8% is bad, but we're not talking "half the CPU power"
> here.

Linus, you're plain right that we can spend days debating about the
scheduler load.
You've to agree that sharing a single lock/queue for multiple CPU is,
let's say, quite crappy.
You agreed that the scheduler is easy and the fix should not take that
much time.
You said that you're going to accept the solution that is coming out from
the mailing list.
Why don't we start talking about some solution and code ?
Starting from a basic architecture down to the implementation.
Alan and Rik are quite "unloaded" now, what do You think ?



- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:27         ` Linus Torvalds
@ 2001-12-18 17:54           ` Andreas Dilger
  2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:35             ` Linus Torvalds
  2001-12-18 18:58           ` Alan Cox
  1 sibling, 2 replies; 168+ messages in thread
From: Andreas Dilger @ 2001-12-18 17:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List,
	Jeff Garzik

On Dec 18, 2001  09:27 -0800, Linus Torvalds wrote:
> Maybe the best thing to do is to educate the people who write the sound
> apps for Linux (somebody was complaining about "esd" triggering this, for
> example).

Yes, esd is an interrupt hog, it seems.  When reading this thread, I
checked, and sure enough I was getting 190 interrupts/sec on the
sound card while not playing any sound.  I killed esd (which I don't
use anyways), and interrupts went to 0/sec when not playing sound.
Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
sound card.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
  2001-12-18 16:56   ` Rik van Riel
@ 2001-12-18 17:55   ` Davide Libenzi
  2001-12-18 19:43   ` Alexander Viro
  2 siblings, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-18 17:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> Quite frankly, I don't think you've told that many people. I haven't seen
> any discussion about the aio stuff on linux-kernel, which may be because
> you posted several announcements and nobody cared, or it may be that
> you've only mentioned it fleetingly and people didn't notice.

This is not to ask the inclusion of /dev/epoll inside the kernel ( it can
be easily merged by users that want to use it ) but i've had its users to
prefer talking about that out of the mailing list. Maybe because they're
scared to be eaten by some gurus when asking easy questions :)




- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  4:27 ` Linus Torvalds
  2001-12-18  4:55   ` William Lee Irwin III
@ 2001-12-18 18:13   ` Davide Libenzi
  1 sibling, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-18 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: William Lee Irwin III, Kernel Mailing List

On Mon, 17 Dec 2001, Linus Torvalds wrote:

> The most likely cause is simply waking up after each sound interrupt: you
> also have a _lot_ of time handling interrupts. Quite frankly, web surfing
> and mp3 playing simply shouldn't use any noticeable amounts of CPU.

It must be noted that wking up a task is going to take two lock operations
( and two unlock ), one in try_to_wakeup() and the other one in schedule().
This double the frequency seen by the lock.



- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18  6:09     ` Linus Torvalds
                         ` (5 preceding siblings ...)
  2001-12-18 17:21       ` David Mansfield
@ 2001-12-18 18:25       ` William Lee Irwin III
  6 siblings, 0 replies; 168+ messages in thread
From: William Lee Irwin III @ 2001-12-18 18:25 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List, Jeff Garzik

On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote:
> Well, looking at the issue, the problem is probably not just in the sb
> driver: the soundblaster driver shares the output buffer code with a
> number of other drivers (there's some horrible "dmabuf.c" code in common).
> And yes, the dmabuf code will wake up the writer on every single DMA
> complete interrupt. Considering that you seem to have them at least 400
> times a second (and probably more, unless you've literally had sound going
> since the machine was booted), I think we know why your setup spends time
> in the scheduler.
> A number of sound drivers will use the same logic.

I've chucked the sb32 and plugged in the emu10k1 I had been planning
to install for a while, to good effect. It's not an ISA sb16, but it
apparently uses the same driver.

I'm getting an overall 1% reduction in system load, and the following
"top 5" profile:

 53374 total                                      0.0400
 11430 default_idle                             238.1250
  8820 handle_IRQ_event                          91.8750
  2186 do_softirq                                10.5096
  1984 schedule                                   1.2525
  1612 number                                     1.4816
  1473 __generic_copy_to_user                    18.4125

Oddly, I'm getting even more interrupts than I saw with the sb32...

  0:    2752924          XT-PIC  timer
  9:   14223905          XT-PIC  EMU10K1, eth1

(eth1 generates orders of magnitude fewer interrupts than the timer)

On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote:
> You may be able to change this more easily some other way, by using a
> larger fragment size for example. That's up to the sw that actually feeds
> the sound stream, so it might be your decoder that selects a small
> fragment size.
> Quite frankly I don't know the sound infrastructure well enough to make
> any more intelligent suggestions about other decoders or similar to try,
> at this point I just start blathering.

Already more insight into the problem I was experiencing than I had
before, and I must confess to those such as myself this lead certainly
seems "plucked out of the air". Good work! =)

On Mon, Dec 17, 2001 at 10:09:22PM -0800, Linus Torvalds wrote:
> But yes, I bet you'll also see much less impact of this if you were to
> switch to more modern hardware.

I hear from elsewhere the emu10k1 has a bad reputation as source of
excessive interrupts. Looks like I bought the wrong sound card(s).
Maybe I should go shopping. =)


Thanks a bunch!
Bill

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:54           ` Andreas Dilger
@ 2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:52               ` Andreas Dilger
                                 ` (3 more replies)
  2001-12-18 18:35             ` Linus Torvalds
  1 sibling, 4 replies; 168+ messages in thread
From: Doug Ledford @ 2001-12-18 18:27 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Kernel Mailing List

Andreas Dilger wrote:

> On Dec 18, 2001  09:27 -0800, Linus Torvalds wrote:
> 
>>Maybe the best thing to do is to educate the people who write the sound
>>apps for Linux (somebody was complaining about "esd" triggering this, for
>>example).
>>
> 
> Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> checked, and sure enough I was getting 190 interrupts/sec on the
> sound card while not playing any sound.  I killed esd (which I don't
> use anyways), and interrupts went to 0/sec when not playing sound.
> Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> sound card.


Weel, evidently esd and artsd both do this (well, I assume esd does now, it 
didn't do this in the past).  Basically, they both transmit silence over the 
sound chip when nothing else is going on.  So even though you don't hear 
anything, the same sound output DMA is taking place.  That avoids things 
like nasty pops when you start up the sound hardware for a beep and that 
sort of thing.  It also maintains state where as dropping output entirely 
could result in things like module auto unloading and then reloading on the 
next beep, etc.  Personally, the interrupt count and overhead annoyed me 
enough that when I started hacking on the i810 sound driver one of my 
primary goals was to get overhead and interrupt count down.  I think I 
suceeded quite well.  On my current workstation:

Context switches per second not playing any sound: 8300 - 8800
Context switches per second playing an MP3: 9200 - 9900
Interrupts per second from sound device: 86
%CPU used when not playing MP3: 0 - 3% (magicdev is a CPU pig once every 2 
seconds)
%CPU used when playing MP3s: 0 - 4%

In any case, it might be worth the original poster's time in figuring out 
just how much of his lost CPU is because of playing sound and how much is 
actually caused by the windowing system and all the associated bloat that 
comes with it now a days.





-- 

  Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
       Please check my web site for aic7xxx updates/answers before
                       e-mailing me about problems


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:54           ` Andreas Dilger
  2001-12-18 18:27             ` Doug Ledford
@ 2001-12-18 18:35             ` Linus Torvalds
  1 sibling, 0 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-18 18:35 UTC (permalink / raw)
  To: Andreas Dilger
  Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List,
	Jeff Garzik


On Tue, 18 Dec 2001, Andreas Dilger wrote:
>
> Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> checked, and sure enough I was getting 190 interrupts/sec on the
> sound card while not playing any sound.  I killed esd (which I don't
> use anyways), and interrupts went to 0/sec when not playing sound.
> Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> sound card.

190 interrupts / sec sounds excessive, but not wildly so. The interrupt
per se is not going to be a CPU hog unless the sound card does programmed
IO to fill the data queues, and while that is not unheard of, I don't
think such a card has been made in the last five years.

Obviously getting 190 irq's per second even when not actually _doing_
anything is a total waste of CPU, and is bad form. There may be some
reason why esd does it, most probably for good synchronization between
sound events and to avoid popping when the sound is shut down (many sound
drivers seem to pop a bit on open/close, possibly due to driver bugs, but
possibly because some hard-to-avoid-programmatically hardware glitch when
powering down the logic.

So waiting a while with the driver active may actually be a reasonable
thing to do, although I suspect that after long sequences of silence "esd"
should really shut down for a while (and "long" here is probably on the
order of seconds, not minutes).

What probably _really_ ends up hurting performance is probably not the
interrupt per se (although it is noticeable), but the fact that we wake up
and cause a schedule - which often blows any CPU caches, making the _next_
interrupt also be more expensive than it would possibly need to be.

The code for that (in the case of drivers that use the generic "dmabuf.c"
infrastructure) seems to be in "finish_output_interrupt()", and I suspect
that it could be improved with something like

	dmap = adev->dmap_out;
	lim = dmap->nbufs;
	if (lim < 2) lim = 2;
	if (dmap->qlen <= lim/2) {
		...
	}

around the current unconditional wakeups.

Yeah, yeah, untested, stupid example, the idea being that we only wake up
if we have at least half the frags free now, instead of waking up for
_every_ fragment that free's up.

The above is just as a suggestion for some testing, if somebody actually
feels like trying it out. It probably won't be good as-is, but as a
starting point..

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 15:34           ` degger
@ 2001-12-18 18:35             ` Mike Kravetz
  2001-12-18 18:48             ` Davide Libenzi
  1 sibling, 0 replies; 168+ messages in thread
From: Mike Kravetz @ 2001-12-18 18:35 UTC (permalink / raw)
  To: degger; +Cc: alan, linux-kernel

On Tue, Dec 18, 2001 at 04:34:57PM +0100, degger@fhm.edu wrote:
> What about a CONFIG_8WAY which, if set, activates a scheduler that
> performs better on such nontypical machines?

I'm pretty sure that we can create a scheduler that works well on
an 8-way, and works just as well as the current scheduler on a UP
machine.  There is already a CONFIG_SMP which is all that should
be necessary to distinguish between the two.

What may be of more concern is support for different architectures
such as HMT and NUMA.  What about better scheduler support for
people working in the RT embedded space?  Each of these seem to
have different scheduling requirements.  Do people working on these
'non-typical' machines need to create their own scheduler patches?
OR is there some 'clean' way to incorporate them into the source
tree?

-- 
Mike

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 15:34           ` degger
  2001-12-18 18:35             ` Mike Kravetz
@ 2001-12-18 18:48             ` Davide Libenzi
  1 sibling, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-18 18:48 UTC (permalink / raw)
  To: degger; +Cc: Alan Cox, lkml

On Tue, 18 Dec 2001 degger@fhm.edu wrote:

> On 18 Dec, Alan Cox wrote:
>
> > The scheduler is eating 40-60% of the machine on real world 8 cpu
> > workloads. That isn't going to go away by sticking heads in sand.
>
> What about a CONFIG_8WAY which, if set, activates a scheduler that
> performs better on such nontypical machines? I see and understand
> boths sides arguments yet I fail to see where the real problem is
> with having a scheduler that just kicks in _iff_ we're running the
> kernel on a nontypical kind of machine.
> This would keep the straigtforward scheduler Linus is defending
> for the single processor machines while providing more performance
> to heavy SMP machines by having a more complex scheduler better suited
> for this task.

By using a multi queue scheduler with global balancing policy you can keep
the core scheduler as is and have the balancing code to take care of
distributing the load.
Obviously that code is under CONFIG_SMP, so it's not even compiled in UP.
In this way you've the same scheduler code running independently with a
lower load on the run queue and an high locality of locking.




- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
@ 2001-12-18 18:52               ` Andreas Dilger
  2001-12-18 19:03                 ` Doug Ledford
  2001-12-19  9:19               ` Peter Wächtler
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 168+ messages in thread
From: Andreas Dilger @ 2001-12-18 18:52 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Kernel Mailing List

On Dec 18, 2001  13:27 -0500, Doug Ledford wrote:
> Andreas Dilger wrote:
> > Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> > checked, and sure enough I was getting 190 interrupts/sec on the
> > sound card while not playing any sound.  I killed esd (which I don't
> > use anyways), and interrupts went to 0/sec when not playing sound.
> > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> > sound card.
> 
> Weel, evidently esd and artsd both do this (well, I assume esd does now, it 
> didn't do this in the past).  Basically, they both transmit silence over the 
> sound chip when nothing else is going on.  So even though you don't hear 
> anything, the same sound output DMA is taking place.  That avoids things 
> like nasty pops when you start up the sound hardware for a beep and that 
> sort of thing.

Hmm, I _do_ notice a pop when the sound hardware is first initialized at
boot time, but not when mpg123 starts/stops (without esd running) so I
personally don't get any benefit from "the sound of silence".  That said,
asside from the 190 interrupts/sec from esd, it doesn't appear to use any
measurable CPU time by itself.

> Context switches per second not playing any sound: 8300 - 8800
> Context switches per second playing an MP3: 9200 - 9900

Hmm, something seems very strange there.  On an idle system, I get about
100 context switches/sec, and about 150/sec when playing sound (up to 400/sec
when moving the mouse between windows).  9000 cswitches/sec is _very_ high.
This is with a text-only player which has screen output (other than the
ID3 info from the currently played song).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:27         ` Linus Torvalds
  2001-12-18 17:54           ` Andreas Dilger
@ 2001-12-18 18:58           ` Alan Cox
  2001-12-18 19:31             ` Gerd Knorr
  1 sibling, 1 reply; 168+ messages in thread
From: Alan Cox @ 2001-12-18 18:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Mansfield, William Lee Irwin III, Kernel Mailing List,
	Jeff Garzik

> Maybe the best thing to do is to educate the people who write the sound
> apps for Linux (somebody was complaining about "esd" triggering this, for
> example).

esd is a culprit, and artsd to an extent. esd is scheduled to die so artsd
is the big one to tidy. Kernel side OSS is dead so its a matter for ALSA

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:52               ` Andreas Dilger
@ 2001-12-18 19:03                 ` Doug Ledford
  0 siblings, 0 replies; 168+ messages in thread
From: Doug Ledford @ 2001-12-18 19:03 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Kernel Mailing List

Andreas Dilger wrote:


> Hmm, I _do_ notice a pop when the sound hardware is first initialized at
> boot time, but not when mpg123 starts/stops (without esd running) so I
> personally don't get any benefit from "the sound of silence".  That said,
> asside from the 190 interrupts/sec from esd, it doesn't appear to use any
> measurable CPU time by itself.
> 
> 
>>Context switches per second not playing any sound: 8300 - 8800
>>Context switches per second playing an MP3: 9200 - 9900
>>
> 
> Hmm, something seems very strange there.  On an idle system, I get about
> 100 context switches/sec, and about 150/sec when playing sound (up to 400/sec
> when moving the mouse between windows).  9000 cswitches/sec is _very_ high.
> This is with a text-only player which has screen output (other than the
> ID3 info from the currently played song).


I haven't taken the time to track down what's causing all the context 
switches, but on my system they are indeed "normal".  I suspect large 
numbers of them are a result of interactions between gnome, nautilus, X, 
xmms, esd, and gnome-xmms.  However, I did just track down one reason for 
it.  It's not 8300 - 8800, its 830 - 880.  There appears to be a bug in the 
procinfo -n1 mode that results in an extra digit getting tacked onto the end 
of the context switch line.  So, take my original numbers and lop off the 
last digit from the context switch numbers and that's more like what the 
machine is actually doing.





-- 

  Doug Ledford <dledford@redhat.com>  http://people.redhat.com/dledford
       Please check my web site for aic7xxx updates/answers before
                       e-mailing me about problems


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:18     ` Linus Torvalds
@ 2001-12-18 19:04       ` Alan Cox
  2001-12-18 21:02         ` Larry McVoy
  2001-12-19 16:50         ` Scheduler ( was: Just a second ) Daniel Phillips
  2001-12-18 19:11       ` Scheduler ( was: Just a second ) Mike Galbraith
  2001-12-18 19:15       ` Rik van Riel
  2 siblings, 2 replies; 168+ messages in thread
From: Alan Cox @ 2001-12-18 19:04 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

> I'm not subscribed to any, thank you very much. I read them through a news
> gateway, which gives me access to the common ones.
> 
> And if the discussion wasn't on the common ones, then it wasn't an open
> discussion.

If the discussion was on the l/k list then most kernel developers arent
going to read it because tey dont have time to wade through all the crap
that doesnt matter to them.
 
> And no, I don't think IRC counts either, sorry.

IRC is where most stuff, especially cross vendor stuff is initially
discussed nowdays, along with kernelnewbies where most of the intro
stuff is - but thats disussed rather than formally proposed and studied

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 19:04       ` Alan Cox
@ 2001-12-18 19:11       ` Mike Galbraith
  2001-12-18 19:15       ` Rik van Riel
  2 siblings, 0 replies; 168+ messages in thread
From: Mike Galbraith @ 2001-12-18 19:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> And no, I don't think IRC counts either, sorry.

Well yeah.. it's synchronous IO :)

	-Mike


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:18     ` Linus Torvalds
  2001-12-18 19:04       ` Alan Cox
  2001-12-18 19:11       ` Scheduler ( was: Just a second ) Mike Galbraith
@ 2001-12-18 19:15       ` Rik van Riel
  2001-12-18 22:32         ` in defense of the linux-kernel mailing list Ingo Molnar
  2 siblings, 1 reply; 168+ messages in thread
From: Rik van Riel @ 2001-12-18 19:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> And no, I don't think IRC counts either, sorry.

Whether you think it counts or not, IRC is where
most stuff is happening nowadays.

cheers,

Rik
-- 
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 17:00           ` Linus Torvalds
@ 2001-12-18 19:17             ` Alan Cox
  0 siblings, 0 replies; 168+ messages in thread
From: Alan Cox @ 2001-12-18 19:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Alan Cox, Rik van Riel, Davide Libenzi, Kernel Mailing List

> > The scheduler is eating 40-60% of the machine on real world 8 cpu workloads.
> > That isn't going to go away by sticking heads in sand.
> 
> Did you _read_ what I said?
> 
> We _have_ patches. You apparently have your own set.

I did read that mail - but somewhat later. Right now Im scanning l/k
every few days no more.

As to my stuff - everything I propose different to ibm/davide is about
cost/speed of ordering or minor optimisations. I don't plan to compete and
duplicate work

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:58           ` Alan Cox
@ 2001-12-18 19:31             ` Gerd Knorr
  0 siblings, 0 replies; 168+ messages in thread
From: Gerd Knorr @ 2001-12-18 19:31 UTC (permalink / raw)
  To: linux-kernel

>  Kernel side OSS is dead

What do you mean with "Kernel side OSS"?  Only Hannu's OSS/free drivers?
Or all current kernel drivers which support the OSS API, including most
(all?) PCI sound drivers which don't use any old OSS/free code?

  Gerd

-- 
#define	ENOCLUE 125 /* userland programmer induced race condition */

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
  2001-12-18 16:56   ` Rik van Riel
  2001-12-18 17:55   ` Scheduler ( was: Just a second ) Davide Libenzi
@ 2001-12-18 19:43   ` Alexander Viro
  2 siblings, 0 replies; 168+ messages in thread
From: Alexander Viro @ 2001-12-18 19:43 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Benjamin LaHaise, Davide Libenzi, Kernel Mailing List



On Tue, 18 Dec 2001, Linus Torvalds wrote:

> Where are the negative comments from Al? (Al _always_ has negative
> comments and suggestions for improvements, don't try to say that he also
> liked it unconditionally ;)

Heh.

Aside of a _big_ problem with exposing async API to userland (for a
lot of reasons, including usual quality of async code in general and
event-drivel one in particular) there is more specific one - Ben's
long-promised full-async writepage() and friends.  I'll believe it
when I see it and so far it didn't appear.

So for the time being I'm staying the fsck out of that - I don't like
it, but I'm sick and tired of this sort of religious wars.


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 19:04       ` Alan Cox
@ 2001-12-18 21:02         ` Larry McVoy
  2001-12-18 21:14           ` David S. Miller
                             ` (2 more replies)
  2001-12-19 16:50         ` Scheduler ( was: Just a second ) Daniel Phillips
  1 sibling, 3 replies; 168+ messages in thread
From: Larry McVoy @ 2001-12-18 21:02 UTC (permalink / raw)
  To: Alan Cox
  Cc: Linus Torvalds, Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

Maybe I'm an old stick in the mud, but IRC seems like a big waste of
time to me.  It's perfect for off the cuff answers and fairly useless
for thoughtful answers.  We used to write well thought out papers and
specifications for OS work.  These days if you can't do it in a paragraph
on IRC it must not be worth doing, eh?

On Tue, Dec 18, 2001 at 07:04:59PM +0000, Alan Cox wrote:
> > I'm not subscribed to any, thank you very much. I read them through a news
> > gateway, which gives me access to the common ones.
> > 
> > And if the discussion wasn't on the common ones, then it wasn't an open
> > discussion.
> 
> If the discussion was on the l/k list then most kernel developers arent
> going to read it because tey dont have time to wade through all the crap
> that doesnt matter to them.
>  
> > And no, I don't think IRC counts either, sorry.
> 
> IRC is where most stuff, especially cross vendor stuff is initially
> discussed nowdays, along with kernelnewbies where most of the intro
> stuff is - but thats disussed rather than formally proposed and studied
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:02         ` Larry McVoy
@ 2001-12-18 21:14           ` David S. Miller
  2001-12-18 21:17             ` Larry McVoy
  2001-12-18 21:18           ` Rik van Riel
  2001-12-19 17:44           ` IRC (was: Scheduler) Daniel Phillips
  2 siblings, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-18 21:14 UTC (permalink / raw)
  To: lm; +Cc: alan, torvalds, riel, bcrl, davidel, linux-kernel

   From: Larry McVoy <lm@bitmover.com>
   Date: Tue, 18 Dec 2001 13:02:28 -0800

   Maybe I'm an old stick in the mud, but IRC seems like a big waste of
   time to me.

It's like being at a Linux conference all the time. :-)

It does kind of make sense given that people are so scattered across
the planet.  Sometimes I want to just grill someone on something, and
email would be too much back and forth, IRC is one way to accomplish
that.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:14           ` David S. Miller
@ 2001-12-18 21:17             ` Larry McVoy
  2001-12-18 21:19               ` Rik van Riel
  2001-12-18 21:30               ` David S. Miller
  0 siblings, 2 replies; 168+ messages in thread
From: Larry McVoy @ 2001-12-18 21:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: lm, alan, torvalds, riel, bcrl, davidel, linux-kernel

On Tue, Dec 18, 2001 at 01:14:20PM -0800, David S. Miller wrote:
>    From: Larry McVoy <lm@bitmover.com>
>    Date: Tue, 18 Dec 2001 13:02:28 -0800
> 
>    Maybe I'm an old stick in the mud, but IRC seems like a big waste of
>    time to me.
> 
> It's like being at a Linux conference all the time. :-)
> 
> It does kind of make sense given that people are so scattered across
> the planet.  Sometimes I want to just grill someone on something, and
> email would be too much back and forth, IRC is one way to accomplish
> that.

Let me introduce you to this neat invention called a telephone.  It's
the black thing next to your desk, it rings, has buttons.  If you push
the right buttons, well, it's magic...

:-)

-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:02         ` Larry McVoy
  2001-12-18 21:14           ` David S. Miller
@ 2001-12-18 21:18           ` Rik van Riel
  2001-12-19 17:44           ` IRC (was: Scheduler) Daniel Phillips
  2 siblings, 0 replies; 168+ messages in thread
From: Rik van Riel @ 2001-12-18 21:18 UTC (permalink / raw)
  To: Larry McVoy
  Cc: Alan Cox, Linus Torvalds, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

On Tue, 18 Dec 2001, Larry McVoy wrote:

> Maybe I'm an old stick in the mud, but IRC seems like a big waste of
> time to me.  It's perfect for off the cuff answers and fairly useless
> for thoughtful answers.  We used to write well thought out papers and
> specifications for OS work.  These days if you can't do it in a
> paragraph on IRC it must not be worth doing, eh?

Actually, we tend to use multiple media at the same time.

It happens very often that because of some discussion on
IRC we end up writing up a few paragraphs and sending it
to people by email.

For other things, email is clearly too slow, so stuff is
done on IRC (eg. walking somebody through a piece of code
to identify and agree on a bug).

cheers,

Rik
--
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:17             ` Larry McVoy
@ 2001-12-18 21:19               ` Rik van Riel
  2001-12-18 21:30               ` David S. Miller
  1 sibling, 0 replies; 168+ messages in thread
From: Rik van Riel @ 2001-12-18 21:19 UTC (permalink / raw)
  To: Larry McVoy; +Cc: David S. Miller, alan, torvalds, bcrl, davidel, linux-kernel

On Tue, 18 Dec 2001, Larry McVoy wrote:
> On Tue, Dec 18, 2001 at 01:14:20PM -0800, David S. Miller wrote:
> >    From: Larry McVoy <lm@bitmover.com>
> >    Date: Tue, 18 Dec 2001 13:02:28 -0800
> >
> >    Maybe I'm an old stick in the mud, but IRC seems like a big waste of
> >    time to me.
> >
> > It's like being at a Linux conference all the time. :-)
>
> Let me introduce you to this neat invention called a telephone.  It's
> the black thing next to your desk, it rings, has buttons.  If you push
> the right buttons, well, it's magic...

Yeah, but you can't scroll up a page on the phone...

(also, talking with multiple people at the same time
is kind of annoying in audio, while it's ok on irc)

Rik
--
DMCA, SSSCA, W3C?  Who cares?  http://thefreeworld.net/

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 21:17             ` Larry McVoy
  2001-12-18 21:19               ` Rik van Riel
@ 2001-12-18 21:30               ` David S. Miller
  1 sibling, 0 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-18 21:30 UTC (permalink / raw)
  To: lm; +Cc: alan, torvalds, riel, bcrl, davidel, linux-kernel

   From: Larry McVoy <lm@bitmover.com>
   Date: Tue, 18 Dec 2001 13:17:13 -0800
   
   Let me introduce you to this neat invention called a telephone.  It's
   the black thing next to your desk, it rings, has buttons.  If you push
   the right buttons, well, it's magic...

I'm not calling Holland every time I want to poke Jens about
something in a patch we're working on :-)

I hate telephones for technical stuff, because people can call the
fucking thing when I am not behind my computer or even worse when I AM
behind my computer and I want to concentrate on the code on my screen
without being disturbed.  With IRC it is MY CHOICE to get involved in
the discussion, I can choose to respond or not respond to someone, I
can choose to be available or not available at any given time.  It's
just a real-time version of email.  And the "passive, I can ignore
you" part is what I like about it.

Telephones frankly suck for discussing technical topics.  I can't cut
and paste pieces of code from my other editor buffer to show you over
the phone, as another example as to why.

A lot of people like to use telephones specifically because it does
not give the other party the option of ignoring you once they pick up
the phone.  I value the ability to make the choice to ignore people
because a lot of ideas I don't give a crap about come under my nose.

In fact that may be one of the best parts about Linux development
compared to doing stuff at a company, one isn't required to listen to
someone's idea or to even read it.  If today I don't give a crap about
Joe's filesystem idea, hey guess what I'm not going to read any of his
emails about the thing.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: in defense of the linux-kernel mailing list
  2001-12-18 19:15       ` Rik van Riel
@ 2001-12-18 22:32         ` Ingo Molnar
  0 siblings, 0 replies; 168+ messages in thread
From: Ingo Molnar @ 2001-12-18 22:32 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Linus Torvalds, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List


On Tue, 18 Dec 2001, Rik van Riel wrote:

> > And no, I don't think IRC counts either, sorry.
>
> Whether you think it counts or not, IRC is where most stuff is
> happening nowadays.

most of the useful traffic on lkml cannot be expressed well on IRC. While
IRC might be useful as an additional form of communication channel, email
lists IMO should still be the main driving force of Linux kernel
development, else we'll only concentrate on those minute ideas that can be
expressed in 1-2 lines on irc and which are simple enough to be understood
until the next message comes. Also, the lack of reliable archiving of IRC
traffic prevents newcomers of reproducing the thought process afterwards.
While IRC might result in the seasoned kernel developer doing the next
super-patch quickly, it will in the end effect only isolate and alienate
newcomers and will only result in an aging, personality-driven elitist
old-boys network and a dying OS.

Regarding the use of IRC as the main development medium for the Linux
kernel - the fast pace of IRC often prevents deeper thoughts - while this
is definitely the point for many people who use IRC, it cannot result in a
much better kernel. [that having said, i'm using irc on a daily basis as
well so this is not irc-bashing, but i rarely use it for development
purposes.]

It's true that reading off-topic emails on lkml isnt a wise use of
developer powers either, but this has to be taken into account just like
spam - it's the price of having an open forum.

and honestly, much of the complaints about lkml's quality are exagerated.
What you dont take into account is the fact that while 3 or 5 years ago
you found perhaps every email on lkml exciting and challenging, today you
are an experienced kernel hacker and find perhaps 90% of the traffic
'boring'. I've just done a test - and perhaps i picked the wrong set of
emails - but the majority of lkml traffic is pretty legitimate, and i
would have found most of them 'interesting and exciting' just 5 years ago.
Today i know what they mean and might find them less challenging to
understand - but that is one of the bad side-effects of experience.
Today there are more people on lkml, more bugs get reported, and more
patches are discussed - so keeping up with lkml traffic is harder. Perhaps
it might make sense to separate linux-kernel into two lists:
linux-kernel-bugs and linux-kernel-devel (without moderation), but
otherwise the current form and quality of discussions (knock on wood) is
pretty OK i think.

also, more formal emails match actual source code format better than the
informal IRC traffic. So by being kindof forced to structure information
into a larger set of ASCII text, it will also be the first step towards
good kernel code.

(on IRC one might be the super-hacker with a well-known nick, entering and
exiting channels, being talked to by newbies. It might boost one's ego.
But it should not cloud one's judgement.)

	Ingo


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:52               ` Andreas Dilger
@ 2001-12-19  9:19               ` Peter Wächtler
  2001-12-19 11:05               ` Helge Hafting
  2001-12-21 20:23               ` Rob Landley
  3 siblings, 0 replies; 168+ messages in thread
From: Peter Wächtler @ 2001-12-19  9:19 UTC (permalink / raw)
  To: Doug Ledford; +Cc: Kernel Mailing List

Doug Ledford schrieb:
> 
> Andreas Dilger wrote:
> 
> > On Dec 18, 2001  09:27 -0800, Linus Torvalds wrote:
> >
> >>Maybe the best thing to do is to educate the people who write the sound
> >>apps for Linux (somebody was complaining about "esd" triggering this, for
> >>example).
> >>
> >
> > Yes, esd is an interrupt hog, it seems.  When reading this thread, I
> > checked, and sure enough I was getting 190 interrupts/sec on the
> > sound card while not playing any sound.  I killed esd (which I don't
> > use anyways), and interrupts went to 0/sec when not playing sound.
> > Still at 190/sec when using mpg123 on my ymfpci (Yamaha YMF744B DS-1S)
> > sound card.
> 
> Weel, evidently esd and artsd both do this (well, I assume esd does now, it
> didn't do this in the past).  Basically, they both transmit silence over the
> sound chip when nothing else is going on.  So even though you don't hear
> anything, the same sound output DMA is taking place.  That avoids things
> like nasty pops when you start up the sound hardware for a beep and that
> sort of thing.  It also maintains state where as dropping output entirely
> could result in things like module auto unloading and then reloading on the
> next beep, etc.  Personally, the interrupt count and overhead annoyed me
> enough that when I started hacking on the i810 sound driver one of my
> primary goals was to get overhead and interrupt count down.  I think I
> suceeded quite well.  On my current workstation:
> 
> Context switches per second not playing any sound: 8300 - 8800
> Context switches per second playing an MP3: 9200 - 9900
> Interrupts per second from sound device: 86
> %CPU used when not playing MP3: 0 - 3% (magicdev is a CPU pig once every 2
> seconds)
> %CPU used when playing MP3s: 0 - 4%
> 
> In any case, it might be worth the original poster's time in figuring out
> just how much of his lost CPU is because of playing sound and how much is
> actually caused by the windowing system and all the associated bloat that
> comes with it now a days.
> 

Do you really think 8000 context switches are sane?

pippin:/var/log # vmstat 1
   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 2  0  0 100728   4424 121572  27800   0   1     6     6   61    77  98   2   0
 2  0  0 100728   5448 121572  27800   0   0     0    68  112   811  93   7   0
 2  0  0 100728   5448 121572  27800   0   0     0     0  101   776  95   5   0
 3  0  0 100728   4928 121572  27800   0   0     0     0  101   794  92   8   0

having a load ~2.1 (2 seti@home)

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
  2001-12-18 18:52               ` Andreas Dilger
  2001-12-19  9:19               ` Peter Wächtler
@ 2001-12-19 11:05               ` Helge Hafting
  2001-12-21 20:23               ` Rob Landley
  3 siblings, 0 replies; 168+ messages in thread
From: Helge Hafting @ 2001-12-19 11:05 UTC (permalink / raw)
  To: Doug Ledford, linux-kernel

Doug Ledford wrote:

> Weel, evidently esd and artsd both do this (well, I assume esd does now, it
> didn't do this in the past).  Basically, they both transmit silence over the
> sound chip when nothing else is going on.  So even though you don't hear
> anything, the same sound output DMA is taking place.  

Uuurgh. :-(

> That avoids things
> like nasty pops when you start up the sound hardware for a beep and that

Yuk, bad hardware.  Pops when you start or stop writing?  You don't even
have to turn the volume off or something to get a pop?  Toss it.

> sort of thing.  It also maintains state where as dropping output entirely
> could result in things like module auto unloading and then reloading on the
> next beep, etc.  

Much better solved by having the device open, but not writing anything.
Open devices don't unload.

Helge Hafting

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 19:04       ` Alan Cox
  2001-12-18 21:02         ` Larry McVoy
@ 2001-12-19 16:50         ` Daniel Phillips
       [not found]           ` <Pine.LNX.4.33.0112190859050.1872-100000@penguin.transmeta.com>
  1 sibling, 1 reply; 168+ messages in thread
From: Daniel Phillips @ 2001-12-19 16:50 UTC (permalink / raw)
  To: Alan Cox, Linus Torvalds
  Cc: Rik van Riel, Benjamin LaHaise, Davide Libenzi,
	Kernel Mailing List

On December 18, 2001 08:04 pm, Alan Cox wrote:
> > I'm not subscribed to any, thank you very much. I read them through a news
> > gateway, which gives me access to the common ones.
> > 
> > And if the discussion wasn't on the common ones, then it wasn't an open
> > discussion.
> 
> If the discussion was on the l/k list then most kernel developers arent
> going to read it because tey dont have time to wade through all the crap
> that doesnt matter to them.

Hi Alan,

It's AIO we're talking about, right?  AIO is interesting to quite a few 
people.  I'd read the thread.  I'd also read any background material that Ben 
would be so kind as to supply.

--
Daniel

^ permalink raw reply	[flat|nested] 168+ messages in thread

* IRC (was: Scheduler)
  2001-12-18 21:02         ` Larry McVoy
  2001-12-18 21:14           ` David S. Miller
  2001-12-18 21:18           ` Rik van Riel
@ 2001-12-19 17:44           ` Daniel Phillips
  2001-12-19 17:51             ` Larry McVoy
  2001-12-19 18:19             ` M. Edward (Ed) Borasky
  2 siblings, 2 replies; 168+ messages in thread
From: Daniel Phillips @ 2001-12-19 17:44 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Kernel Mailing List

On December 18, 2001 10:02 pm, Larry McVoy wrote:
> Maybe I'm an old stick in the mud, but IRC seems like a big waste of
> time to me.  It's perfect for off the cuff answers and fairly useless
> for thoughtful answers.  We used to write well thought out papers and
> specifications for OS work.  These days if you can't do it in a paragraph
> on IRC it must not be worth doing, eh?

Hi Larry,

It's a question of using the right tool for the job.  As you know, email is 
no substitute for a traditional everybody-in-one-room design meeting.  These 
days, with development distributed all over the world it's just not practical 
for everyone to physically get together more than a few times a year, so what 
can we do?  Right, hang on IRC.

In some ways IRC is more efficient than a face-to-face meeting:

  - You can do other things at the same time without offending anyone
    (usually)

  - Everything is logged for reference

  - You can copy code examples and URLs into the channel

  - It's normal to send/forward emails, perhaps with traditional papers 
    attached, patches, whatever, while talking on the channel, or as a
    result of talking on the channel

  - It's there 24 hours a day

  - You can leave the meeting and do work any time you want to, as opposed to 
    keeping some portion of a group of highly paid engineers bored and idle 
    for hours at at time.

IRC also solves a big problem for distributed companies: how can you be sure 
that your people are actually on the job?  (You ping them on IRC and they 
respond.)

While there's no doubt about IRC's value, there's also a danger:  IRC is 
addictive.  You can easily end up spending all your time there, and doing 
very little design/coding as a result.  That's a matter of self-discipline.

To put this into a more immediate perspective for you, suppose you wanted to 
get some traction under your SMP Clusters proposal?  I'd suggest it's already 
been kicked around as much as it's going to be on lkml, and you already wrote 
your paper, so the next step would be to get together face-to-face with some 
folks who have a clue.  Well, unless you're willing to wait months for the 
right people to show up in the Bay Area, IRC is the way to go.

Come on in, the water's fine ;-)

--
Daniel

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: IRC (was: Scheduler)
  2001-12-19 17:44           ` IRC (was: Scheduler) Daniel Phillips
@ 2001-12-19 17:51             ` Larry McVoy
  2001-12-19 18:24               ` Daniel Phillips
  2001-12-19 18:19             ` M. Edward (Ed) Borasky
  1 sibling, 1 reply; 168+ messages in thread
From: Larry McVoy @ 2001-12-19 17:51 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Larry McVoy, Kernel Mailing List

On Wed, Dec 19, 2001 at 06:44:35PM +0100, Daniel Phillips wrote:
> On December 18, 2001 10:02 pm, Larry McVoy wrote:
> > Maybe I'm an old stick in the mud, but IRC seems like a big waste of
> > time to me.  It's perfect for off the cuff answers and fairly useless
> > for thoughtful answers.  We used to write well thought out papers and
> > specifications for OS work.  These days if you can't do it in a paragraph
> > on IRC it must not be worth doing, eh?
> 
> To put this into a more immediate perspective for you, suppose you wanted to 
> get some traction under your SMP Clusters proposal?  I'd suggest it's already 
> been kicked around as much as it's going to be on lkml, and you already wrote 
> your paper, so the next step would be to get together face-to-face with some 
> folks who have a clue.  Well, unless you're willing to wait months for the 
> right people to show up in the Bay Area, IRC is the way to go.

Actually, I haven't written a paper.  A paper is something which lays out

    goals
    architecture
    milestones
    design details

and should be sufficient to make the project happen should I be hit by a
bus.  That's my main complaint with IRC, it requires me to keep coming
back and explaining the same thing over and over again.

Here's an idea: you go try and get some traction on the OS cluster idea.
I'll give you 6 months and we'll see what happens.  If nothing has
happened, I'll produce a decent paper describing it and then we wait
another 6 months to see what happens.  I'll bet you 10:1 odds I get a
lot more action from a lot more people than you do.  Nope, wait, make
that 100:1 odds.

I've seen how little I manage to get done by talking.  Talk is cheap.
I've also seen how much I get done when I write a paper which other
people can pass around, think about, discuss, and implement.  A senior
guy at Morgan Stanley (hi marc) once told me "if you want to get things
done, write them down".  And in my case, since people tend to like to
argue with me rather than listen to me (yup, it's my fault, my "style"
leaves "room for improvement" translation: sucks rocks), a paper is 
far more effective.  My style is pretty much removed from the equation.

I can just see me on IRC, all I'd be getting is style complaints while
people successfully avoid the real points.  Look at the last 8 years 
of LKML.  I'd say most of the effect was from the LMbench paper and
maybe a few threads on performance which would have been more effective
if I'd written a detailed paper explaining my point of view.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: IRC (was: Scheduler)
  2001-12-19 17:44           ` IRC (was: Scheduler) Daniel Phillips
  2001-12-19 17:51             ` Larry McVoy
@ 2001-12-19 18:19             ` M. Edward (Ed) Borasky
  2001-12-19 18:27               ` Daniel Phillips
  2001-12-19 18:40               ` J Sloan
  1 sibling, 2 replies; 168+ messages in thread
From: M. Edward (Ed) Borasky @ 2001-12-19 18:19 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Kernel Mailing List

On Wed, 19 Dec 2001, Daniel Phillips wrote:

> To put this into a more immediate perspective for you, suppose you wanted to
> get some traction under your SMP Clusters proposal?  I'd suggest it's already
> been kicked around as much as it's going to be on lkml, and you already wrote
> your paper, so the next step would be to get together face-to-face with some
> folks who have a clue.  Well, unless you're willing to wait months for the
> right people to show up in the Bay Area, IRC is the way to go.
>
> Come on in, the water's fine ;-)

I've watched with great interest the discussion of IRC for Linux folk
and have yet to see anyone mention server/network names and channel
names. I've been on IRC for 2.5 years -- I tracked the Y2K transition on
IRC despite all the dire warnings that evil impulses were going to shoot
down the wire and fry the LCD screen on my laptop. So -- just where
exactly *is* this water that is so fine? mIRC 5.91 and I await with
bated breath. (Yes, I do use a Windows IRC client -- wanna make
something of it? :-)
--
Ed Borasky  znmeb@aracnet.com  http://www.borasky-research.net
(sometimes known as znmeb on IRC :-)

How to Stop A Folksinger Cold # 4
"Tie me kangaroo down, sport..."
Tie your own kangaroo down -- and stop calling me "sport"!


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: IRC (was: Scheduler)
  2001-12-19 17:51             ` Larry McVoy
@ 2001-12-19 18:24               ` Daniel Phillips
  0 siblings, 0 replies; 168+ messages in thread
From: Daniel Phillips @ 2001-12-19 18:24 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Larry McVoy, Kernel Mailing List

On December 19, 2001 06:51 pm, Larry McVoy wrote:
> On Wed, Dec 19, 2001 at 06:44:35PM +0100, Daniel Phillips wrote:
> > On December 18, 2001 10:02 pm, Larry McVoy wrote:
> > > Maybe I'm an old stick in the mud, but IRC seems like a big waste of
> > > time to me.  It's perfect for off the cuff answers and fairly useless
> > > for thoughtful answers.  We used to write well thought out papers and
> > > specifications for OS work.  These days if you can't do it in a paragraph
> > > on IRC it must not be worth doing, eh?
> > 
> > To put this into a more immediate perspective for you, suppose you wanted to 
> > get some traction under your SMP Clusters proposal?  I'd suggest it's already 
> > been kicked around as much as it's going to be on lkml, and you already wrote 
> > your paper, so the next step would be to get together face-to-face with some 
> > folks who have a clue.  Well, unless you're willing to wait months for the 
> > right people to show up in the Bay Area, IRC is the way to go.
> 
> Actually, I haven't written a paper.  A paper is something which lays out
> 
>     goals
>     architecture
>     milestones
>     design details
> 
> and should be sufficient to make the project happen should I be hit by a
> bus.  That's my main complaint with IRC, it requires me to keep coming
> back and explaining the same thing over and over again.
> 
> Here's an idea: you go try and get some traction on the OS cluster idea.
> I'll give you 6 months and we'll see what happens.  If nothing has
> happened, I'll produce a decent paper describing it and then we wait
> another 6 months to see what happens.  I'll bet you 10:1 odds I get a
> lot more action from a lot more people than you do.  Nope, wait, make
> that 100:1 odds.

Sorry, reverse psychology doesn't doesn't work that well on me ;)

> I've seen how little I manage to get done by talking.  Talk is cheap.
> I've also seen how much I get done when I write a paper which other
> people can pass around, think about, discuss, and implement.  A senior
> guy at Morgan Stanley (hi marc) once told me "if you want to get things
> done, write them down".  And in my case, since people tend to like to
> argue with me rather than listen to me (yup, it's my fault, my "style"
> leaves "room for improvement" translation: sucks rocks), a paper is 
> far more effective.  My style is pretty much removed from the equation.
> 
> I can just see me on IRC, all I'd be getting is style complaints while
> people successfully avoid the real points.  Look at the last 8 years 
> of LKML.  I'd say most of the effect was from the LMbench paper and
> maybe a few threads on performance which would have been more effective
> if I'd written a detailed paper explaining my point of view.

By all means, write the detailed paper if you've got time, then make
sure people have read it before you talk to them.  But trust me, there
are people hanging out on IRC right now who have more than a clue about
exactly the subject you're interested in, who would need no more than
a short note to be up to speed and ready to address the real issues
intelligently.

--
Daniel

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: IRC (was: Scheduler)
  2001-12-19 18:19             ` M. Edward (Ed) Borasky
@ 2001-12-19 18:27               ` Daniel Phillips
  2001-12-19 18:40               ` J Sloan
  1 sibling, 0 replies; 168+ messages in thread
From: Daniel Phillips @ 2001-12-19 18:27 UTC (permalink / raw)
  To: M. Edward (Ed) Borasky; +Cc: Kernel Mailing List

On December 19, 2001 07:19 pm, M. Edward (Ed) Borasky wrote:
> On Wed, 19 Dec 2001, Daniel Phillips wrote:
> I've watched with great interest the discussion of IRC for Linux folk
> and have yet to see anyone mention server/network names and channel
> names. I've been on IRC for 2.5 years -- I tracked the Y2K transition on
> IRC despite all the dire warnings that evil impulses were going to shoot
> down the wire and fry the LCD screen on my laptop. So -- just where
> exactly *is* this water that is so fine? mIRC 5.91 and I await with
> bated breath. (Yes, I do use a Windows IRC client -- wanna make
> something of it? :-)

/server irc.openprojects.net

/list

--
Daniel

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: IRC (was: Scheduler)
  2001-12-19 18:19             ` M. Edward (Ed) Borasky
  2001-12-19 18:27               ` Daniel Phillips
@ 2001-12-19 18:40               ` J Sloan
  1 sibling, 0 replies; 168+ messages in thread
From: J Sloan @ 2001-12-19 18:40 UTC (permalink / raw)
  To: M. Edward (Ed) Borasky; +Cc: Daniel Phillips, Kernel Mailing List

"M. Edward (Ed) Borasky" wrote:

>  mIRC 5.91 and I await with
> bated breath. (Yes, I do use a Windows IRC client -- wanna make
> something of it? :-)

(shrug) whatever turns you on, I guess...

I will mention that there is this cool OS
called Linux, you might have heard of it -

There are a number of very nice irc clients
available for it

;-)

jjs




^ permalink raw reply	[flat|nested] 168+ messages in thread

* aio
       [not found]           ` <Pine.LNX.4.33.0112190859050.1872-100000@penguin.transmeta.com>
@ 2001-12-19 18:57             ` Ben LaHaise
  2001-12-19 19:29               ` aio Dan Kegel
                                 ` (4 more replies)
  0 siblings, 5 replies; 168+ messages in thread
From: Ben LaHaise @ 2001-12-19 18:57 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 09:01:59AM -0800, Linus Torvalds wrote:
> 
> On Wed, 19 Dec 2001, Daniel Phillips wrote:
> >
> > It's AIO we're talking about, right?  AIO is interesting to quite a few
> > people.  I'd read the thread.  I'd also read any background material that Ben
> > would be so kind as to supply.
> 
> Case closed.
> 
> Dan didn't even _know_ of the patches.

He doesn't read l-k apparently.

> Ben: end of discussion. I will _not_ apply any patches for aio if they
> aren't openly discussed. We're not microsoft, and we're not Sun. We're
> "Open Source", not "cram things down peoples throat and spring new
> features on them as a fait accompli".

Discuss them then to your heart's content.  I've posted announcements to 
both l-k and linux-aio which are both on marc.theaimsgroup.com if you're 
too lazy to get your IS to add a new list yo the internal news gateway.

> The ghost of "binary compatibility" is not an issue - if Ben or anytbody
> else finds a flaw with the design, it's a hell of a lot better to have
> that flaw fixed _before_ it's part of my kernel rather than afterwards.

Thanks for the useful feedback on the userland interface then.  Evidently 
nobody cares within the community about improving functionality on a 
reasonable timescale.  If this doesn't change soon, Linux is doomed.

		-ben

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:45                       ` aio David S. Miller
@ 2001-12-19 18:57                         ` John Heil
  2001-12-20  3:06                           ` aio David S. Miller
  2001-12-20  3:07                         ` aio Bill Huey
                                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 168+ messages in thread
From: John Heil @ 2001-12-19 18:57 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

On Wed, 19 Dec 2001, David S. Miller wrote:

> Date: Wed, 19 Dec 2001 18:45:27 -0800 (PST)
> From: "David S. Miller" <davem@redhat.com>
> To: billh@tierra.ucsd.edu
> Cc: bcrl@redhat.com, torvalds@transmeta.com, linux-kernel@vger.kernel.org,
>     linux-aio@kvack.org
> Subject: Re: aio
> 
>    From: Bill Huey <billh@tierra.ucsd.edu>
>    Date: Wed, 19 Dec 2001 18:26:28 -0800
>    
>    The economic inertia of Java driven server applications should have
>    enough force that it is justifyable to RedHat and other commerical
>    organizations to support it regardless of what your current view is
>    on this topic.
> 
> So they'll get paid to implement and support it, and that is precisely
> what is happening right now.  And the whole point I'm trying to make
> is that that is where it's realm is right now.
> 
> If AIO was so relevant+sexy we'd be having threads of discussion about
> the AIO implementation instead of threads about how relevant it is or
> is not for the general populace.  Wouldn't you concur?  :-)
> 
> The people doing Java server applets are such a small fraction of the
> Linux user community.

True for now, but if we want to expand linux into the enterprise and the
desktop to a greater degree, then we need to support the Java community to
draw them and their management in, rather than delaying beneficial 
features until their number on lkml reaches critical mass for a design
discussion.


-
-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
johnhscs@sc-software.com
http://www.sc-software.com
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-19 18:57             ` aio Ben LaHaise
@ 2001-12-19 19:29               ` Dan Kegel
  2001-12-20  4:04                 ` aio Benjamin LaHaise
  2001-12-19 20:09               ` aio Daniel Phillips
                                 ` (3 subsequent siblings)
  4 siblings, 1 reply; 168+ messages in thread
From: Dan Kegel @ 2001-12-19 19:29 UTC (permalink / raw)
  To: Ben LaHaise; +Cc: Linus Torvalds, linux-kernel, linux-aio

Ben LaHaise wrote:
> > Ben: end of discussion. I will _not_ apply any patches for aio if they
> > aren't openly discussed. We're not microsoft, and we're not Sun. We're
> > "Open Source", not "cram things down peoples throat and spring new
> > features on them as a fait accompli".
> 
> Discuss them then to your heart's content.  I've posted announcements to
> both l-k and linux-aio which are both on marc.theaimsgroup.com ...

Ben, I think maybe we need to get people excited about your patches,
and build up a user base, before putting them in the mainline kernel.
The volume on the linux-aio list has been pretty light, and the
visibility of the patches has been pretty low.

I know I volunteered to write some doc for your aio, and haven't delivered;
thus I'm contributing to the problem.  Mea culpa.  But there are some
small things that could be done.  A freshmeat.net entry for the project,
for instance.  Shall I create one, or would you rather do it?
A home page for linux-aio would be great, too.
- Dan

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:06                           ` aio David S. Miller
@ 2001-12-19 19:30                             ` John Heil
  2001-12-20  5:29                               ` aio David S. Miller
  2001-12-20  3:21                             ` aio Bill Huey
  1 sibling, 1 reply; 168+ messages in thread
From: John Heil @ 2001-12-19 19:30 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

On Wed, 19 Dec 2001, David S. Miller wrote:

> Date: Wed, 19 Dec 2001 19:06:29 -0800 (PST)
> From: "David S. Miller" <davem@redhat.com>
> To: kerndev@sc-software.com
> Cc: billh@tierra.ucsd.edu, bcrl@redhat.com, torvalds@transmeta.com,
>     linux-kernel@vger.kernel.org, linux-aio@kvack.org
> Subject: Re: aio
> 
>    From: John Heil <kerndev@sc-software.com>
>    Date: Wed, 19 Dec 2001 18:57:34 +0000 (   )
>    
>    True for now, but if we want to expand linux into the enterprise and the
>    desktop to a greater degree, then we need to support the Java community to
>    draw them and their management in, rather than delaying beneficial 
>    features until their number on lkml reaches critical mass for a design
>    discussion.
> 
> Firstly, you say this as if server java applets do not function at all
> or with acceptable performance today.  That is not true for the vast
> majority of cases.
> 
> If java server applet performance in all cases is dependent upon AIO
> (it is not), that would be pretty sad.  But it wouldn't be the first
> time I've heard crap like that.  There is propaganda out there telling
> people that 64-bit address spaces are needed for good java
> performance.  Guess where that came from?  (hint: they invented java
> and are in the buisness of selling 64-bit RISC processors)
> 

Agree. However, put your business hat for a minute. We want increased
market share for linux and a lot of us, you included, live by it. 
If aio, the proposed implementation or some other, can provide an
adequate performance boost for Java (yet to be seen), that at least 
allows the marketing folks one more argument to draw users to linux. 
Do think the trade mags etc don't watch what we do? A demonstrable
advantage in Java performance is marketable and beneficial to all.
   

-
-----------------------------------------------------------------
John Heil
South Coast Software
Custom systems software for UNIX and IBM MVS mainframes
1-714-774-6952
johnhscs@sc-software.com
http://www.sc-software.com
-----------------------------------------------------------------


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-19 18:57             ` aio Ben LaHaise
  2001-12-19 19:29               ` aio Dan Kegel
@ 2001-12-19 20:09               ` Daniel Phillips
  2001-12-19 20:21               ` aio Davide Libenzi
                                 ` (2 subsequent siblings)
  4 siblings, 0 replies; 168+ messages in thread
From: Daniel Phillips @ 2001-12-19 20:09 UTC (permalink / raw)
  To: Ben LaHaise, Linus Torvalds; +Cc: linux-kernel, linux-aio, Suparna Bhattacharya

On December 19, 2001 07:57 pm, Ben LaHaise wrote:
> On Wed, Dec 19, 2001 at 09:01:59AM -0800, Linus Torvalds wrote:
> > 
> > On Wed, 19 Dec 2001, Daniel Phillips wrote:
> > >
> > > It's AIO we're talking about, right?  AIO is interesting to quite a few
> > > people.  I'd read the thread.  I'd also read any background material 
> > > that Ben would be so kind as to supply.
> > 
> > Case closed.
> > 
> > Dan didn't even _know_ of the patches.
> 
> He doesn't read l-k apparently.

Dan Kegel put it succinctly:

   http://marc.theaimsgroup.com/?l=linux-aio&m=100879005201064&w=2

Your original patch is here, and I do remember the post at the time:

   http://marc.theaimsgroup.com/?l=linux-kernel&m=98114243104171&w=2

This post provides *zero* context.  Ever since, I've been expecting to see 
some explanation of what the goals are, what the design principles are, what 
the historical context is, etc. etc., and that hasn't happened.

I've got a fairly recent version of the patch too, it's a little too long to 
just sit down and read, to reverse-engineer the above information.  What's 
missing here is some kind of writeup like Suparna did for Jens' bio patch 
(hint, hint).  There's no reason why every single person who might be 
interested should have to take the time to reverse-engineer the patch without 
context.

As Linus points out, the active discussion hasn't happened yet.

--
Daniel

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-19 18:57             ` aio Ben LaHaise
  2001-12-19 19:29               ` aio Dan Kegel
  2001-12-19 20:09               ` aio Daniel Phillips
@ 2001-12-19 20:21               ` Davide Libenzi
       [not found]               ` <mailman.1008792601.3391.linux-kernel2news@redhat.com>
  2001-12-20  0:13               ` aio David S. Miller
  4 siblings, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-19 20:21 UTC (permalink / raw)
  To: Ben LaHaise; +Cc: Linus Torvalds, linux-kernel, linux-aio

On Wed, 19 Dec 2001, Ben LaHaise wrote:

> Thanks for the useful feedback on the userland interface then.  Evidently
> nobody cares within the community about improving functionality on a
> reasonable timescale.  If this doesn't change soon, Linux is doomed.

Ben, maybe it's true, nobody cares :( This could be either bad or good.
On one side it could be good because this means that everyone is happy
with the kernel performance level and this could be due the fact that real
world loads does not put their applications under stress. It could be bad
because it's possible that exist applications that are currently under
stress ( yes ), but their developers do not understand that by using
different interfaces they can improve their software ( or they simply do
not understand that the application is under stress ). Or maybe application
developers are not in lk. Or maybe they're not willing to rewrite/experiment
new APIs. On one side i understand that you can have an intrinsic attitude
to push/defend your patch, while one the other side i can agree with the
Linus point to have some kind of broad discussion/adoption about it.
But if applications developers are not in this list there won't be a broad
discussion and if the patch does not go inside the mainstream kernel
"external" applications developers are not going to use it. The Linus
point could be: "why do i have to merge a new api that has had a so cold
discussion/adoption inside lk ?".
Yes egg-chicken draws the picture very well.



- Davide




^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
       [not found]               ` <mailman.1008792601.3391.linux-kernel2news@redhat.com>
@ 2001-12-19 20:23                 ` Pete Zaitcev
  0 siblings, 0 replies; 168+ messages in thread
From: Pete Zaitcev @ 2001-12-19 20:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-aio

> > > > It's AIO we're talking about, right?  AIO is interesting to quite a few
> > > > people.  I'd read the thread.  I'd also read any background material 
> > > > that Ben would be so kind as to supply.
> > > 
> > > Case closed.
> > > 
> > > Dan didn't even _know_ of the patches.

> I've got a fairly recent version of the patch too, it's a little too long to 
> just sit down and read, to reverse-engineer the above information.

Heh, I agree, in a way. I did that once, did not find any major
objections and documented about 20 small things like functions
that have extra arguments which are never used, etc. Ben saw it
and said "I know about all that, never mind".
Perhaps I should have had posted it somewhere?

-- Pete

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-19 18:57             ` aio Ben LaHaise
                                 ` (3 preceding siblings ...)
       [not found]               ` <mailman.1008792601.3391.linux-kernel2news@redhat.com>
@ 2001-12-20  0:13               ` David S. Miller
  2001-12-20  0:21                 ` aio Benjamin LaHaise
  2001-12-20  1:16                 ` aio Bill Huey
  4 siblings, 2 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  0:13 UTC (permalink / raw)
  To: bcrl; +Cc: torvalds, linux-kernel, linux-aio

   From: Ben LaHaise <bcrl@redhat.com>
   Date: Wed, 19 Dec 2001 13:57:08 -0500
   
   Thanks for the useful feedback on the userland interface then.  Evidently 
   nobody cares within the community about improving functionality on a 
   reasonable timescale.  If this doesn't change soon, Linux is doomed.

Maybe it's because the majority of people don't care nor would ever
need to use AIO.  Are you willing to accept this possibly? :-) Linux
is anything but doomed, because you will notice that the things that
actually matter for most people are in fact improved and worked on
within a reasonable timescale.

Only very specialized applications can even benefit from AIO.  This
doesn't make it useless, but it does decrease the amount of interest
(and priority) anyone in the community will have in working on it.

Now, if these few and far between people who are actually interested
in AIO are willing to throw money at the problem to get it worked on,
that is how the "reasonable timescale" will be arrived at.  And if
they aren't willing to toss money at the problem, how important can it
really be to them? :-)

Maybe, just maybe, most people simply do not care one iota about AIO.

Linux caters to the general concerns not the nooks and cranies, that
is why it is anything but doomed.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  0:13               ` aio David S. Miller
@ 2001-12-20  0:21                 ` Benjamin LaHaise
  2001-12-20  0:36                   ` aio Andrew Morton
  2001-12-20  0:47                   ` aio Davide Libenzi
  2001-12-20  1:16                 ` aio Bill Huey
  1 sibling, 2 replies; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-20  0:21 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 04:13:59PM -0800, David S. Miller wrote:
> Now, if these few and far between people who are actually interested
> in AIO are willing to throw money at the problem to get it worked on,
> that is how the "reasonable timescale" will be arrived at.  And if
> they aren't willing to toss money at the problem, how important can it
> really be to them? :-)

People are throwing money at the problem.  We're now at a point that in 
order to provide the interested people with something they can use, we 
need some kind of way to protect their applications against calling an 
unsuspecting new mmap syscall instead of the aio syscall specified in 
the kernel they compiled against.

> Maybe, just maybe, most people simply do not care one iota about AIO.
> 
> Linux caters to the general concerns not the nooks and cranies, that
> is why it is anything but doomed.

What I'm saying is that for more people to play with it, it needs to be 
more widely available.  The set of developers that read linux-kernel and 
linux-aio aren't giving much feedback.  I do not expect the code to go 
into 2.5 at this point in time.  All I need is a set of syscall numbers 
that aren't going to change should this implementation stand up to the 
test of time.

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  0:21                 ` aio Benjamin LaHaise
@ 2001-12-20  0:36                   ` Andrew Morton
  2001-12-20  0:55                     ` aio H. Peter Anvin
  2001-12-20  0:47                   ` aio Davide Libenzi
  1 sibling, 1 reply; 168+ messages in thread
From: Andrew Morton @ 2001-12-20  0:36 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David S. Miller, torvalds, linux-kernel, linux-aio

Benjamin LaHaise wrote:
> 
> All I need is a set of syscall numbers that aren't going to change
> should this implementation stand up to the test of time.

The aio_* functions are part of POSIX and SUS, so merely reserving
system call numbers for them does not seems a completely dumb
thing to do, IMO.

-

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  0:21                 ` aio Benjamin LaHaise
  2001-12-20  0:36                   ` aio Andrew Morton
@ 2001-12-20  0:47                   ` Davide Libenzi
  1 sibling, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-20  0:47 UTC (permalink / raw)
  To: Benjamin LaHaise; +Cc: David S. Miller, Linus Torvalds, lkml, linux-aio

On Wed, 19 Dec 2001, Benjamin LaHaise wrote:

> What I'm saying is that for more people to play with it, it needs to be
> more widely available.  The set of developers that read linux-kernel and
> linux-aio aren't giving much feedback.  I do not expect the code to go
> into 2.5 at this point in time.  All I need is a set of syscall numbers
> that aren't going to change should this implementation stand up to the
> test of time.

It would be nice to have a cooperation between glibc and the kernel to
have syscalls mapped by name, not by number.
With name->number resolved by crtbegin.o reading a public kernel table
or accessing a fixed-ID kernel map function and filling a map.
So if internally ( at the application ) sys_getpid has index 0, the
sysmap[0] will be filled with the id retrieved inside the kernel by
looking up "sys_getpid".
Eat too spicy today ?




- Davide




^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  0:36                   ` aio Andrew Morton
@ 2001-12-20  0:55                     ` H. Peter Anvin
  0 siblings, 0 replies; 168+ messages in thread
From: H. Peter Anvin @ 2001-12-20  0:55 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <3C213270.966DABFE@zip.com.au>
By author:    Andrew Morton <akpm@zip.com.au>
In newsgroup: linux.dev.kernel
> 
> The aio_* functions are part of POSIX and SUS, so merely reserving
> system call numbers for them does not seems a completely dumb
> thing to do, IMO.
> 

Yes, it is, unless you already have a design for how to map the aio_*
library functions onto system calls.

	-hpa


-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt	<amsp@zytor.com>

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  0:13               ` aio David S. Miller
  2001-12-20  0:21                 ` aio Benjamin LaHaise
@ 2001-12-20  1:16                 ` Bill Huey
  2001-12-20  1:20                   ` aio David S. Miller
  1 sibling, 1 reply; 168+ messages in thread
From: Bill Huey @ 2001-12-20  1:16 UTC (permalink / raw)
  To: David S. Miller; +Cc: bcrl, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 04:13:59PM -0800, David S. Miller wrote:
> Maybe it's because the majority of people don't care nor would ever
> need to use AIO.  Are you willing to accept this possibly? :-) Linux
> is anything but doomed, because you will notice that the things that
> actually matter for most people are in fact improved and worked on
> within a reasonable timescale.
> 
> Only very specialized applications can even benefit from AIO.  This
> doesn't make it useless, but it does decrease the amount of interest
> (and priority) anyone in the community will have in working on it.

Folks doing serious server side Java and runtime internals would
definitely be able to use this stuff, namely me. It'll remove the
abuse of threading used to deal with large IO systems when NIO comes out
for 1.4. And as a JVM engineer for the FreeBSD community I'm drooling
over stuff like that.

> Now, if these few and far between people who are actually interested
> in AIO are willing to throw money at the problem to get it worked on,
> that is how the "reasonable timescale" will be arrived at.  And if
> they aren't willing to toss money at the problem, how important can it
> really be to them? :-)

Like the Java folks ? few and far between ? What you're saying it just
plain outdated and from a previous generation of thinking that has
become irrelevant as the community has grown.

> Maybe, just maybe, most people simply do not care one iota about AIO.
> 
> Linux caters to the general concerns not the nooks and cranies, that
> is why it is anything but doomed.

Again, Linux collectively has outgrown that thinking and the scope of
what the previous generation of engineers can responsible for, which is
why it's important for folks like Ben should be encourage to take it to
the next level.

bill


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  1:16                 ` aio Bill Huey
@ 2001-12-20  1:20                   ` David S. Miller
  2001-12-20  2:26                     ` aio Bill Huey
                                       ` (3 more replies)
  0 siblings, 4 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  1:20 UTC (permalink / raw)
  To: billh; +Cc: bcrl, torvalds, linux-kernel, linux-aio

   From: Bill Huey <billh@tierra.ucsd.edu>
   Date: Wed, 19 Dec 2001 17:16:31 -0800
   
   Like the Java folks ? few and far between ?

Precisely, in fact.  Anyone who can say that Java is going to be
relevant in a few years time, with a straight face, is only kidding
themselves.

Java is not something to justify a new kernel feature, that is for
certain.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  1:20                   ` aio David S. Miller
@ 2001-12-20  2:26                     ` Bill Huey
  2001-12-20  2:45                       ` aio David S. Miller
  2001-12-20  2:37                     ` aio Cameron Simpson
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 168+ messages in thread
From: Bill Huey @ 2001-12-20  2:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 05:20:46PM -0800, David S. Miller wrote:
> Precisely, in fact.  Anyone who can say that Java is going to be
> relevant in a few years time, with a straight face, is only kidding
> themselves.

Oh give me coke shooting, Steeley Dan, late 70s bitter kernel
programmer break...

> Java is not something to justify a new kernel feature, that is for
> certain.

Java is here now and used extensively on server side applications.
Simply dismissing it doesn't invalidate the claim that I made before
about how this mentality is outdated.

The economic inertia of Java driven server applications should have
enough force that it is justifyable to RedHat and other commerical
organizations to support it regardless of what your current view is
on this topic.

Even within the BSD/OS group at BSDi/WindRiver, (/me former BSD/OS
engineer) some kind of dedicated async IO system inside kernel was
talked about as highly desireable and possibly a more direct way
of dealing with VM page/async IO event issues that don't map
conceptually to a scheduler context cleanly.

AIO is good, plain and simple.

bill


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  1:20                   ` aio David S. Miller
  2001-12-20  2:26                     ` aio Bill Huey
@ 2001-12-20  2:37                     ` Cameron Simpson
  2001-12-20  2:47                       ` aio David S. Miller
       [not found]                     ` <mailman.1008816001.10138.linux-kernel2news@redhat.com>
  2001-12-21 17:28                     ` aio Alan Cox
  3 siblings, 1 reply; 168+ messages in thread
From: Cameron Simpson @ 2001-12-20  2:37 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 05:20:46PM -0800, David S. Miller <davem@redhat.com> wrote:
|    From: Bill Huey <billh@tierra.ucsd.edu>
|    Like the Java folks ? few and far between ?
| Precisely, in fact.  Anyone who can say that Java is going to be
| relevant in a few years time, with a straight face, is only kidding
| themselves.

Maybe. I'm good at that.

| Java is not something to justify a new kernel feature, that is for
| certain.

Of itself, maybe. (Though an attitude like yours is a core reason Java is
spreading as slowly as it is - much like Linux desktops...)

However, heavily threaded apps regardless of language are hardly likely
to disappear; threads are the natural way to write many many things. And
if the kernel implements threads as on Linux, then the scheduler will
become much more important to good performance.
-- 
Cameron Simpson, DoD#743        cs@zip.com.au    http://www.zip.com.au/~cs/

Questions are a burden to others,
	Answers, a prison for oneself.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:26                     ` aio Bill Huey
@ 2001-12-20  2:45                       ` David S. Miller
  2001-12-19 18:57                         ` aio John Heil
                                           ` (4 more replies)
  0 siblings, 5 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  2:45 UTC (permalink / raw)
  To: billh; +Cc: bcrl, torvalds, linux-kernel, linux-aio

   From: Bill Huey <billh@tierra.ucsd.edu>
   Date: Wed, 19 Dec 2001 18:26:28 -0800
   
   The economic inertia of Java driven server applications should have
   enough force that it is justifyable to RedHat and other commerical
   organizations to support it regardless of what your current view is
   on this topic.

So they'll get paid to implement and support it, and that is precisely
what is happening right now.  And the whole point I'm trying to make
is that that is where it's realm is right now.

If AIO was so relevant+sexy we'd be having threads of discussion about
the AIO implementation instead of threads about how relevant it is or
is not for the general populace.  Wouldn't you concur?  :-)

The people doing Java server applets are such a small fraction of the
Linux user community.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:37                     ` aio Cameron Simpson
@ 2001-12-20  2:47                       ` David S. Miller
  2001-12-20  2:52                         ` aio Cameron Simpson
  0 siblings, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-20  2:47 UTC (permalink / raw)
  To: cs; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

   From: Cameron Simpson <cs@zip.com.au>
   Date: Thu, 20 Dec 2001 13:37:05 +1100
   
   (Though an attitude like yours is a core reason Java is
   spreading as slowly as it is - much like Linux desktops...)
   
It's actually Sun's fault more than anyone else's.

   However, heavily threaded apps regardless of language are hardly likely
   to disappear; threads are the natural way to write many many things. And
   if the kernel implements threads as on Linux, then the scheduler will
   become much more important to good performance.

We are not talking about the scheduler, we are talking about
AIO.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:47                       ` aio David S. Miller
@ 2001-12-20  2:52                         ` Cameron Simpson
  2001-12-20  2:58                           ` aio David S. Miller
  0 siblings, 1 reply; 168+ messages in thread
From: Cameron Simpson @ 2001-12-20  2:52 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 06:47:18PM -0800, David S. Miller <davem@redhat.com> wrote:
|    (Though an attitude like yours is a core reason Java is
|    spreading as slowly as it is - much like Linux desktops...)
| It's actually Sun's fault more than anyone else's.

Debatable. But fortunately off topic.

|    However, heavily threaded apps regardless of language are hardly likely
|    to disappear; threads are the natural way to write many many things. And
|    if the kernel implements threads as on Linux, then the scheduler will
|    become much more important to good performance.
| We are not talking about the scheduler, we are talking about
| AIO.

It was in the same thread - I must have ignored the detail switch. Ignore
me in turn. But while I'm here, tell me why async I/O is important
to Java and not to anything else, which still seems the thrust of your
remarks.
-- 
Cameron Simpson, DoD#743        cs@zip.com.au    http://www.zip.com.au/~cs/

Always code as if the guy who ends up maintaining your code will be a violent
psychopath who knows where you live.
	- Martin Golding, DoD #0236, martin@plaza.ds.adp.com

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:52                         ` aio Cameron Simpson
@ 2001-12-20  2:58                           ` David S. Miller
  2001-12-20  5:47                             ` aio Linus Torvalds
  0 siblings, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-20  2:58 UTC (permalink / raw)
  To: cs; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

   From: Cameron Simpson <cs@zip.com.au>
   Date: Thu, 20 Dec 2001 13:52:21 +1100
   
   tell me why async I/O is important
   to Java and not to anything else, which still seems the thrust of
   your remarks.

Not precisely my thrust, which is that AIO is not important to any
significant population of Linux users, it is "nook and cranny" in
scope.  And that those "nook and cranny" folks who really find it
important can get paid implementation+support of AIO.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-19 18:57                         ` aio John Heil
@ 2001-12-20  3:06                           ` David S. Miller
  2001-12-19 19:30                             ` aio John Heil
  2001-12-20  3:21                             ` aio Bill Huey
  0 siblings, 2 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  3:06 UTC (permalink / raw)
  To: kerndev; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

   From: John Heil <kerndev@sc-software.com>
   Date: Wed, 19 Dec 2001 18:57:34 +0000 (   )
   
   True for now, but if we want to expand linux into the enterprise and the
   desktop to a greater degree, then we need to support the Java community to
   draw them and their management in, rather than delaying beneficial 
   features until their number on lkml reaches critical mass for a design
   discussion.

Firstly, you say this as if server java applets do not function at all
or with acceptable performance today.  That is not true for the vast
majority of cases.

If java server applet performance in all cases is dependent upon AIO
(it is not), that would be pretty sad.  But it wouldn't be the first
time I've heard crap like that.  There is propaganda out there telling
people that 64-bit address spaces are needed for good java
performance.  Guess where that came from?  (hint: they invented java
and are in the buisness of selling 64-bit RISC processors)

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:45                       ` aio David S. Miller
  2001-12-19 18:57                         ` aio John Heil
@ 2001-12-20  3:07                         ` Bill Huey
  2001-12-20  3:13                           ` aio David S. Miller
       [not found]                         ` <mailman.1008817860.10606.linux-kernel2news@redhat.com>
                                           ` (2 subsequent siblings)
  4 siblings, 1 reply; 168+ messages in thread
From: Bill Huey @ 2001-12-20  3:07 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 06:45:27PM -0800, David S. Miller wrote:
> So they'll get paid to implement and support it, and that is precisely
> what is happening right now.  And the whole point I'm trying to make
> is that that is where it's realm is right now.
> 
> If AIO was so relevant+sexy we'd be having threads of discussion about
> the AIO implementation instead of threads about how relevant it is or
> is not for the general populace.  Wouldn't you concur?  :-)

I attribute the lack of technical discussion to the least common denomiator
culture of the Linux community and not the merits of the actual technical
system itself. That's what linux-aio@ is for...

And using lkml as a AIO forum is probably outside of the scope of this list
and group.

> The people doing Java server applets are such a small fraction of the
> Linux user community.

Yeah, but the overall Unix community probably has something different to say
about that, certainly. Even in BSD/OS, this JVM project I've been working on is
recognized as one of the most important systems second (probably) only to the
kernel itself. And, IMO, they have a more balanced view of this language
system and the value of it economically as a money making platform instead of
showing off to their peers. It's a greatly anticipated project in all of the
BSDs.

That's my semi-obnoxious take on it. ;-)

bill


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:07                         ` aio Bill Huey
@ 2001-12-20  3:13                           ` David S. Miller
  2001-12-20  3:47                             ` aio Benjamin LaHaise
                                               ` (2 more replies)
  0 siblings, 3 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  3:13 UTC (permalink / raw)
  To: billh; +Cc: bcrl, torvalds, linux-kernel, linux-aio

   From: Bill Huey <billh@tierra.ucsd.edu>
   Date: Wed, 19 Dec 2001 19:07:16 -0800
   
   And using lkml as a AIO forum is probably outside of the scope of this list
   and group.

This whole thread exists because Linus wants public general and
technical discussion on lkml of new features to happen before he
considers putting them into the tree, and the fact that they are not
in the tree because he isn't seeing such enthusiastic discussions
happening at all.

I don't think AIO, because of it's non-trivial impact to the tree, is
at all outside the scope of this list.  This is in fact the place
where major stuff like AIO is meant to be discussed, not some special
list where only "AIO people" hang out, of course people on that list
will be enthusiastic about AIO!

Frankly, on your other comments, I don't give a rats ass what BSD/OS
people are doing about, nor how highly they rate, Java.  That is
neither here nor there.  Java is going to be dead in a few years, and
let's just agree to disagree about this particular point ok?

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:06                           ` aio David S. Miller
  2001-12-19 19:30                             ` aio John Heil
@ 2001-12-20  3:21                             ` Bill Huey
  2001-12-27  9:36                               ` aio Martin Dalecki
  1 sibling, 1 reply; 168+ messages in thread
From: Bill Huey @ 2001-12-20  3:21 UTC (permalink / raw)
  To: David S. Miller; +Cc: kerndev, billh, bcrl, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 07:06:29PM -0800, David S. Miller wrote:
> Firstly, you say this as if server java applets do not function at all
> or with acceptable performance today.  That is not true for the vast
> majority of cases.
> 
> If java server applet performance in all cases is dependent upon AIO
> (it is not), that would be pretty sad.  But it wouldn't be the first

Java is pretty incomplete in this area, which should be addressed to a
great degree in the new NIO API.

The core JVM isn't dependent on this stuff per se for performance, but
it is critical to server side programs that have to deal with highly
scalable IO systems, largely number of FDs, that go beyond the current
expressiveness of select()/poll().

This is all standard fare in *any* kind of high performance networking
application where some kind of high performance kernel/userspace event
delivery system is needed, kqueue() principally.

> time I've heard crap like that.  There is propaganda out there telling
> people that 64-bit address spaces are needed for good java
> performance.  Guess where that came from?  (hint: they invented java
> and are in the buisness of selling 64-bit RISC processors)

What ? oh god. HotSpot is a pretty amazing compiler and it performs well.
Swing does well now, but the lingering issue in Java is the shear size
of it and possibly GC issues. It pretty clear that it's going to get
larger, which is fine since memory is cheap.

bill


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:13                           ` aio David S. Miller
@ 2001-12-20  3:47                             ` Benjamin LaHaise
  2001-12-20  5:39                               ` aio David S. Miller
  2001-12-21 17:24                               ` aio Alan Cox
  2001-12-20 14:38                             ` aio Luigi Genoni
  2001-12-20 17:26                             ` aio Henning Schmiedehausen
  2 siblings, 2 replies; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-20  3:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 07:13:54PM -0800, David S. Miller wrote:
> I don't think AIO, because of it's non-trivial impact to the tree, is
> at all outside the scope of this list.  This is in fact the place
> where major stuff like AIO is meant to be discussed, not some special
> list where only "AIO people" hang out, of course people on that list
> will be enthusiastic about AIO!

Well maybe yourself and others should make some comments about it then.

> Frankly, on your other comments, I don't give a rats ass what BSD/OS
> people are doing about, nor how highly they rate, Java.  That is
> neither here nor there.  Java is going to be dead in a few years, and
> let's just agree to disagree about this particular point ok?

Who cares about Java?  What about high performance LDAP servers or tux-like 
userspace performance?  How about faster select and poll?  An X server that 
doesn't have to make a syscall to find out that more data has arrived?  What 
about nbd or iscsi servers that are in userspace and have all the benefits 
that their kernel side counterparts do?

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
       [not found] <Pine.LNX.4.33.0112181508001.3410-100000@penguin.transmeta.com>
@ 2001-12-20  3:50 ` Rik van Riel
  2001-12-20  4:04   ` Ryan Cumming
                     ` (2 more replies)
  0 siblings, 3 replies; 168+ messages in thread
From: Rik van Riel @ 2001-12-20  3:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List

On Tue, 18 Dec 2001, Linus Torvalds wrote:

> The thing is, I'm personally very suspicious of the "features for that
> exclusive 0.1%" mentality.

Then why do we have sendfile(), or that idiotic sys_readahead() ?

(is there _any_ use for sys_readahead() ?  at all ?)

cheers,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  3:50 ` Rik van Riel
@ 2001-12-20  4:04   ` Ryan Cumming
  2001-12-20  5:39   ` David S. Miller
  2001-12-20  5:52   ` Linus Torvalds
  2 siblings, 0 replies; 168+ messages in thread
From: Ryan Cumming @ 2001-12-20  4:04 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, torvalds

On December 19, 2001 19:50, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
> > The thing is, I'm personally very suspicious of the "features for that
> > exclusive 0.1%" mentality.
>
> Then why do we have sendfile(), or that idiotic sys_readahead() ?

Damn straights

sendfile(2) had an oppertunity to be a real extention of the Unix philosophy. 
If it was called something like "copy" (to match "read" and "write"), and 
worked on all fds (even if it didn't do zerocopy, it should still just work), 
it'd fit in a lot more nicely than even BSD sockets. Alas, as it is, it's 
more of a wart than an extention. 

Now, sys_readahead() is pretty much the stupidest thing I've ever heard. If 
we had a copy(2) syscall, we could do the same thing by: copy(sourcefile, 
/dev/null, count). I don't think sys_readahead() even qualifies as a wart. 

-Ryan

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-19 19:29               ` aio Dan Kegel
@ 2001-12-20  4:04                 ` Benjamin LaHaise
  0 siblings, 0 replies; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-20  4:04 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Linus Torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 11:29:15AM -0800, Dan Kegel wrote:
> I know I volunteered to write some doc for your aio, and haven't delivered;
> thus I'm contributing to the problem.  Mea culpa.  But there are some
> small things that could be done.  A freshmeat.net entry for the project,
> for instance.  Shall I create one, or would you rather do it?
> A home page for linux-aio would be great, too.

I've started writing some web pages, and will gratiously accept additional 
docs/text/questions.  Freshmeat will get an entry right after I send this 
out.

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
       [not found]                     ` <mailman.1008816001.10138.linux-kernel2news@redhat.com>
@ 2001-12-20  5:07                       ` Pete Zaitcev
  2001-12-20  5:10                         ` aio Cameron Simpson
  0 siblings, 1 reply; 168+ messages in thread
From: Pete Zaitcev @ 2001-12-20  5:07 UTC (permalink / raw)
  To: cs, David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

>[...]
> However, heavily threaded apps regardless of language are hardly likely
> to disappear; threads are the natural way to write many many things. And
> if the kernel implements threads as on Linux, then the scheduler will
> become much more important to good performance.

Cameron seems to be arguing with DaveM, but subconsciously he
only supports DaveM's point about AIO: Java cannot make use
of AIO, so that's one (large or small, important or unimportant)
group of applications down from the count.

Just trying to keep on topic :)

-- Pete

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  5:07                       ` aio Pete Zaitcev
@ 2001-12-20  5:10                         ` Cameron Simpson
  0 siblings, 0 replies; 168+ messages in thread
From: Cameron Simpson @ 2001-12-20  5:10 UTC (permalink / raw)
  To: Pete Zaitcev
  Cc: David S. Miller, billh, bcrl, torvalds, linux-kernel, linux-aio

On Thu, Dec 20, 2001 at 12:07:21AM -0500, Pete Zaitcev <zaitcev@redhat.com> wrote:
| >[...]
| > However, heavily threaded apps regardless of language are hardly likely
| > to disappear; threads are the natural way to write many many things. And
| > if the kernel implements threads as on Linux, then the scheduler will
| > become much more important to good performance.
| 
| Cameron seems to be arguing with DaveM,

About the wrong things, but no matter.

| but subconsciously he
| only supports DaveM's point about AIO: Java cannot make use
| of AIO, so that's one (large or small, important or unimportant)
| group of applications down from the count.

You're sure? Java _authors_ can't make use of it, but Java _implementors_
probably have good reason to want it ...

| Just trying to keep on topic :)

Whatever for?
--
Cameron Simpson, DoD#743        cs@zip.com.au    http://www.zip.com.au/~cs/

Reaching consensus in a group often is confused with finding the right
answer.	- Norman Maier

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
       [not found]                         ` <mailman.1008817860.10606.linux-kernel2news@redhat.com>
@ 2001-12-20  5:16                           ` Pete Zaitcev
  0 siblings, 0 replies; 168+ messages in thread
From: Pete Zaitcev @ 2001-12-20  5:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: billh

> I attribute the lack of technical discussion to the least common denomiator
> culture of the Linux community and not the merits of the actual technical
> system itself. That's what linux-aio@ is for...
> 
> And using lkml as a AIO forum is probably outside of the scope of this list
> and group.

Bill, who is going to read linux-aio? There are many splinter
lists. Just about a week ago Dave Gilbert hissed at me for
not posting to linux-scsi. OK, I admit, the USB cabal made me
to subscribe to linux-usb-devel - only because the subsystem
was so out of whack that I was spending all my time trying to fix
it and dealing with broken sourceforge listserver did not make
it much worse. I can make exception for Ben, out of pure respect.
But then what? Those lists proliferate like cockroaches, every day!
I wish I could subscribe to linux-aio, linux-scsi, linux-nfs,
linux-networking, linux-afs, linux-sound, an OpenGFS list,
and "open" AFS list, linux-s390, linux-on-vaio, linux-usb-user,
linux-infi-devel, Hotplug, and perhaps more.

-- Pete

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-19 19:30                             ` aio John Heil
@ 2001-12-20  5:29                               ` David S. Miller
  0 siblings, 0 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  5:29 UTC (permalink / raw)
  To: kerndev; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

   From: John Heil <kerndev@sc-software.com>
   Date: Wed, 19 Dec 2001 19:30:13 +0000 (   )
   
   Agree. However, put your business hat for a minute. We want increased
   market share for linux and a lot of us, you included, live by it. 

Oh my buisness hat is certainly on, which is why I keep talking about
the people who need this "paying for implementation and support of AIO
for Linux". :-)

Make no mistake, I do agree with your points though in general.

But those things are not dependent upon "standard Linus Linux" having
AIO first, this is what vendors do for differentiation by shipping
feature X in their kernel before others.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:47                             ` aio Benjamin LaHaise
@ 2001-12-20  5:39                               ` David S. Miller
  2001-12-20  5:58                                 ` aio Benjamin LaHaise
                                                   ` (2 more replies)
  2001-12-21 17:24                               ` aio Alan Cox
  1 sibling, 3 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  5:39 UTC (permalink / raw)
  To: bcrl; +Cc: billh, torvalds, linux-kernel, linux-aio

   From: Benjamin LaHaise <bcrl@redhat.com>
   Date: Wed, 19 Dec 2001 22:47:17 -0500

   Well maybe yourself and others should make some comments about it then.
   
Because, like I keep saying, it is totally uninteresting for most of
us.

   Who cares about Java?

The people telling me on this list how important AIO is for Linux :-)

   What about high performance LDAP servers or tux-like 
   userspace performance?

People have done "faster than TUX" userspace web service with the
current kernel, that is without AIO.  There is no reason you can't
do a fast LDAP server with the current kernel either, any such claim
is simply rubbish.  Why do we need AIO again?

   How about faster select and poll?

You don't need faster select and poll as demonstrated by the
userspace "faster than TUX" example above.

   An X server that doesn't have to make a syscall to find out that
   more data has arrived?

Who really needs this kind of performance improvement?  Like anyone
really cares if their window gets the keyboard focus or a pixel over a
AF_UNIX socket a few nanoseconds faster.  How many people do you think
believe they have unacceptable X performance right now and that
select()/poll() syscalls overhead is the cause?  Please get real.

People who want graphics performance are not pushing their data
through X over a filedescriptor, they are either using direct
rendering in the app itself (ala OpenGL) or they are using shared
memory for the bulk of the data (ala Xshm or Xv extensions).

   What about nbd or iscsi servers that are in userspace and have all
   the benefits  that their kernel side counterparts do?

I do not buy this claim that it is not possible the achieve the
desired performance using existing facilities.

The only example of AIO benefitting performance I see right now are
databases.

Franks a lot,
David S. Miller
davem@redhat.com

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  3:50 ` Rik van Riel
  2001-12-20  4:04   ` Ryan Cumming
@ 2001-12-20  5:39   ` David S. Miller
  2001-12-20  5:58     ` Linus Torvalds
  2001-12-20 11:29     ` Rik van Riel
  2001-12-20  5:52   ` Linus Torvalds
  2 siblings, 2 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  5:39 UTC (permalink / raw)
  To: riel; +Cc: torvalds, bcrl, alan, davidel, linux-kernel

   From: Rik van Riel <riel@conectiva.com.br>
   Date: Thu, 20 Dec 2001 01:50:36 -0200 (BRST)

   On Tue, 18 Dec 2001, Linus Torvalds wrote:
   
   > The thing is, I'm personally very suspicious of the "features for that
   > exclusive 0.1%" mentality.
   
   Then why do we have sendfile(), or that idiotic sys_readahead() ?

Sending files over sockets are %99 of what most network servers are
actually doing today, it is much more than 0.1% :-)

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:58                           ` aio David S. Miller
@ 2001-12-20  5:47                             ` Linus Torvalds
  2001-12-20  5:57                               ` aio David S. Miller
  0 siblings, 1 reply; 168+ messages in thread
From: Linus Torvalds @ 2001-12-20  5:47 UTC (permalink / raw)
  To: David S. Miller; +Cc: cs, billh, bcrl, linux-kernel, linux-aio


On Wed, 19 Dec 2001, David S. Miller wrote:
>
> Not precisely my thrust, which is that AIO is not important to any
> significant population of Linux users, it is "nook and cranny" in
> scope.  And that those "nook and cranny" folks who really find it
> important can get paid implementation+support of AIO.

I disagree - we can probably make the aio by Ben quite important. Done
right, it becomes a very natural way of doing event handling, and it could
very well be rather useful for many things that use select loops right
now.

So I actually like the thing as it stands now. What I don't like is how
it's been handled, with people inside Oracle etc working with it, but
_not_ people on the kernel mailing list. I don't worry about the code
nearly as much as I worry about people starting to clique together.

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  3:50 ` Rik van Riel
  2001-12-20  4:04   ` Ryan Cumming
  2001-12-20  5:39   ` David S. Miller
@ 2001-12-20  5:52   ` Linus Torvalds
  2 siblings, 0 replies; 168+ messages in thread
From: Linus Torvalds @ 2001-12-20  5:52 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Benjamin LaHaise, Alan Cox, Davide Libenzi, Kernel Mailing List


On Thu, 20 Dec 2001, Rik van Riel wrote:
> On Tue, 18 Dec 2001, Linus Torvalds wrote:
>
> > The thing is, I'm personally very suspicious of the "features for that
> > exclusive 0.1%" mentality.
>
> Then why do we have sendfile(), or that idiotic sys_readahead() ?

Hey, I expect others to do things in their tree, and I live by the same
rules: I do my stuff openly in my tree.

The Apache people actually seemed quite interested in sendfile. Of course,
that was before apache seemed to stop worrying about trying to beat
others at performance (rightly or wrongly - I think they are right
from a pragmatic viewpoint, and wrong from a PR one).

And hey, the same way I encourage others to experiment openly with their
trees, I experiment with mine.

			Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  5:47                             ` aio Linus Torvalds
@ 2001-12-20  5:57                               ` David S. Miller
  2001-12-20  5:59                                 ` aio Benjamin LaHaise
  0 siblings, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-20  5:57 UTC (permalink / raw)
  To: torvalds; +Cc: cs, billh, bcrl, linux-kernel, linux-aio

   From: Linus Torvalds <torvalds@transmeta.com>
   Date: Wed, 19 Dec 2001 21:47:18 -0800 (PST)
   
   it could very well be rather useful for many things that use select
   loops right now.

Then let us agree to disagree. :-) I think it's potential advantages,
and how many things really "require it" for better performance, is
being blown out of proportion.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  5:39                               ` aio David S. Miller
@ 2001-12-20  5:58                                 ` Benjamin LaHaise
  2001-12-20  6:00                                   ` aio David S. Miller
  2001-12-20  7:27                                 ` aio Daniel Phillips
       [not found]                                 ` <Pine.LNX.4.33.0112201127400.2656-100000@localhost.localdomain>
  2 siblings, 1 reply; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-20  5:58 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, torvalds, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 09:39:10PM -0800, David S. Miller wrote:
>    How about faster select and poll?
> 
> You don't need faster select and poll as demonstrated by the
> userspace "faster than TUX" example above.

Step back for a moment.  I know of phttpd and zeus.  They both have 
a serious problem: they fall down when the load on the system exceeds 
the capabilities of the cpu.  If you'd bother to take a look at the 
aio api I'm proposing, it has less overhead under heavy load as events 
get coalesced.  Even then, the overhead under light load is less than 
signals or select or poll.

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  5:39   ` David S. Miller
@ 2001-12-20  5:58     ` Linus Torvalds
  2001-12-20  6:01       ` David S. Miller
  2001-12-20 11:29     ` Rik van Riel
  1 sibling, 1 reply; 168+ messages in thread
From: Linus Torvalds @ 2001-12-20  5:58 UTC (permalink / raw)
  To: David S. Miller; +Cc: riel, bcrl, alan, davidel, linux-kernel


On Wed, 19 Dec 2001, David S. Miller wrote:
>
>    Then why do we have sendfile(), or that idiotic sys_readahead() ?
>
> Sending files over sockets are %99 of what most network servers are
> actually doing today, it is much more than 0.1% :-)

Well, that was true when the thing was written, but whether anybody _uses_
it any more, I don't know. Tux gets the same effect on its own, and I
don't know if Apache defaults to using sendfile or not.

readahead was just a personal 5-minute experiment, we can certainly remove
that ;)

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  5:57                               ` aio David S. Miller
@ 2001-12-20  5:59                                 ` Benjamin LaHaise
  2001-12-20  6:02                                   ` aio David S. Miller
  0 siblings, 1 reply; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-20  5:59 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, cs, billh, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 09:57:30PM -0800, David S. Miller wrote:
> Then let us agree to disagree. :-) I think it's potential advantages,
> and how many things really "require it" for better performance, is
> being blown out of proportion.

Show me how to make a single process server that can handle 100000 or more 
open tcp sockets that doesn't collapse under load.  I can do it with aio; 
can you do it without?

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  5:58                                 ` aio Benjamin LaHaise
@ 2001-12-20  6:00                                   ` David S. Miller
  2001-12-20  6:46                                     ` aio Mike Castle
  0 siblings, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-20  6:00 UTC (permalink / raw)
  To: bcrl; +Cc: billh, torvalds, linux-kernel, linux-aio

   From: Benjamin LaHaise <bcrl@redhat.com>
   Date: Thu, 20 Dec 2001 00:58:03 -0500
   
   Step back for a moment.  I know of phttpd and zeus.  They both have 
   a serious problem: they fall down when the load on the system exceeds 
   the capabilities of the cpu.  If you'd bother to take a look at the 
   aio api I'm proposing, it has less overhead under heavy load as events 
   get coalesced.  Even then, the overhead under light load is less than 
   signals or select or poll.

No I'm not talking about phttpd nor zeus, I'm talking about the guy
who did the hacks where he'd put the http headers + content into a
seperate file and just sendfile() that to the client.

I forget what his hacks were named, but there certainly was a longish
thread on this list about it about 1 year ago if memory serves.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  5:58     ` Linus Torvalds
@ 2001-12-20  6:01       ` David S. Miller
  2001-12-20 22:40         ` Troels Walsted Hansen
  0 siblings, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-20  6:01 UTC (permalink / raw)
  To: torvalds; +Cc: riel, bcrl, alan, davidel, linux-kernel

   From: Linus Torvalds <torvalds@transmeta.com>
   Date: Wed, 19 Dec 2001 21:58:41 -0800 (PST)
   
   Well, that was true when the thing was written, but whether anybody _uses_
   it any more, I don't know. Tux gets the same effect on its own, and I
   don't know if Apache defaults to using sendfile or not.
   
Samba uses it by default, that I know for sure :-)

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  5:59                                 ` aio Benjamin LaHaise
@ 2001-12-20  6:02                                   ` David S. Miller
  2001-12-20  6:07                                     ` aio Benjamin LaHaise
  2001-12-20  6:09                                     ` aio Linus Torvalds
  0 siblings, 2 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  6:02 UTC (permalink / raw)
  To: bcrl; +Cc: torvalds, cs, billh, linux-kernel, linux-aio

   From: Benjamin LaHaise <bcrl@redhat.com>
   Date: Thu, 20 Dec 2001 00:59:28 -0500
   
   Show me how to make a single process server that can handle 100000 or more 
   open tcp sockets that doesn't collapse under load.  I can do it with aio; 
   can you do it without?

Why are you limiting me to a single process? :-)  Can I have at least
1 per cpu possibly? :-)))

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:02                                   ` aio David S. Miller
@ 2001-12-20  6:07                                     ` Benjamin LaHaise
  2001-12-20  6:12                                       ` aio David S. Miller
  2001-12-20  6:09                                     ` aio Linus Torvalds
  1 sibling, 1 reply; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-20  6:07 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, cs, billh, linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 10:02:47PM -0800, David S. Miller wrote:
> Why are you limiting me to a single process? :-)  Can I have at least
> 1 per cpu possibly? :-)))

1 process.  1 cpu machine.  1 gige card.  As much ram as you want.  No 
syscalls.  Must exhibit a load curve similar to:

	y
	|  ...............
	| .
	|.
	+----------------x

Where x == requests per second sent to the machine and y is the number 
of resposes per second sent out of the machine.  Hint: read the phttpd 
and /dev/poll papers for an idea of the breakdown that happens for larger 
values of x (make the cpu slower to cause the interesting points to move 
lower).  For a third dimension to the graph, make the number of total 
connections the z axis.

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:02                                   ` aio David S. Miller
  2001-12-20  6:07                                     ` aio Benjamin LaHaise
@ 2001-12-20  6:09                                     ` Linus Torvalds
  2001-12-20 17:28                                       ` aio Suparna Bhattacharya
  1 sibling, 1 reply; 168+ messages in thread
From: Linus Torvalds @ 2001-12-20  6:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: bcrl, cs, billh, linux-kernel, linux-aio


Could we get back on track, and possibly discuss the patches themselves,
ok? We want _constructive_ criticism of the interfaces.

I think it's clear that many people do want to have aio support. At least
as far as I'm concerned, that's not the reason I want to have public
discussion. I want to make sure that the interfaces are good for aio
users, and that the design isn't stupid.

If somebody can point to a better way of doing aio, and giving good
arguments for that, more power to him. But let's not go down the path of
"_I_ don't like aio, so _you_ must be stupid".

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:07                                     ` aio Benjamin LaHaise
@ 2001-12-20  6:12                                       ` David S. Miller
  2001-12-20  6:23                                         ` aio Linus Torvalds
  0 siblings, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-20  6:12 UTC (permalink / raw)
  To: bcrl; +Cc: torvalds, cs, billh, linux-kernel, linux-aio

   From: Benjamin LaHaise <bcrl@redhat.com>
   Date: Thu, 20 Dec 2001 01:07:42 -0500
   
   1 process.  1 cpu machine.  1 gige card.  As much ram as you want.  No 
   syscalls.  Must exhibit a load curve similar to:
   
   	y
   	|  ...............
   	| .
   	|.
   	+----------------x
   
   Where x == requests per second sent to the machine and y is the number 
   of resposes per second sent out of the machine.  Hint: read the phttpd 
   and /dev/poll papers for an idea of the breakdown that happens for larger 
   values of x (make the cpu slower to cause the interesting points to move 
   lower).  For a third dimension to the graph, make the number of total 
   connections the z axis.

Ok, TUX can do it.  Now list for me some server that really matters
other than web and ftp?  If you say databases, then I agree with you
but I will also reiterate how the people who need that level of
database performance is "nook and cranny".

I think there is nothing wrong with doing a TUX module for situations
where 1) the server is important for enough people and 2) scaling to
the levels you are talking about is a real issue for that service.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:12                                       ` aio David S. Miller
@ 2001-12-20  6:23                                         ` Linus Torvalds
  2001-12-20 10:18                                           ` aio Ingo Molnar
  0 siblings, 1 reply; 168+ messages in thread
From: Linus Torvalds @ 2001-12-20  6:23 UTC (permalink / raw)
  To: David S. Miller; +Cc: bcrl, cs, billh, linux-kernel, linux-aio


On Wed, 19 Dec 2001, David S. Miller wrote:
>
> Ok, TUX can do it.  Now list for me some server that really matters
> other than web and ftp?

Now now, that's unfair. We should be able to do it in user space.

I think the question you _should_ be lobbying at Ben and the other aio
people is how the aio stuff could do zero-copy from disk cache to the
network, ie do the things that Tux does internally where it does
nonblocking reads from disk ad then sends them out non-blocking to the
network without havign to copy the data _or_ have to use extremely
expensive TLB mapping tricks to get at it..

Ie tie the "sendfile" and "aio" threads together, and ask Ben if we can do
aio-sendfile and have thousands of asynchronous sendfiles going on at the
same time, like Tux can do. And if not, then why not? Missing or bad
interfaces?

Ben? Doing user-space IO is all well and good, but that extra copy and TLB
stuff kills you. Tell us how to do it ;)

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:00                                   ` aio David S. Miller
@ 2001-12-20  6:46                                     ` Mike Castle
  2001-12-20  6:55                                       ` aio Robert Love
  2001-12-20  7:01                                       ` aio David S. Miller
  0 siblings, 2 replies; 168+ messages in thread
From: Mike Castle @ 2001-12-20  6:46 UTC (permalink / raw)
  To: linux-kernel, linux-aio

On Wed, Dec 19, 2001 at 10:00:40PM -0800, David S. Miller wrote:
> No I'm not talking about phttpd nor zeus, I'm talking about the guy
> who did the hacks where he'd put the http headers + content into a
> seperate file and just sendfile() that to the client.
> 
> I forget what his hacks were named, but there certainly was a longish
> thread on this list about it about 1 year ago if memory serves.


Would that be Fabio Riccardi's X15 stuff?

mrc
-- 
     Mike Castle      dalgoda@ix.netcom.com      www.netcom.com/~dalgoda/
    We are all of us living in the shadow of Manhattan.  -- Watchmen
fatal ("You are in a maze of twisty compiler features, all different"); -- gcc

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:46                                     ` aio Mike Castle
@ 2001-12-20  6:55                                       ` Robert Love
  2001-12-20  7:13                                         ` aio Mike Castle
  2001-12-20  7:01                                       ` aio David S. Miller
  1 sibling, 1 reply; 168+ messages in thread
From: Robert Love @ 2001-12-20  6:55 UTC (permalink / raw)
  To: Mike Castle; +Cc: linux-kernel, linux-aio

On Thu, 2001-12-20 at 01:46, Mike Castle wrote:
> On Wed, Dec 19, 2001 at 10:00:40PM -0800, David S. Miller wrote:
> > No I'm not talking about phttpd nor zeus, I'm talking about the guy
> > who did the hacks where he'd put the http headers + content into a
> > seperate file and just sendfile() that to the client.
> > 
> > I forget what his hacks were named, but there certainly was a longish
> > thread on this list about it about 1 year ago if memory serves.
> 
> Would that be Fabio Riccardi's X15 stuff?

Yes.  I was about to reply to this effect.

X15 was a userspace httpd that operated using the Tux-designed
constructs -- sendfile and such.  IIRC, Ingo actually pointed out some
things Fabio did were non-RFC (sending the static headers may of been
one of them, since the timestamp was wrong) and Fabio made a lot of
changes.  X15 seemed promising, especially since it trumpeted that Linux
"worked" without sticking things in kernel-space, but I don't remember
if we ever saw source (let alone a free license)?

	Robert Love


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:46                                     ` aio Mike Castle
  2001-12-20  6:55                                       ` aio Robert Love
@ 2001-12-20  7:01                                       ` David S. Miller
  1 sibling, 0 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20  7:01 UTC (permalink / raw)
  To: dalgoda; +Cc: linux-kernel, linux-aio

   From: Mike Castle <dalgoda@ix.netcom.com>
   Date: Wed, 19 Dec 2001 22:46:52 -0800
   
   Would that be Fabio Riccardi's X15 stuff?

Yes, that sounds like the one.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:55                                       ` aio Robert Love
@ 2001-12-20  7:13                                         ` Mike Castle
  0 siblings, 0 replies; 168+ messages in thread
From: Mike Castle @ 2001-12-20  7:13 UTC (permalink / raw)
  To: linux-kernel, linux-aio

On Thu, Dec 20, 2001 at 01:55:25AM -0500, Robert Love wrote:
> changes.  X15 seemed promising, especially since it trumpeted that Linux
> "worked" without sticking things in kernel-space, but I don't remember
> if we ever saw source (let alone a free license)?

We did, but licensing was personal use only.  It's all in the archives for
those curious.  Though following the URLs Fabio posted just got me to
login screen.

mrc
-- 
     Mike Castle      dalgoda@ix.netcom.com      www.netcom.com/~dalgoda/
    We are all of us living in the shadow of Manhattan.  -- Watchmen
fatal ("You are in a maze of twisty compiler features, all different"); -- gcc

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  5:39                               ` aio David S. Miller
  2001-12-20  5:58                                 ` aio Benjamin LaHaise
@ 2001-12-20  7:27                                 ` Daniel Phillips
       [not found]                                 ` <Pine.LNX.4.33.0112201127400.2656-100000@localhost.localdomain>
  2 siblings, 0 replies; 168+ messages in thread
From: Daniel Phillips @ 2001-12-20  7:27 UTC (permalink / raw)
  To: David S. Miller, bcrl; +Cc: billh, torvalds, linux-kernel, linux-aio

On December 20, 2001 06:39 am, David S. Miller wrote:
>    From: Benjamin LaHaise <bcrl@redhat.com>
>    Date: Wed, 19 Dec 2001 22:47:17 -0500
>    An X server that doesn't have to make a syscall to find out that
>    more data has arrived?
> 
> Who really needs this kind of performance improvement?  Like anyone
> really cares if their window gets the keyboard focus or a pixel over a
> AF_UNIX socket a few nanoseconds faster.  How many people do you think
> believe they have unacceptable X performance right now and that
> select()/poll() syscalls overhead is the cause?  Please get real.

I care, I always like faster graphics.

> People who want graphics performance are not pushing their data
> through X over a filedescriptor, they are either using direct
> rendering in the app itself (ala OpenGL) or they are using shared
> memory for the bulk of the data (ala Xshm or Xv extensions).

You're probably overgeneralizing.  Actually, I run games on my server and 
display the graphics on my laptop.  It works.  I'd be happy if it was faster.

I don't see right off how AIO would make that happen though.  Ben, could you 
please enlighten me, what would be the mechanism?  Are other OSes doing X 
with AIO?

--
Daniel


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:23                                         ` aio Linus Torvalds
@ 2001-12-20 10:18                                           ` Ingo Molnar
  2001-12-20 18:20                                             ` aio Robert Love
  0 siblings, 1 reply; 168+ messages in thread
From: Ingo Molnar @ 2001-12-20 10:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David S. Miller, bcrl, cs, billh, linux-kernel, linux-aio


On Wed, 19 Dec 2001, Linus Torvalds wrote:

> I think the question you _should_ be lobbying at Ben and the other aio
> people is how the aio stuff could do zero-copy from disk cache to the
> network, ie do the things that Tux does internally where it does
> nonblocking reads from disk ad then sends them out non-blocking to the
> network without havign to copy the data _or_ have to use extremely
> expensive TLB mapping tricks to get at it..

months ago i already offered Ben to port TUX to the aio interfaces once
they are available in the kernel. Unfortunately right now i cant afford
maintaining two separate TUX trees - so it's a chicken and egg thing in
this context.

But once aio is available, i *will* do it, because one of Ben's goals
fully state-machine-driven async block IO, which i'd like to use (and
test, and finetune, and improve) very much. (right now TUX does async
block IO via helper kernel threads. Async net-io is fully IRQ-driven.)
I'd also like to prove that our aio interfaces are capable.

there are two possibilities i can think of:

1) lets get Ben's patch in but do *not* export the syscalls, yet.

2) find some nice way of doing 'experimental syscalls', which are not
   guaranteed to stay that way. (Perhaps this is a naive proposition,
   often there is nothing more permanent than temporary solutions.)
   Something like reserving 'temporary' syscalls at the end of the syscall
   space, which would be frequently moved/removed/renamed just to keep
   folks from relying on it. No interface is guaranteed. Perhaps some
   technical solution can be find to make these syscalls truly temporary.

i'm sure people will get excited about (ie. use) aio once it's in the
kernel. Ben is very good at coding, perhaps not as good at PR, but should
such level of PR really be a natural part of Linux development?

> Ie tie the "sendfile" and "aio" threads together, and ask Ben if we
> can do aio-sendfile and have thousands of asynchronous sendfiles going
> on at the same time, like Tux can do. And if not, then why not?
> Missing or bad interfaces?

i'd love to find out. *If* it's guaranteed that some sort of sane aio will
always be available from the point on it's introduced into the kernel then
i'll switch TUX to it. (it will change TUX upside down, this is why i
cannot maintain two separate TUX trees.) TUX doesnt need stable
interfaces. While TUX might not be as important, usage-wise, it's
certainly a good playing ground for such things.

	Ingo


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20  5:39   ` David S. Miller
  2001-12-20  5:58     ` Linus Torvalds
@ 2001-12-20 11:29     ` Rik van Riel
  2001-12-20 11:34       ` David S. Miller
  1 sibling, 1 reply; 168+ messages in thread
From: Rik van Riel @ 2001-12-20 11:29 UTC (permalink / raw)
  To: David S. Miller; +Cc: torvalds, bcrl, alan, davidel, linux-kernel

On Wed, 19 Dec 2001, David S. Miller wrote:
> From: Rik van Riel <riel@conectiva.com.br>
>    On Tue, 18 Dec 2001, Linus Torvalds wrote:
>
>    > The thing is, I'm personally very suspicious of the "features for that
>    > exclusive 0.1%" mentality.
>
>    Then why do we have sendfile(), or that idiotic sys_readahead() ?
>
> Sending files over sockets are %99 of what most network servers are
> actually doing today, it is much more than 0.1% :-)

The same could be said for AIO, there are a _lot_ of
server programs which are heavily overthreaded because
of a lack of AIO...

cheers,

Rik
-- 
Shortwave goes a long way:  irc.starchat.net  #swl

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20 11:29     ` Rik van Riel
@ 2001-12-20 11:34       ` David S. Miller
  0 siblings, 0 replies; 168+ messages in thread
From: David S. Miller @ 2001-12-20 11:34 UTC (permalink / raw)
  To: riel; +Cc: torvalds, bcrl, alan, davidel, linux-kernel

   From: Rik van Riel <riel@conectiva.com.br>
   Date: Thu, 20 Dec 2001 09:29:28 -0200 (BRST)

   On Wed, 19 Dec 2001, David S. Miller wrote:
   > Sending files over sockets are %99 of what most network servers are
   > actually doing today, it is much more than 0.1% :-)
   
   The same could be said for AIO, there are a _lot_ of
   server programs which are heavily overthreaded because
   of a lack of AIO...

If you read my most recent responses to Ingo's postings, you'll see
that I'm starting to completely agree with you :-)

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
       [not found]                                 ` <Pine.LNX.4.33.0112201127400.2656-100000@localhost.localdomain>
@ 2001-12-20 11:49                                   ` William Lee Irwin III
  2001-12-20 16:32                                   ` aio Dan Kegel
  2001-12-20 21:45                                   ` aio Lincoln Dale
  2 siblings, 0 replies; 168+ messages in thread
From: William Lee Irwin III @ 2001-12-20 11:49 UTC (permalink / raw)
  To: linux-kernel

On Thu, Dec 20, 2001 at 11:44:05AM +0100, Ingo Molnar wrote:
> we need a sane interface that covers *all* sorts of IO, not just sockets.
> I used to have exactly the same optinion as you have now, but now i'd like
> to have a common async IO interface that will cover network IO, block IO
> [or graphics IO, or whatever comes up]. We should have something saner and
> more explicit than a side-branch of fcntl() handling the socket fasync
> code.

I second this wholeheartedly. And I believe there are still more
motivations for providing asynchronous interfaces for all I/O in
the realm of assisting the userland:

(1) It would simplify the ways applications have and the kernel
	overhead of responding to user input while I/O is in progress.

(2) It would provide a more efficient way to do M:N threading than
	watchdogs and nonblocking poll/select in itimers.


Cheers,
Bill

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:13                           ` aio David S. Miller
  2001-12-20  3:47                             ` aio Benjamin LaHaise
@ 2001-12-20 14:38                             ` Luigi Genoni
  2001-12-20 17:26                             ` aio Henning Schmiedehausen
  2 siblings, 0 replies; 168+ messages in thread
From: Luigi Genoni @ 2001-12-20 14:38 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio



On Wed, 19 Dec 2001, David S. Miller wrote:

>    From: Bill Huey <billh@tierra.ucsd.edu>
>    Date: Wed, 19 Dec 2001 19:07:16 -0800
>
>    And using lkml as a AIO forum is probably outside of the scope of this list
>    and group.
>
> This whole thread exists because Linus wants public general and
> technical discussion on lkml of new features to happen before he
> considers putting them into the tree, and the fact that they are not
> in the tree because he isn't seeing such enthusiastic discussions
> happening at all.
>
YES, and he is right doing so.

> I don't think AIO, because of it's non-trivial impact to the tree, is
> at all outside the scope of this list.  This is in fact the place
> where major stuff like AIO is meant to be discussed, not some special
> list where only "AIO people" hang out, of course people on that list
> will be enthusiastic about AIO!
agreed
>
> Frankly, on your other comments, I don't give a rats ass what BSD/OS
> people are doing about, nor how highly they rate, Java.  That is
> neither here nor there.  Java is going to be dead in a few years, and
> let's just agree to disagree about this particular point ok?
mmhh, java will not be death untill a lot of commecial software will use
it for graphical interfaces.
Infact it is simpler and cheaper for them to use java, and so we have to
deal with this bad future, where a dead language will be keept alive by
software houses.
That said, should we care about this? In my opinion, NO. and why we
should? when there are no good technical reasons, political reasons should
please disappear.

Luigi


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:45                       ` aio David S. Miller
                                           ` (2 preceding siblings ...)
       [not found]                         ` <mailman.1008817860.10606.linux-kernel2news@redhat.com>
@ 2001-12-20 16:16                         ` Dan Kegel
  2001-12-21 11:44                           ` aio Gerold Jury
  2001-12-20 17:24                         ` aio Henning Schmiedehausen
  4 siblings, 1 reply; 168+ messages in thread
From: Dan Kegel @ 2001-12-20 16:16 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

"David S. Miller" wrote:
> If AIO was so relevant+sexy we'd be having threads of discussion about
> the AIO implementation instead of threads about how relevant it is or
> is not for the general populace.  Wouldn't you concur?  :-)
> 
> The people doing Java server applets are such a small fraction of the
> Linux user community.

People writing code for NT/Win2K/WinXP are being channelled into
using AIO because that's the way to do things there (NT doesn't
really support nonblocking I/O).  Thus another valid economic
reason AIO is important is to make it easier to port code from NT.
I have received requests from NT folks for things like aio_recvfrom()
(and have passed them on to Ben), so I'm not just guessing here.

As should be clear from my c10k page, I love nonblocking I/O,
but I firmly believe that some form of AIO is vital.

- Dan

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
       [not found]                                 ` <Pine.LNX.4.33.0112201127400.2656-100000@localhost.localdomain>
  2001-12-20 11:49                                   ` aio William Lee Irwin III
@ 2001-12-20 16:32                                   ` Dan Kegel
  2001-12-20 18:05                                     ` aio Davide Libenzi
  2001-12-20 21:45                                   ` aio Lincoln Dale
  2 siblings, 1 reply; 168+ messages in thread
From: Dan Kegel @ 2001-12-20 16:32 UTC (permalink / raw)
  To: mingo; +Cc: David S. Miller, bcrl, billh, torvalds, linux-kernel, linux-aio

Ingo Molnar wrote:

> it's not a fair comparison. The system was set up to not exhibit any async
> IO load. So a pure, atomic sendfile() outperformed TUX slightly, where TUX
> did something slightly more complex (and more RFC-conform as well - see
> Date: caching in X12 for example). Not something i'd call a proof - this
> simply works around the async IO interface. (which RT-signal driven,
> fasync-helped async IO interface, as phttpd has proven, is not only hard
> to program and is unrobust, it also performs *very badly*.)

Proper wrapper code can make them (almost) easy to program with.
See http://www.kegel.com/dkftpbench/doc/Poller_sigio.html for an example
of a wrapper that automatically handles the fallback to poll() on overflow.
Using this wrapper I wrote ftp clients and servers which use a thin wrapper
api that lets the user choose from select, poll, /dev/poll, kqueue/kevent, and RT signals
at runtime.

That said, I think that using the RT signal queue is just plain the
wrong way to go, and I can't wait for better approaches to make it
into the standard kernel someday.

- Dan

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  2:45                       ` aio David S. Miller
                                           ` (3 preceding siblings ...)
  2001-12-20 16:16                         ` aio Dan Kegel
@ 2001-12-20 17:24                         ` Henning Schmiedehausen
  4 siblings, 0 replies; 168+ messages in thread
From: Henning Schmiedehausen @ 2001-12-20 17:24 UTC (permalink / raw)
  To: linux-kernel

"David S. Miller" <davem@redhat.com> writes:

>The people doing Java server applets are such a small fraction of the
>Linux user community.

The people doing Java Servlets are maybe a small fraction of the Linux
Kernel Hackers community. Not the user community. They simply do not
show up here because they don't care for Linux 2.5.0-rc1-prepatched.

Short head count: Who here is also a regular reader on Apache Jakarta
lists?

Kernel hacking and Java most of the times doesn't mix. And Java folks
are completly ambivalent to their OS: If Linux doesn't deliver,
well. Windows, Solaris and BSD do. That's what Java is all about. 

	Regards
		Henning


-- 
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen       -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH     hps@intermeta.de

Am Schwabachgrund 22  Fon.: 09131 / 50654-0   info@intermeta.de
D-91054 Buckenhof     Fax.: 09131 / 50654-20   

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:13                           ` aio David S. Miller
  2001-12-20  3:47                             ` aio Benjamin LaHaise
  2001-12-20 14:38                             ` aio Luigi Genoni
@ 2001-12-20 17:26                             ` Henning Schmiedehausen
  2001-12-20 20:04                               ` aio M. Edward (Ed) Borasky
  2001-12-20 23:53                               ` aio David S. Miller
  2 siblings, 2 replies; 168+ messages in thread
From: Henning Schmiedehausen @ 2001-12-20 17:26 UTC (permalink / raw)
  To: linux-kernel

"David S. Miller" <davem@redhat.com> writes:

>neither here nor there.  Java is going to be dead in a few years, and
>let's just agree to disagree about this particular point ok?

Care to point out why? Because of Sun or because of C#?

	Regards
		Henning


-- 
Dipl.-Inf. (Univ.) Henning P. Schmiedehausen       -- Geschaeftsfuehrer
INTERMETA - Gesellschaft fuer Mehrwertdienste mbH     hps@intermeta.de

Am Schwabachgrund 22  Fon.: 09131 / 50654-0   info@intermeta.de
D-91054 Buckenhof     Fax.: 09131 / 50654-20   

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  6:09                                     ` aio Linus Torvalds
@ 2001-12-20 17:28                                       ` Suparna Bhattacharya
  0 siblings, 0 replies; 168+ messages in thread
From: Suparna Bhattacharya @ 2001-12-20 17:28 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bcrl, linux-aio, linux-kernel

On Wed, Dec 19, 2001 at 10:09:40PM -0800, Linus Torvalds wrote:
> Could we get back on track, and possibly discuss the patches themselves,
> ok? We want _constructive_ criticism of the interfaces.
> 
> I think it's clear that many people do want to have aio support. At least

Yes, to add to the lot, even though you probably don't need any more proof :)
we have at least 3 different products requiring this. Both for scalable 
communications aio (large no. of connections) as well as file/disk aio.  
In fact, one of the things that I've been worried about once 2.5 opened up
and the bio changes started flowing in (much as I was delighted to see
Jens' stuff finally getting integrated), was whether this would mean a 
longer timeframe before we can hope to see aio in a distribution (which is 
the question which I have to respond to our product groups about). 

> as far as I'm concerned, that's not the reason I want to have public
> discussion. I want to make sure that the interfaces are good for aio
> users, and that the design isn't stupid.
> 

My feeling about this is that we shouldn't necessarily need to
have the entire aio code in perfect shape and bring it in all in one shot.
The thing I like about the design is that it is quite possible to split
this up into smaller core patches and bring them in slowly. And I agree
with Ingo that we should be able to start stabilizing the basic internal
mechanisms or foundations on which aio is built before we freeze the 
external interfaces. In fact, these two things could happen parallely.
So I was hoping that an evolutionary approach would work. In the current 
design, the aio path is handled separately from the normal i/o paths, so the 
impact on existing interfaces is less. It mainly affects aio users. So it 
ought to be possible to integrate it in a way that doesn't hurt regular
operations, and it should be easier to change it once it is in without
breaking too many things.

Existing aio users on other platforms that I have come across seem to use 
either POSIX aio (for file /disk aio) or completion port style interfaces 
(mainly for communications aio), both of which seem to be possible with 
Ben's implementation. One has to explicitly associate each i/o with the 
completion queue (ctx), rather than associate an fd as a whole with it so 
that all completion events on that fd come to the ctx. That should be OK. 
Besides with async poll support we can have per-file readiness notification 
as well. I was hoping for the async poll piece being available early to 
exercise the top half or event handling side of aio, so we have scalable 
select/poll support, so was focussing on that part to start with.

Your point about some critical discussion of the interfaces and the design 
is well-taken. We have had a few discussions on the aio mailing and more on 
irc about some aspects, but not quite a thorough out and out analysis of 
pros and cons of the whole design. I just started writing some points here, 
but then realized that it is going to take much longer, so decided to do
that while working with Ben on the documentation, and discuss more after 
that.

Regards
Suparna

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 16:32                                   ` aio Dan Kegel
@ 2001-12-20 18:05                                     ` Davide Libenzi
  0 siblings, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-20 18:05 UTC (permalink / raw)
  To: Dan Kegel
  Cc: mingo, David S. Miller, bcrl, billh, torvalds, linux-kernel,
	linux-aio

On Thu, 20 Dec 2001, Dan Kegel wrote:

> Ingo Molnar wrote:
>
> > it's not a fair comparison. The system was set up to not exhibit any async
> > IO load. So a pure, atomic sendfile() outperformed TUX slightly, where TUX
> > did something slightly more complex (and more RFC-conform as well - see
> > Date: caching in X12 for example). Not something i'd call a proof - this
> > simply works around the async IO interface. (which RT-signal driven,
> > fasync-helped async IO interface, as phttpd has proven, is not only hard
> > to program and is unrobust, it also performs *very badly*.)
>
> Proper wrapper code can make them (almost) easy to program with.
> See http://www.kegel.com/dkftpbench/doc/Poller_sigio.html for an example
> of a wrapper that automatically handles the fallback to poll() on overflow.
> Using this wrapper I wrote ftp clients and servers which use a thin wrapper
> api that lets the user choose from select, poll, /dev/poll, kqueue/kevent, and RT signals
> at runtime.

Hey, you forgot /dev/epoll, the fastest one :)




- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 10:18                                           ` aio Ingo Molnar
@ 2001-12-20 18:20                                             ` Robert Love
  2001-12-20 22:30                                               ` aio Cameron Simpson
  0 siblings, 1 reply; 168+ messages in thread
From: Robert Love @ 2001-12-20 18:20 UTC (permalink / raw)
  To: mingo
  Cc: Linus Torvalds, David S. Miller, bcrl, cs, billh, linux-kernel,
	linux-aio

On Thu, 2001-12-20 at 05:18, Ingo Molnar wrote:

> there are two possibilities i can think of:
> 
> 1) lets get Ben's patch in but do *not* export the syscalls, yet.

This is an excellent way to give aio the testing and exposure Linus
wants without getting into the commitment / syscall mess.

Stick aio in the kernel, play with it via Tux, etc.  The really
interested can add temporary syscalls.  aio (which I like, btw) will get
testing and in time, once proven, we can add the syscalls.

Comments?

	Robert Love


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 17:26                             ` aio Henning Schmiedehausen
@ 2001-12-20 20:04                               ` M. Edward (Ed) Borasky
  2001-12-20 23:53                               ` aio David S. Miller
  1 sibling, 0 replies; 168+ messages in thread
From: M. Edward (Ed) Borasky @ 2001-12-20 20:04 UTC (permalink / raw)
  To: Henning Schmiedehausen; +Cc: linux-kernel

On Thu, 20 Dec 2001, Henning Schmiedehausen wrote:

> "David S. Miller" <davem@redhat.com> writes:
>
> >neither here nor there.  Java is going to be dead in a few years, and
> >let's just agree to disagree about this particular point ok?
>
> Care to point out why? Because of Sun or because of C#?

Because MSFT is bigger than SUNW :-) As that great American philosopher,
Damon Runyon, once said, "The race is not always to the swift, nor the
battle to the strong -- but that's the way to bet!"
--
M. Edward Borasky

znmeb@borasky-research.net
http://www.borasky-research.net

If I had 40 billion dollars for every software monopoly that sells an
unwieldy and hazardously complex development environment and is run by
an arrogant college dropout with delusions of grandeur who treats his
employees like serfs while he is acclaimed as a man of compelling
vision, I'd be a wealthy man.


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
       [not found]                                 ` <Pine.LNX.4.33.0112201127400.2656-100000@localhost.localdomain>
  2001-12-20 11:49                                   ` aio William Lee Irwin III
  2001-12-20 16:32                                   ` aio Dan Kegel
@ 2001-12-20 21:45                                   ` Lincoln Dale
  2001-12-20 21:59                                     ` aio Linus Torvalds
  2001-12-20 23:02                                     ` aio Lincoln Dale
  2 siblings, 2 replies; 168+ messages in thread
From: Lincoln Dale @ 2001-12-20 21:45 UTC (permalink / raw)
  To: Dan Kegel
  Cc: mingo, David S. Miller, bcrl, billh, torvalds, linux-kernel,
	linux-aio

At 08:32 AM 20/12/2001 -0800, Dan Kegel wrote:
>Proper wrapper code can make them (almost) easy to program with.
>See http://www.kegel.com/dkftpbench/doc/Poller_sigio.html for an example
>of a wrapper that automatically handles the fallback to poll() on overflow.
>Using this wrapper I wrote ftp clients and servers which use a thin wrapper
>api that lets the user choose from select, poll, /dev/poll, kqueue/kevent, 
>and RT signals
>at runtime.

SIGIO sucks in the real-world for a few reasons right now, most of them 
unrelated to 'sigio' itself:

  1. SIGIO uses signals.
         look at how signals are handled on multiprocessor (SMP) boxes.
         can you say "cache ping-pong", not to mention the locking and 
task_struct
         loop lookups?
         every signal to user-space results in 512-bytes of memory-copy 
from kernel-to-user-space.

  2. SIGIO is very heavy
         userspace only gets back one-event-per-system-call, thus you end 
up with tens-of-
         thousands of user<->kernel transitions per second eating up 
valuable cpu resources.
         there is neither: (a) aggregation of SIGIO events on a per-socket 
basis, nor
         (b) aggregating multiple SIGIO events from multiple sockets onto a 
single system call.

  3. enabling SIGIO is racy at socket-accept
         multiple system calls are required to accept a connection on a 
socket and then
         enable SIGIO on it.  packets can arrive in the meantime.
         one can workaround this with a poll() but its bad.

  4. in practical terms, SIGIO-based I/O isn't very good at expressing a 
"no POLL_OUT" signal.

  5. SIGIO is only a _notification_ mechanism.  it does NOTHING for 
zero-copy-i/o from/to-disk
    from/to-userspace from/to-network.


cheers,

lincoln.


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 21:45                                   ` aio Lincoln Dale
@ 2001-12-20 21:59                                     ` Linus Torvalds
  2001-12-24 11:44                                       ` aio Gerold Jury
  2001-12-20 23:02                                     ` aio Lincoln Dale
  1 sibling, 1 reply; 168+ messages in thread
From: Linus Torvalds @ 2001-12-20 21:59 UTC (permalink / raw)
  To: Lincoln Dale
  Cc: Dan Kegel, mingo, David S. Miller, bcrl, billh, linux-kernel,
	linux-aio


On Thu, 20 Dec 2001, Lincoln Dale wrote:
>
> SIGIO sucks in the real-world for a few reasons right now, most of them
> unrelated to 'sigio' itself:

Well, there _is_ one big one, which definitely is fundamentally related to
sigio itself:

sigio is an asynchronous event programming model.

And let's face it, asynchronous programming models suck. They inherently
require that you handle various race conditions etc, and have extra
locking.

Note that "asynchronous programming model" is not the same as
"asynchronous IO completion". The former implies a threaded user space,
the latter implies threaded kernel IO.

And let's face it - threading is _hard_ to get right. People just don't
think well about asynchronous events.

It's much easier to have a synchronous interface to the asynchronous IO,
ie one where you do not have to worry about events happening "at the same
time".

SIGIO just isn't very nice. It's useful for some event notification (ie if
you don't actually _do_ anything in the signal handler), but let's be
honest: it's an extremely heavy notifier. Something synchronous like
"poll" or "select" will beat it just about every time (yes, they don't
scale well, but neither does SIGIO).

		Linus


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 18:20                                             ` aio Robert Love
@ 2001-12-20 22:30                                               ` Cameron Simpson
  2001-12-20 22:46                                                 ` aio Benjamin LaHaise
  0 siblings, 1 reply; 168+ messages in thread
From: Cameron Simpson @ 2001-12-20 22:30 UTC (permalink / raw)
  To: Robert Love
  Cc: mingo, Linus Torvalds, David S. Miller, bcrl, billh, linux-kernel,
	linux-aio

On Thu, Dec 20, 2001 at 01:20:55PM -0500, Robert Love <rml@tech9.net> wrote:
| On Thu, 2001-12-20 at 05:18, Ingo Molnar wrote:
| > there are two possibilities i can think of:
| > 1) lets get Ben's patch in but do *not* export the syscalls, yet.
| 
| This is an excellent way to give aio the testing and exposure Linus
| wants without getting into the commitment / syscall mess.
| Stick aio in the kernel, play with it via Tux, etc.  The really
| interested can add temporary syscalls.  aio (which I like, btw) will get
| testing and in time, once proven, we can add the syscalls.
| Comments?

Only that it would be hard for user space people to try it - does Ben's
patch (with hypothetical syscalls) present the POSIX async interfaces out
of the box? If not, testing with in-kernel things is sufficient. But
if it does then it becomes more reasonable to transiently define some
syscall numbers (high up, in some defined as "testing and like shifting
sands" range) so user space can test the interface.

Thought: is there a meta-syscall in the kernel API for calling other syscalls?
You could have such a beast taking negative numbers for experimental calls...
-- 
Cameron Simpson, DoD#743        cs@zip.com.au    http://www.zip.com.au/~cs/

Sometimes the only solution is to find a new problem.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* RE: Scheduler ( was: Just a second ) ...
  2001-12-20  6:01       ` David S. Miller
@ 2001-12-20 22:40         ` Troels Walsted Hansen
  2001-12-20 23:55           ` Chris Ricker
  0 siblings, 1 reply; 168+ messages in thread
From: Troels Walsted Hansen @ 2001-12-20 22:40 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: linux-kernel

>From: David S. Miller
>   From: Linus Torvalds <torvalds@transmeta.com>
>   Well, that was true when the thing was written, but whether anybody
_uses_
>   it any more, I don't know. Tux gets the same effect on its own, and
I
>   don't know if Apache defaults to using sendfile or not.
>   
>Samba uses it by default, that I know for sure :-)

I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
the word "sendfile" in the source at least. :( Wonder why the sendfile
patches where never merged...

--
Troels Walsted Hansen


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 22:30                                               ` aio Cameron Simpson
@ 2001-12-20 22:46                                                 ` Benjamin LaHaise
  0 siblings, 0 replies; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-20 22:46 UTC (permalink / raw)
  To: Cameron Simpson
  Cc: Robert Love, mingo, Linus Torvalds, David S. Miller, billh,
	linux-kernel, linux-aio

On Fri, Dec 21, 2001 at 09:30:27AM +1100, Cameron Simpson wrote:
> Only that it would be hard for user space people to try it - does Ben's
> patch (with hypothetical syscalls) present the POSIX async interfaces out
> of the box?

No.  POSIX aio does not have any concept of a completion queue.  Completion 
in POSIX aio comes via a thread callback, signal delivery or polling, all 
of which are horrendously inefficient.

> If not, testing with in-kernel things is sufficient. But
> if it does then it becomes more reasonable to transiently define some
> syscall numbers (high up, in some defined as "testing and like shifting
> sands" range) so user space can test the interface.

Maybe.  The unfortunate aspect to this is that you can't tell if a number 
matches the name you expect it to be, and invariably people end up running 
the wrong code on the wrong kernel.  Or vendors start shipping patches to 
enable these new syscalls....

> Thought: is there a meta-syscall in the kernel API for calling other 
> syscalls?  You could have such a beast taking negative numbers for 
> experimental calls...

I'm working on something.  Stay tuned.

		-ben
-- 
Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 21:45                                   ` aio Lincoln Dale
  2001-12-20 21:59                                     ` aio Linus Torvalds
@ 2001-12-20 23:02                                     ` Lincoln Dale
  1 sibling, 0 replies; 168+ messages in thread
From: Lincoln Dale @ 2001-12-20 23:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dan Kegel, mingo, David S. Miller, bcrl, billh, linux-kernel,
	linux-aio

At 01:59 PM 20/12/2001 -0800, Linus Torvalds wrote:

>On Thu, 20 Dec 2001, Lincoln Dale wrote:
> >
> > SIGIO sucks in the real-world for a few reasons right now, most of them
> > unrelated to 'sigio' itself:
>
>Well, there _is_ one big one, which definitely is fundamentally related to
>sigio itself:
>
>sigio is an asynchronous event programming model.
>
>And let's face it, asynchronous programming models suck. They inherently
>require that you handle various race conditions etc, and have extra
>locking.

actually, i disagree with your assertion that "asyncronous programming 
models suck".

for MANY applications, it doesn't matter.  the equivalent to async is to do 
either:
  - thread-per-connection or process-per-connection (ala apache, sendmail, 
inetd-type services, ...)
  - a system that blocks -- handles one-connection-at-a-time

the only time async actually starts to matter is if you start to stress the 
precipitous performance characteristics associated with thousands of 
concurrent tasks in a thread/process-per-connection model.  (limited 
processor L2 cache size, multiple tasks sharing the same cache-lines 
(suboptimal cache colouring), scheduler overhead, wasted 
stack-space-per-thread/process, ..).

if you care about that level of performance, then you generally move to an 
async model.
moving to an async model doesn't have to be hard -- people generally start 
with their own pseudo scheduler and go from there.
"harder" than non-async: yes.  but "hard": no.

>SIGIO just isn't very nice. It's useful for some event notification (ie if
>you don't actually _do_ anything in the signal handler), but let's be
>honest: it's an extremely heavy notifier. Something synchronous like
>"poll" or "select" will beat it just about every time (yes, they don't
>scale well, but neither does SIGIO).

actually, my experience (circa 12 months ago) was that they were roughly equal.
poll()'s performance dropped off significantly at a few thousand FDs 
whereas sigio's latency just went up.
but it was somewhat trivial to _make_ poll() go faster by being intelligent 
about what fd's to poll.  simple logic of "if a FD didn't have anything 
active, don't poll for it on the next poll() loop" didn't increase the 
latency in servicing that FD by any noticable amount but basically triples 
the # of FDs one could handle.


cheers,

lincoln.
NB. sounds like you're making a case for the current trend in Java Virtual 
Machines insistance on "lots of processes" being a good thing. <grin, duck, 
run>


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 17:26                             ` aio Henning Schmiedehausen
  2001-12-20 20:04                               ` aio M. Edward (Ed) Borasky
@ 2001-12-20 23:53                               ` David S. Miller
  2001-12-21  0:28                                 ` Offtopic Java/C# [Re: aio] Bill Huey
  1 sibling, 1 reply; 168+ messages in thread
From: David S. Miller @ 2001-12-20 23:53 UTC (permalink / raw)
  To: henning; +Cc: linux-kernel

   From: henning@forge.intermeta.de (Henning Schmiedehausen)
   Date: Thu, 20 Dec 2001 17:26:05 +0000 (UTC)

   Care to point out why? Because of Sun or because of C#?

That's a circular question, because C# exists due to Sun's mistakes
with handling Java.  So my answer is "both".

^ permalink raw reply	[flat|nested] 168+ messages in thread

* RE: Scheduler ( was: Just a second ) ...
  2001-12-20 22:40         ` Troels Walsted Hansen
@ 2001-12-20 23:55           ` Chris Ricker
  2001-12-20 23:59             ` CaT
  2001-12-21  0:06             ` Davide Libenzi
  0 siblings, 2 replies; 168+ messages in thread
From: Chris Ricker @ 2001-12-20 23:55 UTC (permalink / raw)
  To: Troels Walsted Hansen; +Cc: 'David S. Miller', World Domination Now!

On Thu, 20 Dec 2001, Troels Walsted Hansen wrote:

> >From: David S. Miller
> >   From: Linus Torvalds <torvalds@transmeta.com>
> >   Well, that was true when the thing was written, but whether anybody
> _uses_
> >   it any more, I don't know. Tux gets the same effect on its own, and
> I
> >   don't know if Apache defaults to using sendfile or not.
> >   
> >Samba uses it by default, that I know for sure :-)
> 
> I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
> the word "sendfile" in the source at least. :( Wonder why the sendfile
> patches where never merged...

The only real-world source I've noticed actually using sendfile() are some
of the better ftp daemons (such as vsftpd).

later,
chris

-- 
Chris Ricker                                               kaboom@gatech.edu

This is a dare to the Bush administration.
        -- Thurston Moore



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-20 23:55           ` Chris Ricker
@ 2001-12-20 23:59             ` CaT
  2001-12-21  0:06             ` Davide Libenzi
  1 sibling, 0 replies; 168+ messages in thread
From: CaT @ 2001-12-20 23:59 UTC (permalink / raw)
  To: Chris Ricker
  Cc: Troels Walsted Hansen, 'David S. Miller',
	World Domination Now!

On Thu, Dec 20, 2001 at 04:55:55PM -0700, Chris Ricker wrote:
> > I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
> > the word "sendfile" in the source at least. :( Wonder why the sendfile
> > patches where never merged...
> 
> The only real-world source I've noticed actually using sendfile() are some
> of the better ftp daemons (such as vsftpd).

proftpd uses it also.

-- 
CaT        - A high level of technology does not a civilisation make.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* RE: Scheduler ( was: Just a second ) ...
  2001-12-20 23:55           ` Chris Ricker
  2001-12-20 23:59             ` CaT
@ 2001-12-21  0:06             ` Davide Libenzi
  1 sibling, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-21  0:06 UTC (permalink / raw)
  To: Chris Ricker
  Cc: Troels Walsted Hansen, 'David S. Miller',
	World Domination Now!

On Thu, 20 Dec 2001, Chris Ricker wrote:

> On Thu, 20 Dec 2001, Troels Walsted Hansen wrote:
>
> > >From: David S. Miller
> > >   From: Linus Torvalds <torvalds@transmeta.com>
> > >   Well, that was true when the thing was written, but whether anybody
> > _uses_
> > >   it any more, I don't know. Tux gets the same effect on its own, and
> > I
> > >   don't know if Apache defaults to using sendfile or not.
> > >
> > >Samba uses it by default, that I know for sure :-)
> >
> > I wish... Neither Samba 2.2.2 nor the bleeding edge 3.0alpha11 includes
> > the word "sendfile" in the source at least. :( Wonder why the sendfile
> > patches where never merged...
>
> The only real-world source I've noticed actually using sendfile() are some
> of the better ftp daemons (such as vsftpd).

And XMail :)




- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Offtopic Java/C# [Re: aio]
  2001-12-20 23:53                               ` aio David S. Miller
@ 2001-12-21  0:28                                 ` Bill Huey
  0 siblings, 0 replies; 168+ messages in thread
From: Bill Huey @ 2001-12-21  0:28 UTC (permalink / raw)
  To: David S. Miller; +Cc: henning, linux-kernel

On Thu, Dec 20, 2001 at 03:53:13PM -0800, David S. Miller wrote:
> That's a circular question, because C# exists due to Sun's mistakes
> with handling Java.  So my answer is "both".

Well, they really serve different purposes and can't be compare. One
is an unified object/class model for DCOM with more static typing stuff
(boxing, parametric types), which the other is more CLish with class
reflection built closely into the language runtime, threading and other
self contained things within that system.

bill


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 16:16                         ` aio Dan Kegel
@ 2001-12-21 11:44                           ` Gerold Jury
  2001-12-21 13:48                             ` aio Ingo Molnar
  0 siblings, 1 reply; 168+ messages in thread
From: Gerold Jury @ 2001-12-21 11:44 UTC (permalink / raw)
  To: Dan Kegel, David S. Miller; +Cc: bcrl, linux-kernel, linux-aio

On Thursday 20 December 2001 17:16, Dan Kegel wrote:
> "David S. Miller" wrote:
> > If AIO was so relevant+sexy we'd be having threads of discussion about
> > the AIO implementation instead of threads about how relevant it is or
> > is not for the general populace.  Wouldn't you concur?  :-)
> >
> > The people doing Java server applets are such a small fraction of the
> > Linux user community.
>
> reason AIO is important is to make it easier to port code from NT.
>
> but I firmly believe that some form of AIO is vital.
>
> - Dan

>From the aio-0.3.1/README
section Current State

  IPv4 TCP and UDP (rx only) sockets.

It is simply too early for sexy discussions. For me, the most appealing part 
of AIO is the socket handling. It seems a little bit broken in the current 
glibc emulation/implementation.
Recv and send operations are ordered when used on the same socket handle.
Thus a recv must be finished before a subsequent send will happen.
Good idea for files, bad for sockets.

SGI's implementation kaio, which works perfect for me, is widespread ignored 
and sufferes from the unreserved syscall problem like Ben's aio. I am sure 
there is a reason for ignoring SGI-kaio, i just do not remember.

With the current state of the different implementations it is difficult to 
have sex about or use them.
But i would really like tooooooooooooooooooo.

Gerold

-
The one-sig-perfd patch did not get much attention either.
No one seems to use sockets these days.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-21 11:44                           ` aio Gerold Jury
@ 2001-12-21 13:48                             ` Ingo Molnar
  2001-12-21 15:27                               ` aio Gerold Jury
  0 siblings, 1 reply; 168+ messages in thread
From: Ingo Molnar @ 2001-12-21 13:48 UTC (permalink / raw)
  To: Gerold Jury; +Cc: Dan Kegel, David S. Miller, bcrl, linux-kernel, linux-aio


On Fri, 21 Dec 2001, Gerold Jury wrote:

> It is simply too early for sexy discussions. For me, the most
> appealing part of AIO is the socket handling. It seems a little bit
> broken in the current glibc emulation/implementation. Recv and send
> operations are ordered when used on the same socket handle. Thus a
> recv must be finished before a subsequent send will happen. Good idea
> for files, bad for sockets.

is this a fundamental limitation expressed in the interface, or just an
implementational limitation? On sockets this is indeed a big problem, HTTP
pipelining wants completely separate receive/send queues.

	Ingo


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-21 13:48                             ` aio Ingo Molnar
@ 2001-12-21 15:27                               ` Gerold Jury
  2001-12-24 11:08                                 ` aio Gerold Jury
  0 siblings, 1 reply; 168+ messages in thread
From: Gerold Jury @ 2001-12-21 15:27 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel, linux-aio, bcrl

On Friday 21 December 2001 14:48, Ingo Molnar wrote:
> On Fri, 21 Dec 2001, Gerold Jury wrote:
> > It is simply too early for sexy discussions. For me, the most
> > appealing part of AIO is the socket handling. It seems a little bit
> > broken in the current glibc emulation/implementation. Recv and send
> > operations are ordered when used on the same socket handle. Thus a
> > recv must be finished before a subsequent send will happen. Good idea
> > for files, bad for sockets.
>
> is this a fundamental limitation expressed in the interface, or just an
> implementational limitation? On sockets this is indeed a big problem, HTTP
> pipelining wants completely separate receive/send queues.
>
> 	Ingo
>

That is a very good question.

The Single UNIX ® Specification, Version 2 has the following to say.

If _POSIX_SYNCHRONIZED_IO is defined and synchronised I/O is enabled on the 
file associated with aiocbp->aio_fildes, the behaviour of this function is 
according to the definitions of synchronised I/O data integrity completion 
and synchronised I/O file integrity completion.

Maybe a was a little bit too fast in blaming glibc. I will go and look for 
more documentation about disabling synchronised I/O on a socket.

Dup()licating the socket handle is an easy workaround, but now i am 
convinced, a little bit man page digging will be lots of fun.

I hope the efforts of Benjamin LaHaise receive more attention and as soon as 
i know more about disabling synchronised I/O on sockets i will send an other 
email.

Gerold

--
I love AIO

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-21 17:24                               ` aio Alan Cox
@ 2001-12-21 17:16                                 ` Benjamin LaHaise
  2001-12-23  5:35                                   ` aio Bill Huey
  0 siblings, 1 reply; 168+ messages in thread
From: Benjamin LaHaise @ 2001-12-21 17:16 UTC (permalink / raw)
  To: Alan Cox; +Cc: David S. Miller, billh, torvalds, linux-kernel, linux-aio

On Fri, Dec 21, 2001 at 05:24:33PM +0000, Alan Cox wrote:
> select/poll is a win - and Java recently discovered poll/select semantics 8)

Anything is a win over Java's threading model.

		-ben
-- 
Killer Attack Fish.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:47                             ` aio Benjamin LaHaise
  2001-12-20  5:39                               ` aio David S. Miller
@ 2001-12-21 17:24                               ` Alan Cox
  2001-12-21 17:16                                 ` aio Benjamin LaHaise
  1 sibling, 1 reply; 168+ messages in thread
From: Alan Cox @ 2001-12-21 17:24 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: David S. Miller, billh, torvalds, linux-kernel, linux-aio

> Who cares about Java?  What about high performance LDAP servers or tux-like 
> userspace performance?  How about faster select and poll?  An X server that 

select/poll is a win - and Java recently discovered poll/select semantics 8)

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  1:20                   ` aio David S. Miller
                                       ` (2 preceding siblings ...)
       [not found]                     ` <mailman.1008816001.10138.linux-kernel2news@redhat.com>
@ 2001-12-21 17:28                     ` Alan Cox
  2001-12-23  5:46                       ` aio Bill Huey
  3 siblings, 1 reply; 168+ messages in thread
From: Alan Cox @ 2001-12-21 17:28 UTC (permalink / raw)
  To: David S. Miller; +Cc: billh, bcrl, torvalds, linux-kernel, linux-aio

> Precisely, in fact.  Anyone who can say that Java is going to be
> relevant in a few years time, with a straight face, is only kidding
> themselves.

Oh it'll be very relevant. Its leaking into all sorts of embedded uses, from
Digital TV to smartcards. Its still useless for serious high end work an
likely to stay so.

> Java is not something to justify a new kernel feature, that is for
> certain.

There we agree. Things like the current asynch/thread mess in java are
partly poor design of language and greatly stupid design of JVM.

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: Scheduler ( was: Just a second ) ...
  2001-12-18 18:27             ` Doug Ledford
                                 ` (2 preceding siblings ...)
  2001-12-19 11:05               ` Helge Hafting
@ 2001-12-21 20:23               ` Rob Landley
  3 siblings, 0 replies; 168+ messages in thread
From: Rob Landley @ 2001-12-21 20:23 UTC (permalink / raw)
  To: Doug Ledford, Andreas Dilger; +Cc: Kernel Mailing List

On Tuesday 18 December 2001 01:27 pm, Doug Ledford wrote:

> Weel, evidently esd and artsd both do this (well, I assume esd does now, it
> didn't do this in the past).  Basically, they both transmit silence over
> the sound chip when nothing else is going on.  So even though you don't
> hear anything, the same sound output DMA is taking place.  That avoids

THAT explains it.

My Dell Inspiron 3500 laptop's built-in sound (NeoMagic MagicMedia 256 AV, 
uses ad1848 module) works fine when I first boot the sucker, but looses its 
marbles after an APM suspend and stops receiving interrupts.  (Extensive 
poking around with setpci has so far failed to get it working again, but on a 
shutdown and restart the bios sets it up fine.  Not a clue what's up there.  
The bios and module agree it's using IRQ 7, but lspci insists it's IRQ 11, 
both before and after apm suspend.  Boggle.)

I was confused for a while about how exactly it was failing because KDE and 
mpg123 from the command line fail in different ways.  mpg123 will play the 
same half-second clip in a loop (ahah! no interrupt!), but sound in kde just 
vanishes and I get silence and hung apps whenever I try to launch anything.

The clue is that it doesn't always fail when I suspend it without having X 
up.  Translation: maybe the sound card's getting hosed by being open and in 
use on APM shutdown!

Hmmm...  I should poke at this over the weekend...

(Nope, not a new problem.  My laptop's sound has been like this since at 
least 2.4.4, which I think was the first version I installed on the box.  But 
it's still annoying, I can go weeks without a true reboot 'cause I have a 
zillion konqueror windows and such open.  I have to clear my desktop to get 
sound working again for a few hours.  Obnoxious...)

Rob

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-21 17:16                                 ` aio Benjamin LaHaise
@ 2001-12-23  5:35                                   ` Bill Huey
  0 siblings, 0 replies; 168+ messages in thread
From: Bill Huey @ 2001-12-23  5:35 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Alan Cox, David S. Miller, billh, torvalds, linux-kernel,
	linux-aio

On Fri, Dec 21, 2001 at 12:16:45PM -0500, Benjamin LaHaise wrote:
> On Fri, Dec 21, 2001 at 05:24:33PM +0000, Alan Cox wrote:
> > select/poll is a win - and Java recently discovered poll/select semantics 8)
> 
> Anything is a win over Java's threading model.
> 
> 		-ben

Yeah, it's just another abstraction layer that lives on top of native threading
model, so you don't have to worry about stuff like spinlock contention since
it's been pushed down into the native threading implementation. It doesn't really
add a tremendous amount of overhead given how delegates all of that to the
native OS threading model.

Also, t would be nice to have some regular way of doing read-write lock without
having to implement it in Java language itself, but it's not too critical since
folks don't really push or use the JVM in that way just yet. It's certainly
important in certain high contention systems in the kernel.

bill


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-21 17:28                     ` aio Alan Cox
@ 2001-12-23  5:46                       ` Bill Huey
  2001-12-23  6:34                         ` aio Dan Kegel
  2001-12-26 20:42                         ` Java and Flam^H^H^H^H AIO (was: aio) Daniel Phillips
  0 siblings, 2 replies; 168+ messages in thread
From: Bill Huey @ 2001-12-23  5:46 UTC (permalink / raw)
  To: Alan Cox; +Cc: David S. Miller, billh, bcrl, torvalds, linux-kernel, linux-aio

On Fri, Dec 21, 2001 at 05:28:36PM +0000, Alan Cox wrote:
> > Precisely, in fact.  Anyone who can say that Java is going to be
> > relevant in a few years time, with a straight face, is only kidding
> > themselves.
> 
> Oh it'll be very relevant. Its leaking into all sorts of embedded uses, from
> Digital TV to smartcards. Its still useless for serious high end work an
> likely to stay so.
> 
> > Java is not something to justify a new kernel feature, that is for
> > certain.
> 
> There we agree. Things like the current asynch/thread mess in java are
> partly poor design of language and greatly stupid design of JVM.

It's not the fault of the JVM runtime nor the the language per se since
both are excellent. The blame should instead be placed on the political
process within Sun, which has created a lag in getting a decent IO event
model/system available in the form of an API.

This newer system is suppose to be able to scale to tens of thousands of
FDs and be able to handle heavy duty server side stuff in a more graceful
manner. It's a reasonable system from what I saw, but the implementation
of it is highly OS dependent and will be subject to those environmental
constraints. Couple this and the HotSpot compiler (supposeablly competitive
with gcc's -O3 from benchmarks) and it should be high useable for a broad
range of of server side work when intelligently engineered.  

bill


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-23  5:46                       ` aio Bill Huey
@ 2001-12-23  6:34                         ` Dan Kegel
  2001-12-23 18:43                           ` aio Davide Libenzi
  2001-12-26 20:42                         ` Java and Flam^H^H^H^H AIO (was: aio) Daniel Phillips
  1 sibling, 1 reply; 168+ messages in thread
From: Dan Kegel @ 2001-12-23  6:34 UTC (permalink / raw)
  To: Bill Huey
  Cc: Alan Cox, David S. Miller, bcrl, torvalds, linux-kernel,
	linux-aio

Bill Huey wrote:
> > There we agree. Things like the current asynch/thread mess in java are
> > partly poor design of language and greatly stupid design of JVM.
> 
> It's not the fault of the JVM runtime nor the the language per se since
> both are excellent. The blame should instead be placed on the political
> process within Sun, which has created a lag in getting a decent IO event
> model/system available in the form of an API.
> 
> This newer system is suppose to be able to scale to tens of thousands of
> FDs and be able to handle heavy duty server side stuff in a more graceful
> manner. It's a reasonable system from what I saw, but the implementation
> of it is highly OS dependent and will be subject to those environmental
> constraints. Couple this and the HotSpot compiler (supposeablly competitive
> with gcc's -O3 from benchmarks) and it should be high useable for a broad
> range of of server side work when intelligently engineered.

I served on JSR-51, the expert group that helped design the new I/O
model.  (The design was Sun's, but we had quite a bit of input.)
For network I/O, there's a Selector object which is essentially
a nice OO wrapper around the /dev/poll or kqueue/kevent abstraction.
Selector does have a distinctly Unixy feel to it, but it can probably
be implemented well on top of any reasonable OS; I'm quite sure
it can be expressed fairly well in terms of Windows NT's asych I/O
or Linux's rt signal stuff.  

(I suspect the initial Linux implementations will just use poll(),
but that's something the Blackdown team can fix.  And heck, it
ought to be easy to implement it on top of all the nifty poll
replacements and choose between them at jvm startup time without
any noticable overhead.)

- Dan

p.s. Davide, I didn't forget /dev/epoll, I just haven't had time to
post Poller_devepoll yet!

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-23  6:34                         ` aio Dan Kegel
@ 2001-12-23 18:43                           ` Davide Libenzi
  0 siblings, 0 replies; 168+ messages in thread
From: Davide Libenzi @ 2001-12-23 18:43 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Bill Huey, Alan Cox, David S. Miller, bcrl, torvalds,
	linux-kernel, linux-aio

On Sat, 22 Dec 2001, Dan Kegel wrote:

> p.s. Davide, I didn't forget /dev/epoll, I just haven't had time to
> post Poller_devepoll yet!

Yep, i just started feeling angry about this :)




- Davide



^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-21 15:27                               ` aio Gerold Jury
@ 2001-12-24 11:08                                 ` Gerold Jury
  0 siblings, 0 replies; 168+ messages in thread
From: Gerold Jury @ 2001-12-24 11:08 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel

> On Friday 21 December 2001 14:48, Ingo Molnar wrote:
> > is this a fundamental limitation expressed in the interface, or just an
> > implementational limitation? On sockets this is indeed a big problem,
> > HTTP pipelining wants completely separate receive/send queues.
> >
> > 	Ingo
>

I got the _POSIX_SYNCHRONIZED_IO completely wrong.
It has nothing to do with the ordering of the aio read/write requests.

Aio_read and aio_write work with absolute positions inside the file size.
The order of requests is unspecified in SUSV V2.
SUSV V2 does neither prevent nor force the desired behaviour.

It seems like it is up to the implementation how to deal with the request 
order.
As mentioned earlier, SGI-kaio does the right thing with the same interface.

I want to add, that a combination of sigwaitinfo / sigtimedwait and aio is a 
very efficient way to deal with sockets. The accept may be handled with real 
time signals as well by using fcntl F_SETSIG and F_SETFL FASYNC.


Gerold


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20 21:59                                     ` aio Linus Torvalds
@ 2001-12-24 11:44                                       ` Gerold Jury
  0 siblings, 0 replies; 168+ messages in thread
From: Gerold Jury @ 2001-12-24 11:44 UTC (permalink / raw)
  To: linux-kernel, linux-aio

sigtimedwait and sigwaitinfo in combination with SIGIO prevents the 
asynchronous event problem and works very well for me.

Gerold

On Thursday 20 December 2001 22:59, Linus Torvalds wrote:
> It's much easier to have a synchronous interface to the asynchronous IO,
> ie one where you do not have to worry about events happening "at the same
> time".
>

^ permalink raw reply	[flat|nested] 168+ messages in thread

* Java and Flam^H^H^H^H AIO (was: aio)
  2001-12-23  5:46                       ` aio Bill Huey
  2001-12-23  6:34                         ` aio Dan Kegel
@ 2001-12-26 20:42                         ` Daniel Phillips
  1 sibling, 0 replies; 168+ messages in thread
From: Daniel Phillips @ 2001-12-26 20:42 UTC (permalink / raw)
  To: Bill Huey, Alan Cox
  Cc: David S. Miller, billh, bcrl, torvalds, linux-kernel, linux-aio

On December 23, 2001 06:46 am, Bill Huey wrote:
> On Fri, Dec 21, 2001 at 05:28:36PM +0000, Alan Cox wrote:
> > > Precisely, in fact.  Anyone who can say that Java is going to be
> > > relevant in a few years time, with a straight face, is only kidding
> > > themselves.
> > 
> > Oh it'll be very relevant. Its leaking into all sorts of embedded uses,
> > from Digital TV to smartcards. Its still useless for serious high end 
> > work an likely to stay so.
> > 
> > > Java is not something to justify a new kernel feature, that is for
> > > certain.
> > 
> > There we agree. Things like the current asynch/thread mess in java are
> > partly poor design of language and greatly stupid design of JVM.
> 
> It's not the fault of the JVM runtime nor the the language per se since
> both are excellent. The blame should instead be placed on the political
> process within Sun, which has created a lag in getting a decent IO event
> model/system available in the form of an API.

Hey wait, it can't be so.  Sun apparently uses a boot camp system to 
guarantee that every project finishes on time, every time. 

* daniel ducks and runs

--
Daniel


^ permalink raw reply	[flat|nested] 168+ messages in thread

* Re: aio
  2001-12-20  3:21                             ` aio Bill Huey
@ 2001-12-27  9:36                               ` Martin Dalecki
  0 siblings, 0 replies; 168+ messages in thread
From: Martin Dalecki @ 2001-12-27  9:36 UTC (permalink / raw)
  To: Bill Huey
  Cc: David S. Miller, kerndev, bcrl, torvalds, linux-kernel, linux-aio

Bill Huey wrote:

>On Wed, Dec 19, 2001 at 07:06:29PM -0800, David S. Miller wrote:
>
>>Firstly, you say this as if server java applets do not function at all
>>or with acceptable performance today.  That is not true for the vast
>>majority of cases.
>>
>>If java server applet performance in all cases is dependent upon AIO
>>(it is not), that would be pretty sad.  But it wouldn't be the first
>>
>
>Java is pretty incomplete in this area, which should be addressed to a
>great degree in the new NIO API.
>
>The core JVM isn't dependent on this stuff per se for performance, but
>it is critical to server side programs that have to deal with highly
>scalable IO systems, largely number of FDs, that go beyond the current
>expressiveness of select()/poll().
>
>This is all standard fare in *any* kind of high performance networking
>application where some kind of high performance kernel/userspace event
>delivery system is needed, kqueue() principally.
>
>>time I've heard crap like that.  There is propaganda out there telling
>>people that 64-bit address spaces are needed for good java
>>performance.  Guess where that came from?  (hint: they invented java
>>and are in the buisness of selling 64-bit RISC processors)
>>
>
>What ? oh god. HotSpot is a pretty amazing compiler and it performs well.
>Swing does well now, but the lingering issue in Java is the shear size
>of it and possibly GC issues. It pretty clear that it's going to get
>larger, which is fine since memory is cheap.
>
I remind you: ORACLE 9i is requiring half a gig as a minimum just due to the
use of the CRAPPY PIECE OF SHIT written in the Java, called, you guess 
it: Just the
bloody damn Installer. Java is really condemned just due to the fact 
that both terms: speed
and memmory usage are both allways only *relative* to other systems.

And yes GC's have only one problem - they try to give a general solution 
for problems
which can be easly prooven to be mathmematically insolvable. The 
resulting undeterministic
behaviour of applications is indeed the thing which is hurting most.


^ permalink raw reply	[flat|nested] 168+ messages in thread

end of thread, other threads:[~2001-12-27  9:49 UTC | newest]

Thread overview: 168+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20011218020456.A11541@redhat.com>
2001-12-18 16:50 ` Scheduler ( was: Just a second ) Linus Torvalds
2001-12-18 16:56   ` Rik van Riel
2001-12-18 17:18     ` Linus Torvalds
2001-12-18 19:04       ` Alan Cox
2001-12-18 21:02         ` Larry McVoy
2001-12-18 21:14           ` David S. Miller
2001-12-18 21:17             ` Larry McVoy
2001-12-18 21:19               ` Rik van Riel
2001-12-18 21:30               ` David S. Miller
2001-12-18 21:18           ` Rik van Riel
2001-12-19 17:44           ` IRC (was: Scheduler) Daniel Phillips
2001-12-19 17:51             ` Larry McVoy
2001-12-19 18:24               ` Daniel Phillips
2001-12-19 18:19             ` M. Edward (Ed) Borasky
2001-12-19 18:27               ` Daniel Phillips
2001-12-19 18:40               ` J Sloan
2001-12-19 16:50         ` Scheduler ( was: Just a second ) Daniel Phillips
     [not found]           ` <Pine.LNX.4.33.0112190859050.1872-100000@penguin.transmeta.com>
2001-12-19 18:57             ` aio Ben LaHaise
2001-12-19 19:29               ` aio Dan Kegel
2001-12-20  4:04                 ` aio Benjamin LaHaise
2001-12-19 20:09               ` aio Daniel Phillips
2001-12-19 20:21               ` aio Davide Libenzi
     [not found]               ` <mailman.1008792601.3391.linux-kernel2news@redhat.com>
2001-12-19 20:23                 ` aio Pete Zaitcev
2001-12-20  0:13               ` aio David S. Miller
2001-12-20  0:21                 ` aio Benjamin LaHaise
2001-12-20  0:36                   ` aio Andrew Morton
2001-12-20  0:55                     ` aio H. Peter Anvin
2001-12-20  0:47                   ` aio Davide Libenzi
2001-12-20  1:16                 ` aio Bill Huey
2001-12-20  1:20                   ` aio David S. Miller
2001-12-20  2:26                     ` aio Bill Huey
2001-12-20  2:45                       ` aio David S. Miller
2001-12-19 18:57                         ` aio John Heil
2001-12-20  3:06                           ` aio David S. Miller
2001-12-19 19:30                             ` aio John Heil
2001-12-20  5:29                               ` aio David S. Miller
2001-12-20  3:21                             ` aio Bill Huey
2001-12-27  9:36                               ` aio Martin Dalecki
2001-12-20  3:07                         ` aio Bill Huey
2001-12-20  3:13                           ` aio David S. Miller
2001-12-20  3:47                             ` aio Benjamin LaHaise
2001-12-20  5:39                               ` aio David S. Miller
2001-12-20  5:58                                 ` aio Benjamin LaHaise
2001-12-20  6:00                                   ` aio David S. Miller
2001-12-20  6:46                                     ` aio Mike Castle
2001-12-20  6:55                                       ` aio Robert Love
2001-12-20  7:13                                         ` aio Mike Castle
2001-12-20  7:01                                       ` aio David S. Miller
2001-12-20  7:27                                 ` aio Daniel Phillips
     [not found]                                 ` <Pine.LNX.4.33.0112201127400.2656-100000@localhost.localdomain>
2001-12-20 11:49                                   ` aio William Lee Irwin III
2001-12-20 16:32                                   ` aio Dan Kegel
2001-12-20 18:05                                     ` aio Davide Libenzi
2001-12-20 21:45                                   ` aio Lincoln Dale
2001-12-20 21:59                                     ` aio Linus Torvalds
2001-12-24 11:44                                       ` aio Gerold Jury
2001-12-20 23:02                                     ` aio Lincoln Dale
2001-12-21 17:24                               ` aio Alan Cox
2001-12-21 17:16                                 ` aio Benjamin LaHaise
2001-12-23  5:35                                   ` aio Bill Huey
2001-12-20 14:38                             ` aio Luigi Genoni
2001-12-20 17:26                             ` aio Henning Schmiedehausen
2001-12-20 20:04                               ` aio M. Edward (Ed) Borasky
2001-12-20 23:53                               ` aio David S. Miller
2001-12-21  0:28                                 ` Offtopic Java/C# [Re: aio] Bill Huey
     [not found]                         ` <mailman.1008817860.10606.linux-kernel2news@redhat.com>
2001-12-20  5:16                           ` aio Pete Zaitcev
2001-12-20 16:16                         ` aio Dan Kegel
2001-12-21 11:44                           ` aio Gerold Jury
2001-12-21 13:48                             ` aio Ingo Molnar
2001-12-21 15:27                               ` aio Gerold Jury
2001-12-24 11:08                                 ` aio Gerold Jury
2001-12-20 17:24                         ` aio Henning Schmiedehausen
2001-12-20  2:37                     ` aio Cameron Simpson
2001-12-20  2:47                       ` aio David S. Miller
2001-12-20  2:52                         ` aio Cameron Simpson
2001-12-20  2:58                           ` aio David S. Miller
2001-12-20  5:47                             ` aio Linus Torvalds
2001-12-20  5:57                               ` aio David S. Miller
2001-12-20  5:59                                 ` aio Benjamin LaHaise
2001-12-20  6:02                                   ` aio David S. Miller
2001-12-20  6:07                                     ` aio Benjamin LaHaise
2001-12-20  6:12                                       ` aio David S. Miller
2001-12-20  6:23                                         ` aio Linus Torvalds
2001-12-20 10:18                                           ` aio Ingo Molnar
2001-12-20 18:20                                             ` aio Robert Love
2001-12-20 22:30                                               ` aio Cameron Simpson
2001-12-20 22:46                                                 ` aio Benjamin LaHaise
2001-12-20  6:09                                     ` aio Linus Torvalds
2001-12-20 17:28                                       ` aio Suparna Bhattacharya
     [not found]                     ` <mailman.1008816001.10138.linux-kernel2news@redhat.com>
2001-12-20  5:07                       ` aio Pete Zaitcev
2001-12-20  5:10                         ` aio Cameron Simpson
2001-12-21 17:28                     ` aio Alan Cox
2001-12-23  5:46                       ` aio Bill Huey
2001-12-23  6:34                         ` aio Dan Kegel
2001-12-23 18:43                           ` aio Davide Libenzi
2001-12-26 20:42                         ` Java and Flam^H^H^H^H AIO (was: aio) Daniel Phillips
2001-12-18 19:11       ` Scheduler ( was: Just a second ) Mike Galbraith
2001-12-18 19:15       ` Rik van Riel
2001-12-18 22:32         ` in defense of the linux-kernel mailing list Ingo Molnar
2001-12-18 17:55   ` Scheduler ( was: Just a second ) Davide Libenzi
2001-12-18 19:43   ` Alexander Viro
     [not found] <Pine.LNX.4.33.0112181508001.3410-100000@penguin.transmeta.com>
2001-12-20  3:50 ` Rik van Riel
2001-12-20  4:04   ` Ryan Cumming
2001-12-20  5:39   ` David S. Miller
2001-12-20  5:58     ` Linus Torvalds
2001-12-20  6:01       ` David S. Miller
2001-12-20 22:40         ` Troels Walsted Hansen
2001-12-20 23:55           ` Chris Ricker
2001-12-20 23:59             ` CaT
2001-12-21  0:06             ` Davide Libenzi
2001-12-20 11:29     ` Rik van Riel
2001-12-20 11:34       ` David S. Miller
2001-12-20  5:52   ` Linus Torvalds
2001-12-18  5:59 V Ganesh
  -- strict thread matches above, loose matches on Subject: below --
2001-12-18  5:11 Thierry Forveille
2001-12-17 21:41 ` John Heil
2001-12-18 14:31 ` Alan Cox
     [not found] <20011217200946.D753@holomorphy.com>
2001-12-18  4:27 ` Linus Torvalds
2001-12-18  4:55   ` William Lee Irwin III
2001-12-18  6:09     ` Linus Torvalds
2001-12-18  6:34       ` Jeff Garzik
2001-12-18 12:23       ` Rik van Riel
2001-12-18 14:29       ` Alan Cox
2001-12-18 17:07         ` Linus Torvalds
2001-12-18 15:51       ` Martin Josefsson
2001-12-18 17:08         ` Linus Torvalds
2001-12-18 16:16       ` Roger Larsson
2001-12-18 17:16         ` Herman Oosthuysen
2001-12-18 17:16         ` Linus Torvalds
2001-12-18 17:21       ` David Mansfield
2001-12-18 17:27         ` Linus Torvalds
2001-12-18 17:54           ` Andreas Dilger
2001-12-18 18:27             ` Doug Ledford
2001-12-18 18:52               ` Andreas Dilger
2001-12-18 19:03                 ` Doug Ledford
2001-12-19  9:19               ` Peter Wächtler
2001-12-19 11:05               ` Helge Hafting
2001-12-21 20:23               ` Rob Landley
2001-12-18 18:35             ` Linus Torvalds
2001-12-18 18:58           ` Alan Cox
2001-12-18 19:31             ` Gerd Knorr
2001-12-18 18:25       ` William Lee Irwin III
2001-12-18 14:21     ` Adam Schrotenboer
2001-12-18 18:13   ` Davide Libenzi
2001-12-16  0:13 Just a second Linus Torvalds
2001-12-17 22:48 ` Scheduler ( was: Just a second ) Davide Libenzi
2001-12-17 22:53   ` Linus Torvalds
2001-12-17 23:15     ` Davide Libenzi
2001-12-17 23:18       ` Linus Torvalds
2001-12-17 23:39         ` Davide Libenzi
2001-12-17 23:52         ` Benjamin LaHaise
2001-12-18  1:11           ` Linus Torvalds
2001-12-18  1:46             ` H. Peter Anvin
2001-12-18  5:54             ` Benjamin LaHaise
2001-12-18  6:10               ` Linus Torvalds
2001-12-18  1:54     ` Rik van Riel
2001-12-18  2:35       ` Linus Torvalds
2001-12-18  2:51         ` David Lang
2001-12-18  3:08         ` Davide Libenzi
2001-12-18  3:19           ` Davide Libenzi
2001-12-18 14:09         ` Alan Cox
2001-12-18  9:12           ` John Heil
2001-12-18 15:34           ` degger
2001-12-18 18:35             ` Mike Kravetz
2001-12-18 18:48             ` Davide Libenzi
2001-12-18 16:50           ` Mike Kravetz
2001-12-18 17:22             ` Linus Torvalds
2001-12-18 17:50               ` Davide Libenzi
2001-12-18 17:00           ` Linus Torvalds
2001-12-18 19:17             ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox