Re: [2.4.17/18pre] VM and swap - it's really unusable

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [2.4.17/18pre] VM and swap - it's really unusable
@ 2002-01-08  3:02 Dieter Nützel
  2002-01-08 10:55 ` Luigi Genoni
                   ` (2 more replies)
  0 siblings, 3 replies; 351+ messages in thread
From: Dieter Nützel @ 2002-01-08  3:02 UTC (permalink / raw)
  To: Marcelo Tosatti, Andrea Arcangeli, Rik van Riel
  Cc: Linux Kernel List, Andrew Morton, Robert Love

Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or 
-rmap?
Andrew Morten`s read-latency.patch is a clear winner for me, too.
What about 00_nanosleep-5 and bootmem?
The O(1) scheduler?
Maybe preemption? It is disengageable so nobody should be harmed but we get 
the chance for wider testing.

Any comments?

Thanks,
	Dieter

-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel@hamburg.de

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08  3:02 [2.4.17/18pre] VM and swap - it's really unusable Dieter Nützel
@ 2002-01-08 10:55 ` Luigi Genoni
  2002-01-08 13:21   ` Andrea Arcangeli
  2002-01-08 13:51 ` J.A. Magallon
       [not found] ` <E16OHLf-0000Dn-00@starship.berlin>
  2 siblings, 1 reply; 351+ messages in thread
From: Luigi Genoni @ 2002-01-08 10:55 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Marcelo Tosatti, Andrea Arcangeli, Rik van Riel,
	Linux Kernel List, Andrew Morton, Robert Love



On Tue, 8 Jan 2002, Dieter [iso-8859-15] Nützel wrote (passim):

> Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or
> -rmap?
[...]
> Maybe preemption? It is disengageable so nobody should be harmed but we get
> the chance for wider testing.
>
> Any comments?
preemption?? this is eventually 2.5 stuff, and should not be integrated
into 2.4 stable tree. Of course a backport is possible, when/if it will be
quite well tested and well working on 2.5






^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 10:55 ` Luigi Genoni
@ 2002-01-08 13:21   ` Andrea Arcangeli
  2002-01-08 13:33     ` Anton Blanchard
                       ` (2 more replies)
  0 siblings, 3 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-08 13:21 UTC (permalink / raw)
  To: Luigi Genoni
  Cc: Dieter Nützel, Marcelo Tosatti, Rik van Riel,
	Linux Kernel List, Andrew Morton, Robert Love

On Tue, Jan 08, 2002 at 11:55:59AM +0100, Luigi Genoni wrote:
> 
> 
> On Tue, 8 Jan 2002, Dieter [iso-8859-15] Nützel wrote (passim):
> 
> > Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or
> > -rmap?
> [...]
> > Maybe preemption? It is disengageable so nobody should be harmed but we get
> > the chance for wider testing.
> >
> > Any comments?
> preemption?? this is eventually 2.5 stuff, and should not be integrated

indeed ("eventually" in the italian sense btw, obvious to me, but not
for l-k).

I'm not against preemption (I can see the benefits about the mean
latency for real time DSP) but the claims about preemption making the
kernel faster doesn't make sense to me. more frequent scheduling,
overhead of branches in the locks (you've to conditional_schedule after
the last preemption lock is released and the cachelines for the per-cpu
preemption locks) and the other preemption stuff can only make the
kernel slower.  Furthmore for multimedia playback any sane kernel out
there with lowlatency fixes applied will work as well as a preemption
kernel that pays for all the preemption overhead.

About the other claim that as the kernel becomes more granular
performance will increase with preemption in kernel, that's obviously
wrong as well, it's clearly the other way around. Maybe it was meant
"latency will decrease further", that's right, but also performance will
decrease if something.

So yes, mean latency will decrease with preemptive kernel, but your CPU
is definitely paying something for it.

> into 2.4 stable tree. Of course a backport is possible, when/if it will be
> quite well tested and well working on 2.5
> 
> 
> 
> 

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 13:21   ` Andrea Arcangeli
@ 2002-01-08 13:33     ` Anton Blanchard
  2002-01-08 15:00       ` Daniel Phillips
  2002-01-08 17:41     ` Luigi Genoni
  2002-01-14  0:46     ` Bill Davidsen
  2 siblings, 1 reply; 351+ messages in thread
From: Anton Blanchard @ 2002-01-08 13:33 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Luigi Genoni, Dieter N?tzel, Marcelo Tosatti, Rik van Riel,
	Linux Kernel List, Andrew Morton, Robert Love

 
> So yes, mean latency will decrease with preemptive kernel, but your CPU
> is definitely paying something for it.

And Andrew Morton's work suggests he can do a much better job of
reducing latency than -preempt.

Anton

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 13:33     ` Anton Blanchard
@ 2002-01-08 15:00       ` Daniel Phillips
  2002-01-08 15:29         ` Andrea Arcangeli
  2002-01-08 19:47         ` Andrew Morton
  0 siblings, 2 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-08 15:00 UTC (permalink / raw)
  To: Anton Blanchard, Andrea Arcangeli
  Cc: Luigi Genoni, Dieter N?tzel, Marcelo Tosatti, Rik van Riel,
	Linux Kernel List, Andrew Morton, Robert Love

On January 8, 2002 02:33 pm, Anton Blanchard wrote:
> Andrea Arcangeli [apparently] wrote:
> > So yes, mean latency will decrease with preemptive kernel, but your CPU
> > is definitely paying something for it.
> 
> And Andrew Morton's work suggests he can do a much better job of
> reducing latency than -preempt.

That's not a particularly clueful comment, Anton.  Obviously, any 
latency-busting hacks that Andrew does could also be patched into a
-preempt kernel.

What a preemptible kernel can do that a non-preemptible kernel can't is: 
reschedule exactly as often as necessary, instead of having lots of extra 
schedule points inserted all over the place, firing when *they* think the 
time is right, which may well be earlier than necessary.

The preemptible approach is much less of a maintainance headache, since 
people don't have to be constantly doing audits to see if something changed, 
and going in to fiddle with scheduling points.

Finally, with preemption, rescheduling can be forced with essentially zero 
latency in response to an arbitrary interrupt such as IO completion, whereas 
the non-preemptive kernel will have to 'coast to a stop'.  In other words, 
the non-preemptive kernel will have little lags between successive IOs, 
whereas the preemptive kernel can submit the next IO immediately.  So there 
are bound to be loads where the preemptive kernel turns in better latency 
*and throughput* than the scheduling point hack.

Mind you, I'm not devaluing Andrew's work, it's good and valuable.  However 
it's good to be aware of why that approach can't equal the latency-busting 
performance of the preemptive approach.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 15:00       ` Daniel Phillips
@ 2002-01-08 15:29         ` Andrea Arcangeli
  2002-01-08 15:54           ` Daniel Phillips
  2002-01-08 20:55           ` [2.4.17/18pre] VM and swap - it's really unusable Robert Love
  2002-01-08 19:47         ` Andrew Morton
  1 sibling, 2 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-08 15:29 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Anton Blanchard, Luigi Genoni, Dieter N?tzel, Marcelo Tosatti,
	Rik van Riel, Linux Kernel List, Andrew Morton, Robert Love

On Tue, Jan 08, 2002 at 04:00:11PM +0100, Daniel Phillips wrote:
> On January 8, 2002 02:33 pm, Anton Blanchard wrote:
> > Andrea Arcangeli [apparently] wrote:
> > > So yes, mean latency will decrease with preemptive kernel, but your CPU
> > > is definitely paying something for it.
> > 
> > And Andrew Morton's work suggests he can do a much better job of
> > reducing latency than -preempt.
> 
> That's not a particularly clueful comment, Anton.  Obviously, any 
> latency-busting hacks that Andrew does could also be patched into a
> -preempt kernel.
> 
> What a preemptible kernel can do that a non-preemptible kernel can't is: 
> reschedule exactly as often as necessary, instead of having lots of extra 
> schedule points inserted all over the place, firing when *they* think the 
> time is right, which may well be earlier than necessary.

"extra schedule points all over the place", that's the -preempt kernel
not the lowlatency kernel! (on yeah, you don't see them in the source
but ask your CPU if it sees them)

> The preemptible approach is much less of a maintainance headache, since 
> people don't have to be constantly doing audits to see if something changed, 
> and going in to fiddle with scheduling points.

this yes, it requires less maintainance, but still you should keep in
mind the details about the spinlocks, things like the checks the VM does
in shrink_cache are needed also with preemptive kernel.

> Finally, with preemption, rescheduling can be forced with essentially zero 
> latency in response to an arbitrary interrupt such as IO completion, whereas 
> the non-preemptive kernel will have to 'coast to a stop'.  In other words, 
> the non-preemptive kernel will have little lags between successive IOs, 
> whereas the preemptive kernel can submit the next IO immediately.  So there 
> are bound to be loads where the preemptive kernel turns in better latency 
> *and throughput* than the scheduling point hack.

The I/O pipeline is big enough that a few msec before or later in a
submit_bh shouldn't make a difference, the batch logic in the
ll_rw_block layer also try to reduce the reschedule, and last but not
the least if the task is I/O bound preemptive kernel or not won't make
any difference in the submit_bh latency because no task is eating cpu
and latency will be the one of pure schedule call.

> Mind you, I'm not devaluing Andrew's work, it's good and valuable.  However 
> it's good to be aware of why that approach can't equal the latency-busting 
> performance of the preemptive approach.

I also don't want to devaluate the preemptive kernel approch (the mean
latency it can reach is lower than the one of the lowlat kernel, however
I personally care only about worst case latency and this is why I don't
feel the need of -preempt), but I just wanted to make clear that the
idea that is floating around that preemptive kernel is all goodness is
very far from reality, you get very low mean latency but at a price.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 15:29         ` Andrea Arcangeli
@ 2002-01-08 15:54           ` Daniel Phillips
  2002-01-08 16:38             ` Andrea Arcangeli
  2002-01-08 23:02             ` Luigi Genoni
  2002-01-08 20:55           ` [2.4.17/18pre] VM and swap - it's really unusable Robert Love
  1 sibling, 2 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-08 15:54 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Anton Blanchard, Luigi Genoni, Dieter N?tzel, Marcelo Tosatti,
	Rik van Riel, Linux Kernel List, Andrew Morton, Robert Love

On January 8, 2002 04:29 pm, Andrea Arcangeli wrote:
> > The preemptible approach is much less of a maintainance headache, since 
> > people don't have to be constantly doing audits to see if something changed, 
> > and going in to fiddle with scheduling points.
> 
> this yes, it requires less maintainance, but still you should keep in
> mind the details about the spinlocks, things like the checks the VM does
> in shrink_cache are needed also with preemptive kernel.

Yes of course, the spinlock regions still have to be analyzed and both
patches have to be maintained for that.  Long duration spinlocks are bad
by any measure, and have to be dealt with anyway.

> > Finally, with preemption, rescheduling can be forced with essentially zero 
> > latency in response to an arbitrary interrupt such as IO completion, whereas 
> > the non-preemptive kernel will have to 'coast to a stop'.  In other words, 
> > the non-preemptive kernel will have little lags between successive IOs, 
> > whereas the preemptive kernel can submit the next IO immediately.  So there 
> > are bound to be loads where the preemptive kernel turns in better latency 
> > *and throughput* than the scheduling point hack.
> 
> The I/O pipeline is big enough that a few msec before or later in a
> submit_bh shouldn't make a difference, the batch logic in the
> ll_rw_block layer also try to reduce the reschedule, and last but not
> the least if the task is I/O bound preemptive kernel or not won't make
> any difference in the submit_bh latency because no task is eating cpu
> and latency will be the one of pure schedule call.

That's not correct.  For one thing, you don't know that no task is eating
CPU, or that nobody is hogging the kernel.  Look at the above, and consider
the part about the little lags between IOs.

> > Mind you, I'm not devaluing Andrew's work, it's good and valuable.  However 
> > it's good to be aware of why that approach can't equal the latency-busting 
> > performance of the preemptive approach.
> 
> I also don't want to devaluate the preemptive kernel approch (the mean
> latency it can reach is lower than the one of the lowlat kernel, however
> I personally care only about worst case latency and this is why I don't
> feel the need of -preempt),

This is exactly the case that -preempt handles well.  On the other hand,
trying to show that scheduling hacks satisfy any given latency bound is
equivalent to solving the halting problem.

I thought you had done some real time work?

> but I just wanted to make clear that the
> idea that is floating around that preemptive kernel is all goodness is
> very far from reality, you get very low mean latency but at a price.

A price lots of people are willing to pay.

By the way, have you measured the cost of -preempt in practice?

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 15:54           ` Daniel Phillips
@ 2002-01-08 16:38             ` Andrea Arcangeli
  2002-01-08 23:02             ` Luigi Genoni
  1 sibling, 0 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-08 16:38 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Anton Blanchard, Luigi Genoni, Dieter N?tzel, Marcelo Tosatti,
	Rik van Riel, Linux Kernel List, Andrew Morton, Robert Love

On Tue, Jan 08, 2002 at 04:54:58PM +0100, Daniel Phillips wrote:
> On January 8, 2002 04:29 pm, Andrea Arcangeli wrote:
> > > The preemptible approach is much less of a maintainance headache, since 
> > > people don't have to be constantly doing audits to see if something changed, 
> > > and going in to fiddle with scheduling points.
> > 
> > this yes, it requires less maintainance, but still you should keep in
> > mind the details about the spinlocks, things like the checks the VM does
> > in shrink_cache are needed also with preemptive kernel.
> 
> Yes of course, the spinlock regions still have to be analyzed and both
> patches have to be maintained for that.  Long duration spinlocks are bad
> by any measure, and have to be dealt with anyway.
> 
> > > Finally, with preemption, rescheduling can be forced with essentially zero 
> > > latency in response to an arbitrary interrupt such as IO completion, whereas 
> > > the non-preemptive kernel will have to 'coast to a stop'.  In other words, 
> > > the non-preemptive kernel will have little lags between successive IOs, 
> > > whereas the preemptive kernel can submit the next IO immediately.  So there 
> > > are bound to be loads where the preemptive kernel turns in better latency 
> > > *and throughput* than the scheduling point hack.
> > 
> > The I/O pipeline is big enough that a few msec before or later in a
> > submit_bh shouldn't make a difference, the batch logic in the
> > ll_rw_block layer also try to reduce the reschedule, and last but not
> > the least if the task is I/O bound preemptive kernel or not won't make
> > any difference in the submit_bh latency because no task is eating cpu
> > and latency will be the one of pure schedule call.
> 
> That's not correct.  For one thing, you don't know that no task is eating
> CPU, or that nobody is hogging the kernel.  Look at the above, and consider
> the part about the little lags between IOs.

We agree. Actually "if the task is I/O bound", I meant "if nobody is
hogging CPU", I exactly wanted to make the example of no task hogging
CPU in general. 

> > > Mind you, I'm not devaluing Andrew's work, it's good and valuable.  However 
> > > it's good to be aware of why that approach can't equal the latency-busting 
> > > performance of the preemptive approach.
> > 
> > I also don't want to devaluate the preemptive kernel approch (the mean
> > latency it can reach is lower than the one of the lowlat kernel, however
> > I personally care only about worst case latency and this is why I don't
> > feel the need of -preempt),
> 
> This is exactly the case that -preempt handles well.  On the other hand,
> trying to show that scheduling hacks satisfy any given latency bound is
> equivalent to solving the halting problem.
> 
> I thought you had done some real time work?
> 
> > but I just wanted to make clear that the
> > idea that is floating around that preemptive kernel is all goodness is
> > very far from reality, you get very low mean latency but at a price.
> 
> A price lots of people are willing to pay.

I'm not convinced that all those people knows exactly what they're
buying then 8).

> 
> By the way, have you measured the cost of -preempt in practice?

dropping the lock in spin_unlock made a difference in the numbers, so
the overhead must be definitely visible, by simply loading the system
with threaded kernel computation (like webserving etc..etc..).

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 15:54           ` Daniel Phillips
  2002-01-08 16:38             ` Andrea Arcangeli
@ 2002-01-08 23:02             ` Luigi Genoni
  2002-01-08 23:32               ` Ken Brownfield
                                 ` (2 more replies)
  1 sibling, 3 replies; 351+ messages in thread
From: Luigi Genoni @ 2002-01-08 23:02 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Andrea Arcangeli, Anton Blanchard, Dieter N?tzel, Marcelo Tosatti,
	Rik van Riel, Linux Kernel List, Andrew Morton, Robert Love



On Tue, 8 Jan 2002, Daniel Phillips wrote:

> On January 8, 2002 04:29 pm, Andrea Arcangeli wrote:
> > > The preemptible approach is much less of a maintainance headache, since
> > > people don't have to be constantly doing audits to see if something changed,
> > > and going in to fiddle with scheduling points.
> >
> > this yes, it requires less maintainance, but still you should keep in
> > mind the details about the spinlocks, things like the checks the VM does
> > in shrink_cache are needed also with preemptive kernel.
>
> Yes of course, the spinlock regions still have to be analyzed and both
> patches have to be maintained for that.  Long duration spinlocks are bad
> by any measure, and have to be dealt with anyway.
>
> > > Finally, with preemption, rescheduling can be forced with essentially zero
> > > latency in response to an arbitrary interrupt such as IO completion, whereas
> > > the non-preemptive kernel will have to 'coast to a stop'.  In other words,
> > > the non-preemptive kernel will have little lags between successive IOs,
> > > whereas the preemptive kernel can submit the next IO immediately.  So there
> > > are bound to be loads where the preemptive kernel turns in better latency
> > > *and throughput* than the scheduling point hack.
> >
> > The I/O pipeline is big enough that a few msec before or later in a
> > submit_bh shouldn't make a difference, the batch logic in the
> > ll_rw_block layer also try to reduce the reschedule, and last but not
> > the least if the task is I/O bound preemptive kernel or not won't make
> > any difference in the submit_bh latency because no task is eating cpu
> > and latency will be the one of pure schedule call.
>
> That's not correct.  For one thing, you don't know that no task is eating
> CPU, or that nobody is hogging the kernel.  Look at the above, and consider
> the part about the little lags between IOs.
>
> > > Mind you, I'm not devaluing Andrew's work, it's good and valuable.  However
> > > it's good to be aware of why that approach can't equal the latency-busting
> > > performance of the preemptive approach.
> >
> > I also don't want to devaluate the preemptive kernel approch (the mean
> > latency it can reach is lower than the one of the lowlat kernel, however
> > I personally care only about worst case latency and this is why I don't
> > feel the need of -preempt),
>
> This is exactly the case that -preempt handles well.  On the other hand,
> trying to show that scheduling hacks satisfy any given latency bound is
> equivalent to solving the halting problem.
>
> I thought you had done some real time work?
>
> > but I just wanted to make clear that the
> > idea that is floating around that preemptive kernel is all goodness is
> > very far from reality, you get very low mean latency but at a price.
>
> A price lots of people are willing to pay
Probably sometimes they are not making a good business. In the reality
preempt is good in many scenarios, as I said, and I agree that for
desktops, and dedicated servers where just one application runs, and
probably the CPU is idle the most of the time, indeed users have a speed
feeling. Please consider that on eavilly loaded servers, with 40 and more
users, some are running gcc, others g77, others g++ compilations, someone
runs pine or mutt or kmail, and netscape, and mozilla, and emacs (someone
form xterm kde or gnome), and and
and... You can have also 4/8 CPU butthey are not infinite ;) (but I talk
mainly thinking of dualAthlon systems).
there is a lot of memory and disk I/O.
This is not a strange scenary on the interactive servers used at SNS.
Here preempt has a too high price
>
> By the way, have you measured the cost of -preempt in practice?
>
Yes, I did a lot of tests, and with current preempt patch definitelly
I was seeing a too big performance loss.



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 23:02             ` Luigi Genoni
@ 2002-01-08 23:32               ` Ken Brownfield
  2002-01-08 23:42                 ` Robert Love
                                   ` (2 more replies)
  2002-01-09  0:13               ` Dieter Nützel
  2002-01-09  6:26               ` Daniel Phillips
  2 siblings, 3 replies; 351+ messages in thread
From: Ken Brownfield @ 2002-01-08 23:32 UTC (permalink / raw)
  To: Luigi Genoni; +Cc: linux-kernel

On Wed, Jan 09, 2002 at 12:02:48AM +0100, Luigi Genoni wrote:
| Probably sometimes they are not making a good business. In the reality
| preempt is good in many scenarios, as I said, and I agree that for
| desktops, and dedicated servers where just one application runs, and
| probably the CPU is idle the most of the time, indeed users have a speed
| feeling. Please consider that on eavilly loaded servers, with 40 and more
| users, some are running gcc, others g77, others g++ compilations, someone
| runs pine or mutt or kmail, and netscape, and mozilla, and emacs (someone
| form xterm kde or gnome), and and
| and... You can have also 4/8 CPU butthey are not infinite ;) (but I talk
| mainly thinking of dualAthlon systems).
| there is a lot of memory and disk I/O.
| This is not a strange scenary on the interactive servers used at SNS.
| Here preempt has a too high price

MacOS 9 is the OS for you.

Essentially what the low-latency patches are is cooperative
multitasking.  Which has less overhead in some cases than preemptive as
long as everyone is equally nice and calls WaitNextEvent() within the
right inner loops.  In the absence of preemptive, Andrew's patch is the
next best thing.  But Bad Things happen without preemptive.  Just try
using Mac OS 9. ;)

Preemptive gives better interactivity under load, which is the whole
point of multitasking (think about it).  If you don't want the overhead
(which also exists without preemptive) run #processes == #processors.

Whether or not preemptive is applied, having a large number of processes
active is a performance hit from context switches, cache thrashing, etc.
Preemptive punishes (and rewards) everyone equally, thus better latency.

I'm really surprised that people are still actually arguing _against_
preemptive multitasking in this day and age.  This is a no-brainer in
the long run, where current corner cases aren't holding us back.

At least IMVHO.
-- 
Ken.
brownfld@irridia.com

| > By the way, have you measured the cost of -preempt in practice?
| >
| Yes, I did a lot of tests, and with current preempt patch definitelly
| I was seeing a too big performance loss.
| 
| 
| -
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to majordomo@vger.kernel.org
| More majordomo info at  http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 23:32               ` Ken Brownfield
@ 2002-01-08 23:42                 ` Robert Love
  2002-01-08 23:52                 ` Luigi Genoni
  2002-01-09  0:10                 ` Alan Cox
  2 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-08 23:42 UTC (permalink / raw)
  To: Ken Brownfield; +Cc: Luigi Genoni, linux-kernel

On Tue, 2002-01-08 at 18:32, Ken Brownfield wrote:
> On Wed, Jan 09, 2002 at 12:02:48AM +0100, Luigi Genoni wrote:
> | Probably sometimes they are not making a good business. In the reality
> | preempt is good in many scenarios, as I said, and I agree that for
> | desktops, and dedicated servers where just one application runs, and
> | probably the CPU is idle the most of the time, indeed users have a speed
> | feeling. Please consider that on eavilly loaded servers, with 40 and more
> | users, some are running gcc, others g77, others g++ compilations, someone
> | runs pine or mutt or kmail, and netscape, and mozilla, and emacs (someone
> | form xterm kde or gnome), and and
> | and... You can have also 4/8 CPU butthey are not infinite ;) (but I talk
> | mainly thinking of dualAthlon systems).
> | there is a lot of memory and disk I/O.
> | This is not a strange scenary on the interactive servers used at SNS.
> | Here preempt has a too high price
> 
> MacOS 9 is the OS for you.
> 
> Essentially what the low-latency patches are is cooperative
> multitasking.  Which has less overhead in some cases than preemptive as
> long as everyone is equally nice and calls WaitNextEvent() within the
> right inner loops.  In the absence of preemptive, Andrew's patch is the
> next best thing.  But Bad Things happen without preemptive.  Just try
> using Mac OS 9. ;)
> 
> Preemptive gives better interactivity under load, which is the whole
> point of multitasking (think about it).  If you don't want the overhead
> (which also exists without preemptive) run #processes == #processors.
> 
> Whether or not preemptive is applied, having a large number of processes
> active is a performance hit from context switches, cache thrashing, etc.
> Preemptive punishes (and rewards) everyone equally, thus better latency.
> 
> I'm really surprised that people are still actually arguing _against_
> preemptive multitasking in this day and age.  This is a no-brainer in
> the long run, where current corner cases aren't holding us back.

Amen.

Of course, the counter argument is that we, as kernel programmers, can
design everything to behave kindly cooperatively.  I reply "now that the
kernel is SMP safe it is trivial to become preemptible" but some still
don't take the patch.  I'll keep trucking along.

	Robert Love




^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 23:32               ` Ken Brownfield
  2002-01-08 23:42                 ` Robert Love
@ 2002-01-08 23:52                 ` Luigi Genoni
  2002-01-09  0:10                 ` Alan Cox
  2 siblings, 0 replies; 351+ messages in thread
From: Luigi Genoni @ 2002-01-08 23:52 UTC (permalink / raw)
  To: Ken Brownfield; +Cc: linux-kernel



On Tue, 8 Jan 2002, Ken Brownfield wrote:

> On Wed, Jan 09, 2002 at 12:02:48AM +0100, Luigi Genoni wrote:
> | Probably sometimes they are not making a good business. In the reality
> | preempt is good in many scenarios, as I said, and I agree that for
> | desktops, and dedicated servers where just one application runs, and
> | probably the CPU is idle the most of the time, indeed users have a speed
> | feeling. Please consider that on eavilly loaded servers, with 40 and more
> | users, some are running gcc, others g77, others g++ compilations, someone
> | runs pine or mutt or kmail, and netscape, and mozilla, and emacs (someone
> | form xterm kde or gnome), and and
> | and... You can have also 4/8 CPU butthey are not infinite ;) (but I talk
> | mainly thinking of dualAthlon systems).
> | there is a lot of memory and disk I/O.
> | This is not a strange scenary on the interactive servers used at SNS.
> | Here preempt has a too high price
>
> MacOS 9 is the OS for you.
>
> Essentially what the low-latency patches are is cooperative
> multitasking.  Which has less overhead in some cases than preemptive as
> long as everyone is equally nice and calls WaitNextEvent() within the
> right inner loops.  In the absence of preemptive, Andrew's patch is the
> next best thing.  But Bad Things happen without preemptive.  Just try
> using Mac OS 9 :)
Not exaclty what I was thinking about.
I listened some horror story from MAC sysadmin at SNS
>
> Preemptive gives better interactivity under load, which is the whole
> point of multitasking (think about it).  If you don't want the overhead
> (which also exists without preemptive) run #processes == #processors.
>
> Whether or not preemptive is applied, having a large number of processes
> active is a performance hit from context switches, cache thrashing, etc.
> Preemptive punishes (and rewards) everyone equally, thus better latency.
you are supposing that I want them to be punished equally. But there are
cases when that is not what you want ;). Thing if one users runs a
montecarlo code for test in the server I was describing. This job could
run, let's say, a couple of hour, and also under nice 20 it can suck a
lot.
>
> I'm really surprised that people are still actually arguing _against_
> preemptive multitasking in this day and age.  This is a no-brainer in
> the long run, where current corner cases aren't holding us back.
>
> At least IMVHO.
What I am talking about is some test I did some week ago. The initial post
of this thread, I think, was very clear about that. On the long run, with
a very well tested implementation. Actually it is not a good idea to
insert preempt nside of the 2.4 stable tree,
because there is a lot of work to do
to get a very WELL TESTED implementation.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 23:32               ` Ken Brownfield
  2002-01-08 23:42                 ` Robert Love
  2002-01-08 23:52                 ` Luigi Genoni
@ 2002-01-09  0:10                 ` Alan Cox
  2002-01-09  0:29                   ` John Alvord
                                     ` (2 more replies)
  2 siblings, 3 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-09  0:10 UTC (permalink / raw)
  To: Ken Brownfield; +Cc: Luigi Genoni, linux-kernel

> Preemptive gives better interactivity under load, which is the whole
> point of multitasking (think about it).  If you don't want the overhead
> (which also exists without preemptive) run #processes == #processors.

That is generally not true. Pe-emption is used in user space to prevent
applications doing very stupid things. Pre-emption in a trusted environment
can often be most efficient if done by the programs themselves.

Userspace is not a trusted environment

> I'm really surprised that people are still actually arguing _against_
> preemptive multitasking in this day and age.  This is a no-brainer in
> the long run, where current corner cases aren't holding us back.

Andrew's patches give you 1mS worst case latency for normal situations, that
is below human perception, and below scheduling granularity. In other words
without the efficiency loss and the debugging problems you can place the
far enough latency below other effects that it isnt worth attacking any more.

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  0:10                 ` Alan Cox
@ 2002-01-09  0:29                   ` John Alvord
  2002-01-09  0:43                     ` Robert Love
                                       ` (2 more replies)
  2002-01-09  5:08                   ` Andrew Morton
  2002-01-10  9:59                   ` Ken Brownfield
  2 siblings, 3 replies; 351+ messages in thread
From: John Alvord @ 2002-01-09  0:29 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

On Wed, 9 Jan 2002 00:10:38 +0000 (GMT), Alan Cox
<alan@lxorguk.ukuu.org.uk> wrote:

>> Preemptive gives better interactivity under load, which is the whole
>> point of multitasking (think about it).  If you don't want the overhead
>> (which also exists without preemptive) run #processes == #processors.
>
>That is generally not true. Pe-emption is used in user space to prevent
>applications doing very stupid things. Pre-emption in a trusted environment
>can often be most efficient if done by the programs themselves.
>
>Userspace is not a trusted environment
The best part about planned preemption points is that there is minimal
state to save when an interruption occurs.

>
>> I'm really surprised that people are still actually arguing _against_
>> preemptive multitasking in this day and age.  This is a no-brainer in
>> the long run, where current corner cases aren't holding us back.
>
>Andrew's patches give you 1mS worst case latency for normal situations, that
>is below human perception, and below scheduling granularity. In other words
>without the efficiency loss and the debugging problems you can place the
>far enough latency below other effects that it isnt worth attacking any more.

Incidently human visual perception runs around 200 milliseconds
minimum and hearing/touch perception around 100 milliseconds if the
signal has to go through the brain. Of course we extend our
perceptions with tools/programs etc.

john

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  0:29                   ` John Alvord
@ 2002-01-09  0:43                     ` Robert Love
  2002-01-09 16:58                     ` Kent Borg
  2002-01-10 19:08                     ` Jussi Laako
  2 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-09  0:43 UTC (permalink / raw)
  To: John Alvord; +Cc: linux-kernel

On Tue, 2002-01-08 at 19:29, John Alvord wrote:

> The best part about planned preemption points is that there is minimal
> state to save when an interruption occurs.

Actually, both preempt-kernel and low-latency do about the same amount
of work re saving state.

With preempt-kernel, when a task is preempted in-kernel we AND a flag
value into the preempt count.  That is all we need to keep track of
things.

With low-latency, the task state is set to TASK_RUNNING (which is a
precautionary measure).  So it is about the same, although low-latency
(and lock-break) often also have to do various setup with the locks and
all.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  0:29                   ` John Alvord
  2002-01-09  0:43                     ` Robert Love
@ 2002-01-09 16:58                     ` Kent Borg
  2002-01-14  1:28                       ` Bill Davidsen
  2002-01-10 19:08                     ` Jussi Laako
  2 siblings, 1 reply; 351+ messages in thread
From: Kent Borg @ 2002-01-09 16:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: Alan Cox, Robert Love, John Alvord

On Tue, Jan 08, 2002 at 04:29:59PM -0800, John Alvord wrote:
> Incidently human visual perception runs around 200 milliseconds
> minimum and hearing/touch perception around 100 milliseconds if the
> signal has to go through the brain. Of course we extend our
> perceptions with tools/programs etc.

Cool!  That means movies don't need to run faster than 5
frames/second.  Maybe 10 frames/second for plenty of overkill.  No
need to look at keyboard and mice any more frequently either, what a
relief.  (Any why do silly gamers want to go so much higher?)

Sarcasm mode "off" now...just because some experiments show it takes
humans a long time to push the correct button when you show them a
picture of a banana doesn't mean there is no reason to have a user
interface do anything any faster.  (I can come up with plenty of
examples if you would like.)

OK, now that I have pissed off a big hunk of the folks on the list,
let me bring up a different question: 

How does all this fit into doing a tick-less kernel?

There is something appealing about doing stuff only when there is
stuff to do, like: respond to input, handle some device that becomes
ready, or let another process run for a while.  Didn't IBM do some
nice work on this for Linux?  (*Was* it nice work?)  I was under the
impression that the current kernel isn't that far from being tickless.

A tickless kernel would be wonderful for battery powered devices that
could literally shut off when there be nothing to do, and it seems it
would (trivially?) help performance on high end power hogs too.

Why do we have regular HZ ticks?  (Other than I think I remember Linus
saying that he likes them.)  

Thanks,

-kb, the Kent who knows more about user interfaces than he does
preemption.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 16:58                     ` Kent Borg
@ 2002-01-14  1:28                       ` Bill Davidsen
  2002-01-14  1:54                         ` Alan Cox
  2002-01-14 20:12                         ` george anzinger
  0 siblings, 2 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-14  1:28 UTC (permalink / raw)
  To: Linux Kernel Mailing List

On Wed, 9 Jan 2002, Kent Borg wrote:

> How does all this fit into doing a tick-less kernel?
> 
> There is something appealing about doing stuff only when there is
> stuff to do, like: respond to input, handle some device that becomes
> ready, or let another process run for a while.  Didn't IBM do some
> nice work on this for Linux?  (*Was* it nice work?)  I was under the
> impression that the current kernel isn't that far from being tickless.
> 
> A tickless kernel would be wonderful for battery powered devices that
> could literally shut off when there be nothing to do, and it seems it
> would (trivially?) help performance on high end power hogs too.
> 
> Why do we have regular HZ ticks?  (Other than I think I remember Linus
> saying that he likes them.)  

Feel free to quantify the savings over the current setup with max power
saving enabled in the kernel. I just don't see how "wonderful" it would
be, given that an idle system currently uses very little battery if you
setup the options to save power.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  1:28                       ` Bill Davidsen
@ 2002-01-14  1:54                         ` Alan Cox
  2002-01-14 20:12                         ` george anzinger
  1 sibling, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-14  1:54 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel Mailing List

> Feel free to quantify the savings over the current setup with max power
> saving enabled in the kernel. I just don't see how "wonderful" it would
> be, given that an idle system currently uses very little battery if you
> setup the options to save power.

IBM have a tickless kernel patch set for the S/390. Here its not battery at
stake but VM overhead sending timer interrupts to hundreds of otherwise idle
virtual machines

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  1:28                       ` Bill Davidsen
  2002-01-14  1:54                         ` Alan Cox
@ 2002-01-14 20:12                         ` george anzinger
  1 sibling, 0 replies; 351+ messages in thread
From: george anzinger @ 2002-01-14 20:12 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel Mailing List

Bill Davidsen wrote:
> 
> On Wed, 9 Jan 2002, Kent Borg wrote:
> 
> > How does all this fit into doing a tick-less kernel?
> >
> > There is something appealing about doing stuff only when there is
> > stuff to do, like: respond to input, handle some device that becomes
> > ready, or let another process run for a while.  Didn't IBM do some
> > nice work on this for Linux?  (*Was* it nice work?)  I was under the
> > impression that the current kernel isn't that far from being tickless.
> >
> > A tickless kernel would be wonderful for battery powered devices that
> > could literally shut off when there be nothing to do, and it seems it
> > would (trivially?) help performance on high end power hogs too.
> >
> > Why do we have regular HZ ticks?  (Other than I think I remember Linus
> > saying that he likes them.)
>
I put a patch on sourceforge as part of the high-res-timers
investigation the implemented a tick less kernel with instrumentation. 
It turns out to be overload prone, mostly do to the need to start and
stop a "slice" timer on each schedule() call.  I, for one, think this
issue is dead and rightly so.  The patch is still there for those who
want to try it.  See signature for URL.

 
> Feel free to quantify the savings over the current setup with max power
> saving enabled in the kernel. I just don't see how "wonderful" it would
> be, given that an idle system currently uses very little battery if you
> setup the options to save power.
> 
> --
> bill davidsen <davidsen@tmr.com>
>   CTO, TMR Associates, Inc
> Doing interesting things with little computers since 1979.
> 

-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  0:29                   ` John Alvord
  2002-01-09  0:43                     ` Robert Love
  2002-01-09 16:58                     ` Kent Borg
@ 2002-01-10 19:08                     ` Jussi Laako
  2 siblings, 0 replies; 351+ messages in thread
From: Jussi Laako @ 2002-01-10 19:08 UTC (permalink / raw)
  To: John Alvord; +Cc: linux-kernel

John Alvord wrote:
> 
> Incidently human visual perception runs around 200 milliseconds
> minimum and hearing/touch perception around 100 milliseconds if the
> signal has to go through the brain. Of course we extend our
> perceptions with tools/programs etc.

Have you ever tried to play piano with 100 ms latency from pressing the key
to sound? I can tell you that it's pretty difficult...

 - Jussi Laako

-- 
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B  39DD A4DE 63EB C216 1E4B
Available at PGP keyservers


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  0:10                 ` Alan Cox
  2002-01-09  0:29                   ` John Alvord
@ 2002-01-09  5:08                   ` Andrew Morton
  2002-01-09  5:42                     ` Robert Love
                                       ` (2 more replies)
  2002-01-10  9:59                   ` Ken Brownfield
  2 siblings, 3 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-09  5:08 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox wrote:
> 
> Andrew's patches give you 1mS worst case latency for normal situations, that
> is below human perception, and below scheduling granularity.

The full ll patch is pretty gruesome though.

The high-end audio synth guys claim that two milliseconds is getting
to be too much.  They are generating real-time audio and they do
have more than one round-trip through the software.  It adds up.

Linux is being used in so many different applications now.  You are,
I think, one of the stronger recognisers of the fact that we do not
only use Linux to squirt out html and to provide shell prompts to snotty
students.  Good scheduling responsiveness is a valuable feature.

I haven't seen any figures for embedded XP, but it is said that
if you bend over backwards you can get 10 milliseconds out of NT4,
and 4-5 out of the fabled BeOS.  This is one area where we can
fairly easily be very much the best.  It's low-hanging fruit.

Internal preemptability is, in my opinion, the best way to deliver
this.

I accept your point about it making debugging harder - I would
suggest that the preempt code be altered so that it can be disabled
at runtime, rather than via a rebuild.  I suspect this can be
done at zero cost by setting init_task's preempt count to 1000000
via a kernel boot option.  And at almost-zero cost via a sysctl.

I would further suggest that support be added to the kernel to
allow general users to both detect and find the source of latency
problems.  That's actually pretty easy - realfeel running at
2 kHz only consumes 2-3% of the CPU.  It can just be left ticking
over in the background.

With preemptability merged in 2.5 we can then work to fix the
long-held locks.  Most of them are simple.  Some of them are
very much not.  I'll gladly help with that.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  5:08                   ` Andrew Morton
@ 2002-01-09  5:42                     ` Robert Love
  2002-01-09  9:08                     ` Helge Hafting
  2002-01-09 17:00                     ` Alan Cox
  2 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-09  5:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, linux-kernel

On Wed, 2002-01-09 at 00:08, Andrew Morton wrote:
> [snip]
> With preemptability merged in 2.5 we can then work to fix the
> long-held locks.  Most of them are simple.  Some of them are
> very much not.  I'll gladly help with that.

Amen.  Thank you, Andrew.

Let's work together and make a better kernel.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  5:08                   ` Andrew Morton
  2002-01-09  5:42                     ` Robert Love
@ 2002-01-09  9:08                     ` Helge Hafting
  2002-01-09 17:00                     ` Alan Cox
  2 siblings, 0 replies; 351+ messages in thread
From: Helge Hafting @ 2002-01-09  9:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew Morton wrote:
[...]
> I haven't seen any figures for embedded XP, but it is said that
> if you bend over backwards you can get 10 milliseconds out of NT4,
> and 4-5 out of the fabled BeOS.  This is one area where we can
> fairly easily be very much the best.  It's low-hanging fruit.
> 
> Internal preemptability is, in my opinion, the best way to deliver
> this.
> 
> I accept your point about it making debugging harder - I would
> suggest that the preempt code be altered so that it can be disabled
> at runtime, rather than via a rebuild.  I suspect this can be
> done at zero cost by setting init_task's preempt count to 1000000
> via a kernel boot option.  And at almost-zero cost via a sysctl.

And with some bad luck, the bug goes away when you
do this.  The bug of the missing lock...

Helge Hafting

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  5:08                   ` Andrew Morton
  2002-01-09  5:42                     ` Robert Love
  2002-01-09  9:08                     ` Helge Hafting
@ 2002-01-09 17:00                     ` Alan Cox
  2002-01-09 11:44                       ` Rob Landley
  2 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-09 17:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Alan Cox, linux-kernel

> The high-end audio synth guys claim that two milliseconds is getting
> to be too much.  They are generating real-time audio and they do
> have more than one round-trip through the software.  It adds up.

Most of the stuff I've seen from high end audio people consists of
overthreaded, chains of code written without any consideration for the
actual cost of execution. There are exceptions - including
people dynamically compiling filters to get ideal cache and latency 
behaviour, but not enough.

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 17:00                     ` Alan Cox
@ 2002-01-09 11:44                       ` Rob Landley
  2002-01-09 19:57                         ` Andrew Morton
  2002-01-10  2:25                         ` Alan Cox
  0 siblings, 2 replies; 351+ messages in thread
From: Rob Landley @ 2002-01-09 11:44 UTC (permalink / raw)
  To: Alan Cox, Andrew Morton; +Cc: linux-kernel

On Wednesday 09 January 2002 12:00 pm, Alan Cox wrote:
> > The high-end audio synth guys claim that two milliseconds is getting
> > to be too much.  They are generating real-time audio and they do
> > have more than one round-trip through the software.  It adds up.
>
> Most of the stuff I've seen from high end audio people consists of
> overthreaded, chains of code written without any consideration for the
> actual cost of execution. There are exceptions - including
> people dynamically compiling filters to get ideal cache and latency
> behaviour, but not enough.
>
> Alan

News flash: people are writing sub-optimal apps in user space.

Do you want an operating system capable of running real-world code written by 
people who know more about their specific problem domain (audio) than about 
optimal coding in general, or do you want an operating system intended to 
only run well-behaved applications designed and implemented by experts?

Rob

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 11:44                       ` Rob Landley
@ 2002-01-09 19:57                         ` Andrew Morton
  2002-01-10 16:40                           ` Timothy Covell
  2002-01-10  2:25                         ` Alan Cox
  1 sibling, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-09 19:57 UTC (permalink / raw)
  To: Rob Landley; +Cc: Alan Cox, linux-kernel

Rob Landley wrote:
> 
> On Wednesday 09 January 2002 12:00 pm, Alan Cox wrote:
> > > The high-end audio synth guys claim that two milliseconds is getting
> > > to be too much.  They are generating real-time audio and they do
> > > have more than one round-trip through the software.  It adds up.
> >
> > Most of the stuff I've seen from high end audio people consists of
> > overthreaded, chains of code written without any consideration for the
> > actual cost of execution. There are exceptions - including
> > people dynamically compiling filters to get ideal cache and latency
> > behaviour, but not enough.
> >
> > Alan
> 
> News flash: people are writing sub-optimal apps in user space.

Not only in user=space :)

> Do you want an operating system capable of running real-world code written by
> people who know more about their specific problem domain (audio) than about
> optimal coding in general, or do you want an operating system intended to
> only run well-behaved applications designed and implemented by experts?

The people with whom I dealt (Benno Senoner, Dave Philips, Paul
Barton-Davis) with are deeply clueful about this stuff.  

I'll quote from an email Paul set me a year ago.  I don't think
he'll mind.  This is, of course, a quite specialised application
area:

....

There are two kinds of situations where its needed:

  1) real-time effects ("FX") processing
  2) real-time synthesis influenced by external controllers

In (1), we have an incoming audio signal that is to be processed in
some way (echo/flange/equalization/etc. etc.) and then delivered back
to the output audio stream. If the delay between the input and output
is more than a few msecs, there are several possible
consequences:

        * if the original source was acoustic (non-electronic), and
             the processed material is played back over monitors
             close to the acoustic source, you will get interesting
             filtering effects as the two signals interfere with each
             other. 

        * the musician will get confused by material in the 
             processed stream arriving "late"

        * the result may be useless.

In (2), a musician is using, for example, a keyboard that sends MIDI
"note on/note off" messages to the computer which are intended to
cause the synthesis engine to start/stop generating certain sounds. If
the delay between the musician pressing the key and hearing the sound
exceeds about 5msec, the system will feel difficult to play; worse, if
there is actual jitter in the +/- 5msec range, it will feel impossible
to play.

....

Without LL, Linux cannot reasonably be used for professional audio
work that involves real time FX or real time synthesis. The default
kernel has worst-case latencies noticeably worse than Windows, and
most people are reluctant to use that system already, not just because
of instability but latencies also. Its not a matter of it being "a bit
of a problem" - the 100msec worst case latencies visible in the
standard kernel make it totally implausible that you would ever deploy
Linux in a situation where RT FX/synthesis were going to happen.

By contrast, if we get LL in place, then we can potentially use Linux
in "black box"/"embedded" systems designed specifically for audio
users; all the flexibility of Linux, but if they choose to ignore most
of that, they'll still have a black rack-mounted box capable of doing
everything (more mostly everything) currently done by dedicated
hardware. As general purpose CPU performance continues to increase,
this becomes more and more overwhelmingly obvious as the way forward
for audio processing.

....

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 19:57                         ` Andrew Morton
@ 2002-01-10 16:40                           ` Timothy Covell
  0 siblings, 0 replies; 351+ messages in thread
From: Timothy Covell @ 2002-01-10 16:40 UTC (permalink / raw)
  To: Andrew Morton, Rob Landley; +Cc: Alan Cox, linux-kernel

On Wednesday 09 January 2002 13:57, Andrew Morton wrote:
[snip]
>
> Without LL, Linux cannot reasonably be used for professional audio
> work that involves real time FX or real time synthesis. The default
> kernel has worst-case latencies noticeably worse than Windows, and
> most people are reluctant to use that system already, not just because
> of instability but latencies also. Its not a matter of it being "a bit
> of a problem" - the 100msec worst case latencies visible in the
> standard kernel make it totally implausible that you would ever deploy
> Linux in a situation where RT FX/synthesis were going to happen.
>
> By contrast, if we get LL in place, then we can potentially use Linux
> in "black box"/"embedded" systems designed specifically for audio
> users; all the flexibility of Linux, but if they choose to ignore most
> of that, they'll still have a black rack-mounted box capable of doing
> everything (more mostly everything) currently done by dedicated
> hardware. As general purpose CPU performance continues to increase,
> this becomes more and more overwhelmingly obvious as the way forward
> for audio processing.
>

I keep on seeing these "Blackbox" apps like Tivo using Linux but the
fact remains the average folks cannot get any reasonable kind of A/V
performance and support under Linux.    That's what we need.

Needing to save money and get some fast cash (I'm unemployed), 
yesterday, I swapped out my dual P-III motherboard in my BeOS box
for a Via C-III (700MHz) based system.    And I got my first real hiccups
while using the OS when I was playing MP3s and _launching_ the TV
program _full screen_(640x480 on 640x480 virtual desktop window).

Obviously, when this stuff is done right, more CPU power can only help,
but it still has to be done right.  As I am sure that you know, BeOS claims
average latency of 250 microseconds.

-- 
timothy.covell@ashavan.org.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 11:44                       ` Rob Landley
  2002-01-09 19:57                         ` Andrew Morton
@ 2002-01-10  2:25                         ` Alan Cox
  2002-01-10 10:06                           ` Rob Landley
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-10  2:25 UTC (permalink / raw)
  To: Rob Landley; +Cc: Alan Cox, Andrew Morton, linux-kernel

> Do you want an operating system capable of running real-world code written by 
> people who know more about their specific problem domain (audio) than about 
> optimal coding in general, or do you want an operating system intended to 
> only run well-behaved applications designed and implemented by experts?

I want an OS were a reasonably cluefully written audio program works. That
to me means aiming at the 1mS latency mark. Which doesn't seem to be needing
pre-empt. Beyond a typical 1mS latency you have hardware fun to worry about,
and the BIOS SMM code eating you.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10  2:25                         ` Alan Cox
@ 2002-01-10 10:06                           ` Rob Landley
  2002-01-10 18:34                             ` Chris Friesen
  2002-01-10 19:01                             ` Alan Cox
  0 siblings, 2 replies; 351+ messages in thread
From: Rob Landley @ 2002-01-10 10:06 UTC (permalink / raw)
  To: Alan Cox; +Cc: Andrew Morton, linux-kernel

On Wednesday 09 January 2002 09:25 pm, Alan Cox wrote:
> > Do you want an operating system capable of running real-world code
> > written by people who know more about their specific problem domain
> > (audio) than about optimal coding in general, or do you want an operating
> > system intended to only run well-behaved applications designed and
> > implemented by experts?
>
> I want an OS were a reasonably cluefully written audio program works. That
> to me means aiming at the 1mS latency mark. Which doesn't seem to be
> needing pre-empt. Beyond a typical 1mS latency you have hardware fun to
> worry about, and the BIOS SMM code eating you.

I don't know what BIOS SMM code is, or what you mean by "hardware fun".  But 
the worst audio dropouts I have are "cp file.wav /dev/audio" when I forgot to 
kill cron and updatedb started up.  (This is considerably WORSE than mp3 
playing.)  I take it "cp" is badly written? :)

And a sound card with only 1mS of buffer in it is definitely not useable on 
windoze, the minimum buffer in the cheapest $12 PCI sound card I've seen is 
about 1/4 second (250ms).  (Is this what you mean by "hardware fun"?)  Even 
if the app was taking half that, it's still a > 100ms big gap where the OS 
leaves it hanging before you get a dropout.  (Okay, some of that's watermark 
policy, not sending more data to the card until half the buffer is 
exhausted...)  What sound output device DOESN'T have this much cache?  (You 
mentioned USB speakers in your diary at one point, which seemed to be like 
those old "paralell port cable plus a few resistors equals sound output" 
hacks...)

Now VIDEO is a slightly more interesting problem.  (Or synchronizing audio 
and video by sending really tiny chunks of audio.)  There's no hardware 
buffer there to cover our latency sins.  Then again, dropping frames is 
considered normal in the video world, isn't it? :)

Rob

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 10:06                           ` Rob Landley
@ 2002-01-10 18:34                             ` Chris Friesen
  2002-01-10 19:36                               ` David Weinehall
  2002-01-10 20:52                               ` Bernd Eckenfels
  2002-01-10 19:01                             ` Alan Cox
  1 sibling, 2 replies; 351+ messages in thread
From: Chris Friesen @ 2002-01-10 18:34 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

Rob Landley wrote:

> And a sound card with only 1mS of buffer in it is definitely not useable on
> windoze, the minimum buffer in the cheapest $12 PCI sound card I've seen is
> about 1/4 second (250ms).  (Is this what you mean by "hardware fun"?)  Even
> if the app was taking half that, it's still a > 100ms big gap where the OS
> leaves it hanging before you get a dropout.  (Okay, some of that's watermark
> policy, not sending more data to the card until half the buffer is
> exhausted...)  What sound output device DOESN'T have this much cache?

Imagine taking an input, doing dsp-type calculations on it, and sending it back
as output.  Now...imagine doing it in realtime with the output being fed back to
a monitor speaker.  Think about what would happen if the output of the monitor
speaker is 1/4 second behind the input at the mike.  Now do you see the
problem?  A few ms of delay might be okay.  A few hundred ms definately is not.

> Now VIDEO is a slightly more interesting problem.  (Or synchronizing audio
> and video by sending really tiny chunks of audio.)  There's no hardware
> buffer there to cover our latency sins.  Then again, dropping frames is
> considered normal in the video world, isn't it? :)

If I'm trying to watch a DVD on my computer, and assuming my CPU is powerful
enough to decode in realtime, then I want the DVD player to take
priority--dropping frames just because I'm starting up netscape is not
acceptable.

Chris

-- 
Chris Friesen                    | MailStop: 043/33/F10  
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 18:34                             ` Chris Friesen
@ 2002-01-10 19:36                               ` David Weinehall
  2002-01-10 20:00                                 ` Chris Friesen
  2002-01-10 20:52                               ` Bernd Eckenfels
  1 sibling, 1 reply; 351+ messages in thread
From: David Weinehall @ 2002-01-10 19:36 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Rob Landley, linux-kernel

On Thu, Jan 10, 2002 at 01:34:17PM -0500, Chris Friesen wrote:
> Rob Landley wrote:
> 
> > And a sound card with only 1mS of buffer in it is definitely not useable on
> > windoze, the minimum buffer in the cheapest $12 PCI sound card I've seen is
> > about 1/4 second (250ms).  (Is this what you mean by "hardware fun"?)  Even
> > if the app was taking half that, it's still a > 100ms big gap where the OS
> > leaves it hanging before you get a dropout.  (Okay, some of that's watermark
> > policy, not sending more data to the card until half the buffer is
> > exhausted...)  What sound output device DOESN'T have this much cache?
> 
> Imagine taking an input, doing dsp-type calculations on it, and sending it back
> as output.  Now...imagine doing it in realtime with the output being fed back to
> a monitor speaker.  Think about what would happen if the output of the monitor
> speaker is 1/4 second behind the input at the mike.  Now do you see the
> problem?  A few ms of delay might be okay.  A few hundred ms definately is not.
> 
> > Now VIDEO is a slightly more interesting problem.  (Or synchronizing audio
> > and video by sending really tiny chunks of audio.)  There's no hardware
> > buffer there to cover our latency sins.  Then again, dropping frames is
> > considered normal in the video world, isn't it? :)
> 
> If I'm trying to watch a DVD on my computer, and assuming my CPU is powerful
> enough to decode in realtime, then I want the DVD player to take
> priority--dropping frames just because I'm starting up netscape is not
> acceptable.

Ummm, and you couldn't consider refraining from firing up Netscape
while watching the DVD, could you?!

I get your point, but the example was poorly chosen, imho.


Regards: David Weinehall
  _                                                                 _
 // David Weinehall <tao@acc.umu.se> /> Northern lights wander      \\
//  Maintainer of the v2.0 kernel   //  Dance across the winter sky //
\>  http://www.acc.umu.se/~tao/    </   Full colour fire           </

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 19:36                               ` David Weinehall
@ 2002-01-10 20:00                                 ` Chris Friesen
  2002-01-10 20:13                                   ` Jussi Laako
  0 siblings, 1 reply; 351+ messages in thread
From: Chris Friesen @ 2002-01-10 20:00 UTC (permalink / raw)
  To: linux-kernel

David Weinehall wrote:
> 
> On Thu, Jan 10, 2002 at 01:34:17PM -0500, Chris Friesen wrote:

> > If I'm trying to watch a DVD on my computer, and assuming my CPU is powerful
> > enough to decode in realtime, then I want the DVD player to take
> > priority--dropping frames just because I'm starting up netscape is not
> > acceptable.
> 
> Ummm, and you couldn't consider refraining from firing up Netscape
> while watching the DVD, could you?!
> 
> I get your point, but the example was poorly chosen, imho.

I chose netscape because it is probably the largest single app that I have on my
machine.  Other possibilities would be running a kernel compile, a recursive
search for specific file content through the entire filesystem, or anything else
that is likely to cause problems. It might even be someone else in the house
logged into it and running stuff over the network.


-- 
Chris Friesen                    | MailStop: 043/33/F10  
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 20:00                                 ` Chris Friesen
@ 2002-01-10 20:13                                   ` Jussi Laako
  0 siblings, 0 replies; 351+ messages in thread
From: Jussi Laako @ 2002-01-10 20:13 UTC (permalink / raw)
  To: linux-kernel

Chris Friesen wrote:
> 
> machine.  Other possibilities would be running a kernel compile, a 
> recursive search for specific file content through the entire filesystem, 
> or anything else that is likely to cause problems. It might even be 
> someone else in the house logged into it and running stuff over the 
> network.

It's not enjoyable late night DVD watching when updatedb fires up at middle
of the movie. Nor when you are trying to record some audio to the disk.
Vanilla kernel really chokes up on those situations. Lowlatency patches help
a lot on this.


 - Jussi Laako

-- 
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B  39DD A4DE 63EB C216 1E4B
Available at PGP keyservers


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 18:34                             ` Chris Friesen
  2002-01-10 19:36                               ` David Weinehall
@ 2002-01-10 20:52                               ` Bernd Eckenfels
  1 sibling, 0 replies; 351+ messages in thread
From: Bernd Eckenfels @ 2002-01-10 20:52 UTC (permalink / raw)
  To: linux-kernel

In article <3C3DDEA9.E8FAB8DC@nortelnetworks.com> you wrote:
> Imagine taking an input, doing dsp-type calculations on it, and sending it back
> as output.  Now...imagine doing it in realtime with the output being fed back to
> a monitor speaker.  Think about what would happen if the output of the monitor
> speaker is 1/4 second behind the input at the mike.  Now do you see the
> problem?  A few ms of delay might be okay.

What kind of signal run time do you normally have in digital sound processing
equipment? AFAIK one can expect a feew frames with of delay (n x 13ms).

Just dont feed back the processed signal to the singers monitor box.

> If I'm trying to watch a DVD on my computer, and assuming my CPU is powerful
> enough to decode in realtime, then I want the DVD player to take
> priority--dropping frames just because I'm starting up netscape is not
> acceptable.

You do not start up netscape while you do realtime av processing in
professional environemnt.

Well, an easy fix is to have the LL patch and do not use swap. Then you only
need reliable/predictable hardware (which is not so easy to get for PC).

Greetings
Bernd


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 10:06                           ` Rob Landley
  2002-01-10 18:34                             ` Chris Friesen
@ 2002-01-10 19:01                             ` Alan Cox
  2002-01-11  2:47                               ` Nigel Gamble
  2002-01-14  2:46                               ` Pavel Machek
  1 sibling, 2 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-10 19:01 UTC (permalink / raw)
  To: Rob Landley; +Cc: Alan Cox, Andrew Morton, linux-kernel

> I don't know what BIOS SMM code is, or what you mean by "hardware fun".  But 
> the worst audio dropouts I have are "cp file.wav /dev/audio" when I forgot to 
> kill cron and updatedb started up.  (This is considerably WORSE than mp3 
> playing.)  I take it "cp" is badly written? :)

Those are ones that Andrew's patch should fix nicely. You might need a
decent VM as well though.

The fun below 1mS comes from

	1.	APM bios calls where the bios decides to take >1mS to have
		a chat with your batteries
	2.	Video cards pulling borderline legal PCI tricks to get
		better benchmarketing by stalling the entire bus

> And a sound card with only 1mS of buffer in it is definitely not useable on 
> windoze, the minimum buffer in the cheapest $12 PCI sound card I've seen is 
> about 1/4 second (250ms).  (Is this what you mean by "hardware fun"?)  Even 

For video conferencing and for real world audio mixing you can't use
that 250ms. Not even for games. If your audio is 150mS late in quake you
will notice it, really notice it. And the buffers on the audio card are
btw generally in RAM not the fifo on the chip, so they dont help when the
PCI bus loads up

> exhausted...)  What sound output device DOESN'T have this much cache?  (You 
> mentioned USB speakers in your diary at one point, which seemed to be like 
> those old "paralell port cable plus a few resistors equals sound output" 
> hacks...)

Umm no USB audio is rather good. USB sends isosynchronous, time guaranteed
sample streams down the USB bus, to the speakers where the A to D is clear
of the machine proper.

> Now VIDEO is a slightly more interesting problem.  (Or synchronizing audio 
> and video by sending really tiny chunks of audio.)  There's no hardware 
> buffer there to cover our latency sins.  Then again, dropping frames is 
> considered normal in the video world, isn't it? :)

You'll see those too. Pure playback is ok because you have to buffer
equally rather than reliably hit deadlines

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 19:01                             ` Alan Cox
@ 2002-01-11  2:47                               ` Nigel Gamble
  2002-01-11  3:18                                 ` Andrew Morton
  2002-01-11 12:37                                 ` Alan Cox
  2002-01-14  2:46                               ` Pavel Machek
  1 sibling, 2 replies; 351+ messages in thread
From: Nigel Gamble @ 2002-01-11  2:47 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rob Landley, Andrew Morton, linux-kernel

On Thu, 10 Jan 2002, Alan Cox wrote:
> The fun below 1mS comes from
>
> 	1.	APM bios calls where the bios decides to take >1mS to have
> 		a chat with your batteries
> 	2.	Video cards pulling borderline legal PCI tricks to get
> 		better benchmarketing by stalling the entire bus

Don't forget the embedded space, where the hardware vendor can ensure
that their hardware is well-behaved.  Even on a PC, it is possible for
someone who cares about realtime to spec a reasonable system.

On good hardware, we can easily do much better than 1ms latency with a
preemptible kernel and a spinlock cleanup.  I don't think the
limitations of some PC hardware should limit our goals for Linux.

Nigel Gamble                                    nigel@nrg.org
Mountain View, CA, USA.                         http://www.nrg.org/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11  2:47                               ` Nigel Gamble
@ 2002-01-11  3:18                                 ` Andrew Morton
  2002-01-11 12:37                                 ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-11  3:18 UTC (permalink / raw)
  To: nigel; +Cc: Alan Cox, Rob Landley, linux-kernel

Nigel Gamble wrote:
> 
> On Thu, 10 Jan 2002, Alan Cox wrote:
> > The fun below 1mS comes from
> >
> >       1.      APM bios calls where the bios decides to take >1mS to have
> >               a chat with your batteries
> >       2.      Video cards pulling borderline legal PCI tricks to get
> >               better benchmarketing by stalling the entire bus
> 
> Don't forget the embedded space, where the hardware vendor can ensure
> that their hardware is well-behaved.  Even on a PC, it is possible for
> someone who cares about realtime to spec a reasonable system.
> 
> On good hardware, we can easily do much better than 1ms latency with a
> preemptible kernel and a spinlock cleanup.  I don't think the
> limitations of some PC hardware should limit our goals for Linux.
> 

On 700MHz x86 running Cerberus we can do 50 microseconds average
and 1300 microseconds worst-case today.  

Below 1000 uSec, the required changes get exponentially larger
and more complex.  I doubt that it's sane to try to go below
a millisecond on a desktop-class machine with desktop-class
workload, disk, memory and swap capacities.

On a more constrained system, which is what I expect you're
referring to, 250 microseconds should be achievable.  Whether
or not that is achieved via preemptability is pretty irrelevant.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11  2:47                               ` Nigel Gamble
  2002-01-11  3:18                                 ` Andrew Morton
@ 2002-01-11 12:37                                 ` Alan Cox
  2002-01-11 20:33                                   ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-11 12:37 UTC (permalink / raw)
  To: nigel; +Cc: Alan Cox, Rob Landley, Andrew Morton, linux-kernel

> On good hardware, we can easily do much better than 1ms latency with a
> preemptible kernel and a spinlock cleanup.  I don't think the
> limitations of some PC hardware should limit our goals for Linux.

Its more than a spinlock cleanup at that point. To do anything useful you have
to tackle both priority inversion and some kind of at least semi-formal 
validation of the code itself. At the point it comes down to validating the
code I'd much rather validate rtlinux than the entire kernel

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 12:37                                 ` Alan Cox
@ 2002-01-11 20:33                                   ` Robert Love
  2002-01-12  2:50                                     ` yodaiken
  2002-01-12 11:13                                     ` [2.4.17/18pre] VM and swap - it's really unusable Andrea Arcangeli
  0 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-11 20:33 UTC (permalink / raw)
  To: Alan Cox; +Cc: nigel, Rob Landley, Andrew Morton, linux-kernel

On Fri, 2002-01-11 at 07:37, Alan Cox wrote:

> Its more than a spinlock cleanup at that point. To do anything useful you have
> to tackle both priority inversion and some kind of at least semi-formal 
> validation of the code itself. At the point it comes down to validating the
> code I'd much rather validate rtlinux than the entire kernel

The preemptible kernel plus the spinlock cleanup could really take us
far.  Having locked at a lot of the long-held locks in the kernel, I am
confident at least reasonable progress could be made.

Beyond that, yah, we need a better locking construct.  Priority
inversion could be solved with a priority-inheriting mutex, which we can
tackle if and when we want to go that route.  Not now.

I want to lay the groundwork for a better kernel.  The preempt-kernel
patch gives real-world improvements, it provides a smoother user desktop
experience -- just look at the positive feedback.  Most importantly,
however, it provides a framework for superior response with our standard
kernel in its standard programming model. 

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 20:33                                   ` Robert Love
@ 2002-01-12  2:50                                     ` yodaiken
  2002-01-11 20:22                                       ` Rob Landley
  2002-01-12 11:13                                     ` [2.4.17/18pre] VM and swap - it's really unusable Andrea Arcangeli
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-12  2:50 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, nigel, Rob Landley, Andrew Morton, linux-kernel

On Fri, Jan 11, 2002 at 03:33:22PM -0500, Robert Love wrote:
> On Fri, 2002-01-11 at 07:37, Alan Cox wrote:
> The preemptible kernel plus the spinlock cleanup could really take us
> far.  Having locked at a lot of the long-held locks in the kernel, I am
> confident at least reasonable progress could be made.
> 
> Beyond that, yah, we need a better locking construct.  Priority
> inversion could be solved with a priority-inheriting mutex, which we can
> tackle if and when we want to go that route.  Not now.

Backing the car up to the edge of the cliff really gives us
good results. Beyond that, we could jump off the cliff
if we want to go that route.
Preempt leads to inheritance and inheritance leads to disaster.


All the numbers I've seen show Morton's low latency just works better. Are
there other numbers I should look at.



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12  2:50                                     ` yodaiken
@ 2002-01-11 20:22                                       ` Rob Landley
  2002-01-12  5:00                                         ` yodaiken
                                                           ` (4 more replies)
  0 siblings, 5 replies; 351+ messages in thread
From: Rob Landley @ 2002-01-11 20:22 UTC (permalink / raw)
  To: yodaiken, Robert Love; +Cc: Alan Cox, nigel, Andrew Morton, linux-kernel

On Friday 11 January 2002 09:50 pm, yodaiken@fsmlabs.com wrote:
> On Fri, Jan 11, 2002 at 03:33:22PM -0500, Robert Love wrote:
> > On Fri, 2002-01-11 at 07:37, Alan Cox wrote:
> > The preemptible kernel plus the spinlock cleanup could really take us
> > far.  Having locked at a lot of the long-held locks in the kernel, I am
> > confident at least reasonable progress could be made.
> >
> > Beyond that, yah, we need a better locking construct.  Priority
> > inversion could be solved with a priority-inheriting mutex, which we can
> > tackle if and when we want to go that route.  Not now.
>
> Backing the car up to the edge of the cliff really gives us
> good results. Beyond that, we could jump off the cliff
> if we want to go that route.
> Preempt leads to inheritance and inheritance leads to disaster.

I preempt leads to disaster than Linux can't do SMP.  Are you saying that's 
the case?

The preempt patch is really "SMP on UP".  If pre-empt shows up a problem, 
then it's a problem SMP users will see too.  If we can't take advantage of 
the existing SMP locking infrastructure to improve latency and interactive 
feel on UP machines, than SMP for linux DOES NOT WORK.

> All the numbers I've seen show Morton's low latency just works better. Are
> there other numbers I should look at.

This approach is basically a collection of heuristics.  The kernel has been 
profiled and everywhere a latency spike was found, a band-aid was put on it 
(an explicit scheduling point).  This doesn't say there aren't other latency 
spikes, just that with the collection of hardware and software being 
benchmarked, the latency spikes that were found have each had a band-aid 
individually applied to them.

This isn't a BAD thing.  If the benchmarks used to find latency spikes are at 
all like real-world use, then it helps real-world applications.  But of 
COURSE the benchmarks are going to look good, since tuning the kernel to 
those benchmarks is the way the patch was developed!

The majority of the original low latency scheduling point work is handled 
automatically by the SMP on UP kernel.  You don't NEED to insert scheduling 
points anywhere you aren't inside a spinlock.  So the SMP on UP patch makes 
most of the explicit scheduling point patch go away, accomplishing the same 
thing in a less intrusive manner.  (Yes, it makes all kernels act like SMP 
kernels for debugging purposes.  But you can turn it off for debugging if you 
want to, that's just another toggle in the magic sysreq menu.  And this isn't 
entirely a bad thing: applying the enormous UP userbase to the remaining SMP 
bugs is bound to squeeze out one or two more obscure ones, but those bugs DO 
exist already on SMP.)

However, what's left of the explicit scheduling work is still very useful.  
When you ARE inside a spinlock, you can't just schedule, you have to save 
state, drop the lock(s), schedule, re-acquire the locks, and reload your 
state in case somebody else diddled with the structures you were using.  This 
is a lot harder than just scheduling, but breaking up long-held locks like 
this helps SMP scalability, AND helps latency in the SMP-on-UP case.

So the best approach is a combination of the two patches.  SMP-on-UP for 
everything outside of spinlocks, and then manually yielding locks that cause 
problems.  Both Robert Love and Andrew Morton have come out in favor of each 
other's patches on lkml just in the past few days.  The patches work together 
quite well, and each wants to see the other's patch applied.

Rob

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 20:22                                       ` Rob Landley
@ 2002-01-12  5:00                                         ` yodaiken
  2002-01-12 11:53                                           ` Roman Zippel
  2002-01-12  5:03                                         ` Andrew Morton
                                                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-12  5:00 UTC (permalink / raw)
  To: Rob Landley
  Cc: yodaiken, Robert Love, Alan Cox, nigel, Andrew Morton,
	linux-kernel

On Fri, Jan 11, 2002 at 03:22:08PM -0500, Rob Landley wrote:
> I preempt leads to disaster than Linux can't do SMP.  Are you saying that's 
> the case?
> 
> The preempt patch is really "SMP on UP".  If pre-empt shows up a problem, 

People keep repeating this, it must feel reassuring.

	/* in kernel mode: does it need a lock? */
	m = next_free_page_from_per_cpu_cache();

To start, preemptive means that the optimizations for SMP that reduce
locking by per cpu localization do not work. 
So, as I understand it:
	the preempt patch is really "crappy SMP on UP" 
may be correct. But what you wrote is not.
Did I miss something? That's not a rhetorical question - I recall
being wrong before so go ahead and explain what's wrong with my logic.
        

> then it's a problem SMP users will see too.  If we can't take advantage of 
> the existing SMP locking infrastructure to improve latency and interactive 
> feel on UP machines, than SMP for linux DOES NOT WORK.
> 
> > All the numbers I've seen show Morton's low latency just works better. Are
> > there other numbers I should look at.
> 
> This approach is basically a collection of heuristics.  The kernel has been 

One patch makes the numbers look good (sort of)

One patch does not but "improves feel" and breaks a exceptionally useful 
rule: per cpu data in kernel that is not touched by interrupt code does not
need to be locked.

The basic assumption of the preempt trick is that locking for SMP is 
based on the same principles as locking for preemption and that's completely
false.  

I believe that the preempt path leads inexorably to
mutex-with-stupid-priority-trick and that would be very unfortunate indeed.
It's unavoidable because sooner or later someone will find that preempt +
SCHED_FIFO leads to 
		niced app 1 in K mode gets Sem A
		SCHED_FIFO app prempts and blocks on  Sem A
		whoops! app 2 in K more preempts niced app 1

	Hey my DVD player has stalled, lets add sem_with_revolting_priority_trick!
	Why the hell is UP Windows XP3 blowing away my Linux box on DVD playing while
	Linux now runs with the grace and speed of IRIX? 
	And has anyone fixed all those mysterious hangs caused by the interesting 
	interaction of hundreds of preempted semaphores?


		



> profiled and everywhere a latency spike was found, a band-aid was put on it 
> (an explicit scheduling point).  This doesn't say there aren't other latency 
> spikes, just that with the collection of hardware and software being 
> benchmarked, the latency spikes that were found have each had a band-aid 
> individually applied to them.
> 
> This isn't a BAD thing.  If the benchmarks used to find latency spikes are at 
> all like real-world use, then it helps real-world applications.  But of 
> COURSE the benchmarks are going to look good, since tuning the kernel to 
> those benchmarks is the way the patch was developed!
> 
> The majority of the original low latency scheduling point work is handled 
> automatically by the SMP on UP kernel.  You don't NEED to insert scheduling 
> points anywhere you aren't inside a spinlock.  So the SMP on UP patch makes 
> most of the explicit scheduling point patch go away, accomplishing the same 
> thing in a less intrusive manner.  (Yes, it makes all kernels act like SMP 
> kernels for debugging purposes.  But you can turn it off for debugging if you 
> want to, that's just another toggle in the magic sysreq menu.  And this isn't 
> entirely a bad thing: applying the enormous UP userbase to the remaining SMP 
> bugs is bound to squeeze out one or two more obscure ones, but those bugs DO 
> exist already on SMP.)

This is the logic of _every_ variant of "lets put X windows in the kernel
and let the kernel hackers fix it" .  Konquerer crashed when I used it 
yesterday. Let's put it in the kernel too and apply that enormous UP
userbase to the remaining bugs.

> 
> However, what's left of the explicit scheduling work is still very useful.  
> When you ARE inside a spinlock, you can't just schedule, you have to save 
> state, drop the lock(s), schedule, re-acquire the locks, and reload your 
> state in case somebody else diddled with the structures you were using.  This 
> is a lot harder than just scheduling, but breaking up long-held locks like 
> this helps SMP scalability, AND helps latency in the SMP-on-UP case.
> 
> So the best approach is a combination of the two patches.  SMP-on-UP for 
> everything outside of spinlocks, and then manually yielding locks that cause 
> problems.  Both Robert Love and Andrew Morton have come out in favor of each 
> other's patches on lkml just in the past few days.  The patches work together 
> quite well, and each wants to see the other's patch applied.


> 
> Rob

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12  5:00                                         ` yodaiken
@ 2002-01-12 11:53                                           ` Roman Zippel
  2002-01-12 12:28                                             ` yodaiken
  2002-01-12 18:46                                             ` Alan Cox
  0 siblings, 2 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-12 11:53 UTC (permalink / raw)
  To: yodaiken
  Cc: Rob Landley, Robert Love, Alan Cox, nigel, Andrew Morton,
	linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> I believe that the preempt path leads inexorably to
> mutex-with-stupid-priority-trick and that would be very unfortunate indeed.
> It's unavoidable because sooner or later someone will find that preempt +
> SCHED_FIFO leads to
>                 niced app 1 in K mode gets Sem A
>                 SCHED_FIFO app prempts and blocks on  Sem A
>                 whoops! app 2 in K more preempts niced app 1

Please explain what's different without the preempt patch.

>         Hey my DVD player has stalled, lets add sem_with_revolting_priority_trick!
>         Why the hell is UP Windows XP3 blowing away my Linux box on DVD playing while
>         Linux now runs with the grace and speed of IRIX?

Because the IRIX implementation sucks, every implementation has to suck?
Somehow I have the suspicion you're trying to discourage everyone from
even trying, because if he'd succeeded you'd loose a big chunk of
potential RTLinux customers.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 11:53                                           ` Roman Zippel
@ 2002-01-12 12:28                                             ` yodaiken
  2002-01-12 13:25                                               ` Roman Zippel
  2002-01-12 18:46                                             ` Alan Cox
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-12 12:28 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Rob Landley, Robert Love, Alan Cox, nigel,
	Andrew Morton, linux-kernel

On Sat, Jan 12, 2002 at 12:53:06PM +0100, Roman Zippel wrote:
> Hi,
> 
> yodaiken@fsmlabs.com wrote:
> 
> > I believe that the preempt path leads inexorably to
> > mutex-with-stupid-priority-trick and that would be very unfortunate indeed.
> > It's unavoidable because sooner or later someone will find that preempt +
> > SCHED_FIFO leads to
> >                 niced app 1 in K mode gets Sem A
> >                 SCHED_FIFO app prempts and blocks on  Sem A
> >                 whoops! app 2 in K more preempts niced app 1
> 
> Please explain what's different without the preempt patch.

See that "preempt" in line 2 . Linux does not
preempt kernel mode processes otherwise. The beauty of the
non-preemptive kernel is that "in K mode every process makes progress"
and even the "niced app" will complete its use of SemA and 
release it in one run. If you have a reasonably fair scheduler you
can make very useful analysis with Linux now of the form

	Under 50 active proceses in the system means that in every
	2 second interval every process
	will get at least 10ms of time to run.

That's a very valuable property and it goes away in a preemptive kernel 
to get you something vague.


> 
> >         Hey my DVD player has stalled, lets add sem_with_revolting_priority_trick!
> >         Why the hell is UP Windows XP3 blowing away my Linux box on DVD playing while
> >         Linux now runs with the grace and speed of IRIX?
> 
> Because the IRIX implementation sucks, every implementation has to suck?
> Somehow I have the suspicion you're trying to discourage everyone from
> even trying, because if he'd succeeded you'd loose a big chunk of
> potential RTLinux customers.

So your argument is that I'm advocating Andrew Morton's patch which 
reduces latencies more than the preempt patch because I have a
financial interest in not reducing latencies? Subtle.

In any case, motive has no bearing on a technical argument. 
Your motive could be to make the 68K look better by reducing 
performance on other processors for all I know. 

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 12:28                                             ` yodaiken
@ 2002-01-12 13:25                                               ` Roman Zippel
  2002-01-12 14:56                                                 ` yodaiken
  2002-01-12 20:12                                                 ` Andrew Morton
  0 siblings, 2 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-12 13:25 UTC (permalink / raw)
  To: yodaiken
  Cc: Rob Landley, Robert Love, Alan Cox, nigel, Andrew Morton,
	linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> > > SCHED_FIFO leads to
> > >                 niced app 1 in K mode gets Sem A
> > >                 SCHED_FIFO app prempts and blocks on  Sem A
> > >                 whoops! app 2 in K more preempts niced app 1
> >
> > Please explain what's different without the preempt patch.
> 
> See that "preempt" in line 2 . Linux does not
> preempt kernel mode processes otherwise. The beauty of the
> non-preemptive kernel is that "in K mode every process makes progress"
> and even the "niced app" will complete its use of SemA and
> release it in one run.

The point of using semaphores is that one can sleep while holding them,
whether this is forced by preemption or voluntary makes no difference.

> If you have a reasonably fair scheduler you
> can make very useful analysis with Linux now of the form
>
>         Under 50 active proceses in the system means that in every
>         2 second interval every process
>         will get at least 10ms of time to run.
> 
> That's a very valuable property and it goes away in a preemptive kernel
> to get you something vague.

How is that changed? AFAIK inserting more schedule points does not
change the behaviour of the scheduler. The niced app will still get its
time.

> So your argument is that I'm advocating Andrew Morton's patch which
> reduces latencies more than the preempt patch because I have a
> financial interest in not reducing latencies? Subtle.

Andrew's patch requires constant audition and Andrew can't audit all
drivers for possible problems. That doesn't mean Andrew's work is
wasted, since it identifies problems, which preempting can't solve, but
it will always be a hunt for the worst cases, where preempting goes for
the general case.

> In any case, motive has no bearing on a technical argument.
> Your motive could be to make the 68K look better by reducing
> performance on other processors for all I know.

I am more than busy to keep it running (together with a few others, who
are left) and more important I make no money of it.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 13:25                                               ` Roman Zippel
@ 2002-01-12 14:56                                                 ` yodaiken
  2002-01-12 17:48                                                   ` Roman Zippel
  2002-01-14 23:21                                                   ` george anzinger
  2002-01-12 20:12                                                 ` Andrew Morton
  1 sibling, 2 replies; 351+ messages in thread
From: yodaiken @ 2002-01-12 14:56 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Rob Landley, Robert Love, Alan Cox, nigel,
	Andrew Morton, linux-kernel

On Sat, Jan 12, 2002 at 02:25:03PM +0100, Roman Zippel wrote:
> Hi,
> 
> yodaiken@fsmlabs.com wrote:
> 
> > > > SCHED_FIFO leads to
> > > >                 niced app 1 in K mode gets Sem A
> > > >                 SCHED_FIFO app prempts and blocks on  Sem A
> > > >                 whoops! app 2 in K more preempts niced app 1
> > >
> > > Please explain what's different without the preempt patch.
> > 
> > See that "preempt" in line 2 . Linux does not
> > preempt kernel mode processes otherwise. The beauty of the
> > non-preemptive kernel is that "in K mode every process makes progress"
> > and even the "niced app" will complete its use of SemA and
> > release it in one run.
> 
> The point of using semaphores is that one can sleep while holding them,
> whether this is forced by preemption or voluntary makes no difference.

No. The point of using semaphores is that one can sleep while
_waiting_ for the resource. Sleeping while holding semaphores is
a different kettle of lampreys entirely.
And it makes a very big difference
A:
	get sem on memory pool
		do something horrible to pool
	release sem on memory pool

In a preemptive kernel this can cause a deadlock. In a non
preemptive it cannot. You are correct in that 
B:
	get sem on memory pool
		do potentially blocking operations
	release sem
is also dangerous - but I don't think that helps your case.
To fix B, we can enforce a coding rule - one of the reasons why
we have all those atomic ops in the kernel is to be able to 
avoid this problem.
To fix A in a preemptive kernel we need to start messing about with
priorities and that's a major error.
"The current kernel has too many places where processes
can sleep while holding semaphores so we should always have the 
potential of blocking with held semaphores" is, to me, a backwards
argument.

> > If you have a reasonably fair scheduler you
> > can make very useful analysis with Linux now of the form
> >
> >         Under 50 active proceses in the system means that in every
> >         2 second interval every process
> >         will get at least 10ms of time to run.
> > 
> > That's a very valuable property and it goes away in a preemptive kernel
> > to get you something vague.
> 
> How is that changed? AFAIK inserting more schedule points does not
> change the behaviour of the scheduler. The niced app will still get its
> time.

How many times can an app be preempted? In a non preempt kernel
is can be preempted during user mode at timer frequency and no more
and it cannot be preempted during kernel mode. So
	while(1){
		read mpeg data
		process
		write bitmap
		}

Assuming Andrew does not get too ambitious about read/write granularity, once this
process is scheduled on a non-preempt system it will always make progress. The non
preempt kernel says, "your kernel request will complete - if we have resources".
A preempt kernel says: "well, if nobody more important activates you get some time"
Now you do the analysis based on the computation of "goodness" to show that there is
a bound on preemption count during an execution of this process. I don't want to 
have to think that hard. 
Let's suppose the Gnome desktop constantly creates and 
destroys new fresh i/o bound tasks to do something. So with the old fashioned non
preempt (ignoring Andrew) we get
			wait no more than 1 second
			I'm scheduled and start a read 
			wait no more than one second
			I'm scheduled and in user mode for at least 10milliseconds
			wait no more than 1 second
			I'm scheduled and do my write
			...
with preempt we get
			wait no more than 1 second
			I'm scheduled and start a read 
				I'm preempted
				read not done
				come back for 2 microseconds
				preempted again
				haven't issued the damn read request yet 
				ok a miracle happens, I finish the read request
				go to usermode and an interrupt happens
						well it would be stupid to have a goodness
						function in a preempt kernel that lets a low
						priority task finish its time slice so preempt
				...

> 
> > So your argument is that I'm advocating Andrew Morton's patch which
> > reduces latencies more than the preempt patch because I have a
> > financial interest in not reducing latencies? Subtle.
> 
> Andrew's patch requires constant audition and Andrew can't audit all
> drivers for possible problems. That doesn't mean Andrew's work is
> wasted, since it identifies problems, which preempting can't solve, but
> it will always be a hunt for the worst cases, where preempting goes for
> the general case.

the preempt requires constant auditing too - and more complex auditing.
After all, a missed audit in Andrew will simply increase worst case timing.
A missed audit in preempt will hang the system.

> 
> > In any case, motive has no bearing on a technical argument.
> > Your motive could be to make the 68K look better by reducing
> > performance on other processors for all I know.
> 
> I am more than busy to keep it running (together with a few others, who
> are left) and more important I make no money of it.

Come on! First of all, you are causing me a great deal of pain by making
me struggle not to make some bad joke about the economics of Linux companies.
More important, not making money has nothing to do with purity of motivation -
don't you read this list?
And how do I know that you haven't got a stockpile of 68K boards that may
be worth big money once it's known that 68K linux is at the top of the heap?
Much less plausible money making schemes have been tried.

Seriously: for our business, a Linux kernel that can reliably run at millisecond
level latencies is only good. If you could get a Linux kernel to run at 
latencies of 100 microseconds worst case on a 486, I'd be a little more
worried  but even then ...
On a 800Mhz Athlon, RTLinux scheduling jitter is 17microseconds worst case right now.


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 14:56                                                 ` yodaiken
@ 2002-01-12 17:48                                                   ` Roman Zippel
  2002-01-12 19:23                                                     ` yodaiken
  2002-01-14 23:21                                                   ` george anzinger
  1 sibling, 1 reply; 351+ messages in thread
From: Roman Zippel @ 2002-01-12 17:48 UTC (permalink / raw)
  To: yodaiken
  Cc: Rob Landley, Robert Love, Alan Cox, nigel, Andrew Morton,
	linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> No. The point of using semaphores is that one can sleep while
> _waiting_ for the resource.
> [...]
> In a preemptive kernel this can cause a deadlock. In a non
> preemptive it cannot. You are correct in that
> B:
>         get sem on memory pool
>                 do potentially blocking operations
>         release sem
> is also dangerous - but I don't think that helps your case.
> To fix B, we can enforce a coding rule - one of the reasons why
> we have all those atomic ops in the kernel is to be able to
> avoid this problem.

Sorry I can't follow you. First, one can sleep while waiting for the
semaphore _and_ while holding it. Second we use atomic ops (e.g. for
resource management) exactly because there are not protected by any
semaphore/spinlock.

> Let's suppose the Gnome desktop constantly creates and
> destroys new fresh i/o bound tasks to do something. So with the old fashioned non
> preempt (ignoring Andrew) we get
> [...]

There is no priority problem! If there is a more important task to run,
the less important one simply has to wait, but it will still get its
time. Your deadlock situation does not exists. The average time a
process has to wait for a lower priority process might be increased, but
the worst case behaviour is still the same.
The problem that does exist is the coarse time slice accounting, which
is easier to exploit with the preempt kernel, but it's not a new
problem. On the other hand it's a solvable problem, which requires no
priority inversion.

> > Andrew's patch requires constant audition and Andrew can't audit all
> > drivers for possible problems. That doesn't mean Andrew's work is
> > wasted, since it identifies problems, which preempting can't solve, but
> > it will always be a hunt for the worst cases, where preempting goes for
> > the general case.
> 
> the preempt requires constant auditing too - and more complex auditing.
> After all, a missed audit in Andrew will simply increase worst case timing.
> A missed audit in preempt will hang the system.

As long as the scheduler isn't changed, this isn't true and as I said
there are latency problems which preempting can't solve, but it will
automatically take care of the rest.

> Come on! First of all, you are causing me a great deal of pain by making
> me struggle not to make some bad joke about the economics of Linux companies.

Feel free, I'm not a big believer in the economics of software companies
in general, anyway.

> More important, not making money has nothing to do with purity of motivation -
> don't you read this list?

Everyone has its motivation and I do respect that, but I'm getting
suspicious as soon as money is involved. If people disagree, they can
still get along nicely and do their thing independently. But if they
have to make a living by getting a share of a cake, it usually only
works as long as there is enough cake, otherwise it can get nasty very
quickly (and usually there is never enough cake).

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 17:48                                                   ` Roman Zippel
@ 2002-01-12 19:23                                                     ` yodaiken
  2002-01-12 21:21                                                       ` Roman Zippel
  0 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-12 19:23 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Rob Landley, Robert Love, Alan Cox, nigel,
	Andrew Morton, linux-kernel

On Sat, Jan 12, 2002 at 06:48:28PM +0100, Roman Zippel wrote:
> Hi,
> 
> yodaiken@fsmlabs.com wrote:
> 
> > No. The point of using semaphores is that one can sleep while
> > _waiting_ for the resource.
> > [...]
> > In a preemptive kernel this can cause a deadlock. In a non
> > preemptive it cannot. You are correct in that
> > B:
> >         get sem on memory pool
> >                 do potentially blocking operations
> >         release sem
> > is also dangerous - but I don't think that helps your case.
> > To fix B, we can enforce a coding rule - one of the reasons why
> > we have all those atomic ops in the kernel is to be able to
> > avoid this problem.
> 
> Sorry I can't follow you. First, one can sleep while waiting for the

We're having a write only discussion - time to stop.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:23                                                     ` yodaiken
@ 2002-01-12 21:21                                                       ` Roman Zippel
  2002-01-13  1:23                                                         ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Roman Zippel @ 2002-01-12 21:21 UTC (permalink / raw)
  To: yodaiken
  Cc: Rob Landley, Robert Love, Alan Cox, nigel, Andrew Morton,
	linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> We're having a write only discussion - time to stop.

Sorry, but I'm still waiting for the proof that preempting deadlocks the
system.
If n running processes have together m time slices, after m ticks every
process will have run it's full share of the time, no matter how often
you schedule. (I assume here a correct time accounting, which is
currently not the case, but that's a different (and not new) problem.)
So even the low priority process will have the same time as before to do
it's job, it will be delayed, but it will not be delayed forever, so I'm
failing to see how preempting Linux should deadlock.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 21:21                                                       ` Roman Zippel
@ 2002-01-13  1:23                                                         ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13  1:23 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Rob Landley, Robert Love, Alan Cox, nigel,
	Andrew Morton, linux-kernel

> So even the low priority process will have the same time as before to do
> it's job, it will be delayed, but it will not be delayed forever, so I'm
> failing to see how preempting Linux should deadlock.

First task scheduled takes a resource that a second task needs. 150 other
threads schedule via pre-emption, the one that it should share the resource
with cannot run but the rest do. Repeat. It doesn't deadlock but it goes
massively unfair


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 14:56                                                 ` yodaiken
  2002-01-12 17:48                                                   ` Roman Zippel
@ 2002-01-14 23:21                                                   ` george anzinger
  2002-01-15  0:59                                                     ` yodaiken
  1 sibling, 1 reply; 351+ messages in thread
From: george anzinger @ 2002-01-14 23:21 UTC (permalink / raw)
  To: yodaiken
  Cc: Roman Zippel, Rob Landley, Robert Love, Alan Cox, nigel,
	Andrew Morton, linux-kernel

yodaiken@fsmlabs.com wrote:
> 
> On Sat, Jan 12, 2002 at 02:25:03PM +0100, Roman Zippel wrote:
> > Hi,
> >
> > yodaiken@fsmlabs.com wrote:
> >
> > > > > SCHED_FIFO leads to
> > > > >                 niced app 1 in K mode gets Sem A
> > > > >                 SCHED_FIFO app prempts and blocks on  Sem A
> > > > >                 whoops! app 2 in K more preempts niced app 1
> > > >
> > > > Please explain what's different without the preempt patch.
> > >
> > > See that "preempt" in line 2 . Linux does not
> > > preempt kernel mode processes otherwise. The beauty of the
> > > non-preemptive kernel is that "in K mode every process makes progress"
> > > and even the "niced app" will complete its use of SemA and
> > > release it in one run.
> >
> > The point of using semaphores is that one can sleep while holding them,
> > whether this is forced by preemption or voluntary makes no difference.
> 
> No. The point of using semaphores is that one can sleep while
> _waiting_ for the resource. Sleeping while holding semaphores is
> a different kettle of lampreys entirely.
> And it makes a very big difference
> A:
>         get sem on memory pool
>                 do something horrible to pool
>         release sem on memory pool
> 
> In a preemptive kernel this can cause a deadlock. In a non
> preemptive it cannot. You are correct in that
> B:
>         get sem on memory pool
>                 do potentially blocking operations
>         release sem
> is also dangerous - but I don't think that helps your case.
> To fix B, we can enforce a coding rule - one of the reasons why
> we have all those atomic ops in the kernel is to be able to
> avoid this problem.
> To fix A in a preemptive kernel we need to start messing about with
> priorities and that's a major error.
> "The current kernel has too many places where processes
> can sleep while holding semaphores so we should always have the
> potential of blocking with held semaphores" is, to me, a backwards
> argument.
> 
> > > If you have a reasonably fair scheduler you
> > > can make very useful analysis with Linux now of the form
> > >
> > >         Under 50 active proceses in the system means that in every
> > >         2 second interval every process
> > >         will get at least 10ms of time to run.
> > >
> > > That's a very valuable property and it goes away in a preemptive kernel
> > > to get you something vague.
> >
> > How is that changed? AFAIK inserting more schedule points does not
> > change the behaviour of the scheduler. The niced app will still get its
> > time.
> 
> How many times can an app be preempted? In a non preempt kernel
> is can be preempted during user mode at timer frequency and no more

Uh, it can be and is preempted in user mode by ANY interrupt, be it
keyboard, serial, lan, disc, etc.  The kernel looks for need_resched at
the end of ALL interrupts, not just the timer interrupt.

> and it cannot be preempted during kernel mode. So
>         while(1){
>                 read mpeg data
>                 process
>                 write bitmap
>                 }
> 
> Assuming Andrew does not get too ambitious about read/write granularity, once this
> process is scheduled on a non-preempt system it will always make progress. The non
> preempt kernel says, "your kernel request will complete - if we have resources".
> A preempt kernel says: "well, if nobody more important activates you get some time"
> Now you do the analysis based on the computation of "goodness" to show that there is
> a bound on preemption count during an execution of this process. I don't want to
> have to think that hard.
> Let's suppose the Gnome desktop constantly creates and
> destroys new fresh i/o bound tasks to do something. So with the old fashioned non
> preempt (ignoring Andrew) we get
>                         wait no more than 1 second
>                         I'm scheduled and start a read
>                         wait no more than one second
>                         I'm scheduled and in user mode for at least 10milliseconds
>                         wait no more than 1 second
>                         I'm scheduled and do my write
>                         ...
> with preempt we get
>                         wait no more than 1 second
>                         I'm scheduled and start a read
>                                 I'm preempted
>                                 read not done
>                                 come back for 2 microseconds
>                                 preempted again
>                                 haven't issued the damn read request yet
>                                 ok a miracle happens, I finish the read request
>                                 go to usermode and an interrupt happens
>                                                 well it would be stupid to have a goodness
>                                                 function in a preempt kernel that lets a low
>                                                 priority task finish its time slice so preempt
>                                 ...
> 
> >
> > > So your argument is that I'm advocating Andrew Morton's patch which
> > > reduces latencies more than the preempt patch because I have a
> > > financial interest in not reducing latencies? Subtle.
> >
> > Andrew's patch requires constant audition and Andrew can't audit all
> > drivers for possible problems. That doesn't mean Andrew's work is
> > wasted, since it identifies problems, which preempting can't solve, but
> > it will always be a hunt for the worst cases, where preempting goes for
> > the general case.
> 
> the preempt requires constant auditing too - and more complex auditing.
> After all, a missed audit in Andrew will simply increase worst case timing.
> A missed audit in preempt will hang the system.
> 
> >
> > > In any case, motive has no bearing on a technical argument.
> > > Your motive could be to make the 68K look better by reducing
> > > performance on other processors for all I know.
> >
> > I am more than busy to keep it running (together with a few others, who
> > are left) and more important I make no money of it.
> 
> Come on! First of all, you are causing me a great deal of pain by making
> me struggle not to make some bad joke about the economics of Linux companies.
> More important, not making money has nothing to do with purity of motivation -
> don't you read this list?
> And how do I know that you haven't got a stockpile of 68K boards that may
> be worth big money once it's known that 68K linux is at the top of the heap?
> Much less plausible money making schemes have been tried.
> 
> Seriously: for our business, a Linux kernel that can reliably run at millisecond
> level latencies is only good. If you could get a Linux kernel to run at
> latencies of 100 microseconds worst case on a 486, I'd be a little more
> worried  but even then ...
> On a 800Mhz Athlon, RTLinux scheduling jitter is 17microseconds worst case right now.
> 
> --
> ---------------------------------------------------------
> Victor Yodaiken
> Finite State Machine Labs: The RTLinux Company.
>  www.fsmlabs.com  www.rtlinux.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 23:21                                                   ` george anzinger
@ 2002-01-15  0:59                                                     ` yodaiken
  2002-01-15  9:18                                                       ` Helge Hafting
  0 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-15  0:59 UTC (permalink / raw)
  To: george anzinger
  Cc: yodaiken, Roman Zippel, Rob Landley, Robert Love, Alan Cox, nigel,
	Andrew Morton, linux-kernel

On Mon, Jan 14, 2002 at 03:21:14PM -0800, george anzinger wrote:
> > > How is that changed? AFAIK inserting more schedule points does not
> > > change the behaviour of the scheduler. The niced app will still get its
> > > time.
> > 
> > How many times can an app be preempted? In a non preempt kernel
> > is can be preempted during user mode at timer frequency and no more
> 
> Uh, it can be and is preempted in user mode by ANY interrupt, be it
> keyboard, serial, lan, disc, etc.  The kernel looks for need_resched at
> the end of ALL interrupts, not just the timer interrupt.


Ouch.





^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-15  0:59                                                     ` yodaiken
@ 2002-01-15  9:18                                                       ` Helge Hafting
  0 siblings, 0 replies; 351+ messages in thread
From: Helge Hafting @ 2002-01-15  9:18 UTC (permalink / raw)
  To: yodaiken; +Cc: linux-kernel

yodaiken@fsmlabs.com wrote:
> 
> On Mon, Jan 14, 2002 at 03:21:14PM -0800, george anzinger wrote:
> > > > How is that changed? AFAIK inserting more schedule points does not
> > > > change the behaviour of the scheduler. The niced app will still get its
> > > > time.
> > >
> > > How many times can an app be preempted? In a non preempt kernel
> > > is can be preempted during user mode at timer frequency and no more
> >
> > Uh, it can be and is preempted in user mode by ANY interrupt, be it
> > keyboard, serial, lan, disc, etc.  The kernel looks for need_resched at
> > the end of ALL interrupts, not just the timer interrupt.
> 
> Ouch.

Ouch?  It is supposed to be that way.  Consider:

A high-priority task issues a disk read - and blocks.  Some
lower-priority process gets the cpu.  But then the disk io finishes
way before the low-priority process used up its timeslice.
The kernel gets an interrupt from the disk controller because
of that.  Perhaps the block device issues some more requests,
then time comes to return to user space.  The higher priority task
is now ready to run because its IO completed.  So of course
it is preferred over that low-priority thing.  In other words,
the low-priority task got preempted, this time by a disk
interrupt.

The same thing happens whan high-priority tasks waits for
other kinds of io, such as network, serial, and so on.
I am sure you wouldn't want it any other way.  Not
using the opportunity to switch task immediately after an io
completion interrupt would kill latency completely.

Helge Hafting

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 13:25                                               ` Roman Zippel
  2002-01-12 14:56                                                 ` yodaiken
@ 2002-01-12 20:12                                                 ` Andrew Morton
  1 sibling, 0 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-12 20:12 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Rob Landley, Robert Love, Alan Cox, nigel, linux-kernel

Roman Zippel wrote:
> 
> Andrew's patch requires constant audition and Andrew can't audit all
> drivers for possible problems. That doesn't mean Andrew's work is
> wasted, since it identifies problems, which preempting can't solve, but
> it will always be a hunt for the worst cases, where preempting goes for
> the general case.

Guys,

I've heard this so many times, and it just ain't so.   The overwhelming
majority of problem areas are inside locks.  All the complexity and 
maintainability difficulties to which you refer exist in the preempt
patch as well.    There just is no difference.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 11:53                                           ` Roman Zippel
  2002-01-12 12:28                                             ` yodaiken
@ 2002-01-12 18:46                                             ` Alan Cox
  2002-01-12 20:42                                               ` Roman Zippel
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-12 18:46 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Rob Landley, Robert Love, Alan Cox, nigel,
	Andrew Morton, linux-kernel

> >         Hey my DVD player has stalled, lets add sem_with_revolting_priority_trick!
> >         Why the hell is UP Windows XP3 blowing away my Linux box on DVD playing while
> >         Linux now runs with the grace and speed of IRIX?
> 
> Because the IRIX implementation sucks, every implementation has to suck?
> Somehow I have the suspicion you're trying to discourage everyone from
> even trying, because if he'd succeeded you'd loose a big chunk of
> potential RTLinux customers.

Victor has had the same message for years, as have others like Larry McVoy
(in fact if Larry and Victor agree on something its unusual enough to
 remember). So I can vouch for the fact Victor hasn't changed his tune from
before rtlinux was ever any real commercial toy. I think you owe him an
apology.

Now rtlinux and low latency in the main kernel are two different things. One
gives you effectively a small embedded system to program for which talks
to Linux. From that you draw extremely reliable behaviour and very bounded
delay times. Its small enough you can validate it too

RtLinux isn't going to help you one bit when it comes to smooth movie playback 
because the DVD playback is dependant on the Linux file system layers and a
whole pile of other code. Low-latency does this quite nicely, and it takes
you to the point where hardware becomes the biggest latency cause for the
general case. Pre-empt doesn't buy you anything more. You can spend a
millisecond locked in an I/O instruction to an irritating device.

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 18:46                                             ` Alan Cox
@ 2002-01-12 20:42                                               ` Roman Zippel
  2002-01-12 22:13                                                 ` yodaiken
  2002-01-13  1:28                                                 ` Alan Cox
  0 siblings, 2 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-12 20:42 UTC (permalink / raw)
  To: Alan Cox
  Cc: yodaiken, Rob Landley, Robert Love, nigel, Andrew Morton,
	linux-kernel

Hi,

Alan Cox wrote:

> > Because the IRIX implementation sucks, every implementation has to suck?
> > Somehow I have the suspicion you're trying to discourage everyone from
> > even trying, because if he'd succeeded you'd loose a big chunk of
> > potential RTLinux customers.
> 
> Victor has had the same message for years, as have others like Larry McVoy
> (in fact if Larry and Victor agree on something its unusual enough to
>  remember). So I can vouch for the fact Victor hasn't changed his tune from
> before rtlinux was ever any real commercial toy. I think you owe him an
> apology.

Did I really say something that bad? I would be actually surprised, if
Victor wouldn't act in the best interest of his company. The other
possibility is that Victor must have had such a terrible experience with
IRIX, so that he thinks any attempts to add better soft realtime or even
hard realtime capabilities (not just as addon) must be doomed to fail.

> RtLinux isn't going to help you one bit when it comes to smooth movie playback
> because the DVD playback is dependant on the Linux file system layers and a
> whole pile of other code. Low-latency does this quite nicely, and it takes
> you to the point where hardware becomes the biggest latency cause for the
> general case. Pre-empt doesn't buy you anything more. You can spend a
> millisecond locked in an I/O instruction to an irritating device.

Preemption doesn't solve of course every problem. It's mainly useful to
get an event as fast as possible from kernel to user space. This can be
the mouse click or the buffer your process is waiting for. Latencies can
quickly sum up here to be sensible.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:42                                               ` Roman Zippel
@ 2002-01-12 22:13                                                 ` yodaiken
  2002-01-13  3:33                                                   ` Roman Zippel
  2002-01-13  1:28                                                 ` Alan Cox
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-12 22:13 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, yodaiken, Rob Landley, Robert Love, nigel,
	Andrew Morton, linux-kernel

On Sat, Jan 12, 2002 at 09:42:26PM +0100, Roman Zippel wrote:
> Hi,
> 
> Alan Cox wrote:
> 
> > > Because the IRIX implementation sucks, every implementation has to suck?
> > > Somehow I have the suspicion you're trying to discourage everyone from
> > > even trying, because if he'd succeeded you'd loose a big chunk of
> > > potential RTLinux customers.
> > 
> > Victor has had the same message for years, as have others like Larry McVoy
> > (in fact if Larry and Victor agree on something its unusual enough to
> >  remember). So I can vouch for the fact Victor hasn't changed his tune from
> > before rtlinux was ever any real commercial toy. I think you owe him an
> > apology.
> 
> Did I really say something that bad? I would be actually surprised, if
> Victor wouldn't act in the best interest of his company. The other
> possibility is that Victor must have had such a terrible experience with
> IRIX, so that he thinks any attempts to add better soft realtime or even
> hard realtime capabilities (not just as addon) must be doomed to fail.

Well, how about a third possibility - that I see a problem you have not
seen and that you should try to argue on technical terms instead of psychoanlyzing
me or looking for financial motives?


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 22:13                                                 ` yodaiken
@ 2002-01-13  3:33                                                   ` Roman Zippel
  2002-01-13  4:02                                                     ` yodaiken
  0 siblings, 1 reply; 351+ messages in thread
From: Roman Zippel @ 2002-01-13  3:33 UTC (permalink / raw)
  To: yodaiken
  Cc: Alan Cox, Rob Landley, Robert Love, nigel, Andrew Morton,
	linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> Well, how about a third possibility - that I see a problem you have not
> seen and that you should try to argue on technical terms

I just don't see any problem that is really new. Alan's example is one
of more extreme ones, but the only effect is that an operation can be
delayed far more than usual, but not indefinitely.
If you think preemption can cause a deadlock, maybe you could give me a
hint, which of the conditions for a deadlock is changed by preemption?

> instead of psychoanlyzing
> me or looking for financial motives?

If I had known, how easily people are offended by implying they could
act out of financial interest, I hadn't made that comment. Sorry, but
I'm just annoyed, how you attack any attempt to add realtime
capabilities to the kernel, mostly with the argument that it sucks under
IRIX. I people want to try it, let them. I prefer to see patches and if
they should really suck, I would be first one to say so.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13  3:33                                                   ` Roman Zippel
@ 2002-01-13  4:02                                                     ` yodaiken
  0 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-13  4:02 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Alan Cox, Rob Landley, Robert Love, nigel,
	Andrew Morton, linux-kernel

On Sun, Jan 13, 2002 at 04:33:44AM +0100, Roman Zippel wrote:
> Hi,
> 
> yodaiken@fsmlabs.com wrote:
> 
> > Well, how about a third possibility - that I see a problem you have not
> > seen and that you should try to argue on technical terms
> 
> I just don't see any problem that is really new. Alan's example is one
> of more extreme ones, but the only effect is that an operation can be
> delayed far more than usual, but not indefinitely.
> If you think preemption can cause a deadlock, maybe you could give me a
> hint, which of the conditions for a deadlock is changed by preemption?
> 
> > instead of psychoanlyzing
> > me or looking for financial motives?
> 
> If I had known, how easily people are offended by implying they could
> act out of financial interest, I hadn't made that comment. Sorry, but
> I'm just annoyed, how you attack any attempt to add realtime
> capabilities to the kernel, mostly with the argument that it sucks under
> IRIX. I people want to try it, let them. I prefer to see patches and if
> they should really suck, I would be first one to say so.

I'm annoyed that you take a comment in which I said that the Morton approach
was much preferrable to the preempt patch and respond by saying I "attack
any attempt to add realtime capabilities to the kernel". 
I'm all in favor of people trying all sorts of things. My original comment
was that the numbers I'd seen all favored the Morton patch and I still
haven't seen any evidence to the contrary.

I also made two very simple and specific comments:
	1) I don't see how processor specific caching, which seems
	essential for smp performance and will be more essential 
	with numa, works with this patch
	2) preempt seems to lead inescapably to priority inherit. If this
	is true, people better understand the ramifications now before they
	commit.

Of course, I think there are strong limits to what you can get for RT 
performance in the kernel - I think the RTLinux method is far superior.
Believe what you want - it won't change the numbers.

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:42                                               ` Roman Zippel
  2002-01-12 22:13                                                 ` yodaiken
@ 2002-01-13  1:28                                                 ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13  1:28 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, yodaiken, Rob Landley, Robert Love, nigel,
	Andrew Morton, linux-kernel

> Preemption doesn't solve of course every problem. It's mainly useful to
> get an event as fast as possible from kernel to user space. This can be
> the mouse click or the buffer your process is waiting for. Latencies can
> quickly sum up here to be sensible.

The pre-emption patch doesn't change the average latencies. Go run some real
benchmarks. Its lost in the noise after the low latency patch. A single inw
from some I/O cards can cost as much as the latency target we hit.

Its not a case of the 90% of the result with 10% of the work, the pre-empt
patch is firmly in the all pain no gain camp


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 20:22                                       ` Rob Landley
  2002-01-12  5:00                                         ` yodaiken
@ 2002-01-12  5:03                                         ` Andrew Morton
  2002-01-12 18:26                                           ` Jussi Laako
  2002-01-12  6:01                                         ` Robert Love
                                                           ` (2 subsequent siblings)
  4 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-12  5:03 UTC (permalink / raw)
  To: Rob Landley; +Cc: yodaiken, Robert Love, Alan Cox, nigel, linux-kernel

Rob Landley wrote:
> 
> On Friday 11 January 2002 09:50 pm, yodaiken@fsmlabs.com wrote:
> > On Fri, Jan 11, 2002 at 03:33:22PM -0500, Robert Love wrote:
> > > On Fri, 2002-01-11 at 07:37, Alan Cox wrote:
> > > The preemptible kernel plus the spinlock cleanup could really take us
> > > far.  Having locked at a lot of the long-held locks in the kernel, I am
> > > confident at least reasonable progress could be made.
> > >
> > > Beyond that, yah, we need a better locking construct.  Priority
> > > inversion could be solved with a priority-inheriting mutex, which we can
> > > tackle if and when we want to go that route.  Not now.
> >
> > Backing the car up to the edge of the cliff really gives us
> > good results. Beyond that, we could jump off the cliff
> > if we want to go that route.
> > Preempt leads to inheritance and inheritance leads to disaster.
> 
> I preempt leads to disaster than Linux can't do SMP.  Are you saying that's
> the case?

Victor is referring to priority inheritance, to solve priority inversion.

Priority inheritance seems undesirable for Linux - these applications are
already in the minority.   A realtime application on Linux should simply
avoid complex system calls which can lead to blockage on a SCHED_OTHER
thread.

If the app is well-designed, the only place in which it is likely to
be unexpectedly blocked inside the kernel is in the page allocator.
My approach to this problem is to cause non-SCHED_OTHER processes
to perform atomic (non-blocking) memory allocations, with a fallback
to non-atomic.

> The preempt patch is really "SMP on UP".  If pre-empt shows up a problem,
> then it's a problem SMP users will see too.  If we can't take advantage of
> the existing SMP locking infrastructure to improve latency and interactive
> feel on UP machines, than SMP for linux DOES NOT WORK.
> 
> > All the numbers I've seen show Morton's low latency just works better. Are
> > there other numbers I should look at.
> 
> This approach is basically a collection of heuristics.  The kernel has been
> profiled and everywhere a latency spike was found, a band-aid was put on it
> (an explicit scheduling point).  This doesn't say there aren't other latency
> spikes, just that with the collection of hardware and software being
> benchmarked, the latency spikes that were found have each had a band-aid
> individually applied to them.

The preempt patch needs all this as well.
 
> This isn't a BAD thing.  If the benchmarks used to find latency spikes are at
> all like real-world use, then it helps real-world applications.  But of
> COURSE the benchmarks are going to look good, since tuning the kernel to
> those benchmarks is the way the patch was developed!
> 
> The majority of the original low latency scheduling point work is handled
> automatically by the SMP on UP kernel.

No it is not.

The preempt code only obsoletes a handful of the low-latency patch's
resceduling.  The most trivial ones.  generic_file_read, generic_file_write
and a couple of /proc functions.

Of the sixty or so rescheduling points in the low-latency patch, about
fifty are inside locks.  Many of these are just lock_kernel().  About
half are not.

>  You don't NEED to insert scheduling
> points anywhere you aren't inside a spinlock.

I know of only four or five places in the kernel where large amount of
time are spent in unlocked code.  All the other problem areas are inside locks.

>  So the SMP on UP patch makes
> most of the explicit scheduling point patch go away,

s/most/a trivial minority/

> accomplishing the same
> thing in a less intrusive manner.

s/less/more/

> (Yes, it makes all kernels act like SMP
> kernels for debugging purposes.  But you can turn it off for debugging if you
> want to, that's just another toggle in the magic sysreq menu.  And this isn't
> entirely a bad thing: applying the enormous UP userbase to the remaining SMP
> bugs is bound to squeeze out one or two more obscure ones, but those bugs DO
> exist already on SMP.)

Saying "it's a config option" is a cop-out.  The kernel developers should
be aiming at producing a piece of software which can be shrink-wrap
deployed to millions of people.

Arguably, enabling it on UP and disabling it on SMP may be a sensible
approach, meraly because SMP tends to map onto applications which
do not require lower latencies.
  
> However, what's left of the explicit scheduling work is still very useful.
> When you ARE inside a spinlock, you can't just schedule, you have to save
> state, drop the lock(s), schedule, re-acquire the locks, and reload your
> state in case somebody else diddled with the structures you were using.  This
> is a lot harder than just scheduling, but breaking up long-held locks like
> this helps SMP scalability, AND helps latency in the SMP-on-UP case.

Yes, it _may_ help SMP scalability.  But a better approach is to replace
spinlocks with rwlocks when a lock is fond to have this access pattern.
 
> So the best approach is a combination of the two patches.  SMP-on-UP for
> everything outside of spinlocks, and then manually yielding locks that cause
> problems.

Well the ideal approach is to simply make the long-running locked code
faster, by better choice of algorithm and data structure.  Unfortunately,
in the majority of cases, this isn't possible.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12  5:03                                         ` Andrew Morton
@ 2002-01-12 18:26                                           ` Jussi Laako
  0 siblings, 0 replies; 351+ messages in thread
From: Jussi Laako @ 2002-01-12 18:26 UTC (permalink / raw)
  To: Andrew Morton; +Cc: yodaiken, Robert Love, linux-kernel

Andrew Morton wrote:
> 
> Priority inheritance seems undesirable for Linux - these applications are
> already in the minority.   A realtime application on Linux should simply
> avoid complex system calls which can lead to blockage on a SCHED_OTHER
> thread.

I think it's very common to have SCHED_FIFO thread communicating with
various other processes through pipe/fifo/socket or some other IPC
mechanism.

It would be great to have priority inheritance where process receiving data
through fifo from SCHED_FIFO process would have raised priority for transfer
time. (see QNX priority inheriting message queues) Too bad we don't have
message queues so we could have send/receive/reply time priority
inheritance.

So we could have

Process 1 at SCHED_FIFO sending data to two processes.
Process 2 at SCHED_FIFO receiving data from process 1.
Process 3 at SCHED_OTHER receiving data from process 1.
Process 4 at SCHED_OTHER sending data to process 5.
Process 5 at SCHED_OTHER receiving data from process 4.

And (2) would get the data first from (1) and then (3). And if (1) starts
sending data to (2) system would immediately start running (1/2) and even
pre-empt the ongoing system call of (3). Also (1/3) would take over/pre-empt
(4/5) because (3) inherits priority from sending process (1).

If this is currently _not_ done I think it's very strange.

But I think I have misunderstood the whole point of original message... :)

 - Jussi Laako

-- 
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B  39DD A4DE 63EB C216 1E4B
Available at PGP keyservers

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 20:22                                       ` Rob Landley
  2002-01-12  5:00                                         ` yodaiken
  2002-01-12  5:03                                         ` Andrew Morton
@ 2002-01-12  6:01                                         ` Robert Love
  2002-01-12 12:45                                           ` yodaiken
  2002-01-12 19:00                                           ` Alan Cox
  2002-01-12  9:52                                         ` arjan
  2002-01-14 12:08                                         ` [2.4.17/18pre] VM and swap - it's really unusable Helge Hafting
  4 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-12  6:01 UTC (permalink / raw)
  To: Rob Landley; +Cc: yodaiken, Alan Cox, nigel, Andrew Morton, linux-kernel

On Fri, 2002-01-11 at 15:22, Rob Landley wrote:

> So the best approach is a combination of the two patches.  SMP-on-UP for 
> everything outside of spinlocks, and then manually yielding locks that cause 
> problems.  Both Robert Love and Andrew Morton have come out in favor of each 
> other's patches on lkml just in the past few days.  The patches work together 
> quite well, and each wants to see the other's patch applied.

Right.  Here is what I want for 2.5 as a _general_ step towards a better
kernel that will yield better performance:

Merge the preemptible kernel patch.  A version is now out for
2.5.2-pre11 with support for Ingo's scheduler:

	ftp://ftp.kernel.org/pub/linux/kernel/people/rml/preempt-kernel

Next, make available a tool for profiling kernel latencies.  I have one
available now, preempt-stats, at the above url.  Andrew has some
excellent tools available at his website, too.  Something like this
could even be merged.  Daniel Phillips suggested a passive tool on IRC. 
Preempt-stats works like this.  It is off-by-default and, when enabled,
measures time between lock and unlock, reporting the top 20 worst-cases.

Begin working on the worst-case locks.  Solutions like Andrew's
low-latency and my lock-break are a start.  Better (at least in general)
solutions are to analyze the locks.  Localize them; make them finer
grained.  Analyze the algorithms.  Find the big problems.  Anyone look
at the tty layer lately?  Ugh.  Using the preemptive kernel as a base
and the analysis of the locks as a list of culprits, clean this cruft
up.  This would benefit SMP, too.  Perhaps a better locking construct is
useful.

The immediate result is good; the future is better.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12  6:01                                         ` Robert Love
@ 2002-01-12 12:45                                           ` yodaiken
  2002-01-12 19:00                                           ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-12 12:45 UTC (permalink / raw)
  To: Robert Love
  Cc: Rob Landley, yodaiken, Alan Cox, nigel, Andrew Morton,
	linux-kernel

On Sat, Jan 12, 2002 at 01:01:39AM -0500, Robert Love wrote:
> could even be merged.  Daniel Phillips suggested a passive tool on IRC. 
> Preempt-stats works like this.  It is off-by-default and, when enabled,
> measures time between lock and unlock, reporting the top 20 worst-cases.

I think one of the problems with this entire debate is lack of meaningful
numbers. Not for the first time, I propose that you test with something
that tests application benefits instead of internal numbers that may not
mean anything. For example, there is a simple test
		/* user code */
		get time.
		count = 200*3600; /* one hour */
		while(count--){
			read cycle timer
			clock_nanosleep(5 milliseconds)
			read cycle timer
			compute actual delay and difference from 5 milliseconds
			store the worst case
			}
		get time.
		printf("After one hour the worst deviation is %d clock ticks\n",worst);
		printf("This was supposed to take one hour and it took %d", compute_elapsed());

		

> 
> Begin working on the worst-case locks.  Solutions like Andrew's
> low-latency and my lock-break are a start.  Better (at least in general)
> solutions are to analyze the locks.  Localize them; make them finer
> grained.  Analyze the algorithms.  Find the big problems.  Anyone look

The theory that "fine grained = better" is not proved. It's obvious that
"fine grained = more time spent in the overhead of locking and unlocking locks and
		potentially more time spent in lock contention
		and lots more opportunities of cache ping-pong in real smp
		and much harder to debug"
But the performance gain that is supposed to balance that is often elusive.



> at the tty layer lately?  Ugh.  Using the preemptive kernel as a base
> and the analysis of the locks as a list of culprits, clean this cruft
> up.  This would benefit SMP, too.  Perhaps a better locking construct is
> useful.
> 
> The immediate result is good; the future is better.

Removing synchronization by removing contention 
is better engineering than fiddling about with synchronization
primitives, but it is much harder.


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12  6:01                                         ` Robert Love
  2002-01-12 12:45                                           ` yodaiken
@ 2002-01-12 19:00                                           ` Alan Cox
  2002-01-13  0:16                                             ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-12 19:00 UTC (permalink / raw)
  To: Robert Love
  Cc: Rob Landley, yodaiken, Alan Cox, nigel, Andrew Morton,
	linux-kernel

> Right.  Here is what I want for 2.5 as a _general_ step towards a better
> kernel that will yield better performance:

I see absolutely _no_ evidence to support this repeated claim. I'm still
waiting to see any evidence that low latency patches are not sufficient, or
an explanation of who is going to fix all the drivers you break in subtle
ways

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:00                                           ` Alan Cox
@ 2002-01-13  0:16                                             ` Robert Love
  2002-01-13  1:41                                               ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-13  0:16 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rob Landley, yodaiken, nigel, Andrew Morton, linux-kernel

On Sat, 2002-01-12 at 14:00, Alan Cox wrote:

> I see absolutely _no_ evidence to support this repeated claim. I'm still
> waiting to see any evidence that low latency patches are not sufficient, or
> an explanation of who is going to fix all the drivers you break in subtle
> ways

I'll work on fixing things the patch breaks.  I don't think it will be
that bad.  I've been working on preemption for a long long time, and
before me others have been working for a long long time, and I just
don't see the hordes of broken drivers or the tons of race-conditions
due to per-CPU data.  I have seen some, and I have fixed them.

For a solution to latency concerns, I'd much prefer to lay a framework
down that provides a proper solution and then work on fine tuning the
kernel to get the desired latency out of it.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13  0:16                                             ` Robert Love
@ 2002-01-13  1:41                                               ` Alan Cox
  2002-01-13 22:50                                                 ` Daniel Phillips
  0 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-13  1:41 UTC (permalink / raw)
  To: Robert Love
  Cc: Alan Cox, Rob Landley, yodaiken, nigel, Andrew Morton,
	linux-kernel

> For a solution to latency concerns, I'd much prefer to lay a framework
> down that provides a proper solution and then work on fine tuning the
> kernel to get the desired latency out of it.

As the low latency patch proves, the framework has always been there, the
ll patches do the rest

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13  1:41                                               ` Alan Cox
@ 2002-01-13 22:50                                                 ` Daniel Phillips
  0 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-13 22:50 UTC (permalink / raw)
  To: Alan Cox, Robert Love
  Cc: Alan Cox, Rob Landley, yodaiken, nigel, Andrew Morton,
	linux-kernel

On January 13, 2002 02:41 am, Alan Cox wrote:
> [somebody wrote]
> > For a solution to latency concerns, I'd much prefer to lay a framework
> > down that provides a proper solution and then work on fine tuning the
> > kernel to get the desired latency out of it.
> 
> As the low latency patch proves, the framework has always been there, the
> ll patches do the rest

For that matter, the -preempt patch proves that the framework has always - 
i.e., since genesis of SMP - been there for a preemptible kernel.

--
Daniel


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 20:22                                       ` Rob Landley
                                                           ` (2 preceding siblings ...)
  2002-01-12  6:01                                         ` Robert Love
@ 2002-01-12  9:52                                         ` arjan
  2002-01-12 18:54                                           ` Alan Cox
  2002-01-14 12:08                                         ` [2.4.17/18pre] VM and swap - it's really unusable Helge Hafting
  4 siblings, 1 reply; 351+ messages in thread
From: arjan @ 2002-01-12  9:52 UTC (permalink / raw)
  To: Rob Landley; +Cc: linux-kernel

In article <20020112042404.WCSI23959.femail47.sdc1.sfba.home.com@there> you wrote:

> The preempt patch is really "SMP on UP".

Ok I've seen this misconception quite a lot now. THIS IS NOT TRUE. For one,
constructs that are ok on SMP are not automatically ok with the -preempt
patch, like per-cpu data. And there's a LOT more of that than you think.
Basically with preempt you change the locking rules from under all existing
code. Most will work, even more will appear to work as preemption isn't
an event THAT common (by this I mean the chance of getting preempted in your
4 lines of C code where you have per cpu data).

Also, once you add locks around the  per-cpu data, for the core code it
might be close to smp. For drivers it's not though. Drivers assume that when
they do

outb(foo,bar);
outb(foo2,bar2);

that those happen "close" to eachother in time. Especially in initialisation
paths (where the driver thread is the only thread that can see the
datastructures/device) there's no spinlocks helt so preempt can trigger
here. Sure in the current situation you can get an interrupt but the linux
interrupt delay is not more than, say, 1ms while a schedule-out can take a
second or two easily. Do we know all devices can stand such delays ?
I dare to say we don't as the hardware requirements currently aren't coded
in the drivers.

Add to that that there's no actual benefit of -preempt over the -lowlat
patch latency wise (you REALLY need to combine them or -preempt sucks raw
eggs for latency)....

Greetings,
   Arjan van de Ven

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12  9:52                                         ` arjan
@ 2002-01-12 18:54                                           ` Alan Cox
  2002-01-12 19:23                                             ` Ed Sweetman
                                                               ` (4 more replies)
  0 siblings, 5 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-12 18:54 UTC (permalink / raw)
  To: arjan; +Cc: Rob Landley, linux-kernel

Another example is in the network drivers. The 8390 core for one example
carefully disables an IRQ on the card so that it can avoid spinlocking on 
uniprocessor boxes.

So with pre-empt this happens

	driver magic
	disable_irq(dev->irq)
PRE-EMPT:
	[large periods of time running other code]
PRE-EMPT:
	We get back and we've missed 300 packets, the serial port sharing
	the IRQ has dropped our internet connection completely.

["Don't do that then" isnt a valid answer here. If I did hold a lock
 it would be for several milliseconds at a time anyway and would reliably
 trash performance this time]

There are numerous other examples in the kernel tree where the current code
knows that there is a small bounded time between two actions in kernel space
that do not have a sleep. They are not spin locked, and putting spin locks
everywhere will just trash performance. They are pure hardware interactions
so you can't automatically detect them.

That is why the pre-empt code is a much much bigger problem and task than the
low latency code.

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 18:54                                           ` Alan Cox
@ 2002-01-12 19:23                                             ` Ed Sweetman
  2002-01-12 19:35                                               ` yodaiken
  2002-01-12 20:09                                               ` Alan Cox
  2002-01-12 19:26                                             ` Robert Love
                                                               ` (3 subsequent siblings)
  4 siblings, 2 replies; 351+ messages in thread
From: Ed Sweetman @ 2002-01-12 19:23 UTC (permalink / raw)
  To: arjan, Alan Cox; +Cc: Rob Landley, linux-kernel

> Another example is in the network drivers. The 8390 core for one example
> carefully disables an IRQ on the card so that it can avoid spinlocking on
> uniprocessor boxes.
>
> So with pre-empt this happens
>
> driver magic
> disable_irq(dev->irq)
> PRE-EMPT:
> [large periods of time running other code]
> PRE-EMPT:
> We get back and we've missed 300 packets, the serial port sharing
> the IRQ has dropped our internet connection completely.
>
> ["Don't do that then" isnt a valid answer here. If I did hold a lock
>  it would be for several milliseconds at a time anyway and would reliably
>  trash performance this time

> There are numerous other examples in the kernel tree where the current
code
> knows that there is a small bounded time between two actions in kernel
space
> that do not have a sleep. They are not spin locked, and putting spin locks
> everywhere will just trash performance. They are pure hardware
interactions
> so you can't automatically detect them.

hardware to hardware could have a higher priority than normal programs being
run.   That way they're not preempted by simple programs, it would have to
be purposely preempted by the user.

> That is why the pre-empt code is a much much bigger problem and task than
the
> low latency code.

Lowering the latency, sure the low latency code probably does nearly as well
as the preempt patch.  that's fine.  Shortening the time locks are held by
better code can help to a certain extent (unless a lot of the kernel code is
poorly written, which i doubt).  at it's present state though,  my idea to
fix the kernel would be to give parts of the kernel where locks are made,
that shouldn't be broken normally, higher priorities.  That way we can
distinguish between safe locks to preempt at and the ones that can do harm.
But those people who require their app to be treated special can run it
with -20 and preempt everything.   To me that makes sense.  Is there a
reason why it doesn't?  Besides ethstetics.   the only way the ethsetic
argument people are going to be pleased is if the kernel is designed from
the ground up to be better latency and lock-wise.   A lot of people would
like to not have to wait until that time in the meantime.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:23                                             ` Ed Sweetman
@ 2002-01-12 19:35                                               ` yodaiken
  2002-01-12 20:09                                               ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-12 19:35 UTC (permalink / raw)
  To: Ed Sweetman; +Cc: arjan, Alan Cox, Rob Landley, linux-kernel

On Sat, Jan 12, 2002 at 02:23:00PM -0500, Ed Sweetman wrote:
> hardware to hardware could have a higher priority than normal programs being
> run.   That way they're not preempted by simple programs, it would have to
> be purposely preempted by the user.

Priority is currently, and sensibly, by process. A process may run user
code, do sys-calls, or field interrupts both soft and hard. Now do you want to
adjust the priority at every transition?

> Lowering the latency, sure the low latency code probably does nearly as well
> as the preempt patch.  that's fine.  Shortening the time locks are held by
> better code can help to a certain extent (unless a lot of the kernel code is
> poorly written, which i doubt).  at it's present state though,  my idea to
> fix the kernel would be to give parts of the kernel where locks are made,

"Fix" what? What is the objective of your fix?


> that shouldn't be broken normally, higher priorities.  That way we can
> distinguish between safe locks to preempt at and the ones that can do harm.
> But those people who require their app to be treated special can run it
> with -20 and preempt everything.   To me that makes sense.  Is there a

So:
	get semaphore on slab memory and raise priority
		get preempted by "treated special" app that then
			does an operation on the slab queues 

Is that what you want?
		
> reason why it doesn't?  Besides ethstetics.   the only way the ethsetic

It doesn't work? Is that a sufficient reason?


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:23                                             ` Ed Sweetman
  2002-01-12 19:35                                               ` yodaiken
@ 2002-01-12 20:09                                               ` Alan Cox
  2002-01-20  0:08                                                 ` Pavel Machek
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-12 20:09 UTC (permalink / raw)
  To: Ed Sweetman; +Cc: arjan, Alan Cox, Rob Landley, linux-kernel

> hardware to hardware could have a higher priority than normal programs being
> run.   That way they're not preempted by simple programs, it would have to
> be purposely preempted by the user.

How do you know they are there. How do you detect the situation, or do you
plan to audit every driver ?

> Lowering the latency, sure the low latency code probably does nearly as well
> as the preempt patch.  that's fine.  Shortening the time locks are held by

Not nearly as well. The tests I've seen it runs _better_ than just pre-empt
and pre-empt + low latency is the same as pure low latency - 1mS

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:09                                               ` Alan Cox
@ 2002-01-20  0:08                                                 ` Pavel Machek
  0 siblings, 0 replies; 351+ messages in thread
From: Pavel Machek @ 2002-01-20  0:08 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ed Sweetman, arjan, Rob Landley, linux-kernel

Hi!

> > hardware to hardware could have a higher priority than normal programs being
> > run.   That way they're not preempted by simple programs, it would have to
> > be purposely preempted by the user.
> 
> How do you know they are there. How do you detect the situation, or do you
> plan to audit every driver ?

Any driver which depends on timing is broken. 2.4.9 was happy to spend
two seconds in interrupt (console switch). So I doubt too much drivers
are broken like that. 

And... The drivers were broken already. That is not reason against the
patch!
									Pavel
-- 
(about SSSCA) "I don't say this lightly.  However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 18:54                                           ` Alan Cox
  2002-01-12 19:23                                             ` Ed Sweetman
@ 2002-01-12 19:26                                             ` Robert Love
  2002-01-12 19:36                                               ` yodaiken
  2002-01-12 20:07                                               ` Alan Cox
  2002-01-12 20:53                                             ` Roman Zippel
                                                               ` (2 subsequent siblings)
  4 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-12 19:26 UTC (permalink / raw)
  To: Alan Cox; +Cc: arjan, Rob Landley, linux-kernel

On Sat, 2002-01-12 at 13:54, Alan Cox wrote:
> Another example is in the network drivers. The 8390 core for one example
> carefully disables an IRQ on the card so that it can avoid spinlocking on 
> uniprocessor boxes.
> 
> So with pre-empt this happens
> 
> 	driver magic
> 	disable_irq(dev->irq)
> PRE-EMPT:
> 	[large periods of time running other code]
> PRE-EMPT:
> 	We get back and we've missed 300 packets, the serial port sharing
> 	the IRQ has dropped our internet connection completely.

We don't preempt while IRQ are disabled.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:26                                             ` Robert Love
@ 2002-01-12 19:36                                               ` yodaiken
  2002-01-12 20:07                                               ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-12 19:36 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

On Sat, Jan 12, 2002 at 02:26:27PM -0500, Robert Love wrote:
> On Sat, 2002-01-12 at 13:54, Alan Cox wrote:
> > Another example is in the network drivers. The 8390 core for one example
> > carefully disables an IRQ on the card so that it can avoid spinlocking on 
> > uniprocessor boxes.
> > 
> > So with pre-empt this happens
> > 
> > 	driver magic
> > 	disable_irq(dev->irq)
> > PRE-EMPT:
> > 	[large periods of time running other code]
> > PRE-EMPT:
> > 	We get back and we've missed 300 packets, the serial port sharing
> > 	the IRQ has dropped our internet connection completely.
> 
> We don't preempt while IRQ are disabled.

You read the mask map? and somehow figure out which masked irqs correspond to 
active devices?

> 
> 	Robert Love
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:26                                             ` Robert Love
  2002-01-12 19:36                                               ` yodaiken
@ 2002-01-12 20:07                                               ` Alan Cox
  2002-01-12 20:03                                                 ` Robert Love
  2002-01-12 20:36                                                 ` Kenneth Johansson
  1 sibling, 2 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-12 20:07 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

> > PRE-EMPT:
> > 	We get back and we've missed 300 packets, the serial port sharing
> > 	the IRQ has dropped our internet connection completely.
> 
> We don't preempt while IRQ are disabled.

I must have missed that in the code. I can see you check __cli() status but
I didn't see anywhere you check disable_irq(). Even if you did it doesnt
help when I mask the irq on the chip rather than using disable_irq() calls.

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:07                                               ` Alan Cox
@ 2002-01-12 20:03                                                 ` Robert Love
  2002-01-12 20:21                                                   ` Alan Cox
  2002-01-12 20:36                                                 ` Kenneth Johansson
  1 sibling, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-12 20:03 UTC (permalink / raw)
  To: Alan Cox; +Cc: arjan, Rob Landley, linux-kernel

On Sat, 2002-01-12 at 15:07, Alan Cox wrote:

> > We don't preempt while IRQ are disabled.
> 
> I must have missed that in the code. I can see you check __cli() status but
> I didn't see anywhere you check disable_irq(). Even if you did it doesnt
> help when I mask the irq on the chip rather than using disable_irq() calls.

Well, if IRQs are disabled we won't have the timer... would not the
system panic anyhow if schedule() was called while in an interrupt
handler?

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:03                                                 ` Robert Love
@ 2002-01-12 20:21                                                   ` Alan Cox
  2002-01-13  3:10                                                     ` Robert Love
  2002-01-13 22:02                                                     ` Daniel Phillips
  0 siblings, 2 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-12 20:21 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

> > I didn't see anywhere you check disable_irq(). Even if you did it doesnt
> > help when I mask the irq on the chip rather than using disable_irq() calls.
> 
> Well, if IRQs are disabled we won't have the timer... would not the
> system panic anyhow if schedule() was called while in an interrupt
> handler?

You completely misunderstand.

	disable_irq(n)

I disable a single specific interrupt, I don't disable the timer interrupt.
Your code doesn't seem to handle that. Its just one of the examples of where
you really need priority handling, and thats a horrible dark and slippery
slope

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:21                                                   ` Alan Cox
@ 2002-01-13  3:10                                                     ` Robert Love
  2002-01-13 11:39                                                       ` Russell King
  2002-01-13 15:59                                                       ` Alan Cox
  2002-01-13 22:02                                                     ` Daniel Phillips
  1 sibling, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-13  3:10 UTC (permalink / raw)
  To: Alan Cox; +Cc: arjan, Rob Landley, linux-kernel

On Sat, 2002-01-12 at 15:21, Alan Cox wrote:

> You completely misunderstand.
> 
> 	disable_irq(n)
> 
> I disable a single specific interrupt, I don't disable the timer interrupt.
> Your code doesn't seem to handle that.

It can if we increment the preempt_count in disable_irq_nosync and
decrement it on enable_irq.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13  3:10                                                     ` Robert Love
@ 2002-01-13 11:39                                                       ` Russell King
  2002-01-13 18:24                                                         ` Robert Love
  2002-01-13 15:59                                                       ` Alan Cox
  1 sibling, 1 reply; 351+ messages in thread
From: Russell King @ 2002-01-13 11:39 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

On Sat, Jan 12, 2002 at 10:10:55PM -0500, Robert Love wrote:
> It can if we increment the preempt_count in disable_irq_nosync and
> decrement it on enable_irq.

Who says you're going to be enabling IRQs any time soon?  AFAIK, there is
nothing that requires enable_irq to be called after disable_irq_nosync.

In fact, you could well have the following in a driver:

	/* initial shutdown of device */

	disable_irq_nosync(i); /* or disable_irq(i); */

	/* other shutdown stuff */

	free_irq(i, private);

and you would have to audit all drivers to find out if they did this -
this would seriously damage your preempt_count.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 11:39                                                       ` Russell King
@ 2002-01-13 18:24                                                         ` Robert Love
  2002-01-13 19:06                                                           ` Russell King
  2002-01-13 19:30                                                           ` Alan Cox
  0 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-13 18:24 UTC (permalink / raw)
  To: Russell King; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

On Sun, 2002-01-13 at 06:39, Russell King wrote:
> On Sat, Jan 12, 2002 at 10:10:55PM -0500, Robert Love wrote:
> > It can if we increment the preempt_count in disable_irq_nosync and
> > decrement it on enable_irq.
> 
> Who says you're going to be enabling IRQs any time soon?  AFAIK, there is
> nothing that requires enable_irq to be called after disable_irq_nosync.
> 
> In fact, you could well have the following in a driver:
> 
> 	/* initial shutdown of device */
> 
> 	disable_irq_nosync(i); /* or disable_irq(i); */
> 
> 	/* other shutdown stuff */
> 
> 	free_irq(i, private);
> 
> and you would have to audit all drivers to find out if they did this -
> this would seriously damage your preempt_count.

I wasn't thinking.  Anytime we are in an interrupt handler, preemption
is disabled.  Regardless of how (or even if) interrupts are disabled. 
We bump preempt_count on the entry path.  So, no problem.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:24                                                         ` Robert Love
@ 2002-01-13 19:06                                                           ` Russell King
  2002-01-13 19:30                                                           ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: Russell King @ 2002-01-13 19:06 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

On Sun, Jan 13, 2002 at 01:24:20PM -0500, Robert Love wrote:
> On Sun, 2002-01-13 at 06:39, Russell King wrote:
> > On Sat, Jan 12, 2002 at 10:10:55PM -0500, Robert Love wrote:
> > > It can if we increment the preempt_count in disable_irq_nosync and
> > > decrement it on enable_irq.
> > 
> > Who says you're going to be enabling IRQs any time soon?  AFAIK, there is
> > nothing that requires enable_irq to be called after disable_irq_nosync.
> > 
> > In fact, you could well have the following in a driver:
> > 
> > 	/* initial shutdown of device */
> > 
> > 	disable_irq_nosync(i); /* or disable_irq(i); */
> > 
> > 	/* other shutdown stuff */
> > 
> > 	free_irq(i, private);
> > 
> > and you would have to audit all drivers to find out if they did this -
> > this would seriously damage your preempt_count.
> 
> I wasn't thinking.  Anytime we are in an interrupt handler, preemption
> is disabled.  Regardless of how (or even if) interrupts are disabled. 
> We bump preempt_count on the entry path.  So, no problem.

Err.  This isn't *inside* an interrupt handler.  This could well be in
the driver shutdown code (eg, when fops->release is called).

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:24                                                         ` Robert Love
  2002-01-13 19:06                                                           ` Russell King
@ 2002-01-13 19:30                                                           ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13 19:30 UTC (permalink / raw)
  To: Robert Love; +Cc: Russell King, Alan Cox, arjan, Rob Landley, linux-kernel

> I wasn't thinking.  Anytime we are in an interrupt handler, preemption
> is disabled.  Regardless of how (or even if) interrupts are disabled. 
> We bump preempt_count on the entry path.  So, no problem.

The code path isnt in an interrupt handler.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13  3:10                                                     ` Robert Love
  2002-01-13 11:39                                                       ` Russell King
@ 2002-01-13 15:59                                                       ` Alan Cox
  2002-01-13 18:20                                                         ` Robert Love
  2002-01-14  5:59                                                         ` Daniel Phillips
  1 sibling, 2 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13 15:59 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

> > I disable a single specific interrupt, I don't disable the timer interrupt.
> > Your code doesn't seem to handle that.
> 
> It can if we increment the preempt_count in disable_irq_nosync and
> decrement it on enable_irq.

A driver that knows about how its irq is handled and that it is sole
user (eg ISA) may and some do leave it disabled for hours at a time

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:59                                                       ` Alan Cox
@ 2002-01-13 18:20                                                         ` Robert Love
  2002-01-14  5:59                                                         ` Daniel Phillips
  1 sibling, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-13 18:20 UTC (permalink / raw)
  To: Alan Cox; +Cc: arjan, Rob Landley, linux-kernel

On Sun, 2002-01-13 at 10:59, Alan Cox wrote:
> > I disable a single specific interrupt, I don't disable the timer interrupt.
> > Your code doesn't seem to handle that.
> 
> It can if we increment the preempt_count in disable_irq_nosync and
> decrement it on enable_irq.

OK, Alan, you spooked me with the disable_irq mess and admittedly my
initial solution wasn't ideal for a few reasons.

But it isn't a problem after all. In hw_irq.h we bump the count in the
interrupt path.  This should handle any handler, however we end up in
it.

I realized it because if we did not have a global solution to interrupt
request handlers, dropping spinlocks in the handler, even with IRQs
disabled, would cause a preemptive schedule.  All interrupts are
properly protected.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:59                                                       ` Alan Cox
  2002-01-13 18:20                                                         ` Robert Love
@ 2002-01-14  5:59                                                         ` Daniel Phillips
  2002-01-13 22:23                                                           ` Rob Landley
  1 sibling, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14  5:59 UTC (permalink / raw)
  To: Alan Cox, Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

On January 13, 2002 04:59 pm, Alan Cox wrote:
> > > I disable a single specific interrupt, I don't disable the timer 
interrupt.
> > > Your code doesn't seem to handle that.
> > 
> > It can if we increment the preempt_count in disable_irq_nosync and
> > decrement it on enable_irq.
> 
> A driver that knows about how its irq is handled and that it is sole
> user (eg ISA) may and some do leave it disabled for hours at a time

Good point.  Preemption would be disabled for that thread if we mindlessly 
shut it off for every irq_disable.  For that driver we probably just want to 
leave preemption enabled, it can't hurt.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:59                                                         ` Daniel Phillips
@ 2002-01-13 22:23                                                           ` Rob Landley
  0 siblings, 0 replies; 351+ messages in thread
From: Rob Landley @ 2002-01-13 22:23 UTC (permalink / raw)
  To: Daniel Phillips, Alan Cox, Robert Love; +Cc: Alan Cox, arjan, linux-kernel

On Monday 14 January 2002 12:59 am, Daniel Phillips wrote:
> On January 13, 2002 04:59 pm, Alan Cox wrote:
> > > > I disable a single specific interrupt, I don't disable the timer
>
> interrupt.
>
> > > > Your code doesn't seem to handle that.
> > >
> > > It can if we increment the preempt_count in disable_irq_nosync and
> > > decrement it on enable_irq.
> >
> > A driver that knows about how its irq is handled and that it is sole
> > user (eg ISA) may and some do leave it disabled for hours at a time
>
> Good point.  Preemption would be disabled for that thread if we mindlessly
> shut it off for every irq_disable.  For that driver we probably just want
> to leave preemption enabled, it can't hurt.

Once we return to user space, we can preempt again.  If preemption is still 
disabled upon return from the syscall, I'd say it's okay to switch it back on 
now. :)

Unless I'm missing something fundamental...?

Rob

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:21                                                   ` Alan Cox
  2002-01-13  3:10                                                     ` Robert Love
@ 2002-01-13 22:02                                                     ` Daniel Phillips
  1 sibling, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-13 22:02 UTC (permalink / raw)
  To: Alan Cox, Robert Love; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

On January 12, 2002 09:21 pm, Alan Cox wrote:
> > > I didn't see anywhere you check disable_irq(). Even if you did it doesnt
> > > help when I mask the irq on the chip rather than using disable_irq() calls.
> > 
> > Well, if IRQs are disabled we won't have the timer... would not the
> > system panic anyhow if schedule() was called while in an interrupt
> > handler?
> 
> You completely misunderstand.
> 
> 	disable_irq(n)
> 
> I disable a single specific interrupt, I don't disable the timer interrupt.
> Your code doesn't seem to handle that. Its just one of the examples of where
> you really need priority handling, and thats a horrible dark and slippery
> slope

He just needs to disable preemption there, it's just a slight mod to 
disable/enable_irq.  You probably have a few more of those, though...

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:07                                               ` Alan Cox
  2002-01-12 20:03                                                 ` Robert Love
@ 2002-01-12 20:36                                                 ` Kenneth Johansson
  2002-01-12 23:01                                                   ` Robert Love
  2002-01-13  1:30                                                   ` Alan Cox
  1 sibling, 2 replies; 351+ messages in thread
From: Kenneth Johansson @ 2002-01-12 20:36 UTC (permalink / raw)
  To: Alan Cox; +Cc: Robert Love, arjan, Rob Landley, linux-kernel

Alan Cox wrote:

> > > PRE-EMPT:
> > >     We get back and we've missed 300 packets, the serial port sharing
> > >     the IRQ has dropped our internet connection completely.
> >
> > We don't preempt while IRQ are disabled.
>
> I must have missed that in the code. I can see you check __cli() status but
> I didn't see anywhere you check disable_irq(). Even if you did it doesnt
> help when I mask the irq on the chip rather than using disable_irq() calls.
>
> Alan

But you get interrupted by other interrups then so you have the same problem
reagardless of any preemtion patch you hopefully lose the cpu for a much
shorter time but still the same problem.



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:36                                                 ` Kenneth Johansson
@ 2002-01-12 23:01                                                   ` Robert Love
  2002-01-13  0:02                                                     ` J Sloan
  2002-01-13  1:38                                                     ` Alan Cox
  2002-01-13  1:30                                                   ` Alan Cox
  1 sibling, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-12 23:01 UTC (permalink / raw)
  To: Kenneth Johansson; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

On Sat, 2002-01-12 at 15:36, Kenneth Johansson wrote:

> > I must have missed that in the code. I can see you check __cli() status but
> > I didn't see anywhere you check disable_irq(). Even if you did it doesnt
> > help when I mask the irq on the chip rather than using disable_irq() calls.
> >
> > Alan
> 
> But you get interrupted by other interrups then so you have the same problem
> reagardless of any preemtion patch you hopefully lose the cpu for a much
> shorter time but still the same problem.

Agreed.  Further, you can't put _any_ upper bound on the number of
interrupts that could occur, preempt or not.  Sure, preempt can make it
worse, but I don't see it.  I have no bug reports to correlate.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 23:01                                                   ` Robert Love
@ 2002-01-13  0:02                                                     ` J Sloan
  2002-01-13  1:38                                                     ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: J Sloan @ 2002-01-13  0:02 UTC (permalink / raw)
  To: Robert Love
  Cc: Kenneth Johansson, Alan Cox, arjan, Rob Landley, linux-kernel,
	Andrew Morton

Robert Love wrote:

>Agreed.  Further, you can't put _any_ upper bound on the number of interrupts that could occur, preempt or not.  Sure, preempt can make it worse, but I don't see it. I have no bug reports to correlate.
>

OTOH we do have a pile of user reports which
say the low latency patches give better results.

 From my view here, low latency provides a more
silky feel when e.g. playing RtCW or Q3A -

BTW I have checked out 2.4.18pre2-aa2 and
am now running 2.4.18-pre3 + mini low latency.

* -aa absolutely kicks major booty in benchmarks.

* -mini-low-latency seems to do no worse than
stock kernel benchmark-wise, but seems to be
somehow smoother. I played some mp3s while
running dbench 16 and heard no hitches. Also
the RtCW test was successful, e.g. movement
was fluid and I was victorious in most skirmishes
with win32 opponents.

Regards,

jjs

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 23:01                                                   ` Robert Love
  2002-01-13  0:02                                                     ` J Sloan
@ 2002-01-13  1:38                                                     ` Alan Cox
  2002-01-13 15:18                                                       ` Roman Zippel
       [not found]                                                       ` <16QNVQ-2JqEACC@fwd03.sul.t-online.com>
  1 sibling, 2 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13  1:38 UTC (permalink / raw)
  To: Robert Love; +Cc: Kenneth Johansson, Alan Cox, arjan, Rob Landley, linux-kernel

> > But you get interrupted by other interrups then so you have the same problem
> > reagardless of any preemtion patch you hopefully lose the cpu for a much
> > shorter time but still the same problem.
> 
> Agreed.  Further, you can't put _any_ upper bound on the number of
> interrupts that could occur, preempt or not.  Sure, preempt can make it
> worse, but I don't see it.  I have no bug reports to correlate.

How may full benchmark sets have you done on an NE2000. Its quite obvious
from your earlier mail you hadn't even considered problems like this.

Let me ask you the _right_ question instead

-	Prove to me that there are no cases that pre-empt doesn't screw up
	like this.
-	Prove to me that pre-empt is better than the big low latency patch

All I have seen so far is benchmarks that say low latency is better as is,
and evidence that preempt patches cause far more problems than they solve
and have complex and subtle side effects nobody yet understands.

Furthermore its obvious that the only way to fix these side effects is to
implement full priority handling to avoid priority inversion issues (which
is precisely what the IRQ problem is) , that means implementing interrupt
handlers as threads, heavyweight locks and an end result I'm really not
interested in using.

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13  1:38                                                     ` Alan Cox
@ 2002-01-13 15:18                                                       ` Roman Zippel
  2002-01-13 15:36                                                         ` Arjan van de Ven
                                                                           ` (3 more replies)
       [not found]                                                       ` <16QNVQ-2JqEACC@fwd03.sul.t-online.com>
  1 sibling, 4 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-13 15:18 UTC (permalink / raw)
  To: Alan Cox; +Cc: Robert Love, Kenneth Johansson, arjan, Rob Landley, linux-kernel

Hi,

Alan Cox wrote:

> All I have seen so far is benchmarks that say low latency is better as is,

If Andrew did a good job (what he obviously did), I don't doubt that.

> and evidence that preempt patches cause far more problems than they solve
> and have complex and subtle side effects nobody yet understands.

I'm aware of two side effects:
- preempt exposes already existing problems, which are worth fixing
independent of preempt.
- it can cause unexpected delays, which should be nonfatal, otherwise
worth fixing as well.

What somehow got lost in this discussion, that both patches don't
necessarily conflict with each other, they both attack the same problem
with different approaches, which complement each other. I prefer to get
the best of both patches.
The ll patch identifies problem, which preempt alone can't fix, on the
other hand the ll patch inserts schedule calls all over the place, where
preempt can handle this transparently. So far I haven't seen any
evidence, that preempt introduces any _new_ serious problems, so I'd
rather like to see to get the best out of both.

> Furthermore its obvious that the only way to fix these side effects is to
> implement full priority handling to avoid priority inversion issues (which
> is precisely what the IRQ problem is) , that means implementing interrupt
> handlers as threads, heavyweight locks and an end result I'm really not
> interested in using.

It's not really needed to go that far, it's generally a good idea to
keep interrupt handler as short as possible, we use bh or tasklets for
exactly that reason. I don't think we need to work around broken
hardware, but halfway decent hardware should not be a problem to get
decent latency.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:18                                                       ` Roman Zippel
@ 2002-01-13 15:36                                                         ` Arjan van de Ven
  2002-01-14  5:03                                                           ` Daniel Phillips
  2002-01-13 15:45                                                         ` Alan Cox
                                                                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 351+ messages in thread
From: Arjan van de Ven @ 2002-01-13 15:36 UTC (permalink / raw)
  To: Roman Zippel; +Cc: linux-kernel

On Sun, Jan 13, 2002 at 04:18:29PM +0100, Roman Zippel wrote:

> What somehow got lost in this discussion, that both patches don't
> necessarily conflict with each other, they both attack the same problem
> with different approaches, which complement each other. I prefer to get
> the best of both patches.

If you do this (and I've seen the results of doing both at once vs only
either of then vs pure) then there's NO benifit for the preemption left.
Sure AVERAGE latency goes down slightly, however this is talking in the usec
range since worst case is already 1msec or less. Below the 1msec range it
really doesn't matter anymore however. 

At that point you're adding all the complexity for the negliable-to-no-gain
case...

Greetings,
   Arjan van de Ven

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:36                                                         ` Arjan van de Ven
@ 2002-01-14  5:03                                                           ` Daniel Phillips
  2002-01-14  5:09                                                             ` Andrew Morton
                                                                               ` (2 more replies)
  0 siblings, 3 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14  5:03 UTC (permalink / raw)
  To: Arjan van de Ven, Roman Zippel; +Cc: linux-kernel

On January 13, 2002 04:36 pm, Arjan van de Ven wrote:
> On Sun, Jan 13, 2002 at 04:18:29PM +0100, Roman Zippel wrote:
> 
> > What somehow got lost in this discussion, that both patches don't
> > necessarily conflict with each other, they both attack the same problem
> > with different approaches, which complement each other. I prefer to get
> > the best of both patches.
> 
> If you do this (and I've seen the results of doing both at once vs only
> either of then vs pure) then there's NO benifit for the preemption left.

Sorry, that's incorrect.  I stated why earlier in this thread and akpm signed 
off on it.  With preempt you get ASAP (i.e., as soon as the outermost 
spinlock is done) process scheduling.  With hand-coded scheduling points you 
get 'as soon as it happens to hit a scheduling point'.

That is not the only benefit, just the most obvious one.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:03                                                           ` Daniel Phillips
@ 2002-01-14  5:09                                                             ` Andrew Morton
  2002-01-14  9:24                                                               ` Daniel Phillips
  2002-01-14  5:34                                                             ` yodaiken
  2002-01-14  8:24                                                             ` Arjan van de Ven
  2 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-14  5:09 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Arjan van de Ven, Roman Zippel, linux-kernel

Daniel Phillips wrote:
> 
> On January 13, 2002 04:36 pm, Arjan van de Ven wrote:
> > On Sun, Jan 13, 2002 at 04:18:29PM +0100, Roman Zippel wrote:
> >
> > > What somehow got lost in this discussion, that both patches don't
> > > necessarily conflict with each other, they both attack the same problem
> > > with different approaches, which complement each other. I prefer to get
> > > the best of both patches.
> >
> > If you do this (and I've seen the results of doing both at once vs only
> > either of then vs pure) then there's NO benifit for the preemption left.
> 
> Sorry, that's incorrect.  I stated why earlier in this thread and akpm signed
> off on it.  With preempt you get ASAP (i.e., as soon as the outermost
> spinlock is done) process scheduling.  With hand-coded scheduling points you
> get 'as soon as it happens to hit a scheduling point'.

With preempt it's "as soon as you hit a lock-break point".  They're equivalent,
for the inside-lock case, which is where most of the problems and complexity
lie.

> That is not the only benefit, just the most obvious one.

I'd say the most obvious benefit of preempt is that it catches some
of the cases which the explicit schedules do not - the stuff which
the developer didn't test for, and which is outside locks.

How useful this is, is moot.

But we can *make* it useful.  I believe that internal preemption is
the foundation to improve 2.5 kernel latency.  But first we need
consensus that we *want* linux to be a low-latency kernel.

Do we have that?

If we do, then as I've said before, holding a lock for more than N milliseconds
becomes a bug to be fixed.  We can put tools in the hands of testers to
locate those bugs.  Easy.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:09                                                             ` Andrew Morton
@ 2002-01-14  9:24                                                               ` Daniel Phillips
  0 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14  9:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Arjan van de Ven, Roman Zippel, linux-kernel

On January 14, 2002 06:09 am, Andrew Morton wrote:
> Daniel Phillips wrote:
> I believe that internal preemption is
> the foundation to improve 2.5 kernel latency.  But first we need
> consensus that we *want* linux to be a low-latency kernel.
> 
> Do we have that?

You have it from me, for what it's worth ;-)

> If we do, then as I've said before, holding a lock for more than N
> milliseconds becomes a bug to be fixed.  We can put tools in the hands of 
> testers to locate those bugs.  Easy.

Perhaps not a bug, but bad-acting.  Just as putting a huge object on the 
stack is not necessarily a bug, but deserves a quick larting nonetheless.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:03                                                           ` Daniel Phillips
  2002-01-14  5:09                                                             ` Andrew Morton
@ 2002-01-14  5:34                                                             ` yodaiken
  2002-01-14 11:14                                                               ` Roman Zippel
                                                                                 ` (2 more replies)
  2002-01-14  8:24                                                             ` Arjan van de Ven
  2 siblings, 3 replies; 351+ messages in thread
From: yodaiken @ 2002-01-14  5:34 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, Jan 14, 2002 at 06:03:43AM +0100, Daniel Phillips wrote:
> On January 13, 2002 04:36 pm, Arjan van de Ven wrote:
> > On Sun, Jan 13, 2002 at 04:18:29PM +0100, Roman Zippel wrote:
> > 
> > > What somehow got lost in this discussion, that both patches don't
> > > necessarily conflict with each other, they both attack the same problem
> > > with different approaches, which complement each other. I prefer to get
> > > the best of both patches.
> > 
> > If you do this (and I've seen the results of doing both at once vs only
> > either of then vs pure) then there's NO benifit for the preemption left.
> 
> Sorry, that's incorrect.  I stated why earlier in this thread and akpm signed 
> off on it.  With preempt you get ASAP (i.e., as soon as the outermost 
> spinlock is done) process scheduling.  With hand-coded scheduling points you 
> get 'as soon as it happens to hit a scheduling point'.
> 
> That is not the only benefit, just the most obvious one.

My understanding of this situation is as follows:
	The pure preempt measurements show some improvement on synthetic
	latency benchmarks that have not been shown to have any relationship
	to any real application

	The LL measurements show _better_ results on similar benchmarks.

	Some people find preempt improves "feel"
	Some people find LL improves "feel"

	The interactions of these improvements with Ingos scheduler, aa mm, and
	other recent patches are exceptionally murky

	We have one benchmark that shows that kernel compiles run on different 
	untarred trees show a slight advantage for preempt+Ingo via some
	unknown mechanism. This benchmark, aside from its dubious repeatability 
	tests something that seems to have no relationship to _anything_  at all
	by running a huge number of compile processes.

	Nobody has answered my question about the conflict between SMP per-cpu caching
	and preempt. Since NUMA is apparently the future of MP in the PC world and
	the future of Linux servers, it's interesting to consider this tradeoff.
	
	Nobody has answered the question about how to make sure all processes
	make progress with preempt.

	Nobody has offered a single benchmark of actual application code benefits
	from either preempt or LL.

	Nobody has clearly explained how to avoid what I claim to be the inevitable 
	result of preempt -- priority inheritance locks (not just semaphores).
	What we have is some "we'll figure that out when we get to it".

	It's not even clear how preempt is supposed to interact with SCHED_FIFO.


As far as your "most obvious" "benefit". It's neither obvious that it happens
or obvious that it is a benefit.  According to the measurements I've seen, Andrew
reduces latency _more_ than preempt. Andrews argument, as I understand it, is that
the longest latencies are within spinlocks anyways so increasing speed of preempt
outside those locks misses the problem. If he is correct, then if you are correct,
it doesn't matter - preempt is reducing already short latencies. 

BTW: there is a detailed explanation of how priority inherit works in Solaris in the
UNIX Internals book. It's worth reading and thinking about.

I'm not at all sure that putting preempt into 2.5 is a bad idea. I think that 2.4
has a long lifetime ahead of it and the debacle that will follow putting preempt into 2.5
will eventually discredit the entire idea for at least a year or two.
But
I think that there are some much more important scheduling issues that are being ignored to
"improve the feel" of DVD playing. The key one is some idea of being able to assure processes
of some rate of progress.  This is not classical RT, but it is important to multimedia and 
databases and also to some applications we are interested in looking at. 




---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:34                                                             ` yodaiken
@ 2002-01-14 11:14                                                               ` Roman Zippel
  2002-01-14 11:47                                                                 ` Alan Cox
                                                                                   ` (2 more replies)
  2002-01-14 12:17                                                               ` Momchil Velikov
  2002-01-14 15:08                                                               ` Russ Leighton
  2 siblings, 3 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 11:14 UTC (permalink / raw)
  To: yodaiken; +Cc: Daniel Phillips, Arjan van de Ven, linux-kernel

Hi,

On Sun, 13 Jan 2002 yodaiken@fsmlabs.com wrote:

> 	Nobody has answered my question about the conflict between SMP per-cpu caching
> 	and preempt. Since NUMA is apparently the future of MP in the PC world and
> 	the future of Linux servers, it's interesting to consider this tradeoff.

Preempt is a UP feature so far.

> 	Nobody has answered the question about how to make sure all processes
> 	make progress with preempt.

The same way as without preempt.

> 	Nobody has clearly explained how to avoid what I claim to be the inevitable
> 	result of preempt -- priority inheritance locks (not just semaphores).
> 	What we have is some "we'll figure that out when we get to it".

So far you haven't given any reason, how preempt should lead to this.
(If I missed something, please explain it in a way a mere mortal can
understand it.)

> 	It's not even clear how preempt is supposed to interact with SCHED_FIFO.

The same way as without preempt.

More of other FUD deleted, Victor, could you please stop this?

bye, Roman


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:14                                                               ` Roman Zippel
@ 2002-01-14 11:47                                                                 ` Alan Cox
  2002-01-14 12:00                                                                   ` Roman Zippel
  2002-01-14 13:38                                                                 ` yodaiken
  2002-01-14 14:07                                                                 ` Guest section DW
  2 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-14 11:47 UTC (permalink / raw)
  To: Roman Zippel; +Cc: yodaiken, Daniel Phillips, Arjan van de Ven, linux-kernel

> More of other FUD deleted, Victor, could you please stop this?

Insulting people won't make problems go away Roman.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:47                                                                 ` Alan Cox
@ 2002-01-14 12:00                                                                   ` Roman Zippel
  2002-01-14 12:27                                                                     ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 12:00 UTC (permalink / raw)
  To: Alan Cox; +Cc: yodaiken, Daniel Phillips, Arjan van de Ven, linux-kernel

Hi,

On Mon, 14 Jan 2002, Alan Cox wrote:

> > More of other FUD deleted, Victor, could you please stop this?
>
> Insulting people won't make problems go away Roman.

I'm really trying to avoid this, I'm more than happy to discuss
theoretical or practical problems _if_ they are backed by arguments,
latter are very thin with Victor. Making pointless claims only triggers
above reaction. If I did really miss a major argument so far, I will
publicly apologize.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 12:00                                                                   ` Roman Zippel
@ 2002-01-14 12:27                                                                     ` Alan Cox
  2002-01-14 13:39                                                                       ` Roman Zippel
  2002-01-14 20:02                                                                       ` Andrew Morton
  0 siblings, 2 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-14 12:27 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, yodaiken, Daniel Phillips, Arjan van de Ven,
	linux-kernel

> I'm really trying to avoid this, I'm more than happy to discuss
> theoretical or practical problems _if_ they are backed by arguments,
> latter are very thin with Victor. Making pointless claims only triggers
> above reaction. If I did really miss a major argument so far, I will
> publicly apologize.

You seem to be missing the fact that latency guarantees only work if you
can make progress. If a low priority process is pre-empted owning a
resource (quite likely) then you won't get your good latency. To
handle those cases you get into priority boosting, and all sorts of lock
complexity - so that the task that owns the resource temporarily can borrow
your priority in order that you can make progress at your needed speed.
That gets horrendously complex, and you get huge chains of priority
dependancies including hardware level ones.

The low latency patches don't make that problem go away, but they achieve
equivalent real world latencies up to at least the point you have to do
priority handling of that kind. 

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 12:27                                                                     ` Alan Cox
@ 2002-01-14 13:39                                                                       ` Roman Zippel
  2002-01-14 14:35                                                                         ` Alan Cox
  2002-01-14 14:35                                                                         ` Rik van Riel
  2002-01-14 20:02                                                                       ` Andrew Morton
  1 sibling, 2 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 13:39 UTC (permalink / raw)
  To: Alan Cox; +Cc: yodaiken, Daniel Phillips, Arjan van de Ven, linux-kernel

Hi,

On Mon, 14 Jan 2002, Alan Cox wrote:

> You seem to be missing the fact that latency guarantees only work if you
> can make progress. If a low priority process is pre-empted owning a
> resource (quite likely) then you won't get your good latency. To
> handle those cases you get into priority boosting, and all sorts of lock
> complexity - so that the task that owns the resource temporarily can borrow
> your priority in order that you can make progress at your needed speed.
> That gets horrendously complex, and you get huge chains of priority
> dependancies including hardware level ones.

Any ll approach so far only addresses a single type of latency - the time
from waking up an important process until it really gets the cpu. What is
not handled by any patch are i/o latencies, that means the average time to
get access to a specific resource. (To be exact breaking up locks modifies
of course i/o latencies, but that's more a side effect.)
I/O latencies are only relevant for this discussion insofar, as to verify
they are not overly harmed by improving scheduling latencies. Preempting
does not modify the behaviour of the scheduler, all it does is to increase
the scheduling frequency. This means it can happen that a low priority
task locks a resource for a longer time, because it's interrupted by
another task. Nethertheless the current scheduler guarantees every process
gets its share of the cpu time(*), so the low priority task will continue
and release the resource within a guaranteed amount of time.
So the worst behaviour I see is that on a loaded system, a low priority
task can hold up another task, if that task should be our interactive
task, the interactivity is of course gone. But this problem is not really
new, as we have no guarantees regarding i/o latencies. So everyone using
any patch should be aware of that it's not a magical tool and for getting
better scheduling latencies, one has to trade something else, but so far
I haven't seen any evidence that it makes something else much worse.

(*) This of course assumes accurate cpu time accounting, but I mentioned
this problem before. On the other hand it's also fixable, the tickless
patch looks most interesting in this regard.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:39                                                                       ` Roman Zippel
@ 2002-01-14 14:35                                                                         ` Alan Cox
  2002-01-14 14:30                                                                           ` Roman Zippel
  2002-01-14 14:35                                                                         ` Rik van Riel
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-14 14:35 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, yodaiken, Daniel Phillips, Arjan van de Ven,
	linux-kernel

> So the worst behaviour I see is that on a loaded system, a low priority
> task can hold up another task, if that task should be our interactive
> task, the interactivity is of course gone. But this problem is not really
> new, as we have no guarantees regarding i/o latencies. So everyone using
> any patch should be aware of that it's not a magical tool and for getting
> better scheduling latencies, one has to trade something else, but so far
> I haven't seen any evidence that it makes something else much worse.

It doesn't make anything better is the issue. Its more complex than ll but
gains nothing

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 14:35                                                                         ` Alan Cox
@ 2002-01-14 14:30                                                                           ` Roman Zippel
  0 siblings, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 14:30 UTC (permalink / raw)
  To: Alan Cox; +Cc: yodaiken, Daniel Phillips, Arjan van de Ven, linux-kernel

Hi,

On Mon, 14 Jan 2002, Alan Cox wrote:

> It doesn't make anything better is the issue. Its more complex than ll but
> gains nothing

Please see my previous mail about maintenance costs.

bye, Roman


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:39                                                                       ` Roman Zippel
  2002-01-14 14:35                                                                         ` Alan Cox
@ 2002-01-14 14:35                                                                         ` Rik van Riel
  2002-01-14 16:19                                                                           ` Roman Zippel
  2002-01-14 20:05                                                                           ` Robert Love
  1 sibling, 2 replies; 351+ messages in thread
From: Rik van Riel @ 2002-01-14 14:35 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, yodaiken, Daniel Phillips, Arjan van de Ven,
	linux-kernel

On Mon, 14 Jan 2002, Roman Zippel wrote:

> Any ll approach so far only addresses a single type of latency - the
> time from waking up an important process until it really gets the cpu.
> What is not handled by any patch are i/o latencies, that means the
> average time to get access to a specific resource.

OK, suppose you have three tasks.

A is a SCHED_FIFO task
B is a nice 0 SCHED_OTHER task
C is a nice +19 SCHED_OTHER task

Task B is your standard CPU hog, running all the time, task C has
grabbed  an inode semaphore (no spinlock), task A wakes up, preempts
task C, tries to grab the inode semaphore and goes back to sleep.

Now task A has to wait for task B to give up the CPU before task C
can run again and release the semaphore.

Without preemption task C would not have been preempted and it would
have released the lock much sooner, meaning task A could have gotten
the resource earlier.

Using the low latency patch we'd insert some smart code into the
algorithm so task A also releases the lock before rescheduling.

Before you say this thing never happens in practice, I ran into
this thing in real life with the SCHED_IDLE patch. In fact, this
problem was so severe it convinced me to abandon SCHED_IDLE ;))

regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 14:35                                                                         ` Rik van Riel
@ 2002-01-14 16:19                                                                           ` Roman Zippel
  2002-01-14 20:05                                                                           ` Robert Love
  1 sibling, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 16:19 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Alan Cox, yodaiken, Daniel Phillips, Arjan van de Ven,
	linux-kernel

Hi,

On Mon, 14 Jan 2002, Rik van Riel wrote:

> Without preemption task C would not have been preempted and it would
> have released the lock much sooner, meaning task A could have gotten
> the resource earlier.

Define "much sooner", nobody disputes that low priority tasks can be
delayed, that's actually the purpose of both patches.

> Using the low latency patch we'd insert some smart code into the
> algorithm so task A also releases the lock before rescheduling.

Could you please show me that "smart code"?

> Before you say this thing never happens in practice, I ran into
> this thing in real life with the SCHED_IDLE patch. In fact, this
> problem was so severe it convinced me to abandon SCHED_IDLE ;))

SCHED_IDLE is something completely different than preeempt. Rik, do I
really have to explain the difference?

bye, Roman


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 14:35                                                                         ` Rik van Riel
  2002-01-14 16:19                                                                           ` Roman Zippel
@ 2002-01-14 20:05                                                                           ` Robert Love
  1 sibling, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-14 20:05 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Roman Zippel, Alan Cox, yodaiken, Daniel Phillips,
	Arjan van de Ven, linux-kernel

On Mon, 2002-01-14 at 09:35, Rik van Riel wrote:
> OK, suppose you have three tasks.
> 
> A is a SCHED_FIFO task
> B is a nice 0 SCHED_OTHER task
> C is a nice +19 SCHED_OTHER task
> 
> Task B is your standard CPU hog, running all the time, task C has
> grabbed  an inode semaphore (no spinlock), task A wakes up, preempts
> task C, tries to grab the inode semaphore and goes back to sleep.
> 
> Now task A has to wait for task B to give up the CPU before task C
> can run again and release the semaphore.
> 
> Without preemption task C would not have been preempted and it would
> have released the lock much sooner, meaning task A could have gotten
> the resource earlier.
> 
> Using the low latency patch we'd insert some smart code into the
> algorithm so task A also releases the lock before rescheduling.
> 
> Before you say this thing never happens in practice, I ran into
> this thing in real life with the SCHED_IDLE patch. In fact, this
> problem was so severe it convinced me to abandon SCHED_IDLE ;))

This isn't related.  The problem you described can happen nearly as
easily on a non-preemptible system.  We have plenty of semaphores held
across schedules and there is no reason to single out ones that acquire
and release the semaphore in short, non-preemptible, sequences.  We
always have this "problem."

SCHED_IDLE is much different, as you know, because the SCHED_IDLE task
holding the lock can _never_ get scheduled if there is a CPU hog on the
system!  With the preemptive case, we only worry about an increase in
this period, which is at the expense of fairness in running higher
priority tasks.  But I think you know this ...

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 12:27                                                                     ` Alan Cox
  2002-01-14 13:39                                                                       ` Roman Zippel
@ 2002-01-14 20:02                                                                       ` Andrew Morton
  2002-01-14 21:19                                                                         ` Alan Cox
  1 sibling, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-14 20:02 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roman Zippel, yodaiken, Daniel Phillips, Arjan van de Ven,
	linux-kernel

Alan Cox wrote:
> 
> > I'm really trying to avoid this, I'm more than happy to discuss
> > theoretical or practical problems _if_ they are backed by arguments,
> > latter are very thin with Victor. Making pointless claims only triggers
> > above reaction. If I did really miss a major argument so far, I will
> > publicly apologize.
> 
> You seem to be missing the fact that latency guarantees only work if you
> can make progress. If a low priority process is pre-empted owning a
> resource (quite likely) then you won't get your good latency. To
> handle those cases you get into priority boosting, and all sorts of lock
> complexity - so that the task that owns the resource temporarily can borrow
> your priority in order that you can make progress at your needed speed.
> That gets horrendously complex, and you get huge chains of priority
> dependancies including hardware level ones.
> 

It would be useful to define the scope and design guidelines of a "real
time task".   Obviously, if it tries to perform filesystem or network
I/O it can block for a long time.  If it acquires VFS locks it can suffer
bad priority inversion.

I have all along assumed that a well-designed RT application would delegate
all these operations to SCHED_OTHER worker processes, probably via shared
memory/shared mappings.  So in the simplest case, you'd have a SCHED_FIFO
task which talks to the hardware, and which has a helper task which reads
and writes stuff from and to disk.  With sufficient buffering and readahead
to cover the worst case IO latencies.

If this is generally workable, then it means that the areas of possible
priority inversion are quite small - basically device driver read/write
functions.  The main remaining area where priority inversion can 
happen is in the page allocator.  I'm experimenting/thinking about giving
non-SCHED_OTHER tasks a modified form of atomic allocation to defeat this.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 20:02                                                                       ` Andrew Morton
@ 2002-01-14 21:19                                                                         ` Alan Cox
  2002-01-14 21:11                                                                           ` Andrew Morton
  0 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-14 21:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, Roman Zippel, yodaiken, Daniel Phillips,
	Arjan van de Ven, linux-kernel

> I have all along assumed that a well-designed RT application would delegate
> all these operations to SCHED_OTHER worker processes, probably via shared
> memory/shared mappings.  So in the simplest case, you'd have a SCHED_FIFO
> task which talks to the hardware, and which has a helper task which reads
> and writes stuff from and to disk.  With sufficient buffering and readahead
> to cover the worst case IO latencies.

A real RT task has hard guarantees and to all intents and purposes you may
deem the system failed if it ever misses one (arguably if you cannot verify
it will never miss one).

The stuff we care about is things like DVD players which tangle with
sockets, pipes, X11, memory allocation, and synchronization between multiple
hardware devices all running at slightly incorrect clocks.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 21:19                                                                         ` Alan Cox
@ 2002-01-14 21:11                                                                           ` Andrew Morton
  2002-01-14 21:30                                                                             ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-14 21:11 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roman Zippel, yodaiken, Daniel Phillips, Arjan van de Ven,
	linux-kernel

Alan Cox wrote:
> 
> > I have all along assumed that a well-designed RT application would delegate
> > all these operations to SCHED_OTHER worker processes, probably via shared
> > memory/shared mappings.  So in the simplest case, you'd have a SCHED_FIFO
> > task which talks to the hardware, and which has a helper task which reads
> > and writes stuff from and to disk.  With sufficient buffering and readahead
> > to cover the worst case IO latencies.
> 
> A real RT task has hard guarantees and to all intents and purposes you may
> deem the system failed if it ever misses one (arguably if you cannot verify
> it will never miss one).

We know that :)  Here, "RT" means "Linux-RT": something which is non-SCHED_OTHER,
and which we'd prefer didn't completely suck.

> The stuff we care about is things like DVD players which tangle with
> sockets, pipes, X11, memory allocation, and synchronization between multiple
> hardware devices all running at slightly incorrect clocks.

Well, that's my point.  A well-designed DVD player would have two processes.
One which tangles with the sockets, pipes, disks, etc, and which feeds data into
and out of the SCHED_FIFO task via a shared, mlocked memory region.

What I'm trying to develop here is a set of guidelines which will allow
application developers to design these programs with a reasonable
degree of success.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 21:11                                                                           ` Andrew Morton
@ 2002-01-14 21:30                                                                             ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-14 21:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alan Cox, Roman Zippel, yodaiken, Daniel Phillips,
	Arjan van de Ven, linux-kernel

> Well, that's my point.  A well-designed DVD player would have two processes.
> One which tangles with the sockets, pipes, disks, etc, and which feeds data into
> and out of the SCHED_FIFO task via a shared, mlocked memory region.
> 
> What I'm trying to develop here is a set of guidelines which will allow
> application developers to design these programs with a reasonable
> degree of success.

What about the X server 8)

Given 1mS and vague fairness DVD is more than acceptable

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:14                                                               ` Roman Zippel
  2002-01-14 11:47                                                                 ` Alan Cox
@ 2002-01-14 13:38                                                                 ` yodaiken
  2002-01-14 14:40                                                                   ` Roman Zippel
  2002-01-14 14:07                                                                 ` Guest section DW
  2 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-14 13:38 UTC (permalink / raw)
  To: Roman Zippel; +Cc: yodaiken, Daniel Phillips, Arjan van de Ven, linux-kernel

On Mon, Jan 14, 2002 at 12:14:47PM +0100, Roman Zippel wrote:
> Hi,
> 
> On Sun, 13 Jan 2002 yodaiken@fsmlabs.com wrote:
> 
> > 	Nobody has answered my question about the conflict between SMP per-cpu caching
> > 	and preempt. Since NUMA is apparently the future of MP in the PC world and
> > 	the future of Linux servers, it's interesting to consider this tradeoff.
> 
> Preempt is a UP feature so far.

I think this is a sufficient summary of your engineering approach.

 ...

> More of other FUD deleted, Victor, could you please stop this?

I guess that Andrew, Alan, Andrea and I all are raising objections that
you ignore because we  have some kind of shared bias.

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:38                                                                 ` yodaiken
@ 2002-01-14 14:40                                                                   ` Roman Zippel
  0 siblings, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 14:40 UTC (permalink / raw)
  To: yodaiken; +Cc: Daniel Phillips, Arjan van de Ven, linux-kernel

Hi,

On Mon, 14 Jan 2002 yodaiken@fsmlabs.com wrote:

> > > 	Nobody has answered my question about the conflict between SMP per-cpu caching
> > > 	and preempt. Since NUMA is apparently the future of MP in the PC world and
> > > 	the future of Linux servers, it's interesting to consider this tradeoff.
> >
> > Preempt is a UP feature so far.
>
> I think this is a sufficient summary of your engineering approach.

Would you please care to explain, what the hell you want?
Preempt on SMP has more problems than you mention above, so that the scope
of my arguments only included UP. Sorry, if I missed something, but
preempt on SMP is an entirely different dicussion.

> > More of other FUD deleted, Victor, could you please stop this?
>
> I guess that Andrew, Alan, Andrea and I all are raising objections that
> you ignore because we  have some kind of shared bias.

No, your sparse use of arguments makes the difference.

bye, Roman


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:14                                                               ` Roman Zippel
  2002-01-14 11:47                                                                 ` Alan Cox
  2002-01-14 13:38                                                                 ` yodaiken
@ 2002-01-14 14:07                                                                 ` Guest section DW
  2 siblings, 0 replies; 351+ messages in thread
From: Guest section DW @ 2002-01-14 14:07 UTC (permalink / raw)
  To: Roman Zippel; +Cc: yodaiken, Daniel Phillips, Arjan van de Ven, linux-kernel

On Mon, Jan 14, 2002 at 12:14:47PM +0100, Roman Zippel wrote:

> On Sun, 13 Jan 2002 yodaiken@fsmlabs.com wrote:
> 
> > 	Nobody has answered the question about how to make sure all processes
> > 	make progress with preempt.
> 
> The same way as without preempt.
> 
> More of other FUD deleted, Victor, could you please stop this?

Roman, Victor asks meaningful questions.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:34                                                             ` yodaiken
  2002-01-14 11:14                                                               ` Roman Zippel
@ 2002-01-14 12:17                                                               ` Momchil Velikov
  2002-01-14 12:45                                                                 ` Oliver Neukum
  2002-01-14 13:45                                                                 ` yodaiken
  2002-01-14 15:08                                                               ` Russ Leighton
  2 siblings, 2 replies; 351+ messages in thread
From: Momchil Velikov @ 2002-01-14 12:17 UTC (permalink / raw)
  To: yodaiken; +Cc: Daniel Phillips, Arjan van de Ven, Roman Zippel, linux-kernel

>>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
yodaiken> 	It's not even clear how preempt is supposed to interact with SCHED_FIFO.

How so ? The POSIX specification is not clear enough or it is not to be followed ?

Regards,
-velco


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 12:17                                                               ` Momchil Velikov
@ 2002-01-14 12:45                                                                 ` Oliver Neukum
  2002-01-14 16:32                                                                   ` Momchil Velikov
  2002-01-14 13:45                                                                 ` yodaiken
  1 sibling, 1 reply; 351+ messages in thread
From: Oliver Neukum @ 2002-01-14 12:45 UTC (permalink / raw)
  To: Momchil Velikov, yodaiken
  Cc: Daniel Phillips, Arjan van de Ven, Roman Zippel, linux-kernel

On Monday 14 January 2002 13:17, Momchil Velikov wrote:
> >>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
>
> yodaiken> 	It's not even clear how preempt is supposed to interact with
> SCHED_FIFO.
>
> How so ? The POSIX specification is not clear enough or it is not to be
> followed ?

You can have an rt task block on a lock held by a normal task that was 
preempted by a rt task of lower priority. The same problem as with the 
sched_idle patches.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 12:45                                                                 ` Oliver Neukum
@ 2002-01-14 16:32                                                                   ` Momchil Velikov
  2002-01-14 17:43                                                                     ` Alan Cox
  2002-01-14 18:04                                                                     ` Oliver Neukum
  0 siblings, 2 replies; 351+ messages in thread
From: Momchil Velikov @ 2002-01-14 16:32 UTC (permalink / raw)
  To: Oliver.Neukum
  Cc: yodaiken, Daniel Phillips, Arjan van de Ven, Roman Zippel,
	linux-kernel

>>>>> "Oliver" == Oliver Neukum <520047054719-0001@t-online.de> writes:

Oliver> On Monday 14 January 2002 13:17, Momchil Velikov wrote:
>> >>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
>> 
yodaiken> It's not even clear how preempt is supposed to interact with
>> SCHED_FIFO.
>> 
>> How so ? The POSIX specification is not clear enough or it is not to be
>> followed ?

Oliver> You can have an rt task block on a lock held by a normal task that was 
Oliver> preempted by a rt task of lower priority. The same problem as with the 
Oliver> sched_idle patches.

This can happen with a non-preemptible kernel too. And it has nothing to
do with scheduling policy.



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 16:32                                                                   ` Momchil Velikov
@ 2002-01-14 17:43                                                                     ` Alan Cox
  2002-01-14 22:34                                                                       ` Momchil Velikov
  2002-01-14 18:04                                                                     ` Oliver Neukum
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-14 17:43 UTC (permalink / raw)
  To: Momchil Velikov
  Cc: Oliver.Neukum, yodaiken, Daniel Phillips, Arjan van de Ven,
	Roman Zippel, linux-kernel

> Oliver> You can have an rt task block on a lock held by a normal task that was 
> Oliver> preempted by a rt task of lower priority. The same problem as with the 
> Oliver> sched_idle patches.
> 
> This can happen with a non-preemptible kernel too. And it has nothing to
> do with scheduling policy.

So why bother adding pre-emption. As you keep saying - it doesnt
gain anything

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 17:43                                                                     ` Alan Cox
@ 2002-01-14 22:34                                                                       ` Momchil Velikov
  2002-01-14 22:46                                                                         ` yodaiken
  0 siblings, 1 reply; 351+ messages in thread
From: Momchil Velikov @ 2002-01-14 22:34 UTC (permalink / raw)
  To: Alan Cox
  Cc: Oliver.Neukum, yodaiken, Daniel Phillips, Arjan van de Ven,
	Roman Zippel, linux-kernel

>>>>> "Alan" == Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

Oliver> You can have an rt task block on a lock held by a normal task that was 
Oliver> preempted by a rt task of lower priority. The same problem as with the 
Oliver> sched_idle patches.
>> 
>> This can happen with a non-preemptible kernel too. And it has nothing to
>> do with scheduling policy.

Alan> So why bother adding pre-emption. As you keep saying - it doesnt
Alan> gain anything

Nope. I don't. I said (at least in the above) it didn't hurt.

One can consider a non-preemptible kernel as a special kind of
priority inversion, preemptible kernel will eliminate _that_ case of
priority inversion.

Regards,
-velco


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 22:34                                                                       ` Momchil Velikov
@ 2002-01-14 22:46                                                                         ` yodaiken
       [not found]                                                                           ` <876664vxm8.fsf@fadata.bg>
  0 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-14 22:46 UTC (permalink / raw)
  To: Momchil Velikov
  Cc: Alan Cox, Oliver.Neukum, yodaiken, Daniel Phillips,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Tue, Jan 15, 2002 at 12:34:01AM +0200, Momchil Velikov wrote:
> One can consider a non-preemptible kernel as a special kind of
> priority inversion, preemptible kernel will eliminate _that_ case of
> priority inversion.

The problem here is that priority means something very different in 
a time-shared system than in a hard real-time system. And even in real-time
systems, as Walpole and colleagues have pointed out, priority doesn't
really capture much of what is needed for good scheduling.

In a general purpose system,  priorities are dynamic and "fair". 
The priority of even the lowliest process increases while it waits
for time. In a raw real-time system, the low priority process can sit
forever and should wait until no higher priority thread needs the 
processor. So it's absurd to talk of priority inversion in a non RT
system. When a low priority process is delaying a higher priority task
for reasons of fairness, increased throughput, or any other valid
objective, that is not a scheduling error.

> 
> Regards,
> -velco

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

[parent not found: <876664vxm8.fsf@fadata.bg>]

* Re: [2.4.17/18pre] VM and swap - it's really unusable
       [not found]                                                                           ` <876664vxm8.fsf@fadata.bg>
@ 2002-01-15 12:31                                                                             ` yodaiken
  2002-01-20 10:31                                                                               ` george anzinger
  0 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-15 12:31 UTC (permalink / raw)
  To: Momchil Velikov
  Cc: yodaiken, Alan Cox, Oliver.Neukum, Daniel Phillips,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Tue, Jan 15, 2002 at 10:15:11AM +0200, Momchil Velikov wrote:
> >>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
> 
> yodaiken> On Tue, Jan 15, 2002 at 12:34:01AM +0200, Momchil Velikov wrote:
> >> One can consider a non-preemptible kernel as a special kind of
> >> priority inversion, preemptible kernel will eliminate _that_ case of
> >> priority inversion.
> 
> yodaiken> The problem here is that priority means something very different in 
> yodaiken> a time-shared system than in a hard real-time system. And even in real-time
> yodaiken> systems, as Walpole and colleagues have pointed out, priority doesn't
> yodaiken> really capture much of what is needed for good scheduling.
> 
> Well, maybe there are other policies (notably static scheduling ;),
> which may be preferrable in one or other case, anyway, I personally
> tend to think rate-monotonic scheduling is quite adequate in practice.

That's not what we see in real RT applications.

> Of course, anyone is free to prove me wrong by implementing earliest
> deadline or slack time scheduler in the kernel.

My impression is that these are of limited use in applications.

> 
> yodaiken> In a general purpose system,  priorities are dynamic and "fair". 
> yodaiken> The priority of even the lowliest process increases while it waits
> yodaiken> for time. In a raw real-time system, the low priority process can sit
> yodaiken> forever and should wait until no higher priority thread needs the 
> yodaiken> processor. 
> 
> *nod*
> 
> yodaiken>             So it's absurd to talk of priority inversion in a non RT
> yodaiken> system. When a low priority process is delaying a higher priority task
> yodaiken> for reasons of fairness, increased throughput, or any other valid
> yodaiken> objective, that is not a scheduling error.
> 
> What is a non real-time system?  E.g. are a desktop playing ogg or a
> server serving multimedia content over ATM, real-time systems?  The
> point is that one can view the system as real-time or not depending on
> the dominant _current_ load.  A system may have real-time load now and

I don't think so, and certainly the straightforward priority based model
of traditional RT seems grossly wrong. What's the highest priority process - 
is it the CD player or is it the niced sendmail that is running in the
background? The fact is we require that _both_ make progress. Therefore
at some time the sendmail process must have effectively higher priority
than the CD player. If we then factor in the many kernel threads that may
be running at any time, the calculation becomes even more problematic. 
In RTLinux we have a much simpler environment and we can use the rule:
	The highest priority runnable task should be the running task
	within some T and the OS gets better as T goes to 0.
If something starves, that's the user problem - our users are programmers
who we want to provide with as much control as possible. We certainly
do not want to have a situation where the stop command to the robot arm
is delayed because the OS needs to run bdflush ...


For Linux, even Linux doing the critical work of playing DVDs, this is 
not so simple. The target users are not programmers, liveness and 
responsiveness remain important criteria ... 
It's clear that much of the problem when people discuss this issue is
lack of clear goals. "Preemption" is a controversial implementation 
technique, not a goal in itself - I hope. Even "low latency" doesn't
really capture the intended goal - "low latency" to do what?  If it's
simply "low latency" switch to highest priority process, that's not
necesarily anything that is going to be useful in itself. The 
"resource reservation" work at CMU seems to me to be way off the point
too. The OGI people have some interesting thoughts about this stuff, but
I don't think they've come close to working out a comprehensive model.

What I think we need is a kind of interval real-time scheduling.
Something like:
	The system has a basic timing period of N milliseconds where
	N is at lease 500 and probably more.

	Over a N millisecond period each process gets a full scheduling quantum
	and, if it requests, a full I/O quantum.
	For a niced process there is some calculated interval greater than
	N. An I/O quantum should correspond somehow to a rate of I/0.

	RTLIMIT is used to set the max number of processes allowed to start
	and this determines the computation length of "one quantum"

There is this cliche about RT that RT is about deadlines, not speed, but I think
that's only partially true. To say that every task will run during a 1 second period
with today's PC/Server technology is fundamentally different than to say that 
every task will run during a 10 millisecond period.

> a non real-time one an hour later, both occasionally intermixed with
> tasks of the other class. It is unreasonable to say "This system is
> non-realtime so use that kernel and when you want to play vidoes,
> please reboot to something else, for example XX-Linux".

Yes, but it's not unreasonable to pop up a window and ask the user if
he/she wants to have the scheduling rates of other applications
stepped down to run the DVD player. 
	The Oracle logging server is designed to run at 1 second periods
	increasing the responsiveness of your current process will 
	reduce that to a 2 second period.
	Email delivery will be delayed up to 1 minute
	...
	YES NO

and so we can follow the most important OS design technique of all time:
	push complexity to user space.
 

---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-15 12:31                                                                             ` yodaiken
@ 2002-01-20 10:31                                                                               ` george anzinger
  2002-01-20 14:34                                                                                 ` yodaiken
  0 siblings, 1 reply; 351+ messages in thread
From: george anzinger @ 2002-01-20 10:31 UTC (permalink / raw)
  To: yodaiken
  Cc: Momchil Velikov, Alan Cox, Oliver.Neukum, Daniel Phillips,
	Arjan van de Ven, Roman Zippel, linux-kernel

yodaiken@fsmlabs.com wrote:
> 
> On Tue, Jan 15, 2002 at 10:15:11AM +0200, Momchil Velikov wrote:
> > >>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
> >
> > yodaiken> On Tue, Jan 15, 2002 at 12:34:01AM +0200, Momchil Velikov wrote:
> > >> One can consider a non-preemptible kernel as a special kind of
> > >> priority inversion, preemptible kernel will eliminate _that_ case of
> > >> priority inversion.
> >
> > yodaiken> The problem here is that priority means something very different in
> > yodaiken> a time-shared system than in a hard real-time system. And even in real-time
> > yodaiken> systems, as Walpole and colleagues have pointed out, priority doesn't
> > yodaiken> really capture much of what is needed for good scheduling.
> >
> > Well, maybe there are other policies (notably static scheduling ;),
> > which may be preferrable in one or other case, anyway, I personally
> > tend to think rate-monotonic scheduling is quite adequate in practice.
> 
> That's not what we see in real RT applications.
> 
> > Of course, anyone is free to prove me wrong by implementing earliest
> > deadline or slack time scheduler in the kernel.
> 
> My impression is that these are of limited use in applications.
> 
> >
> > yodaiken> In a general purpose system,  priorities are dynamic and "fair".
> > yodaiken> The priority of even the lowliest process increases while it waits
> > yodaiken> for time. In a raw real-time system, the low priority process can sit
> > yodaiken> forever and should wait until no higher priority thread needs the
> > yodaiken> processor.
> >
> > *nod*
> >
> > yodaiken>             So it's absurd to talk of priority inversion in a non RT
> > yodaiken> system. When a low priority process is delaying a higher priority task
> > yodaiken> for reasons of fairness, increased throughput, or any other valid
> > yodaiken> objective, that is not a scheduling error.
> >
> > What is a non real-time system?  E.g. are a desktop playing ogg or a
> > server serving multimedia content over ATM, real-time systems?  The
> > point is that one can view the system as real-time or not depending on
> > the dominant _current_ load.  A system may have real-time load now and
> 
> I don't think so, and certainly the straightforward priority based model
> of traditional RT seems grossly wrong. What's the highest priority process -
> is it the CD player or is it the niced sendmail that is running in the
> background? The fact is we require that _both_ make progress. Therefore
> at some time the sendmail process must have effectively higher priority
> than the CD player. If we then factor in the many kernel threads that may
> be running at any time, the calculation becomes even more problematic.
> In RTLinux we have a much simpler environment and we can use the rule:
>         The highest priority runnable task should be the running task
>         within some T and the OS gets better as T goes to 0.
> If something starves, that's the user problem - our users are programmers
> who we want to provide with as much control as possible. We certainly
> do not want to have a situation where the stop command to the robot arm
> is delayed because the OS needs to run bdflush ...
> 
> For Linux, even Linux doing the critical work of playing DVDs, this is
> not so simple. The target users are not programmers, liveness and
> responsiveness remain important criteria ...
> It's clear that much of the problem when people discuss this issue is
> lack of clear goals. "Preemption" is a controversial implementation
> technique, not a goal in itself - I hope. Even "low latency" doesn't
> really capture the intended goal - "low latency" to do what?  If it's
> simply "low latency" switch to highest priority process, that's not
> necesarily anything that is going to be useful in itself. The
> "resource reservation" work at CMU seems to me to be way off the point
> too. The OGI people have some interesting thoughts about this stuff, but
> I don't think they've come close to working out a comprehensive model.
> 
> What I think we need is a kind of interval real-time scheduling.
> Something like:
>         The system has a basic timing period of N milliseconds where
>         N is at lease 500 and probably more.
> 
>         Over a N millisecond period each process gets a full scheduling quantum
>         and, if it requests, a full I/O quantum.
>         For a niced process there is some calculated interval greater than
>         N. An I/O quantum should correspond somehow to a rate of I/0.
> 
>         RTLIMIT is used to set the max number of processes allowed to start
>         and this determines the computation length of "one quantum"

Have you looked at SCHED_SPORADIC (see 1003.1d-1999)?
> 
> There is this cliche about RT that RT is about deadlines, not speed, but I think
> that's only partially true. To say that every task will run during a 1 second period
> with today's PC/Server technology is fundamentally different than to say that
> every task will run during a 10 millisecond period.
> 
> > a non real-time one an hour later, both occasionally intermixed with
> > tasks of the other class. It is unreasonable to say "This system is
> > non-realtime so use that kernel and when you want to play vidoes,
> > please reboot to something else, for example XX-Linux".
> 
> Yes, but it's not unreasonable to pop up a window and ask the user if
> he/she wants to have the scheduling rates of other applications
> stepped down to run the DVD player.
>         The Oracle logging server is designed to run at 1 second periods
>         increasing the responsiveness of your current process will
>         reduce that to a 2 second period.
>         Email delivery will be delayed up to 1 minute
>         ...
>         YES NO
> 
> and so we can follow the most important OS design technique of all time:
>         push complexity to user space.
> 
> 
> ---------------------------------------------------------
> Victor Yodaiken
> Finite State Machine Labs: The RTLinux Company.
>  www.fsmlabs.com  www.rtlinux.com
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-20 10:31                                                                               ` george anzinger
@ 2002-01-20 14:34                                                                                 ` yodaiken
  0 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-20 14:34 UTC (permalink / raw)
  To: george anzinger
  Cc: yodaiken, Momchil Velikov, Alan Cox, Oliver.Neukum,
	Daniel Phillips, Arjan van de Ven, Roman Zippel, linux-kernel

On Sun, Jan 20, 2002 at 02:31:51AM -0800, george anzinger wrote:
> yodaiken@fsmlabs.com wrote:
> > What I think we need is a kind of interval real-time scheduling.
> > Something like:
> >         The system has a basic timing period of N milliseconds where
> >         N is at lease 500 and probably more.
> > 
> >         Over a N millisecond period each process gets a full scheduling quantum
> >         and, if it requests, a full I/O quantum.
> >         For a niced process there is some calculated interval greater than
> >         N. An I/O quantum should correspond somehow to a rate of I/0.
> > 
> >         RTLIMIT is used to set the max number of processes allowed to start
> >         and this determines the computation length of "one quantum"
> 
> Have you looked at SCHED_SPORADIC (see 1003.1d-1999)?

Yes, but I don't think it does much - although its description is quite
long and complex.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 16:32                                                                   ` Momchil Velikov
  2002-01-14 17:43                                                                     ` Alan Cox
@ 2002-01-14 18:04                                                                     ` Oliver Neukum
  2002-01-14 20:09                                                                       ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: Oliver Neukum @ 2002-01-14 18:04 UTC (permalink / raw)
  To: Momchil Velikov
  Cc: yodaiken, Daniel Phillips, Arjan van de Ven, Roman Zippel,
	linux-kernel

> >> How so ? The POSIX specification is not clear enough or it is not to be
> >> followed ?
>
> Oliver> You can have an rt task block on a lock held by a normal task that
> was Oliver> preempted by a rt task of lower priority. The same problem as
> with the Oliver> sched_idle patches.
>
> This can happen with a non-preemptible kernel too. And it has nothing to
> do with scheduling policy.

It can happen if you sleep with a lock held.
It can not happen at random points in the code.
Thus there is a relation to preemption in kernel mode.

To cure that problem tasks holding a lock would have to be given
the highest priority of all tasks blocking on that lock. The semaphore
code would get much more complex, even in the succesful code path,
which would hurt a lot.

If on the other hand sleeping in kernel mode is explicit, you can simply
give any task being woken up a timeslice and the scheduling requirements
are met. If that should be a problem.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 18:04                                                                     ` Oliver Neukum
@ 2002-01-14 20:09                                                                       ` Robert Love
  2002-01-14 20:22                                                                         ` Oliver Neukum
  0 siblings, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-14 20:09 UTC (permalink / raw)
  To: Oliver.Neukum
  Cc: Momchil Velikov, yodaiken, Daniel Phillips, Arjan van de Ven,
	Roman Zippel, linux-kernel

On Mon, 2002-01-14 at 13:04, Oliver Neukum wrote:

> It can happen if you sleep with a lock held.
> It can not happen at random points in the code.
> Thus there is a relation to preemption in kernel mode.
> 
> To cure that problem tasks holding a lock would have to be given
> the highest priority of all tasks blocking on that lock. The semaphore
> code would get much more complex, even in the succesful code path,
> which would hurt a lot.

No, this isn't needed.  This same problem would occur without
preemption.  Our semaphores now have locking rules such that we aren't
going to have blatant priority inversion like this (1 holds A needs B, 2
holds B needs A).

When priority inversion begins to become a problem is if we intend to
start turning existing spinlocks into semaphores.  There the locking
rules are weaker, and thus we would need to do priority inheriting.  But
that's not now.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 20:09                                                                       ` Robert Love
@ 2002-01-14 20:22                                                                         ` Oliver Neukum
  2002-01-14 20:36                                                                           ` Robert Love
  0 siblings, 1 reply; 351+ messages in thread
From: Oliver Neukum @ 2002-01-14 20:22 UTC (permalink / raw)
  To: Robert Love
  Cc: Momchil Velikov, yodaiken, Daniel Phillips, Arjan van de Ven,
	Roman Zippel, linux-kernel

On Monday 14 January 2002 21:09, Robert Love wrote:
> On Mon, 2002-01-14 at 13:04, Oliver Neukum wrote:
> > It can happen if you sleep with a lock held.
> > It can not happen at random points in the code.
> > Thus there is a relation to preemption in kernel mode.
> >
> > To cure that problem tasks holding a lock would have to be given
> > the highest priority of all tasks blocking on that lock. The semaphore
> > code would get much more complex, even in the succesful code path,
> > which would hurt a lot.
>
> No, this isn't needed.  This same problem would occur without
> preemption.  Our semaphores now have locking rules such that we aren't
> going to have blatant priority inversion like this (1 holds A needs B, 2
> holds B needs A).

No this is a good old deadlock.
The problem with preemption and SCHED_FIFO is, that due to SCHED_FIFO
you have no guarantee that any task will make any progress at all.
Thus a semaphore could basically be held forever.
That can happen without preemption only if you do something that
might block.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 20:22                                                                         ` Oliver Neukum
@ 2002-01-14 20:36                                                                           ` Robert Love
  2002-01-14 22:46                                                                             ` Oliver Neukum
  0 siblings, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-14 20:36 UTC (permalink / raw)
  To: Oliver.Neukum
  Cc: Momchil Velikov, yodaiken, Daniel Phillips, Roman Zippel,
	linux-kernel

On Mon, 2002-01-14 at 15:22, Oliver Neukum wrote:

> > No, this isn't needed.  This same problem would occur without
> > preemption.  Our semaphores now have locking rules such that we aren't
> > going to have blatant priority inversion like this (1 holds A needs B, 2
> > holds B needs A).
> 
> No this is a good old deadlock.
> The problem with preemption and SCHED_FIFO is, that due to SCHED_FIFO
> you have no guarantee that any task will make any progress at all.
> Thus a semaphore could basically be held forever.
> That can happen without preemption only if you do something that
> might block.

Well, semaphores block.  And we have these races right now with
SCHED_FIFO tasks.  I still contend preempt does not change the nature of
the problem and it certainly doesn't introduce a new one.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 20:36                                                                           ` Robert Love
@ 2002-01-14 22:46                                                                             ` Oliver Neukum
  2002-01-15  3:01                                                                               ` george anzinger
  0 siblings, 1 reply; 351+ messages in thread
From: Oliver Neukum @ 2002-01-14 22:46 UTC (permalink / raw)
  To: Robert Love
  Cc: Momchil Velikov, yodaiken, Daniel Phillips, Roman Zippel,
	linux-kernel

> Well, semaphores block.  And we have these races right now with
> SCHED_FIFO tasks.  I still contend preempt does not change the nature of
> the problem and it certainly doesn't introduce a new one.

But it does:

down(&sem);
do_something_that_cannot_block();
up(&sem);

Will stop a SCHED_FIFO task for a definite amount of time. Only
until it returns from the kernel to user space at worst.

If do_something_that_cannot_block() can be preempted, a SCHED_FIFO
task can block indefinitely long on the semaphore, because you have
no guarantee that the scheduler will ever again select the the preempted task.
In fact it must never again select the preempted task as long as there's
another runnable SCHED_FIFO task.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 22:46                                                                             ` Oliver Neukum
@ 2002-01-15  3:01                                                                               ` george anzinger
  0 siblings, 0 replies; 351+ messages in thread
From: george anzinger @ 2002-01-15  3:01 UTC (permalink / raw)
  To: Oliver.Neukum
  Cc: Robert Love, Momchil Velikov, yodaiken, Daniel Phillips,
	Roman Zippel, linux-kernel

Oliver Neukum wrote:
> 
> > Well, semaphores block.  And we have these races right now with
> > SCHED_FIFO tasks.  I still contend preempt does not change the nature of
> > the problem and it certainly doesn't introduce a new one.
> 
> But it does:
> 
> down(&sem);
> do_something_that_cannot_block();
> up(&sem);
> 
> Will stop a SCHED_FIFO task for a definite amount of time. Only
> until it returns from the kernel to user space at worst.
> 
> If do_something_that_cannot_block() can be preempted, a SCHED_FIFO
> task can block indefinitely long on the semaphore, because you have
> no guarantee that the scheduler will ever again select the the preempted task.
> In fact it must never again select the preempted task as long as there's
> another runnable SCHED_FIFO task.
> 
This is not true, and if it is is a scheduler bug.  When a task (any
task) gets preempted it is not moved from the front of its queue, thus,
in this case, the FIFO task will still be the fitst task at its prioity
to run.  Also, it can only be preempted by another real time task of
higher priority.  Now it is possible that that task may block on the
same sem, but this is simple priority inversion and has nothing to do
with the sem holder being FIFO, RR or any thing else.  In other words
preemption does NOT change a FIFO (or any other) task's position in the
dispatch queue.


-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 12:17                                                               ` Momchil Velikov
  2002-01-14 12:45                                                                 ` Oliver Neukum
@ 2002-01-14 13:45                                                                 ` yodaiken
  2002-01-14 13:48                                                                   ` yodaiken
                                                                                     ` (3 more replies)
  1 sibling, 4 replies; 351+ messages in thread
From: yodaiken @ 2002-01-14 13:45 UTC (permalink / raw)
  To: Momchil Velikov
  Cc: yodaiken, Daniel Phillips, Arjan van de Ven, Roman Zippel,
	linux-kernel

On Mon, Jan 14, 2002 at 02:17:46PM +0200, Momchil Velikov wrote:
> >>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
> yodaiken> 	It's not even clear how preempt is supposed to interact with SCHED_FIFO.
> 
> How so ? The POSIX specification is not clear enough or it is not to be followed ?

POSIX makes no specification of how scheduling classes interact - unless something changed
in the new version.

But more than that, the problem of preemption is much more complex when you have
task that do not share the "goodness fade" with everything else. That is, given a
set of SCHED_OTHER processes at time T0, it is reasonable to design the scheduler so
that there is some D so that by time T0+D each process has become the highest priority
and has received cpu up to either a complete time slice or a I/O block. Linux kind of
has this property now, and I believe that making this more robust and easier to analyze
is going to be an enormously important issue.  However, once you add SCHED_FIFO in the
current scheme, this becomes more complex. And with preempt, you cannot even offer the
assurance that once a process gets the cpu it will make _any_ advance at all.

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:45                                                                 ` yodaiken
@ 2002-01-14 13:48                                                                   ` yodaiken
  2002-01-14 14:56                                                                   ` Roman Zippel
                                                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-14 13:48 UTC (permalink / raw)
  To: yodaiken
  Cc: Momchil Velikov, Daniel Phillips, Arjan van de Ven, Roman Zippel,
	linux-kernel

I forgot the line that says: "Oliver pointed out the immediate problem but .."

On Mon, Jan 14, 2002 at 06:45:48AM -0700, yodaiken@fsmlabs.com wrote:
> On Mon, Jan 14, 2002 at 02:17:46PM +0200, Momchil Velikov wrote:
> > >>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
> > yodaiken> 	It's not even clear how preempt is supposed to interact with SCHED_FIFO.
> > 
> > How so ? The POSIX specification is not clear enough or it is not to be followed ?
> 
> POSIX makes no specification of how scheduling classes interact - unless something changed
> in the new version.
> 
> But more than that, the problem of preemption is much more complex when you have
> task that do not share the "goodness fade" with everything else. That is, given a
> set of SCHED_OTHER processes at time T0, it is reasonable to design the scheduler so
> that there is some D so that by time T0+D each process has become the highest priority
> and has received cpu up to either a complete time slice or a I/O block. Linux kind of
> has this property now, and I believe that making this more robust and easier to analyze
> is going to be an enormously important issue.  However, once you add SCHED_FIFO in the
> current scheme, this becomes more complex. And with preempt, you cannot even offer the
> assurance that once a process gets the cpu it will make _any_ advance at all.
> 
> 
> 
> -- 
> ---------------------------------------------------------
> Victor Yodaiken 
> Finite State Machine Labs: The RTLinux Company.
>  www.fsmlabs.com  www.rtlinux.com

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:45                                                                 ` yodaiken
  2002-01-14 13:48                                                                   ` yodaiken
@ 2002-01-14 14:56                                                                   ` Roman Zippel
  2002-01-14 16:18                                                                     ` yodaiken
  2002-01-14 16:36                                                                   ` Momchil Velikov
  2002-01-14 17:36                                                                   ` Daniel Phillips
  3 siblings, 1 reply; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 14:56 UTC (permalink / raw)
  To: yodaiken; +Cc: Momchil Velikov, Daniel Phillips, Arjan van de Ven, linux-kernel

Hi,

On Mon, 14 Jan 2002 yodaiken@fsmlabs.com wrote:

> is going to be an enormously important issue.  However, once you add SCHED_FIFO in the
> current scheme, this becomes more complex. And with preempt, you cannot even offer the
> assurance that once a process gets the cpu it will make _any_ advance at all.

I'm not sure if I understand you correctly, but how is this related to
preempt? A SCHED_FIFO tasks only delays SCHED_OTHER tasks, but it doesn't
consume their time slice, so the remaining tasks still get their
(previously assigned) time at the cpu, until all tasks have consumed
their share.

bye, Roman



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 14:56                                                                   ` Roman Zippel
@ 2002-01-14 16:18                                                                     ` yodaiken
  2002-01-14 18:54                                                                       ` Roman Zippel
  0 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-14 16:18 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Momchil Velikov, Daniel Phillips, Arjan van de Ven,
	linux-kernel

On Mon, Jan 14, 2002 at 03:56:05PM +0100, Roman Zippel wrote:
> Hi,
> 
> On Mon, 14 Jan 2002 yodaiken@fsmlabs.com wrote:
> 
> > is going to be an enormously important issue.  However, once you add SCHED_FIFO in the
> > current scheme, this becomes more complex. And with preempt, you cannot even offer the
> > assurance that once a process gets the cpu it will make _any_ advance at all.
> 
> I'm not sure if I understand you correctly, but how is this related to
> preempt?

It's pretty subtle. If there is no preempt, processes don't get preempted.
If there is preempt, they can be preempted. Amazing isn't it? 

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 16:18                                                                     ` yodaiken
@ 2002-01-14 18:54                                                                       ` Roman Zippel
  0 siblings, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 18:54 UTC (permalink / raw)
  To: yodaiken; +Cc: Momchil Velikov, Daniel Phillips, Arjan van de Ven, linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> > > is going to be an enormously important issue.  However, once you add SCHED_FIFO in the
> > > current scheme, this becomes more complex. And with preempt, you cannot even offer the
> > > assurance that once a process gets the cpu it will make _any_ advance at all.
> >
> > I'm not sure if I understand you correctly, but how is this related to
> > preempt?
> 
> It's pretty subtle. If there is no preempt, processes don't get preempted.
> If there is preempt, they can be preempted. Amazing isn't it?

I just can't win against such brilliant argumentation, I'm out.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:45                                                                 ` yodaiken
  2002-01-14 13:48                                                                   ` yodaiken
  2002-01-14 14:56                                                                   ` Roman Zippel
@ 2002-01-14 16:36                                                                   ` Momchil Velikov
       [not found]                                                                     ` <20020114030925.A1363@viejo.fsmlabs.com>
  2002-01-14 17:36                                                                   ` Daniel Phillips
  3 siblings, 1 reply; 351+ messages in thread
From: Momchil Velikov @ 2002-01-14 16:36 UTC (permalink / raw)
  To: yodaiken; +Cc: Daniel Phillips, Arjan van de Ven, Roman Zippel, linux-kernel

>>>>> "yodaiken" == yodaiken  <yodaiken@fsmlabs.com> writes:
yodaiken> current scheme, this becomes more complex. And with preempt, you cannot even offer the
yodaiken> assurance that once a process gets the cpu it will make _any_ advance at all.

So? It either shouldn't have got the CPU anyway (maybe it CPU is
needed for other things) or user's priority setup is seriously borked.

The scheduling policies, algorithms, mechanisms, whatever ... do not
guarantee schedulability by themselves.


^ permalink raw reply	[flat|nested] 351+ messages in thread

[parent not found: <20020114030925.A1363@viejo.fsmlabs.com>]

* Re: [2.4.17/18pre] VM and swap - it's really unusable
       [not found]                                                                     ` <20020114030925.A1363@viejo.fsmlabs.com>
@ 2002-01-14 18:43                                                                       ` Daniel Phillips
  2002-01-14 18:39                                                                         ` yodaiken
                                                                                           ` (2 more replies)
  0 siblings, 3 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14 18:43 UTC (permalink / raw)
  To: yodaiken, Momchil Velikov
  Cc: yodaiken, Arjan van de Ven, Roman Zippel, linux-kernel

On January 14, 2002 10:09 am, yodaiken@fsmlabs.com wrote:
> UNIX generally tries to ensure liveness. So you know that
> 	cat lkarchive | grep feel | wc
> will complete and not just that, it will run pretty reasonably because
> for UNIX _every_ process is important and gets cpu and IO time.
> When you start trying to add special low latency tasks, you endanger
> liveness.  And preempt is especially corrosive because one of the 
> mechanisms UNIX uses to assure liveness is to make sure that once a 
> process starts it can do a significant chunk of work.

You're claiming that preemption by nature is not Unix-like?

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 18:43                                                                       ` Daniel Phillips
@ 2002-01-14 18:39                                                                         ` yodaiken
  2002-01-14 20:16                                                                           ` Robert Love
  2002-01-14 19:16                                                                         ` Rick Stevens
  2002-01-15  3:07                                                                         ` george anzinger
  2 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-14 18:39 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: yodaiken, Momchil Velikov, Arjan van de Ven, Roman Zippel,
	linux-kernel

On Mon, Jan 14, 2002 at 07:43:59PM +0100, Daniel Phillips wrote:
> On January 14, 2002 10:09 am, yodaiken@fsmlabs.com wrote:
> > UNIX generally tries to ensure liveness. So you know that
> > 	cat lkarchive | grep feel | wc
> > will complete and not just that, it will run pretty reasonably because
> > for UNIX _every_ process is important and gets cpu and IO time.
> > When you start trying to add special low latency tasks, you endanger
> > liveness.  And preempt is especially corrosive because one of the 
> > mechanisms UNIX uses to assure liveness is to make sure that once a 
> > process starts it can do a significant chunk of work.
> 
> You're claiming that preemption by nature is not Unix-like?

Kernel preemption is not traditionally part of UNIX. 

> 
> --
> Daniel

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 18:39                                                                         ` yodaiken
@ 2002-01-14 20:16                                                                           ` Robert Love
  0 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-14 20:16 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, Momchil Velikov, Arjan van de Ven, Roman Zippel,
	linux-kernel

On Mon, 2002-01-14 at 13:39, yodaiken@fsmlabs.com wrote:

> > You're claiming that preemption by nature is not Unix-like?
> 
> Kernel preemption is not traditionally part of UNIX. 

True original AT&T was non-preemptible, but it also didn't originally
have paging.  Today, Solaris, IRIX, latest BSD (via BSDng), etc. are all
preemptible kernels.

Ask Core whether SMPng in FreeBSD 5.0 will include preempt, I think they
are still debating.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 18:43                                                                       ` Daniel Phillips
  2002-01-14 18:39                                                                         ` yodaiken
@ 2002-01-14 19:16                                                                         ` Rick Stevens
  2002-01-15  3:07                                                                         ` george anzinger
  2 siblings, 0 replies; 351+ messages in thread
From: Rick Stevens @ 2002-01-14 19:16 UTC (permalink / raw)
  To: linux-kernel

Daniel Phillips wrote:

> On January 14, 2002 10:09 am, yodaiken@fsmlabs.com wrote:
> 
>>UNIX generally tries to ensure liveness. So you know that
>>	cat lkarchive | grep feel | wc
>>will complete and not just that, it will run pretty reasonably because
>>for UNIX _every_ process is important and gets cpu and IO time.
>>When you start trying to add special low latency tasks, you endanger
>>liveness.  And preempt is especially corrosive because one of the 
>>mechanisms UNIX uses to assure liveness is to make sure that once a 
>>process starts it can do a significant chunk of work.
>>
> 
> You're claiming that preemption by nature is not Unix-like?


Unix started out life as a _time-sharing_ OS.  It never claimed to
be preemptive or real time.  For those, you waited a while, then
got to run MACH.
----------------------------------------------------------------------
- Rick Stevens, SSE, VitalStream, Inc.      rstevens@vitalstream.com -
- 949-743-2010 (Voice)                    http://www.vitalstream.com -
-                                                                    -
-        Change is inevitable, except from a vending machine.        -
----------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 18:43                                                                       ` Daniel Phillips
  2002-01-14 18:39                                                                         ` yodaiken
  2002-01-14 19:16                                                                         ` Rick Stevens
@ 2002-01-15  3:07                                                                         ` george anzinger
  2002-01-15  3:31                                                                           ` Daniel Phillips
  2002-01-15 12:39                                                                           ` yodaiken
  2 siblings, 2 replies; 351+ messages in thread
From: george anzinger @ 2002-01-15  3:07 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: yodaiken, Momchil Velikov, Arjan van de Ven, Roman Zippel,
	linux-kernel

Daniel Phillips wrote:
> 
> On January 14, 2002 10:09 am, yodaiken@fsmlabs.com wrote:
> > UNIX generally tries to ensure liveness. So you know that
> >       cat lkarchive | grep feel | wc
> > will complete and not just that, it will run pretty reasonably because
> > for UNIX _every_ process is important and gets cpu and IO time.
> > When you start trying to add special low latency tasks, you endanger
> > liveness.  And preempt is especially corrosive because one of the
> > mechanisms UNIX uses to assure liveness is to make sure that once a
> > process starts it can do a significant chunk of work.
> 
If I read this right, your complaint is not with preemption but with
scheduler policy.  Clearly both are needed to "assure liveness". 
Another way of looking at preemption is that is enables a more
responsive and nimble scheduler policy (afterall it is the scheduler
that decided that task A should give way to task B.  All preemption does
is to allow that to happen with greater dispatch.)  Given that, we can
then discuss what scheduler policy should be.
-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-15  3:07                                                                         ` george anzinger
@ 2002-01-15  3:31                                                                           ` Daniel Phillips
  2002-01-15 12:39                                                                           ` yodaiken
  1 sibling, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-15  3:31 UTC (permalink / raw)
  To: george anzinger
  Cc: yodaiken, Momchil Velikov, Arjan van de Ven, Roman Zippel,
	linux-kernel

On January 15, 2002 04:07 am, george anzinger wrote:
> Daniel Phillips wrote:
> > 
> > On January 14, 2002 10:09 am, yodaiken@fsmlabs.com wrote:
> > > UNIX generally tries to ensure liveness. So you know that
> > >       cat lkarchive | grep feel | wc
> > > will complete and not just that, it will run pretty reasonably because
> > > for UNIX _every_ process is important and gets cpu and IO time.
> > > When you start trying to add special low latency tasks, you endanger
> > > liveness.  And preempt is especially corrosive because one of the
> > > mechanisms UNIX uses to assure liveness is to make sure that once a
> > > process starts it can do a significant chunk of work.
>
> If I read this right, your complaint is not with preemption but with
> scheduler policy.  Clearly both are needed to "assure liveness". 
> Another way of looking at preemption is that is enables a more
> responsive and nimble scheduler policy (afterall it is the scheduler
> that decided that task A should give way to task B.  All preemption does
> is to allow that to happen with greater dispatch.)  Given that, we can
> then discuss what scheduler policy should be.

You responded to the wrong person, however I'll take this opportunity to 
agree with you, on the basis of my years of experience with critical path 
scheduling.  For project schedules 'earlist completion' is the name of the 
game, within bounds of available resources.  When you delay an indvidual 
'task' (I'm using the project management term here) past the earliest time it 
can be scheduled, you are using up its 'float', and if the delay is longer 
than the task's float, the completion time of the schedule as a whole will be 
delayed.  This is no different for a computer than it is for a group of 
people, it is still a scheduling problem.  Delaying any random task risks 
delaying the schedule as a whole, and that risk approaches certainty as the 
number of delays approaches infinity.

N.B.: the above observation is aimed at project managers, who will know 
exactly what I'm talking about.  Otherwise, don't worry if it sounds like so 
much BS, it actually isn't ;-)

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-15  3:07                                                                         ` george anzinger
  2002-01-15  3:31                                                                           ` Daniel Phillips
@ 2002-01-15 12:39                                                                           ` yodaiken
  2002-01-21 15:38                                                                             ` Daniel Phillips
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-15 12:39 UTC (permalink / raw)
  To: george anzinger
  Cc: Daniel Phillips, yodaiken, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On Mon, Jan 14, 2002 at 07:07:46PM -0800, george anzinger wrote:
> Daniel Phillips wrote:
> > 
> > On January 14, 2002 10:09 am, yodaiken@fsmlabs.com wrote:
> > > UNIX generally tries to ensure liveness. So you know that
> > >       cat lkarchive | grep feel | wc
> > > will complete and not just that, it will run pretty reasonably because
> > > for UNIX _every_ process is important and gets cpu and IO time.
> > > When you start trying to add special low latency tasks, you endanger
> > > liveness.  And preempt is especially corrosive because one of the
> > > mechanisms UNIX uses to assure liveness is to make sure that once a
> > > process starts it can do a significant chunk of work.
> > 
> If I read this right, your complaint is not with preemption but with
> scheduler policy.  Clearly both are needed to "assure liveness". 

You are right: I think however "preemption" is part of
a package of ideas about how the system should work. 
So it would probably be better to separate these issues out

> Another way of looking at preemption is that is enables a more
> responsive and nimble scheduler policy (afterall it is the scheduler
> that decided that task A should give way to task B.  All preemption does
> is to allow that to happen with greater dispatch.)  Given that, we can
> then discuss what scheduler policy should be.

If you would write this as 
	Another way of looking at preemption is that it is intended
	to enable a more responsive ...
then we would be off to a good start in narrowing the discussion.
My reservation about preemption as an implementation technique is that
it has costs, which seem to be not easily boundable, but not very 
clear benefits.


> -- 
> George           george@mvista.com
> High-res-timers: http://sourceforge.net/projects/high-res-timers/
> Real time sched: http://sourceforge.net/projects/rtsched/

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-15 12:39                                                                           ` yodaiken
@ 2002-01-21 15:38                                                                             ` Daniel Phillips
  2002-01-21 15:43                                                                               ` yodaiken
  0 siblings, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-21 15:38 UTC (permalink / raw)
  To: yodaiken, george anzinger
  Cc: yodaiken, Momchil Velikov, Arjan van de Ven, Roman Zippel,
	linux-kernel

On January 15, 2002 01:39 pm, yodaiken@fsmlabs.com wrote:
> My reservation about preemption as an implementation technique is that
> it has costs, which seem to be not easily boundable, but not very 
> clear benefits.

To me the benefit is clear enough: ASAP scheduling of IO threads, a simple 
heuristic that improves both throughput and latency.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 15:38                                                                             ` Daniel Phillips
@ 2002-01-21 15:43                                                                               ` yodaiken
  2002-01-21 16:05                                                                                 ` Daniel Phillips
  2002-01-21 20:35                                                                                 ` Bill Davidsen
  0 siblings, 2 replies; 351+ messages in thread
From: yodaiken @ 2002-01-21 15:43 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: yodaiken, george anzinger, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On Mon, Jan 21, 2002 at 04:38:59PM +0100, Daniel Phillips wrote:
> On January 15, 2002 01:39 pm, yodaiken@fsmlabs.com wrote:
> > My reservation about preemption as an implementation technique is that
> > it has costs, which seem to be not easily boundable, but not very 
> > clear benefits.
> 
> To me the benefit is clear enough: ASAP scheduling of IO threads, a simple 
> heuristic that improves both throughput and latency.

I think of "benefit", perhaps naiively, in terms of something that can
be measured or demonstrated rather than just announced.


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 15:43                                                                               ` yodaiken
@ 2002-01-21 16:05                                                                                 ` Daniel Phillips
  2002-01-21 16:06                                                                                   ` yodaiken
  2002-01-21 19:26                                                                                   ` Mark Hahn
  2002-01-21 20:35                                                                                 ` Bill Davidsen
  1 sibling, 2 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-21 16:05 UTC (permalink / raw)
  To: yodaiken
  Cc: yodaiken, george anzinger, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On January 21, 2002 04:43 pm, yodaiken@fsmlabs.com wrote:
> On Mon, Jan 21, 2002 at 04:38:59PM +0100, Daniel Phillips wrote:
> > On January 15, 2002 01:39 pm, yodaiken@fsmlabs.com wrote:
> > > My reservation about preemption as an implementation technique is that
> > > it has costs, which seem to be not easily boundable, but not very 
> > > clear benefits.
> > 
> > To me the benefit is clear enough: ASAP scheduling of IO threads, a 
> > simple heuristic that improves both throughput and latency.
> 
> I think of "benefit", perhaps naiively, in terms of something that can
> be measured or demonstrated rather than just announced.

But you see why asap scheduling improves latency/throughput *in theory*, 
don't you?  As for the measured benefit, there have been a steady stream of 
postive reports on lkml.  My own experience is that the usability of my 
laptop with its small memory is much improved under heavy IO load.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:05                                                                                 ` Daniel Phillips
@ 2002-01-21 16:06                                                                                   ` yodaiken
  2002-01-21 16:33                                                                                     ` Peter Wächtler
                                                                                                       ` (3 more replies)
  2002-01-21 19:26                                                                                   ` Mark Hahn
  1 sibling, 4 replies; 351+ messages in thread
From: yodaiken @ 2002-01-21 16:06 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: yodaiken, george anzinger, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On Mon, Jan 21, 2002 at 05:05:01PM +0100, Daniel Phillips wrote:
> > I think of "benefit", perhaps naiively, in terms of something that can
> > be measured or demonstrated rather than just announced.
> 
> But you see why asap scheduling improves latency/throughput *in theory*, 

Nope. And I don't even see a relationship between preemption and asap I/O
schedulding. What make you think that I/O threads won't be preempted by
other threads?

> don't you?  As for the measured benefit, there have been a steady stream of 
> postive reports on lkml. 

I have not seen a single well structured benchmark that shows a significant
difference. I've seen lots of benchmarks with odd mixes of different patches
showing something unknown. How about a simple clear dbench?

>My own experience is that the usability of my 
> laptop with its small memory is much improved under heavy IO load.

No comment.

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:06                                                                                   ` yodaiken
@ 2002-01-21 16:33                                                                                     ` Peter Wächtler
  2002-01-21 16:45                                                                                       ` yodaiken
  2002-01-21 16:48                                                                                     ` Daniel Phillips
                                                                                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 351+ messages in thread
From: Peter Wächtler @ 2002-01-21 16:33 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

yodaiken@fsmlabs.com schrieb:
> 
> On Mon, Jan 21, 2002 at 05:05:01PM +0100, Daniel Phillips wrote:
> > > I think of "benefit", perhaps naiively, in terms of something that can
> > > be measured or demonstrated rather than just announced.
> >
> > But you see why asap scheduling improves latency/throughput *in theory*,
> 
> Nope. And I don't even see a relationship between preemption and asap I/O
> schedulding. What make you think that I/O threads won't be preempted by
> other threads?
> 

I/O intensive threads block early voluntarily - while CPU hogs don't.
Statistically there is a higher chance, that a CPU hog gets preempted
instead of an IO bound (that gives up the CPU in some useconds anyway)

The next IO request is hitting the device "earlier" - instead of waiting
for the next schedule() - that makes sense to me.

With this scenario the system CPU utilization gets "bigger" for the benefit
of "faster" IO. OTOH, seti@home needs longer to run.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:33                                                                                     ` Peter Wächtler
@ 2002-01-21 16:45                                                                                       ` yodaiken
  2002-01-21 17:12                                                                                         ` Peter Wächtler
  0 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-21 16:45 UTC (permalink / raw)
  To: Peter Wächtler
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, Jan 21, 2002 at 05:33:50PM +0100, Peter Wächtler wrote:
> yodaiken@fsmlabs.com schrieb:
> > 
> > On Mon, Jan 21, 2002 at 05:05:01PM +0100, Daniel Phillips wrote:
> > > > I think of "benefit", perhaps naiively, in terms of something that can
> > > > be measured or demonstrated rather than just announced.
> > >
> > > But you see why asap scheduling improves latency/throughput *in theory*,
> > 
> > Nope. And I don't even see a relationship between preemption and asap I/O
> > schedulding. What make you think that I/O threads won't be preempted by
> > other threads?
> > 
> 
> I/O intensive threads block early voluntarily - while CPU hogs don't.

Since the preemption patch only allows additional preemption in kernel
mode, I'm curious to know what the compute bound tasks are doing in 
kernel mode. Did Linux add in-kernel matrix multiplication while
I was not looking?


> Statistically there is a higher chance, that a CPU hog gets preempted
> instead of an IO bound (that gives up the CPU in some useconds anyway)


"Statistically"?  As far as I know, most I/O in Linux does not block.
When you say "statistically", you should have some analysis with clearly
stated assumptions. 

> 
> The next IO request is hitting the device "earlier" - instead of waiting
> for the next schedule() - that makes sense to me.
> 
> With this scenario the system CPU utilization gets "bigger" for the benefit
> of "faster" IO. OTOH, seti@home needs longer to run.

Sorry. No sale.


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:45                                                                                       ` yodaiken
@ 2002-01-21 17:12                                                                                         ` Peter Wächtler
  2002-01-21 17:15                                                                                           ` yodaiken
  0 siblings, 1 reply; 351+ messages in thread
From: Peter Wächtler @ 2002-01-21 17:12 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

yodaiken@fsmlabs.com schrieb:
> 
> On Mon, Jan 21, 2002 at 05:33:50PM +0100, Peter Wächtler wrote:
> > yodaiken@fsmlabs.com schrieb:
> > >
> > > On Mon, Jan 21, 2002 at 05:05:01PM +0100, Daniel Phillips wrote:
> > > > > I think of "benefit", perhaps naiively, in terms of something that can
> > > > > be measured or demonstrated rather than just announced.
> > > >
> > > > But you see why asap scheduling improves latency/throughput *in theory*,
> > >
> > > Nope. And I don't even see a relationship between preemption and asap I/O
> > > schedulding. What make you think that I/O threads won't be preempted by
> > > other threads?
> > >
> >
> > I/O intensive threads block early voluntarily - while CPU hogs don't.
> 
> Since the preemption patch only allows additional preemption in kernel
> mode, I'm curious to know what the compute bound tasks are doing in
> kernel mode. Did Linux add in-kernel matrix multiplication while
> I was not looking?
> 

Dead right you are.
Then there are only slow system calls left. Umh, execve(), fork()
(with big address space) - what about page_launder etc.?


> > Statistically there is a higher chance, that a CPU hog gets preempted
> > instead of an IO bound (that gives up the CPU in some useconds anyway)
> 
> "Statistically"?  As far as I know, most I/O in Linux does not block.

You mean, the syscall returns without a reschedule?
Aehm, now it's time for some statistics where the kernel spents its time on ;-)

But what is a possible explanation for the people, who think their 
systems behave better with preemption - strong believe?

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 17:12                                                                                         ` Peter Wächtler
@ 2002-01-21 17:15                                                                                           ` yodaiken
  0 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-21 17:15 UTC (permalink / raw)
  To: Peter Wächtler
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, Jan 21, 2002 at 06:12:58PM +0100, Peter Wächtler wrote:
> > Since the preemption patch only allows additional preemption in kernel
> > mode, I'm curious to know what the compute bound tasks are doing in
> > kernel mode. Did Linux add in-kernel matrix multiplication while
> > I was not looking?
> > 
> 
> Dead right you are.
> Then there are only slow system calls left. Umh, execve(), fork()
> (with big address space) - what about page_launder etc.?

Those are, in some sense, I/O right? It's not clear to me
that preempting page_launder is sensible.

> But what is a possible explanation for the people, who think their 
> systems behave better with preemption - strong believe?

Beats me. Maybe it really does work - but maybe not and nobody
has advanced any analysis or numbers that make the case.



-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:06                                                                                   ` yodaiken
  2002-01-21 16:33                                                                                     ` Peter Wächtler
@ 2002-01-21 16:48                                                                                     ` Daniel Phillips
  2002-01-21 16:50                                                                                       ` yodaiken
  2002-01-21 21:24                                                                                       ` Robert Love
  2002-01-21 21:16                                                                                     ` Robert Love
  2002-01-22  0:27                                                                                     ` Roman Zippel
  3 siblings, 2 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-21 16:48 UTC (permalink / raw)
  To: yodaiken
  Cc: yodaiken, george anzinger, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On January 21, 2002 05:06 pm, yodaiken@fsmlabs.com wrote:
> On Mon, Jan 21, 2002 at 05:05:01PM +0100, Daniel Phillips wrote:
> > > I think of "benefit", perhaps naiively, in terms of something that can
> > > be measured or demonstrated rather than just announced.
> > 
> > But you see why asap scheduling improves latency/throughput *in theory*, 
> 
> Nope. And I don't even see a relationship between preemption and asap I/O
> schedulding. What make you think that I/O threads won't be preempted by
> other threads?

Consider a thread reading from disk in such a way that readahead is no help, 
i.e., perhaps the disk is fragmented.  At each step the IO thread schedules a 
read and sleeps until the read completes, then schedules the next one.  At 
the same time there is a hog in the kernel, or perhaps there is 
competition from other tasks using the kernel.  In any event, it will 
frequently transpire that at the time the disk IO completes there is somebody 
in the kernel.  Without preemption the IO thread has to wait until the kernel 
hog blocks, hits a scheduling point or exits the kernel.

The result, without preemption, is:

         IO thread      Kernel hog       Disk
             |              .
             |--------------.-----------> .
             .              |             |
             .              |             |
             .              |             |
             .              |             |
             .              |<------------|
             .              |             .
             .              |             .
             .              |             .
             .<-------------|             .
             |--------------.-----------> .
             .              .             |
             .              .             |
             .              |             |
             .              |             |
             .              |             |
             .              |<------------|
             .              |             .
             .              |             .
             .              |             .
             .<-------------|             .
             |--------------.-----------> .
             .              .             |
             .              .             |
             .              |             |
             .              |             |
             .              |             |
             .              |<------------|
             .              |             .
             .              |             .
             .              |             .
             .<-------------|             .
             .

Whereas with preemption, we have:

         IO thread      Kernel hog       Disk
             |              .
             |--------------.-----------> .
             .              |             |
             .              |             |
             .              |             |
             .              |             |
             .<-------------|-------------|
             |--------------.-----------> .
             .              |             |
             .              |             |
             .              |             |
             .              |             |
             .<-------------|-------------|
             |--------------.-----------> .
             .              |             |
             .              |             |
             .              |             |
             .              |             |
             .<-------------|-------------|
             |

The disk and the IO thread are active a higher portion of the time, while the 
kernel hog gets the same amount of time.  So in this case we have improved 
both latency and throughput.

Naturally I constructed this case to show the effect most clearly.  There are 
many possible variations on the above scenario.  It does seem to explain the 
latency/throughput improvements that have been reported in practice.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:48                                                                                     ` Daniel Phillips
@ 2002-01-21 16:50                                                                                       ` yodaiken
  2002-01-21 17:32                                                                                         ` Chris Friesen
  2002-01-21 21:22                                                                                         ` Robert Love
  2002-01-21 21:24                                                                                       ` Robert Love
  1 sibling, 2 replies; 351+ messages in thread
From: yodaiken @ 2002-01-21 16:50 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: yodaiken, george anzinger, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On Mon, Jan 21, 2002 at 05:48:30PM +0100, Daniel Phillips wrote:
> On January 21, 2002 05:06 pm, yodaiken@fsmlabs.com wrote:
> > On Mon, Jan 21, 2002 at 05:05:01PM +0100, Daniel Phillips wrote:
> > > > I think of "benefit", perhaps naiively, in terms of something that can
> > > > be measured or demonstrated rather than just announced.
> > > 
> > > But you see why asap scheduling improves latency/throughput *in theory*, 
> > 
> > Nope. And I don't even see a relationship between preemption and asap I/O
> > schedulding. What make you think that I/O threads won't be preempted by
> > other threads?
> 
> Consider a thread reading from disk in such a way that readahead is no help, 
> i.e., perhaps the disk is fragmented.  At each step the IO thread schedules a 
> read and sleeps until the read completes, then schedules the next one.  At 
> the same time there is a hog in the kernel, or perhaps there is 
> competition from other tasks using the kernel.  In any event, it will 
> frequently transpire that at the time the disk IO completes there is somebody 
> in the kernel.  Without preemption the IO thread has to wait until the kernel 
> hog blocks, hits a scheduling point or exits the kernel.


So your claim is that:
	Preemption improves latency when there are both kernel cpu bound
	tasks and tasks that are I/O bound with very low cache hit
	rates?

Is that it?

Can you give me an example of a CPU bound task that runs
mostly in kernel? Doesn't that seem like a kernel bug?

> Naturally I constructed this case to show the effect most clearly.  There are 

How about a plausible case?

> many possible variations on the above scenario.  It does seem to explain the 
> latency/throughput improvements that have been reported in practice.

I still keep missing these reports. Can you help me here?
(Obviously "my laptop seems more effervescent" is not what I'm looking
 for.)





-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:50                                                                                       ` yodaiken
@ 2002-01-21 17:32                                                                                         ` Chris Friesen
  2002-01-21 17:52                                                                                           ` yodaiken
  2002-01-21 21:22                                                                                         ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: Chris Friesen @ 2002-01-21 17:32 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

yodaiken@fsmlabs.com wrote:

> So your claim is that:
>         Preemption improves latency when there are both kernel cpu bound
>         tasks and tasks that are I/O bound with very low cache hit
>         rates?
> 
> Is that it?
> 
> Can you give me an example of a CPU bound task that runs
> mostly in kernel? Doesn't that seem like a kernel bug?

cat /dev/urandom > /dev/null

-- 
Chris Friesen                    | MailStop: 043/33/F10  
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 17:32                                                                                         ` Chris Friesen
@ 2002-01-21 17:52                                                                                           ` yodaiken
  2002-01-21 18:59                                                                                             ` Chris Friesen
  2002-01-21 19:00                                                                                             ` Peter Wächtler
  0 siblings, 2 replies; 351+ messages in thread
From: yodaiken @ 2002-01-21 17:52 UTC (permalink / raw)
  To: Chris Friesen
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel


On Mon, Jan 21, 2002 at 12:32:57PM -0500, Chris Friesen wrote:
> yodaiken@fsmlabs.com wrote:
> 
> > So your claim is that:
> >         Preemption improves latency when there are both kernel cpu bound
> >         tasks and tasks that are I/O bound with very low cache hit
> >         rates?
> > 
> > Is that it?
> > 
> > Can you give me an example of a CPU bound task that runs
> > mostly in kernel? Doesn't that seem like a kernel bug?
> 
> cat /dev/urandom > /dev/null

Don't see any of Daniel's postulated long latencies there.  (Sorry, but
I'm having a hard time figuring out what is meant as a serious comment
here).


> 
> -- 
> Chris Friesen                    | MailStop: 043/33/F10  
> Nortel Networks                  | work: (613) 765-0557
> 3500 Carling Avenue              | fax:  (613) 765-2986
> Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 17:52                                                                                           ` yodaiken
@ 2002-01-21 18:59                                                                                             ` Chris Friesen
  2002-01-21 19:00                                                                                             ` Peter Wächtler
  1 sibling, 0 replies; 351+ messages in thread
From: Chris Friesen @ 2002-01-21 18:59 UTC (permalink / raw)
  To: linux-kernel

yodaiken@fsmlabs.com wrote:
> 
> On Mon, Jan 21, 2002 at 12:32:57PM -0500, Chris Friesen wrote:
> > yodaiken@fsmlabs.com wrote:
> >
> > > So your claim is that:
> > >         Preemption improves latency when there are both kernel cpu bound
> > >         tasks and tasks that are I/O bound with very low cache hit
> > >         rates?
> > >
> > > Is that it?
> > >
> > > Can you give me an example of a CPU bound task that runs
> > > mostly in kernel? Doesn't that seem like a kernel bug?
> >
> > cat /dev/urandom > /dev/null
> 
> Don't see any of Daniel's postulated long latencies there.  (Sorry, but
> I'm having a hard time figuring out what is meant as a serious comment
> here).

No, that one wasn't serious.  And while it is CPU bound and mostly in the
kernel, you're right that there are no long latencies to cause issues...

-- 
Chris Friesen                    | MailStop: 043/33/F10  
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 17:52                                                                                           ` yodaiken
  2002-01-21 18:59                                                                                             ` Chris Friesen
@ 2002-01-21 19:00                                                                                             ` Peter Wächtler
  1 sibling, 0 replies; 351+ messages in thread
From: Peter Wächtler @ 2002-01-21 19:00 UTC (permalink / raw)
  To: yodaiken
  Cc: Chris Friesen, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

yodaiken@fsmlabs.com schrieb:
> 
> On Mon, Jan 21, 2002 at 12:32:57PM -0500, Chris Friesen wrote:
> > yodaiken@fsmlabs.com wrote:
> >
> > > So your claim is that:
> > >         Preemption improves latency when there are both kernel cpu bound
> > >         tasks and tasks that are I/O bound with very low cache hit
> > >         rates?
> > >
> > > Is that it?
> > >
> > > Can you give me an example of a CPU bound task that runs
> > > mostly in kernel? Doesn't that seem like a kernel bug?
> >
> > cat /dev/urandom > /dev/null
> 
> Don't see any of Daniel's postulated long latencies there.  (Sorry, but
> I'm having a hard time figuring out what is meant as a serious comment
> here).
> 

This does not count for a "general" kernel:

zisofs reading
e2compr

I will try to compare with preemption kernel patch for out webbox like
device - but there I am mostly interested in "GUI feeling" - and will we
use e2compr on a 2.4.9ac kernel.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:50                                                                                       ` yodaiken
  2002-01-21 17:32                                                                                         ` Chris Friesen
@ 2002-01-21 21:22                                                                                         ` Robert Love
  2002-01-21 21:54                                                                                           ` yodaiken
  2002-01-21 22:18                                                                                           ` Horst von Brand
  1 sibling, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-21 21:22 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, 2002-01-21 at 11:50, yodaiken@fsmlabs.com wrote:
> On Mon, Jan 21, 2002 at 05:48:30PM +0100, Daniel Phillips wrote:

> > Consider a thread reading from disk in such a way that readahead is no help, 
> > i.e., perhaps the disk is fragmented.  At each step the IO thread schedules a 
> > read and sleeps until the read completes, then schedules the next one.  At 
> > the same time there is a hog in the kernel, or perhaps there is 
> > competition from other tasks using the kernel.  In any event, it will 
> > frequently transpire that at the time the disk IO completes there is somebody 
> > in the kernel.  Without preemption the IO thread has to wait until the kernel 
> > hog blocks, hits a scheduling point or exits the kernel.
> 
> 
> So your claim is that:
> 	Preemption improves latency when there are both kernel cpu bound
> 	tasks and tasks that are I/O bound with very low cache hit
> 	rates?
> 
> Is that it?
> 
> Can you give me an example of a CPU bound task that runs
> mostly in kernel? Doesn't that seem like a kernel bug?

It doesn't have to run mostly in the kernel.  It just has to be in the
kernel when the I/O-bound tasks awakes.  Further, there are plenty of
what we consider CPU-bound tasks that are interactive and/or
graphics-oriented and this adds much to their time in the kernel.

In a given period of time, a CPU bound task can run at any allotment
within it is given.  On the other hand, an I/O-bound task spends much
time blocked and thus can only run when I/O is available and it is
awake.  It is thus advantageous to schedule it within the bounds of the
I/O being available, and as tightly in those bounds as possible.  This
more fairly distributes scheduling to all tasks.  Same goes for RT
tasks, interactive tasks, etc.

The result is faster wake-up-to-run and thus higher throughput.  I just
sent some dbench scores to correlate this.

> I still keep missing these reports. Can you help me here?
> (Obviously "my laptop seems more effervescent" is not what I'm looking
>  for.)

While we certainly need tangible empirical benefits, users finding their
desktop experience smoother and thus more enjoyable is just about the
best thing we can ask for.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:22                                                                                         ` Robert Love
@ 2002-01-21 21:54                                                                                           ` yodaiken
  2002-01-21 22:19                                                                                             ` Robert Love
  2002-01-21 22:18                                                                                           ` Horst von Brand
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-21 21:54 UTC (permalink / raw)
  To: Robert Love
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, Jan 21, 2002 at 04:22:58PM -0500, Robert Love wrote:
> On Mon, 2002-01-21 at 11:50, yodaiken@fsmlabs.com wrote:
> > On Mon, Jan 21, 2002 at 05:48:30PM +0100, Daniel Phillips wrote:
> 
> > > Consider a thread reading from disk in such a way that readahead is no help, 
> > > i.e., perhaps the disk is fragmented.  At each step the IO thread schedules a 
> > > read and sleeps until the read completes, then schedules the next one.  At 
> > > the same time there is a hog in the kernel, or perhaps there is 
> > > competition from other tasks using the kernel.  In any event, it will 
> > > frequently transpire that at the time the disk IO completes there is somebody 
> > > in the kernel.  Without preemption the IO thread has to wait until the kernel 
> > > hog blocks, hits a scheduling point or exits the kernel.
> > 
> > 
> > So your claim is that:
> > 	Preemption improves latency when there are both kernel cpu bound
> > 	tasks and tasks that are I/O bound with very low cache hit
> > 	rates?
> > 
> > Is that it?
> > 
> > Can you give me an example of a CPU bound task that runs
> > mostly in kernel? Doesn't that seem like a kernel bug?
> 
> It doesn't have to run mostly in the kernel.  It just has to be in the
> kernel when the I/O-bound tasks awakes.  Further, there are plenty of

How does that work? Won't the switch happen on exit from the kernel?

> what we consider CPU-bound tasks that are interactive and/or
> graphics-oriented and this adds much to their time in the kernel.

I'm not sure what an "interactive and/or graphics-oriented" CPU bound
task might be. Is there a definition?

> In a given period of time, a CPU bound task can run at any allotment
> within it is given.  On the other hand, an I/O-bound task spends much
> time blocked and thus can only run when I/O is available and it is
> awake.  It is thus advantageous to schedule it within the bounds of the
> I/O being available, and as tightly in those bounds as possible.  This
> more fairly distributes scheduling to all tasks.  Same goes for RT
> tasks, interactive tasks, etc.

So you think of an "I/O bound task" as  "an I/O bound task that spends
most of its timeblocked". Won't the latencies of such tasks already be
pretty high? I'd think that better caching and read-ahead is the correct
fix.


> The result is faster wake-up-to-run and thus higher throughput.  I just
> sent some dbench scores to correlate this.
> 
> > I still keep missing these reports. Can you help me here?
> > (Obviously "my laptop seems more effervescent" is not what I'm looking
> >  for.)
> 
> While we certainly need tangible empirical benefits, users finding their
> desktop experience smoother and thus more enjoyable is just about the
> best thing we can ask for.

It depends on what you want. 

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:54                                                                                           ` yodaiken
@ 2002-01-21 22:19                                                                                             ` Robert Love
  0 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-21 22:19 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, 2002-01-21 at 16:54, yodaiken@fsmlabs.com wrote:

> > It doesn't have to run mostly in the kernel.  It just has to be in the
> > kernel when the I/O-bound tasks awakes.  Further, there are plenty of
> 
> How does that work? Won't the switch happen on exit from the kernel?

Sure, but we have essentially unbounded times in the kernel ...
I/O-bound tasks shouldn't have to wait for do_try_to_free_pages to
finish for some lower priority process.

> > what we consider CPU-bound tasks that are interactive and/or
> > graphics-oriented and this adds much to their time in the kernel.
> 
> I'm not sure what an "interactive and/or graphics-oriented" CPU bound
> task might be. Is there a definition?

I'm talking about today's GUI application.  It does computation and its
riddled with bloated GUI code.  So it is certainly CPU bound.  At the
same time it is interactive (blocking or polling on user input) and
involves some graphics output.  So it is involved in I/O, too.  What
would you consider it?

> So you think of an "I/O bound task" as  "an I/O bound task that spends
> most of its timeblocked". Won't the latencies of such tasks already be
> pretty high? I'd think that better caching and read-ahead is the correct
> fix.

I should correct myself, I didn't mean "most of its time" but a
statistically large portion.  All it needs to do is find itself woken
when something else is in the kernel.

> > While we certainly need tangible empirical benefits, users finding their
> > desktop experience smoother and thus more enjoyable is just about the
> > best thing we can ask for.
> 
> It depends on what you want. 

I want a better kernel.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:22                                                                                         ` Robert Love
  2002-01-21 21:54                                                                                           ` yodaiken
@ 2002-01-21 22:18                                                                                           ` Horst von Brand
  2002-01-21 22:53                                                                                             ` Chris Friesen
  2002-01-29 23:12                                                                                             ` Bill Davidsen
  1 sibling, 2 replies; 351+ messages in thread
From: Horst von Brand @ 2002-01-21 22:18 UTC (permalink / raw)
  To: Robert Love; +Cc: linux-kernel

Robert Love <rml@tech9.net> said:

[...]

> It doesn't have to run mostly in the kernel.  It just has to be in the
> kernel when the I/O-bound tasks awakes.  Further, there are plenty of
> what we consider CPU-bound tasks that are interactive and/or
> graphics-oriented and this adds much to their time in the kernel.

Look, I don't know about you, but system (kernel) tieme around here is
rarely very high as a %. Perhaps 5% could be called "typical". And it is
during those 5% (i.e., something like 5% of the time) any of this stuff
will make a difference at all. This will be _hard_ to "feel" (if it is
possible to feel at all).

For the mostly positive (subjective) responses you see, there is something
called "psycology", which would predict that for _exactly_ the same "feel"
(whatever that may be) somebody who just made an effort downloading
patches, applying them, reconfiguring ad building a kernel "to make it feel
better" _will_ feel it better. I.e., nobody wants to have to say "Okay,
lots of work down the drain". Besides, those who see no difference will
shut up, those that delude themselves most will be vocal about it.

There was a famous experiment in determining productivity of people under
various circumstances. Whatever they changed, the productivity went
up. They finally had to conclude their results weren't due to the
environmental changes, but to the fact that the subjects felt important
(and motivated). Something similar might be happening here.

I'm not saying it _is_ like this, but as long as there are no reproducible
ways to put numbers to the "feel", and make controlled experiments, no one
will know for sure.

> In a given period of time, [...]

Yes, yes, we all know the theory, and it is obviously true. Question is
just, _how much_ is the change? Is it large enough to compensate for the
pain it causes?

[...]

> While we certainly need tangible empirical benefits, users finding their
> desktop experience smoother and thus more enjoyable is just about the
> best thing we can ask for.

It just isn't enough justification for wholesale redesign on basic
assumptions in the kernel, sorry.
-- 
Horst von Brand			     http://counter.li.org # 22616

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 22:18                                                                                           ` Horst von Brand
@ 2002-01-21 22:53                                                                                             ` Chris Friesen
  2002-01-29 23:12                                                                                             ` Bill Davidsen
  1 sibling, 0 replies; 351+ messages in thread
From: Chris Friesen @ 2002-01-21 22:53 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Robert Love, linux-kernel

Horst von Brand wrote:
> 
> Robert Love <rml@tech9.net> said:
> 
> [...]
> 
> > It doesn't have to run mostly in the kernel.  It just has to be in the
> > kernel when the I/O-bound tasks awakes.  Further, there are plenty of
> > what we consider CPU-bound tasks that are interactive and/or
> > graphics-oriented and this adds much to their time in the kernel.
> 
> Look, I don't know about you, but system (kernel) tieme around here is
> rarely very high as a %. Perhaps 5% could be called "typical". And it is
> during those 5% (i.e., something like 5% of the time) any of this stuff
> will make a difference at all. This will be _hard_ to "feel" (if it is
> possible to feel at all).

As a thought experiment...

1) top usually averages over 5 seconds
2) 5% of 5 seconds is 0.25 seconds


What I'm getting at is that it is possible that there are cases where we could
be taking significant amounts of time to respond to something, without the
overall average being too high.

If we have even a single 0.1 second delay, that's going to be noticeable to the
user without seriously bumping up system percentages.

-- 
Chris Friesen                    | MailStop: 043/33/F10  
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 22:18                                                                                           ` Horst von Brand
  2002-01-21 22:53                                                                                             ` Chris Friesen
@ 2002-01-29 23:12                                                                                             ` Bill Davidsen
  1 sibling, 0 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-29 23:12 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Robert Love, linux-kernel

On Mon, 21 Jan 2002, Horst von Brand wrote:

> Robert Love <rml@tech9.net> said:
> 
> [...]
> 
> > It doesn't have to run mostly in the kernel.  It just has to be in the
> > kernel when the I/O-bound tasks awakes.  Further, there are plenty of
> > what we consider CPU-bound tasks that are interactive and/or
> > graphics-oriented and this adds much to their time in the kernel.
	[ snip ]
> For the mostly positive (subjective) responses you see, there is something
> called "psycology", which would predict that for _exactly_ the same "feel"
> (whatever that may be) somebody who just made an effort downloading
> patches, applying them, reconfiguring ad building a kernel "to make it feel
> better" _will_ feel it better. I.e., nobody wants to have to say "Okay,
> lots of work down the drain". Besides, those who see no difference will
> shut up, those that delude themselves most will be vocal about it.

Ah, that's it, we're deluding ourselves. To the point that booting a
kernel without identification and having casual users watch the cursor
move while running a standard low will result in those users sharing our
delusion.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:48                                                                                     ` Daniel Phillips
  2002-01-21 16:50                                                                                       ` yodaiken
@ 2002-01-21 21:24                                                                                       ` Robert Love
  1 sibling, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-21 21:24 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: yodaiken, george anzinger, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On Mon, 2002-01-21 at 11:48, Daniel Phillips wrote:

> The disk and the IO thread are active a higher portion of the time, while the 
> kernel hog gets the same amount of time.  So in this case we have improved 
> both latency and throughput.
> 
> Naturally I constructed this case to show the effect most clearly.  There are 
> many possible variations on the above scenario.  It does seem to explain the 
> latency/throughput improvements that have been reported in practice.

Well put.  I think this is exact the scenario we are observing.

Its the same benefit to latency.  We react quicker.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:06                                                                                   ` yodaiken
  2002-01-21 16:33                                                                                     ` Peter Wächtler
  2002-01-21 16:48                                                                                     ` Daniel Phillips
@ 2002-01-21 21:16                                                                                     ` Robert Love
  2002-01-21 21:33                                                                                       ` Andrew Morton
  2002-01-21 21:49                                                                                       ` yodaiken
  2002-01-22  0:27                                                                                     ` Roman Zippel
  3 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-21 21:16 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, 2002-01-21 at 11:06, yodaiken@fsmlabs.com wrote:

> I have not seen a single well structured benchmark that shows a significant
> difference. I've seen lots of benchmarks with odd mixes of different patches
> showing something unknown. How about a simple clear dbench?

I and many others have been posting benchmarks for months.

Here:

(average of 4 runs of `dbench 16')
2.5.3-pre1:		25.7608 MB/s
2.5.3-pre1-preempt:	32.341 MB/s

(old, average of 4 runs of `dbench 16')
2.5.2-pre11:		24.5364 MB/s
2.5.2-pre11-preempt:	27.5192 MB/s

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:16                                                                                     ` Robert Love
@ 2002-01-21 21:33                                                                                       ` Andrew Morton
  2002-01-21 21:59                                                                                         ` J Sloan
  2002-01-21 21:49                                                                                       ` yodaiken
  1 sibling, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-21 21:33 UTC (permalink / raw)
  To: Robert Love
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

Robert Love wrote:
> 
> On Mon, 2002-01-21 at 11:06, yodaiken@fsmlabs.com wrote:
> 
> > I have not seen a single well structured benchmark that shows a significant
> > difference. I've seen lots of benchmarks with odd mixes of different patches
> > showing something unknown. How about a simple clear dbench?
> 
> I and many others have been posting benchmarks for months.
> 
> Here:
> 
> (average of 4 runs of `dbench 16')
> 2.5.3-pre1:             25.7608 MB/s
> 2.5.3-pre1-preempt:     32.341 MB/s
> 

Well I spent several hours last week trying to reproduce and
account for these observations and basically came up with
nothing.  The patch-induced variation was of the same order
as the inter-run variation.

You should publish the dbench dots.  They're most informative.
Look at these two:

16 clients started
..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................+..............................................................................................+++++++++++++++****************
Throughput 6.48209 MB/sec (NB=8.10261 MB/sec  64.8209 MBit/sec)

16 clients started
.................................................................................................................................................................................................................................+...................................................................................................................................................................................................................................................................+................................................................................+............................+++++++++++++****************
Throughput 7.76901 MB/sec (NB=9.71126 MB/sec  77.6901 MBit/sec)

These are identical runs.  Empty filesystem, 64 megs of memory.

Note how in the second run, a few clients terminated early,
and the throughput numbers increased quite markedly.

I don't know what causes this, and frankly I'm not really
interested.  It's some bizarre freaky dbench instability.

Similar effects occur with the `make -j12 bzImage' swapstorm
test.  After a while, all the `cc' instances start to
get synchronised at the onset of their peak memory demand.
The earlier and longer this happens, the worse the runtime.
It's an unstable system and tiny input perturbations create
large effects on the output.

Sorry, but these are not interesting, repeatable or stable workloads,
and I remain unconvinced about claims that low-latency or preempt
aid I/O throughput.  And even if a statistically significant
benefit _can_ be empirically demonstrated, it would be incautious
to claim a general benefit without a solid explanation of what
is causing it.  (Apart from the RAID5 kernel thread starvation
effect, which _is_ explained).

If someone can show me a sensible and repeatable I/O gain then
great, I can go in and work out exactly where it's coming from
and then we finally have a real, tangible, non-hand-wavy
explanation.  It may be there, but I just don't see it yet.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:33                                                                                       ` Andrew Morton
@ 2002-01-21 21:59                                                                                         ` J Sloan
  0 siblings, 0 replies; 351+ messages in thread
From: J Sloan @ 2002-01-21 21:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Linux kernel

Remember, the -aa kernels are dbench champ -

They don't use preempt, but do use mini-ll...

Just a rant from the peanut gallery....

cu

jjs

Andrew Morton wrote:

>Robert Love wrote:
>
>>On Mon, 2002-01-21 at 11:06, yodaiken@fsmlabs.com wrote:
>>
>>>I have not seen a single well structured benchmark that shows a significant
>>>difference. I've seen lots of benchmarks with odd mixes of different patches
>>>showing something unknown. How about a simple clear dbench?
>>>
>>I and many others have been posting benchmarks for months.
>>
>>Here:
>>
>>(average of 4 runs of `dbench 16')
>>2.5.3-pre1:             25.7608 MB/s
>>2.5.3-pre1-preempt:     32.341 MB/s
>>
>
>Well I spent several hours last week trying to reproduce and
>account for these observations and basically came up with
>nothing.  The patch-induced variation was of the same order
>as the inter-run variation.
>
>You should publish the dbench dots.  They're most informative.
>Look at these two:
>
>16 clients started
>..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................+..............................................................................................+++++++++++++++****************
>Throughput 6.48209 MB/sec (NB=8.10261 MB/sec  64.8209 MBit/sec)
>
>16 clients started
>.................................................................................................................................................................................................................................+...................................................................................................................................................................................................................................................................+................................................................................+............................+++++++++++++****************
>Throughput 7.76901 MB/sec (NB=9.71126 MB/sec  77.6901 MBit/sec)
>
>These are identical runs.  Empty filesystem, 64 megs of memory.
>
>Note how in the second run, a few clients terminated early,
>and the throughput numbers increased quite markedly.
>
>I don't know what causes this, and frankly I'm not really
>interested.  It's some bizarre freaky dbench instability.
>
>Similar effects occur with the `make -j12 bzImage' swapstorm
>test.  After a while, all the `cc' instances start to
>get synchronised at the onset of their peak memory demand.
>The earlier and longer this happens, the worse the runtime.
>It's an unstable system and tiny input perturbations create
>large effects on the output.
>
>Sorry, but these are not interesting, repeatable or stable workloads,
>and I remain unconvinced about claims that low-latency or preempt
>aid I/O throughput.  And even if a statistically significant
>benefit _can_ be empirically demonstrated, it would be incautious
>to claim a general benefit without a solid explanation of what
>is causing it.  (Apart from the RAID5 kernel thread starvation
>effect, which _is_ explained).
>
>If someone can show me a sensible and repeatable I/O gain then
>great, I can go in and work out exactly where it's coming from
>and then we finally have a real, tangible, non-hand-wavy
>explanation.  It may be there, but I just don't see it yet.
>
>-
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:16                                                                                     ` Robert Love
  2002-01-21 21:33                                                                                       ` Andrew Morton
@ 2002-01-21 21:49                                                                                       ` yodaiken
  2002-01-21 22:01                                                                                         ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-21 21:49 UTC (permalink / raw)
  To: Robert Love
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel



On Mon, Jan 21, 2002 at 04:16:51PM -0500, Robert Love wrote:
> On Mon, 2002-01-21 at 11:06, yodaiken@fsmlabs.com wrote:
> 
> > I have not seen a single well structured benchmark that shows a significant
> > difference. I've seen lots of benchmarks with odd mixes of different patches
> > showing something unknown. How about a simple clear dbench?
> 
> I and many others have been posting benchmarks for months.
> 
> Here:
> 
> (average of 4 runs of `dbench 16')
> 2.5.3-pre1:		25.7608 MB/s
> 2.5.3-pre1-preempt:	32.341 MB/s
> 
> (old, average of 4 runs of `dbench 16')
> 2.5.2-pre11:		24.5364 MB/s
> 2.5.2-pre11-preempt:	27.5192 MB/s
> 

Robert, with all due respect, my tests of dbench show such high
variation that 4 miserable runs prove exactly nothing.
Did these even come on the same filesystem? 



---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:49                                                                                       ` yodaiken
@ 2002-01-21 22:01                                                                                         ` Robert Love
  2002-01-21 20:52                                                                                           ` Marcelo Tosatti
  2002-01-21 23:56                                                                                           ` yodaiken
  0 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-21 22:01 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, 2002-01-21 at 16:49, yodaiken@fsmlabs.com wrote:

> > (average of 4 runs of `dbench 16')
> > 2.5.3-pre1:		25.7608 MB/s
> > 2.5.3-pre1-preempt:	32.341 MB/s
> > 
> > (old, average of 4 runs of `dbench 16')
> > 2.5.2-pre11:		24.5364 MB/s
> > 2.5.2-pre11-preempt:	27.5192 MB/s

> Robert, with all due respect, my tests of dbench show such high
> variation that 4 miserable runs prove exactly nothing.

Well you asked for dbench.  Would you prefer 10 runs each?  There were,
however, no statistical anomalies and the variation was low enough such
that I suspect I could construct a reasonable confidence interval from
these 16 runs.

I've run these tests over and over again sufficiently that the
repeatability of obtaining improved marks under a preemptive kernel is
evident to me.

You can see very old (2.4.6) yet still positive results from Nigel, too:
http://kpreempt.sourceforge.net.

I guess the point is, everyone argues preemption is detrimental to
throughput.  I'm not going to argue that we aren't adding complexity,
because clearly we are.  But now we have tests showing throughput is
improved and people still argue.  I've seen the same behavior under
bonnie, timing kernel compiles, etc ...

> Did these even come on the same filesystem?

Yes, why would you suspect otherwise?

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 22:01                                                                                         ` Robert Love
@ 2002-01-21 20:52                                                                                           ` Marcelo Tosatti
  2002-01-21 22:26                                                                                             ` Robert Love
  2002-01-21 23:56                                                                                           ` yodaiken
  1 sibling, 1 reply; 351+ messages in thread
From: Marcelo Tosatti @ 2002-01-21 20:52 UTC (permalink / raw)
  To: Robert Love
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel



On 21 Jan 2002, Robert Love wrote:

> On Mon, 2002-01-21 at 16:49, yodaiken@fsmlabs.com wrote:
> 
> > > (average of 4 runs of `dbench 16')
> > > 2.5.3-pre1:		25.7608 MB/s
> > > 2.5.3-pre1-preempt:	32.341 MB/s
> > > 
> > > (old, average of 4 runs of `dbench 16')
> > > 2.5.2-pre11:		24.5364 MB/s
> > > 2.5.2-pre11-preempt:	27.5192 MB/s
> 
> > Robert, with all due respect, my tests of dbench show such high
> > variation that 4 miserable runs prove exactly nothing.
> 
> Well you asked for dbench.  Would you prefer 10 runs each?  There were,
> however, no statistical anomalies and the variation was low enough such
> that I suspect I could construct a reasonable confidence interval from
> these 16 runs.
> 
> I've run these tests over and over again sufficiently that the
> repeatability of obtaining improved marks under a preemptive kernel is
> evident to me.
> 
> You can see very old (2.4.6) yet still positive results from Nigel, too:
> http://kpreempt.sourceforge.net.
> 
> I guess the point is, everyone argues preemption is detrimental to
> throughput.  I'm not going to argue that we aren't adding complexity,
> because clearly we are.  But now we have tests showing throughput is
> improved and people still argue.  I've seen the same behavior under
> bonnie, timing kernel compiles, etc ...

Sure, you've seen it. But _why_ it happens ?

That is the point.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 20:52                                                                                           ` Marcelo Tosatti
@ 2002-01-21 22:26                                                                                             ` Robert Love
  0 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-21 22:26 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, 2002-01-21 at 15:52, Marcelo Tosatti wrote:

> > I guess the point is, everyone argues preemption is detrimental to
> > throughput.  I'm not going to argue that we aren't adding complexity,
> > because clearly we are.  But now we have tests showing throughput is
> > improved and people still argue.  I've seen the same behavior under
> > bonnie, timing kernel compiles, etc ...
> 
> Sure, you've seen it. But _why_ it happens ?
> 
> That is the point.

Daniel just reiterated it, but I suspect we better multitask a mix of
tasks.  I/O-bound tasks that are woken can be run quicker and thus
throughput increases.

I'm not trying to tout preempt-kernel as a throughput solution.  I think
it is a neat and promising side-note to the patch, and one that
benchmarks are correlating.  Ignore it as a statistical error and
consider throughput untouched if you want. 

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 22:01                                                                                         ` Robert Love
  2002-01-21 20:52                                                                                           ` Marcelo Tosatti
@ 2002-01-21 23:56                                                                                           ` yodaiken
  2002-01-22  0:45                                                                                             ` Roman Zippel
                                                                                                               ` (2 more replies)
  1 sibling, 3 replies; 351+ messages in thread
From: yodaiken @ 2002-01-21 23:56 UTC (permalink / raw)
  To: Robert Love
  Cc: yodaiken, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, Roman Zippel, linux-kernel

On Mon, Jan 21, 2002 at 05:01:45PM -0500, Robert Love wrote:
> On Mon, 2002-01-21 at 16:49, yodaiken@fsmlabs.com wrote:
> 
> > > (average of 4 runs of `dbench 16')
> > > 2.5.3-pre1:		25.7608 MB/s
> > > 2.5.3-pre1-preempt:	32.341 MB/s
> > > 
> > > (old, average of 4 runs of `dbench 16')
> > > 2.5.2-pre11:		24.5364 MB/s
> > > 2.5.2-pre11-preempt:	27.5192 MB/s
> 
> > Robert, with all due respect, my tests of dbench show such high
> > variation that 4 miserable runs prove exactly nothing.
> 
> Well you asked for dbench.  Would you prefer 10 runs each?  There were,

50. And I'd like standard deviation as well as average, best, worst.

And then I'd like to see the same test done with a RT task running in
the background - since I assume preempt has as its main purpose
to enable SCHED_FIFO.

> I guess the point is, everyone argues preemption is detrimental to
> throughput.  I'm not going to argue that we aren't adding complexity,

I have not seen that argued - certainly I have not argued it myself.
My argument is:
	It makes the kernel _much_ more complex
	It has known costs e.g. by making the lockless 
		per-processor caching  more difficult if not impossible
	It seems to lead to a requirement for inheritance
	It has no demonstrated benefits.

> > Did these even come on the same filesystem?
> 
> Yes, why would you suspect otherwise?
> 

Because the prior cited benchmark was a kernel compile on different trees.
Was the filesystem unchanged between runs?
I'm not suggesting you are cheating, it's easy to overlook something 
critical when there are so many variables.


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 23:56                                                                                           ` yodaiken
@ 2002-01-22  0:45                                                                                             ` Roman Zippel
  2002-01-22  1:34                                                                                               ` yodaiken
  2002-01-22  2:10                                                                                             ` Daniel Phillips
  2002-01-29 23:36                                                                                             ` Bill Davidsen
  2 siblings, 1 reply; 351+ messages in thread
From: Roman Zippel @ 2002-01-22  0:45 UTC (permalink / raw)
  To: yodaiken
  Cc: Robert Love, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> I have not seen that argued - certainly I have not argued it myself.
> My argument is:

Which is your usual FUD and everything was discussed before, just
repeating claims without any arguments doesn't make it better. :-(

>         It makes the kernel _much_ more complex

It certainly adds complexity, but which new feature doesn't?
Please specify "_much_ more complex".

>         It has known costs e.g. by making the lockless
>                 per-processor caching  more difficult if not impossible

It would only expose the locking, which is currently hidden in the smp
infrastructure.
What's so "impossible" about it?

>         It seems to lead to a requirement for inheritance

Where? I still haven't seen any proof for this.

>         It has no demonstrated benefits.

http://kpreempt.sourceforge.net/benno/linux+kp-2.4.6/3x256.html

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-22  0:45                                                                                             ` Roman Zippel
@ 2002-01-22  1:34                                                                                               ` yodaiken
  2002-01-22  9:13                                                                                                 ` Roman Zippel
  0 siblings, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-22  1:34 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Robert Love, Daniel Phillips, george anzinger,
	Momchil Velikov, Arjan van de Ven, linux-kernel

On Tue, Jan 22, 2002 at 01:45:37AM +0100, Roman Zippel wrote:
> Hi,
> 
> yodaiken@fsmlabs.com wrote:
> 
> > I have not seen that argued - certainly I have not argued it myself.
> > My argument is:
> 
> Which is your usual FUD and everything was discussed before, just
> repeating claims without any arguments doesn't make it better. :-(
> 

I was not talking to you and will not be.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-22  1:34                                                                                               ` yodaiken
@ 2002-01-22  9:13                                                                                                 ` Roman Zippel
  0 siblings, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-22  9:13 UTC (permalink / raw)
  To: yodaiken
  Cc: Robert Love, Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, linux-kernel

Hi,

> I was not talking to you and will not be.

Thanks.
Somehow I have the feeling, you hadn't to tell me anything important
anyway. :-)

bye, Roman

PS: SCNR


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 23:56                                                                                           ` yodaiken
  2002-01-22  0:45                                                                                             ` Roman Zippel
@ 2002-01-22  2:10                                                                                             ` Daniel Phillips
  2002-01-24 15:19                                                                                               ` yodaiken
  2002-01-29 23:36                                                                                             ` Bill Davidsen
  2 siblings, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-22  2:10 UTC (permalink / raw)
  To: yodaiken, Robert Love
  Cc: yodaiken, george anzinger, Momchil Velikov, Arjan van de Ven,
	Roman Zippel, linux-kernel

On January 22, 2002 12:56 am, yodaiken@fsmlabs.com wrote:
> I have not seen that argued - certainly I have not argued it myself.
> My argument is:
> 	It makes the kernel _much_ more complex

The patch itself is simple, so this must be an extended interpretation of the 
word 'complex'.

> 	It has known costs e.g. by making the lockless 
> 		per-processor caching  more difficult if not impossible

Not at all, the lazy man's way of dealing with this is to disable preemption 
around that code, an efficient operation.

> 	It seems to lead to a requirement for inheritance

I don't know about that.  From the (long) thread above, it looks like you 
haven't successfully proved the assertion that -preempt introduces any new 
inheritance requirement.

> 	It has no demonstrated benefits.

Demonstrated to who?  I have certainly demonstrated the benefits to myself, 
and others have attested to doing the same.

As far as arguments go, your main points don't seem to be rooted in firm 
ground at all.  On the other hand, the proponents of this patch have 
compelling arguments: it makes Linux feel smoother, it makes certain tests 
run faster, it doesn't slow anything down measurably, it's stable and so on.  
I even explained why it does what it does.  I don't understand why you're so 
vehemently opposed to this, especially as it's a config option.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-22  2:10                                                                                             ` Daniel Phillips
@ 2002-01-24 15:19                                                                                               ` yodaiken
  2002-01-24 21:15                                                                                                 ` Roman Zippel
  2002-01-26  2:36                                                                                                 ` Jamie Lokier
  0 siblings, 2 replies; 351+ messages in thread
From: yodaiken @ 2002-01-24 15:19 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: yodaiken, Robert Love, george anzinger, Momchil Velikov,
	Arjan van de Ven, linux-kernel

On Tue, Jan 22, 2002 at 03:10:42AM +0100, Daniel Phillips wrote:
> On January 22, 2002 12:56 am, yodaiken@fsmlabs.com wrote:
> > I have not seen that argued - certainly I have not argued it myself.
> > My argument is:
> > 	It makes the kernel _much_ more complex
> 
> The patch itself is simple, so this must be an extended interpretation of the 
> word 'complex'.

I'm at a loss here. You seem to be arguing that
there there is a relationship between the complexity of a patch
that changes the entire synchronization assumption of the
kernel and the complexity of the result. But that seems
unbelievable. Is there some component of your argument I've missed?

> 
> > 	It has known costs e.g. by making the lockless 
> > 		per-processor caching  more difficult if not impossible
> 
> Not at all, the lazy man's way of dealing with this is to disable preemption 
> around that code, an efficient operation.

Well, aside from how easy it will be to isolate all that information,
doesn't that defeat the purpose of the patch?  There is a big
difference in design between
	try to get fromverylightweight cache
		fallback to slow but fair and safe pool
and
	try to bound worst case times

In the first case we amortize worst case times by making average case
very low. This is a common design methodology in Linux kernel: semaphores
are the classic example.
So I'm sure that you can add any number of hacks to the kernel, but my
argument stands: per-processor caching is a common case optimization that
de-optimizes worst case. If the purpose of preemption is to reduce
latency, per-processor caching is at counter-purposes.


It's also worth pointing out that every use of cpu-specific information
is dangerous if preemption is extended to smp.
	x = smp_processor_id();
	//get preempted
	do_something(memslab[x]); // used to be safe since only current can
				// do this.

	
> > 	It seems to lead to a requirement for inheritance
> 
> I don't know about that.  From the (long) thread above, it looks like you 
> haven't successfully proved the assertion that -preempt introduces any new 
> inheritance requirement.

Oliver cited the trivial case. He was ignored.

> > 	It has no demonstrated benefits.
> 
> Demonstrated to who?  I have certainly demonstrated the benefits to myself, 
> and others have attested to doing the same.

I've heard similar arguments in favor of aromatherapy and Scientology.

What's amazing about all the arguments in favor of preemption is that we
don't see any published numbers of the obvious application: a periodic
SCHED_FIFO process. We've done these experiments and the results are
_dismal_. 
Even ignoring this, the repeated publication of numbers showing that
Andrew's patches get better results and the repeated statement by 
Andrew that the hard part of latency reduction
is _not_ solved by preemption alone
is continually met with repetitions of the same unsubstantiated chorus:
	But it is easier to maintain  



> As far as arguments go, your main points don't seem to be rooted in firm 
> ground at all.  On the other hand, the proponents of this patch have 
> compelling arguments: it makes Linux feel smoother, it makes certain tests 
> run faster, it doesn't slow anything down measurably, it's stable and so on.  
> I even explained why it does what it does.  I don't understand why you're so 
> vehemently opposed to this, especially as it's a config option.

What you proposed is a
claimed explanation of why a task that experienced regular unfixable
latencies of multiple milliseconds waiting for I/O would have additional
latencies possibly reduced by some unknown amount. You failed to make a
case that this is either something that actually happens or that it would
ever make any difference.

BTW: I've made no arguments that the patch should not be made an option.
I've argued that it is based on a very bad design premise. As for
whether it should be added to 2.5,  sold by other vendors, advertised on TV,
used when operating heavy machinery, or taken by pregnant women,
that's not up to me.


-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-24 15:19                                                                                               ` yodaiken
@ 2002-01-24 21:15                                                                                                 ` Roman Zippel
  2002-01-26  2:36                                                                                                 ` Jamie Lokier
  1 sibling, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-24 21:15 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, Robert Love, george anzinger, Momchil Velikov,
	Arjan van de Ven, linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

I know you're not talking to me (you even removed me and only me from
the Cc list...) and personally I don't care, but other people might get
the impression you had something to say.

> > The patch itself is simple, so this must be an extended interpretation of the 
> > word 'complex'.
> 
> I'm at a loss here. You seem to be arguing that
> there there is a relationship between the complexity of a patch
> that changes the entire synchronization assumption of the
> kernel and the complexity of the result. But that seems
> unbelievable. Is there some component of your argument I've missed?

Which "entire synchronization assumption"? It changes timing
assumptions, which sometimes are used for synchronization. If something
breaks because of this, it was already broken before.
Maybe you could explain this "relationship" in a bit more detail?

> It's also worth pointing out that every use of cpu-specific information
> is dangerous if preemption is extended to smp.
>         x = smp_processor_id();
>         //get preempted
>         do_something(memslab[x]); // used to be safe since only current can
>                                 // do this.

This is a well known problem, which is also fixable. It's also known,
that these fixes would have a larger impact on the kernel, for that
reason preempt for smp is not a real option right now, so that it is
outside the scope of this discussion (or what's left it).

> > >     It seems to lead to a requirement for inheritance
> >
> > I don't know about that.  From the (long) thread above, it looks like you
> > haven't successfully proved the assertion that -preempt introduces any new
> > inheritance requirement.
> 
> Oliver cited the trivial case. He was ignored.

I really tried to find a detailed explanation of this case, but I
haven't found anything, it's possible I missed, so I'd be happy about
any pointer. From the hints I found that case seemed rather unlikely and
assuming a really bad scheduler.

> I've heard similar arguments in favor of aromatherapy and Scientology.

Now we are compared to this. Nice "arguments". :-(

> What's amazing about all the arguments in favor of preemption is that we
> don't see any published numbers of the obvious application: a periodic
> SCHED_FIFO process. We've done these experiments and the results are
> _dismal_.

url?

> Even ignoring this, the repeated publication of numbers showing that
> Andrew's patches get better results and the repeated statement by
> Andrew that the hard part of latency reduction
> is _not_ solved by preemption alone
> is continually met with repetitions of the same unsubstantiated chorus:
>         But it is easier to maintain

"repeated statement"? I've seen Andrew stating it once, repeated by you
and here is my (unanswered) response
http://marc.theaimsgroup.com/?l=linux-kernel&m=101100365432615&w=2.

> > As far as arguments go, your main points don't seem to be rooted in firm
> > ground at all.  On the other hand, the proponents of this patch have
> > compelling arguments: it makes Linux feel smoother, it makes certain tests
> > run faster, it doesn't slow anything down measurably, it's stable and so on.
> > I even explained why it does what it does.  I don't understand why you're so
> > vehemently opposed to this, especially as it's a config option.
> 
> What you proposed is a
> claimed explanation of why a task that experienced regular unfixable
> latencies of multiple milliseconds waiting for I/O would have additional
> latencies possibly reduced by some unknown amount. You failed to make a
> case that this is either something that actually happens or that it would
> ever make any difference.

You happily ignore the cases where it makes a difference. There might be
purely empirical, but they exist and you make no attempt at explaining
them.

> I've argued that it is based on a very bad design premise.

"Arguing" means for me to come up with arguments, which are really rare
from you. Maybe you can surprise us with some real arguments? I'd be
even happy about urls. So far I have only seen statements with very thin
backing. If I missed something, I still have an open offer for an
apology.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-24 15:19                                                                                               ` yodaiken
  2002-01-24 21:15                                                                                                 ` Roman Zippel
@ 2002-01-26  2:36                                                                                                 ` Jamie Lokier
  1 sibling, 0 replies; 351+ messages in thread
From: Jamie Lokier @ 2002-01-26  2:36 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, Robert Love, george anzinger, Momchil Velikov,
	Arjan van de Ven, linux-kernel

yodaiken@fsmlabs.com wrote:
> > > 	It has no demonstrated benefits.
> > 
> > Demonstrated to who?  I have certainly demonstrated the benefits to
> > myself, and others have attested to doing the same.
> 
> I've heard similar arguments in favor of aromatherapy and Scientology.
> 
> What's amazing about all the arguments in favor of preemption is that we
> don't see any published numbers of the obvious application: a periodic
> SCHED_FIFO process. We've done these experiments and the results are
> _dismal_. 

Hi Victor,

I am specifically interested in SCHED_FIFO performance (for a software
modem driver).  Can you publish thes results of yours of SCHED_FIFO
performance with & without the preempt patches?

Thanks,
-- Jamie

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 23:56                                                                                           ` yodaiken
  2002-01-22  0:45                                                                                             ` Roman Zippel
  2002-01-22  2:10                                                                                             ` Daniel Phillips
@ 2002-01-29 23:36                                                                                             ` Bill Davidsen
  2 siblings, 0 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-29 23:36 UTC (permalink / raw)
  To: yodaiken; +Cc: Linux Kernel Mailing List

On Mon, 21 Jan 2002 yodaiken@fsmlabs.com wrote:

> I have not seen that argued - certainly I have not argued it myself.
> My argument is:
> 	It makes the kernel _much_ more complex
  It modifies a tiny fraction of a percent of the kernel, which is
currently simplistic rather than simple. Nearly everyone who looks at it
makes some improvement, be it preempt, low latency, etc.

> 	It has known costs e.g. by making the lockless 
> 		per-processor caching  more difficult if not impossible
  How much slowdown did you measure when you tested the effect of that?

> 	It seems to lead to a requirement for inheritance
  To the limited extent that I agree, so what?

> 	It has no demonstrated benefits.
  You have that backward. There are  many people who say they can see a
benefit, and no one has shown either a quantified bad impact or a single
user account which said it was worse. And I bet you looked, didn't you?

  I believe that a system will run better for a single user, and better
for a server with high interrupt rates, like DNS or web servers, where
many threads may be blocked on i/o, but there is significant CPU load as
well.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:06                                                                                   ` yodaiken
                                                                                                       ` (2 preceding siblings ...)
  2002-01-21 21:16                                                                                     ` Robert Love
@ 2002-01-22  0:27                                                                                     ` Roman Zippel
  3 siblings, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-22  0:27 UTC (permalink / raw)
  To: yodaiken
  Cc: Daniel Phillips, george anzinger, Momchil Velikov,
	Arjan van de Ven, linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> > don't you?  As for the measured benefit, there have been a steady stream of
> > postive reports on lkml.
> 
> I have not seen a single well structured benchmark that shows a significant
> difference. I've seen lots of benchmarks with odd mixes of different patches
> showing something unknown. How about a simple clear dbench?

So let's explore the benchmark argument a bit. A usual some things need
to be clarified first.
It would be useful to know what we actually want to measure in the
benchmark. The main goal is to reduce scheduling latencies, so it would
make sense to test this. Results can be found at
http://www.gardena.net/benno/linux/audio/ and
http://kpreempt.sourceforge.net/. I haven't found a direct compare of
preemp+lockbreak and ll patch, but I wouldn't expect major differences
and it isn't really important. It is only important that both approaches
improve the scheduling latency considerably.
Now what in this thread is mostly mentioned are i/o benchmarks.
Improving i/o performance isn't really the main goal of the patches, so
it's only important that i/o performance isn't harmed. So far I haven't
seen any results indicating something like this, so case closed.
Victor, could you please explain, what are you trying to prove with the
benchmark argument??? I see you arguing a lot, but I don't really see
for what.
Finally my theory, why preempt performs better. If we assume that the
kernel spends too much time in kernel space, we only need to look at the
ll patch for possible latency problems. My favourites are two places -
copy_(to|from)_user and page table scan. Both are likely places for
loads like grep or kernel compile. I can see two possible effects here.
First lots of small i/o can be issued faster instead copying data
around. Second we spent too much time scanning the page tables looking
for freeable pages, while already enough i/o is scheduled, so instead
something useful is done and we just wait for the i/o to finish. Anyway,
anyone who really wants to know it, could modify the profiling code and
test where in kernel we schedule so often.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 16:05                                                                                 ` Daniel Phillips
  2002-01-21 16:06                                                                                   ` yodaiken
@ 2002-01-21 19:26                                                                                   ` Mark Hahn
  2002-01-21 20:16                                                                                     ` Allan Sandfeld
  2002-01-22 10:57                                                                                     ` Peter Wächtler
  1 sibling, 2 replies; 351+ messages in thread
From: Mark Hahn @ 2002-01-21 19:26 UTC (permalink / raw)
  To: linux-kernel

> > > To me the benefit is clear enough: ASAP scheduling of IO threads, a 
> > > simple heuristic that improves both throughput and latency.
> > 
> > I think of "benefit", perhaps naiively, in terms of something that can
> > be measured or demonstrated rather than just announced.
> 
> But you see why asap scheduling improves latency/throughput *in theory*, 
> don't you?  

NO, IT DOES NOT. why can't you preempt-ophiles get that through your heads? 

	eager scheduling is NOT optimal in general.  

for instance, suppose my disk can only read a sector at a time.
scheduling my sequentially-reading process to wake eagerly
is most definitly PESSIMAL.  laziness is a cardinal virtue!
this doesn't preclude heuristics to sometimes short-cut the laziness.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 19:26                                                                                   ` Mark Hahn
@ 2002-01-21 20:16                                                                                     ` Allan Sandfeld
  2002-01-22 10:57                                                                                     ` Peter Wächtler
  1 sibling, 0 replies; 351+ messages in thread
From: Allan Sandfeld @ 2002-01-21 20:16 UTC (permalink / raw)
  To: linux-kernel

On Monday 21 January 2002 20:26, Mark Hahn wrote:
> > > > To me the benefit is clear enough: ASAP scheduling of IO threads, a
> > > > simple heuristic that improves both throughput and latency.
> > >
> > > I think of "benefit", perhaps naiively, in terms of something that can
> > > be measured or demonstrated rather than just announced.
> >
> > But you see why asap scheduling improves latency/throughput *in theory*,
> > don't you?
>
> NO, IT DOES NOT. why can't you preempt-ophiles get that through your heads?
>
> 	eager scheduling is NOT optimal in general.
>
> for instance, suppose my disk can only read a sector at a time.
> scheduling my sequentially-reading process to wake eagerly
> is most definitly PESSIMAL.  laziness is a cardinal virtue!
> this doesn't preclude heuristics to sometimes short-cut the laziness.
>
It's because your system is behaving wrongly for your dream to come true. If 
your want to handle several expected inputs from IO, you should ask it for an 
interrupt for every package. Rather you should rely on a timer function an 
periodically handle new data and in case of nothing new, go back to 
interrupts..

Eager scheduling is OPTIMAL for the sematics in your system. It thats is not 
optimal for throughput/whatever, it's the code that is wrong!

-Allan


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 19:26                                                                                   ` Mark Hahn
  2002-01-21 20:16                                                                                     ` Allan Sandfeld
@ 2002-01-22 10:57                                                                                     ` Peter Wächtler
  1 sibling, 0 replies; 351+ messages in thread
From: Peter Wächtler @ 2002-01-22 10:57 UTC (permalink / raw)
  To: Mark Hahn; +Cc: linux-kernel

Mark Hahn schrieb:
> 
> > > > To me the benefit is clear enough: ASAP scheduling of IO threads, a
> > > > simple heuristic that improves both throughput and latency.
> > >
> > > I think of "benefit", perhaps naiively, in terms of something that can
> > > be measured or demonstrated rather than just announced.
> >
> > But you see why asap scheduling improves latency/throughput *in theory*,
> > don't you?
> 
> NO, IT DOES NOT. why can't you preempt-ophiles get that through your heads?
> 
>         eager scheduling is NOT optimal in general.
> 
> for instance, suppose my disk can only read a sector at a time.
> scheduling my sequentially-reading process to wake eagerly
> is most definitly PESSIMAL.  laziness is a cardinal virtue!
> this doesn't preclude heuristics to sometimes short-cut the laziness.
> 

Do you think there are no other benefits besides the scheduling latency in
a realtime system?

In a realtime system you want your event handling code (outside of the
interrupt handler [on Linux: bottom halves/tasklets/sorftirq?) get running 
on the CPU as fast as possible. Therefore a realtime kernel is often fully 
preemptible (well, there are always critical sections that has to disable 
interrupts).

So the time between the interrupt handler wanting to schedule a specific 
task/thread and the next scheduling decision is crucial, right? 

I have no hard numbers, but I can imagine that this can also lead to
better IO (in terms of latency AND IO throughput but with the cost of 
cpu cycles [user space CPU throughput]).

I don't know the Linux kernel good enough right now, but if you shorten
the scheduling latency: that could be a win for faster IO. But there's always
a tradeoff: if you spent too much time in scheduling decisions/preparations
the overhead eats the lower latency (especially if your mutexes have to deal
with priority inversion, giving a lock holder at least the same priority as
the lock contender for the period it holds the lock).

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 15:43                                                                               ` yodaiken
  2002-01-21 16:05                                                                                 ` Daniel Phillips
@ 2002-01-21 20:35                                                                                 ` Bill Davidsen
  2002-01-21 20:49                                                                                   ` yodaiken
  2002-01-21 21:42                                                                                   ` Mark Hahn
  1 sibling, 2 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-21 20:35 UTC (permalink / raw)
  To: yodaiken; +Cc: Linux Kernel Mailing List

On Mon, 21 Jan 2002 yodaiken@fsmlabs.com wrote:

> I think of "benefit", perhaps naiively, in terms of something that can
> be measured or demonstrated rather than just announced.

I guess there are people who assume that anything which can't be given a
number doesn't matter or possible doesn't exist. If software can't come up
with a number for the beauty of a sunset, cuteness of a baby, or taste of
a good wine, then obviously all that subjective stuff is meaningless.

However, since we have art, food, and wine critics making a living giving
their meaningless opinions, I guess the majority of us recognize that even
without a number produced by a benchmark there is "subjectively better." I
don't know of anyone who doesn't feel that the rmap patches, even with
some admited imprefections, make the system more responsive. I haven't
seen one person who questioned this after trying it.

Explaining responsiveness is like describing color to a blind person, any
quantifications totally miss the experience.

There are some responsemarks which may or may not be useful, feel free to
actually locate and run these and post the results instead of posting
multiple ways to ask for quantification. Linux people work on the things
that interest them, and most of us can tell a pea from a bowling ball
without a caliper.

Consider this a response to your other notes of similar nature.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 20:35                                                                                 ` Bill Davidsen
@ 2002-01-21 20:49                                                                                   ` yodaiken
  2002-01-21 21:42                                                                                   ` Mark Hahn
  1 sibling, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-21 20:49 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: yodaiken, Linux Kernel Mailing List

On Mon, Jan 21, 2002 at 03:35:35PM -0500, Bill Davidsen wrote:
> However, since we have art, food, and wine critics making a living giving
> their meaningless opinions, I guess the majority of us recognize that even
> without a number produced by a benchmark there is "subjectively better." I
> don't know of anyone who doesn't feel that the rmap patches, even with
> some admited imprefections, make the system more responsive. I haven't
> seen one person who questioned this after trying it.

Thanks for stating the case so clearly.



-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 20:35                                                                                 ` Bill Davidsen
  2002-01-21 20:49                                                                                   ` yodaiken
@ 2002-01-21 21:42                                                                                   ` Mark Hahn
  2002-01-22  0:58                                                                                     ` Ken Brownfield
  2002-01-22 16:51                                                                                     ` Bill Davidsen
  1 sibling, 2 replies; 351+ messages in thread
From: Mark Hahn @ 2002-01-21 21:42 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel Mailing List

> Explaining responsiveness is like describing color to a blind person, any
> quantifications totally miss the experience.

you overestimate human uniqueness - we all have near-identical
perceptual hardware, and there *is* an absolute limit
beyond which no human can perceive.  for our purposes, let's
say it's 5ms.

> There are some responsemarks which may or may not be useful, feel free to
> actually locate and run these and post the results instead of posting

I posted "realfeel" last year, AKPM added some touches to it.
it's in his amlat bundle.  


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:42                                                                                   ` Mark Hahn
@ 2002-01-22  0:58                                                                                     ` Ken Brownfield
  2002-01-22 16:51                                                                                     ` Bill Davidsen
  1 sibling, 0 replies; 351+ messages in thread
From: Ken Brownfield @ 2002-01-22  0:58 UTC (permalink / raw)
  To: linux-kernel

I'm ranting, and not at you directly, Mark.  But I think it's important
to get perspective here.

On Mon, Jan 21, 2002 at 04:42:28PM -0500, Mark Hahn wrote:
| > Explaining responsiveness is like describing color to a blind person, any
| > quantifications totally miss the experience.
| 
| you overestimate human uniqueness - we all have near-identical
| perceptual hardware, and there *is* an absolute limit
| beyond which no human can perceive.  for our purposes, let's
| say it's 5ms.

A 5ms wavelength is 200Hz.  Whether xterm lines scroll with a 5ms or a
10ms latency will most likely not be perceivable by most people.  On the
other hand, audio and video processing is worthless with a 5ms latency,
especially when the current worst-case is atrocious.

5ms is clearly a poor choice for human latency, but hard/software
latencies can never be small enough, just as processor speeds can never
be fast enough.  And we certainly can't pull these debatably "absolute"
limits out of thin air.

Of course, Linux is just for people who don't actually want to USE an
operating system, only those who want to write kernel code and don't
want to have to worry about "needless" complexity.

It is an invalid statement that all kernel modifications need to have
benchmarks or human case studies to justify them.  There are many parts
and functions of these systems that are not well-specified by
benchmarks, not to mention that one person's benchmark is another
person's non-real-world invalid test.

Ultimately the maintainers need to make this decision, based on the
community's exploration of theory, practice, and from appropriate
benchmarks or other research.  Research like the O(1) scheduler, rmap,
preemptive, and others need SUPPORT from this community.  One random
person should not get in the way of progress in a kernel that is not
only intended for that one person's narrow view of the world.

That being said, this one person's narrow view of the world is quite
important.  But however cheesy it sounds, "the greater good" is the
priority.

IMHO, latency is a critical part of Linux's growth, as is SMP
scalability (and a hundred other projects).  These are, perhaps due to
hardware restrictions, not a top priority.  But work on this type of
advancement MUST occur, and it must occur without having to fend off
people whose sole basis for argument is that they don't need it, or that
it will be hard, or that they need proof from sources they are unwilling
to mention.

>From what I've seen of the preemption and latency discussions, the Linux
kernel is going to be constrained to old technology and old ideas for
the foreseeable future.  People unable or unwilling to see the light
insist on being able to see the chicken before the egg is hatched.

Can this community be open-source and research-oriented without seeing
the business model and profit forecasts first?

I hope so.
-- 
Ken.
brownfld@irridia.com

| > There are some responsemarks which may or may not be useful, feel free to
| > actually locate and run these and post the results instead of posting
| 
| I posted "realfeel" last year, AKPM added some touches to it.
| it's in his amlat bundle.  

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-21 21:42                                                                                   ` Mark Hahn
  2002-01-22  0:58                                                                                     ` Ken Brownfield
@ 2002-01-22 16:51                                                                                     ` Bill Davidsen
  2002-01-22 20:50                                                                                       ` Jussi Laako
  1 sibling, 1 reply; 351+ messages in thread
From: Bill Davidsen @ 2002-01-22 16:51 UTC (permalink / raw)
  To: Mark Hahn; +Cc: Linux Kernel Mailing List

On Mon, 21 Jan 2002, Mark Hahn wrote:

> you overestimate human uniqueness - we all have near-identical
> perceptual hardware, and there *is* an absolute limit
> beyond which no human can perceive.  for our purposes, let's
> say it's 5ms.

Let's say that if the original poster didn't see the difference (a) he has
NO functioning "perceptual hardware," or (b) he hasn't tried it, which I
invited him to do.

> > There are some responsemarks which may or may not be useful, feel free to
> > actually locate and run these and post the results instead of posting
> 
> I posted "realfeel" last year, AKPM added some touches to it.
> it's in his amlat bundle.  

I was making the point that no quantification is needed, rmap is at the
"wow that's better!" level overywhere I've tried it.

Now if someone could get the rmap performance with memory pressure, add
the -aa improvements to heavy i/o and large memory, and season with a
touch of J4 scheduler, I think we could have response which would blow
your fingers off the keyboard.

OT: has someone gotten 2.4.17 rmap-11c and J4 playing together? I looked
at it for about five minutes but had no time last night. 

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-22 16:51                                                                                     ` Bill Davidsen
@ 2002-01-22 20:50                                                                                       ` Jussi Laako
  2002-01-29 23:05                                                                                         ` Bill Davidsen
  0 siblings, 1 reply; 351+ messages in thread
From: Jussi Laako @ 2002-01-22 20:50 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Linux Kernel Mailing List

Bill Davidsen wrote:
> 
> OT: has someone gotten 2.4.17 rmap-11c and J4 playing together? I looked
> at it for about five minutes but had no time last night.

It's in my -jl13 http://www.pp.song.fi/~visitor/linux/


 - Jussi Laako

-- 
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B  39DD A4DE 63EB C216 1E4B
Available at PGP keyservers


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-22 20:50                                                                                       ` Jussi Laako
@ 2002-01-29 23:05                                                                                         ` Bill Davidsen
  2002-01-29 23:33                                                                                           ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Bill Davidsen @ 2002-01-29 23:05 UTC (permalink / raw)
  To: Jussi Laako; +Cc: Linux Kernel Mailing List

On Tue, 22 Jan 2002, Jussi Laako wrote:

> Bill Davidsen wrote:
> > 
> > OT: has someone gotten 2.4.17 rmap-11c and J4 playing together? I looked
> > at it for about five minutes but had no time last night.
> 
> It's in my -jl13 http://www.pp.song.fi/~visitor/linux/

It was 15 by the time I tried it, but I'm still running it. What sched is
in pre7ac1? I looked at the post Alan put up and didn't see it on the
first pass. The Jn scheduler and and low latency seem to play well with
rmap, but the -aa changes to enable better bdflush tuning work well. I
don't want to tune for dbench, but I don't want to ignore it, either.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-29 23:05                                                                                         ` Bill Davidsen
@ 2002-01-29 23:33                                                                                           ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-29 23:33 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Jussi Laako, Linux Kernel Mailing List

> It was 15 by the time I tried it, but I'm still running it. What sched is
> in pre7ac1? I looked at the post Alan put up and didn't see it on the
> first pass. The Jn scheduler and and low latency seem to play well with

I've not touched the scheduler at all. Its in somewhat of a bit of flux
and I'm trying to accumulate stuff to go to Marcelo sooner not later

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:45                                                                 ` yodaiken
                                                                                     ` (2 preceding siblings ...)
  2002-01-14 16:36                                                                   ` Momchil Velikov
@ 2002-01-14 17:36                                                                   ` Daniel Phillips
  3 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14 17:36 UTC (permalink / raw)
  To: yodaiken, Momchil Velikov
  Cc: yodaiken, Arjan van de Ven, Roman Zippel, linux-kernel

On January 14, 2002 02:45 pm, yodaiken@fsmlabs.com wrote:
> POSIX makes no specification of how scheduling classes interact - unless something changed
> in the new version.
> 
> But more than that, the problem of preemption is much more complex when you have
> task that do not share the "goodness fade" with everything else. That is, given a
> set of SCHED_OTHER processes at time T0, it is reasonable to design the scheduler so
> that there is some D so that by time T0+D each process has become the highest priority
> and has received cpu up to either a complete time slice or a I/O block. Linux kind of
> has this property now, and I believe that making this more robust and easier to analyze
> is going to be an enormously important issue.  However, once you add SCHED_FIFO in the
> current scheme, this becomes more complex. And with preempt, you cannot even offer the
> assurance that once a process gets the cpu it will make _any_ advance at all.

So the prediction here is that SCHED_FIFO + preempt can livelock some set of correctly
designed processes, is that it?  I don't see exactly how that could happen, though that
may simply mean I didn't read closely enough.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:34                                                             ` yodaiken
  2002-01-14 11:14                                                               ` Roman Zippel
  2002-01-14 12:17                                                               ` Momchil Velikov
@ 2002-01-14 15:08                                                               ` Russ Leighton
  2 siblings, 0 replies; 351+ messages in thread
From: Russ Leighton @ 2002-01-14 15:08 UTC (permalink / raw)
  To: linux-kernel

This is getting silly ... feeback like "ll is better than PK", "feels 
smooth", "is reponsive",  "my kernel
compile is faster than yours", etc. is not getting us any closer to the 
"how" of making a better kernel.

What's the goal? How should SMP and NUMA behave? How is success measured?

It would be good to be very clear on the ultimate purpose before making 
radical changes. All of
these changes are dancing around some vague concept of 
reponsiveness...so define it!

These comments seem to set a better tone for this thread, perhaps we can 
concentrate  on _useful_ debate
around some well defined goal.

yodaiken@fsmlabs.com wrote:

> The key one is some idea of being able to assure processes
> of some rate of progress.  This is not classical RT, but it is important to multimedia and 
> databases and also to some applications we are interested in looking at. 

Andrew Morton wrote:

> But we can **make** it useful.  I believe that internal preemption is
> the foundation to improve 2.5 kernel latency.  But first we need
> consensus that we **want** linux to be a low-latency kernel.
> 
> Do we have that?
> 
> If we do, then as I've said before, holding a lock for more than N milliseconds
> becomes a bug to be fixed.  We can put tools in the hands of testers to
> locate those bugs.  Easy.
> 

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  5:03                                                           ` Daniel Phillips
  2002-01-14  5:09                                                             ` Andrew Morton
  2002-01-14  5:34                                                             ` yodaiken
@ 2002-01-14  8:24                                                             ` Arjan van de Ven
  2 siblings, 0 replies; 351+ messages in thread
From: Arjan van de Ven @ 2002-01-14  8:24 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Roman Zippel, linux-kernel

On Mon, Jan 14, 2002 at 06:03:43AM +0100, Daniel Phillips wrote:
> Sorry, that's incorrect.  I stated why earlier in this thread and akpm signed 
> off on it.  With preempt you get ASAP (i.e., as soon as the outermost 
> spinlock is done) process scheduling.  With hand-coded scheduling points you 
> get 'as soon as it happens to hit a scheduling point'.
> 
> That is not the only benefit, just the most obvious one.

Big duh. So you get there 1 usec sooner. NOBODY will notice that. NOBODY.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:18                                                       ` Roman Zippel
  2002-01-13 15:36                                                         ` Arjan van de Ven
@ 2002-01-13 15:45                                                         ` Alan Cox
  2002-01-13 20:25                                                           ` Roman Zippel
  2002-01-13 18:13                                                         ` Robert Love
  2002-01-14  1:50                                                         ` Rik van Riel
  3 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-13 15:45 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

> What somehow got lost in this discussion, that both patches don't
> necessarily conflict with each other, they both attack the same problem
> with different approaches, which complement each other. I prefer to get
> the best of both patches.

When you look at the benchmark there is no difference between ll and 
ll+pre-empt. ll alone takes you to the 1ms point. pre-empt takes you no
further and to get much out of pre-emption requires you go and do all the
hideously slow and complex priority inversion stuff.

> exactly that reason. I don't think we need to work around broken
> hardware, but halfway decent hardware should not be a problem to get
> decent latency.

We have to work around common hardware not designed for SMP - the 8390 isnt
a broken chip in that sense, its just from a different era, and there are a 
lot of them.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:45                                                         ` Alan Cox
@ 2002-01-13 20:25                                                           ` Roman Zippel
  2002-01-13 21:11                                                             ` Alan Cox
  2002-01-14  0:10                                                             ` yodaiken
  0 siblings, 2 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-13 20:25 UTC (permalink / raw)
  To: Alan Cox; +Cc: Robert Love, Kenneth Johansson, arjan, Rob Landley, linux-kernel

Hi,

Alan Cox wrote:

> > What somehow got lost in this discussion, that both patches don't
> > necessarily conflict with each other, they both attack the same problem
> > with different approaches, which complement each other. I prefer to get
> > the best of both patches.
> 
> When you look at the benchmark there is no difference between ll and
> ll+pre-empt. ll alone takes you to the 1ms point.

I don't doubt that, but would you seriously consider the ll patch for
inclusion into the main kernel?
It's a useful patch for anyone, who needs good latencies now, but it's
still a quick&dirty solution. Preempt offers a clean solution for a
certain part of the problem, as it's possible to cleanly localize the
needed changes for preemption (at least for UP). That means the ll patch
becomes smaller and future work on ll becomes simpler, since a certain
type of latency problems is handled automatically (and transparently),
so you do gain something by it.
The remaining places pointed out in the ll patch are worth a closer look
as well, as mostly now we hold a spinlock for too long. These should be
fixed as well, as they mean possible contention problems on SMP.

> pre-empt takes you no
> further and to get much out of pre-emption requires you go and do all the
> hideously slow and complex priority inversion stuff.

The possibility of priority inversion problems are not new, it was
already discussed before. It was considered not a serious problem, since
all processes will still make progress. Preempt now increases the
likeliness such a situation occurs, but nonetheless the processes will
still make progress. In the past I can't remember any report that
indicated a problem caused by priority inversion and so I simply can't
believe it should become a massive problem now with preempt.

> > exactly that reason. I don't think we need to work around broken
> > hardware, but halfway decent hardware should not be a problem to get
> > decent latency.
> 
> We have to work around common hardware not designed for SMP - the 8390 isnt
> a broken chip in that sense, its just from a different era, and there are a
> lot of them.

Please let me rephrase, I just don't expect terrible good latency
numbers with non dma hardware.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 20:25                                                           ` Roman Zippel
@ 2002-01-13 21:11                                                             ` Alan Cox
  2002-01-14  0:33                                                               ` Stephan von Krawczynski
  2002-01-14  0:10                                                             ` yodaiken
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-13 21:11 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

> I don't doubt that, but would you seriously consider the ll patch for
> inclusion into the main kernel?

The mini ll patch definitely. The full ll one needs some head scratching to
be sure its correct. pre-empt is a 2.5 thing which in some ways is easier
because it doesnt matter if it breaks something.

> Please let me rephrase, I just don't expect terrible good latency
> numbers with non dma hardware.

Expect the same with DMA hardware too at times.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 21:11                                                             ` Alan Cox
@ 2002-01-14  0:33                                                               ` Stephan von Krawczynski
  2002-01-14  0:50                                                                 ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-14  0:33 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roman Zippel, Alan Cox, Robert Love, Kenneth Johansson, arjan,
	Rob Landley, linux-kernel

> > I don't doubt that, but would you seriously consider the ll patch 
for                                                                   
> > inclusion into the main kernel?                                   
>                                                                     
> The mini ll patch definitely.                                       

Huh?                                                                  
Can you point at anyone who experienced a significant benefit from it?
I can see a lot of interesting patches ahead if you let this go.      
Tell me honestly that the idea behind this patch is not _crap_. You   
can only make this basic idea work if you patch a tremendous lot of   
those conditional_schedules() through the kernel. We already saw it   
starting off in some graphics drivers, network drivers. Why not just  
all of it? You will not be far away in the end from the 'round 4000 I 
already stated in earlier post.                                       
I do believe Roberts' preempt is a lot cleaner in its idea _how_ to   
achieve basically the same goal. Although I am at least as sceptic as 
you about a race-free implementation.                                 

> The full ll one needs some head scratching to                       
> be sure its correct.                                                

You may simply call it _counting_ (the files to patch).               

> pre-empt is a 2.5 thing which in some ways is easier                
> because it doesnt matter if it breaks something.                    

So I understand you agree somehow with me in the answer to "what idea 
is really better?"...                                                 

Regards,                                                              
Stephan                                                               

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:33                                                               ` Stephan von Krawczynski
@ 2002-01-14  0:50                                                                 ` Alan Cox
  2002-01-14  1:17                                                                   ` Robert Love
  2002-01-14  9:45                                                                   ` Stephan von Krawczynski
  0 siblings, 2 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-14  0:50 UTC (permalink / raw)
  To: Stephan von Krawczynski
  Cc: Roman Zippel, Alan Cox, Robert Love, Kenneth Johansson, arjan,
	Rob Landley, linux-kernel

> Tell me honestly that the idea behind this patch is not _crap_. You   
> can only make this basic idea work if you patch a tremendous lot of   
> those conditional_schedules() through the kernel. We already saw it   
> starting off in some graphics drivers, network drivers. Why not just  
> all of it? You will not be far away in the end from the 'round 4000 I 
> already stated in earlier post.                                       

There are very few places you need to touch to get a massive benefit. Most
of the kernel already behaves extremely well.

> So I understand you agree somehow with me in the answer to "what idea 
> is really better?"...                                                 

Do you want a clean simple solution or complex elegance ? For 2.4 I definitely
favour clean and simple. For 2.5 its an open debate

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:50                                                                 ` Alan Cox
@ 2002-01-14  1:17                                                                   ` Robert Love
  2002-01-14  9:49                                                                     ` Stephan von Krawczynski
  2002-01-14  9:45                                                                   ` Stephan von Krawczynski
  1 sibling, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-14  1:17 UTC (permalink / raw)
  To: Alan Cox
  Cc: Stephan von Krawczynski, Roman Zippel, Kenneth Johansson, arjan,
	Rob Landley, linux-kernel

On Sun, 2002-01-13 at 19:50, Alan Cox wrote:

> Do you want a clean simple solution or complex elegance ? For 2.4 I definitely
> favour clean and simple. For 2.5 its an open debate

Make no mistake, I do not intend to see preempt-kernel in 2.4.  I will,
however, continue to maintain the patch for endusers and such that use
it.  A proper in-kernel solution for 2.4 in my opinion in mini-ll,
perhaps extend with any other obviously-completely-utterly sane bits
from full-ll.

For 2.5, however, I tout preempt as the answer.  This does not mean just
preempt.  It means a preemptible kernel as a basis for beginning
low-latency works in manners other than explicit scheduling statements.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  1:17                                                                   ` Robert Love
@ 2002-01-14  9:49                                                                     ` Stephan von Krawczynski
  0 siblings, 0 replies; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-14  9:49 UTC (permalink / raw)
  To: Robert Love; +Cc: alan, zippel, ken, arjan, landley, linux-kernel

On 13 Jan 2002 20:17:11 -0500
Robert Love <rml@tech9.net> wrote:

> For 2.5, however, I tout preempt as the answer.  This does not mean just
> preempt.  It means a preemptible kernel as a basis for beginning
> low-latency works in manners other than explicit scheduling statements.

Aha, exactly my thoughts...

Regards,
Stephan



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:50                                                                 ` Alan Cox
  2002-01-14  1:17                                                                   ` Robert Love
@ 2002-01-14  9:45                                                                   ` Stephan von Krawczynski
  2002-01-14 10:04                                                                     ` Andrew Morton
                                                                                       ` (3 more replies)
  1 sibling, 4 replies; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-14  9:45 UTC (permalink / raw)
  To: Alan Cox; +Cc: zippel, rml, ken, arjan, landley, linux-kernel

On Mon, 14 Jan 2002 00:50:54 +0000 (GMT)
Alan Cox <alan@lxorguk.ukuu.org.uk> wrote:

> > all of it? You will not be far away in the end from the 'round 4000 I 
> > already stated in earlier post.                                       
> 
> There are very few places you need to touch to get a massive benefit. Most
> of the kernel already behaves extremely well.

Just a short question: the last (add-on) patch to mini-ll I saw on the list
patches: drivers/net/3c59x.c
drivers/net/8139too.c
drivers/net/eepro100.c

Unfortunately me have neither of those. This would mean I cannot benefit from
_these_ patches, but instead would need _others_ (like tulip or
name-one-of-the-rest-of-the-drivers) to see _some_ effect you tell me I
_should_ see (I currently see _none_). How do you argue then against the
statement: we need patches for /drivers/net/*.c ?? I do not expect 3c59x.c to
be particularly bad in comparison to tulip/*.c or lets say via-rhine.c, do you?

> > So I understand you agree somehow with me in the answer to "what idea 
> > is really better?"...                                                 
> 
> Do you want a clean simple solution or complex elegance ? For 2.4 I
definitely> favour clean and simple. For 2.5 its an open debate

Hm, obviously the ll-patches look simple, but their pure required number makes
me think they are as well stupid as simple. This whole story looks like making
an old mac do real multitasking, just spread around scheduling points
throughout the code ... This is like drilling for water on top of the mountain.
I want the water too, but I state there must be a nice valley somewhere around
...

Regards,
Stephan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  9:45                                                                   ` Stephan von Krawczynski
@ 2002-01-14 10:04                                                                     ` Andrew Morton
  2002-01-14 11:47                                                                       ` Stephan von Krawczynski
  2002-01-14 10:09                                                                     ` Alan Cox
                                                                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-14 10:04 UTC (permalink / raw)
  To: Stephan von Krawczynski
  Cc: Alan Cox, zippel, rml, ken, arjan, landley, linux-kernel

Stephan von Krawczynski wrote:
> 
> ...
> Unfortunately me have neither of those. This would mean I cannot benefit from
> _these_ patches, but instead would need _others_ (like tulip or
> name-one-of-the-rest-of-the-drivers) to see _some_ effect you tell me I
> _should_ see (I currently see _none_). How do you argue then against the
> statement: we need patches for /drivers/net/*.c ?? I do not expect 3c59x.c to
> be particularly bad in comparison to tulip/*.c or lets say via-rhine.c, do you?
> 

In 3c59x.c, probably the biggest problem will be the call to issue_and_wait()
in boomerang_start_xmit().  On a LAN which is experiencing heavy collision rates
this can take as long as 2,000 PCI cycles (it's quite rare, and possibly an
erratum).  It is called under at least two spinlocks.

In via-rhine, wait_for_reset() can busywait for up to ten milliseconds.
via_rhine_tx_timeout() calls it from under a spinlock.

In eepro100.c, wait_for_cmd_done() can busywait for one millisecond
and is called multiple times under spinlock.

Preemption will help _some_ of this, but by no means all, or enough.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 10:04                                                                     ` Andrew Morton
@ 2002-01-14 11:47                                                                       ` Stephan von Krawczynski
  2002-01-14 12:29                                                                         ` Alan Cox
  2002-01-14 19:58                                                                         ` george anzinger
  0 siblings, 2 replies; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-14 11:47 UTC (permalink / raw)
  To: Andrew Morton; +Cc: alan, zippel, rml, ken, arjan, landley, linux-kernel

On Mon, 14 Jan 2002 02:04:56 -0800
Andrew Morton <akpm@zip.com.au> wrote:

> Stephan von Krawczynski wrote:
> > 
> > ...
> > Unfortunately me have neither of those. This would mean I cannot benefit
from> > _these_ patches, but instead would need _others_ 
[...]
> 
> In 3c59x.c, probably the biggest problem will be the call to issue_and_wait()
> in boomerang_start_xmit().  On a LAN which is experiencing heavy collision
rates> this can take as long as 2,000 PCI cycles (it's quite rare, and possibly
an> erratum).  It is called under at least two spinlocks.
> 
> In via-rhine, wait_for_reset() can busywait for up to ten milliseconds.
> via_rhine_tx_timeout() calls it from under a spinlock.
> 
> In eepro100.c, wait_for_cmd_done() can busywait for one millisecond
> and is called multiple times under spinlock.

Did I get that right, as long as spinlocked no sense in conditional_schedule()
?

> Preemption will help _some_ of this, but by no means all, or enough.

Maybe we should really try to shorten the lock-times _first_. You mentioned a
way to find the bad guys?

Regards,
Stephan



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:47                                                                       ` Stephan von Krawczynski
@ 2002-01-14 12:29                                                                         ` Alan Cox
  2002-01-14 22:20                                                                           ` Jussi Laako
  2002-01-14 19:58                                                                         ` george anzinger
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-14 12:29 UTC (permalink / raw)
  To: Stephan von Krawczynski
  Cc: Andrew Morton, alan, zippel, rml, ken, arjan, landley,
	linux-kernel

> > In eepro100.c, wait_for_cmd_done() can busywait for one millisecond
> > and is called multiple times under spinlock.
> 
> Did I get that right, as long as spinlocked no sense in conditional_schedule()
> ?

No conditional schedule, no pre-emption. You would need to rewrite that code
to do something like try for 100uS then queue a 1 tick timer to retry
asynchronously. That makes the code vastly more complex for an error case and
for some drivers where irq mask is required during reset waits won't help.

Yet again there are basically 1mS limitations buried in the hardware.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 12:29                                                                         ` Alan Cox
@ 2002-01-14 22:20                                                                           ` Jussi Laako
  2002-01-15  1:43                                                                             ` Stephan von Krawczynski
  0 siblings, 1 reply; 351+ messages in thread
From: Jussi Laako @ 2002-01-14 22:20 UTC (permalink / raw)
  To: Alan Cox; +Cc: Andrew Morton, linux-kernel

Alan Cox wrote:
> 
> > > In eepro100.c, wait_for_cmd_done() can busywait for one millisecond
> > > and is called multiple times under spinlock.
> >
> > Did I get that right, as long as spinlocked no sense in 
> > conditional_schedule()
> 
> No conditional schedule, no pre-emption. You would need to rewrite that 
> code to do something like try for 100uS then queue a 1 tick timer to 
> retry asynchronously. That makes the code vastly more complex for an 
> error case and for some drivers where irq mask is required during reset 
> waits won't help.

That wait_for_cmd_done() and similar functions in other drivers are called
let's say 3 times in interrupt handler or spinlocked routine and 20 times in
non-interrupts disabled nor spinlocked functions.

Spinlocked reqions are usually protected by spin_lock_irqsave().

So the code reads

	if (!spin_is_locked(sl))
		conditional_schedule();

This doesn't make the whole problem go away, but could make the situation a
little bit better for most of the time?


 - Jussi Laako

-- 
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B  39DD A4DE 63EB C216 1E4B
Available at PGP keyservers


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 22:20                                                                           ` Jussi Laako
@ 2002-01-15  1:43                                                                             ` Stephan von Krawczynski
  2002-01-15 20:29                                                                               ` Jussi Laako
  0 siblings, 1 reply; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-15  1:43 UTC (permalink / raw)
  To: Jussi Laako; +Cc: Alan Cox, Andrew Morton, linux-kernel

> Alan Cox wrote:                                                     
> >                                                                   
> > > > In eepro100.c, wait_for_cmd_done() can busywait for one       
millisecond                                                           
> > > > and is called multiple times under spinlock.                  
> > >                                                                 
> > > Did I get that right, as long as spinlocked no sense in         
> > > conditional_schedule()                                          
> >                                                                   
> > No conditional schedule, no pre-emption. You would need to rewrite
that                                                                  
> > code to do something like try for 100uS then queue a 1 tick timer 
to                                                                    
> > retry asynchronously. That makes the code vastly more complex for 
an                                                                    
> > error case and for some drivers where irq mask is required during 
reset                                                                 
> > waits won't help.                                                 
>                                                                     
> That wait_for_cmd_done() and similar functions in other drivers are 
called                                                                
> let's say 3 times in interrupt handler or spinlocked routine and 20 
times in                                                              
> non-interrupts disabled nor spinlocked functions.                   
>                                                                     
> Spinlocked reqions are usually protected by spin_lock_irqsave().    
                                                                      
Now I have really waited for this one: _usually_.                     
My servers work usually, except for the days they hit that other      
rather unusual part of the code...                                    
                                                                      
> So the code reads                                                   
>                                                                     
> 	if (!spin_is_locked(sl))                                           
> 		conditional_schedule();                                           
>                                                                     
> This doesn't make the whole problem go away, but could make the     
situation a                                                           
> little bit better for most of the time?                             
                                                                      
Time to take out these big hats and rename ourself to Gandalf or the  
like. What do you expect your server to do, having no problem "most of
the time"? Please read Albert E. Time can be pretty relative to your  
personal point of view...                                             
                                                                      
Regards,                                                              
Stephan                                                               
                                                                      
                                                                      

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-15  1:43                                                                             ` Stephan von Krawczynski
@ 2002-01-15 20:29                                                                               ` Jussi Laako
  0 siblings, 0 replies; 351+ messages in thread
From: Jussi Laako @ 2002-01-15 20:29 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

Stephan von Krawczynski wrote:
> 
> Time to take out these big hats and rename ourself to Gandalf or the
> like. What do you expect your server to do, having no problem "most of
> the time"? Please read Albert E. Time can be pretty relative to your
> personal point of view...

Is this flaming really necessary?

So we don't have care about how long some driver spends in it internal
loops? So we could as well start writing drivers like 

	while (!frame_received()) udelay(1000000);

Because one day we will have powershortage anyway and that will anyway cause
few hours latencypeak? And if the user pulls the ethernet plug we don't have
to do anything else?


 - Jussi Laako

-- 
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B  39DD A4DE 63EB C216 1E4B
Available at PGP keyservers


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:47                                                                       ` Stephan von Krawczynski
  2002-01-14 12:29                                                                         ` Alan Cox
@ 2002-01-14 19:58                                                                         ` george anzinger
  1 sibling, 0 replies; 351+ messages in thread
From: george anzinger @ 2002-01-14 19:58 UTC (permalink / raw)
  To: Stephan von Krawczynski
  Cc: Andrew Morton, alan, zippel, rml, ken, arjan, landley,
	linux-kernel

Stephan von Krawczynski wrote:
> 
> On Mon, 14 Jan 2002 02:04:56 -0800
> Andrew Morton <akpm@zip.com.au> wrote:
> 
> > Stephan von Krawczynski wrote:
> > >
> > > ...
> > > Unfortunately me have neither of those. This would mean I cannot benefit
> from> > _these_ patches, but instead would need _others_
> [...]
> >
> > In 3c59x.c, probably the biggest problem will be the call to issue_and_wait()
> > in boomerang_start_xmit().  On a LAN which is experiencing heavy collision
> rates> this can take as long as 2,000 PCI cycles (it's quite rare, and possibly
> an> erratum).  It is called under at least two spinlocks.
> >
> > In via-rhine, wait_for_reset() can busywait for up to ten milliseconds.
> > via_rhine_tx_timeout() calls it from under a spinlock.
> >
> > In eepro100.c, wait_for_cmd_done() can busywait for one millisecond
> > and is called multiple times under spinlock.
> 
> Did I get that right, as long as spinlocked no sense in conditional_schedule()
> ?
> 
> > Preemption will help _some_ of this, but by no means all, or enough.
> 
> Maybe we should really try to shorten the lock-times _first_. You mentioned a
> way to find the bad guys?

Apply the preempt patch and then the preempt-stats patch.  Follow
instructions that come with the stats patch.  It will report on the
longest preempt disable times since the last report.  You need to
provide a load that will exercise the bad code, but it will tell you
which, where, and how bad.  Note: it measures preempt off time, NOT how
long it took to get to some task, i.e. it does not depend on requesting
preemption at the worst possible time.
-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  9:45                                                                   ` Stephan von Krawczynski
  2002-01-14 10:04                                                                     ` Andrew Morton
@ 2002-01-14 10:09                                                                     ` Alan Cox
  2002-01-14 15:02                                                                     ` J.A. Magallon
  2002-01-14 22:03                                                                     ` Jussi Laako
  3 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-14 10:09 UTC (permalink / raw)
  To: Stephan von Krawczynski
  Cc: Alan Cox, zippel, rml, ken, arjan, landley, linux-kernel

> Just a short question: the last (add-on) patch to mini-ll I saw on the list
> patches: drivers/net/3c59x.c
> drivers/net/8139too.c
> drivers/net/eepro100.c

I've seen multiple quite frankly bizarre patches like that. I've not applied
them and don't see the point

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  9:45                                                                   ` Stephan von Krawczynski
  2002-01-14 10:04                                                                     ` Andrew Morton
  2002-01-14 10:09                                                                     ` Alan Cox
@ 2002-01-14 15:02                                                                     ` J.A. Magallon
  2002-01-14 15:03                                                                       ` Arjan van de Ven
  2002-01-14 19:35                                                                       ` Robert Love
  2002-01-14 22:03                                                                     ` Jussi Laako
  3 siblings, 2 replies; 351+ messages in thread
From: J.A. Magallon @ 2002-01-14 15:02 UTC (permalink / raw)
  To: Stephan von Krawczynski
  Cc: Alan Cox, zippel, rml, ken, arjan, landley, linux-kernel


On 20020114 Stephan von Krawczynski wrote:
>
>Hm, obviously the ll-patches look simple, but their pure required number makes
>me think they are as well stupid as simple. This whole story looks like making
>an old mac do real multitasking, just spread around scheduling points

Yup. That remind me of...
Would there be any kernel call every driver is doing just to hide there
a conditional_schedule() so everyone does it even without knowledge of it ?
Just like Apple put the SystemTask() inside GetNextEvent()...

-- 
J.A. Magallon                           #  Let the source be with you...        
mailto:jamagallon@able.es
Mandrake Linux release 8.2 (Cooker) for i586
Linux werewolf 2.4.18-pre3-beo #5 SMP Sun Jan 13 02:14:04 CET 2002 i686

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 15:02                                                                     ` J.A. Magallon
@ 2002-01-14 15:03                                                                       ` Arjan van de Ven
  2002-01-14 19:50                                                                         ` george anzinger
  2002-01-14 19:35                                                                       ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: Arjan van de Ven @ 2002-01-14 15:03 UTC (permalink / raw)
  To: J.A. Magallon, linux-kernel

"J.A. Magallon" wrote:
> 
> On 20020114 Stephan von Krawczynski wrote:
> >
> >Hm, obviously the ll-patches look simple, but their pure required number makes
> >me think they are as well stupid as simple. This whole story looks like making
> >an old mac do real multitasking, just spread around scheduling points
> 
> Yup. That remind me of...
> Would there be any kernel call every driver is doing just to hide there
> a conditional_schedule() so everyone does it even without knowledge of it ?
> Just like Apple put the SystemTask() inside GetNextEvent()...


Well the preempt patch sort of does this in every spin_unlock*() .....

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 15:03                                                                       ` Arjan van de Ven
@ 2002-01-14 19:50                                                                         ` george anzinger
  0 siblings, 0 replies; 351+ messages in thread
From: george anzinger @ 2002-01-14 19:50 UTC (permalink / raw)
  To: arjanv; +Cc: J.A. Magallon, linux-kernel

Arjan van de Ven wrote:
> 
> "J.A. Magallon" wrote:
> >
> > On 20020114 Stephan von Krawczynski wrote:
> > >
> > >Hm, obviously the ll-patches look simple, but their pure required number makes
> > >me think they are as well stupid as simple. This whole story looks like making
> > >an old mac do real multitasking, just spread around scheduling points
> >
> > Yup. That remind me of...
> > Would there be any kernel call every driver is doing just to hide there
> > a conditional_schedule() so everyone does it even without knowledge of it ?
> > Just like Apple put the SystemTask() inside GetNextEvent()...
> 
> Well the preempt patch sort of does this in every spin_unlock*() .....
> -
Gosh, not really.  The nature of the preempt patch is to allow
preemption on completion of the interrupt that put the contending task
back in the run list.  This can not be done if a spin lock is held, so
yes, there is a test on exit from the spin lock, but the point is that
this is only needed when the lock is release, not in unlocked code.

The utility of most of the ll patches is that they address the problem
within locked regions.  This is why there is a lock-break patch that is
designed to augment the preempt patch.  It picks up several of the long
held spin locks and pops out of them early to allow preemption, and then
relocks and continues (after picking up the pieces, of course).
-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 15:02                                                                     ` J.A. Magallon
  2002-01-14 15:03                                                                       ` Arjan van de Ven
@ 2002-01-14 19:35                                                                       ` Robert Love
  2002-01-14 15:46                                                                         ` Rob Landley
  1 sibling, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-14 19:35 UTC (permalink / raw)
  To: J.A. Magallon
  Cc: Stephan von Krawczynski, Alan Cox, zippel, ken, arjan, landley,
	linux-kernel

On Mon, 2002-01-14 at 10:02, J.A. Magallon wrote:

> Yup. That remind me of...
> Would there be any kernel call every driver is doing just to hide there
> a conditional_schedule() so everyone does it even without knowledge of it ?
> Just like Apple put the SystemTask() inside GetNextEvent()...

It's not nearly that easy.  If it were, we would all certainly switch to
the preemptive kernel design, and preempt whenever and wherever we
needed.

Instead, we have to worry about reentrancy and thus can not preempt
inside critical regions (denoted by spinlocks).  So we can't have
preempt there, and have more work to do -- thus this discussion.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 19:35                                                                       ` Robert Love
@ 2002-01-14 15:46                                                                         ` Rob Landley
  0 siblings, 0 replies; 351+ messages in thread
From: Rob Landley @ 2002-01-14 15:46 UTC (permalink / raw)
  To: Robert Love, J.A. Magallon
  Cc: Stephan von Krawczynski, Alan Cox, zippel, ken, arjan,
	linux-kernel

On Monday 14 January 2002 02:35 pm, Robert Love wrote:
> On Mon, 2002-01-14 at 10:02, J.A. Magallon wrote:
> > Yup. That remind me of...
> > Would there be any kernel call every driver is doing just to hide there
> > a conditional_schedule() so everyone does it even without knowledge of it
> > ? Just like Apple put the SystemTask() inside GetNextEvent()...
>
> It's not nearly that easy.  If it were, we would all certainly switch to
> the preemptive kernel design, and preempt whenever and wherever we
> needed.
>
> Instead, we have to worry about reentrancy and thus can not preempt
> inside critical regions (denoted by spinlocks).  So we can't have
> preempt there, and have more work to do -- thus this discussion.
>
> 	Robert Love

The real question is: what can get in.

Variants of the explicit scheduling points have been around for over a year, 
since Ingo's original version.  Just a few days ago Marcello once again said 
that if all the patch does is add scheduling points, he had no intention of 
integrating it.  Linus's opinion on the matter has pretty much been about the 
same since Ingo's version.

If explicit scheduling points ARE a better first step than preempt (which 
doesn't necessarily elminate preempt, it just lets us move forward while 
arguing), when the heck might they possibly appear in a mainline kernel we 
don't have to manually patch each new release of?

Rob

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  9:45                                                                   ` Stephan von Krawczynski
                                                                                       ` (2 preceding siblings ...)
  2002-01-14 15:02                                                                     ` J.A. Magallon
@ 2002-01-14 22:03                                                                     ` Jussi Laako
  2002-01-15  1:34                                                                       ` Stephan von Krawczynski
  3 siblings, 1 reply; 351+ messages in thread
From: Jussi Laako @ 2002-01-14 22:03 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: Alan Cox, linux-kernel

Stephan von Krawczynski wrote:
> 
> Just a short question: the last (add-on) patch to mini-ll I saw on the 

It was for full-ll, not for mini-ll.

> patches: drivers/net/3c59x.c
> drivers/net/8139too.c
> drivers/net/eepro100.c
> 
> Unfortunately me have neither of those. This would mean I cannot benefit 
> from _these_ patches, but instead would need _others_ (like tulip or
> name-one-of-the-rest-of-the-drivers) to see _some_ effect you tell me I
> _should_ see (I currently see _none_). How do you argue then against the
> statement: we need patches for /drivers/net/*.c ?? I do not expect 
> 3c59x.c to be particularly bad in comparison to tulip/*.c or lets say 
> via-rhine.c, do you?

I also checked the tulip driver (which is the one I use at home) and didn't
find need for "fixing" there. I will definitely take a closer look at that
driver in future.

WLAN drivers seem to need some hacking, but I'm not very interested in that
area. I think WLAN is one big security hole that noone should be using...


 - Jussi Laako

-- 
PGP key fingerprint: 161D 6FED 6A92 39E2 EB5B  39DD A4DE 63EB C216 1E4B
Available at PGP keyservers


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 22:03                                                                     ` Jussi Laako
@ 2002-01-15  1:34                                                                       ` Stephan von Krawczynski
  0 siblings, 0 replies; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-15  1:34 UTC (permalink / raw)
  To: Jussi Laako; +Cc: Alan Cox, linux-kernel

> Stephan von Krawczynski wrote:                                      
                                                                      
> I also checked the tulip driver (which is the one I use at home) and
didn't                                                                
> find need for "fixing" there. I will definitely take a closer look  
at that                                                               
> driver in future.                                                   
                                                                      
Aha, I see, everything as expected...                                 
                                                                      
> WLAN drivers seem to need some hacking, but I'm not very interested 
in that                                                               
> area. I think WLAN is one big security hole that noone should be    
using...                                                              
                                                                      
Uhm, I got hot news: internet is insecure.                            
Come on, this is really silly, every network is a security problem,   
not speaking of the guys sitting in front of the screens...           
                                                                      
Regards,                                                              
Stephan                                                               
                                                                      
                                                                      

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 20:25                                                           ` Roman Zippel
  2002-01-13 21:11                                                             ` Alan Cox
@ 2002-01-14  0:10                                                             ` yodaiken
  2002-01-14  0:41                                                               ` Roman Zippel
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-14  0:10 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

On Sun, Jan 13, 2002 at 09:25:50PM +0100, Roman Zippel wrote:
> I don't doubt that, but would you seriously consider the ll patch for
> inclusion into the main kernel?
> It's a useful patch for anyone, who needs good latencies now, but it's
> still a quick&dirty solution. Preempt offers a clean solution for a
> certain part of the problem, as it's possible to cleanly localize the
> needed changes for preemption (at least for UP). That means the ll patch
> becomes smaller and future work on ll becomes simpler, since a certain

That is exactly what Andrew Morton disputes. So why do you think he is
wrong?



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:10                                                             ` yodaiken
@ 2002-01-14  0:41                                                               ` Roman Zippel
  2002-01-14  1:05                                                                 ` yodaiken
  2002-01-14  1:19                                                                 ` Robert Love
  0 siblings, 2 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14  0:41 UTC (permalink / raw)
  To: yodaiken
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> > It's a useful patch for anyone, who needs good latencies now, but it's
> > still a quick&dirty solution. Preempt offers a clean solution for a
> > certain part of the problem, as it's possible to cleanly localize the
> > needed changes for preemption (at least for UP). That means the ll patch
> > becomes smaller and future work on ll becomes simpler, since a certain
> 
> That is exactly what Andrew Morton disputes. So why do you think he is
> wrong?

Please explain, what do you mean?

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:41                                                               ` Roman Zippel
@ 2002-01-14  1:05                                                                 ` yodaiken
  2002-01-14 10:16                                                                   ` Roman Zippel
  2002-01-14  1:19                                                                 ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-14  1:05 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Alan Cox, Robert Love, Kenneth Johansson, arjan,
	Rob Landley, linux-kernel

On Mon, Jan 14, 2002 at 01:41:35AM +0100, Roman Zippel wrote:
> Hi,
> 
> yodaiken@fsmlabs.com wrote:
> 
> > > It's a useful patch for anyone, who needs good latencies now, but it's
> > > still a quick&dirty solution. Preempt offers a clean solution for a
> > > certain part of the problem, as it's possible to cleanly localize the
> > > needed changes for preemption (at least for UP). That means the ll patch
> > > becomes smaller and future work on ll becomes simpler, since a certain
> > 
> > That is exactly what Andrew Morton disputes. So why do you think he is
> > wrong?
> 
> Please explain, what do you mean?

I mean, that these conversations are not very useful if you don't
read what the other people write.
Here's a prior response by Andrew to a post by you.

>From akpm@zip.com.au  Sat Jan 12 13:15:22 2002
Roman Zippel wrote:
> 
> Andrew's patch requires constant audition and Andrew can't audit all
> drivers for possible problems. That doesn't mean Andrew's work is
> wasted, since it identifies problems, which preempting can't solve, but
> it will always be a hunt for the worst cases, where preempting goes for
> the general case.

Guys,

I've heard this so many times, and it just ain't so.   The overwhelming
majority of problem areas are inside locks.  All the complexity and 
maintainability difficulties to which you refer exist in the preempt
patch as well.    There just is no difference.

> 
> bye, Roman

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  1:05                                                                 ` yodaiken
@ 2002-01-14 10:16                                                                   ` Roman Zippel
  0 siblings, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 10:16 UTC (permalink / raw)
  To: yodaiken
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

Hi,

yodaiken@fsmlabs.com wrote:

> I mean, that these conversations are not very useful if you don't
> read what the other people write.

Oh, I do read everything that's for me, but it happens I'm not answering
everything, but thanks for your kind reminder.

> Here's a prior response by Andrew to a post by you.
>
> I've heard this so many times, and it just ain't so.   The overwhelming
> majority of problem areas are inside locks.  All the complexity and
> maintainability difficulties to which you refer exist in the preempt
> patch as well.    There just is no difference.

There is a difference. There is of course the maintenance work which have
both approaches in common, every kernel has to be tested for new latency
problems. What differs is the amount of problems that needs fixing, as
preempt can handle the problems outside of locks automatically, but
inserting schedule points for these cases is usually quite simple. (I
leave it open whether these problem are really only the minority, it
doesn't matter much, it only matters that they do exist.)
The remaining problems need to be examined again by either approach.
Breaking up the locks and inserting (implicit or explicit) schedule points
is often the simpler solution, but analyzing the problem and adjusting the
algorithm leads usually to the cleaner solution. Anyway, this is again
pretty much common for both approaches.
There is an additional cost for maintaining the explicit schedule points,
as they mean additional code all over the kernel, which has to be
maintained and to be verified overtime something is changed in that area.
This work has to be done by someone, if the ll patch would be included
into the standard kernel, the burden would be put onto every maintainer of
the systems that were changed. The easy approach by these maintainers
could be of course to just drop these schedule points, but then the ll
maintainer can start from zero and it adds up to the testing costs above.
This additional cost does not exist for all cases that are automatically
handled by preempting.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:41                                                               ` Roman Zippel
  2002-01-14  1:05                                                                 ` yodaiken
@ 2002-01-14  1:19                                                                 ` Robert Love
  1 sibling, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-14  1:19 UTC (permalink / raw)
  To: Roman Zippel
  Cc: yodaiken, Alan Cox, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

On Sun, 2002-01-13 at 19:41, Roman Zippel wrote:

> > That is exactly what Andrew Morton disputes. So why do you think he is
> > wrong?

Victor is saying that Andrew contends the hard parts of his low-latency
patch are just as hard to maintain with a preemptive kernel.  This is
true, for the places where spinlocks are held anyway, but it assumes we
continue to treat lock breaking and explicit scheduling as our only
solution.  It isn't under a preemptible kernel.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:18                                                       ` Roman Zippel
  2002-01-13 15:36                                                         ` Arjan van de Ven
  2002-01-13 15:45                                                         ` Alan Cox
@ 2002-01-13 18:13                                                         ` Robert Love
  2002-01-14  1:50                                                         ` Rik van Riel
  3 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-13 18:13 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, Kenneth Johansson, arjan, Rob Landley, linux-kernel

On Sun, 2002-01-13 at 10:18, Roman Zippel wrote:

> What somehow got lost in this discussion, that both patches don't
> necessarily conflict with each other, they both attack the same problem
> with different approaches, which complement each other. I prefer to get
> the best of both patches.
> The ll patch identifies problem, which preempt alone can't fix, on the
> other hand the ll patch inserts schedule calls all over the place, where
> preempt can handle this transparently. So far I haven't seen any
> evidence, that preempt introduces any _new_ serious problems, so I'd
> rather like to see to get the best out of both.

Good point.  In fact, I have an "ll patch" for preempt-kernel, it is
called lock-break and available at
	ftp://ftp.kernel.org/pub/linux/kernel/people/rml/lock-break

While I am not so sure this sort of explicit work is the answer -- I'd
much prefer to work on the algorithms to shorten lock time or lock into
different locks -- it does work.  The work is based heavily on Andrew's
ll patch but designed for use with preempt-kernel.  This means we can
drop some of the conditional schedules that aren't needed, and in others
we don't need to call schedule (just drop the locks).

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:18                                                       ` Roman Zippel
                                                                           ` (2 preceding siblings ...)
  2002-01-13 18:13                                                         ` Robert Love
@ 2002-01-14  1:50                                                         ` Rik van Riel
  2002-01-14  1:56                                                           ` Robert Love
  2002-01-14 10:55                                                           ` Roman Zippel
  3 siblings, 2 replies; 351+ messages in thread
From: Rik van Riel @ 2002-01-14  1:50 UTC (permalink / raw)
  To: Roman Zippel
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

On Sun, 13 Jan 2002, Roman Zippel wrote:

> So far I haven't seen any evidence, that preempt introduces any _new_
> serious problems, so I'd rather like to see to get the best out of
> both.

Are you seriously suggesting you haven't read a single
email in this thread yet ?

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  1:50                                                         ` Rik van Riel
@ 2002-01-14  1:56                                                           ` Robert Love
  2002-01-14 10:55                                                           ` Roman Zippel
  1 sibling, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-14  1:56 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Roman Zippel, Alan Cox, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

On Sun, 2002-01-13 at 20:50, Rik van Riel wrote:

> > So far I haven't seen any evidence, that preempt introduces any _new_
> > serious problems, so I'd rather like to see to get the best out of
> > both.
> 
> Are you seriously suggesting you haven't read a single
> email in this thread yet ?

No, I think he is suggesting he doesn't consider any of the problems
serious.  A lot of it is just smoke.  What is "bad" wrt 2.5?

	Robert Love



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  1:50                                                         ` Rik van Riel
  2002-01-14  1:56                                                           ` Robert Love
@ 2002-01-14 10:55                                                           ` Roman Zippel
  1 sibling, 0 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-14 10:55 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Alan Cox, Robert Love, Kenneth Johansson, arjan, Rob Landley,
	linux-kernel

Hi,

On Sun, 13 Jan 2002, Rik van Riel wrote:

> > So far I haven't seen any evidence, that preempt introduces any _new_
> > serious problems, so I'd rather like to see to get the best out of
> > both.
>
> Are you seriously suggesting you haven't read a single
> email in this thread yet ?

Could you please be more explicit?

bye, Roman


^ permalink raw reply	[flat|nested] 351+ messages in thread

[parent not found: <16QNVQ-2JqEACC@fwd03.sul.t-online.com>]

[parent not found: <3C43D5E1.6785695C@mvista.com>]

* Re: [2.4.17/18pre] VM and swap - it's really unusable
       [not found]                                                         ` <3C43D5E1.6785695C@mvista.com>
@ 2002-01-15  8:32                                                           ` Oliver Neukum
  0 siblings, 0 replies; 351+ messages in thread
From: Oliver Neukum @ 2002-01-15  8:32 UTC (permalink / raw)
  To: george anzinger; +Cc: Momchil Velikov, linux-kernel

On Tuesday 15 January 2002 08:10, george anzinger wrote:

> Yes, this is classic priority inversion.  It is here now, today with
> semaphors which are held by code that blocks.  If the code doesn't
> block, why not use a spin lock?  If it does, well the problem is here

Because eg. other code that holds the semaphore needs to sleep

> now.  I suppose we could set a preempt disable around a semaphore if it
> makes you feel better.  It doesn't fix the problem if the task blocks

It would make me feel better, but it would defeat the purpose.
There's a lot of code holding semaphores.

> AND it is legal to block while holding a preemption lock.

But it's easier to fix. If you can preempt only by explicitely
sleeping, you can beat priority invasion by changing basically
only wake_up. If you can be preempted at random, you need to know
who holds a semaphore.

	Regards
		Oliver

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:36                                                 ` Kenneth Johansson
  2002-01-12 23:01                                                   ` Robert Love
@ 2002-01-13  1:30                                                   ` Alan Cox
  1 sibling, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13  1:30 UTC (permalink / raw)
  To: Kenneth Johansson; +Cc: Alan Cox, Robert Love, arjan, Rob Landley, linux-kernel

> > I must have missed that in the code. I can see you check __cli() status but
> > I didn't see anywhere you check disable_irq(). Even if you did it doesnt
> > help when I mask the irq on the chip rather than using disable_irq() calls.
> 
> But you get interrupted by other interrups then so you have the same problem
> reagardless of any preemtion patch you hopefully lose the cpu for a much
> shorter time but still the same problem.

Interrupt paths are well sub millisecond, a pre empt might mean I don't get
the CPU back for measurable chunks of a second.  They are totally different
guarantees. 

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 18:54                                           ` Alan Cox
  2002-01-12 19:23                                             ` Ed Sweetman
  2002-01-12 19:26                                             ` Robert Love
@ 2002-01-12 20:53                                             ` Roman Zippel
  2002-01-12 23:07                                               ` Rob Landley
  2002-01-13  1:26                                               ` Alan Cox
  2002-01-13 22:06                                             ` Daniel Phillips
  2002-01-14  7:22                                             ` Alans example against preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable) Roger Larsson
  4 siblings, 2 replies; 351+ messages in thread
From: Roman Zippel @ 2002-01-12 20:53 UTC (permalink / raw)
  To: Alan Cox; +Cc: arjan, Rob Landley, linux-kernel

Hi,

Alan Cox wrote:

> So with pre-empt this happens
> 
>         driver magic
>         disable_irq(dev->irq)
> PRE-EMPT:
>         [large periods of time running other code]
> PRE-EMPT:
>         We get back and we've missed 300 packets, the serial port sharing
>         the IRQ has dropped our internet connection completely.

But it shouldn't deadlock as Victor is suggesting.

> There are numerous other examples in the kernel tree where the current code
> knows that there is a small bounded time between two actions in kernel space
> that do not have a sleep. They are not spin locked, and putting spin locks
> everywhere will just trash performance. They are pure hardware interactions
> so you can't automatically detect them.

Why should spin locks trash perfomance, while an expensive disable_irq()
doesn't?

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:53                                             ` Roman Zippel
@ 2002-01-12 23:07                                               ` Rob Landley
  2002-01-13 16:03                                                 ` Alan Cox
  2002-01-13  1:26                                               ` Alan Cox
  1 sibling, 1 reply; 351+ messages in thread
From: Rob Landley @ 2002-01-12 23:07 UTC (permalink / raw)
  To: Roman Zippel, Alan Cox; +Cc: arjan, linux-kernel

On Saturday 12 January 2002 03:53 pm, Roman Zippel wrote:
> Hi,
>
> Alan Cox wrote:
> > So with pre-empt this happens
> >
> >         driver magic
> >         disable_irq(dev->irq)
> > PRE-EMPT:
> >         [large periods of time running other code]
> > PRE-EMPT:
> >         We get back and we've missed 300 packets, the serial port sharing
> >         the IRQ has dropped our internet connection completely.
>
> But it shouldn't deadlock as Victor is suggesting.

Um, hang on...

Obvioiusly, Alan, you know more about the networking stack than I do. :)  But 
could you define "large periods of time running other code"?

The real performance KILLER is when your code gets swapped out, and that 
simply doesn't happen in-kernel.  Yes, the niced app may get swapped out, but 
the syscall it's making is pinned in ram and it will only block on disk 
access when it returns.  So we're talking what kind of delay here, one second?

As for scheduling, even a nice 19 task will still get SOME time, and we can 
find out exactly what the worst case is since we hand out time slices and we 
don't hand out more until EVERYBODY exhausts theirs, including seriously 
niced processes.  So this isn't exactly non-deterministic behavior, is it?  
There IS an upper bound here...

There ISN'T an upper bound on interrupts.  We've got some nasty interrupts in 
the system.  How long does the PCI bus get tied up with spinning video cards 
flooding the bus to make their benchmarks look 5% better?  How long of a 
latency spike did we (until recently) get switching between graphics and text 
consoles?  (I heard that got fixed, moved into a tasklet or some such.  
Haven't looked at it yet.)  Without Andre's IDE patches, how much latency can 
the disk insert at random?

Yes, it's possible than if you have a fork bomb trying to take down your 
system, and you're using an old 10baseT ethernet driver built with some 
serious assumptions about how the kernel works, that you could drop some 
packets.  But I find it interesting that make -j can be considered a fairly 
unrealistic test intentionally overloading the system, yet an example with 
150 active threads all eating CPU time is NOT considered an example of how 
your process's receive buffer could easily fill up and drop packets no matter 
HOW fast the interrupt is working since even 10baseT feeds you 1.1 megabytes 
per second and with a 1 second delay we might have to swap stuff out to make 
room for them if we don't read from the socket in that long...

One other fun little thing about the scheduler: a process that is submitting 
network packets probably isn't entirely CPU bound, is it?  It's doing I/O.  
So even if it's niced, if it's competing with totally CPU bound tasks isn't 
it likely to get promoted?  How real-world is your overload-induced failure 
case example?

As for dropping 300 packets killing your connection, are you saying 802.11 
cards can't have a static burst that blocks your connection for half a 
second?  I've had full second gaps in network traffic on my cable modem, all 
time time, and the current overload behavior of most routers is dropping lots 
and lots of packets on the floor.  (My in-house network is still using an 
ancient 10baseT half-duplex hub.  I'm lazy, and it's still way faster than my 
upstream connection to the internet.)  Datagram delivery is not guaranteed.  
It never has been.  Maybe it will be once ECN comes in, but that's not yet.

What's one of the other examples you were worried about, besides NE2K (which 
can't do 100baseT, even on PCI, and a 100baseT PCI card is now $9 new.  Just 
a data point.)

Rob

(P.S.  The only behavior difference between preempt and SMP, apart from 
contention for per-cpu data, is the potential insertion of latency spikes in 
kernel code, which interrupts do anyway.  You're saying it can matter when 
something disables an interrupt.  Robert Love suggested the macro that 
disables an interrupt can count as a preemption guard just like a spinlock.  
Is this NOT enough to fix the objection?)

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 23:07                                               ` Rob Landley
@ 2002-01-13 16:03                                                 ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13 16:03 UTC (permalink / raw)
  To: Rob Landley; +Cc: Roman Zippel, Alan Cox, arjan, linux-kernel

> Obvioiusly, Alan, you know more about the networking stack than I do. :)  But 
> could you define "large periods of time running other code"?

10ths of a second if there is a lot to let run instead of this thread. Even
1/100th is bad news. 

> There ISN'T an upper bound on interrupts.  We've got some nasty interrupts in 
> the system.  How long does the PCI bus get tied up with spinning video cards 
> flooding the bus to make their benchmarks look 5% better?  How long of a 

They dont flood the bus with interrupts, the lock the bus off for several
millseconds worst case. Which btw you'll note means that lowlatency already
achieves the best value you can get

> latency spike did we (until recently) get switching between graphics and text 
> consoles?  (I heard that got fixed, moved into a tasklet or some such.  
> Haven't looked at it yet.)  Without Andre's IDE patches, how much latency can 

Been fixed in -ac for ages, and finally made Linus tree.

> the disk insert at random?

IDE with or without Andre's changes can insert multiple millisecond delays
on the bus in some situations. Again pre-empt patch can offer you nothing 
because the hardware limit is easily met by low latency

> One other fun little thing about the scheduler: a process that is submitting 
> network packets probably isn't entirely CPU bound, is it?  It's doing I/O.  

Network packets get submitted from _outside_

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:53                                             ` Roman Zippel
  2002-01-12 23:07                                               ` Rob Landley
@ 2002-01-13  1:26                                               ` Alan Cox
  2002-01-13 13:34                                                 ` Roman Zippel
  1 sibling, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-13  1:26 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

> > everywhere will just trash performance. They are pure hardware interactions
> > so you can't automatically detect them.
> 
> Why should spin locks trash perfomance, while an expensive disable_irq()
> doesn't?

disable_irq only blocks _one_ interrupt line, spin_lock_irqsave locks the
interrupt off on a uniprocessor, and  50% of the time off on a
dual processor. 

If I use a spin lock you can't run a modem and an NE2000 card together on
Linux 2.4. Thats why I had to do that work on the code. Its one of myriads
of basic obvious cases that the pre-empt patch gets wrong

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13  1:26                                               ` Alan Cox
@ 2002-01-13 13:34                                                 ` Roman Zippel
  2002-01-13 15:19                                                   ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Roman Zippel @ 2002-01-13 13:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: arjan, Rob Landley, linux-kernel

Hi,

Alan Cox wrote:

> disable_irq only blocks _one_ interrupt line, spin_lock_irqsave locks the
> interrupt off on a uniprocessor, and  50% of the time off on a
> dual processor.
> 
> If I use a spin lock you can't run a modem and an NE2000 card together on
> Linux 2.4. Thats why I had to do that work on the code. Its one of myriads
> of basic obvious cases that the pre-empt patch gets wrong

I wouldn't say it gets it wrong, the driver also has to take a non irq
spinlock anyway, so the window is quite small and even then the packet
is only delayed.
But now I really have to look at that driver and try a more optimistic
irq disabling approach, otherwise it will happily disable the most
important shared interrupt on my Amiga for ages.

bye, Roman

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 13:34                                                 ` Roman Zippel
@ 2002-01-13 15:19                                                   ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13 15:19 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

> I wouldn't say it gets it wrong, the driver also has to take a non irq
> spinlock anyway, so the window is quite small and even then the packet
> is only delayed.

Or you lose a pile of them

> But now I really have to look at that driver and try a more optimistic
> irq disabling approach, otherwise it will happily disable the most
> important shared interrupt on my Amiga for ages.

If you play with the code remember that the irq delivery on x86 is
asynchronous. You can disable the irq on the chip, synchronize_irq() on 
the result and very occasionally get the irq delivered after all of that

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 18:54                                           ` Alan Cox
                                                               ` (2 preceding siblings ...)
  2002-01-12 20:53                                             ` Roman Zippel
@ 2002-01-13 22:06                                             ` Daniel Phillips
  2002-01-14  7:22                                             ` Alans example against preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable) Roger Larsson
  4 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-13 22:06 UTC (permalink / raw)
  To: Alan Cox, arjan; +Cc: Rob Landley, linux-kernel

On January 12, 2002 07:54 pm, Alan Cox wrote:
> Another example is in the network drivers. The 8390 core for one example
> carefully disables an IRQ on the card so that it can avoid spinlocking on 
> uniprocessor boxes.
> 
> So with pre-empt this happens
> 
> 	driver magic
> 	disable_irq(dev->irq)
	inc task's preempt inhibit

> PRE-EMPT:
> 	[large periods of time running other code]
> PRE-EMPT:
> 	We get back and we've missed 300 packets, the serial port sharing
> 	the IRQ has dropped our internet connection completely.

	We are ok
	dec tasks's preempt inhibit
	jmp if nonzero

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Alans example against preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-12 18:54                                           ` Alan Cox
                                                               ` (3 preceding siblings ...)
  2002-01-13 22:06                                             ` Daniel Phillips
@ 2002-01-14  7:22                                             ` Roger Larsson
  2002-01-14  9:18                                               ` Alan Cox
  4 siblings, 1 reply; 351+ messages in thread
From: Roger Larsson @ 2002-01-14  7:22 UTC (permalink / raw)
  To: Alan Cox, arjan; +Cc: Rob Landley, linux-kernel

On Saturday den 12 January 2002 19.54, Alan Cox wrote:
> Another example is in the network drivers. The 8390 core for one example
> carefully disables an IRQ on the card so that it can avoid spinlocking on
> uniprocessor boxes.
>
> So with pre-empt this happens
>
> 	driver magic
> 	disable_irq(dev->irq)
> PRE-EMPT:
> 	[large periods of time running other code]
> PRE-EMPT:
> 	We get back and we've missed 300 packets, the serial port sharing
> 	the IRQ has dropped our internet connection completely.
>
> ["Don't do that then" isnt a valid answer here. If I did hold a lock
>  it would be for several milliseconds at a time anyway and would reliably
>  trash performance this time]
>

./drivers/net/8390.c
I checked the code ./drivers/net/8390.c - this is how it REALLY looks like...

	/* Ugly but a reset can be slow, yet must be protected */
		
	disable_irq_nosync(dev->irq);
	spin_lock(&ei_local->page_lock);
		
	/* Try to restart the card.  Perhaps the user has fixed something. */
	ei_reset_8390(dev);
	NS8390_init(dev, 1);
		
	spin_unlock(&ei_local->page_lock);
	enable_irq(dev->irq);

This should be mostly OK for the preemptive kernel. Swapping the irq and spin 
lock lines should be preferred. But I think that is the case in SMP too...

Suppose two processors does the disable_irq_nosync - unlikely but possible...
One gets the spinlock, the other waits
The first runs through the code, exits the spin lock, enables irq
The second starts running the code - without irq disabled!!!

This would work in both cases.
	/* Ugly but a reset can be slow, yet must be protected */
		
	spin_lock(&ei_local->page_lock);
	disable_irq_nosync(dev->irq);
		
	/* Try to restart the card.  Perhaps the user has fixed something. */
	ei_reset_8390(dev);
	NS8390_init(dev, 1);
		
	enable_irq(dev->irq);
	spin_unlock(&ei_local->page_lock);

/RogerL



-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Alans example against preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-14  7:22                                             ` Alans example against preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable) Roger Larsson
@ 2002-01-14  9:18                                               ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-14  9:18 UTC (permalink / raw)
  To: Roger Larsson; +Cc: Alan Cox, arjan, Rob Landley, linux-kernel

> This should be mostly OK for the preemptive kernel. Swapping the irq an=
> d spin=20
> lock lines should be preferred. But I think that is the case in SMP too=
> =2E..

You deadlock if you swap the two lines over. In this case for pre-empt you
really have to go in and add non pre-emption places to the driver

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 20:22                                       ` Rob Landley
                                                           ` (3 preceding siblings ...)
  2002-01-12  9:52                                         ` arjan
@ 2002-01-14 12:08                                         ` Helge Hafting
  2002-01-18 22:41                                           ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable) Pavel Machek
  4 siblings, 1 reply; 351+ messages in thread
From: Helge Hafting @ 2002-01-14 12:08 UTC (permalink / raw)
  To: Rob Landley, linux-kernel

Rob Landley wrote:
> 
> On Friday 11 January 2002 09:50 pm, yodaiken@fsmlabs.com wrote:
> > On Fri, Jan 11, 2002 at 03:33:22PM -0500, Robert Love wrote:
> > > On Fri, 2002-01-11 at 07:37, Alan Cox wrote:
> > > The preemptible kernel plus the spinlock cleanup could really take us
> > > far.  Having locked at a lot of the long-held locks in the kernel, I am
> > > confident at least reasonable progress could be made.
> > >
> > > Beyond that, yah, we need a better locking construct.  Priority
> > > inversion could be solved with a priority-inheriting mutex, which we can
> > > tackle if and when we want to go that route.  Not now.
> >
> > Backing the car up to the edge of the cliff really gives us
> > good results. Beyond that, we could jump off the cliff
> > if we want to go that route.
> > Preempt leads to inheritance and inheritance leads to disaster.
> 
> I preempt leads to disaster than Linux can't do SMP.  Are you saying that's
> the case?

There is a difference.  Preempt have the same locking requirements as
SMP, but there's also _timing_ requirements.

> The preempt patch is really "SMP on UP".  If pre-empt shows up a problem,
> then it's a problem SMP users will see too.  If we can't take advantage of
> the existing SMP locking infrastructure to improve latency and interactive
> feel on UP machines, than SMP for linux DOES NOT WORK.

One example where preempt may break and SMP does not:

Consider driver code.  Critical data structures is protected by
spinlocks,
but some of the access to the hardware device itself is outside those
locks (I can prove that the other processors can't get there with
the driver in that state anyway)

Now, hardware access has timing requirements.  That works on SMP because
you don't loose the CPU to anything but interrupts, and they are fast. 
You get it back almost immediately.  The device in question times out
after a much longer interval.

But preempt may decide to run a time-consuming higher priority task in
the 
middle of device access, cuasing the hardware to time out and fail.
Hardware access isn't necessarily in a interrupt handler.  It may be
done directly in a read/write/ioctl call if the device happens
to be available at the moment.

This is a case where SMP works even though preempt may fail.  I don't
know if this is an issue for existing drivers, but it is possible.

Helge Hafting



> 
> > All the numbers I've seen show Morton's low latency just works better. Are
> > there other numbers I should look at.
> 
> This approach is basically a collection of heuristics.  The kernel has been
> profiled and everywhere a latency spike was found, a band-aid was put on it
> (an explicit scheduling point).  This doesn't say there aren't other latency
> spikes, just that with the collection of hardware and software being
> benchmarked, the latency spikes that were found have each had a band-aid
> individually applied to them.
> 
> This isn't a BAD thing.  If the benchmarks used to find latency spikes are at
> all like real-world use, then it helps real-world applications.  But of
> COURSE the benchmarks are going to look good, since tuning the kernel to
> those benchmarks is the way the patch was developed!
> 
> The majority of the original low latency scheduling point work is handled
> automatically by the SMP on UP kernel.  You don't NEED to insert scheduling
> points anywhere you aren't inside a spinlock.  So the SMP on UP patch makes
> most of the explicit scheduling point patch go away, accomplishing the same
> thing in a less intrusive manner.  (Yes, it makes all kernels act like SMP
> kernels for debugging purposes.  But you can turn it off for debugging if you
> want to, that's just another toggle in the magic sysreq menu.  And this isn't
> entirely a bad thing: applying the enormous UP userbase to the remaining SMP
> bugs is bound to squeeze out one or two more obscure ones, but those bugs DO
> exist already on SMP.)
> 
> However, what's left of the explicit scheduling work is still very useful.
> When you ARE inside a spinlock, you can't just schedule, you have to save
> state, drop the lock(s), schedule, re-acquire the locks, and reload your
> state in case somebody else diddled with the structures you were using.  This
> is a lot harder than just scheduling, but breaking up long-held locks like
> this helps SMP scalability, AND helps latency in the SMP-on-UP case.
> 
> So the best approach is a combination of the two patches.  SMP-on-UP for
> everything outside of spinlocks, and then manually yielding locks that cause
> problems.  Both Robert Love and Andrew Morton have come out in favor of each
> other's patches on lkml just in the past few days.  The patches work together
> quite well, and each wants to see the other's patch applied.
> 
> Rob
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-14 12:08                                         ` [2.4.17/18pre] VM and swap - it's really unusable Helge Hafting
@ 2002-01-18 22:41                                           ` Pavel Machek
  2002-01-20 11:22                                             ` Rob Landley
  2002-01-20 20:22                                             ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable) Robert Love
  0 siblings, 2 replies; 351+ messages in thread
From: Pavel Machek @ 2002-01-18 22:41 UTC (permalink / raw)
  To: Helge Hafting; +Cc: Rob Landley, linux-kernel

Hi!

> > I preempt leads to disaster than Linux can't do SMP.  Are you saying that's
> > the case?
> 
> There is a difference.  Preempt have the same locking requirements as
> SMP, but there's also _timing_ requirements.
...
> > The preempt patch is really "SMP on UP".  If pre-empt shows up a problem,
> > then it's a problem SMP users will see too.  If we can't take advantage of
> > the existing SMP locking infrastructure to improve latency and interactive
> > feel on UP machines, than SMP for linux DOES NOT WORK.
> 
> One example where preempt may break and SMP does not:
> 
> Consider driver code.  Critical data structures is protected by
> spinlocks,
> but some of the access to the hardware device itself is outside those
> locks (I can prove that the other processors can't get there with
> the driver in that state anyway)
> 
> Now, hardware access has timing requirements.  That works on SMP because
> you don't loose the CPU to anything but interrupts, and they are fast. 
> You get it back almost immediately.  The device in question times out
> after a much longer interval.

So... how long do you have to stay in interrupt for it to be a bug?

There's *no* requirement that says "it may not take second to handle
an interrupt". Actually I guess that some nasty conditions (UHCI needs
reset?) may take that long in interrupt. Oh and actually few releases
ago, console switching was done from interrupt and it *did* take 2
seconds for me.

If someone assumes interrupts are "short", he has broken code already.

									Pavel
-- 
(about SSSCA) "I don't say this lightly.  However, I really think that the U.S.
no longer is classifiable as a democracy, but rather as a plutocracy." --hpa

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-18 22:41                                           ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable) Pavel Machek
@ 2002-01-20 11:22                                             ` Rob Landley
  2002-01-21 21:48                                               ` Alan Cox
  2002-01-20 20:22                                             ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable) Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: Rob Landley @ 2002-01-20 11:22 UTC (permalink / raw)
  To: Pavel Machek, Helge Hafting; +Cc: linux-kernel

On Friday 18 January 2002 05:41 pm, Pavel Machek wrote:

> So... how long do you have to stay in interrupt for it to be a bug?
>
> There's *no* requirement that says "it may not take second to handle
> an interrupt". Actually I guess that some nasty conditions (UHCI needs
> reset?) may take that long in interrupt. Oh and actually few releases
> ago, console switching was done from interrupt and it *did* take 2
> seconds for me.
>
> If someone assumes interrupts are "short", he has broken code already.

That kinda defeats the entire purpose of low-latency patches, doesn't it?

I'm not entirely certain what Alan's smoking if he's raising the straw man 
argument of a two second delay dropping 300 packets and causing connections 
to abort (my sister's on DSL, 5 second dropouts every time the phone rings, 
but connections continue just fine.  Wouldn't want to play quake under those 
circumstances, but would I really have the ~200 CPU-hog background threads 
while playing quake as per Alan's example, and even then the argument is just 
that either the network driver or the network card is bad...)

Still not as bad an example as Aunt Tillie, I'll grant you...

Rob

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-20 11:22                                             ` Rob Landley
@ 2002-01-21 21:48                                               ` Alan Cox
  2002-01-22 11:52                                                 ` Rob Landley
  0 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-21 21:48 UTC (permalink / raw)
  To: Rob Landley; +Cc: Pavel Machek, Helge Hafting, linux-kernel

> I'm not entirely certain what Alan's smoking if he's raising the straw man 
> argument of a two second delay dropping 300 packets and causing connections 

Go read my original mail about the NE2000 driver. If you are going to accuse
me of smoking things you could at least read the posts you base it on


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-21 21:48                                               ` Alan Cox
@ 2002-01-22 11:52                                                 ` Rob Landley
  2002-01-27 20:37                                                   ` Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Rob Landley @ 2002-01-22 11:52 UTC (permalink / raw)
  To: Alan Cox; +Cc: Pavel Machek, Helge Hafting, linux-kernel

On Monday 21 January 2002 04:48 pm, Alan Cox wrote:
> > I'm not entirely certain what Alan's smoking if he's raising the straw
> > man argument of a two second delay dropping 300 packets and causing
> > connections
>
> Go read my original mail about the NE2000 driver. If you are going to
> accuse me of smoking things

For which I apologize,

> you could at least read the posts you base it on

I did.

Okay, let's review:

> On Sat, 12 Jan 2002 18:54:27 +0000 (GMT), Alan Cox spaketh thusly:
> 
>Another example is in the network drivers. The 8390 core for one example
>carefully disables an IRQ on the card so that it can avoid spinlocking on 
>uniprocessor boxes.

Sounds like a bit of a kludge, but it's not my code.  However, without 
preempt aren't spinlocks basically NOPs on uniprocessor boxes?  What did I 
miss?

And wasn't there discussion of using IRQ disabling as a preempt barrier (at 
least until the syscall returns to userspace or finished a module unload 
call, clueing us in that it won't reenable it any time soon.)

>So with pre-empt this happens
>
>        driver magic
>        disable_irq(dev->irq)
>PRE-EMPT:
>        [large periods of time running other code]
>PRE-EMPT:
>        We get back and we've missed 300 packets, the serial port sharing
>        the IRQ has dropped our internet connection completely.

Okay, please point out where I missed a curve here:

An NE2K cannot go faster than 10baseT.  (Never designed to.  It's an old ISA 
standard dragged along to PCI largely because they had these chips lying 
around and nobody wanted to come up with a new interface anyway.  But it can 
only handle packets one at a time, as far as I know.  I've got several of 
these suckers lying around in various drawers, some of which are ISA.  I'm 
considering throwing them out since a new 100baseT card is $9 retail.  But I 
digress...).

With 10baseT you've got a theoretical maximum throughput of 1.25 (decimal) 
megabytes/second.  Assuming 1500 byte packet sizes at 10 megabits per second 
on a saturated link, we're talking 833 packets/second so a little over a 
third of a second is the shortest amount of time in which you can drop 300 
packets.

So in a worst case scenario latency spike introduced by an overloaded system 
running make -j where niced down CPU bound processes are doing network I/O 
through a driver that's not doing the right locking for preempt to know it 
shouldn't be interrupted...  Yeah, it could lose 300 packets.  Why this is a 
bad thing when we're designing gigabit ethernet systems with interrupt 
mitigation so they intentionally drop thousands of packets at a time rather 
than livelocking...  Open question.  TCP/IP is designed to retransmit around 
this sort of thing, and even with ECN it's not going to forget how. 

But that wasn't really the bad thing.  The bad thing was the incidentally 
misconfigured serial connection (sharing the network card's IRQ) hanging up.  
Serial maxes out at 115,200 which is 14400 bytes/sec (assuming perfect 8 bit 
encoding with no overhead), and losing 1/3 of that means 4800 bytes, which is 
indeed noticeably more than a 16550a UART's 16 byte buffer.  And SLIP and PPP 
also tend to have smaller MTU, (around 256 bytes for latency reasons), 
meaning the loss here could be a whole 18 packets.  (Assuming your 56k modem 
that can't actually quite do 56k isn't the real bottleneck, but we won't go 
there...)

And I agree that's not good for playing quake, but again: playing quake with 
"make -j" running in the background isn't going to give you the world's 
greatest frame rates anyway.  The 1/3 second latency spike was ENTIRELY due 
to the computer being loaded up with other things to do and not scheduling 
back to you before then.  If your game of quake is experiencing those kind of 
SCHEDULING latency spikes, it's unplayable anyway.

As for hanging up, I've used slip at 2400 bps with no error correction, on 
noisy phone lines.  (It sucked, but it more or less worked.  Yes, it would 
hang up at times when the line noise made retransmission impossible for more 
than about fifteen seconds at a time, which is why PPP was invented.  PPP is 
designed to be MORE robust than slip, more intelligent than SLIP about 
retransmits, and above all not to give up nearly as easily.  It's been a 
couple years since I've messed with it in depth, but it seems to me the phone 
generally physically lost carrier before PPP gave up.  (Should PPP over 
Ethernet ever "hang up" and exit during a network storm?)  The modem itself 
doesn't care about the data being transmitted through it, that has no bearing 
on its carrier detect status.  And if pppd exited due to a 1/3 second dropout 
(producing at most 2 garbled packets: one cut off at the start and one cut 
off at the end, the rest simply dropped), then there would be something wrong 
with pppd.

So you've got a "gloom and doom" scenario that, even in this fairly 
pathlogical worst case, doesn't really seem all that bad.  And it's also a 
purely theoretical objection of a kind that I haven't heard anybody actually 
testing the patch complaining about, AND one that seems like it could be 
addressed by using IRQ disabling as a latency guard in addition to spinlocks.

>["Don't do that then" isnt a valid answer here. If I did hold a lock
> it would be for several milliseconds at a time anyway and would reliably
> trash performance this time]

If NE2K is holding the lock for several miliseconds at a time, how is it 
managing 833 packets/second?  (Is it NOT doing one per interrupt?)

If it's holding the lock for several miliseconds, the overhead of acquiring 
the lock in the first place isn't exactly a show-stopper, is it?

If spinlocks don't get compiled in on non-preempt UP boxes (and are basically 
just an increment in preempt), where is the killer overhead in the UP case?  
If you're saying spinlocks would kill SMP performance, on a lock which should 
basically have no contention at all (when we used to have 100baseT drivers 
using the Big Kernel Lock), 

And again, this is where the use of IRQ blocking as a preempt guard comes in 
handy.  (Which naturally expires when you return to userspace anyway, so 
hand-waving about unlimited blocking time is just that: there IS an upper 
bound here.  And an IRQ block that's part of a device shutdown is really a 
different call, which would probably mostly be confined to the module unload 
code anyway.)

>There are numerous other examples in the kernel tree where the current code
>knows that there is a small bounded time between two actions in kernel space
>that do not have a sleep.

Such as?

(And if the use of IRQ disabling as a preempt guard doesn't fix it, then the 
code is ALREADY hosed because interrupts can be arbitrarily long.  We're 
trying to keep them short now, but we used to switch consoles from interrupt 
context and that could take a LONG time if we were wandering between graphics 
and text consoles.  So you're saying this is code that didn't show up as a 
bug back then...)

>They are not spin locked, and putting spin locks
>everywhere will just trash performance. They are pure hardware interactions
>so you can't automatically detect them.

If they don't have IRQs blocked, they don't have any real latency guarantees 
anyway.  If the DO have IRQs blocked, they can be automatically detected.

>That is why the pre-empt code is a much much bigger problem and task than the
>low latency code.

I don't see it.  Care to point out what I've missed?

>Alan

Rob

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-22 11:52                                                 ` Rob Landley
@ 2002-01-27 20:37                                                   ` Alan Cox
  2002-01-27 22:10                                                     ` Nigel Gamble
  0 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-27 20:37 UTC (permalink / raw)
  To: Rob Landley; +Cc: Alan Cox, Pavel Machek, Helge Hafting, linux-kernel

> >carefully disables an IRQ on the card so that it can avoid spinlocking on 
> >uniprocessor boxes.
> 
> Sounds like a bit of a kludge, but it's not my code.  However, without 
> preempt aren't spinlocks basically NOPs on uniprocessor boxes?  What did I 
> miss?

spin lock is a nop on uniprocessor. That is much of the point of this. Most
ne2000's are in uniprocessor boxes so they are primary target

> An NE2K cannot go faster than 10baseT.  (Never designed to.  It's an old ISA 

Wrong. There are multiple 100Mbit NE2000 clones (notably PCMCIA ones). I
have one in my laptop for example.

> testing the patch complaining about, AND one that seems like it could be 
> addressed by using IRQ disabling as a latency guard in addition to spinlocks.

I dont believe anyone has tested the driver hard with pre-empt. Its not that
this driver can't be fixed. Its that this is one tiny example of maybe 
thousands of other similar flaws lurking. There is no obvious automated way
to find them either.

> If it's holding the lock for several miliseconds, the overhead of acquiring 
> the lock in the first place isn't exactly a show-stopper, is it?

I don't hold the lock with interrupts off for several milliseconds

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-27 20:37                                                   ` Alan Cox
@ 2002-01-27 22:10                                                     ` Nigel Gamble
  2002-01-27 22:56                                                       ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre]u Alan Cox
  0 siblings, 1 reply; 351+ messages in thread
From: Nigel Gamble @ 2002-01-27 22:10 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rob Landley, Pavel Machek, Helge Hafting, linux-kernel

On Sun, 27 Jan 2002, Alan Cox wrote:
> I dont believe anyone has tested the driver hard with pre-empt. Its not that
> this driver can't be fixed. Its that this is one tiny example of maybe
> thousands of other similar flaws lurking. There is no obvious automated way
> to find them either.

You could make the same argument against SMP, but Linux has SMP support
despite all the thousands of SMP flaws that once lurked with no obvious
automated way to find them.  Most of them have been found.

Actually, there is a way to help to automate the finding of preemption
problems:  you keep a log of kernel preemption events in a circular
buffer, and dump the log after something unexpected happens (like a
kernel oops).  Then you search the log for preemptions that happened in
suspicious places.  Kernel preemptions don't happen very often, so the
log usually goes back several seconds, which is usually plenty of time
to catch the preemption that happened in the wrong place.  (Since SMP
locking problems are also preemption problems, this technique can also
catch SMP problems.)

I have a patch to do this for earlier versions of the kernel preemption
patch - I need to bring it up to date and send it to Robert for use with
the latest versions of his patch.

Nigel Gamble                                    nigel@nrg.org
Mountain View, CA, USA.                         http://www.nrg.org/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre]u
  2002-01-27 22:10                                                     ` Nigel Gamble
@ 2002-01-27 22:56                                                       ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-27 22:56 UTC (permalink / raw)
  To: nigel; +Cc: Alan Cox, Rob Landley, Pavel Machek, Helge Hafting, linux-kernel

> You could make the same argument against SMP, but Linux has SMP support
> despite all the thousands of SMP flaws that once lurked with no obvious
> automated way to find them.  Most of them have been found.

We spent four years on that. It was also done in a very careful and 
precise manner starting with SMP that gave the same guarantees as non SMP
for the 2.0 kernel tree, then moving on to relaxing guarantees in certain
places -as they were audited- for 2.2 and with 2.4 increasing the coverage
to all major points of contention except the scsi layer

> I have a patch to do this for earlier versions of the kernel preemption
> patch - I need to bring it up to date and send it to Robert for use with
> the latest versions of his patch.

Nod - thats a productive approach


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-18 22:41                                           ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable) Pavel Machek
  2002-01-20 11:22                                             ` Rob Landley
@ 2002-01-20 20:22                                             ` Robert Love
  1 sibling, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-20 20:22 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Helge Hafting, Rob Landley, linux-kernel

On Fri, 2002-01-18 at 17:41, Pavel Machek wrote:

> So... how long do you have to stay in interrupt for it to be a bug?
> 
> There's *no* requirement that says "it may not take second to handle
> an interrupt". Actually I guess that some nasty conditions (UHCI needs
> reset?) may take that long in interrupt. Oh and actually few releases
> ago, console switching was done from interrupt and it *did* take 2
> seconds for me.
> 
> If someone assumes interrupts are "short", he has broken code already.

Agreed.  Conversely, however, writing code that introduces long
interrupt-off periods should be considered a BUG.

In other words, relying on short interrupt-off periods is bad form, but
so is writing gross code that rudely keeps them off.

I think we all considered the long off periods in VT switching and fbdev
a bug.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-11 20:33                                   ` Robert Love
  2002-01-12  2:50                                     ` yodaiken
@ 2002-01-12 11:13                                     ` Andrea Arcangeli
  2002-01-12 15:07                                       ` jogi
  1 sibling, 1 reply; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-12 11:13 UTC (permalink / raw)
  To: Robert Love; +Cc: Alan Cox, nigel, Rob Landley, Andrew Morton, linux-kernel

On Fri, Jan 11, 2002 at 03:33:22PM -0500, Robert Love wrote:
> On Fri, 2002-01-11 at 07:37, Alan Cox wrote:
> 
> > Its more than a spinlock cleanup at that point. To do anything useful you have
> > to tackle both priority inversion and some kind of at least semi-formal 
> > validation of the code itself. At the point it comes down to validating the
> > code I'd much rather validate rtlinux than the entire kernel
> 
> The preemptible kernel plus the spinlock cleanup could really take us
> far.  Having locked at a lot of the long-held locks in the kernel, I am
> confident at least reasonable progress could be made.
> 
> Beyond that, yah, we need a better locking construct.  Priority
> inversion could be solved with a priority-inheriting mutex, which we can
> tackle if and when we want to go that route.  Not now.
> 
> I want to lay the groundwork for a better kernel.  The preempt-kernel
> patch gives real-world improvements, it provides a smoother user desktop
> experience -- just look at the positive feedback.  Most importantly,
> however, it provides a framework for superior response with our standard

I don't know how to tell you, positive feedback compared to mainline
kernel is totally irrelevant, mainline has broken read/write/sendfile
syscalls that can hang the machine etc... That was fixed ages ago in
many ways, current way is very lightweight, if you can get positive
feedback compared to -aa _that_ will matter.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 11:13                                     ` [2.4.17/18pre] VM and swap - it's really unusable Andrea Arcangeli
@ 2002-01-12 15:07                                       ` jogi
  2002-01-12 16:05                                         ` Andrea Arcangeli
                                                           ` (2 more replies)
  0 siblings, 3 replies; 351+ messages in thread
From: jogi @ 2002-01-12 15:07 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Robert Love, Alan Cox, nigel, Rob Landley, Andrew Morton,
	linux-kernel

On Sat, Jan 12, 2002 at 12:13:15PM +0100, Andrea Arcangeli wrote:
> On Fri, Jan 11, 2002 at 03:33:22PM -0500, Robert Love wrote:
> > On Fri, 2002-01-11 at 07:37, Alan Cox wrote:
> > 
> > > Its more than a spinlock cleanup at that point. To do anything useful you have
> > > to tackle both priority inversion and some kind of at least semi-formal 
> > > validation of the code itself. At the point it comes down to validating the
> > > code I'd much rather validate rtlinux than the entire kernel
> > 
> > The preemptible kernel plus the spinlock cleanup could really take us
> > far.  Having locked at a lot of the long-held locks in the kernel, I am
> > confident at least reasonable progress could be made.
> > 
> > Beyond that, yah, we need a better locking construct.  Priority
> > inversion could be solved with a priority-inheriting mutex, which we can
> > tackle if and when we want to go that route.  Not now.
> > 
> > I want to lay the groundwork for a better kernel.  The preempt-kernel
> > patch gives real-world improvements, it provides a smoother user desktop
> > experience -- just look at the positive feedback.  Most importantly,
> > however, it provides a framework for superior response with our standard
> 
> I don't know how to tell you, positive feedback compared to mainline
> kernel is totally irrelevant, mainline has broken read/write/sendfile
> syscalls that can hang the machine etc... That was fixed ages ago in
> many ways, current way is very lightweight, if you can get positive
> feedback compared to -aa _that_ will matter.

Hello Andrea,

I did my usual compile testings (untar kernel archive, apply patches,
make -j<value> ...

Here are some results (Wall time + Percent cpu) for each of the consecutive five runs:

        13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp
j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%
j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%
j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%
j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%
j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%

j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%
j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%
j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%
j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%
j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%

* build incomplete (OOM killer killed several cc1 ... )

So far 2.4.13-pre5aa1 had been the king of the block in compile times.
But this has changed. Now the (by far) fastest kernel is 2.4.18-pre
+ Ingos scheduler patch (s) + preemptive patch (p). I did not test
preemptive patch alone so far since I don't know if the one I have
applies cleanly against -pre3 without Ingos patch. I used the
following patches:

s: sched-O1-2.4.17-H6.patch
p: preempt-kernel-rml-2.4.18-pre3-ingo-1.patch

I hope this info is useful to someone.

Kind regards,

   Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 15:07                                       ` jogi
@ 2002-01-12 16:05                                         ` Andrea Arcangeli
  2002-01-13 15:15                                           ` jogi
  2002-01-12 16:52                                         ` yodaiken
  2002-01-13 22:55                                         ` Daniel Phillips
  2 siblings, 1 reply; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-12 16:05 UTC (permalink / raw)
  To: jogi; +Cc: Robert Love, Alan Cox, nigel, Rob Landley, Andrew Morton,
	linux-kernel

On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> On Sat, Jan 12, 2002 at 12:13:15PM +0100, Andrea Arcangeli wrote:
> > On Fri, Jan 11, 2002 at 03:33:22PM -0500, Robert Love wrote:
> > > On Fri, 2002-01-11 at 07:37, Alan Cox wrote:
> > > 
> > > > Its more than a spinlock cleanup at that point. To do anything useful you have
> > > > to tackle both priority inversion and some kind of at least semi-formal 
> > > > validation of the code itself. At the point it comes down to validating the
> > > > code I'd much rather validate rtlinux than the entire kernel
> > > 
> > > The preemptible kernel plus the spinlock cleanup could really take us
> > > far.  Having locked at a lot of the long-held locks in the kernel, I am
> > > confident at least reasonable progress could be made.
> > > 
> > > Beyond that, yah, we need a better locking construct.  Priority
> > > inversion could be solved with a priority-inheriting mutex, which we can
> > > tackle if and when we want to go that route.  Not now.
> > > 
> > > I want to lay the groundwork for a better kernel.  The preempt-kernel
> > > patch gives real-world improvements, it provides a smoother user desktop
> > > experience -- just look at the positive feedback.  Most importantly,
> > > however, it provides a framework for superior response with our standard
> > 
> > I don't know how to tell you, positive feedback compared to mainline
> > kernel is totally irrelevant, mainline has broken read/write/sendfile
> > syscalls that can hang the machine etc... That was fixed ages ago in
> > many ways, current way is very lightweight, if you can get positive
> > feedback compared to -aa _that_ will matter.
> 
> Hello Andrea,
> 
> I did my usual compile testings (untar kernel archive, apply patches,
> make -j<value> ...
> 
> Here are some results (Wall time + Percent cpu) for each of the consecutive five runs:
> 
>         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp
> j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%
> j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%
> j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%
> j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%
> j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%
> 
> j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%
> j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%
> j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%
> j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%
> j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%
> 
> * build incomplete (OOM killer killed several cc1 ... )
> 
> So far 2.4.13-pre5aa1 had been the king of the block in compile times.
> But this has changed. Now the (by far) fastest kernel is 2.4.18-pre
> + Ingos scheduler patch (s) + preemptive patch (p). I did not test
> preemptive patch alone so far since I don't know if the one I have
> applies cleanly against -pre3 without Ingos patch. I used the
> following patches:
> 
> s: sched-O1-2.4.17-H6.patch
> p: preempt-kernel-rml-2.4.18-pre3-ingo-1.patch
> 
> I hope this info is useful to someone.

the improvement of "sp" compared to "s" is quite visible, not sure how
can a little different time spent in kernel make such a difference on
the final numbers, also given compilation is mostly an userspace task, I
assume you were swapping out or running out of cache at the very least,
right?

btw, I'd be curious if you could repeat the same test with -j1 or -j2?
(actually real world)

Still the other numbers remains interesting for a trashing machine, but
a few percent difference with a trashing box isn't a big difference, vm
changes can infulence those numbers more than any preempt or scheduler
number (of course if my guess that you're swapping out is really right :).
I guess "p" helps because we simply miss some schedule point in some vm
routine. Hints?

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 16:05                                         ` Andrea Arcangeli
@ 2002-01-13 15:15                                           ` jogi
  0 siblings, 0 replies; 351+ messages in thread
From: jogi @ 2002-01-13 15:15 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Robert Love, Alan Cox, nigel, Rob Landley, Andrew Morton,
	linux-kernel

On Sat, Jan 12, 2002 at 05:05:28PM +0100, Andrea Arcangeli wrote:
> On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:

[...]

> > Hello Andrea,
> > 
> > I did my usual compile testings (untar kernel archive, apply patches,
> > make -j<value> ...
> > 
> > Here are some results (Wall time + Percent cpu) for each of the consecutive five runs:
> > 
> >         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp
> > j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%
> > j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%
> > j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%
> > j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%
> > j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%
> > 
> > j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%
> > j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%
> > j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%
> > j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%
> > j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%
> > 
> > * build incomplete (OOM killer killed several cc1 ... )
> > 
> > So far 2.4.13-pre5aa1 had been the king of the block in compile times.
> > But this has changed. Now the (by far) fastest kernel is 2.4.18-pre
> > + Ingos scheduler patch (s) + preemptive patch (p). I did not test
> > preemptive patch alone so far since I don't know if the one I have
> > applies cleanly against -pre3 without Ingos patch. I used the
> > following patches:
> > 
> > s: sched-O1-2.4.17-H6.patch
> > p: preempt-kernel-rml-2.4.18-pre3-ingo-1.patch
> > 
> > I hope this info is useful to someone.
> 
> the improvement of "sp" compared to "s" is quite visible, not sure how
> can a little different time spent in kernel make such a difference on
> the final numbers, also given compilation is mostly an userspace task, I
> assume you were swapping out or running out of cache at the very least,
> right?

The system is *heavily* swapping. Plain 2.4.18-pre3 can not even finish
the jobs because it runs out of memory. That's why I used j75 or j100
initially. Otherwise there was not even a difference between the 2.4.10+
vm and the 2.4.9-ac+ vm. All I want to test with this "benchmark" is how
well the system reacts when I throw *lots* of compilation jobs at it ...

> btw, I'd be curious if you could repeat the same test with -j1 or -j2?
> (actually real world)

Using just -j1 or -j2 will probably be no difference (I will test it anyway
and post the results). 

> Still the other numbers remains interesting for a trashing machine, but
> a few percent difference with a trashing box isn't a big difference, vm
> changes can infulence those numbers more than any preempt or scheduler
> number (of course if my guess that you're swapping out is really right :).
> I guess "p" helps because we simply miss some schedule point in some vm
> routine. Hints?

But what *I* like most about the preemptive results are that the results
for all runs do not vary that much. Looking at plain 2.4.18-pre3 there
is a huge difference in runtime between the fastest and the longest run.

Regards,

   Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 15:07                                       ` jogi
  2002-01-12 16:05                                         ` Andrea Arcangeli
@ 2002-01-12 16:52                                         ` yodaiken
  2002-01-12 17:00                                           ` Andrea Arcangeli
  2002-01-13 15:18                                           ` jogi
  2002-01-13 22:55                                         ` Daniel Phillips
  2 siblings, 2 replies; 351+ messages in thread
From: yodaiken @ 2002-01-12 16:52 UTC (permalink / raw)
  To: jogi
  Cc: Andrea Arcangeli, Robert Love, Alan Cox, nigel, Rob Landley,
	Andrew Morton, linux-kernel

On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> I did my usual compile testings (untar kernel archive, apply patches,
> make -j<value> ...

If I understand your test, 
you are testing different loads - you are compiling kernels that may differ
in size and makefile organization, not to mention different layout on the
file system and disk.

What happens when you do the same test, compiling one kernel under multiple
different kernels?


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 16:52                                         ` yodaiken
@ 2002-01-12 17:00                                           ` Andrea Arcangeli
  2002-01-12 19:00                                             ` Ed Sweetman
  2002-01-13 15:18                                           ` jogi
  1 sibling, 1 reply; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-12 17:00 UTC (permalink / raw)
  To: yodaiken
  Cc: jogi, Robert Love, Alan Cox, nigel, Rob Landley, Andrew Morton,
	linux-kernel

On Sat, Jan 12, 2002 at 09:52:09AM -0700, yodaiken@fsmlabs.com wrote:
> On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> > I did my usual compile testings (untar kernel archive, apply patches,
> > make -j<value> ...
> 
> If I understand your test, 
> you are testing different loads - you are compiling kernels that may differ
> in size and makefile organization, not to mention different layout on the
> file system and disk.

Ouch, I assumed this wasn't the case indeed.

> 
> What happens when you do the same test, compiling one kernel under multiple
> different kernels?

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 17:00                                           ` Andrea Arcangeli
@ 2002-01-12 19:00                                             ` Ed Sweetman
  2002-01-12 20:23                                               ` Andrew Morton
  2002-01-13 15:22                                               ` jogi
  0 siblings, 2 replies; 351+ messages in thread
From: Ed Sweetman @ 2002-01-12 19:00 UTC (permalink / raw)
  To: Andrea Arcangeli, yodaiken
  Cc: jogi, Robert Love, Alan Cox, nigel, Rob Landley, Andrew Morton,
	linux-kernel

> On Sat, Jan 12, 2002 at 09:52:09AM -0700, yodaiken@fsmlabs.com wrote:
> > On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> > > I did my usual compile testings (untar kernel archive, apply patches,
> > > make -j<value> ...
> >
> > If I understand your test,
> > you are testing different loads - you are compiling kernels that may
differ
> > in size and makefile organization, not to mention different layout on
the
> > file system and disk.

Can someone tell me why we're "testing" the preempt kernel by running
make -j on a build?  What exactly is this going to show us?  The only thing
i can think of is showing us that throughput is not damaged when you want to
run single apps by using preempt.  You dont get to see the effects of the
kernel preemption because all the damn thing is doing is preempting itself.

If you want to test the preempt kernel you're going to need something that
can find the mean latancy or "time to action" for a particular program or
all programs being run at the time and then run multiple programs that you
would find on various peoples' systems.   That is the "feel" people talk
about when they praise the preempt patch.  make -j'ing something and not
testing anything else but that will show you nothing important except "does
throughput get screwed by the preempt patch."   Perhaps checking the
latencies on a common program on people's systems like mozilla or konqueror
while doing a 'make -j N bzImage'  would be a better idea.

> Ouch, I assumed this wasn't the case indeed.
>
> >
> > What happens when you do the same test, compiling one kernel under
multiple
> > different kernels?
>
> Andrea

You should _always_ use the same kernel tree at the same point each time you
rerun the test under a different kernel.  Always make clean before rebooting
to the next kernel.  setting up the test bed should be pretty straight
forward.   make sure the build tree is clean then make dep it.   reboot to
the next kernel.   load up mozilla but nothing else (mozilla should be
modified a bit to display the time it takes to do certain functions such as
displaying drop down menus, loading, opening a new window. Also you should
make the homepage something on the drive or blank.).  start make -j 4
bzImage then load mozilla (no other gnome gtk libraries or having them
loaded via running gnome doesn't matter, just as long as it's the same each
time).  Mozilla should then output times it takes to do certain things and
that should give you a good idea of how the preempt patch is performing
assuming everything is running on the same priority and your memory isn't
being maxed out and your hdd isn't eating the majority of the cpu time.

But i really think make -j'ing and only testing that or reporting those
numbers is a complete waste of time if you're trying to look at the
preempt's patch performance.  I like using mozilla in this example because
it's a big bulky app that most people have (kde users possibly excluded)
where an improvement in latency or "time to action" is actually important to
people, and cant be easily ignored.

well those are just my two cents,  i'd do it myself but i'm waiting for
hardware to replace the broken crap i have now.  but if nobody has done it
by then i'll set that up.

-formerly safemode

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:00                                             ` Ed Sweetman
@ 2002-01-12 20:23                                               ` Andrew Morton
  2002-01-12 21:02                                                 ` Erik Andersen
                                                                   ` (7 more replies)
  2002-01-13 15:22                                               ` jogi
  1 sibling, 8 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-12 20:23 UTC (permalink / raw)
  To: Ed Sweetman
  Cc: Andrea Arcangeli, yodaiken, jogi, Robert Love, Alan Cox, nigel,
	Rob Landley, linux-kernel

Ed Sweetman wrote:
> 
> If you want to test the preempt kernel you're going to need something that
> can find the mean latancy or "time to action" for a particular program or
> all programs being run at the time and then run multiple programs that you
> would find on various peoples' systems.   That is the "feel" people talk
> about when they praise the preempt patch.

Right.  And that is precisely why I created the "mini-ll" patch.  To
give the improved "feel" in a way which is acceptable for merging into
the 2.4 kernel.

And guess what?   Nobody has tested the damn thing, so it's going
nowhere.

Here it is again:



--- linux-2.4.18-pre3/fs/buffer.c	Fri Dec 21 11:19:14 2001
+++ linux-akpm/fs/buffer.c	Sat Jan 12 12:22:29 2002
@@ -249,12 +249,19 @@ static int wait_for_buffers(kdev_t dev, 
 	struct buffer_head * next;
 	int nr;
 
-	next = lru_list[index];
 	nr = nr_buffers_type[index];
+repeat:
+	next = lru_list[index];
 	while (next && --nr >= 0) {
 		struct buffer_head *bh = next;
 		next = bh->b_next_free;
 
+		if (dev == NODEV && current->need_resched) {
+			spin_unlock(&lru_list_lock);
+			conditional_schedule();
+			spin_lock(&lru_list_lock);
+			goto repeat;
+		}
 		if (!buffer_locked(bh)) {
 			if (refile)
 				__refile_buffer(bh);
@@ -1174,8 +1181,10 @@ struct buffer_head * bread(kdev_t dev, i
 
 	bh = getblk(dev, block, size);
 	touch_buffer(bh);
-	if (buffer_uptodate(bh))
+	if (buffer_uptodate(bh)) {
+		conditional_schedule();
 		return bh;
+	}
 	ll_rw_block(READ, 1, &bh);
 	wait_on_buffer(bh);
 	if (buffer_uptodate(bh))
--- linux-2.4.18-pre3/fs/dcache.c	Fri Dec 21 11:19:14 2001
+++ linux-akpm/fs/dcache.c	Sat Jan 12 12:22:29 2002
@@ -71,7 +71,7 @@ static inline void d_free(struct dentry 
  * d_iput() operation if defined.
  * Called with dcache_lock held, drops it.
  */
-static inline void dentry_iput(struct dentry * dentry)
+static void dentry_iput(struct dentry * dentry)
 {
 	struct inode *inode = dentry->d_inode;
 	if (inode) {
@@ -84,6 +84,7 @@ static inline void dentry_iput(struct de
 			iput(inode);
 	} else
 		spin_unlock(&dcache_lock);
+	conditional_schedule();
 }
 
 /* 
--- linux-2.4.18-pre3/fs/jbd/commit.c	Fri Dec 21 11:19:14 2001
+++ linux-akpm/fs/jbd/commit.c	Sat Jan 12 12:22:29 2002
@@ -212,6 +212,16 @@ write_out_data_locked:
 				__journal_remove_journal_head(bh);
 				refile_buffer(bh);
 				__brelse(bh);
+				if (current->need_resched) {
+					if (commit_transaction->t_sync_datalist)
+						commit_transaction->t_sync_datalist =
+							next_jh;
+					if (bufs)
+						break;
+					spin_unlock(&journal_datalist_lock);
+					conditional_schedule();
+					goto write_out_data;
+				}
 			}
 		}
 		if (bufs == ARRAY_SIZE(wbuf)) {
--- linux-2.4.18-pre3/fs/proc/array.c	Thu Oct 11 09:00:01 2001
+++ linux-akpm/fs/proc/array.c	Sat Jan 12 12:22:29 2002
@@ -415,6 +415,8 @@ static inline void statm_pte_range(pmd_t
 		pte_t page = *pte;
 		struct page *ptpage;
 
+		conditional_schedule();
+
 		address += PAGE_SIZE;
 		pte++;
 		if (pte_none(page))
--- linux-2.4.18-pre3/fs/proc/generic.c	Fri Sep  7 10:53:59 2001
+++ linux-akpm/fs/proc/generic.c	Sat Jan 12 12:22:29 2002
@@ -98,7 +98,9 @@ proc_file_read(struct file * file, char 
 				retval = n;
 			break;
 		}
-		
+
+		conditional_schedule();
+
 		/* This is a hack to allow mangling of file pos independent
  		 * of actual bytes read.  Simply place the data at page,
  		 * return the bytes, and set `start' to the desired offset
--- linux-2.4.18-pre3/include/linux/condsched.h	Thu Jan  1 00:00:00 1970
+++ linux-akpm/include/linux/condsched.h	Sat Jan 12 12:22:29 2002
@@ -0,0 +1,18 @@
+#ifndef _LINUX_CONDSCHED_H
+#define _LINUX_CONDSCHED_H
+
+#ifndef __LINUX_COMPILER_H
+#include <linux/compiler.h>
+#endif
+
+#ifndef __ASSEMBLY__
+#define conditional_schedule()				\
+do {							\
+	if (unlikely(current->need_resched)) {		\
+		__set_current_state(TASK_RUNNING);	\
+		schedule();				\
+	}						\
+} while(0)
+#endif
+
+#endif
--- linux-2.4.18-pre3/include/linux/sched.h	Fri Dec 21 11:19:23 2001
+++ linux-akpm/include/linux/sched.h	Sat Jan 12 12:22:29 2002
@@ -13,6 +13,7 @@ extern unsigned long event;
 #include <linux/times.h>
 #include <linux/timex.h>
 #include <linux/rbtree.h>
+#include <linux/condsched.h>
 
 #include <asm/system.h>
 #include <asm/semaphore.h>
--- linux-2.4.18-pre3/mm/filemap.c	Thu Jan 10 13:39:50 2002
+++ linux-akpm/mm/filemap.c	Sat Jan 12 12:22:29 2002
@@ -296,10 +296,7 @@ static int truncate_list_pages(struct li
 
 			page_cache_release(page);
 
-			if (current->need_resched) {
-				__set_current_state(TASK_RUNNING);
-				schedule();
-			}
+			conditional_schedule();
 
 			spin_lock(&pagecache_lock);
 			goto restart;
@@ -609,6 +606,7 @@ void filemap_fdatasync(struct address_sp
 			UnlockPage(page);
 
 		page_cache_release(page);
+		conditional_schedule();
 		spin_lock(&pagecache_lock);
 	}
 	spin_unlock(&pagecache_lock);
@@ -1392,6 +1390,9 @@ page_ok:
 		offset &= ~PAGE_CACHE_MASK;
 
 		page_cache_release(page);
+
+		conditional_schedule();
+
 		if (ret == nr && desc->count)
 			continue;
 		break;
@@ -3025,6 +3026,8 @@ unlock:
 		SetPageReferenced(page);
 		UnlockPage(page);
 		page_cache_release(page);
+
+		conditional_schedule();
 
 		if (status < 0)
 			break;
--- linux-2.4.18-pre3/drivers/block/ll_rw_blk.c	Thu Jan 10 13:39:49 2002
+++ linux-akpm/drivers/block/ll_rw_blk.c	Sat Jan 12 12:22:29 2002
@@ -917,6 +917,7 @@ void submit_bh(int rw, struct buffer_hea
 			kstat.pgpgin += count;
 			break;
 	}
+	conditional_schedule();
 }
 
 /**

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:23                                               ` Andrew Morton
@ 2002-01-12 21:02                                                 ` Erik Andersen
  2002-01-12 21:18                                                   ` Stephan von Krawczynski
  2002-01-12 21:16                                                 ` Stephan von Krawczynski
                                                                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 351+ messages in thread
From: Erik Andersen @ 2002-01-12 21:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

On Sat Jan 12, 2002 at 12:23:09PM -0800, Andrew Morton wrote:
> Ed Sweetman wrote:
> > 
> > If you want to test the preempt kernel you're going to need something that
> > can find the mean latancy or "time to action" for a particular program or
> > all programs being run at the time and then run multiple programs that you
> > would find on various peoples' systems.   That is the "feel" people talk
> > about when they praise the preempt patch.
> 
> Right.  And that is precisely why I created the "mini-ll" patch.  To
> give the improved "feel" in a way which is acceptable for merging into
> the 2.4 kernel.
> 
> And guess what?   Nobody has tested the damn thing, so it's going
> nowhere.

I've tested it.  I've been running it on my box for the last
several days.  Works just great.  My box has been quite solid
with it and I've not seen anything to prevent your sending it 
to Marcelo for 2.4.18...

 -Erik

--
Erik B. Andersen             http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 21:02                                                 ` Erik Andersen
@ 2002-01-12 21:18                                                   ` Stephan von Krawczynski
  2002-01-12 23:24                                                     ` Erik Andersen
  0 siblings, 1 reply; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-12 21:18 UTC (permalink / raw)
  To: andersen; +Cc: akpm, linux-kernel

On Sat, 12 Jan 2002 14:02:13 -0700
Erik Andersen <andersen@codepoet.org> wrote:

> > And guess what?   Nobody has tested the damn thing, so it's going
> > nowhere.
> 
> I've tested it.  I've been running it on my box for the last
> several days.  Works just great.  My box has been quite solid
> with it and I've not seen anything to prevent your sending it 
> to Marcelo for 2.4.18...

Sorry for this dumb question:
What is the difference to vanilla exactly like in your setup? Better
interactive feeling? Throughput?

Regards,
Stephan


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 21:18                                                   ` Stephan von Krawczynski
@ 2002-01-12 23:24                                                     ` Erik Andersen
  0 siblings, 0 replies; 351+ messages in thread
From: Erik Andersen @ 2002-01-12 23:24 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: akpm, linux-kernel

On Sat Jan 12, 2002 at 10:18:35PM +0100, Stephan von Krawczynski wrote:
> On Sat, 12 Jan 2002 14:02:13 -0700
> Erik Andersen <andersen@codepoet.org> wrote:
> 
> > > And guess what?   Nobody has tested the damn thing, so it's going
> > > nowhere.
> > 
> > I've tested it.  I've been running it on my box for the last
> > several days.  Works just great.  My box has been quite solid
> > with it and I've not seen anything to prevent your sending it 
> > to Marcelo for 2.4.18...
> 
> Sorry for this dumb question:
> What is the difference to vanilla exactly like in your setup? Better
> interactive feeling? Throughput?

To be honest, not a _huge_ difference.  I've been mostly doing
development, and when I happen to have, for example,  a kernel
compile and a gcc compile going on, xmms isn't skipping for me
at all, while previously I would hear skips every so often.

 -Erik

--
Erik B. Andersen             http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:23                                               ` Andrew Morton
  2002-01-12 21:02                                                 ` Erik Andersen
@ 2002-01-12 21:16                                                 ` Stephan von Krawczynski
  2002-01-12 22:25                                                 ` Francois Romieu
                                                                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 351+ messages in thread
From: Stephan von Krawczynski @ 2002-01-12 21:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: ed.sweetman, andrea, yodaiken, jogi, rml, alan, nigel, landley,
	linux-kernel

On Sat, 12 Jan 2002 12:23:09 -0800
Andrew Morton <akpm@zip.com.au> wrote:

> Ed Sweetman wrote:
> > 
> > If you want to test the preempt kernel you're going to need something that
> > can find the mean latancy or "time to action" for a particular program or
> > all programs being run at the time and then run multiple programs that you
> > would find on various peoples' systems.   That is the "feel" people talk
> > about when they praise the preempt patch.
> 
> Right.  And that is precisely why I created the "mini-ll" patch.  To
> give the improved "feel" in a way which is acceptable for merging into
> the 2.4 kernel.

Hm, I am not quite sure about what you expect to hear about it, but:

a) It applies cleanly to 2.4.18-pre3.
b) It compiles
c) During a load of around 150 produced by (of course :-) "make -j bzImage" and
concurrent XMMS playing while my mail-client and mozilla are open, I cannot
"feel" a real big difference in interactivity compared to vanilla kernel. XMMS
hickups sometimes, mouse does kangaroo'ing, switching around different
X-screens and screen refresh (especially mozilla of course) are no big hit.

This is a dual PIII-1GHz/2 GB RAM and some swap. During make no swapping is
going on.

Sorry, but I cannot see (feel) the difference in _this_ test (if this is really
a test for what you intend to do). Compile time btw makes no difference either.
Perhaps this try is rather something for ingo and the scheduler...

Regards,
Stephan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:23                                               ` Andrew Morton
  2002-01-12 21:02                                                 ` Erik Andersen
  2002-01-12 21:16                                                 ` Stephan von Krawczynski
@ 2002-01-12 22:25                                                 ` Francois Romieu
  2002-01-13  1:32                                                 ` Alan Cox
                                                                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 351+ messages in thread
From: Francois Romieu @ 2002-01-12 22:25 UTC (permalink / raw)
  To: linux-kernel

[Cc: trimmed]

Andrew Morton <akpm@zip.com.au> :
[mini-ll]
> And guess what?   Nobody has tested the damn thing, so it's going
> nowhere.

It allows me to del^W read NFS-mounted mail behind a linux router while I 
copy files locally on the router. If I don't apply mini-ll to the router, 
it's a "server foo not responding, still trying" fest. You know what
"interactivity feel" means when it happens.

If someone suspects the hardware is crap, it's a PIV motherboard with 
built-in Promise20265 and four IBM IC35L060AVER07-0 on their own channel.
Each disk has been able to behave normally during RAID1 rebuild.

Without mini-ll:
  well choosen file I/O => no file I/O, no networking, no console, *big pain*.
With mini-ll:
  well choosen file I/O => *only* those I/O suck (less than before btw).

-- 
Ueimor

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:23                                               ` Andrew Morton
                                                                   ` (2 preceding siblings ...)
  2002-01-12 22:25                                                 ` Francois Romieu
@ 2002-01-13  1:32                                                 ` Alan Cox
  2002-01-13  1:57                                                 ` J.A. Magallon
                                                                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-13  1:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ed Sweetman, Andrea Arcangeli, yodaiken, jogi, Robert Love,
	Alan Cox, nigel, Rob Landley, linux-kernel

> And guess what?   Nobody has tested the damn thing, so it's going
> nowhere.

I've been testing it. It works for me, its not as good as the full one,
it seems obviously correct. What else am I supposed to say.

I'm pretty much exclusively running Andre's new IDE code too. In fact I'm
back to a page long list of applied diffs versus 2.4.18pre3, most of which
I need to feed Marcelo

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:23                                               ` Andrew Morton
                                                                   ` (3 preceding siblings ...)
  2002-01-13  1:32                                                 ` Alan Cox
@ 2002-01-13  1:57                                                 ` J.A. Magallon
  2002-01-13  8:03                                                 ` Rusty Russell
                                                                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 351+ messages in thread
From: J.A. Magallon @ 2002-01-13  1:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ed Sweetman, Andrea Arcangeli, yodaiken, jogi, Robert Love,
	Alan Cox, nigel, Rob Landley, linux-kernel


On 20020112 Andrew Morton wrote:
>Ed Sweetman wrote:
>> 
>> If you want to test the preempt kernel you're going to need something that
>> can find the mean latancy or "time to action" for a particular program or
>> all programs being run at the time and then run multiple programs that you
>> would find on various peoples' systems.   That is the "feel" people talk
>> about when they praise the preempt patch.
>
>Right.  And that is precisely why I created the "mini-ll" patch.  To
>give the improved "feel" in a way which is acceptable for merging into
>the 2.4 kernel.
>
>And guess what?   Nobody has tested the damn thing, so it's going
>nowhere.
>

I have been running mini-ll on -pre3 for a time. And have just booted pre3
with full-ll. I see no marvelous diff between them, but I am not pushing
my box to their knees. 
I can get numbers for you, but is there any test out there that gives them ?
Something like 'under this damned test your system just delayed as much as xxx us'.
That kind of 'my xmms does not skip' does not look like a very serious measure.

And could you tell me if some of this patches can interfere with results ?
This is what I am running just now:
- 2.4.18-pre3
- vm fixes from aa (vm-22, vm-raend, truncate-garbage)
- ext3-0.9.17 update
- ide-20011210 (hint: plz, make it in mainline for the time of .18...)
- irqrate-A1
- interrupts-seq-file
- spinlock-cacheline + fast-pte from -aa
- scalable timers
- sensors-cvs
- bproc 3.1.5

On that i have run full-ll+ll_fixes (from -aa) or mini-ll+ll_fixes.

(If someone is interested, patches are at
http://giga.cps.unizar.es/~magallon/linux/ )

TIA.

-- 
J.A. Magallon                           #  Let the source be with you...        
mailto:jamagallon@able.es
Mandrake Linux release 8.2 (Cooker) for i586
Linux werewolf 2.4.18-pre3-beo #5 SMP Sun Jan 13 02:14:04 CET 2002 i686

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:23                                               ` Andrew Morton
                                                                   ` (4 preceding siblings ...)
  2002-01-13  1:57                                                 ` J.A. Magallon
@ 2002-01-13  8:03                                                 ` Rusty Russell
  2002-01-13 17:42                                                 ` jogi
       [not found]                                                 ` <3C40A6BB.1090100@pobox.com>
  7 siblings, 0 replies; 351+ messages in thread
From: Rusty Russell @ 2002-01-13  8:03 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, torvalds

On Sat, 12 Jan 2002 12:23:09 -0800
Andrew Morton <akpm@zip.com.au> wrote:
> And guess what?   Nobody has tested the damn thing, so it's going
> nowhere.

Haven't had latency problems, to be honest.  Maybe I should start playing mp3s
while I code?

1) conditional_schedule?  Hmmm... Why the __set_current_state?  I think I prefer
   an explicit "if (need_schedule()) schedule()", with
   #define need_schedule() unlikely(current->need_resched)

2) I hate condsched.h: Use sched.h please!

3) Why this:
   > +#ifndef __LINUX_COMPILER_H
   > +#include <linux/compiler.h>
   > +#endif

Other than that, I like this patch.  Linus?
Rusty.
-- 
  Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 20:23                                               ` Andrew Morton
                                                                   ` (5 preceding siblings ...)
  2002-01-13  8:03                                                 ` Rusty Russell
@ 2002-01-13 17:42                                                 ` jogi
  2002-01-13 18:22                                                   ` Robert Love
       [not found]                                                 ` <3C40A6BB.1090100@pobox.com>
  7 siblings, 1 reply; 351+ messages in thread
From: jogi @ 2002-01-13 17:42 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ed Sweetman, Andrea Arcangeli, yodaiken, Robert Love, Alan Cox,
	nigel, Rob Landley, linux-kernel

On Sat, Jan 12, 2002 at 12:23:09PM -0800, Andrew Morton wrote:
> Ed Sweetman wrote:
> > 
> > If you want to test the preempt kernel you're going to need something that
> > can find the mean latancy or "time to action" for a particular program or
> > all programs being run at the time and then run multiple programs that you
> > would find on various peoples' systems.   That is the "feel" people talk
> > about when they praise the preempt patch.
> 
> Right.  And that is precisely why I created the "mini-ll" patch.  To
> give the improved "feel" in a way which is acceptable for merging into
> the 2.4 kernel.
> 
> And guess what?   Nobody has tested the damn thing, so it's going
> nowhere.

Ok, as promised, here are the results:

        13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
		                                                                                          
j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%

Regards,

   Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 17:42                                                 ` jogi
@ 2002-01-13 18:22                                                   ` Robert Love
  2002-01-13 19:32                                                     ` Alan Cox
                                                                       ` (5 more replies)
  0 siblings, 6 replies; 351+ messages in thread
From: Robert Love @ 2002-01-13 18:22 UTC (permalink / raw)
  To: jogi
  Cc: Andrew Morton, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox,
	nigel, Rob Landley, linux-kernel

On Sun, 2002-01-13 at 12:42, jogi@planetzork.ping.de wrote:

>         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
> j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
> j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
> j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
> j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
> j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
> 		                                                                                          
> j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
> j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
> j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
> j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
> j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%

Again, preempt seems to reign supreme.  Where is all the information
correlating preempt is inferior?  To be fair, however, we should bench a
mini-ll+s test.

But I stand by my original point that none of this matters all too
much.  A preemptive kernel will allow for future latency reduction
_without_ using explicit scheduling points everywhere there is a
problem.  This means we can tackle the problem and not provide a million
bandaids.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:22                                                   ` Robert Love
@ 2002-01-13 19:32                                                     ` Alan Cox
  2002-01-14 11:41                                                       ` Andrea Arcangeli
  2002-01-13 19:35                                                     ` J Sloan
                                                                       ` (4 subsequent siblings)
  5 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-13 19:32 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, Andrew Morton, Ed Sweetman, Andrea Arcangeli, yodaiken,
	Alan Cox, nigel, Rob Landley, linux-kernel

> Again, preempt seems to reign supreme.  Where is all the information
> correlating preempt is inferior?  To be fair, however, we should bench a
> mini-ll+s test.

How about some actual latency numbers ?

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 19:32                                                     ` Alan Cox
@ 2002-01-14 11:41                                                       ` Andrea Arcangeli
  0 siblings, 0 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-14 11:41 UTC (permalink / raw)
  To: Alan Cox
  Cc: Robert Love, jogi, Andrew Morton, Ed Sweetman, yodaiken, nigel,
	Rob Landley, linux-kernel

On Sun, Jan 13, 2002 at 07:32:18PM +0000, Alan Cox wrote:
> > Again, preempt seems to reign supreme.  Where is all the information
> > correlating preempt is inferior?  To be fair, however, we should bench a
> > mini-ll+s test.
> 
> How about some actual latency numbers ?

with an huge rescheduling rate (huge swapout/swapin load) and the
scheduler walking over 100 tasks at each schedule it is insane to
deduct anything from those numbers (-preempt was using O(1)
scheduler!!!!). so please don't make any assumption by just looking at
those numbers.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:22                                                   ` Robert Love
  2002-01-13 19:32                                                     ` Alan Cox
@ 2002-01-13 19:35                                                     ` J Sloan
  2002-01-14  6:49                                                       ` Daniel Phillips
  2002-01-13 19:46                                                     ` Andrew Morton
                                                                       ` (3 subsequent siblings)
  5 siblings, 1 reply; 351+ messages in thread
From: J Sloan @ 2002-01-13 19:35 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, Andrew Morton, Ed Sweetman, Andrea Arcangeli, yodaiken,
	Alan Cox, nigel, Rob Landley, linux-kernel

The problem here is that when people report
that the low latency patch works better for them
than the preempt patch, they aren't talking about
bebnchmarking the time to compile a kernel, they
are talking about interactive feel and smoothness.

You're speaking to a peripheral issue.

I've no agenda other than wanting to see linux
as an attractive option for the multimedia and
gaming crowds - and in my experience, the low
latency patches simply give a much smoother
feel and a more pleasant experience. Kernel
compilation time is the farthest thing from my
mind when e.g. playing Q3A!

I'd be happy to check out the preempt patch
again and see if anything's changed, if the
problem of tux+preempt oopsing has been
dealt with -

Regards,

jjs

Robert Love wrote:

>On Sun, 2002-01-13 at 12:42, jogi@planetzork.ping.de wrote:
>
>>        13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
>>j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
>>j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
>>j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
>>j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
>>j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
>>		                                                                                          
>>j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
>>j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
>>j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
>>j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
>>j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%
>>
>
>Again, preempt seems to reign supreme.  Where is all the information
>correlating preempt is inferior?  To be fair, however, we should bench a
>mini-ll+s test.
>
>But I stand by my original point that none of this matters all too
>much.  A preemptive kernel will allow for future latency reduction
>_without_ using explicit scheduling points everywhere there is a
>problem.  This means we can tackle the problem and not provide a million
>bandaids.
>
>	Robert Love
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 19:35                                                     ` J Sloan
@ 2002-01-14  6:49                                                       ` Daniel Phillips
  2002-01-15  1:31                                                         ` J Sloan
  0 siblings, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14  6:49 UTC (permalink / raw)
  To: J Sloan, Robert Love
  Cc: jogi, Andrew Morton, Ed Sweetman, Andrea Arcangeli, yodaiken,
	Alan Cox, nigel, Rob Landley, linux-kernel

On January 13, 2002 08:35 pm, J Sloan wrote:
> The problem here is that when people report
> that the low latency patch works better for them
> than the preempt patch, they aren't talking about
> bebnchmarking the time to compile a kernel, they
> are talking about interactive feel and smoothness.

Nobody is claiming the low latency patch works better than 
-preempt+lock_break, only that low latency can equal -preempt+lock_break, 
which is a claim I'm skeptical of, but oh well.

> I've no agenda other than wanting to see linux
> as an attractive option for the multimedia and
> gaming crowds - and in my experience, the low
> latency patches simply give a much smoother
> feel and a more pleasant experience. Kernel
> compilation time is the farthest thing from my
> mind when e.g. playing Q3A!

You need to read the thread *way* more closely ;-)

> I'd be happy to check out the preempt patch
> again and see if anything's changed, if the
> problem of tux+preempt oopsing has been
> dealt with -

Right, useful.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  6:49                                                       ` Daniel Phillips
@ 2002-01-15  1:31                                                         ` J Sloan
  0 siblings, 0 replies; 351+ messages in thread
From: J Sloan @ 2002-01-15  1:31 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: J Sloan, Robert Love, jogi, Andrew Morton, Ed Sweetman,
	Andrea Arcangeli, yodaiken, Alan Cox, nigel, Rob Landley,
	linux-kernel

Daniel Phillips wrote:

>On January 13, 2002 08:35 pm, J Sloan wrote:
>
>>The problem here is that when people report
>>that the low latency patch works better for them
>>than the preempt patch, they aren't talking about
>>bebnchmarking the time to compile a kernel, they
>>are talking about interactive feel and smoothness.
>>
>
>Nobody is claiming the low latency patch works better than 
>-preempt+lock_break, only that low latency can equal -preempt+lock_break, 
>which is a claim I'm skeptical of, but oh well.
>
AFAICT Alan Cox  et al are saying that low-latency
gives better latency than -preempt, but that if lock-break
is added to -preempt, the results are basically the same.

IOW lock-break + preempt =~ low-latency as far as the
latency question is concerned.

>>I've no agenda other than wanting to see linux
>>as an attractive option for the multimedia and
>>gaming crowds - and in my experience, the low
>>latency patches simply give a much smoother
>>feel and a more pleasant experience. Kernel
>>compilation time is the farthest thing from my
>>mind when e.g. playing Q3A!
>>
>
>You need to read the thread *way* more closely ;-)
>
Admittedly my observations have been more from
an "end-user" point of view, because at the end
of the day, what I experience while using Linux as
a multimedia/gaming platform is worth more than a
barrel of benchmarks - and while kernel compilation
time is of interest, it is just _one_  benchmark in the
greater scheme of things. (not to mention that that
benchmark result could probably be matched in a
non -preempt kernel via /proc tuning)


>>I'd be happy to check out the preempt patch
>>again and see if anything's changed, if the
>>problem of tux+preempt oopsing has been
>>dealt with -
>>
>
>Right, useful.
>
See my previous reply, or the archives -

Regards,

jjs




^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:22                                                   ` Robert Love
  2002-01-13 19:32                                                     ` Alan Cox
  2002-01-13 19:35                                                     ` J Sloan
@ 2002-01-13 19:46                                                     ` Andrew Morton
  2002-01-13 20:04                                                       ` Robert Love
  2002-01-13 20:17                                                     ` jogi
                                                                       ` (2 subsequent siblings)
  5 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-13 19:46 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel

Robert Love wrote:
> 
> Again, preempt seems to reign supreme.  Where is all the information
> correlating preempt is inferior?  To be fair, however, we should bench a
> mini-ll+s test.

I can't say that I have ever seen any significant change in throughput
of anything with any of this stuff.

Benchmarks are well and good, but until we have a solid explanation for
the throughput changes which people are seeing, it's risky to claim
that there is a general benefit.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 19:46                                                     ` Andrew Morton
@ 2002-01-13 20:04                                                       ` Robert Love
  2002-01-13 20:30                                                         ` Andrew Morton
  2002-01-14 11:56                                                         ` Andrea Arcangeli
  0 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-13 20:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: jogi, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel

On Sun, 2002-01-13 at 14:46, Andrew Morton wrote:

> I can't say that I have ever seen any significant change in throughput
> of anything with any of this stuff.

I can send you some numbers.  It is typically 5-10% throughput increase
under load.  Obviously this work won't help a single task on a single
user system.  But things like (ack!) dbench 16 show a marked
improvement.

> Benchmarks are well and good, but until we have a solid explanation for
> the throughput changes which people are seeing, it's risky to claim
> that there is a general benefit.

I have an explanation.  We can schedule quicker off a woken task.  When
an event occurs that allows an I/O-blocked task to run, its time-to-run
is shorter.  Same event/response improvement that helps interactivity.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 20:04                                                       ` Robert Love
@ 2002-01-13 20:30                                                         ` Andrew Morton
  2002-01-14 11:56                                                         ` Andrea Arcangeli
  1 sibling, 0 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-13 20:30 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel

Robert Love wrote:
> 
> ...
> > Benchmarks are well and good, but until we have a solid explanation for
> > the throughput changes which people are seeing, it's risky to claim
> > that there is a general benefit.
> 
> I have an explanation.  We can schedule quicker off a woken task.  When
> an event occurs that allows an I/O-blocked task to run, its time-to-run
> is shorter.  Same event/response improvement that helps interactivity.
> 

Sounds more like handwaving that an explanation :)

The way to speed up dbench is to allow the processes which want to delete
files to actually do that.  This reduces the total amount of IO which the
test performs.  Another way is to increase usable memory (or at least to
delay the onset of balance_dirty going synchronous).  Possibly it's something
to do with letting kswapd schedule earlier.  Or bdflush.

In the swapstorm case, it's again not clear to me.  Perhaps it's due to prompter
kswapd activity, perhaps due somehow to improved request merging.

As I say, without a precise and detailed understanding of the mechanisms
I wouldn't be prepared to claim more than "speeds up dbench and swapstorms
for some reason".

(I'd _like_ to know the complete reason - that way we can stare at it
and maybe make things even better.  Doing a binary search through the
various chunks of the mini-ll patch would be instructive).

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 20:04                                                       ` Robert Love
  2002-01-13 20:30                                                         ` Andrew Morton
@ 2002-01-14 11:56                                                         ` Andrea Arcangeli
  2002-01-14 13:38                                                           ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-14 11:56 UTC (permalink / raw)
  To: Robert Love
  Cc: Andrew Morton, jogi, Ed Sweetman, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel

On Sun, Jan 13, 2002 at 03:04:35PM -0500, Robert Love wrote:
> user system.  But things like (ack!) dbench 16 show a marked
> improvement.

please try again on top of -aa, and I've to specify this : benchmarked
in a way that can be trusted and compared, so we can make some use of
this information.  This mean with -18pre2aa2 alone and only -preempt on
top of -18pre2aa2.

NOTE: I'd be glad to say "preempt rules", "go preempt", "preempt is
cool" like you as soon as I have some proof it makes _THE_ difference
and that it is worth the mess on SMP (per-cpu, RCU locking, etc...), not
to tell about the other architectures, but at the moment there's only a
number of people running xmms on mainline with the broken scheduling
points and those numbers that cannot be compared in any sane way. I
repeat, I'm not against preempt, I just want to get some real world
proof and measurement and at the moment I think preempt doesn't worth,
but if you give us _any_ real world proof that a that low mean latency
of the order of 10/100 usec matters to get most of the cpu cycles out of
the cpu during trashing (as it could be possible to speculate from the
broken benchmark posted in this thread), and that there's no real
regression with the additional branches in the spin_unlock in 100%
system load, I may change my mind (an of course, only for anything above
2.5, and still I think there are more interesting optimizations to do
rahter than requiring everybody spending lots of time fixing drivers,
auditing, fixing smp, rcu locking etc... but ok if it is obviously good
thing [aka no real regression and only benefits long term] it would be
ok to do it early as well). I'm not particularly worried about the
preempt lock around the per-cpu stuff, that's a cacheline local, and it
could go into the schedule_data like I did for the rcu per-cpu
variables, so they're at zero cacheline cost (RCU_poll patch costs now
only 1 instruction per schedule and zero memory overhead [an incl
instruction precisely]).

> > Benchmarks are well and good, but until we have a solid explanation for
> > the throughput changes which people are seeing, it's risky to claim
> > that there is a general benefit.
> 
> I have an explanation.  We can schedule quicker off a woken task.  When
> an event occurs that allows an I/O-blocked task to run, its time-to-run
> is shorter.  Same event/response improvement that helps interactivity.

That's a nice speculation out of a broken comparison, it may be really
the case, there's no way to be sure, before that sum of usec you should
also sum the seconds spent walking the tasklist in the non O(1)
scheduler.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:56                                                         ` Andrea Arcangeli
@ 2002-01-14 13:38                                                           ` Robert Love
  2002-01-14 15:45                                                             ` Andrea Arcangeli
  0 siblings, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-14 13:38 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Andrew Morton, jogi, Ed Sweetman, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel

On Mon, 2002-01-14 at 06:56, Andrea Arcangeli wrote:
> On Sun, Jan 13, 2002 at 03:04:35PM -0500, Robert Love wrote:
> > user system.  But things like (ack!) dbench 16 show a marked
> > improvement.
> 
> please try again on top of -aa, and I've to specify this : benchmarked
> in a way that can be trusted and compared, so we can make some use of
> this information.  This mean with -18pre2aa2 alone and only -preempt on
> top of -18pre2aa2.

I realize the test isn't directly comparing what we want, so I asked him
for ll+O(1) benchmark, which he gave.  Another set would be to do
preempt and ll alone.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 13:38                                                           ` Robert Love
@ 2002-01-14 15:45                                                             ` Andrea Arcangeli
  0 siblings, 0 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-14 15:45 UTC (permalink / raw)
  To: Robert Love
  Cc: Andrew Morton, jogi, Ed Sweetman, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel

On Mon, Jan 14, 2002 at 08:38:54AM -0500, Robert Love wrote:
> On Mon, 2002-01-14 at 06:56, Andrea Arcangeli wrote:
> > On Sun, Jan 13, 2002 at 03:04:35PM -0500, Robert Love wrote:
> > > user system.  But things like (ack!) dbench 16 show a marked
> > > improvement.
> > 
> > please try again on top of -aa, and I've to specify this : benchmarked
> > in a way that can be trusted and compared, so we can make some use of
> > this information.  This mean with -18pre2aa2 alone and only -preempt on
> > top of -18pre2aa2.
> 
> I realize the test isn't directly comparing what we want, so I asked him
> for ll+O(1) benchmark, which he gave.  Another set would be to do
      ^^ actually mini-ll

right (I was still in the middle of the backlog of my emails, so I
didn't know he just produced the mini-ll+O(1)). The mini-ll+O(1) shows
that -preempt is still a bit faster (as expected not much faster
anymore). The reason it is faster it is probably really the sum of few
usec latency of userspace cpu cycles that you save. However given the
small difference in numbers in this patological case (-j1 obviously
cannot take advantage of the few usec less of reduced latency) still
makes me to think it doesn't worth the pain and the complexity, or at
least somebody should also proof that it doesn't visibly drop
performance in a 100% cpu bound _system_ (not user) time load (ala
pagecache_lock collision testcase with sendfile etc..), in general with
a single thread in the system.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:22                                                   ` Robert Love
                                                                       ` (2 preceding siblings ...)
  2002-01-13 19:46                                                     ` Andrew Morton
@ 2002-01-13 20:17                                                     ` jogi
       [not found]                                                       ` <Pine.LNX.4.33.0201131533530.14774-100000@coffee.psychology.mcmaster.ca>
  2002-01-13 23:52                                                     ` yodaiken
  2002-01-14 11:39                                                     ` Andrea Arcangeli
  5 siblings, 1 reply; 351+ messages in thread
From: jogi @ 2002-01-13 20:17 UTC (permalink / raw)
  To: Robert Love
  Cc: Andrew Morton, Ed Sweetman, Andrea Arcangeli, yodaiken, Alan Cox,
	nigel, Rob Landley, linux-kernel

On Sun, Jan 13, 2002 at 01:22:57PM -0500, Robert Love wrote:
> On Sun, 2002-01-13 at 12:42, jogi@planetzork.ping.de wrote:
> 
> >         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
> > j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
> > j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
> > j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
> > j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
> > j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
> > 		                                                                                          
> > j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
> > j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
> > j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
> > j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
> > j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%
> 
> Again, preempt seems to reign supreme.  Where is all the information
> correlating preempt is inferior?  To be fair, however, we should bench a
> mini-ll+s test.

Your wish is granted. Here are the results for mini-ll + scheduler:

j100:   8:26.54
j100:   7:50.35
j100:   6:49.59
j100:   6:39.30
j100:   6:39.70
j75:    6:01.02
j75:    6:12.16
j75:    6:04.60
j75:    6:24.58
j75:    6:28.00

Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

[parent not found: <Pine.LNX.4.33.0201131533530.14774-100000@coffee.psychology.mcmaster.ca>]

* Re: [2.4.17/18pre] VM and swap - it's really unusable
       [not found]                                                       ` <Pine.LNX.4.33.0201131533530.14774-100000@coffee.psychology.mcmaster.ca>
@ 2002-01-13 22:14                                                         ` jogi
  0 siblings, 0 replies; 351+ messages in thread
From: jogi @ 2002-01-13 22:14 UTC (permalink / raw)
  To: linux-kernel

On Sun, Jan 13, 2002 at 03:35:08PM -0500, Mark Hahn wrote:
> > > >         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
> > > > j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
> > > > j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
> > > > j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
> > > > j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
> > > > j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
> > > > 		                                                                                          
> > > > j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
> > > > j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
> > > > j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
> > > > j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
> > > > j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%
> > > 
> > > Again, preempt seems to reign supreme.  Where is all the information
> > > correlating preempt is inferior?  To be fair, however, we should bench a
> > > mini-ll+s test.
> > 
> > Your wish is granted. Here are the results for mini-ll + scheduler:
> > 
> > j100:   8:26.54
> > j100:   7:50.35
> > j100:   6:49.59
> > j100:   6:39.30
> > j100:   6:39.70
> > j75:    6:01.02
> > j75:    6:12.16
> > j75:    6:04.60
> > j75:    6:24.58
> > j75:    6:28.00
> 
> how about a real benchmark like -j2 or so (is this a dual machine?)

Why does everybody think this is no *real* benchmark? When I remember
the good old days at the university the systems I tried to compile some
applications on were *always* overloaded. Would it make a difference for
you if I would run

for a in lots_of.srpm; do
  rpm --rebuild $a &
done

Basically this gives the same result: lots of compile jobs running in
parallel. All *I* am doing is doing it a little extreme since running
the compile with make -j2 does not make a *noticable* difference at all.
And as I said previously my idea was to get the system into high memory
pressure and test the different vms (AA and RvR) ...

Furthermore some people think this combination (sched+preempt) is only
good for latency (if at all) all I can say is that this works *very*
well for me latency wise. Since I don't know how to measure latency
exactly I tried to run my compile script (make -j50) while running my
usual desktop + xmms. Result: xmms was *not* skipping, although the
system was ~70MB into swap and the load was >50. Changing workspaces
worked immedeatly all the time. But I was able to get xmms to skip for
a short while by starting galeon, StarOffice, gimp with ~10 pictures
all at the same time. But when all applications came up xmms was not
skipping any more and the system was ~130MB into swap. This is the best
result for me so far but I have to admit that I did not test mini-ll
+sched in this area (I can test this earliest on wednesday, sorry).

Since it is a little while since I posted my system specs here they are:

- Athlon 1.2GHz (single proc)
- 256 MB
- IDE drive (quantum)

> also, I've often found the user/sys/elapsed components to all be interesting;
> how do they look?  (I'd expect preempt to have more sys, for instance.)

        13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3smini  
        (sys) (user)
j100:   30.78 297.07    32.40 294.38        *           27.74 296.02    27.55 292.95    28.30 297.67
j100:   30.92 297.11    33.04 295.15        *           29.14 296.25    26.88 292.77    28.13 296.44
j100:   29.58 297.90    34.01 294.16        *           27.56 295.76    26.25 293.79    27.96 296.47
j100:   30.62 297.13    32.00 294.30        *           28.47 296.46    27.64 293.42    27.50 297.47
j100:   30.48 299.43    32.28 295.42        *           27.77 296.44    27.53 292.10    27.23 297.24

As expected the system and the user times are almost identical. The "fastest"
compile results are always where the job gets the most %cpu time. So I guess
it would be more interesting to see how much cpu time e.g. kswapd gets.

Probably I have to enhance my script to run vmstat in the background ...
Would this provide useful data?


Regards,

   Jogi


-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:22                                                   ` Robert Love
                                                                       ` (3 preceding siblings ...)
  2002-01-13 20:17                                                     ` jogi
@ 2002-01-13 23:52                                                     ` yodaiken
  2002-01-14 11:39                                                     ` Andrea Arcangeli
  5 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-13 23:52 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, Andrew Morton, Ed Sweetman, Andrea Arcangeli, yodaiken,
	Alan Cox, nigel, Rob Landley, linux-kernel


Well to start with:
	1) Maybe I should be more precise: The latency measures I've seen
	posted all favor Morton and not preempt. Since the claimed purpose
	of both patches is improving latency isn't that more interesting
	than measuremts of kernel compile?
	
	2) In these measurements
	the tree is different each time so the measurement doesn't
	seem very stable. It's not exactly a secret that file layout 
	can have an affect on performance.

	3) There is no measure of preempt without Ingo's scheduler

	4) this is what I want to see:
		Run the periodic SCHED_FIFO task I've posted multiple times
		Let's see worst case error
		Let's see effect on the background kernel compile

	All the rest is just so much talk about "interactive feel". I saw
	exactly the same claims from the people who wanted kernel graphics.




On Sun, Jan 13, 2002 at 01:22:57PM -0500, Robert Love wrote:
> On Sun, 2002-01-13 at 12:42, jogi@planetzork.ping.de wrote:
> 
> >         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
> > j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
> > j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
> > j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
> > j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
> > j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
> > 		                                                                                          
> > j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
> > j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
> > j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
> > j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
> > j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%
> 
> Again, preempt seems to reign supreme.  Where is all the information
> correlating preempt is inferior?  To be fair, however, we should bench a
> mini-ll+s test.
> 
> But I stand by my original point that none of this matters all too
> much.  A preemptive kernel will allow for future latency reduction
> _without_ using explicit scheduling points everywhere there is a
> problem.  This means we can tackle the problem and not provide a million
> bandaids.
> 
> 	Robert Love

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:22                                                   ` Robert Love
                                                                       ` (4 preceding siblings ...)
  2002-01-13 23:52                                                     ` yodaiken
@ 2002-01-14 11:39                                                     ` Andrea Arcangeli
  5 siblings, 0 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-14 11:39 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, Andrew Morton, Ed Sweetman, yodaiken, Alan Cox, nigel,
	Rob Landley, linux-kernel

On Sun, Jan 13, 2002 at 01:22:57PM -0500, Robert Love wrote:
> On Sun, 2002-01-13 at 12:42, jogi@planetzork.ping.de wrote:
> 
> >         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp       18-pre3minill  
> > j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%        *
> > j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%        *
> > j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%        *
> > j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%        *
> > j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%        *
> > 		                                                                                          
> > j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%    7:07.56  77%
> > j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%    7:17.15  74%
> > j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%    6:47.48  80%
> > j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%    6:34.02  83%
> > j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%    7:01.39  77%
> 
> Again, preempt seems to reign supreme.  Where is all the information

those comparison are totally flawed. There's nothing to compare in
there. 

minill misses the O(1) scheduler, and -aa has faster vm etc... there's
absolutely nothing to compare in the above numbers, all variables
changes at the same time.

I'm amazed I've to say this, but in short:

1) to compare minill with preempt, apply both patches to 18-pre3, as the
   only patch applied (no O(1) in the way of preempt!!!!)
2) to compare -aa with preempt, apply -preempt on top of -aa and see
   what difference it makes

If you don't follow exactly those simple rules you will change an huge
amount of variables at the same time, and it will be again impossible to
make any comparison or deduction from the numbers.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

[parent not found: <3C40A6BB.1090100@pobox.com>]

* Re: [2.4.17/18pre] VM and swap - it's really unusable
       [not found]                                                 ` <3C40A6BB.1090100@pobox.com>
@ 2002-01-14 11:34                                                   ` Andrea Arcangeli
  2002-01-14 20:27                                                     ` Andrew Morton
  0 siblings, 1 reply; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-14 11:34 UTC (permalink / raw)
  To: J Sloan
  Cc: Andrew Morton, Ed Sweetman, yodaiken, jogi, Robert Love, Alan Cox,
	nigel, Rob Landley, linux-kernel

On Sat, Jan 12, 2002 at 01:12:27PM -0800, J Sloan wrote:
> 
>    Ah - if it stands a chance of going into 2.4,
>    I'll test the heck out of it!
>    I'll give it the Q3A test, the RtCW test, the
>    xine/xmms/dbench tests, and more - glad
>    to be of service.
>    jjs
>    Andrew Morton wrote:
> 
> Ed Sweetman wrote:
> 
> If you want to test the preempt kernel you're going to need something that
> can find the mean latancy or "time to action" for a particular program or
> all programs being run at the time and then run multiple programs that you
> would find on various peoples' systems.   That is the "feel" people talk
> about when they praise the preempt patch.
> 
> Right.  And that is precisely why I created the "mini-ll" patch.  To
> give the improved "feel" in a way which is acceptable for merging into
> the 2.4 kernel.
> And guess what?   Nobody has tested the damn thing, so it's going
> nowhere.
> Here it is again:
> --- linux-2.4.18-pre3/fs/buffer.c       Fri Dec 21 11:19:14 2001
> +++ linux-akpm/fs/buffer.c      Sat Jan 12 12:22:29 2002
> @@ -249,12 +249,19 @@ static int wait_for_buffers(kdev_t dev, 
>         struct buffer_head * next;
>         int nr;
>  
> -       next = lru_list[index];
>         nr = nr_buffers_type[index];
> +repeat:
> +       next = lru_list[index];
>         while (next && --nr >= 0) {
>                 struct buffer_head *bh = next;
>                 next = bh->b_next_free;
>  
> +               if (dev == NODEV && current->need_resched) {
> +                       spin_unlock(&lru_list_lock);
> +                       conditional_schedule();
> +                       spin_lock(&lru_list_lock);
> +                       goto repeat;
> +               }
>                 if (!buffer_locke
> d(bh)) {

this introduces possibility of looping indefinitely, this is why I
rejected it while I merged the mini-ll other points into -aa, if you
want to do anything like that at the very least you should roll the head
of the list as well or something like that.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:34                                                   ` Andrea Arcangeli
@ 2002-01-14 20:27                                                     ` Andrew Morton
  0 siblings, 0 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-14 20:27 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: J Sloan, Ed Sweetman, yodaiken, jogi, Robert Love, Alan Cox,
	nigel, Rob Landley, linux-kernel

Andrea Arcangeli wrote:
> 
> > --- linux-2.4.18-pre3/fs/buffer.c       Fri Dec 21 11:19:14 2001
> > +++ linux-akpm/fs/buffer.c      Sat Jan 12 12:22:29 2002
> > @@ -249,12 +249,19 @@ static int wait_for_buffers(kdev_t dev,
> >         struct buffer_head * next;
> >         int nr;
> >
> > -       next = lru_list[index];
> >         nr = nr_buffers_type[index];
> > +repeat:
> > +       next = lru_list[index];
> >         while (next && --nr >= 0) {
> >                 struct buffer_head *bh = next;
> >                 next = bh->b_next_free;
> >
> > +               if (dev == NODEV && current->need_resched) {
> > +                       spin_unlock(&lru_list_lock);
> > +                       conditional_schedule();
> > +                       spin_lock(&lru_list_lock);
> > +                       goto repeat;
> > +               }
> >                 if (!buffer_locke
> > d(bh)) {
> 
> this introduces possibility of looping indefinitely, this is why I
> rejected it while I merged the mini-ll other points into -aa, if you
> want to do anything like that at the very least you should roll the head
> of the list as well or something like that.

I ended up deciding that the `NODEV' check here avoids livelocks.
Unless, of course, the scheduling pressure is so high that we can't
even run a few statements.  I which case the interrupt load will be so 
high that the machine stops anyway.  Possibly it needs to check `refile'
as well.

A technique I frequently use in the full-ll patch is to only reschedule
after we've executed the loop (say) 16 times before dropping out.  This
assures that forward progress is made.  There's a test mode in the full
ll patch - in this mode, it *always* assumes that need_resched is true.
If the patch runs OK in this mode without livelocking, we know that it
can't livelock.

Anyway, I'll revisit this.  It is a "must fix".  wait_for_buffers() is
possibly the worst cause of latency in the kernel.  The usual scenario
is where kupdate has written 10,000 buffers and then sleeps.  Next time
it wakes, it has 10,000 clean, unlocked buffers to move from BUF_LOCKED
onto BUF_CLEAN.  It does this with lru_list_lock held.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 19:00                                             ` Ed Sweetman
  2002-01-12 20:23                                               ` Andrew Morton
@ 2002-01-13 15:22                                               ` jogi
  2002-01-14 23:05                                                 ` george anzinger
  1 sibling, 1 reply; 351+ messages in thread
From: jogi @ 2002-01-13 15:22 UTC (permalink / raw)
  To: Ed Sweetman
  Cc: Andrea Arcangeli, yodaiken, Robert Love, Alan Cox, nigel,
	Rob Landley, Andrew Morton, linux-kernel

On Sat, Jan 12, 2002 at 02:00:17PM -0500, Ed Sweetman wrote:
> 
> 
> > On Sat, Jan 12, 2002 at 09:52:09AM -0700, yodaiken@fsmlabs.com wrote:
> > > On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> > > > I did my usual compile testings (untar kernel archive, apply patches,
> > > > make -j<value> ...
> > >
> > > If I understand your test,
> > > you are testing different loads - you are compiling kernels that may
> differ
> > > in size and makefile organization, not to mention different layout on
> the
> > > file system and disk.
> 
> Can someone tell me why we're "testing" the preempt kernel by running
> make -j on a build?  What exactly is this going to show us?  The only thing
> i can think of is showing us that throughput is not damaged when you want to
> run single apps by using preempt.  You dont get to see the effects of the
> kernel preemption because all the damn thing is doing is preempting itself.
> 
> If you want to test the preempt kernel you're going to need something that
> can find the mean latancy or "time to action" for a particular program or
> all programs being run at the time and then run multiple programs that you
> would find on various peoples' systems.   That is the "feel" people talk
> about when they praise the preempt patch.  make -j'ing something and not
> testing anything else but that will show you nothing important except "does
> throughput get screwed by the preempt patch."   Perhaps checking the
> latencies on a common program on people's systems like mozilla or konqueror
> while doing a 'make -j N bzImage'  would be a better idea.

That's the second test I am normally running. Just running xmms while
doing the kernel compile. I just wanted to check if the system slows
down because of preemption but instead it compiled the kernel even
faster :-) But so far I was not able to test the latency and furthermore
it is very difficult to "measure" skipping of xmms ...

> > Ouch, I assumed this wasn't the case indeed.

Sorry for not answering immedeatly but I am compiling the same kernel
source with the same .config and everything I could think of being the
same! I even do a 'rm -rf linux' after every run and untar the same
sources *every* time.

Regards,

   Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:22                                               ` jogi
@ 2002-01-14 23:05                                                 ` george anzinger
  0 siblings, 0 replies; 351+ messages in thread
From: george anzinger @ 2002-01-14 23:05 UTC (permalink / raw)
  To: jogi
  Cc: Ed Sweetman, Andrea Arcangeli, yodaiken, Robert Love, Alan Cox,
	nigel, Rob Landley, Andrew Morton, linux-kernel

jogi@planetzork.ping.de wrote:
> 
> On Sat, Jan 12, 2002 at 02:00:17PM -0500, Ed Sweetman wrote:
> >
> >
> > > On Sat, Jan 12, 2002 at 09:52:09AM -0700, yodaiken@fsmlabs.com wrote:
> > > > On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> > > > > I did my usual compile testings (untar kernel archive, apply patches,
> > > > > make -j<value> ...
> > > >
> > > > If I understand your test,
> > > > you are testing different loads - you are compiling kernels that may
> > differ
> > > > in size and makefile organization, not to mention different layout on
> > the
> > > > file system and disk.
> >
> > Can someone tell me why we're "testing" the preempt kernel by running
> > make -j on a build?  What exactly is this going to show us?  The only thing
> > i can think of is showing us that throughput is not damaged when you want to
> > run single apps by using preempt.  You dont get to see the effects of the
> > kernel preemption because all the damn thing is doing is preempting itself.
> >
> > If you want to test the preempt kernel you're going to need something that
> > can find the mean latancy or "time to action" for a particular program or
> > all programs being run at the time and then run multiple programs that you
> > would find on various peoples' systems.   That is the "feel" people talk
> > about when they praise the preempt patch.  make -j'ing something and not
> > testing anything else but that will show you nothing important except "does
> > throughput get screwed by the preempt patch."   Perhaps checking the
> > latencies on a common program on people's systems like mozilla or konqueror
> > while doing a 'make -j N bzImage'  would be a better idea.
> 
> That's the second test I am normally running. Just running xmms while
> doing the kernel compile. I just wanted to check if the system slows
> down because of preemption but instead it compiled the kernel even
> faster :-) 

This sort of thing is nice to hear, but, it does show up a problem in
the non-preempt kernel.  That preemption improves compile performance
implies that the kernel is not doing the right thing during a normal
compile and that preemption, to some extent, corrects the problem.  But
preemption adds the overhead of additional context switches.  It would
be nice to know where the time is coming from.  I.e. lets assume that
the actual compile takes about the same amount of execution time with or
without preemption.  Then for the preemptable kernel to do the job
faster something else must go up, idle time perhaps.  If this is the
case, then there is some place in the kernel that is wasting cpu time
and that is preemptable and the preemptable patch is moving this idle
time to the idle process.  

What ever the reason, while I do want to promote preemption, I think we
should look at this issue and, at the very least, explain it.

>           But so far I was not able to test the latency and furthermore
> it is very difficult to "measure" skipping of xmms ...
> 
> > > Ouch, I assumed this wasn't the case indeed.
> 
> Sorry for not answering immedeatly but I am compiling the same kernel
> source with the same .config and everything I could think of being the
> same! I even do a 'rm -rf linux' after every run and untar the same
> sources *every* time.
> 
> Regards,
> 
>    Jogi
> 
> --
> 
> Well, yeah ... I suppose there's no point in getting greedy, is there?
> 
>     << Calvin & Hobbes >>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
George           george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 16:52                                         ` yodaiken
  2002-01-12 17:00                                           ` Andrea Arcangeli
@ 2002-01-13 15:18                                           ` jogi
  2002-01-13 17:51                                             ` yodaiken
  2002-01-13 18:11                                             ` Robert Love
  1 sibling, 2 replies; 351+ messages in thread
From: jogi @ 2002-01-13 15:18 UTC (permalink / raw)
  To: yodaiken
  Cc: Andrea Arcangeli, Robert Love, Alan Cox, nigel, Rob Landley,
	Andrew Morton, linux-kernel

On Sat, Jan 12, 2002 at 09:52:09AM -0700, yodaiken@fsmlabs.com wrote:
> On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> > I did my usual compile testings (untar kernel archive, apply patches,
> > make -j<value> ...
> 
> If I understand your test, 
> you are testing different loads - you are compiling kernels that may differ
> in size and makefile organization, not to mention different layout on the
> file system and disk.

No, I use a script which is run in single user mode after a reboot. So
there are only a few processes running when I start the script (see
attachment) and the jobs should start from the same environment.

> What happens when you do the same test, compiling one kernel under multiple
> different kernels?

That is exactly what I am doing. I even try to my best to have the exact
same starting environment ...

Regards,

   Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:18                                           ` jogi
@ 2002-01-13 17:51                                             ` yodaiken
  2002-01-13 18:10                                               ` jogi
  2002-01-13 18:11                                             ` Robert Love
  1 sibling, 1 reply; 351+ messages in thread
From: yodaiken @ 2002-01-13 17:51 UTC (permalink / raw)
  To: jogi
  Cc: yodaiken, Andrea Arcangeli, Robert Love, Alan Cox, nigel,
	Rob Landley, Andrew Morton, linux-kernel

On Sun, Jan 13, 2002 at 04:18:23PM +0100, jogi@planetzork.ping.de wrote:
> On Sat, Jan 12, 2002 at 09:52:09AM -0700, yodaiken@fsmlabs.com wrote:
> > On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> > > I did my usual compile testings (untar kernel archive, apply patches,
> > > make -j<value> ...
> > 
> > If I understand your test, 
> > you are testing different loads - you are compiling kernels that may differ
> > in size and makefile organization, not to mention different layout on the
> > file system and disk.
> 
> No, I use a script which is run in single user mode after a reboot. So
> there are only a few processes running when I start the script (see
> attachment) and the jobs should start from the same environment.

But your description makes it sound like you do
	untar kernel X
	apply patches Y
	make -j  Tree

I'm sorry if I'm getting you wrong, but each of these steps is
variable.
Even if X and Y are the same each time, "Tree" is different.

The test should be
	reboot
		N times
		make clean
		time make -j Tree

Am I misunderstaning your test?


> 
> > What happens when you do the same test, compiling one kernel under multiple
> > different kernels?
> 
> That is exactly what I am doing. I even try to my best to have the exact
> same starting environment ...
> 
> Regards,
> 
>    Jogi
> 
> -- 
> 
> Well, yeah ... I suppose there's no point in getting greedy, is there?
> 
>     << Calvin & Hobbes >>

-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 17:51                                             ` yodaiken
@ 2002-01-13 18:10                                               ` jogi
  0 siblings, 0 replies; 351+ messages in thread
From: jogi @ 2002-01-13 18:10 UTC (permalink / raw)
  To: yodaiken
  Cc: Andrea Arcangeli, Robert Love, Alan Cox, nigel, Rob Landley,
	Andrew Morton, linux-kernel

On Sun, Jan 13, 2002 at 10:51:04AM -0700, yodaiken@fsmlabs.com wrote:
> On Sun, Jan 13, 2002 at 04:18:23PM +0100, jogi@planetzork.ping.de wrote:
> > On Sat, Jan 12, 2002 at 09:52:09AM -0700, yodaiken@fsmlabs.com wrote:
> > > On Sat, Jan 12, 2002 at 04:07:14PM +0100, jogi@planetzork.ping.de wrote:
> > > > I did my usual compile testings (untar kernel archive, apply patches,
> > > > make -j<value> ...
> > > 
> > > If I understand your test, 
> > > you are testing different loads - you are compiling kernels that may differ
> > > in size and makefile organization, not to mention different layout on the
> > > file system and disk.
> > 
> > No, I use a script which is run in single user mode after a reboot. So
> > there are only a few processes running when I start the script (see
> > attachment) and the jobs should start from the same environment.
> 
> But your description makes it sound like you do
> 	untar kernel X
> 	apply patches Y
> 	make -j  Tree
> 
> I'm sorry if I'm getting you wrong, but each of these steps is
> variable.
> Even if X and Y are the same each time, "Tree" is different.

X and Y are the same. But I don't really get why this is still
"different" ... If you think this could be because of the fs
fragmentation then I will enhance my test. I think I have a spare
partition somewhere which I can format each time before untar the
kernel sources and so on. But why can I reproduce the results then?
Ok, not exactly but the results do get close ...

Furthermore I am timing not only the make -j<value> but also the
complete untar and applying of patches. So basically I am timing the
following:

tar xvf linux-x.y.z.tar
patch -p0 < some_patches
cd linux; cp ../config-x.y.z .config
make oldconfig dep
make -j $PAR bzImage modules

and afterwards

cd .. ; rm -rf linux

and start again. Its just the same as doing 'rpm --rebuild' with

MAKE=make -j $PAR

> The test should be
> 	reboot
> 		N times
> 		make clean
> 		time make -j Tree
> 
> Am I misunderstaning your test?

No, but I don't understand why this should make any difference. I do not
propose my way of testing as *the* benchmark. Its just a benchmark of
something which I do most of the time on my system (compiling) in an
extreme way ...

Kind regards,

   Jogi

-- 

Well, yeah ... I suppose there's no point in getting greedy, is there?

    << Calvin & Hobbes >>

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 15:18                                           ` jogi
  2002-01-13 17:51                                             ` yodaiken
@ 2002-01-13 18:11                                             ` Robert Love
  2002-01-14 11:32                                               ` Andrea Arcangeli
  1 sibling, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-13 18:11 UTC (permalink / raw)
  To: jogi
  Cc: yodaiken, Andrea Arcangeli, Alan Cox, nigel, Rob Landley,
	Andrew Morton, linux-kernel

On Sun, 2002-01-13 at 10:18, jogi@planetzork.ping.de wrote:

> No, I use a script which is run in single user mode after a reboot. So
> there are only a few processes running when I start the script (see
> attachment) and the jobs should start from the same environment.
> 
> > What happens when you do the same test, compiling one kernel under multiple
> > different kernels?
> 
> That is exactly what I am doing. I even try to my best to have the exact
> same starting environment ...

So there you go, his testing is accurate.  Now we have results that
preempt works and is best and it is still refuted.  Everyone is running
around with these "ll is best" or "preempt sucks throughput" and that is
not true.  Further, with preempt we can improve things cleanly, and I
don't think that necessarily implies priority inversion problems.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 18:11                                             ` Robert Love
@ 2002-01-14 11:32                                               ` Andrea Arcangeli
  0 siblings, 0 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-14 11:32 UTC (permalink / raw)
  To: Robert Love
  Cc: jogi, yodaiken, Alan Cox, nigel, Rob Landley, Andrew Morton,
	linux-kernel

On Sun, Jan 13, 2002 at 01:11:21PM -0500, Robert Love wrote:
> On Sun, 2002-01-13 at 10:18, jogi@planetzork.ping.de wrote:
> 
> > No, I use a script which is run in single user mode after a reboot. So
> > there are only a few processes running when I start the script (see
> > attachment) and the jobs should start from the same environment.
> > 
> > > What happens when you do the same test, compiling one kernel under multiple
> > > different kernels?
> > 
> > That is exactly what I am doing. I even try to my best to have the exact
> > same starting environment ...
> 
> So there you go, his testing is accurate.  Now we have results that
> preempt works and is best and it is still refuted.  Everyone is running
> around with these "ll is best" or "preempt sucks throughput" and that is

assuming the report can be trusted this is not the test where we can
measure a throughput regression, this is a VM intensive test and nothing
else.  Swap load.

In short, run top and check you've 100% system load and cpus are never
idle or in userspace, and _then_ it will most certainly get an interesting
benchmark for -preempt throughput.

Furthmore the whole comparison is flawed, just -O(1) is as broken as
mainline w.r.t. the scheduling point, and -aa has the right scheduling
point but not the -O(1) scheduler, so there's no way to compare those
numbers at all. If you want to make any real comparison you should apply
-preempt on top of -aa.

Assuming it is really -preempt that makes the numbers more repetable
(not the fact -O(1) alone has the broken rescheduling points), this
still doesn't proof anything yet, the lower numbers are most certainly
because those tasks getting the page faults get rescheduled faster, -aa
didn't do more cpu work, it just had the cpus more idle than -preempt
apparently, this may be the indication of an important scheduling point
missing somewhere, if somebody could run a lowlatency measurement during
a swap intensive load and send me the offending IP that could probably
be addressed with a one liner.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-12 15:07                                       ` jogi
  2002-01-12 16:05                                         ` Andrea Arcangeli
  2002-01-12 16:52                                         ` yodaiken
@ 2002-01-13 22:55                                         ` Daniel Phillips
  2002-01-13 22:56                                           ` Robert Love
  2002-01-14 11:18                                           ` Marian Jancar
  2 siblings, 2 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-13 22:55 UTC (permalink / raw)
  To: jogi, Andrea Arcangeli
  Cc: Robert Love, Alan Cox, nigel, Rob Landley, Andrew Morton,
	linux-kernel

On January 12, 2002 04:07 pm, jogi@planetzork.ping.de wrote:
> Hello Andrea,
> 
> I did my usual compile testings (untar kernel archive, apply patches,
> make -j<value> ...
> 
> Here are some results (Wall time + Percent cpu) for each of the consecutive five runs:
> 
>         13-pre5aa1      18-pre2aa2      18-pre3         18-pre3s        18-pre3sp
> j100:   6:59.79  78%    7:07.62  76%        *           6:39.55  81%    6:24.79  83%
> j100:   7:03.39  77%    8:10.04  66%        *           8:07.13  66%    6:21.23  83%
> j100:   6:40.40  81%    7:43.15  70%        *           6:37.46  81%    6:03.68  87%
> j100:   7:45.12  70%    7:11.59  75%        *           7:14.46  74%    6:06.98  87%
> j100:   6:56.71  79%    7:36.12  71%        *           6:26.59  83%    6:11.30  86%
> 
> j75:    6:22.33  85%    6:42.50  81%    6:48.83  80%    6:01.61  89%    5:42.66  93%
> j75:    6:41.47  81%    7:19.79  74%    6:49.43  79%    5:59.82  89%    6:00.83  88%
> j75:    6:10.32  88%    6:44.98  80%    7:01.01  77%    6:02.99  88%    5:48.00  91%
> j75:    6:28.55  84%    6:44.21  80%    9:33.78  57%    6:19.83  85%    5:49.07  91%
> j75:    6:17.15  86%    6:46.58  80%    7:24.52  73%    6:23.50  84%    5:58.06  88%
> 
> * build incomplete (OOM killer killed several cc1 ... )
> 
> So far 2.4.13-pre5aa1 had been the king of the block in compile times.
> But this has changed. Now the (by far) fastest kernel is 2.4.18-pre
> + Ingos scheduler patch (s) + preemptive patch (p). I did not test
> preemptive patch alone so far since I don't know if the one I have
> applies cleanly against -pre3 without Ingos patch. I used the
> following patches:
> 
> s: sched-O1-2.4.17-H6.patch
> p: preempt-kernel-rml-2.4.18-pre3-ingo-1.patch
> 
> I hope this info is useful to someone.

I'd like to add my 'me too' to those who have requested a re-run of this test, building
the *identical* kernel tree every time, starting from the same initial conditions.
Maybe that's what you did, but it's not clear from your post.

Thanks,

Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 22:55                                         ` Daniel Phillips
@ 2002-01-13 22:56                                           ` Robert Love
  2002-01-14  0:11                                             ` yodaiken
  2002-01-14 11:18                                           ` Marian Jancar
  1 sibling, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-13 22:56 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: jogi, Andrea Arcangeli, Alan Cox, nigel, Rob Landley,
	Andrew Morton, linux-kernel

On Sun, 2002-01-13 at 17:55, Daniel Phillips wrote:

> I'd like to add my 'me too' to those who have requested a re-run of this test, building
> the *identical* kernel tree every time, starting from the same initial conditions.
> Maybe that's what you did, but it's not clear from your post.

He later said he did in fact build the same tree, from the same initial
condition, in single user mode, etc etc ... sounded like good testing
methodology to me.

I later asked for a test of Ingo's sched with ll (to compare to Ingo's
sched with preempt).  In this test, like the others, preempt gives the
best times.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 22:56                                           ` Robert Love
@ 2002-01-14  0:11                                             ` yodaiken
  0 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-14  0:11 UTC (permalink / raw)
  To: Robert Love
  Cc: Daniel Phillips, jogi, Andrea Arcangeli, Alan Cox, nigel,
	Rob Landley, Andrew Morton, linux-kernel

On Sun, Jan 13, 2002 at 05:56:25PM -0500, Robert Love wrote:
> On Sun, 2002-01-13 at 17:55, Daniel Phillips wrote:
> 
> > I'd like to add my 'me too' to those who have requested a re-run of this test, building
> > the *identical* kernel tree every time, starting from the same initial conditions.
> > Maybe that's what you did, but it's not clear from your post.
> 
> He later said he did in fact build the same tree, from the same initial
> condition, in single user mode, etc etc ... sounded like good testing
> methodology to me.

Really? You think that 
		unpack a tar archive
		make

is a repeatable benchmark?

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-13 22:55                                         ` Daniel Phillips
  2002-01-13 22:56                                           ` Robert Love
@ 2002-01-14 11:18                                           ` Marian Jancar
  2002-01-14 14:16                                             ` yodaiken
  1 sibling, 1 reply; 351+ messages in thread
From: Marian Jancar @ 2002-01-14 11:18 UTC (permalink / raw)
  To: linux-kernel

Daniel Phillips wrote:

>I'd like to add my 'me too' to those who have requested a re-run of this test, building
>the *identical* kernel tree every time, starting from the same initial conditions.
>Maybe that's what you did, but it's not clear from your post.
>

Its obvious its same source code under different kernels. It would have 
no sense to do othervise and it would require more effort than just boot 
other image and run same script.

Marian



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 11:18                                           ` Marian Jancar
@ 2002-01-14 14:16                                             ` yodaiken
  0 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-14 14:16 UTC (permalink / raw)
  To: Marian Jancar; +Cc: linux-kernel

On Mon, Jan 14, 2002 at 12:18:57PM +0100, Marian Jancar wrote:
> Daniel Phillips wrote:
> 
> >I'd like to add my 'me too' to those who have requested a re-run of this test, building
> >the *identical* kernel tree every time, starting from the same initial conditions.
> >Maybe that's what you did, but it's not clear from your post.
> >
> 
> Its obvious its same source code under different kernels. It would have 
> no sense to do othervise and it would require more effort than just boot 
> other image and run same script.

It's a different tree each time. Same contents. Different 
tree.



-- 
---------------------------------------------------------
Victor Yodaiken 
Finite State Machine Labs: The RTLinux Company.
 www.fsmlabs.com  www.rtlinux.com


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10 19:01                             ` Alan Cox
  2002-01-11  2:47                               ` Nigel Gamble
@ 2002-01-14  2:46                               ` Pavel Machek
  1 sibling, 0 replies; 351+ messages in thread
From: Pavel Machek @ 2002-01-14  2:46 UTC (permalink / raw)
  To: Alan Cox; +Cc: Rob Landley, Andrew Morton, linux-kernel

Hi!

> > exhausted...)  What sound output device DOESN'T have this much cache?  (You 
> > mentioned USB speakers in your diary at one point, which seemed to be like 
> > those old "paralell port cable plus a few resistors equals sound output" 
> > hacks...)
> 
> Umm no USB audio is rather good. USB sends isosynchronous, time guaranteed
> sample streams down the USB bus, to the speakers where the A to D is clear
> of the machine proper.

Actually usb speakers are *very* sensitive to latency, and when they drop
out,they drop out for half a second...
									Pavel

-- 
Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt,
details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  0:10                 ` Alan Cox
  2002-01-09  0:29                   ` John Alvord
  2002-01-09  5:08                   ` Andrew Morton
@ 2002-01-10  9:59                   ` Ken Brownfield
  2002-01-10 11:04                     ` Alan Cox
  2 siblings, 1 reply; 351+ messages in thread
From: Ken Brownfield @ 2002-01-10  9:59 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

On Wed, Jan 09, 2002 at 12:10:38AM +0000, Alan Cox wrote:
| That is generally not true. Pe-emption is used in user space to prevent
| applications doing very stupid things. Pre-emption in a trusted environment
| can often be most efficient if done by the programs themselves.
| Userspace is not a trusted environment

That's true, but at some point in the future I think the work involved
in making sure all new additional kernel code and all new intra-kernel
interactions are "tuned" becomes larger than going preemptive all the
way down.

Apple had its arguments for cooperative, along the same lines as what
you've mentioned I believe.  And while I agree that the kernel is a much
_more_ trusted environment, I think the possibilities easily remain for
abuse given that there are A) more and more people contributing kernel
code every day, and B) countless unspeakably evil modules out there.

And the preempt tunability that has been mentioned sounds like it would
go a long way.

| Andrew's patches give you 1mS worst case latency for normal situations, that
| is below human perception, and below scheduling granularity. In other words
| without the efficiency loss and the debugging problems you can place the
| far enough latency below other effects that it isnt worth attacking any more.

It sounds like the LL patches are easier and less prone to locking
issues with a lot of the benefit.  But I can't help but feel that it's
not using the right tool for the job.  I think the end result of
stabilizing a preemptive kernel (in 2.5?) is worth the price, IMHO.
-- 
Ken.
brownfld@irridia.com

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-10  9:59                   ` Ken Brownfield
@ 2002-01-10 11:04                     ` Alan Cox
  0 siblings, 0 replies; 351+ messages in thread
From: Alan Cox @ 2002-01-10 11:04 UTC (permalink / raw)
  To: Ken Brownfield; +Cc: Alan Cox, linux-kernel

> That's true, but at some point in the future I think the work involved
> in making sure all new additional kernel code and all new intra-kernel
> interactions are "tuned" becomes larger than going preemptive all the
> way down.

It makes no difference to the kernel the work is the same in all cases
because you cannot pre-empt while holding a lock. Therefore you have to do
all the lock analysis anyway

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 23:02             ` Luigi Genoni
  2002-01-08 23:32               ` Ken Brownfield
@ 2002-01-09  0:13               ` Dieter Nützel
  2002-01-09  6:26               ` Daniel Phillips
  2 siblings, 0 replies; 351+ messages in thread
From: Dieter Nützel @ 2002-01-09  0:13 UTC (permalink / raw)
  To: Luigi Genoni, Daniel Phillips
  Cc: Andrea Arcangeli, Anton Blanchard, Marcelo Tosatti, Rik van Riel,
	Linux Kernel List, Andrew Morton, Robert Love, George Anzinger

On Wednesday, 9. January 2002 00:02, Luigi Genoni wrote:
> On Tue, 8 Jan 2002, Daniel Phillips wrote:
> > On January 8, 2002 04:29 pm, Andrea Arcangeli wrote:
[-]
> > > I also don't want to devaluate the preemptive kernel approch (the mean
> > > latency it can reach is lower than the one of the lowlat kernel,
> > > however I personally care only about worst case latency and this is why
> > > I don't feel the need of -preempt),
> >
> > This is exactly the case that -preempt handles well.  On the other hand,
> > trying to show that scheduling hacks satisfy any given latency bound is
> > equivalent to solving the halting problem.
> >
> > I thought you had done some real time work?
> >
> > > but I just wanted to make clear that the
> > > idea that is floating around that preemptive kernel is all goodness is
> > > very far from reality, you get very low mean latency but at a price.
> >
> > A price lots of people are willing to pay
>
> Probably sometimes they are not making a good business. In the reality
> preempt is good in many scenarios, as I said, and I agree that for
> desktops, and dedicated servers where just one application runs, and
> probably the CPU is idle the most of the time,

OK, good. You are much at the same line than I am.

Should we starting not only to differentiate between UP and SMP systems but 
allthought between desktop and (big) servers?
I remember one saying. "Think, this patch is worth only for ~0.05% of the 
Linux users..." (He meant the multi SMP system users.)

Allmost 99.95% of the Linux users running desktops and I am somewhat tiered 
of saying, "sorry, Linux is under development..."
Look at the imprint of the famous German ct magazine (they are not even known 
as Linux bashers...;-). It shows little penguins falling like domino stones 
(starting with 2.4.17).

Let me rephrase it:
I appreciate all your great work and I know "only" some (little) internals of 
it but we should do some interactivity improvements for the 2.4 kernel, too.
I know what it's worth Andrew's (lowlatency patch) and Robert's (George 
Anzinger's) preempt patch. In short the system (bigger desktop) flies.

The holly grail would be a combination of preempt+lock-break plus lowlatency 
and Ingo's O(1) scheduler.

My main focus lies on 3D graphics not kernel and I use KDE (yes, a little 
luxury:-) 'cause KDE is C++ and most visualization systems are c and later 
c++.

Without the above patches even my 1 GHz Athlon II, 640 MB, feels sluggish.
But I don't forget to think about throughput which is even usefull for 
"heavy" compiler runs...

> indeed users have a speed
> feeling. Please consider that on eavilly loaded servers, with 40 and more
> users, some are running gcc, others g77, others g++ compilations, someone
> runs pine or mutt or kmail, and netscape, and mozilla, and emacs (someone
> form xterm kde or gnome), and and
> and... You can have also 4/8 CPU butthey are not infinite ;) (but I talk
> mainly thinking of dualAthlon systems).
> there is a lot of memory and disk I/O.
> This is not a strange scenary on the interactive servers used at SNS.
> Here preempt has a too high price

That's why preempt is a compile time option, btw.

> > By the way, have you measured the cost of -preempt in practice?
>
> Yes, I did a lot of tests, and with current preempt patch definitelly
> I was seeing a too big performance loss.

Have you tried with stock 2.4.17 or with additional patches?
2.4.17-rc2aa2 (10_vm-21)?

The later make big differences in throughput for me (with and without 
preempt).

I am under preparation of some numbers.
Anybody want some special tests?
dbench (yes, I know...) with and without MP3 during run
latencytest0.42-png
bonnie++
getc_putc

Thank you for all your serious answers. This was definitely not intended as a 
flamewar start.

-Dieter
-- 
Dieter Nützel
Graduate Student, Computer Science

University of Hamburg
Department of Computer Science
@home: Dieter.Nuetzel@hamburg.de

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 23:02             ` Luigi Genoni
  2002-01-08 23:32               ` Ken Brownfield
  2002-01-09  0:13               ` Dieter Nützel
@ 2002-01-09  6:26               ` Daniel Phillips
  2002-01-09  7:25                 ` Preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable) Roger Larsson
  2 siblings, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-09  6:26 UTC (permalink / raw)
  To: Luigi Genoni
  Cc: Andrea Arcangeli, Anton Blanchard, Dieter N?tzel, Marcelo Tosatti,
	Rik van Riel, Linux Kernel List, Andrew Morton, Robert Love

On January 9, 2002 12:02 am, Luigi Genoni wrote:
> On Tue, 8 Jan 2002, Daniel Phillips wrote:
> > On January 8, 2002 04:29 pm, Andrea Arcangeli wrote:
> > > but I just wanted to make clear that the
> > > idea that is floating around that preemptive kernel is all goodness is
> > > very far from reality, you get very low mean latency but at a price.
> >
> > A price lots of people are willing to pay
>
> Probably sometimes they are not making a good business.

Perhaps.  But they are happy customers and their music sounds better.

Note: the dominating cost of -preempt is not Robert's patch, but the fact 
that you need to have CONFIG_SMP enabled, even for uniprocessor, turning all 
those stub macros into real spinlocks.  For a dual processor you have to have 
this anyway and it just isn't an issue.

Personally, I don't intend to ever get another single-processor machine, 
except maybe a laptop, and that's only if Transmeta doesn't come up with a 
dual-processor laptop configuration.

> > By the way, have you measured the cost of -preempt in practice?
>
> Yes, I did a lot of tests, and with current preempt patch definitelly
> I was seeing a too big performance loss.

Was this on uniprocessor machines, or your dual Athlons?  How did you measure 
the performance?

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-09  6:26               ` Daniel Phillips
@ 2002-01-09  7:25                 ` Roger Larsson
  2002-01-09  7:48                   ` Daniel Phillips
  0 siblings, 1 reply; 351+ messages in thread
From: Roger Larsson @ 2002-01-09  7:25 UTC (permalink / raw)
  To: Daniel Phillips, Luigi Genoni
  Cc: Andrea Arcangeli, Anton Blanchard, Dieter N?tzel, Marcelo Tosatti,
	Rik van Riel, Linux Kernel List, Andrew Morton, Robert Love

(the subject has been wrong for some time now...)

On Wednesday den 9 January 2002 07.26, Daniel Phillips wrote:
> On January 9, 2002 12:02 am, Luigi Genoni wrote:
> > On Tue, 8 Jan 2002, Daniel Phillips wrote:
> > > On January 8, 2002 04:29 pm, Andrea Arcangeli wrote:
> > > > but I just wanted to make clear that the
> > > > idea that is floating around that preemptive kernel is all goodness
> > > > is very far from reality, you get very low mean latency but at a
> > > > price.
> > >
> > > A price lots of people are willing to pay
> >
> > Probably sometimes they are not making a good business.
>
> Perhaps.  But they are happy customers and their music sounds better.
>
> Note: the dominating cost of -preempt is not Robert's patch, but the fact
> that you need to have CONFIG_SMP enabled, even for uniprocessor, turning
> all those stub macros into real spinlocks.  For a dual processor you have
> to have this anyway and it just isn't an issue.
>

Well you don't - the first versions used the SMP spinlocks macros but
replaced them with own code. (basically an INC on entry and a DEC and test
when leaving)

Think about what happens on a UP
There are two cases
 - the processor is in the critical section, it can not be preempted = no
   other process can take the CPU away from it.
 - the processor is not in a critical section, no process can be executing
   inside it = can never be busy.
=> no real spinlocks needed on a UP

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: Preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable)
  2002-01-09  7:25                 ` Preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable) Roger Larsson
@ 2002-01-09  7:48                   ` Daniel Phillips
  0 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-09  7:48 UTC (permalink / raw)
  To: Roger Larsson, Luigi Genoni
  Cc: Andrea Arcangeli, Anton Blanchard, Dieter N?tzel, Marcelo Tosatti,
	Rik van Riel, Linux Kernel List, Andrew Morton, Robert Love

On January 9, 2002 08:25 am, Roger Larsson wrote:
> (the subject has been wrong for some time now...)
> 
> On Wednesday den 9 January 2002 07.26, Daniel Phillips wrote:
> > On January 9, 2002 12:02 am, Luigi Genoni wrote:
> > > On Tue, 8 Jan 2002, Daniel Phillips wrote:
> > > > On January 8, 2002 04:29 pm, Andrea Arcangeli wrote:
> > > > > but I just wanted to make clear that the
> > > > > idea that is floating around that preemptive kernel is all goodness
> > > > > is very far from reality, you get very low mean latency but at a
> > > > > price.
> > > >
> > > > A price lots of people are willing to pay
> > >
> > > Probably sometimes they are not making a good business.
> >
> > Perhaps.  But they are happy customers and their music sounds better.
> >
> > Note: the dominating cost of -preempt is not Robert's patch, but the fact
> > that you need to have CONFIG_SMP enabled, even for uniprocessor, turning
> > all those stub macros into real spinlocks.  For a dual processor you have
> > to have this anyway and it just isn't an issue.
> 
> Well you don't - the first versions used the SMP spinlocks macros but
> replaced them with own code. (basically an INC on entry and a DEC and test
> when leaving)
> 
> Think about what happens on a UP
> There are two cases
>  - the processor is in the critical section, it can not be preempted = no
>    other process can take the CPU away from it.
>  - the processor is not in a critical section, no process can be executing
>    inside it = can never be busy.
> => no real spinlocks needed on a UP

Right, thanks, it was immediately obvious when you pointed out that the 
macros are just used to find the bounds of the critical regions.  So the cost 
of -preempt is somewhat less than I had imagined.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 15:29         ` Andrea Arcangeli
  2002-01-08 15:54           ` Daniel Phillips
@ 2002-01-08 20:55           ` Robert Love
  2002-01-09 11:24             ` Andrea Arcangeli
  1 sibling, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-08 20:55 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Daniel Phillips, Anton Blanchard, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Andrew Morton

On Tue, 2002-01-08 at 10:29, Andrea Arcangeli wrote:

> "extra schedule points all over the place", that's the -preempt kernel
> not the lowlatency kernel! (on yeah, you don't see them in the source
> but ask your CPU if it sees them)

How so?  The branch on drop of the last lock?  It's not a factor in
profiles I've seen.  And it is marked unlikely.  The other change is the
usual check for reschedule on return from interrupt, but that is the
case already, we just allow it when in-kernel, too.

This makes me think the end conclusion would be that preemptive
multitasking in general is bad.  Why don't we increase the timeslice and
and tick period, in that case?

One can argue the complexity degrades performance, but tests show
otherwise.  In throughput and latency.  Besides, like I always say, its
an option that uses existing kernel (SMP lock) infrastructure.  You
don't have to use it ;)

> > The preemptible approach is much less of a maintainance headache, since 
> > people don't have to be constantly doing audits to see if something changed, 
> > and going in to fiddle with scheduling points.
> 
> this yes, it requires less maintainance, but still you should keep in
> mind the details about the spinlocks, things like the checks the VM does
> in shrink_cache are needed also with preemptive kernel.

They impact SMP in the same way they impact preempt-kernel.  Long-held
locks are never good.  Weird locking rules are never good.

> The I/O pipeline is big enough that a few msec before or later in a
> submit_bh shouldn't make a difference, the batch logic in the
> ll_rw_block layer also try to reduce the reschedule, and last but not
> the least if the task is I/O bound preemptive kernel or not won't make
> any difference in the submit_bh latency because no task is eating cpu
> and latency will be the one of pure schedule call.

Yet throughput tests show marked increase.  I believe this is true with
Andrew's patch, too.  We multitask better.  We dispatch waiting tasks
faster.  Without the patch, a queued I/O task may stall for sometime
waiting for some hog to get out of the kernel.

Although the patch's goal is to improve interactivity (*), no one says
the higher priority task we preempt in favor of has to be CPU-bound.  A
woken up I/O-bound task is just as benefiting from the patch.

(*) this is, IMO, where we benefit most though.  By far the most
pleasing benchmark isn't to see some x% decrease in latency or y more
MB/s in bonnie, but to feel the improvement in interactivity.  On a
multitasking desktop, it is noticeable.

> I also don't want to devaluate the preemptive kernel approch (the mean
> latency it can reach is lower than the one of the lowlat kernel, however
> I personally care only about worst case latency and this is why I don't
> feel the need of -preempt), but I just wanted to make clear that the
> idea that is floating around that preemptive kernel is all goodness is
> very far from reality, you get very low mean latency but at a price.

Andrea, I don't want you or anyone to believe preemption is a free
ride.  On the other hand, the patch has a _huge_ userbase and you can't
question that.  You also can't question the benchmarks that show
improvements in average _and_ worst case latency _and_ throughput.

I don't expect you to use the patch.  If it were merged, it is an
option.  It provides a framework for continuing to improve latency.  It
is a solution to the problem (i.e. latency is poor because the kernel is
non-preemptible) instead of a hack.  I agree worst-case latency is
important, and I agree the patch shines more so in average case.  But we
do affect worse-case.  And now a framework exists for working on fixing
the worst-case latencies too.  And in the end, its just an option for
some, but a better kernel for others.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:55           ` [2.4.17/18pre] VM and swap - it's really unusable Robert Love
@ 2002-01-09 11:24             ` Andrea Arcangeli
  2002-01-09 14:07               ` Ed Sweetman
  0 siblings, 1 reply; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-09 11:24 UTC (permalink / raw)
  To: Robert Love
  Cc: Daniel Phillips, Anton Blanchard, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Andrew Morton

On Tue, Jan 08, 2002 at 03:55:38PM -0500, Robert Love wrote:
> On Tue, 2002-01-08 at 10:29, Andrea Arcangeli wrote:
> 
> > "extra schedule points all over the place", that's the -preempt kernel
> > not the lowlatency kernel! (on yeah, you don't see them in the source
> > but ask your CPU if it sees them)
> 
> How so?  The branch on drop of the last lock?  It's not a factor in

exactly, this is the reschedule point I meant. Oh note that it's
unlikely also in the lowlatecy patch. Please count the number of time
you add this branch in the -preempt, and how many times we add this
branch in the lowlat and then tell me who is adding rescheduling points
in the kernel all over the place.

> This makes me think the end conclusion would be that preemptive
> multitasking in general is bad.  Why don't we increase the timeslice and
> and tick period, in that case?

that would increase performance, but we'd lost interactivity.

> One can argue the complexity degrades performance, but tests show
> otherwise.  In throughput and latency.  Besides, like I always say, its

which benchmarks? you should make sure the CPU spend all its cycles in
the kernel to benchmark the perfrormance degradation (this is the normal
case of webserving with a few gigabit ethernet cards using sendfile).

> ride.  On the other hand, the patch has a _huge_ userbase and you can't

I question this because it is too risky to apply. There is no way any
distribution or production system could ever consider applying the
preempt kernel and ship it in its next kernel update 2.4. You never know
if a driver will deadlock because it is doing a test and set bit busy
loop by hand instead of using spin_lock and you cannot audit all the
device drivers out there. It is not like the VM that is self contained
and that can be replaced without any caller noticing, this instead
impacts every single driver out there and you'd need to audit all of
them, which is not feasible I think and that should be done by giving
everybody the time to test. This is also what makes preempt config
option risky, if we go preempt we should force everybody to use it, at
least during 2.5, so we get the useful feedback from testers of all the
hardware, or nobody could trust -preempt.

NOTE: I trust your work with spinlocks, locks around per-cpu data
structures etc.. is perfect, I trust that part, as said it's the driver
doing test and set bit that you cannot audit that is the problem here
and that makes it potentially unstable, not your changes.  And also the
per-cpu data structures sounds a little risky (but for example for UP
that's not an issue).

> question that.  You also can't question the benchmarks that show
> improvements in average _and_ worst case latency _and_ throughput.

I don't question some benchmark is faster with -preempt, the interesting
thing is to find why because it shouldn't be the case, Andrew for
example mentioned software raid, there are good reasons for which
-preempt could be faster there, so we added a single sechdule point and
we just have that case covered in 18pre2aa1, we don't need reschedule
points all over the place like in -preempt to cover things like that.
It is good to find them out so we can fix those bugs, I consider them
bugs :).

Again: I'm not completly against preempt, it can reach an mean latency
much lower than mainline (it can reschedule immediatly in the middle of
long copy-users for example), so it definitely has a value, it's just
that I'm not sure if it worth it.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 11:24             ` Andrea Arcangeli
@ 2002-01-09 14:07               ` Ed Sweetman
  2002-01-09 14:27                 ` Andrea Arcangeli
  0 siblings, 1 reply; 351+ messages in thread
From: Ed Sweetman @ 2002-01-09 14:07 UTC (permalink / raw)
  To: Andrea Arcangeli, Robert Love
  Cc: Daniel Phillips, Anton Blanchard, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Andrew Morton


----- Original Message -----
From: "Andrea Arcangeli" <andrea@suse.de>
To: "Robert Love" <rml@tech9.net>
Cc: "Daniel Phillips" <phillips@bonn-fries.net>; "Anton Blanchard"
<anton@samba.org>; "Luigi Genoni" <kernel@Expansa.sns.it>; "Dieter N?tzel"
<Dieter.Nuetzel@hamburg.de>; "Marcelo Tosatti" <marcelo@conectiva.com.br>;
"Rik van Riel" <riel@conectiva.com.br>; "Linux Kernel List"
<linux-kernel@vger.kernel.org>; "Andrew Morton" <akpm@zip.com.au>
Sent: Wednesday, January 09, 2002 6:24 AM
Subject: Re: [2.4.17/18pre] VM and swap - it's really unusable


> On Tue, Jan 08, 2002 at 03:55:38PM -0500, Robert Love wrote:
> > On Tue, 2002-01-08 at 10:29, Andrea Arcangeli wrote:
> >
> > > "extra schedule points all over the place", that's the -preempt kernel
> > > not the lowlatency kernel! (on yeah, you don't see them in the source
> > > but ask your CPU if it sees them)
> >
> > How so?  The branch on drop of the last lock?  It's not a factor in
>
> exactly, this is the reschedule point I meant. Oh note that it's
> unlikely also in the lowlatecy patch. Please count the number of time
> you add this branch in the -preempt, and how many times we add this
> branch in the lowlat and then tell me who is adding rescheduling points
> in the kernel all over the place.
>
> > This makes me think the end conclusion would be that preemptive
> > multitasking in general is bad.  Why don't we increase the timeslice and
> > and tick period, in that case?
>
> that would increase performance, but we'd lost interactivity.
>
> > One can argue the complexity degrades performance, but tests show
> > otherwise.  In throughput and latency.  Besides, like I always say, its
>
> which benchmarks? you should make sure the CPU spend all its cycles in
> the kernel to benchmark the perfrormance degradation (this is the normal
> case of webserving with a few gigabit ethernet cards using sendfile).
 I haven't seen any interactive tests that showed worse results than the
vanilla kernel with the preempt patch.  The only cases where it gives a
worse performance is in a single tasking environment such as running bonnie
or dbench and apps like that that require to throttle the system.  This is
obviously expected behavior though.  Performance degradation might be seen
on a per app basis, but when looking at the system as a whole, performance
has never degraded with the patch as far as i've seen.  Better overall
performance is what has lead to better "benchmark" performance on the tests
being run by people.


> > ride.  On the other hand, the patch has a _huge_ userbase and you can't
>
> I question this because it is too risky to apply. There is no way any
> distribution or production system could ever consider applying the
> preempt kernel and ship it in its next kernel update 2.4. You never know
> if a driver will deadlock because it is doing a test and set bit busy
> loop by hand instead of using spin_lock and you cannot audit all the
> device drivers out there. It is not like the VM that is self contained
> and that can be replaced without any caller noticing, this instead
> impacts every single driver out there and you'd need to audit all of
> them, which is not feasible I think and that should be done by giving
> everybody the time to test. This is also what makes preempt config
> option risky, if we go preempt we should force everybody to use it, at
> least during 2.5, so we get the useful feedback from testers of all the
> hardware, or nobody could trust -preempt.
 I disagree.   Redhat shipped gcc 2.96 when it was producing incompatible
binaries and was buggy as all hell, why not ship a kernel that is "unstable"
and "risky" if it promises better performance.
If scheduling points are ugly in 2.4, then they'd be ugly in 2.5.  The only
solution to the problem you see with it is making 2.5 fully preemptible from
the ground up instead of having to add fixes.  If nobody wants to do things
the hard way (assuming there is a better way), is it better to leave it
unfixed rather than fix it?  Of course i'm assuming that the idea of a fully
preemptible kernel is better than the current version we have now.


> NOTE: I trust your work with spinlocks, locks around per-cpu data
> structures etc.. is perfect, I trust that part, as said it's the driver
> doing test and set bit that you cannot audit that is the problem here
> and that makes it potentially unstable, not your changes.  And also the
> per-cpu data structures sounds a little risky (but for example for UP
> that's not an issue).
>
> > question that.  You also can't question the benchmarks that show
> > improvements in average _and_ worst case latency _and_ throughput.
>
> I don't question some benchmark is faster with -preempt, the interesting
> thing is to find why because it shouldn't be the case, Andrew for
> example mentioned software raid, there are good reasons for which
> -preempt could be faster there, so we added a single sechdule point and
> we just have that case covered in 18pre2aa1, we don't need reschedule
> points all over the place like in -preempt to cover things like that.
> It is good to find them out so we can fix those bugs, I consider them
> bugs :).

  I think robert love is trying to give the kernel the highest flexibility.
Making it flexible in key areas will improve your worst cases but a lot of
the time during normal use it's the multitude of smaller cases that is
noticeable.

> Again: I'm not completly against preempt, it can reach an mean latency
> much lower than mainline (it can reschedule immediatly in the middle of
> long copy-users for example), so it definitely has a value, it's just
> that I'm not sure if it worth it.
>
> Andrea


Ok so the medicine is worse than the disease.   I take it that you only want
some key points made for rescheduling instead of the full preempt patch by
Robert.   That seems logical enough.   The only issue i see is that for the
most part people dont like the idea of needing to add scheduling points.  So
how would the kernel need to be fixed in order to not need them and still be
fully preemptible like it's getting in Robert's patch.  If it just cant then
is it really best to hang out somewhere on the edge of preemptible
multitasking because some people are in denial that the kernel needs to be
patched so much to work correctly and for the sake of single tasking
performance?

Now in just my own opinion i think of linux as a multitasking kernel and as
thus it should perform that function as best as possible.  If you want to
run a single program as fast as possible then absolutely dont run anything
else and nothing can preempt it to degrade it's performance.   The fact that
you can run multiple apps and run a single program as fast as possible
without degrading it's performance  is a bug if those other apps (at the
same priority) have to wait longer than they should if we want linux to be a
multitasking kernel.  Just to sum things up, if there is a way to be fully
preemptible without scheduling points linux, then perhaps that should be a
major focus for 2.5 instead of picking and choosing (ugly)scheduling points,
but if not then the argument about them not being elegant is mute because
then the kernel, itself, is far from elegant already, so what exactly are
you saving?

- Formerly safemode


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 14:07               ` Ed Sweetman
@ 2002-01-09 14:27                 ` Andrea Arcangeli
  2002-01-09 14:51                   ` Arjan van de Ven
  0 siblings, 1 reply; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-09 14:27 UTC (permalink / raw)
  To: Ed Sweetman
  Cc: Robert Love, Daniel Phillips, Anton Blanchard, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List,
	Andrew Morton

On Wed, Jan 09, 2002 at 09:07:55AM -0500, Ed Sweetman wrote:
> Ok so the medicine is worse than the disease.   I take it that you only want
> some key points made for rescheduling instead of the full preempt patch by
> Robert.   That seems logical enough.   The only issue i see is that for the

My ideal is to have the kernel to be as low worst latency as -preempt,
but without being preemptive. that's possible to achieve, I don't think
we're that far.

mean latency is another matter, but I personally don't mind about mean
latency and I much prefer to save cpu cycles instead.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 14:27                 ` Andrea Arcangeli
@ 2002-01-09 14:51                   ` Arjan van de Ven
  2002-01-09 17:02                     ` Roger Larsson
  2002-01-09 17:13                     ` Daniel Phillips
  0 siblings, 2 replies; 351+ messages in thread
From: Arjan van de Ven @ 2002-01-09 14:51 UTC (permalink / raw)
  To: Andrea Arcangeli, linux-kernel

Andrea Arcangeli wrote:
> 
> On Wed, Jan 09, 2002 at 09:07:55AM -0500, Ed Sweetman wrote:
> > Ok so the medicine is worse than the disease.   I take it that you only want
> > some key points made for rescheduling instead of the full preempt patch by
> > Robert.   That seems logical enough.   The only issue i see is that for the
> 
> My ideal is to have the kernel to be as low worst latency as -preempt,
> but without being preemptive. that's possible to achieve, I don't think
> we're that far.
> 
> mean latency is another matter, but I personally don't mind about mean
> latency and I much prefer to save cpu cycles instead.

hear hear!

The akpm patch is achieving a MUCH better latency than pure -preempt,
and only has 40 
or so coded preemption points instead of a few hundred (eg every
spin_unlock).... 

and if with 40 we can get <= 1ms then everybody will be happy; if you
want, say, 50 usec
latency instead you need RTLinux anyway. With 1ms _worst case_ latency
the "mean" latency 
is obviously also very good.......

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 14:51                   ` Arjan van de Ven
@ 2002-01-09 17:02                     ` Roger Larsson
  2002-01-09 17:10                       ` Arjan van de Ven
  2002-01-09 17:13                     ` Daniel Phillips
  1 sibling, 1 reply; 351+ messages in thread
From: Roger Larsson @ 2002-01-09 17:02 UTC (permalink / raw)
  To: arjanv, Andrea Arcangeli, linux-kernel

On Wednesday den 9 January 2002 15.51, Arjan van de Ven wrote:
> Andrea Arcangeli wrote:
> > On Wed, Jan 09, 2002 at 09:07:55AM -0500, Ed Sweetman wrote:
> > > Ok so the medicine is worse than the disease.   I take it that you only
> > > want some key points made for rescheduling instead of the full preempt
> > > patch by Robert.   That seems logical enough.   The only issue i see is
> > > that for the
> >
> > My ideal is to have the kernel to be as low worst latency as -preempt,
> > but without being preemptive. that's possible to achieve, I don't think
> > we're that far.
> >
> > mean latency is another matter, but I personally don't mind about mean
> > latency and I much prefer to save cpu cycles instead.
>
> hear hear!
>
> The akpm patch is achieving a MUCH better latency than pure -preempt,
> and only has 40
> or so coded preemption points instead of a few hundred (eg every
> spin_unlock)....

The difference is that the preemptive kernel mostly uses existing 
infrastructure. When SMP scalability gets better due to holding locks
for a shorter time then the preemptive kernel will improve as well!

AND it can be used on a UP computer to "simulate" SMP and that
should help the quality of the total code base...

This is my idea:
* Add the preemptive kernel
* "Remove" reschedule points from main kernel.
   note: that reschedule points that does nothing more than
   test and schedule can be NOOPed since they will never trigger in a
   preemptive kernel...

>
> and if with 40 we can get <= 1ms then everybody will be happy; if you
> want, say, 50 usec
> latency instead you need RTLinux anyway. With 1ms _worst case_ latency
> the "mean" latency
> is obviously also very good.......

Worst case latency... is VERY hard to prove if you rely on schedule points.
Since they are typically added after the fact...
If the code suddenly end up on a road less travelled...

With preemptive kernel your worst latency is the longest held spinlock. 
PERIOD.
(you can of cause be delayed by an even higher priority process)
* Make sure that there are no "infinite" loops inside any spinlock.

"infinite" == over ALL or ALL/x of something since someone, somewere
 will have ALL close to infinite... (infinity/x is still infinity... :-)
   example code is looping through LRU list to find a victim page...
   once it was not infinite due to the small number of pages...

Note: that akpm patches usually hava a - "do not do this list" with known
problem spots (ok, usually in a hard to break spinlocks).

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 17:02                     ` Roger Larsson
@ 2002-01-09 17:10                       ` Arjan van de Ven
  0 siblings, 0 replies; 351+ messages in thread
From: Arjan van de Ven @ 2002-01-09 17:10 UTC (permalink / raw)
  To: Roger Larsson; +Cc: arjanv, Andrea Arcangeli, linux-kernel

On Wed, Jan 09, 2002 at 06:02:53PM +0100, Roger Larsson wrote:
> The difference is that the preemptive kernel mostly uses existing 
> infrastructure. When SMP scalability gets better due to holding locks
> for a shorter time then the preemptive kernel will improve as well!

Ehm. Holding locks for a shorter time is not guaranteed to improve smp
scalability. In fact it can completely kill it due to cacheline pingpong
effects.

> > and if with 40 we can get <= 1ms then everybody will be happy; if you
> > want, say, 50 usec
> > latency instead you need RTLinux anyway. With 1ms _worst case_ latency
> > the "mean" latency
> > is obviously also very good.......
> 
> Worst case latency... is VERY hard to prove if you rely on schedule points.

Agreed. It's "worst case" in the soft real time sence. But we've beaten the
kernel quite hard during such tests....

> With preemptive kernel your worst latency is the longest held spinlock. 
> PERIOD.

Yes and without the same stuff akpm does that's about 80 to 90 ms right now. 

> Note: that akpm patches usually hava a - "do not do this list" with known
> problem spots (ok, usually in a hard to break spinlocks).

Usually in hardware related parts. Even with -preempt you'll get this.
Hopefully only during hardware initialisation, but there are just cases
where you need to go WAAAY too far if you want to go below, say, 5ms during
device init.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 14:51                   ` Arjan van de Ven
  2002-01-09 17:02                     ` Roger Larsson
@ 2002-01-09 17:13                     ` Daniel Phillips
  1 sibling, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-09 17:13 UTC (permalink / raw)
  To: arjanv, Andrea Arcangeli, linux-kernel

On January 9, 2002 03:51 pm, Arjan van de Ven wrote:
> Andrea Arcangeli wrote:
> > 
> > On Wed, Jan 09, 2002 at 09:07:55AM -0500, Ed Sweetman wrote:
> > > Ok so the medicine is worse than the disease.   I take it that you only want
> > > some key points made for rescheduling instead of the full preempt patch by
> > > Robert.   That seems logical enough.   The only issue i see is that for the
> > 
> > My ideal is to have the kernel to be as low worst latency as -preempt,
> > but without being preemptive. that's possible to achieve, I don't think
> > we're that far.
> > 
> > mean latency is another matter, but I personally don't mind about mean
> > latency and I much prefer to save cpu cycles instead.
> 
> hear hear!

> The akpm patch is achieving a MUCH better latency than pure -preempt,

Can you please point us at the benchmark results that support your claim?

> and only has 40 
> or so coded preemption points instead of a few hundred (eg every
> spin_unlock).... 

So?  The cost of this is, in theory, a dec and a branch normally not taken.
Robert hasn't coded it that way in the current incarnation, and personally,
I'd rather see the correctness proven before the microoptimizations are
done, but that's where it's going in theory.  Big deal.

On the other hand, I just did a test for myself that pretty well makes up
my mind about this patch.  I'm typing this right now on a 64 Meg laptop with
a slow disk, dma turned off.  On this machine, debian apt-get dist-upgrade
is essentially a DoS - once it gets to unpacking packages and configuring,
for whatever reason, the machine becomes almost ununsable.  Changing windows
for example, can take 10-15 seconds.  Updatedb, while not quite as bad, is
definitely an irritant as far as interactive use goes.

With Robert's patch, the machine is a little sluggish during apt-get, but
quite usable.  This is a *huge* difference.  And during updatedb, well, I
hardly notice it's happening, except for the disk light.

So I like this patch.  What was your complaint again?  If you've got hard
numbers and repeatable benchmarks, please trot them out.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 15:00       ` Daniel Phillips
  2002-01-08 15:29         ` Andrea Arcangeli
@ 2002-01-08 19:47         ` Andrew Morton
  2002-01-08 20:13           ` Alan Cox
                             ` (4 more replies)
  1 sibling, 5 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-08 19:47 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

Daniel Phillips wrote:
> 
> On January 8, 2002 02:33 pm, Anton Blanchard wrote:
> > Andrea Arcangeli [apparently] wrote:
> > > So yes, mean latency will decrease with preemptive kernel, but your CPU
> > > is definitely paying something for it.
> >
> > And Andrew Morton's work suggests he can do a much better job of
> > reducing latency than -preempt.
> 
> That's not a particularly clueful comment, Anton.  Obviously, any
> latency-busting hacks that Andrew does could also be patched into a
> -preempt kernel.

Yes.  The important part is the implicit dropping of the BKL across
schedule().

> What a preemptible kernel can do that a non-preemptible kernel can't is:
> reschedule exactly as often as necessary, instead of having lots of extra
> schedule points inserted all over the place, firing when *they* think the
> time is right, which may well be earlier than necessary.

Nope.  `if (current->need_resched)' -> the time is right (beyond right,
actually).

> The preemptible approach is much less of a maintainance headache, since
> people don't have to be constantly doing audits to see if something changed,
> and going in to fiddle with scheduling points.

Except it doesn't work.  The full-on low-latency patch has ~60 rescheduling
points.  Of these, ~40 involve popping spinlocks.  Really, the only significant
latency sources which the preemptible kernel solves are generic_file_read()
and generic_file_write().

So preemptible kernel needs lock-break to be useful.  And then it's basically
the same thing, with the same maintainability problems.  And believe me, these
are considerable.  Mainly because the areas which needs busting up exactly
coincide with the areas where there has been most churn in the kernel.

> Finally, with preemption, rescheduling can be forced with essentially zero
> latency in response to an arbitrary interrupt such as IO completion, whereas
> the non-preemptive kernel will have to 'coast to a stop'.  In other words,
> the non-preemptive kernel will have little lags between successive IOs,
> whereas the preemptive kernel can submit the next IO immediately.  So there
> are bound to be loads where the preemptive kernel turns in better latency
> *and throughput* than the scheduling point hack.

Latency yes.  Throughout no.

I don't think the "preempt slows down the kernel" argument is very valid
really.  Let's invert the argument - Linux is multitasking, and that has a
cost.  There's no reason why certain bits of the kernel need to violate that
just to get a bit more throughput.  If it really worries you, set HZ=10 and
increase all the timeslices, etc.

Now, there *may* be overheads added due to losing the implicit locking which
per-CPU data gives you.

The main cost of preempt IMO is in complexity and stability risks.

(BTW: I took a weird oops testing the preempt patch on an SMP NFS client.
The fault address was 0x0aXXXXXX.  No useful backtrace, unfortunately).

> Mind you, I'm not devaluing Andrew's work, it's good and valuable.  However
> it's good to be aware of why that approach can't equal the latency-busting
> performance of the preemptive approach.

There's no point in just merging the preempt patch and saying "there,
that's done".  It doesn't do anything.

Instead, a decision needs to be made: "Linux will henceforth be a 
low-latency kernel".  Now, IF we can come to this decision, then
internal preemption is the way to do it.  But it affects ALL kernel
developers.  Because we'll need to introduce a new rule: "it is a
bug to spend more than five milliseconds holding any locks".

So.  Do we we want a low-latency kernel?  Are we prepared to mandate
the five-millisecond rule?   It can be done, but won't be easy, and
we'll never get complete coverage.  But I don't see the will around
here.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 19:47         ` Andrew Morton
@ 2002-01-08 20:13           ` Alan Cox
  2002-01-08 22:00             ` Roger Larsson
  2002-01-08 20:18           ` Daniel Phillips
                             ` (3 subsequent siblings)
  4 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-08 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Phillips, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List,
	Robert Love

> low-latency kernel".  Now, IF we can come to this decision, then
> internal preemption is the way to do it.  But it affects ALL kernel

The pre-empt patches just make things much much harder to debug. They
remove some of the predictability and the normal call chain following
goes out of the window because you end up seeing crashes in a thread with
no idea what ran the microsecond before

Some of that happens now but this makes it vastly worse.

The low latency patches don't change the basic predictability and
debuggability but allow you to hit a 1mS pre-empt target for the general
case.

Alan

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:13           ` Alan Cox
@ 2002-01-08 22:00             ` Roger Larsson
  0 siblings, 0 replies; 351+ messages in thread
From: Roger Larsson @ 2002-01-08 22:00 UTC (permalink / raw)
  To: Alan Cox, Andrew Morton
  Cc: Daniel Phillips, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List,
	Robert Love

On Tuesdayen den 8 January 2002 21.13, Alan Cox wrote:
> > low-latency kernel".  Now, IF we can come to this decision, then
> > internal preemption is the way to do it.  But it affects ALL kernel
>
> The pre-empt patches just make things much much harder to debug. They
> remove some of the predictability and the normal call chain following
> goes out of the window because you end up seeing crashes in a thread with
> no idea what ran the microsecond before
>
> Some of that happens now but this makes it vastly worse.
>
> The low latency patches don't change the basic predictability and
> debuggability but allow you to hit a 1mS pre-empt target for the general
> case.
>

Yes, it does make things much much harder to debug - but:
* If you get a problem on a preemtive UP kernel, it is likely to be a problem
  on a SMP too - and those are hard to debug aswell. But the positive aspect
  is that you get more people that can do the debugging... :-)
  (One CPU gets delayed with handling a IRQ the other runs into the critical
   section)
* It is optional at compile time.
   And could even be made run time optional / CPU ! Just set a too big value
   on the counter and it will never reschedule...

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 19:47         ` Andrew Morton
  2002-01-08 20:13           ` Alan Cox
@ 2002-01-08 20:18           ` Daniel Phillips
  2002-01-08 21:19             ` Robert Love
  2002-01-14  1:08             ` Bill Davidsen
  2002-01-08 20:59           ` Daniel Phillips
                             ` (2 subsequent siblings)
  4 siblings, 2 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-08 20:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

On January 8, 2002 08:47 pm, Andrew Morton wrote:
> There's no point in just merging the preempt patch and saying "there,
> that's done".  It doesn't do anything.
> 
> Instead, a decision needs to be made: "Linux will henceforth be a 
> low-latency kernel".

I thought the intention was to make it a config option?

> Now, IF we can come to this decision, then
> internal preemption is the way to do it.  But it affects ALL kernel
> developers.  Because we'll need to introduce a new rule: "it is a
> bug to spend more than five milliseconds holding any locks".
> 
> So.  Do we we want a low-latency kernel?  Are we prepared to mandate
> the five-millisecond rule?   It can be done, but won't be easy, and
> we'll never get complete coverage.  But I don't see the will around
> here.

At least the flaming has gotten a little less ;-)

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:18           ` Daniel Phillips
@ 2002-01-08 21:19             ` Robert Love
  2002-01-14  1:08             ` Bill Davidsen
  1 sibling, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-08 21:19 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List

On Tue, 2002-01-08 at 15:18, Daniel Phillips wrote:

> > Instead, a decision needs to be made: "Linux will henceforth be a 
> > low-latency kernel".
> 
> I thought the intention was to make it a config option?

It was originally, it is now, and I intend it to be.

Further, since it uses the existing SMP locks, it doesn't introduce new
design decisions (the one being protection of implicitly locked per-CPU
data on preempt).

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:18           ` Daniel Phillips
  2002-01-08 21:19             ` Robert Love
@ 2002-01-14  1:08             ` Bill Davidsen
  1 sibling, 0 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-14  1:08 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Linux Kernel Mailing List

On Tue, 8 Jan 2002, Daniel Phillips wrote:

> On January 8, 2002 08:47 pm, Andrew Morton wrote:
> > There's no point in just merging the preempt patch and saying "there,
> > that's done".  It doesn't do anything.
> > 
> > Instead, a decision needs to be made: "Linux will henceforth be a 
> > low-latency kernel".
> 
> I thought the intention was to make it a config option?

Irrelevant, it has to be implemented in order to be an option, so the
amount of work involved is the same either way. And if you want to make it
a runtime setting you add a slight bit of work and overhead deciding if LL
is wanted.

I'm not advocating that, but it would allow admins to enable LL when the
system was slow and see if it really made a change. Rebooting is bound to
change the load ;-)

> > Now, IF we can come to this decision, then
> > internal preemption is the way to do it.  But it affects ALL kernel
> > developers.  Because we'll need to introduce a new rule: "it is a
> > bug to spend more than five milliseconds holding any locks".
> > 
> > So.  Do we we want a low-latency kernel?  Are we prepared to mandate
> > the five-millisecond rule?   It can be done, but won't be easy, and
> > we'll never get complete coverage.  But I don't see the will around
> > here.

Really? You have people working on low latency, people working on preempt,
and at least a few of us trying to characterize the problems with large
memory and i/o. I would say latency has become a real issue, and you only
need enough "will" to have one person write useful code, this is a
committee.

Since changes of this type don't need to be perfect and address all cases,
just help some and not make other worse, I think we will see improvement
in 2.4.xx without waiting for 2.5 or 2.6. No one is complaining that the
Linux overall thruput is bad, that network performance is bad, etc. But
responsiveness has become an issue, and I'm sure there's enough will to
solve it. "Solve" means getting most of the delays to be caused by
hardware capacity instead of kernel ineptitude.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 19:47         ` Andrew Morton
  2002-01-08 20:13           ` Alan Cox
  2002-01-08 20:18           ` Daniel Phillips
@ 2002-01-08 20:59           ` Daniel Phillips
  2002-01-08 21:08             ` Rik van Riel
                               ` (4 more replies)
  2002-01-08 21:08           ` Robert Love
  2002-01-09  0:31           ` Oliver Xymoron
  4 siblings, 5 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-08 20:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

On January 8, 2002 08:47 pm, Andrew Morton wrote:
> Daniel Phillips wrote:
> > What a preemptible kernel can do that a non-preemptible kernel can't is:
> > reschedule exactly as often as necessary, instead of having lots of extra
> > schedule points inserted all over the place, firing when *they* think the
> > time is right, which may well be earlier than necessary.
> 
> Nope.  `if (current->need_resched)' -> the time is right (beyond right,
> actually).

Oops, sorry, right.

The preemptible kernel can reschedule, on average, sooner than the 
scheduling-point kernel, which has to wait for a scheduling point to roll 
around.

And while I'm enumerating differences, the preemptable kernel (in this 
incarnation) has a slight per-spinlock cost, while the non-preemptable kernel 
has the fixed cost of checking for rescheduling, at intervals throughout all 
'interesting' kernel code, essentially all long-running loops.  But by clever 
coding it's possible to finesse away almost all the overhead of those loop 
checks, so in the end, the non-preemptible low-latency patch has a slight 
efficiency advantage here, with emphasis on 'slight'.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:59           ` Daniel Phillips
@ 2002-01-08 21:08             ` Rik van Riel
  2002-01-08 21:15               ` Robert Love
  2002-01-08 21:51               ` Daniel Phillips
  2002-01-08 21:10             ` Andrew Morton
                               ` (3 subsequent siblings)
  4 siblings, 2 replies; 351+ messages in thread
From: Rik van Riel @ 2002-01-08 21:08 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Linux Kernel List, Robert Love

On Tue, 8 Jan 2002, Daniel Phillips wrote:

> The preemptible kernel can reschedule, on average, sooner than the
> scheduling-point kernel, which has to wait for a scheduling point to
> roll around.

The preemptible kernel ALSO has to wait for a scheduling point
to roll around, since it cannot preempt with spinlocks held.

Considering this, I don't see much of an advantage to adding
kernel preemption.

regards,

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:08             ` Rik van Riel
@ 2002-01-08 21:15               ` Robert Love
  2002-01-08 21:24                 ` Rik van Riel
  2002-01-08 21:51               ` Daniel Phillips
  1 sibling, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-08 21:15 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Daniel Phillips, Andrew Morton, Anton Blanchard, Andrea Arcangeli,
	Luigi Genoni, Dieter N?tzel, Marcelo Tosatti, Linux Kernel List

On Tue, 2002-01-08 at 16:08, Rik van Riel wrote:

> The preemptible kernel ALSO has to wait for a scheduling point
> to roll around, since it cannot preempt with spinlocks held.
> 
> Considering this, I don't see much of an advantage to adding
> kernel preemption.

It only has to wait if locks are held and then only until the locks are
dropped.  Otherwise it will preempt on the next return from interrupt.

Future work would be to look into long-held locks and see what we can
do.

Without preempt-kernel, we have none of this: either run until
completion or explicit scheduling points. 

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:15               ` Robert Love
@ 2002-01-08 21:24                 ` Rik van Riel
  2002-01-08 21:45                   ` Robert Love
  0 siblings, 1 reply; 351+ messages in thread
From: Rik van Riel @ 2002-01-08 21:24 UTC (permalink / raw)
  To: Robert Love
  Cc: Daniel Phillips, Andrew Morton, Anton Blanchard, Andrea Arcangeli,
	Luigi Genoni, Dieter N?tzel, Marcelo Tosatti, Linux Kernel List

On 8 Jan 2002, Robert Love wrote:
> On Tue, 2002-01-08 at 16:08, Rik van Riel wrote:
>
> > The preemptible kernel ALSO has to wait for a scheduling point
> > to roll around, since it cannot preempt with spinlocks held.
> >
> > Considering this, I don't see much of an advantage to adding
> > kernel preemption.
>
> It only has to wait if locks are held and then only until the locks are
> dropped.  Otherwise it will preempt on the next return from interrupt.

So what exactly _is_ the difference between an explicit
preemption point and a place where we need to explicitly
drop a spinlock ?

>From what I can see, there really isn't a difference.

> Future work would be to look into long-held locks and see what we can
> do.

One thing we could do is download Andrew Morton's patch ;)

Rik
-- 
"Linux holds advantages over the single-vendor commercial OS"
    -- Microsoft's "Competing with Linux" document

http://www.surriel.com/		http://distro.conectiva.com/


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:24                 ` Rik van Riel
@ 2002-01-08 21:45                   ` Robert Love
  2002-01-08 22:31                     ` Andrew Morton
  0 siblings, 1 reply; 351+ messages in thread
From: Robert Love @ 2002-01-08 21:45 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Daniel Phillips, Andrew Morton, Anton Blanchard, Andrea Arcangeli,
	Luigi Genoni, Dieter N?tzel, Marcelo Tosatti, Linux Kernel List

On Tue, 2002-01-08 at 16:24, Rik van Riel wrote:

> So what exactly _is_ the difference between an explicit
> preemption point and a place where we need to explicitly
> drop a spinlock ?

In that case nothing, except that when we drop the lock and check it is
the earliest place where preemption is allowed.  In the normal scenario,
however, we have a check for reschedule on return from interrupt (e.g.
the timer) and thus preempt in the same manner as with user space and
that is the key.

> > Future work would be to look into long-held locks and see what we can
> > do.
> 
> One thing we could do is download Andrew Morton's patch ;)

That is certainly one option, and Andrew's patch is very good. 
Nonetheless, I think we need a more general framework that tackles the
problem itself.  Preemptible kernel does this, yields results now, and
allows for greater return later on.

	Robert Love

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:45                   ` Robert Love
@ 2002-01-08 22:31                     ` Andrew Morton
  0 siblings, 0 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-08 22:31 UTC (permalink / raw)
  To: Robert Love
  Cc: Rik van Riel, Daniel Phillips, Anton Blanchard, Andrea Arcangeli,
	Luigi Genoni, Dieter N?tzel, Marcelo Tosatti, Linux Kernel List

Robert Love wrote:
> 
> On Tue, 2002-01-08 at 16:24, Rik van Riel wrote:
> 
> > So what exactly _is_ the difference between an explicit
> > preemption point and a place where we need to explicitly
> > drop a spinlock ?
> 
> In that case nothing, except that when we drop the lock and check it is
> the earliest place where preemption is allowed.  In the normal scenario,
> however, we have a check for reschedule on return from interrupt (e.g.
> the timer) and thus preempt in the same manner as with user space and
> that is the key.

One could do:

static inline void spin_unlock(spinlock_t *lock)
{
        __asm__ __volatile__(
                spin_unlock_string
        );

	if (--current->lock_depth == 0 &&
		current->need_resched &&
		current->state == TASK_RUNNING)
		schedule();
}

But I have generally avoided "global" solutions like this, in favour
of nailing the _specific_ code which is causing the problem.  Which
is a lot more work, but more useful.

The scheduling points in bread() and submit_bh() in the mini-ll patch
go against this (masochistic) philosophy.

> > > Future work would be to look into long-held locks and see what we can
> > > do.
> >
> > One thing we could do is download Andrew Morton's patch ;)
> 
> That is certainly one option, and Andrew's patch is very good.
> Nonetheless, I think we need a more general framework that tackles the
> problem itself.  Preemptible kernel does this, yields results now, and
> allows for greater return later on.

We need something which makes 2.4.x not suck.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:08             ` Rik van Riel
  2002-01-08 21:15               ` Robert Love
@ 2002-01-08 21:51               ` Daniel Phillips
  1 sibling, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-08 21:51 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Linux Kernel List, Robert Love

On January 8, 2002 10:08 pm, Rik van Riel wrote:
> On Tue, 8 Jan 2002, Daniel Phillips wrote:
> 
> > The preemptible kernel can reschedule, on average, sooner than the
> > scheduling-point kernel, which has to wait for a scheduling point to
> > roll around.
> 
> The preemptible kernel ALSO has to wait for a scheduling point
> to roll around, since it cannot preempt with spinlocks held.

Think about the relative amount of time spent inside spinlocks vs regular 
kernel.

> Considering this, I don't see much of an advantage to adding
> kernel preemption.

And now?

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:59           ` Daniel Phillips
  2002-01-08 21:08             ` Rik van Riel
@ 2002-01-08 21:10             ` Andrew Morton
  2002-01-08 21:17             ` Robert Love
                               ` (2 subsequent siblings)
  4 siblings, 0 replies; 351+ messages in thread
From: Andrew Morton @ 2002-01-08 21:10 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

Daniel Phillips wrote:
> 
> On January 8, 2002 08:47 pm, Andrew Morton wrote:
> > Daniel Phillips wrote:
> > > What a preemptible kernel can do that a non-preemptible kernel can't is:
> > > reschedule exactly as often as necessary, instead of having lots of extra
> > > schedule points inserted all over the place, firing when *they* think the
> > > time is right, which may well be earlier than necessary.
> >
> > Nope.  `if (current->need_resched)' -> the time is right (beyond right,
> > actually).
> 
> Oops, sorry, right.
> 
> The preemptible kernel can reschedule, on average, sooner than the
> scheduling-point kernel, which has to wait for a scheduling point to roll
> around.

That's theory.  Practice (ie: instrumentation) says that the preempt
patch makes little improvement over conditional yields in generic_file_read()
and generic_file_write().  Four lines.  Additional yields in wait_for_buffers()
(where we move zillions of buffers from BUF_LOCKED to BUF_CLEAN) and in submit_bh()
and bread() are cream.

Preemptability is global in its impact, and in its effect.  It requires
global changes to make it useful.  If we're prepared to make those
changes (scan_swap_map, truncate_inode_pages, etc) then fine.  Go for
it.  We'll end up with a better kernel.

> And while I'm enumerating differences, the preemptable kernel (in this
> incarnation) has a slight per-spinlock cost, while the non-preemptable kernel
> has the fixed cost of checking for rescheduling, at intervals throughout all
> 'interesting' kernel code, essentially all long-running loops.  But by clever
> coding it's possible to finesse away almost all the overhead of those loop
> checks, so in the end, the non-preemptible low-latency patch has a slight
> efficiency advantage here, with emphasis on 'slight'.
> 

As I said: I don't buy the efficiency worries at all.  If scheduling pressure
is so high that either approach impacts performance, then scheduling pressure
is too high.  We need to fix the context switch rate and/or speed up context
switches.  The overhead of conditional_schedule() or preempt will be zilch.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:59           ` Daniel Phillips
  2002-01-08 21:08             ` Rik van Riel
  2002-01-08 21:10             ` Andrew Morton
@ 2002-01-08 21:17             ` Robert Love
  2002-01-08 21:57               ` Daniel Phillips
  2002-01-14  1:22               ` Bill Davidsen
  2002-01-08 22:21             ` Andrew Morton
  2002-01-08 23:26             ` Luigi Genoni
  4 siblings, 2 replies; 351+ messages in thread
From: Robert Love @ 2002-01-08 21:17 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List

On Tue, 2002-01-08 at 15:59, Daniel Phillips wrote:

> And while I'm enumerating differences, the preemptable kernel (in this 
> incarnation) has a slight per-spinlock cost, while the non-preemptable kernel 
> has the fixed cost of checking for rescheduling, at intervals throughout all 
> 'interesting' kernel code, essentially all long-running loops.  But by clever 
> coding it's possible to finesse away almost all the overhead of those loop 
> checks, so in the end, the non-preemptible low-latency patch has a slight 
> efficiency advantage here, with emphasis on 'slight'.

True (re spinlock weight in preemptible kernel) but how is that not
comparable to explicit scheduling points?  Worse, the preempt-kernel
typically does its preemption on a branch on return to interrupt
(similar to user space's preemption).  What better time to check and
reschedule if needed?

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:17             ` Robert Love
@ 2002-01-08 21:57               ` Daniel Phillips
  2002-01-08 22:01                 ` Robert Love
  2002-01-14  1:22               ` Bill Davidsen
  1 sibling, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-08 21:57 UTC (permalink / raw)
  To: Robert Love
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List

On January 8, 2002 10:17 pm, Robert Love wrote:
> On Tue, 2002-01-08 at 15:59, Daniel Phillips wrote:
> 
> > And while I'm enumerating differences, the preemptable kernel (in this 
> > incarnation) has a slight per-spinlock cost, while the non-preemptable kernel 
> > has the fixed cost of checking for rescheduling, at intervals throughout all 
> > 'interesting' kernel code, essentially all long-running loops.  But by clever 
> > coding it's possible to finesse away almost all the overhead of those loop 
> > checks, so in the end, the non-preemptible low-latency patch has a slight 
> > efficiency advantage here, with emphasis on 'slight'.
> 
> True (re spinlock weight in preemptible kernel) but how is that not
> comparable to explicit scheduling points?  Worse, the preempt-kernel
> typically does its preemption on a branch on return to interrupt
> (similar to user space's preemption).  What better time to check and
> reschedule if needed?

The per-spinlock cost I was refering to is the cost of the inc/dec per 
spinlock.  I guess this cost is small enough as to be hard to measure, but
I have not tried so I don't know.  Curiously, none of the people I've heard
making pronouncements on the overhead of your preempt patch seem to have 
measured it either.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:57               ` Daniel Phillips
@ 2002-01-08 22:01                 ` Robert Love
  0 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-08 22:01 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List

On Tue, 2002-01-08 at 16:57, Daniel Phillips wrote:

> > True (re spinlock weight in preemptible kernel) but how is that not
> > comparable to explicit scheduling points?  Worse, the preempt-kernel
> > typically does its preemption on a branch on return to interrupt
> > (similar to user space's preemption).  What better time to check and
> > reschedule if needed?
> 
> The per-spinlock cost I was refering to is the cost of the inc/dec per 
> spinlock.  I guess this cost is small enough as to be hard to measure, but
> I have not tried so I don't know.  Curiously, none of the people I've heard
> making pronouncements on the overhead of your preempt patch seem to have 
> measured it either.

;-)
 
If they did I suspect it would be minimal.  Andrew's point on complexity
and overhead in this manner is exact -- such thinks are just not an
issue.

I see two valid arguments against kernel preemption, and I'll be the
first to admit them:

- we introduce new problems with kernel programming.  specifically, the
issue with implicitly locked per-CPU data.  honestly, this isn't a huge
deal.  I've been working on preempt-kernel for awhile now and the
problems we have found and fixed are minimal.  admittedly, however,
especially wrt the future, preempt-kernel may introduce new concerns.  I
say let's rise to meet them.

- we don't do enough for the worst-case latency.  this is where future
work is useful and where preempt-kernel provides the framework for a
better kernel.

I want a better kernel.  Hell, I want the best kernel.  In my opinion,
one factor of that is having a preemptible kernel.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 21:17             ` Robert Love
  2002-01-08 21:57               ` Daniel Phillips
@ 2002-01-14  1:22               ` Bill Davidsen
  1 sibling, 0 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-14  1:22 UTC (permalink / raw)
  To: Linux Kernel Mailing List

On Tue, 8 Jan 2002, Robert Love wrote:

> On Tue, 2002-01-08 at 15:59, Daniel Phillips wrote:
> 
> > And while I'm enumerating differences, the preemptable kernel (in this 
> > incarnation) has a slight per-spinlock cost, while the non-preemptable kernel 
> > has the fixed cost of checking for rescheduling, at intervals throughout all 
> > 'interesting' kernel code, essentially all long-running loops.  But by clever 
> > coding it's possible to finesse away almost all the overhead of those loop 
> > checks, so in the end, the non-preemptible low-latency patch has a slight 
> > efficiency advantage here, with emphasis on 'slight'.
> 
> True (re spinlock weight in preemptible kernel) but how is that not
> comparable to explicit scheduling points?  Worse, the preempt-kernel
> typically does its preemption on a branch on return to interrupt
> (similar to user space's preemption).  What better time to check and
> reschedule if needed?

I'm not sure that preempt and low latency really are attacking the same
problem. What I am finding is the LL improves overall performance when a
process does something which is physically slow, like a find in a
directory with 20k files. On the other hand PK makes the response of the
system better to changes. In particular I see the DNS servers which have
other work running, even backups or reports, are more responsive with PK,
as are usenet news servers. I find it hard to measure "feels faster" with
either approach, although like the supreme court "I know it when I see
it."

I'd like to hope that some of each will get in the main kernel, PK has
been stable for me for a while, LL has never been unstable but I've run it
less.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:59           ` Daniel Phillips
                               ` (2 preceding siblings ...)
  2002-01-08 21:17             ` Robert Love
@ 2002-01-08 22:21             ` Andrew Morton
  2002-01-09  9:17               ` Daniel Phillips
  2002-01-08 23:26             ` Luigi Genoni
  4 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-08 22:21 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

Daniel Phillips wrote:
> 
> On January 8, 2002 08:47 pm, Andrew Morton wrote:
> > Daniel Phillips wrote:
> > > What a preemptible kernel can do that a non-preemptible kernel can't is:
> > > reschedule exactly as often as necessary, instead of having lots of extra
> > > schedule points inserted all over the place, firing when *they* think the
> > > time is right, which may well be earlier than necessary.
> >
> > Nope.  `if (current->need_resched)' -> the time is right (beyond right,
> > actually).
> 
> Oops, sorry, right.
> 
> The preemptible kernel can reschedule, on average, sooner than the
> scheduling-point kernel, which has to wait for a scheduling point to roll
> around.
> 

Yes.  It can also fix problematic areas which my testing
didn't cover.

Incidentally, there's the SMP problem.  Suppose we
have the code:

	lock_kernel();
	for (lots) {
		do(something sucky);
		if (current->need_resched)
			schedule();
	}
	unlock_kernel();

This works fine on UP, but not on SMP.  The scenario:

- CPU A runs this loop.

- CPU B is spinning on the lock.

- Interrupt occurs, kernel elects to run RT task on CPU B.
  CPU A doesn't have need_resched set, and just keeps 
  on going.  CPU B is stuck spinning on the lock.

This is only an issue for the low-latency patch - all the
other approaches still have sufficiently bad worse-case that
this scenario isn't worth worrying about.

I toyed with creating spin_lock_while_polling_resched(),
but ended up changing the scheduler to set need_resched
against _all_ CPUs if an RT task is being woken (yes, yuk).

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 22:21             ` Andrew Morton
@ 2002-01-09  9:17               ` Daniel Phillips
  2002-01-09  9:26                 ` Andrew Morton
  0 siblings, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-09  9:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

On January 8, 2002 11:21 pm, Andrew Morton wrote:
> Daniel Phillips wrote:
> > The preemptible kernel can reschedule, on average, sooner than the
> > scheduling-point kernel, which has to wait for a scheduling point to roll
> > around.
> 
> Yes.  It can also fix problematic areas which my testing
> didn't cover.

I bet, with a minor hack, it can help you *find* those problem areas too.  
You compile the two patches together and automatically log any event along 
with the execution address, where your explicit schedule points failed to 
reschedule in time.  Sort of like a profile but suited exactly to your 
problem.

This just detects the problem areas in normal kernel execution, not 
spinlocks, but that is probably where most of the maintainance will be anyway.

By the way, did you check for latency in directory operations?

> Incidentally, there's the SMP problem.  Suppose we
> have the code:
> 
> 	lock_kernel();
> 	for (lots) {
> 		do(something sucky);
> 		if (current->need_resched)
> 			schedule();
> 	}
> 	unlock_kernel();
> 
> This works fine on UP, but not on SMP.  The scenario:
> 
> - CPU A runs this loop.
> 
> - CPU B is spinning on the lock.
> 
> - Interrupt occurs, kernel elects to run RT task on CPU B.
>   CPU A doesn't have need_resched set, and just keeps 
>   on going.  CPU B is stuck spinning on the lock.
> 
> This is only an issue for the low-latency patch - all the
> other approaches still have sufficiently bad worse-case that
> this scenario isn't worth worrying about.
> 
> I toyed with creating spin_lock_while_polling_resched(),
> but ended up changing the scheduler to set need_resched
> against _all_ CPUs if an RT task is being woken (yes, yuk).

Heh, subtle.  Thanks for pointing that out and making my head hurt.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  9:17               ` Daniel Phillips
@ 2002-01-09  9:26                 ` Andrew Morton
  2002-01-09  9:48                   ` Daniel Phillips
  0 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-09  9:26 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

Daniel Phillips wrote:
> 
> On January 8, 2002 11:21 pm, Andrew Morton wrote:
> > Daniel Phillips wrote:
> > > The preemptible kernel can reschedule, on average, sooner than the
> > > scheduling-point kernel, which has to wait for a scheduling point to roll
> > > around.
> >
> > Yes.  It can also fix problematic areas which my testing
> > didn't cover.
> 
> I bet, with a minor hack, it can help you *find* those problem areas too.
> You compile the two patches together and automatically log any event along
> with the execution address, where your explicit schedule points failed to
> reschedule in time.  Sort of like a profile but suited exactly to your
> problem.

Well, one of the instrumentation patches which I use detects a
scheduling overrun at interrupt time and emits an all-CPU backtrace.
You just feed the trace into ksymoops or gdb then go stare at
the offending code.  

That's the easy part - the hard part is getting sufficient coverage.
There are surprising places.  close_files(), exit_notify(), ...

> This just detects the problem areas in normal kernel execution, not
> spinlocks, but that is probably where most of the maintainance will be anyway.
> 
> By the way, did you check for latency in directory operations?

Yes.  They can be very bad for really large directories.  Scheduling
on the found-in-cache case in bread() kills that one easily for most
local filesystems.  There may still be a problem in ext2.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  9:26                 ` Andrew Morton
@ 2002-01-09  9:48                   ` Daniel Phillips
  0 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-09  9:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Anton Blanchard, Andrea Arcangeli, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

On January 9, 2002 10:26 am, Andrew Morton wrote:
> Daniel Phillips wrote:
> > By the way, did you check for latency in directory operations?
> 
> Yes.  They can be very bad for really large directories.  Scheduling
> on the found-in-cache case in bread() kills that one easily for most
> local filesystems.  There may still be a problem in ext2.

A indexed directory won't have that problem - I'll get to finishing off the 
htree patch pretty soon[1].  In any event, the analogous technique will work: 
a schedule point in ext2_bread.

[1] Wli's hash work is happening at a convenient time.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 20:59           ` Daniel Phillips
                               ` (3 preceding siblings ...)
  2002-01-08 22:21             ` Andrew Morton
@ 2002-01-08 23:26             ` Luigi Genoni
  2002-01-09  6:36               ` Daniel Phillips
  4 siblings, 1 reply; 351+ messages in thread
From: Luigi Genoni @ 2002-01-08 23:26 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love



On Tue, 8 Jan 2002, Daniel Phillips wrote:

> On January 8, 2002 08:47 pm, Andrew Morton wrote:
> > Daniel Phillips wrote:
> > > What a preemptible kernel can do that a non-preemptible kernel can't is:
> > > reschedule exactly as often as necessary, instead of having lots of extra
> > > schedule points inserted all over the place, firing when *they* think the
> > > time is right, which may well be earlier than necessary.
> >
> > Nope.  `if (current->need_resched)' -> the time is right (beyond right,
> > actually).
>
> Oops, sorry, right.
>
> The preemptible kernel can reschedule, on average, sooner than the
> scheduling-point kernel, which has to wait for a scheduling point to roll
> around.
>
mmhhh. At which cost? And then anyway if I have a spinlock, I still have
to wait for a scheduling point to roll around.




^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 23:26             ` Luigi Genoni
@ 2002-01-09  6:36               ` Daniel Phillips
  0 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-09  6:36 UTC (permalink / raw)
  To: Luigi Genoni
  Cc: Andrew Morton, Anton Blanchard, Andrea Arcangeli, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Robert Love

On January 9, 2002 12:26 am, Luigi Genoni wrote:
> On Tue, 8 Jan 2002, Daniel Phillips wrote:
> 
> > On January 8, 2002 08:47 pm, Andrew Morton wrote:
> > > Daniel Phillips wrote:
> > > > What a preemptible kernel can do that a non-preemptible kernel can't 
is:
> > > > reschedule exactly as often as necessary, instead of having lots of 
extra
> > > > schedule points inserted all over the place, firing when *they* think 
the
> > > > time is right, which may well be earlier than necessary.
> > >
> > > Nope.  `if (current->need_resched)' -> the time is right (beyond right,
> > > actually).
> >
> > Oops, sorry, right.
> >
> > The preemptible kernel can reschedule, on average, sooner than the
> > scheduling-point kernel, which has to wait for a scheduling point to roll
> > around.
>
> mmhhh. At which cost? And then anyway if I have a spinlock, I still have
> to wait for a scheduling point to roll around.

Did you read the thread?  Think about the relative amount of time spent in
spinlocks vs the amount of time spent in the regular kernel.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 19:47         ` Andrew Morton
                             ` (2 preceding siblings ...)
  2002-01-08 20:59           ` Daniel Phillips
@ 2002-01-08 21:08           ` Robert Love
  2002-01-09  0:31           ` Oliver Xymoron
  4 siblings, 0 replies; 351+ messages in thread
From: Robert Love @ 2002-01-08 21:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Phillips, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List

On Tue, 2002-01-08 at 14:47, Andrew Morton wrote:

> > What a preemptible kernel can do that a non-preemptible kernel can't is:
> > reschedule exactly as often as necessary, instead of having lots of extra
> > schedule points inserted all over the place, firing when *they* think the
> > time is right, which may well be earlier than necessary.
> 
> Nope.  `if (current->need_resched)' -> the time is right (beyond right,
> actually).

Eh, I disagree here.  The right time is the moment a high-priority task
becomes runnable.  Given your HZ, only a fully preemptible kernel can
come close to meeting that.

> > Finally, with preemption, rescheduling can be forced with essentially zero
> > latency in response to an arbitrary interrupt such as IO completion, whereas
> > the non-preemptive kernel will have to 'coast to a stop'.  In other words,
> > the non-preemptive kernel will have little lags between successive IOs,
> > whereas the preemptive kernel can submit the next IO immediately.  So there
> > are bound to be loads where the preemptive kernel turns in better latency
> > *and throughput* than the scheduling point hack.
> 
> Latency yes.  Throughout no.

I bet in _many_ (most?) workloads the preemptible kernel turns in better
throughput.  Anytime there is load on the system, there should be a
benefit.  I bet the same goes for your patch.  I've certainly verified
it for both in various loads myself.

> I don't think the "preempt slows down the kernel" argument is very valid
> really.  Let's invert the argument - Linux is multitasking, and that has a
> cost.  There's no reason why certain bits of the kernel need to violate that
> just to get a bit more throughput.  If it really worries you, set HZ=10 and
> increase all the timeslices, etc.

Very well said.  I always find an answer to the "more complexity, more
context switching, blah blah" arguments as ultimately being arguments
tantamount to "preemptive multitasking sucks".

> Now, there *may* be overheads added due to losing the implicit locking which
> per-CPU data gives you.

Perhaps, but note what preempt enable and disable statements effectively
are: an inc and a dec.  Not even atomic.

Yes, there is a branch on reenable.  This may be an interesting change
to look into.  FWIW, we have a construct that doesn't check for
reschedule on reenable, too.

> The main cost of preempt IMO is in complexity and stability risks.
> 
> (BTW: I took a weird oops testing the preempt patch on an SMP NFS client.
> The fault address was 0x0aXXXXXX.  No useful backtrace, unfortunately).

Should of sent me the oops :)

> > Mind you, I'm not devaluing Andrew's work, it's good and valuable.  However
> > it's good to be aware of why that approach can't equal the latency-busting
> > performance of the preemptive approach.
> 
> There's no point in just merging the preempt patch and saying "there,
> that's done".  It doesn't do anything.
> 
> Instead, a decision needs to be made: "Linux will henceforth be a 
> low-latency kernel".  Now, IF we can come to this decision, then
> internal preemption is the way to do it.  But it affects ALL kernel
> developers.  Because we'll need to introduce a new rule: "it is a
> bug to spend more than five milliseconds holding any locks".
> 
> So.  Do we we want a low-latency kernel?  Are we prepared to mandate
> the five-millisecond rule?   It can be done, but won't be easy, and
> we'll never get complete coverage.  But I don't see the will around
> here.

I agree here, but then I do have three points:

- proper lock use that benefits SMP scalability benefits preempt-kernel
induced latency improvements.  In other words, things like short lock
durations, fine-grained locking, and ditching the BKL benefit both
worlds.

- with the preemptible kernel in place, we can look at the long-held
locks and figure ways to combat them.  The preempt-stats patch helps us
find them.  Then, we can take a lock-break approach.  We can look into
finer grained locks.  We can localize lock the lock if it is global or
the BKL.  Or we can do something radical like make long-held spinlocks
priority-inheriting when preempt-kernel is enabled.  In other words,
preempt-kernel becomes a framework for proper solutions in the future.

- finally, the usual: it's an option.

	Robert Love


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 19:47         ` Andrew Morton
                             ` (3 preceding siblings ...)
  2002-01-08 21:08           ` Robert Love
@ 2002-01-09  0:31           ` Oliver Xymoron
  2002-01-09  9:50             ` Helge Hafting
  4 siblings, 1 reply; 351+ messages in thread
From: Oliver Xymoron @ 2002-01-09  0:31 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Daniel Phillips, Anton Blanchard, Andrea Arcangeli, Luigi Genoni,
	Dieter N?tzel, Marcelo Tosatti, Rik van Riel, Linux Kernel List,
	Robert Love

On Tue, 8 Jan 2002, Andrew Morton wrote:

> > What a preemptible kernel can do that a non-preemptible kernel can't is:
> > reschedule exactly as often as necessary, instead of having lots of extra
> > schedule points inserted all over the place, firing when *they* think the
> > time is right, which may well be earlier than necessary.
>
> Nope.  `if (current->need_resched)' -> the time is right (beyond right,
> actually).

Have we ever considered making rescheduling work like get_user? That is,
make current->need_resched be a pointer, and if we need to reschedule,
make it an INVALID pointer that causes us to fault and call schedule in
its fault path?

Orthogonally, for rescheduling points with locks, we could build a version
of the spinlocks that know when they're blocking other processes and can
do a spin_yield(&lock) in places where they can safely give up a lock. On
single processor, spin_yield could translate to a scheduling point.

-- 
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.."

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09  0:31           ` Oliver Xymoron
@ 2002-01-09  9:50             ` Helge Hafting
  0 siblings, 0 replies; 351+ messages in thread
From: Helge Hafting @ 2002-01-09  9:50 UTC (permalink / raw)
  To: Oliver Xymoron, linux-kernel

Oliver Xymoron wrote:
[...]
> Have we ever considered making rescheduling work like get_user? That is,
> make current->need_resched be a pointer, and if we need to reschedule,
> make it an INVALID pointer that causes us to fault and call schedule in
> its fault path?

Elegant perhaps, but now you take the time to do a completely
unnecessary
page fault when rescheduling.  This has a cost which is high on
some architectures.  But the point of rescheduling was to improve
interactive performance and io latency.  
Every page fault may have to check for this case.

Helge Hafting

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 13:21   ` Andrea Arcangeli
  2002-01-08 13:33     ` Anton Blanchard
@ 2002-01-08 17:41     ` Luigi Genoni
  2002-01-14  0:46     ` Bill Davidsen
  2 siblings, 0 replies; 351+ messages in thread
From: Luigi Genoni @ 2002-01-08 17:41 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Dieter Nützel, Marcelo Tosatti, Rik van Riel,
	Linux Kernel List, Andrew Morton, Robert Love



On Tue, 8 Jan 2002, Andrea Arcangeli wrote:

> On Tue, Jan 08, 2002 at 11:55:59AM +0100, Luigi Genoni wrote:
> >
> >
> > On Tue, 8 Jan 2002, Dieter [iso-8859-15] Nützel wrote (passim):
> >
> > > Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or
> > > -rmap?
> > [...]
> > > Maybe preemption? It is disengageable so nobody should be harmed but we get
> > > the chance for wider testing.
> > >
> > > Any comments?
> > preemption?? this is eventually 2.5 stuff, and should not be integrated
>
> indeed ("eventually" in the italian sense btw, obvious to me, but not
> for l-k).
>
> I'm not against preemption (I can see the benefits about the mean
> latency for real time DSP) but the claims about preemption making the
> kernel faster doesn't make sense to me. more frequent scheduling,
> overhead of branches in the locks (you've to conditional_schedule after
> the last preemption lock is released and the cachelines for the per-cpu
> preemption locks) and the other preemption stuff can only make the
> kernel slower.  Furthmore for multimedia playback any sane kernel out
> there with lowlatency fixes applied will work as well as a preemption
> kernel that pays for all the preemption overhead.
I would add that preemption simply gives a felling of more speed with
interactive usage (with one single user on the system), and also has some
advantages for dedicated servers, but except of those conditions it never
showed in my experience to be a real and decisive advantage.
Of course we are supposing that the preemptive scheduler is very well
done, because otherway (bad working scheduler) there is nothing worse than
preemption.
>
> About the other claim that as the kernel becomes more granular
> performance will increase with preemption in kernel, that's obviously
> wrong as well, it's clearly the other way around. Maybe it was meant
> "latency will decrease further", that's right, but also performance will
> decrease if something.
>
> So yes, mean latency will decrease with preemptive kernel, but your CPU
> is definitely paying something for it.
agreed. Obviously this choice depends on what you want to do with your
system. If you have more than a couple of interactive users (and here I
have also 50 interactive users at the same time on every single system),
preemption is not what you want, period. If you have a desktop system,
well, it is a different situation.
>
> > into 2.4 stable tree. Of course a backport is possible, when/if it will be
> > quite well tested and well working on 2.5
> >
> Andrea


Luigi


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08 13:21   ` Andrea Arcangeli
  2002-01-08 13:33     ` Anton Blanchard
  2002-01-08 17:41     ` Luigi Genoni
@ 2002-01-14  0:46     ` Bill Davidsen
  2002-01-14  1:14       ` yodaiken
                         ` (5 more replies)
  2 siblings, 6 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-14  0:46 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: Linux Kernel Mailing List

On Tue, 8 Jan 2002, Andrea Arcangeli wrote:

> I'm not against preemption (I can see the benefits about the mean
> latency for real time DSP) but the claims about preemption making the
> kernel faster doesn't make sense to me. more frequent scheduling,
> overhead of branches in the locks (you've to conditional_schedule after
> the last preemption lock is released and the cachelines for the per-cpu
> preemption locks) and the other preemption stuff can only make the
> kernel slower.  Furthmore for multimedia playback any sane kernel out
> there with lowlatency fixes applied will work as well as a preemption
> kernel that pays for all the preemption overhead.

I'm not sure I have seen claims that it makes the kernel faster, but it
sure makes the latency lower, and improves performance for systems doing a
lot of network activity (DNS servers) with anything else going on in the
systems, such as daily reports and backups.

I will try the low latency kernel stuff, but I think intrinsically that if
you want to service the incoming requests quickly you have to dispatch to
them quickly, not at the end of a time slice. Preempt is a way to avoid
having to play with RT processes, and I think it's desirable in general as
an option where the load will benefit from such behaviour.

I'm not sure it "competes" with low latency, since many of the thing LL is
doing are "good things" in general.

Finally, I doubt that any of this will address my biggest problem with
Linux, which is that as memory gets cheap a program doing significant disk
writing can get buffers VERY full (perhaps a while CD worth) before the
kernel decides to do the write, at which point the system becomes
non-responsive for seconds at a time while the disk light comes on and
stays on. That's another problem, and I did play with some patches this
weekend without making myself really happy :-( Another topic,
unfortunately.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:46     ` Bill Davidsen
@ 2002-01-14  1:14       ` yodaiken
  2002-01-14  2:04       ` Andrew Morton
                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 351+ messages in thread
From: yodaiken @ 2002-01-14  1:14 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Andrea Arcangeli, Linux Kernel Mailing List

On Sun, Jan 13, 2002 at 07:46:54PM -0500, Bill Davidsen wrote:
> Finally, I doubt that any of this will address my biggest problem with
> Linux, which is that as memory gets cheap a program doing significant disk
> writing can get buffers VERY full (perhaps a while CD worth) before the
> kernel decides to do the write, at which point the system becomes
> non-responsive for seconds at a time while the disk light comes on and
> stays on. That's another problem, and I did play with some patches this
> weekend without making myself really happy :-( Another topic,
> unfortunately.

I think this is a critical problem. I'd like to be able to have some
assurance that a task with a buffer of size N doing read-disk->write-disk
will maintain data flow at some minimal rate over intervals of 1 or 2
seconds or something like that.


^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:46     ` Bill Davidsen
  2002-01-14  1:14       ` yodaiken
@ 2002-01-14  2:04       ` Andrew Morton
  2002-01-16 15:15         ` Bill Davidsen
  2002-01-14  9:08       ` Daniel Phillips
                         ` (3 subsequent siblings)
  5 siblings, 1 reply; 351+ messages in thread
From: Andrew Morton @ 2002-01-14  2:04 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Andrea Arcangeli, Linux Kernel Mailing List

Bill Davidsen wrote:
> 
> Finally, I doubt that any of this will address my biggest problem with
> Linux, which is that as memory gets cheap a program doing significant disk
> writing can get buffers VERY full (perhaps a while CD worth) before the
> kernel decides to do the write, at which point the system becomes
> non-responsive for seconds at a time while the disk light comes on and
> stays on. That's another problem, and I did play with some patches this
> weekend without making myself really happy :-( Another topic,
> unfortunately.

/proc/sys/vm/bdflush: Decreasing the kupdate interval from five
seconds, decreasing the nfract and nfract_sync setting in there
should smooth this out.  The -aa patches add start and stop
levels for bdflush as well, which means that bdflush can be the
one who blocks on IO rather than your process.  And it means that
the request queue doesn't get 100% drained as soon as the writer
hits nfract_sync.

All very interesting and it will be fun to play with when it
*finally* gets merged.

But with the current elevator design, disk read latencies will
still be painful.

-

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  2:04       ` Andrew Morton
@ 2002-01-16 15:15         ` Bill Davidsen
  0 siblings, 0 replies; 351+ messages in thread
From: Bill Davidsen @ 2002-01-16 15:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Andrea Arcangeli, Linux Kernel Mailing List

On Sun, 13 Jan 2002, Andrew Morton wrote:

> Bill Davidsen wrote:
> > 
> > Finally, I doubt that any of this will address my biggest problem with
> > Linux, which is that as memory gets cheap a program doing significant disk
> > writing can get buffers VERY full (perhaps a while CD worth) before the
> > kernel decides to do the write, at which point the system becomes
> > non-responsive for seconds at a time while the disk light comes on and
> > stays on.

> /proc/sys/vm/bdflush: Decreasing the kupdate interval from five
> seconds, decreasing the nfract and nfract_sync setting in there
> should smooth this out.  The -aa patches add start and stop
> levels for bdflush as well, which means that bdflush can be the
> one who blocks on IO rather than your process.  And it means that
> the request queue doesn't get 100% drained as soon as the writer
> hits nfract_sync.

Been there, done that. Makes it "less bad" if the right settings are
chosen. I will try -aa on 2.4.18-pre3 (or 4 if the patch is out), I've
been trying -ac this morning. Looking at the code, it doesn't look as if
the logic is what I want to see, no matter how tuned. last night I tried a
patch, and several of the "unued" elements in the bdflush were reused, but
I froze w/o any io for seconds, so I don't have it right.

What I want is a smooth increase in how mush is written as the dirty
buffers increase. Percentage of {anything} may be the wrong thing to use,
the problem is dirty buffers vs. disk write bandwidth, on a 2GB machine
it's a smaller percentage than 128M machine, but the absolute numbers seem
to be similar.

> All very interesting and it will be fun to play with when it
> *finally* gets merged.
> 
> But with the current elevator design, disk read latencies will
> still be painful.

There are a few patches around to change that. Note I didn't say "fix"
just change.

Finally, one of my goals is to be able to keep a large free page pool. I
have two apps which will suddenly need another 8MB or so, and if they have
to wait for disk they become unpleasant. With current memory prices I
don't mind "wasting" 256MB or so, if it means I get better response when I
want it. This is all part of tuning the system to application, I certainly
wouldn't use it on other machines.

Better doc of bdflush wouldn't be amiss, either. When/if it stops changing
I will clean up my notes from a stack of 3x5 cards to something useful and
make them available. If there's a doc giving what the bdflush values do
(in current kernels) and what happens when you change them, and what to
tune first if you have this problem, I haven't found it. 

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:46     ` Bill Davidsen
  2002-01-14  1:14       ` yodaiken
  2002-01-14  2:04       ` Andrew Morton
@ 2002-01-14  9:08       ` Daniel Phillips
  2002-01-14  9:08       ` Daniel Phillips
                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14  9:08 UTC (permalink / raw)
  To: Bill Davidsen, Andrea Arcangeli; +Cc: Linux Kernel Mailing List

On January 14, 2002 01:46 am, Bill Davidsen wrote:
> Finally, I doubt that any of this will address my biggest problem with
> Linux, which is that as memory gets cheap a program doing significant disk
> writing can get buffers VERY full (perhaps a while CD worth) before the
> kernel decides to do the write, at which point the system becomes
> non-responsive for seconds at a time while the disk light comes on and
> stays on.  That's another problem, and I did play with some patches this
> weekend without making myself really happy :-( Another topic,
> unfortunately.

Patience, this is understood, a solution is known and a fix is in the 
pipeline.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:46     ` Bill Davidsen
                         ` (2 preceding siblings ...)
  2002-01-14  9:08       ` Daniel Phillips
@ 2002-01-14  9:08       ` Daniel Phillips
  2002-01-14  9:08       ` Daniel Phillips
  2002-01-14 16:40       ` Daniel Phillips
  5 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14  9:08 UTC (permalink / raw)
  To: Bill Davidsen, Andrea Arcangeli; +Cc: Linux Kernel Mailing List

On January 14, 2002 01:46 am, Bill Davidsen wrote:
> Finally, I doubt that any of this will address my biggest problem with
> Linux, which is that as memory gets cheap a program doing significant disk
> writing can get buffers VERY full (perhaps a while CD worth) before the
> kernel decides to do the write, at which point the system becomes
> non-responsive for seconds at a time while the disk light comes on and
> stays on.  That's another problem, and I did play with some patches this
> weekend without making myself really happy :-( Another topic,
> unfortunately.

Patience, this is understood, a solution is known and a fix is in the 
pipeline.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:46     ` Bill Davidsen
                         ` (3 preceding siblings ...)
  2002-01-14  9:08       ` Daniel Phillips
@ 2002-01-14  9:08       ` Daniel Phillips
  2002-01-14 16:40       ` Daniel Phillips
  5 siblings, 0 replies; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14  9:08 UTC (permalink / raw)
  To: Bill Davidsen, Andrea Arcangeli; +Cc: Linux Kernel Mailing List

On January 14, 2002 01:46 am, Bill Davidsen wrote:
> Finally, I doubt that any of this will address my biggest problem with
> Linux, which is that as memory gets cheap a program doing significant disk
> writing can get buffers VERY full (perhaps a while CD worth) before the
> kernel decides to do the write, at which point the system becomes
> non-responsive for seconds at a time while the disk light comes on and
> stays on.  That's another problem, and I did play with some patches this
> weekend without making myself really happy :-( Another topic,
> unfortunately.

Patience, this is understood, a solution is known and a fix is in the 
pipeline.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14  0:46     ` Bill Davidsen
                         ` (4 preceding siblings ...)
  2002-01-14  9:08       ` Daniel Phillips
@ 2002-01-14 16:40       ` Daniel Phillips
  2002-01-14 17:42         ` Alan Cox
  5 siblings, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-14 16:40 UTC (permalink / raw)
  To: Bill Davidsen, Andrea Arcangeli; +Cc: Linux Kernel Mailing List

On January 14, 2002 01:46 am, Bill Davidsen wrote:
> Finally, I doubt that any of this will address my biggest problem with
> Linux, which is that as memory gets cheap a program doing significant disk
> writing can get buffers VERY full (perhaps a while CD worth) before the
> kernel decides to do the write, at which point the system becomes
> non-responsive for seconds at a time while the disk light comes on and
> stays on.  That's another problem, and I did play with some patches this
> weekend without making myself really happy :-( Another topic,
> unfortunately.

Patience, the problem is understood and there will be a fix in the 2.5 
timeframe.

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 16:40       ` Daniel Phillips
@ 2002-01-14 17:42         ` Alan Cox
  2002-01-14 21:28           ` J Sloan
  0 siblings, 1 reply; 351+ messages in thread
From: Alan Cox @ 2002-01-14 17:42 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Bill Davidsen, Andrea Arcangeli, Linux Kernel Mailing List

> > stays on.  That's another problem, and I did play with some patches this
> > weekend without making myself really happy :-( Another topic,
> > unfortunately.
> 
> Patience, the problem is understood and there will be a fix in the 2.5 
> timeframe.

Without a fix in the 2.4 timeframe everyone has to run 2.2. That strikes
me as decidedly non optimal. If you are having VM problems try both the
Andrea -aa and the Rik rmap-11b patches (*not together*) and report back

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-14 17:42         ` Alan Cox
@ 2002-01-14 21:28           ` J Sloan
  0 siblings, 0 replies; 351+ messages in thread
From: J Sloan @ 2002-01-14 21:28 UTC (permalink / raw)
  To: Alan Cox
  Cc: Daniel Phillips, Bill Davidsen, Andrea Arcangeli,
	Linux Kernel Mailing List

Alan Cox wrote:

>>>stays on.  That's another problem, and I did play with some patches this
>>>weekend without making myself really happy :-( Another topic,
>>>unfortunately.
>>>
>>Patience, the problem is understood and there will be a fix in the 2.5 
>>timeframe.
>>
>
>Without a fix in the 2.4 timeframe everyone has to run 2.2. That strikes
>me as decidedly non optimal. If you are having VM problems try both the
>Andrea -aa and the Rik rmap-11b patches (*not together*) and report back
>
Easiest is to grab 2.4.17 and apply 2.4.18pre2 and 2.4.18pre2-aa2 -

pre2-aa2 has all the fixes and tweaks I had been doing by hand.

cu

jjs

>



^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-08  3:02 [2.4.17/18pre] VM and swap - it's really unusable Dieter Nützel
  2002-01-08 10:55 ` Luigi Genoni
@ 2002-01-08 13:51 ` J.A. Magallon
       [not found] ` <E16OHLf-0000Dn-00@starship.berlin>
  2 siblings, 0 replies; 351+ messages in thread
From: J.A. Magallon @ 2002-01-08 13:51 UTC (permalink / raw)
  To: Dieter Nützel
  Cc: Marcelo Tosatti, Andrea Arcangeli, Rik van Riel,
	Linux Kernel List, Andrew Morton, Robert Love


On 20020108 Dieter Nützel wrote:
>Is it possible to decide, now what should go into 2.4.18 (maybe -pre3) -aa or 
>-rmap?
>Andrew Morten`s read-latency.patch is a clear winner for me, too.
>What about 00_nanosleep-5 and bootmem?
>The O(1) scheduler?
>Maybe preemption? It is disengageable so nobody should be harmed but we get 
>the chance for wider testing.
>
>Any comments?
>

I would pefer the ton of small, usefull and safe bits in Andrea's kernel
(vm-21, cache-aligned-spinlocks, compiler, gcc3, rwsem, highmen fixes...)

-- 
J.A. Magallon                           #  Let the source be with you...        
mailto:jamagallon@able.es
Mandrake Linux release 8.2 (Cooker) for i586
Linux werewolf 2.4.18-pre2-beo #1 SMP Tue Jan 8 03:18:18 CET 2002 i686

^ permalink raw reply	[flat|nested] 351+ messages in thread

[parent not found: <E16OHLf-0000Dn-00@starship.berlin>]

[parent not found: <20020109145509.G1543@inspiron.school.suse.de>]

* Re: [2.4.17/18pre] VM and swap - it's really unusable
       [not found]   ` <20020109145509.G1543@inspiron.school.suse.de>
@ 2002-01-09 14:07     ` Daniel Phillips
  2002-01-09 14:22       ` Andrea Arcangeli
  0 siblings, 1 reply; 351+ messages in thread
From: Daniel Phillips @ 2002-01-09 14:07 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Robert Love, Anton Blanchard, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Andrew Morton

On January 9, 2002 02:55 pm, Andrea Arcangeli wrote:
> On Wed, Jan 09, 2002 at 12:56:50PM +0100, Daniel Phillips wrote:
> > BTW, I find your main argument confusing.  First you don't want -preempt with
> > CONFIG_EXERIMENTAL because it might not get wide enough testing, so you want 
> > to enable it by default in the mainline kernel, then you argue it's too risky 
> > because everybody will use it and it might break some obscure driver.  Sorry, 
> > you lost me back there.
> 
> the point I am making is very simple: _if_ we include it, it should _not_
> be a config option.

That doesn't make any sense to me.  Why should _SMP be a config option and not
_PREEMPT?

--
Daniel

^ permalink raw reply	[flat|nested] 351+ messages in thread

* Re: [2.4.17/18pre] VM and swap - it's really unusable
  2002-01-09 14:07     ` Daniel Phillips
@ 2002-01-09 14:22       ` Andrea Arcangeli
  0 siblings, 0 replies; 351+ messages in thread
From: Andrea Arcangeli @ 2002-01-09 14:22 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Robert Love, Anton Blanchard, Luigi Genoni, Dieter N?tzel,
	Marcelo Tosatti, Rik van Riel, Linux Kernel List, Andrew Morton

On Wed, Jan 09, 2002 at 03:07:40PM +0100, Daniel Phillips wrote:
> On January 9, 2002 02:55 pm, Andrea Arcangeli wrote:
> > On Wed, Jan 09, 2002 at 12:56:50PM +0100, Daniel Phillips wrote:
> > > BTW, I find your main argument confusing.  First you don't want -preempt with
> > > CONFIG_EXERIMENTAL because it might not get wide enough testing, so you want 
> > > to enable it by default in the mainline kernel, then you argue it's too risky 
> > > because everybody will use it and it might break some obscure driver.  Sorry, 
> > > you lost me back there.
> > 
> > the point I am making is very simple: _if_ we include it, it should _not_
> > be a config option.
> 
> That doesn't make any sense to me.  Why should _SMP be a config option and not

getting the drivers tested with preempt enable makes lots of sense to
me.

> _PREEMPT?

SMP in 2.1 wasn't a config option.

Andrea

^ permalink raw reply	[flat|nested] 351+ messages in thread

end of thread, other threads:[~2002-01-29 23:38 UTC | newest]

Thread overview: 351+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-01-08  3:02 [2.4.17/18pre] VM and swap - it's really unusable Dieter Nützel
2002-01-08 10:55 ` Luigi Genoni
2002-01-08 13:21   ` Andrea Arcangeli
2002-01-08 13:33     ` Anton Blanchard
2002-01-08 15:00       ` Daniel Phillips
2002-01-08 15:29         ` Andrea Arcangeli
2002-01-08 15:54           ` Daniel Phillips
2002-01-08 16:38             ` Andrea Arcangeli
2002-01-08 23:02             ` Luigi Genoni
2002-01-08 23:32               ` Ken Brownfield
2002-01-08 23:42                 ` Robert Love
2002-01-08 23:52                 ` Luigi Genoni
2002-01-09  0:10                 ` Alan Cox
2002-01-09  0:29                   ` John Alvord
2002-01-09  0:43                     ` Robert Love
2002-01-09 16:58                     ` Kent Borg
2002-01-14  1:28                       ` Bill Davidsen
2002-01-14  1:54                         ` Alan Cox
2002-01-14 20:12                         ` george anzinger
2002-01-10 19:08                     ` Jussi Laako
2002-01-09  5:08                   ` Andrew Morton
2002-01-09  5:42                     ` Robert Love
2002-01-09  9:08                     ` Helge Hafting
2002-01-09 17:00                     ` Alan Cox
2002-01-09 11:44                       ` Rob Landley
2002-01-09 19:57                         ` Andrew Morton
2002-01-10 16:40                           ` Timothy Covell
2002-01-10  2:25                         ` Alan Cox
2002-01-10 10:06                           ` Rob Landley
2002-01-10 18:34                             ` Chris Friesen
2002-01-10 19:36                               ` David Weinehall
2002-01-10 20:00                                 ` Chris Friesen
2002-01-10 20:13                                   ` Jussi Laako
2002-01-10 20:52                               ` Bernd Eckenfels
2002-01-10 19:01                             ` Alan Cox
2002-01-11  2:47                               ` Nigel Gamble
2002-01-11  3:18                                 ` Andrew Morton
2002-01-11 12:37                                 ` Alan Cox
2002-01-11 20:33                                   ` Robert Love
2002-01-12  2:50                                     ` yodaiken
2002-01-11 20:22                                       ` Rob Landley
2002-01-12  5:00                                         ` yodaiken
2002-01-12 11:53                                           ` Roman Zippel
2002-01-12 12:28                                             ` yodaiken
2002-01-12 13:25                                               ` Roman Zippel
2002-01-12 14:56                                                 ` yodaiken
2002-01-12 17:48                                                   ` Roman Zippel
2002-01-12 19:23                                                     ` yodaiken
2002-01-12 21:21                                                       ` Roman Zippel
2002-01-13  1:23                                                         ` Alan Cox
2002-01-14 23:21                                                   ` george anzinger
2002-01-15  0:59                                                     ` yodaiken
2002-01-15  9:18                                                       ` Helge Hafting
2002-01-12 20:12                                                 ` Andrew Morton
2002-01-12 18:46                                             ` Alan Cox
2002-01-12 20:42                                               ` Roman Zippel
2002-01-12 22:13                                                 ` yodaiken
2002-01-13  3:33                                                   ` Roman Zippel
2002-01-13  4:02                                                     ` yodaiken
2002-01-13  1:28                                                 ` Alan Cox
2002-01-12  5:03                                         ` Andrew Morton
2002-01-12 18:26                                           ` Jussi Laako
2002-01-12  6:01                                         ` Robert Love
2002-01-12 12:45                                           ` yodaiken
2002-01-12 19:00                                           ` Alan Cox
2002-01-13  0:16                                             ` Robert Love
2002-01-13  1:41                                               ` Alan Cox
2002-01-13 22:50                                                 ` Daniel Phillips
2002-01-12  9:52                                         ` arjan
2002-01-12 18:54                                           ` Alan Cox
2002-01-12 19:23                                             ` Ed Sweetman
2002-01-12 19:35                                               ` yodaiken
2002-01-12 20:09                                               ` Alan Cox
2002-01-20  0:08                                                 ` Pavel Machek
2002-01-12 19:26                                             ` Robert Love
2002-01-12 19:36                                               ` yodaiken
2002-01-12 20:07                                               ` Alan Cox
2002-01-12 20:03                                                 ` Robert Love
2002-01-12 20:21                                                   ` Alan Cox
2002-01-13  3:10                                                     ` Robert Love
2002-01-13 11:39                                                       ` Russell King
2002-01-13 18:24                                                         ` Robert Love
2002-01-13 19:06                                                           ` Russell King
2002-01-13 19:30                                                           ` Alan Cox
2002-01-13 15:59                                                       ` Alan Cox
2002-01-13 18:20                                                         ` Robert Love
2002-01-14  5:59                                                         ` Daniel Phillips
2002-01-13 22:23                                                           ` Rob Landley
2002-01-13 22:02                                                     ` Daniel Phillips
2002-01-12 20:36                                                 ` Kenneth Johansson
2002-01-12 23:01                                                   ` Robert Love
2002-01-13  0:02                                                     ` J Sloan
2002-01-13  1:38                                                     ` Alan Cox
2002-01-13 15:18                                                       ` Roman Zippel
2002-01-13 15:36                                                         ` Arjan van de Ven
2002-01-14  5:03                                                           ` Daniel Phillips
2002-01-14  5:09                                                             ` Andrew Morton
2002-01-14  9:24                                                               ` Daniel Phillips
2002-01-14  5:34                                                             ` yodaiken
2002-01-14 11:14                                                               ` Roman Zippel
2002-01-14 11:47                                                                 ` Alan Cox
2002-01-14 12:00                                                                   ` Roman Zippel
2002-01-14 12:27                                                                     ` Alan Cox
2002-01-14 13:39                                                                       ` Roman Zippel
2002-01-14 14:35                                                                         ` Alan Cox
2002-01-14 14:30                                                                           ` Roman Zippel
2002-01-14 14:35                                                                         ` Rik van Riel
2002-01-14 16:19                                                                           ` Roman Zippel
2002-01-14 20:05                                                                           ` Robert Love
2002-01-14 20:02                                                                       ` Andrew Morton
2002-01-14 21:19                                                                         ` Alan Cox
2002-01-14 21:11                                                                           ` Andrew Morton
2002-01-14 21:30                                                                             ` Alan Cox
2002-01-14 13:38                                                                 ` yodaiken
2002-01-14 14:40                                                                   ` Roman Zippel
2002-01-14 14:07                                                                 ` Guest section DW
2002-01-14 12:17                                                               ` Momchil Velikov
2002-01-14 12:45                                                                 ` Oliver Neukum
2002-01-14 16:32                                                                   ` Momchil Velikov
2002-01-14 17:43                                                                     ` Alan Cox
2002-01-14 22:34                                                                       ` Momchil Velikov
2002-01-14 22:46                                                                         ` yodaiken
     [not found]                                                                           ` <876664vxm8.fsf@fadata.bg>
2002-01-15 12:31                                                                             ` yodaiken
2002-01-20 10:31                                                                               ` george anzinger
2002-01-20 14:34                                                                                 ` yodaiken
2002-01-14 18:04                                                                     ` Oliver Neukum
2002-01-14 20:09                                                                       ` Robert Love
2002-01-14 20:22                                                                         ` Oliver Neukum
2002-01-14 20:36                                                                           ` Robert Love
2002-01-14 22:46                                                                             ` Oliver Neukum
2002-01-15  3:01                                                                               ` george anzinger
2002-01-14 13:45                                                                 ` yodaiken
2002-01-14 13:48                                                                   ` yodaiken
2002-01-14 14:56                                                                   ` Roman Zippel
2002-01-14 16:18                                                                     ` yodaiken
2002-01-14 18:54                                                                       ` Roman Zippel
2002-01-14 16:36                                                                   ` Momchil Velikov
     [not found]                                                                     ` <20020114030925.A1363@viejo.fsmlabs.com>
2002-01-14 18:43                                                                       ` Daniel Phillips
2002-01-14 18:39                                                                         ` yodaiken
2002-01-14 20:16                                                                           ` Robert Love
2002-01-14 19:16                                                                         ` Rick Stevens
2002-01-15  3:07                                                                         ` george anzinger
2002-01-15  3:31                                                                           ` Daniel Phillips
2002-01-15 12:39                                                                           ` yodaiken
2002-01-21 15:38                                                                             ` Daniel Phillips
2002-01-21 15:43                                                                               ` yodaiken
2002-01-21 16:05                                                                                 ` Daniel Phillips
2002-01-21 16:06                                                                                   ` yodaiken
2002-01-21 16:33                                                                                     ` Peter Wächtler
2002-01-21 16:45                                                                                       ` yodaiken
2002-01-21 17:12                                                                                         ` Peter Wächtler
2002-01-21 17:15                                                                                           ` yodaiken
2002-01-21 16:48                                                                                     ` Daniel Phillips
2002-01-21 16:50                                                                                       ` yodaiken
2002-01-21 17:32                                                                                         ` Chris Friesen
2002-01-21 17:52                                                                                           ` yodaiken
2002-01-21 18:59                                                                                             ` Chris Friesen
2002-01-21 19:00                                                                                             ` Peter Wächtler
2002-01-21 21:22                                                                                         ` Robert Love
2002-01-21 21:54                                                                                           ` yodaiken
2002-01-21 22:19                                                                                             ` Robert Love
2002-01-21 22:18                                                                                           ` Horst von Brand
2002-01-21 22:53                                                                                             ` Chris Friesen
2002-01-29 23:12                                                                                             ` Bill Davidsen
2002-01-21 21:24                                                                                       ` Robert Love
2002-01-21 21:16                                                                                     ` Robert Love
2002-01-21 21:33                                                                                       ` Andrew Morton
2002-01-21 21:59                                                                                         ` J Sloan
2002-01-21 21:49                                                                                       ` yodaiken
2002-01-21 22:01                                                                                         ` Robert Love
2002-01-21 20:52                                                                                           ` Marcelo Tosatti
2002-01-21 22:26                                                                                             ` Robert Love
2002-01-21 23:56                                                                                           ` yodaiken
2002-01-22  0:45                                                                                             ` Roman Zippel
2002-01-22  1:34                                                                                               ` yodaiken
2002-01-22  9:13                                                                                                 ` Roman Zippel
2002-01-22  2:10                                                                                             ` Daniel Phillips
2002-01-24 15:19                                                                                               ` yodaiken
2002-01-24 21:15                                                                                                 ` Roman Zippel
2002-01-26  2:36                                                                                                 ` Jamie Lokier
2002-01-29 23:36                                                                                             ` Bill Davidsen
2002-01-22  0:27                                                                                     ` Roman Zippel
2002-01-21 19:26                                                                                   ` Mark Hahn
2002-01-21 20:16                                                                                     ` Allan Sandfeld
2002-01-22 10:57                                                                                     ` Peter Wächtler
2002-01-21 20:35                                                                                 ` Bill Davidsen
2002-01-21 20:49                                                                                   ` yodaiken
2002-01-21 21:42                                                                                   ` Mark Hahn
2002-01-22  0:58                                                                                     ` Ken Brownfield
2002-01-22 16:51                                                                                     ` Bill Davidsen
2002-01-22 20:50                                                                                       ` Jussi Laako
2002-01-29 23:05                                                                                         ` Bill Davidsen
2002-01-29 23:33                                                                                           ` Alan Cox
2002-01-14 17:36                                                                   ` Daniel Phillips
2002-01-14 15:08                                                               ` Russ Leighton
2002-01-14  8:24                                                             ` Arjan van de Ven
2002-01-13 15:45                                                         ` Alan Cox
2002-01-13 20:25                                                           ` Roman Zippel
2002-01-13 21:11                                                             ` Alan Cox
2002-01-14  0:33                                                               ` Stephan von Krawczynski
2002-01-14  0:50                                                                 ` Alan Cox
2002-01-14  1:17                                                                   ` Robert Love
2002-01-14  9:49                                                                     ` Stephan von Krawczynski
2002-01-14  9:45                                                                   ` Stephan von Krawczynski
2002-01-14 10:04                                                                     ` Andrew Morton
2002-01-14 11:47                                                                       ` Stephan von Krawczynski
2002-01-14 12:29                                                                         ` Alan Cox
2002-01-14 22:20                                                                           ` Jussi Laako
2002-01-15  1:43                                                                             ` Stephan von Krawczynski
2002-01-15 20:29                                                                               ` Jussi Laako
2002-01-14 19:58                                                                         ` george anzinger
2002-01-14 10:09                                                                     ` Alan Cox
2002-01-14 15:02                                                                     ` J.A. Magallon
2002-01-14 15:03                                                                       ` Arjan van de Ven
2002-01-14 19:50                                                                         ` george anzinger
2002-01-14 19:35                                                                       ` Robert Love
2002-01-14 15:46                                                                         ` Rob Landley
2002-01-14 22:03                                                                     ` Jussi Laako
2002-01-15  1:34                                                                       ` Stephan von Krawczynski
2002-01-14  0:10                                                             ` yodaiken
2002-01-14  0:41                                                               ` Roman Zippel
2002-01-14  1:05                                                                 ` yodaiken
2002-01-14 10:16                                                                   ` Roman Zippel
2002-01-14  1:19                                                                 ` Robert Love
2002-01-13 18:13                                                         ` Robert Love
2002-01-14  1:50                                                         ` Rik van Riel
2002-01-14  1:56                                                           ` Robert Love
2002-01-14 10:55                                                           ` Roman Zippel
     [not found]                                                       ` <16QNVQ-2JqEACC@fwd03.sul.t-online.com>
     [not found]                                                         ` <3C43D5E1.6785695C@mvista.com>
2002-01-15  8:32                                                           ` Oliver Neukum
2002-01-13  1:30                                                   ` Alan Cox
2002-01-12 20:53                                             ` Roman Zippel
2002-01-12 23:07                                               ` Rob Landley
2002-01-13 16:03                                                 ` Alan Cox
2002-01-13  1:26                                               ` Alan Cox
2002-01-13 13:34                                                 ` Roman Zippel
2002-01-13 15:19                                                   ` Alan Cox
2002-01-13 22:06                                             ` Daniel Phillips
2002-01-14  7:22                                             ` Alans example against preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable) Roger Larsson
2002-01-14  9:18                                               ` Alan Cox
2002-01-14 12:08                                         ` [2.4.17/18pre] VM and swap - it's really unusable Helge Hafting
2002-01-18 22:41                                           ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable) Pavel Machek
2002-01-20 11:22                                             ` Rob Landley
2002-01-21 21:48                                               ` Alan Cox
2002-01-22 11:52                                                 ` Rob Landley
2002-01-27 20:37                                                   ` Alan Cox
2002-01-27 22:10                                                     ` Nigel Gamble
2002-01-27 22:56                                                       ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre]u Alan Cox
2002-01-20 20:22                                             ` Preempt & how long it takes to interrupt (was Re: [2.4.17/18pre] VM and swap - it's really unusable) Robert Love
2002-01-12 11:13                                     ` [2.4.17/18pre] VM and swap - it's really unusable Andrea Arcangeli
2002-01-12 15:07                                       ` jogi
2002-01-12 16:05                                         ` Andrea Arcangeli
2002-01-13 15:15                                           ` jogi
2002-01-12 16:52                                         ` yodaiken
2002-01-12 17:00                                           ` Andrea Arcangeli
2002-01-12 19:00                                             ` Ed Sweetman
2002-01-12 20:23                                               ` Andrew Morton
2002-01-12 21:02                                                 ` Erik Andersen
2002-01-12 21:18                                                   ` Stephan von Krawczynski
2002-01-12 23:24                                                     ` Erik Andersen
2002-01-12 21:16                                                 ` Stephan von Krawczynski
2002-01-12 22:25                                                 ` Francois Romieu
2002-01-13  1:32                                                 ` Alan Cox
2002-01-13  1:57                                                 ` J.A. Magallon
2002-01-13  8:03                                                 ` Rusty Russell
2002-01-13 17:42                                                 ` jogi
2002-01-13 18:22                                                   ` Robert Love
2002-01-13 19:32                                                     ` Alan Cox
2002-01-14 11:41                                                       ` Andrea Arcangeli
2002-01-13 19:35                                                     ` J Sloan
2002-01-14  6:49                                                       ` Daniel Phillips
2002-01-15  1:31                                                         ` J Sloan
2002-01-13 19:46                                                     ` Andrew Morton
2002-01-13 20:04                                                       ` Robert Love
2002-01-13 20:30                                                         ` Andrew Morton
2002-01-14 11:56                                                         ` Andrea Arcangeli
2002-01-14 13:38                                                           ` Robert Love
2002-01-14 15:45                                                             ` Andrea Arcangeli
2002-01-13 20:17                                                     ` jogi
     [not found]                                                       ` <Pine.LNX.4.33.0201131533530.14774-100000@coffee.psychology.mcmaster.ca>
2002-01-13 22:14                                                         ` jogi
2002-01-13 23:52                                                     ` yodaiken
2002-01-14 11:39                                                     ` Andrea Arcangeli
     [not found]                                                 ` <3C40A6BB.1090100@pobox.com>
2002-01-14 11:34                                                   ` Andrea Arcangeli
2002-01-14 20:27                                                     ` Andrew Morton
2002-01-13 15:22                                               ` jogi
2002-01-14 23:05                                                 ` george anzinger
2002-01-13 15:18                                           ` jogi
2002-01-13 17:51                                             ` yodaiken
2002-01-13 18:10                                               ` jogi
2002-01-13 18:11                                             ` Robert Love
2002-01-14 11:32                                               ` Andrea Arcangeli
2002-01-13 22:55                                         ` Daniel Phillips
2002-01-13 22:56                                           ` Robert Love
2002-01-14  0:11                                             ` yodaiken
2002-01-14 11:18                                           ` Marian Jancar
2002-01-14 14:16                                             ` yodaiken
2002-01-14  2:46                               ` Pavel Machek
2002-01-10  9:59                   ` Ken Brownfield
2002-01-10 11:04                     ` Alan Cox
2002-01-09  0:13               ` Dieter Nützel
2002-01-09  6:26               ` Daniel Phillips
2002-01-09  7:25                 ` Preemtive kernel (Was: Re: [2.4.17/18pre] VM and swap - it's really unusable) Roger Larsson
2002-01-09  7:48                   ` Daniel Phillips
2002-01-08 20:55           ` [2.4.17/18pre] VM and swap - it's really unusable Robert Love
2002-01-09 11:24             ` Andrea Arcangeli
2002-01-09 14:07               ` Ed Sweetman
2002-01-09 14:27                 ` Andrea Arcangeli
2002-01-09 14:51                   ` Arjan van de Ven
2002-01-09 17:02                     ` Roger Larsson
2002-01-09 17:10                       ` Arjan van de Ven
2002-01-09 17:13                     ` Daniel Phillips
2002-01-08 19:47         ` Andrew Morton
2002-01-08 20:13           ` Alan Cox
2002-01-08 22:00             ` Roger Larsson
2002-01-08 20:18           ` Daniel Phillips
2002-01-08 21:19             ` Robert Love
2002-01-14  1:08             ` Bill Davidsen
2002-01-08 20:59           ` Daniel Phillips
2002-01-08 21:08             ` Rik van Riel
2002-01-08 21:15               ` Robert Love
2002-01-08 21:24                 ` Rik van Riel
2002-01-08 21:45                   ` Robert Love
2002-01-08 22:31                     ` Andrew Morton
2002-01-08 21:51               ` Daniel Phillips
2002-01-08 21:10             ` Andrew Morton
2002-01-08 21:17             ` Robert Love
2002-01-08 21:57               ` Daniel Phillips
2002-01-08 22:01                 ` Robert Love
2002-01-14  1:22               ` Bill Davidsen
2002-01-08 22:21             ` Andrew Morton
2002-01-09  9:17               ` Daniel Phillips
2002-01-09  9:26                 ` Andrew Morton
2002-01-09  9:48                   ` Daniel Phillips
2002-01-08 23:26             ` Luigi Genoni
2002-01-09  6:36               ` Daniel Phillips
2002-01-08 21:08           ` Robert Love
2002-01-09  0:31           ` Oliver Xymoron
2002-01-09  9:50             ` Helge Hafting
2002-01-08 17:41     ` Luigi Genoni
2002-01-14  0:46     ` Bill Davidsen
2002-01-14  1:14       ` yodaiken
2002-01-14  2:04       ` Andrew Morton
2002-01-16 15:15         ` Bill Davidsen
2002-01-14  9:08       ` Daniel Phillips
2002-01-14  9:08       ` Daniel Phillips
2002-01-14  9:08       ` Daniel Phillips
2002-01-14 16:40       ` Daniel Phillips
2002-01-14 17:42         ` Alan Cox
2002-01-14 21:28           ` J Sloan
2002-01-08 13:51 ` J.A. Magallon
     [not found] ` <E16OHLf-0000Dn-00@starship.berlin>
     [not found]   ` <20020109145509.G1543@inspiron.school.suse.de>
2002-01-09 14:07     ` Daniel Phillips
2002-01-09 14:22       ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox