Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl

Linux Sound subsystem development
 help / color / mirror / Atom feed

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
@ 1999-08-28 23:55 yodaiken
  1999-08-29  0:24 ` Alan Cox
                   ` (16 more replies)
  0 siblings, 17 replies; 18+ messages in thread
From: yodaiken @ 1999-08-28 23:55 UTC (permalink / raw)
  To: linux-sound

On Sat, Aug 28, 1999 at 10:40:57PM +0200, Benno Senoner wrote:
> - The disk performance decreases by 10-25% when I increase the CPU load in
> the "latencytest" bench.
> (On light CPU load there are no disk performance differences,
> maybe this is related to higher scheduling overhead)
> 
> I think most of us want to have these "low-latency" features in the upcoming
> 2.4 kernel since it will make Linux a very good _MULTIMEDIA_OS_.


A 25% disk i/o decrease is very serious. Lets get some serious feedback
from people running internet and database servers before we blow off
the server users in order to compete with BEOS.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
@ 1999-08-29  0:24 ` Alan Cox
  1999-08-29  1:59 ` yodaiken
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Alan Cox @ 1999-08-29  0:24 UTC (permalink / raw)
  To: linux-sound

> A 25% disk i/o decrease is very serious. Lets get some serious feedback
> from people running internet and database servers before we blow off
> the server users in order to compete with BEOS.

For most people its unacceptable in that form, but nobody has yet sat down
and tuned the tuned code to get the disk performance all back. I don't think
the two are exclusive.

We also have tons of time. Its not a 2.2. candidate

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
  1999-08-29  0:24 ` Alan Cox
@ 1999-08-29  1:59 ` yodaiken
  1999-08-29  6:21 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), Linus Torvalds
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: yodaiken @ 1999-08-29  1:59 UTC (permalink / raw)
  To: linux-sound

On Sun, Aug 29, 1999 at 01:24:08AM +0100, Alan Cox wrote:
> > A 25% disk i/o decrease is very serious. Lets get some serious feedback
> > from people running internet and database servers before we blow off
> > the server users in order to compete with BEOS.
> 
> For most people its unacceptable in that form, but nobody has yet sat down
> and tuned the tuned code to get the disk performance all back. I don't think
> the two are exclusive.

I'm concerned that the major improvement has come from additional 
calls to schedule instead of from some basic improvements in algorithm.
Calls to schedule are not free and I'm not smart enough to see an
obvious way to add a rescehdule 
into a loop that dumps all write data to buffers without damaging
thoughput.  This seems like _another_  tunable parameter
          start io-loop
             do chunk
             io_reseched()
             end loop

io_resched
       if (really_want_soft_rt > resched count)
                schedule



> 
> We also have tons of time. Its not a 2.2. candidate
> 
> 
> --- [rtl] ---
> To unsubscribe:
> echo "unsubscribe rtl" | mail majordomo@rtlinux.cs.nmt.edu OR
> echo "unsubscribe rtl <Your_email>" | mail majordomo@rtlinux.cs.nmt.edu
> ----
> For more information on Real-Time Linux see:
> http://www.rtlinux.org/~rtlinux/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency),
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
  1999-08-29  0:24 ` Alan Cox
  1999-08-29  1:59 ` yodaiken
@ 1999-08-29  6:21 ` Linus Torvalds
  1999-08-29  7:13 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl Ingo Molnar
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 1999-08-29  6:21 UTC (permalink / raw)
  To: linux-sound



On Sat, 28 Aug 1999 yodaiken@fsmlabs.com wrote:
> 
> A 25% disk i/o decrease is very serious. Lets get some serious feedback
> from people running internet and database servers before we blow off
> the server users in order to compete with BEOS.

Guys, if anybody thinks we're competing with BeOS, then wake up. BeOS is a
niche OS that isn't worth competing against, and at most we can try to
find out what it's good at and see if we can emulate some of it. But 25%
disk IO decrease is definitely not something we want to even consider.

		Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (2 preceding siblings ...)
  1999-08-29  6:21 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), Linus Torvalds
@ 1999-08-29  7:13 ` Ingo Molnar
  1999-08-29  7:15 ` Ingo Molnar
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 1999-08-29  7:13 UTC (permalink / raw)
  To: linux-sound

On Sat, 28 Aug 1999, Linus Torvalds wrote: 

> Guys, if anybody thinks we're competing with BeOS, then wake up. BeOS is
> a niche OS that isn't worth competing against, and at most we can try to
> find out what it's good at and see if we can emulate some of it. But 25%
> disk IO decrease is definitely not something we want to even consider. 

definitely. The patches are 'work in progress', and i know they are only
acceptable for inclusion if they: 

	1) cause no measureable slowdown _anywhere_

	2) fixes 'buggy' latencies by redesigning the latency source, not
	   by intruding the latency core with conditional reschedule
	   points.

i'm quite certain both 1) and 2) are very much possible, Benno's reported
25% slowdown (which btw. i think is not a pure bandwith degradation, but a
slowdown in certain, not necesserily well understood circumstances) is
simply a bug in my patch.

the patch in it's current form is just a 'demo' that shows that good
in-kernel latencies (the example to follow is not BeOS but QNX i think)
are indeed possible without architectural impact. On my box i've killed
all latencies that are bigger than 0.5 msec. (i'll try to fix the 2.9msec
peak reported as well) I'll send an updated and fixed patch (in smaller
pieces) for 2.3.

-- mingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (3 preceding siblings ...)
  1999-08-29  7:13 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl Ingo Molnar
@ 1999-08-29  7:15 ` Ingo Molnar
  1999-08-29  7:17 ` Ingo Molnar
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 1999-08-29  7:15 UTC (permalink / raw)
  To: linux-sound


On Sat, 28 Aug 1999 yodaiken@fsmlabs.com wrote:

> I'm concerned that the major improvement has come from additional 
> calls to schedule instead of from some basic improvements in algorithm.

there are _no_ extra calls to schedule, only if necessery. Zero, nil,
nada. Please check out the patch.

-- mingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (4 preceding siblings ...)
  1999-08-29  7:15 ` Ingo Molnar
@ 1999-08-29  7:17 ` Ingo Molnar
  1999-08-29 13:59 ` yodaiken
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 1999-08-29  7:17 UTC (permalink / raw)
  To: linux-sound

On Sun, 29 Aug 1999, Alan Cox wrote:

> We also have tons of time. Its not a 2.2. candidate

yes. between 2.2 and 2.3 there were so many fundamental changes that i
will not even try to clean the 2.2 patch up - it would be too fundamental
anyway. If anyone wants a solution for 2.2, the current patch can be used
(it's 100% stable), but i'm developing the 'clean patch' only for 2.3.

-- mingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (5 preceding siblings ...)
  1999-08-29  7:17 ` Ingo Molnar
@ 1999-08-29 13:59 ` yodaiken
  1999-08-29 14:22 ` David Olofson
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: yodaiken @ 1999-08-29 13:59 UTC (permalink / raw)
  To: linux-sound


Here is a fundamental change from Ingo's patch:
 #define __copy_user(to,from,size)                                      \
 do {                                                                   \
        int __d0, __d1;                                                 \
+       conditional_schedule();                                         \
        __asm__ __volatile__(                                           \
                "0:     rep; movsl\n"                                   \
                "       movl %3,%0\n"  

Big change. And I think it is a change that makes a mistake about
what is more important. Terminating an i/o and being able to 
release a buffer is not always less critical than running the
next process.


And my candidate for "irreparably breaks Oracle" is
                for (i = nr_buffers_type[BUF_LOCKED]*2 ; i-- > 0 ; bh = next) {
+                       if (current->need_resched) {
+                               bh->b_count++;
+                               schedule();
+                               bh->b_count--;

What does Lmbench report on this patch set? I bet that the throughput
tests show the difference.

It would be interesting to consider how these changes might interact
with, for example, Stephen's semi-i/o-light changes for direct io or with
new fs designs or changes to the network subsystem. 

Some problems:
1. extra calls to schedule trash cache and trade bandwidth for latency
2. assumptions about machine timing become embedded in basic code and 
   will cause problems as timing changes.
3. the fundamental technique of this patch is to introduce reschedules
   that hide the problem instead of solving it. Instead of
   start_long_copy
          do a chunk
          conditional_reschedule
          loop
   It's more interesting to think about how to avoid the long copy in 
   the first place. A write request that asks to dump a big chunk of
   memory to i/o seems like it should be made to be lower latency by
   using a k buffer to page align and then doing direct i/o on user
   pages. Or, alternatively, we could put some smarts in libc for
   big i/o, or maybe we can make "write" understand something more
   about the destination device so that it can delegate copying to 
   smart devices and use a just-in-time copying approach for other
   devices. Any of these are a lot more difficult than introducing
   resched calls.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (6 preceding siblings ...)
  1999-08-29 13:59 ` yodaiken
@ 1999-08-29 14:22 ` David Olofson
  1999-08-29 20:48 ` yodaiken
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: David Olofson @ 1999-08-29 14:22 UTC (permalink / raw)
  To: linux-sound

On Sun, 29 Aug 1999, yodaiken@fsmlabs.com wrote:
> A 25% disk i/o decrease is very serious. Lets get some serious feedback
> from people running internet and database servers before we blow off
> the server users in order to compete with BEOS.

I think it's pretty serious for multimedia users as well... I for one could use
that extra headroom for getting multitrack streaming to/from disk reliable.
The reason for this decrease should be tracked down to see if it can be fixed.

BTW, any figures for RTLinux? If I had the time, I'd do some benchmarking, but
that'll have to wait for a few days...


//David


 ·A·U·D·I·A·L·I·T·Y·   P r o f e s s i o n a l   L i n u x   A u d i o
-  - ------------------------------------------------------------- -  -
    ·Rock Solid                                      David Olofson:
    ·Low Latency    www.angelfire.com/or/audiality   ·Audio Hacker
    ·Plug-Ins            audiality@swipnet.se        ·Linux Advocate
    ·Open Source                                     ·Singer/Composer

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (7 preceding siblings ...)
  1999-08-29 14:22 ` David Olofson
@ 1999-08-29 20:48 ` yodaiken
  1999-08-30  6:09 ` yodaiken
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: yodaiken @ 1999-08-29 20:48 UTC (permalink / raw)
  To: linux-sound

On Sun, Aug 29, 1999 at 09:15:19AM +0200, Ingo Molnar wrote:
> 
> On Sat, 28 Aug 1999 yodaiken@fsmlabs.com wrote:
> 
> > I'm concerned that the major improvement has come from additional 
> > calls to schedule instead of from some basic improvements in algorithm.
> 
> there are _no_ extra calls to schedule, only if necessery. Zero, nil,
> nada. Please check out the patch.

Tell me what I misunderstood. As far as I can tell, pre patch behavior
involves many fewer calls to schedule, post patch behavior for a write, for
example, can make at least one extra call to the scheduler for every block
copied. If "needs_resched >0" it is still not necessarily true that the
call to the scheduler is "necessary".  Consider, a long running data base
program is writing backing store out to disk. Old behavior: the write
absorbs as much free memory as possible to optimize disk behavior. New
behavior: a screen saver, which is small i/o bound, causes needs resched
to be set continually, and the write is segmented into many smaller writes.
We now have, like NT, optimized the screen saver on a server while hammering
file system performance. Isn't that correct?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (8 preceding siblings ...)
  1999-08-29 20:48 ` yodaiken
@ 1999-08-30  6:09 ` yodaiken
  1999-08-30  6:55 ` Ingo Molnar
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: yodaiken @ 1999-08-30  6:09 UTC (permalink / raw)
  To: linux-sound

On Mon, Aug 30, 1999 at 08:55:45AM +0200, Ingo Molnar wrote:
> > Tell me what I misunderstood. As far as I can tell, pre patch behavior
> > involves many fewer calls to schedule, post patch behavior for a write, for
> > example, can make at least one extra call to the scheduler for every block
> > copied. [...]
> 
> the thing is, we are simply getting what we asked for. In that benchmark

so what happens if we ask for more than one thing?

> we have a soundcard-using RT program that gets ~1000 reschedules a second
> and 'wastes' CPU cycles (artificially) after every reschedule. The first
> benchmark config used an unmodified kernel that has simply violated the
> (very tight, 1msec) RT constraints of the RT process. Then we had a kernel
> modified by lowlatency-N6+patches, which kernel correctly satisfied the RT
> process' requests and rescheduled to it (and away from it) about every
> msec or so. No wonder IMO that disk performance might suffer if such tight
> RT constraints are satisfied accurately. Do you see my point? Disk
> performance does not suffer if 'simple' CPU-using processes are running. 

I don't see how your code avoids reschedules from non SCHED_FIFO/RR 
processes. And I'm not convinced tha even then, it is reasonable to 
allow this. But first explain why a screen saver will not trigger
the same behavior. The screen saver will do fast writes to the screen,
and these will trigger io for X and for the saver itself. Both operations
will set needs_resched. So we expect io performance to get worse 
in this case. Right?

> 
> >                                                     [...] New
> > behavior: a screen saver, which is small i/o bound, causes needs resched
> > to be set continually, and the write is segmented into many smaller writes.
> 
> do you see where you missed the point? We are talking about _RT, CPU-using
> high-frequency rescheduling_ processes that cause a measured bandwith
> difference. Not screen savers. Not 'simple' CPU hogs. RT processes.

I know that that is your intention, but I don't understand how you expect
to limit the effects of the changes to RT tasks. That's why I ask for
Lmbench data and also database benchmarks.

By the way, I don't mean to be too critical here - this is very important work
and if we had millisecond soft rt tasks in Linux, it would really be
useful for RTLinux  too -- people ask for it all the time. My caution is
that simply adding new premption points to the kernel is not simple and
has far reaching effects.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (9 preceding siblings ...)
  1999-08-30  6:09 ` yodaiken
@ 1999-08-30  6:55 ` Ingo Molnar
  1999-08-30  7:30 ` Ingo Molnar
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 1999-08-30  6:55 UTC (permalink / raw)
  To: linux-sound

On Sun, 29 Aug 1999 yodaiken@chelm.cs.nmt.edu wrote:

> > there are _no_ extra calls to schedule, only if necessery. Zero, nil,
> 
> Tell me what I misunderstood. As far as I can tell, pre patch behavior
> involves many fewer calls to schedule, post patch behavior for a write, for
> example, can make at least one extra call to the scheduler for every block
> copied. [...]

the thing is, we are simply getting what we asked for. In that benchmark
we have a soundcard-using RT program that gets ~1000 reschedules a second
and 'wastes' CPU cycles (artificially) after every reschedule. The first
benchmark config used an unmodified kernel that has simply violated the
(very tight, 1msec) RT constraints of the RT process. Then we had a kernel
modified by lowlatency-N6+patches, which kernel correctly satisfied the RT
process' requests and rescheduled to it (and away from it) about every
msec or so. No wonder IMO that disk performance might suffer if such tight
RT constraints are satisfied accurately. Do you see my point? Disk
performance does not suffer if 'simple' CPU-using processes are running. 

>                                                     [...] New
> behavior: a screen saver, which is small i/o bound, causes needs resched
> to be set continually, and the write is segmented into many smaller writes.

do you see where you missed the point? We are talking about _RT, CPU-using
high-frequency rescheduling_ processes that cause a measured bandwith
difference. Not screen savers. Not 'simple' CPU hogs. RT processes.

-- mingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (10 preceding siblings ...)
  1999-08-30  6:55 ` Ingo Molnar
@ 1999-08-30  7:30 ` Ingo Molnar
  1999-08-30  8:18 ` yodaiken
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 1999-08-30  7:30 UTC (permalink / raw)
  To: linux-sound

On Mon, 30 Aug 1999 yodaiken@chelm.cs.nmt.edu wrote:

> I don't see how your code avoids reschedules from non SCHED_FIFO/RR
> processes. [...]

i dont really understand your point. _if_ current->need_resched is set we
should reschedule ASAP - thats all. Thats a generic kernel rule - it's up
to the scheduling code to balance timeslices and priorities properly. The
patch is only enforcing this rule more accurately than old kernels. But if
you think this is something new then you are wrong.

>              [...] But first explain why a screen saver will not trigger
> the same behavior. The screen saver will do fast writes to the screen,
> and these will trigger io for X and for the saver itself. Both operations
> will set needs_resched. So we expect io performance to get worse 
> in this case. Right?

wrong. The behavior of X & screensaver does not change the slightest from
current kernels. The patch adds no additional behavior! We check for
need_resched at _every_ system-call return (or IRQ return to user-space,
or signal delivery) anyway. The patch only shortens certain longer
'scheduling atoms' by either splitting them up into smaller pieces or by
redesigning them. But this does not cause any macro-effect - apart from
situations of course which are now behaving correctly.

[btw. 99% of the time the X client gets rescheduled is not due to
need_resched but due to the unix-domain socket buffer running out of write
space. And this is true globally, need_resched itself is resposible for a
small fraction of reschedules only.]

-- mingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (11 preceding siblings ...)
  1999-08-30  7:30 ` Ingo Molnar
@ 1999-08-30  8:18 ` yodaiken
  1999-08-30  9:45 ` Ingo Molnar
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: yodaiken @ 1999-08-30  8:18 UTC (permalink / raw)
  To: linux-sound

On Mon, Aug 30, 1999 at 09:30:04AM +0200, Ingo Molnar wrote:
> 
> On Mon, 30 Aug 1999 yodaiken@chelm.cs.nmt.edu wrote:
> 
> > I don't see how your code avoids reschedules from non SCHED_FIFO/RR
> > processes. [...]
> 
> i dont really understand your point. _if_ current->need_resched is set we
> should reschedule ASAP - thats all. Thats a generic kernel rule - it's up

No. If current->need_resched is set, we should reschedule at a preemption
point. What your patch does is dramatically increase the number of
preemption points -- that changes the meaning.

> to the scheduling code to balance timeslices and priorities properly. The
> patch is only enforcing this rule more accurately than old kernels. But if
> you think this is something new then you are wrong.

It absolutely is something new. In the current kernel, we check for
preemption only at points where we are about to do a context
switch anyways - from k to user modulo some places like mem.c .
That is the logic is:
                     before commit to a switch to user, see
                     if there is a hint to call the scheudler
This is not the same as
                     before copy a block check to see if there
                     is a hint to call the scheduler.



> >              [...] But first explain why a screen saver will not trigger
> > the same behavior. The screen saver will do fast writes to the screen,
> > and these will trigger io for X and for the saver itself. Both operations
> > will set needs_resched. So we expect io performance to get worse 
> > in this case. Right?
> 
> wrong. The behavior of X & screensaver does not change the slightest from
> current kernels. The patch adds no additional behavior! We check for

The patch is not intended to add additional behavior.  I know that. I
still don't see why the screen saver i/o ops that will either set needs
rescehd directly or schedule a bottom half task will not cause extra
context switches.


> need_resched at _every_ system-call return (or IRQ return to user-space,
> or signal delivery) anyway. The patch only shortens certain longer
> 'scheduling atoms' by either splitting them up into smaller pieces or by
> redesigning them. But this does not cause any macro-effect - apart from
> situations of course which are now behaving correctly.

How do you know?

> [btw. 99% of the time the X client gets rescheduled is not due to
> need_resched but due to the unix-domain socket buffer running out of write
> space. And this is true globally, need_resched itself is resposible for a
> small fraction of reschedules only.]

In the current system, yes. After your patch, it is not at all clear.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (12 preceding siblings ...)
  1999-08-30  8:18 ` yodaiken
@ 1999-08-30  9:45 ` Ingo Molnar
  1999-08-30 11:13 ` Stephen C. Tweedie
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: Ingo Molnar @ 1999-08-30  9:45 UTC (permalink / raw)
  To: linux-sound

On Mon, 30 Aug 1999 yodaiken@chelm.cs.nmt.edu wrote:

> It absolutely is something new. In the current kernel, we check for
> preemption only at points where we are about to do a context
> switch anyways [...]

huh? We have a preemption point after every system call. Typically we
execute tens of thousands of system calls per second on a moderately
loaded system, and only tens/hundreds of context switches per second. A
system call is not a context switch. I dont really see where this
discussion is going to. If you believe there is something weird going on,
please download the patch and prove it. The only thing that 'changed' is
the behavior of high-frequency-rescheduling RT tasks, but this is very
much intended.

> That is the logic is:
>                      before commit to a switch to user, see
>                      if there is a hint to call the scheudler

sorry, a switch to user-space is nowhere near a context switch. A context
switch is when we schedule from one process (thread) to another one. There
is about an order of magnitude between the cost of them, and about two
orders of magnitude between the typical frequency of them. (user-kernel
entries being much cheaper and much more frequent)

> This is not the same as
>                      before copy a block check to see if there
>                      is a hint to call the scheduler.

we have thousands of system calls executed between every typical
reschedule. Go check it yourself. So wether one out of those final system
calls is 'partial' or not makes no difference. If it makes a difference
then the patch has unearthed some kernel bug which we want to fix anyway.

> > [btw. 99% of the time the X client gets rescheduled is not due to
> > need_resched but due to the unix-domain socket buffer running out of write
> > space. And this is true globally, need_resched itself is resposible for a
> > small fraction of reschedules only.]
> 
> In the current system, yes. After your patch, it is not at all clear.

huh? it is absolutely true both for the current kernel and for the patched
kernel. The patch does not generate _any_ new need_resched 'events'. It
only shortens the time we 'respond' to need_resched, thats all. 
need_resched is rarely used in a typical (or benchmarked) system, whenever
some process is trying to naturally preempt a currently running process.

if you think about it, many 'need_resched events' can happen at a large
scale only if there is a higher-statical-priority (not necesserily RT)
process around that does high frequency rescheduling. Nothing in a typical
system does that - and if it does than the priority difference very much
mandates the kernel to reschedule ASAP. In fact, with the patch i see much
better interactive behavior under X when i load the system, interactive
events (which have higher priority) get executed much faster.

-- mingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (13 preceding siblings ...)
  1999-08-30  9:45 ` Ingo Molnar
@ 1999-08-30 11:13 ` Stephen C. Tweedie
  1999-09-04 20:41 ` yodaiken
  1999-09-06  7:43 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), Andrea Arcangeli
  16 siblings, 0 replies; 18+ messages in thread
From: Stephen C. Tweedie @ 1999-08-30 11:13 UTC (permalink / raw)
  To: linux-sound

Hi,

On Mon, 30 Aug 1999 00:09:51 -0600, yodaiken@chelm.cs.nmt.edu said:

> I don't see how your code avoids reschedules from non SCHED_FIFO/RR 
> processes. And I'm not convinced tha even then, it is reasonable to 
> allow this. But first explain why a screen saver will not trigger
> the same behavior. The screen saver will do fast writes to the screen,
> and these will trigger io for X and for the saver itself. Both operations
> will set needs_resched. So we expect io performance to get worse 
> in this case. Right?

Is the screensaver consuming significant CPU time?  If so, it is running
with fewer scheduling credits than (say) the bdflush code.  A wakeup of
the screensaver will not cause need_resched to be set (reschedule_idle
doesn't set need_resched unless the woken process has significantly more
scheduling credits than the running task).

Is the screensaver using even less CPU than bdflush?  In that case, it
is assumed to be a more latency-critical task, and if woken up, it will
set need_resched.  With Ingo's diff, the only change here is that the
reschedule will now occur sooner rather than later, which is exactly
correct for a mostly-idle task.  If the screensaver is in fact waking up
like this all the time, then it should rapidly consume enough scheduling
credits to fall below the bdflush priority and to stop preempting.

I don't understand why this behaviour is undesirable.

--Stephen

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (14 preceding siblings ...)
  1999-08-30 11:13 ` Stephen C. Tweedie
@ 1999-09-04 20:41 ` yodaiken
  1999-09-06  7:43 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), Andrea Arcangeli
  16 siblings, 0 replies; 18+ messages in thread
From: yodaiken @ 1999-09-04 20:41 UTC (permalink / raw)
  To: linux-sound


Ok. I ran lmbench and some other tests and fail to find any problems
on a uniprocessor.   sct pointed out that reschedule_idle
is very conservative about setting need_resched and this makes Ingo
correct when he stated that need_resched>0 means that we really do need
to resched. I'd be happier with some big database tests, and I really think
that database performance should be checked before any such change goes
into the kernel, but for now, I was flat out wrong.


On Mon, Aug 30, 1999 at 11:45:54AM +0200, Ingo Molnar wrote:
> 
> On Mon, 30 Aug 1999 yodaiken@chelm.cs.nmt.edu wrote:
> 
> > It absolutely is something new. In the current kernel, we check for
> > preemption only at points where we are about to do a context
> > switch anyways [...]
> 
> huh? We have a preemption point after every system call. Typically we
> execute tens of thousands of system calls per second on a moderately
> loaded system, and only tens/hundreds of context switches per second. A
> system call is not a context switch. I dont really see where this
> discussion is going to. If you believe there is something weird going on,
> please download the patch and prove it. The only thing that 'changed' is
> the behavior of high-frequency-rescheduling RT tasks, but this is very
> much intended.
> 
> > That is the logic is:
> >                      before commit to a switch to user, see
> >                      if there is a hint to call the scheudler
> 
> sorry, a switch to user-space is nowhere near a context switch. A context
> switch is when we schedule from one process (thread) to another one. There
> is about an order of magnitude between the cost of them, and about two
> orders of magnitude between the typical frequency of them. (user-kernel
> entries being much cheaper and much more frequent)
> 
> > This is not the same as
> >                      before copy a block check to see if there
> >                      is a hint to call the scheduler.
> 
> we have thousands of system calls executed between every typical
> reschedule. Go check it yourself. So wether one out of those final system
> calls is 'partial' or not makes no difference. If it makes a difference
> then the patch has unearthed some kernel bug which we want to fix anyway.
> 
> > > [btw. 99% of the time the X client gets rescheduled is not due to
> > > need_resched but due to the unix-domain socket buffer running out of write
> > > space. And this is true globally, need_resched itself is resposible for a
> > > small fraction of reschedules only.]
> > 
> > In the current system, yes. After your patch, it is not at all clear.
> 
> huh? it is absolutely true both for the current kernel and for the patched
> kernel. The patch does not generate _any_ new need_resched 'events'. It
> only shortens the time we 'respond' to need_resched, thats all. 
> need_resched is rarely used in a typical (or benchmarked) system, whenever
> some process is trying to naturally preempt a currently running process.
> 
> if you think about it, many 'need_resched events' can happen at a large
> scale only if there is a higher-statical-priority (not necesserily RT)
> process around that does high frequency rescheduling. Nothing in a typical
> system does that - and if it does than the priority difference very much
> mandates the kernel to reschedule ASAP. In fact, with the patch i see much
> better interactive behavior under X when i load the system, interactive
> events (which have higher priority) get executed much faster.
> 
> -- mingo
> 
> --- [rtl] ---
> To unsubscribe:
> echo "unsubscribe rtl" | mail majordomo@rtlinux.cs.nmt.edu OR
> echo "unsubscribe rtl <Your_email>" | mail majordomo@rtlinux.cs.nmt.edu
> ----
> For more information on Real-Time Linux see:
> http://www.rtlinux.org/~rtlinux/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [rtl] Low-latency patches working GREAT (<2.9ms audio latency),
  1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
                   ` (15 preceding siblings ...)
  1999-09-04 20:41 ` yodaiken
@ 1999-09-06  7:43 ` Andrea Arcangeli
  16 siblings, 0 replies; 18+ messages in thread
From: Andrea Arcangeli @ 1999-09-06  7:43 UTC (permalink / raw)
  To: linux-sound

On Sat, 4 Sep 1999 yodaiken@fsmlabs.com wrote:

>on a uniprocessor.   sct pointed out that reschedule_idle
>is very conservative about setting need_resched and this makes Ingo
>correct when he stated that need_resched>0 means that we really do need
>to resched. I'd be happier with some big database tests, and I really think

IMHO this is not the point at all.

need_resched = 1 means you _have_ to reschedule ASAP careless about the
scheduler algorithm at all.

>that database performance should be checked before any such change goes
>into the kernel, but for now, I was flat out wrong.

If honouring the need_resched bit is decreasing performances than it means
you _want_ to change the scheduler and not the code that honour the
need_resched bit. Of course I am supposing the checks itself are not the
source of the slowdown.

Andrea

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~1999-09-06  7:43 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
1999-08-28 23:55 [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl yodaiken
1999-08-29  0:24 ` Alan Cox
1999-08-29  1:59 ` yodaiken
1999-08-29  6:21 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), Linus Torvalds
1999-08-29  7:13 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), see testresults ,but ISDN troubl Ingo Molnar
1999-08-29  7:15 ` Ingo Molnar
1999-08-29  7:17 ` Ingo Molnar
1999-08-29 13:59 ` yodaiken
1999-08-29 14:22 ` David Olofson
1999-08-29 20:48 ` yodaiken
1999-08-30  6:09 ` yodaiken
1999-08-30  6:55 ` Ingo Molnar
1999-08-30  7:30 ` Ingo Molnar
1999-08-30  8:18 ` yodaiken
1999-08-30  9:45 ` Ingo Molnar
1999-08-30 11:13 ` Stephen C. Tweedie
1999-09-04 20:41 ` yodaiken
1999-09-06  7:43 ` [rtl] Low-latency patches working GREAT (<2.9ms audio latency), Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox