public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* RFA: Changing scheduler quantum (Was: REQUEST: OpenLDAP 2.3.7)
       [not found]         ` <20050918110524.GA23910@devserv.devel.redhat.com>
@ 2005-09-18 11:37           ` Bernardo Innocenti
  2005-09-18 11:44             ` Con Kolivas
  0 siblings, 1 reply; 4+ messages in thread
From: Bernardo Innocenti @ 2005-09-18 11:37 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Development discussions related to Fedora Core, Nalin Dahyabhai,
	lkml, Arjan van de Ven

Arjan van de Ven wrote:

> On Sun, Sep 18, 2005 at 04:27:38AM +0200, Bernardo Innocenti wrote:
> 
>>It's more meaningful to interpret sched_yield() as "give up the processor,
>>as if the scheduler quantum had expired".
> 
> afaik this is *exactly* what the new sched_yield() does ;)

Oops :-)


>>The scheduler wouldn't normally allow a lower priority process to
>>preempt a high-priority ready process for 30+ ms.  Unless I'm
>>mistaken about Linux's scheduling policy...
> 
> if your quantum is up... all other tasks get theirs of course

I assumed dynamic priorities affected the length of the
quantum, but maybe it just changes the number of times
the process is scheduled wrt other processes, with the
quantum being fixed at 20-30ms.

(...a few seconds later...)

Skimming through sched.c, it seems my first guess was
right: the quantum varies with the priority from 5ms
to 800ms.

The DEF_TIMESLICE of 400ms looks a bit too gross for
most applications and the maximum 800ms is just
ridicolously high.

IIRC, the 7.14MHz 68000 in the Amiga 500 did task-switching
at 20ms intervals, with a negligible performance hit.
Couldn't do much better on today's CPUs?

-- 
  // Bernardo Innocenti - Develer S.r.l., R&D dept.
\X/  http://www.develer.com/


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFA: Changing scheduler quantum (Was: REQUEST: OpenLDAP 2.3.7)
  2005-09-18 11:37           ` RFA: Changing scheduler quantum (Was: REQUEST: OpenLDAP 2.3.7) Bernardo Innocenti
@ 2005-09-18 11:44             ` Con Kolivas
  2005-09-18 21:53               ` Bernardo Innocenti
  0 siblings, 1 reply; 4+ messages in thread
From: Con Kolivas @ 2005-09-18 11:44 UTC (permalink / raw)
  To: Bernardo Innocenti
  Cc: Arjan van de Ven, Development discussions related to Fedora Core,
	Nalin Dahyabhai, lkml

On Sun, 18 Sep 2005 21:37, Bernardo Innocenti wrote:
> Arjan van de Ven wrote:
> > On Sun, Sep 18, 2005 at 04:27:38AM +0200, Bernardo Innocenti wrote:
> >>It's more meaningful to interpret sched_yield() as "give up the
> >> processor, as if the scheduler quantum had expired".
> >
> > afaik this is *exactly* what the new sched_yield() does ;)
>
> Oops :-)
>
> >>The scheduler wouldn't normally allow a lower priority process to
> >>preempt a high-priority ready process for 30+ ms.  Unless I'm
> >>mistaken about Linux's scheduling policy...
> >
> > if your quantum is up... all other tasks get theirs of course
>
> I assumed dynamic priorities affected the length of the
> quantum, but maybe it just changes the number of times
> the process is scheduled wrt other processes, with the
> quantum being fixed at 20-30ms.
>
> (...a few seconds later...)
>
> Skimming through sched.c, it seems my first guess was
> right: the quantum varies with the priority from 5ms
> to 800ms.
>
> The DEF_TIMESLICE of 400ms looks a bit too gross for
> most applications and the maximum 800ms is just
> ridicolously high.
>
> IIRC, the 7.14MHz 68000 in the Amiga 500 did task-switching
> at 20ms intervals, with a negligible performance hit.
> Couldn't do much better on today's CPUs?

Not quite.

The default timeslice of nice 0 tasks is 100ms. The timeslice is not altered 
the way you have read sched.c. It is altered thus:
1. For 'nice' levels it varies from 5ms at nice 19 to 800ms at nice -20.
2. For interactive tasks, it is cut up into smaller pieces down to 10ms and 
round robins with other tasks at the same dynamic priority, but still is 
based on the nice levels for the full length of cpu time before expiration 
overall.

Cheers,
Con

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFA: Changing scheduler quantum (Was: REQUEST: OpenLDAP 2.3.7)
  2005-09-18 11:44             ` Con Kolivas
@ 2005-09-18 21:53               ` Bernardo Innocenti
  2005-09-19  0:46                 ` Con Kolivas
  0 siblings, 1 reply; 4+ messages in thread
From: Bernardo Innocenti @ 2005-09-18 21:53 UTC (permalink / raw)
  To: Con Kolivas
  Cc: Arjan van de Ven, Development discussions related to Fedora Core,
	Nalin Dahyabhai, lkml

Con Kolivas wrote:
> On Sun, 18 Sep 2005 21:37, Bernardo Innocenti wrote:

>>The DEF_TIMESLICE of 400ms looks a bit too gross for
>>most applications and the maximum 800ms is just
>>ridicolously high.
> 
> Not quite.
> 
> The default timeslice of nice 0 tasks is 100ms. The timeslice is not altered 
> the way you have read sched.c. It is altered thus:
> 1. For 'nice' levels it varies from 5ms at nice 19 to 800ms at nice -20.
> 2. For interactive tasks, it is cut up into smaller pieces down to 10ms and 
> round robins with other tasks at the same dynamic priority, but still is 
> based on the nice levels for the full length of cpu time before expiration 
> overall.

I see.  Then there must be something else to explain
the behavior I'm observing with slapd.

Each and every call to sched_yield() makes the process
sleep for over *50ms* while a "nice make bootstrap" is
running in the background:

[pid  8780]      0.000033 stat64("gidNumber.dbb", 0xb7b3ebcc) = -1 EACCES (Permission denied)
[pid  8780]      0.000059 pread(20, "\0\0\0\0\1\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\2\0\344\17\2\3"..., 4096, 4096) = 4096
[pid  8780]      0.000083 pread(20, "\0\0\0\0\1\0\0\0\4\0\0\0\3\0\0\0\0\0\0\0\222\0<\7\1\5\370"..., 4096, 16384) = 4096
[pid  8780]      0.000078 time(NULL)    = 1124322520
[pid  8780]      0.000066 pread(11, "\0\0\0\0\1\0\0\0\250\0\0\0\231\0\0\0\235\0\0\0\16\0000"..., 4096, 688128) = 4096
[pid  8780]      0.000241 write(19, "0e\2\1\3d`\4$cn=bernie,ou=group,dc=d"..., 103) = 103
[pid  8780]      0.000137 sched_yield( <unfinished ...>
      ...zzzz...
[pid  8781]      0.050020 <... sched_yield resumed> ) = 0
[pid  8780]      0.000025 <... sched_yield resumed> ) = 0
[pid  8781]      0.000060 futex(0x925ab20, FUTEX_WAIT, 33, NULL <unfinished ...>
[pid  8780]      0.000026 write(19, "0\f\2\1\3e\7\n\1\0\4\0\4\0", 14) = 14
[pid  8774]      0.000774 <... select resumed> ) = 1 (in [19])


Actually, I'm now noticing that several slapd threads were
involved here.  Depending how strace handles relative
timestamps of multiple processes, it may mean both 8780 and
8781 slept too much or just 8781 did and 8780 was quick.

Any idea?  I'm planning to patch my kernel to print the
time_slice value in /proc/*/stat.  This way I can check
it's being computed as intended for both slapd and gcc.

-- 
  // Bernardo Innocenti - Develer S.r.l., R&D dept.
\X/  http://www.develer.com/


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: RFA: Changing scheduler quantum (Was: REQUEST: OpenLDAP 2.3.7)
  2005-09-18 21:53               ` Bernardo Innocenti
@ 2005-09-19  0:46                 ` Con Kolivas
  0 siblings, 0 replies; 4+ messages in thread
From: Con Kolivas @ 2005-09-19  0:46 UTC (permalink / raw)
  To: Bernardo Innocenti; +Cc: Arjan van de Ven, Nalin Dahyabhai, lkml

On Mon, 19 Sep 2005 07:53, Bernardo Innocenti wrote:
> Con Kolivas wrote:
> > On Sun, 18 Sep 2005 21:37, Bernardo Innocenti wrote:
> >>The DEF_TIMESLICE of 400ms looks a bit too gross for
> >>most applications and the maximum 800ms is just
> >>ridicolously high.
> >
> > Not quite.
> >
> > The default timeslice of nice 0 tasks is 100ms. The timeslice is not
> > altered the way you have read sched.c. It is altered thus:
> > 1. For 'nice' levels it varies from 5ms at nice 19 to 800ms at nice -20.
> > 2. For interactive tasks, it is cut up into smaller pieces down to 10ms
> > and round robins with other tasks at the same dynamic priority, but still
> > is based on the nice levels for the full length of cpu time before
> > expiration overall.

Please do not cc mailing lists that reply with the "your email is awaiting 
moderator approval" to lkml.

> I see.  Then there must be something else to explain
> the behavior I'm observing with slapd.
>
> Each and every call to sched_yield() makes the process
> sleep for over *50ms* while a "nice make bootstrap" is
> running in the background:

Why this preoccupation with how long sched_yield takes? We've already 
established that it takes a variable unpredictable (yet long) time for 
SCHED_NORMAL tasks. No, cancel that question or we'll start having people 
tell us what the kernel should do all over again.

You're almost certainly seeing the effect of fork during 'make bootstrap' and 
multiple tasks are running prior to expiration on the active runqueue. 
SCHED_NORMAL tasks that have done sched_yield will yield till nothing is left 
wanting cpu time on the active runqueue.

> Actually, I'm now noticing that several slapd threads were
> involved here.  Depending how strace handles relative
> timestamps of multiple processes, it may mean both 8780 and
> 8781 slept too much or just 8781 did and 8780 was quick.
>
> Any idea?  I'm planning to patch my kernel to print the
> time_slice value in /proc/*/stat.  This way I can check
> it's being computed as intended for both slapd and gcc.

Feel free to do as much checking on kernel code as you like.

Cheers,
Con

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2005-09-19  0:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <432B9F4A.6070805@develer.com>
     [not found] ` <1126982265.3010.12.camel@localhost.localdomain>
     [not found]   ` <432CBABC.8090906@develer.com>
     [not found]     ` <20050918013247.GA31974@devserv.devel.redhat.com>
     [not found]       ` <432CD09A.2060201@develer.com>
     [not found]         ` <20050918110524.GA23910@devserv.devel.redhat.com>
2005-09-18 11:37           ` RFA: Changing scheduler quantum (Was: REQUEST: OpenLDAP 2.3.7) Bernardo Innocenti
2005-09-18 11:44             ` Con Kolivas
2005-09-18 21:53               ` Bernardo Innocenti
2005-09-19  0:46                 ` Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox