* O(1) scheduler starvation
@ 2003-06-18 7:53 Felipe Alfaro Solana
2003-06-18 12:04 ` Mike Galbraith
2003-06-18 16:51 ` William Lee Irwin III
0 siblings, 2 replies; 15+ messages in thread
From: Felipe Alfaro Solana @ 2003-06-18 7:53 UTC (permalink / raw)
To: davidm; +Cc: LKML
[-- Attachment #1: Type: text/plain, Size: 735 bytes --]
Hi!
I've been poking around and found the following link on O(1) scheduler
starvation problems:
http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
The web page contains a small test program which supposedly is able to
make two processes starvate. However, I've been unable to reproduce what
is described in the above link. In fact, the CPU usage ratio ranges
between 60-40% or 70-30% in the worst cases.
I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
small patch I made myself to improve interactivity (mainly, to stop XMMS
from skipping by adjusting some scheduler parameters).
What's going on here? Is the previous article outdated, or have the
starvation problems been definitively fixed?
Thanks!
[-- Attachment #2: galbraith.patch --]
[-- Type: text/x-patch, Size: 7376 bytes --]
diff -prauN linux-2.5.70-bk8/include/linux/sched.h galbraith-2.5.70-bk8-1/include/linux/sched.h
--- linux-2.5.70-bk8/include/linux/sched.h Mon Jun 2 02:15:11 2003
+++ galbraith-2.5.70-bk8-1/include/linux/sched.h Tue Jun 3 07:10:17 2003
@@ -336,7 +336,9 @@
prio_array_t *array;
unsigned long sleep_avg;
- unsigned long last_run;
+ unsigned long long last_run;
+ unsigned int run_nsecs;
+ unsigned int sleep_nsecs;
unsigned long policy;
unsigned long cpus_allowed;
diff -prauN linux-2.5.70-bk8/kernel/sched.c galbraith-2.5.70-bk8-1/kernel/sched.c
--- linux-2.5.70-bk8/kernel/sched.c Tue May 27 04:05:34 2003
+++ galbraith-2.5.70-bk8-1/kernel/sched.c Tue Jun 3 07:10:18 2003
@@ -74,6 +74,12 @@
#define MAX_SLEEP_AVG (10*HZ)
#define STARVATION_LIMIT (10*HZ)
#define NODE_THRESHOLD 125
+#define SCHED_NANOSECOND 1
+#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
+#define SCHED_TICK (SCHED_SECOND / HZ)
+#define TICKS_PER_SECOND (SCHED_SECOND / SCHED_TICK)
+
+extern unsigned long long monotonic_clock(void);
/*
* If a task is 'interactive' then we reinsert it in the active
@@ -342,9 +348,23 @@
*/
static inline void activate_task(task_t *p, runqueue_t *rq)
{
- long sleep_time = jiffies - p->last_run - 1;
+ unsigned long long now = monotonic_clock();
+ long long sleep = now - p->last_run + p->sleep_nsecs;
+ int ticks = 0, requeue_waker = 0;
+
+ if (sleep >= SCHED_TICK) {
+ while (sleep >= SCHED_SECOND) {
+ sleep -= SCHED_SECOND;
+ ticks += TICKS_PER_SECOND;
+ }
+ while (sleep >= SCHED_TICK) {
+ sleep -= SCHED_TICK;
+ ticks++;
+ }
+ p->sleep_nsecs = sleep;
+ } else p->sleep_nsecs += sleep;
- if (sleep_time > 0) {
+ if (ticks > 0) {
int sleep_avg;
/*
@@ -355,7 +375,7 @@
* spends sleeping, the higher the average gets - and the
* higher the priority boost gets as well.
*/
- sleep_avg = p->sleep_avg + sleep_time;
+ sleep_avg = p->sleep_avg + ticks;
/*
* 'Overflow' bonus ticks go to the waker as well, so the
@@ -363,8 +383,10 @@
* boosting tasks that are related to maximum-interactive
* tasks.
*/
- if (sleep_avg > MAX_SLEEP_AVG)
+ if (sleep_avg > MAX_SLEEP_AVG) {
sleep_avg = MAX_SLEEP_AVG;
+ p->sleep_nsecs = 0;
+ }
if (p->sleep_avg != sleep_avg) {
p->sleep_avg = sleep_avg;
p->prio = effective_prio(p);
@@ -548,6 +570,8 @@
current->sleep_avg = current->sleep_avg * PARENT_PENALTY / 100;
p->sleep_avg = p->sleep_avg * CHILD_PENALTY / 100;
p->prio = effective_prio(p);
+ p->run_nsecs = 0;
+ p->sleep_nsecs = 0;
set_task_cpu(p, smp_processor_id());
if (unlikely(!current->array))
@@ -1147,6 +1171,49 @@
(jiffies - (rq)->expired_timestamp >= \
STARVATION_LIMIT * ((rq)->nr_running) + 1)))
+inline void __scheduler_tick(runqueue_t *rq, task_t *p)
+{
+ unsigned long long now = monotonic_clock();
+ prio_array_t *array = rq->active;
+ int ticks;
+
+ p->run_nsecs += now - p->last_run;
+ /* Task might have expired already, but not scheduled off yet */
+ if (p->array != array) {
+ set_tsk_need_resched(p);
+ goto abort;
+ }
+ if (p->run_nsecs < SCHED_TICK || p->policy == SCHED_FIFO )
+ goto abort;
+
+ for (ticks = 0; p->run_nsecs >= SCHED_TICK; ticks++)
+ p->run_nsecs -= SCHED_TICK;
+ if (ticks > p->time_slice)
+ show_task(p);
+ if (p->sleep_avg > ticks)
+ p->sleep_avg -= ticks;
+ else
+ p->sleep_avg = 0;
+ p->time_slice -= ticks;
+
+ if (p->time_slice <= 0) {
+ dequeue_task(p, p->array);
+ p->prio = effective_prio(p);
+ p->time_slice = task_timeslice(p);
+ p->first_time_slice = 0;
+ set_tsk_need_resched(p);
+ if ((EXPIRED_STARVING(rq) && !rt_task(p)) ||
+ !TASK_INTERACTIVE(p)) {
+ array = rq->expired;
+ if (!rq->expired_timestamp)
+ rq->expired_timestamp = jiffies;
+ }
+ enqueue_task(p, array);
+ }
+abort:
+ p->last_run = monotonic_clock();
+}
+
/*
* This function gets called by the timer code, with HZ frequency.
* We call it with interrupts disabled.
@@ -1159,11 +1226,12 @@
int cpu = smp_processor_id();
runqueue_t *rq = this_rq();
task_t *p = current;
+ int idle = p == rq->idle;
if (rcu_pending(cpu))
rcu_check_callbacks(cpu, user_ticks);
- if (p == rq->idle) {
+ if (idle) {
/* note: this timer irq context must be accounted for as well */
if (irq_count() - HARDIRQ_OFFSET >= SOFTIRQ_OFFSET)
kstat_cpu(cpu).cpustat.system += sys_ticks;
@@ -1171,8 +1239,7 @@
kstat_cpu(cpu).cpustat.iowait += sys_ticks;
else
kstat_cpu(cpu).cpustat.idle += sys_ticks;
- rebalance_tick(rq, 1);
- return;
+ goto out;
}
if (TASK_NICE(p) > 0)
kstat_cpu(cpu).cpustat.nice += user_ticks;
@@ -1180,11 +1247,6 @@
kstat_cpu(cpu).cpustat.user += user_ticks;
kstat_cpu(cpu).cpustat.system += sys_ticks;
- /* Task might have expired already, but not scheduled off yet */
- if (p->array != rq->active) {
- set_tsk_need_resched(p);
- goto out;
- }
spin_lock(&rq->lock);
/*
* The task was running during this tick - update the
@@ -1194,42 +1256,10 @@
* it possible for interactive tasks to use up their
* timeslices at their highest priority levels.
*/
- if (p->sleep_avg)
- p->sleep_avg--;
- if (unlikely(rt_task(p))) {
- /*
- * RR tasks need a special form of timeslice management.
- * FIFO tasks have no timeslices.
- */
- if ((p->policy == SCHED_RR) && !--p->time_slice) {
- p->time_slice = task_timeslice(p);
- p->first_time_slice = 0;
- set_tsk_need_resched(p);
-
- /* put it at the end of the queue: */
- dequeue_task(p, rq->active);
- enqueue_task(p, rq->active);
- }
- goto out_unlock;
- }
- if (!--p->time_slice) {
- dequeue_task(p, rq->active);
- set_tsk_need_resched(p);
- p->prio = effective_prio(p);
- p->time_slice = task_timeslice(p);
- p->first_time_slice = 0;
-
- if (!TASK_INTERACTIVE(p) || EXPIRED_STARVING(rq)) {
- if (!rq->expired_timestamp)
- rq->expired_timestamp = jiffies;
- enqueue_task(p, rq->expired);
- } else
- enqueue_task(p, rq->active);
- }
-out_unlock:
+ __scheduler_tick(rq, p);
spin_unlock(&rq->lock);
out:
- rebalance_tick(rq, 0);
+ rebalance_tick(rq, idle);
}
void scheduling_functions_start_here(void) { }
@@ -1264,8 +1294,8 @@
rq = this_rq();
release_kernel_lock(prev);
- prev->last_run = jiffies;
spin_lock_irq(&rq->lock);
+ __scheduler_tick(rq, prev);
/*
* if entering off of a kernel preemption go straight
@@ -1320,6 +1350,7 @@
if (likely(prev != next)) {
rq->nr_switches++;
rq->curr = next;
+ next->last_run = prev->last_run;
prepare_arch_switch(rq, next);
prev = context_switch(rq, prev, next);
diff -prauN linux-2.5.70-bk8/arch/i386/kernel/timers/timer_tsc.c galbraith-2.5.70-bk8-1/arch/i386/kernel/timers/timer_tsc.c
--- linux-2.5.70-bk8/arch/i386/kernel/timers/timer_tsc.c Mon Apr 21 08:11:07 2003
+++ galbraith-2.5.70-bk8-1/arch/i386/kernel/timers/timer_tsc.c Tue Jun 3 07:10:18 2003
@@ -102,12 +102,13 @@
static unsigned long long monotonic_clock_tsc(void)
{
unsigned long long last_offset, this_offset, base;
+ unsigned long flags;
/* atomically read monotonic base & last_offset */
- read_lock_irq(&monotonic_lock);
+ read_lock_irqsave(&monotonic_lock, flags);
last_offset = ((unsigned long long)last_tsc_high<<32)|last_tsc_low;
base = monotonic_base;
- read_unlock_irq(&monotonic_lock);
+ read_unlock_irqrestore(&monotonic_lock, flags);
/* Read the Time Stamp Counter */
rdtscll(this_offset);
[-- Attachment #3: sched.patch --]
[-- Type: text/plain, Size: 695 bytes --]
--- old/kernel/sched.c 2003-06-17 21:04:21.240902000 +0200
+++ new/kernel/sched.c 2003-06-17 20:58:54.840902000 +0200
@@ -66,14 +66,14 @@
* they expire.
*/
#define MIN_TIMESLICE ( 10 * HZ / 1000)
-#define MAX_TIMESLICE (200 * HZ / 1000)
+#define MAX_TIMESLICE ( 20 * HZ / 1000)
#define CHILD_PENALTY 50
#define PARENT_PENALTY 100
#define EXIT_WEIGHT 3
-#define PRIO_BONUS_RATIO 25
+#define PRIO_BONUS_RATIO 20
#define INTERACTIVE_DELTA 2
-#define MAX_SLEEP_AVG (10*HZ)
-#define STARVATION_LIMIT (10*HZ)
+#define MAX_SLEEP_AVG (2*HZ)
+#define STARVATION_LIMIT (2*HZ)
#define NODE_THRESHOLD 125
#define SCHED_NANOSECOND 1
#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 7:53 O(1) scheduler starvation Felipe Alfaro Solana
@ 2003-06-18 12:04 ` Mike Galbraith
2003-06-18 12:16 ` Helge Hafting
` (2 more replies)
2003-06-18 16:51 ` William Lee Irwin III
1 sibling, 3 replies; 15+ messages in thread
From: Mike Galbraith @ 2003-06-18 12:04 UTC (permalink / raw)
To: Felipe Alfaro Solana; +Cc: davidm, LKML
At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>Hi!
>
>I've been poking around and found the following link on O(1) scheduler
>starvation problems:
>
>http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
>
>The web page contains a small test program which supposedly is able to
>make two processes starvate. However, I've been unable to reproduce what
>is described in the above link. In fact, the CPU usage ratio ranges
>between 60-40% or 70-30% in the worst cases.
(you're talking about with my monotinic_clock() diff in your kernel right?)
If you examine the priorities vs cpu usage, therein lies a big fat bug.
I think the fundamental problem is that you can only execute in series, but
can sleep in parallel, which makes for more sleep time existing than all
execution time combined. If you're running test-starve with my
monotonic_clock() diff, you should notice that one task is at maximum
priority and eating ~75% cpu, while the other is at minumum and getting the
rest minus what top gets. In a sane universe, that should be
impossible. In my current tree, this _is_ now impossible, but I haven't
worked out some nasty kinks.
I've got thud licked (without the restricting sleep to time_slice),
test-starve works right as well, and interactivity is up. Tasks waking
each other in a loop is a bitch and a half though, and need to be beaten
about the head and shoulders. Going to a synchronous wakeup for pipes
(talking stock kernel now) cures irman process_load's ability to starve...
IFF you're running it from a vt. If you're in an xterm, it'll still climb
up from the bottom (only place where it can't starve anybody) and starve
via pass-the-baton wakeup DoS. That will/does take the joy out of using
xmms. If xmms didn't use multiple threads, it'd be much worse... right
now, you'll lose eye-candy [cpu hungry visualization stuff] before you lose
sound [at next song].
>I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
>small patch I made myself to improve interactivity (mainly, to stop XMMS
>from skipping by adjusting some scheduler parameters).
Please show me your xmms hack, and show me how you make xmms skip without
doing something that puts tons of stress on the cache. I built xmms here,
and the only time the audio thread gets starved is when starting a new
song. That's because of CHILD_PENALTY when starting a new copy of xmms
while something of prio < 20 is hogging cpu (process_load <grrrr>). Once
playing, it's rock solid here.
>What's going on here? Is the previous article outdated, or have the
>starvation problems been definitively fixed?
Nope, definitely not.
(if you like bean-counter diff, run contest/irman's process load:)
-Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 12:04 ` Mike Galbraith
@ 2003-06-18 12:16 ` Helge Hafting
2003-06-18 12:32 ` Mike Galbraith
2003-06-18 14:22 ` Felipe Alfaro Solana
2003-06-18 16:52 ` William Lee Irwin III
2 siblings, 1 reply; 15+ messages in thread
From: Helge Hafting @ 2003-06-18 12:16 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Felipe Alfaro Solana, davidm, LKML
Mike Galbraith wrote:
> At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>
>> Hi!
>>
>> I've been poking around and found the following link on O(1) scheduler
>> starvation problems:
>>
>> http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
>>
>> The web page contains a small test program which supposedly is able to
>> make two processes starvate. However, I've been unable to reproduce what
>> is described in the above link. In fact, the CPU usage ratio ranges
>> between 60-40% or 70-30% in the worst cases.
>
>
> (you're talking about with my monotinic_clock() diff in your kernel right?)
>
> If you examine the priorities vs cpu usage, therein lies a big fat bug.
>
> I think the fundamental problem is that you can only execute in series,
> but can sleep in parallel, which makes for more sleep time existing than
> all execution time combined.
Would dividing the sleep time by the number of sleepers fix this?
Or is division a too heavy operation here?
Helge Hafting
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 12:16 ` Helge Hafting
@ 2003-06-18 12:32 ` Mike Galbraith
0 siblings, 0 replies; 15+ messages in thread
From: Mike Galbraith @ 2003-06-18 12:32 UTC (permalink / raw)
To: Helge Hafting; +Cc: Felipe Alfaro Solana, davidm, LKML
At 02:16 PM 6/18/2003 +0200, Helge Hafting wrote:
>Mike Galbraith wrote:
>>At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>>
>>>Hi!
>>>
>>>I've been poking around and found the following link on O(1) scheduler
>>>starvation problems:
>>>
>>>http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
>>>
>>>The web page contains a small test program which supposedly is able to
>>>make two processes starvate. However, I've been unable to reproduce what
>>>is described in the above link. In fact, the CPU usage ratio ranges
>>>between 60-40% or 70-30% in the worst cases.
>>
>>(you're talking about with my monotinic_clock() diff in your kernel right?)
>>If you examine the priorities vs cpu usage, therein lies a big fat bug.
>>I think the fundamental problem is that you can only execute in series,
>>but can sleep in parallel, which makes for more sleep time existing than
>>all execution time combined.
>
>Would dividing the sleep time by the number of sleepers fix this?
>Or is division a too heavy operation here?
That won't work. You'd have to plunk it all into a pot, and divvy it up by
%cpu usage or something. I solved it the simple way. I keep a smoothed
run_avg (%cpu * 100), and use that as a sleep_avg limit... ie if you're
run_avg is 9000 (you're eating 10% cpu), no sleep will push you into
insanity land. I still have the problem that tasks will seek their
appropriate priority level and _stick_ there, so I probably need to add
some decay. (which will no doubt open the next can-O-worms)
-Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 12:04 ` Mike Galbraith
2003-06-18 12:16 ` Helge Hafting
@ 2003-06-18 14:22 ` Felipe Alfaro Solana
2003-06-18 15:54 ` Mike Galbraith
2003-06-18 16:52 ` William Lee Irwin III
2 siblings, 1 reply; 15+ messages in thread
From: Felipe Alfaro Solana @ 2003-06-18 14:22 UTC (permalink / raw)
To: Mike Galbraith; +Cc: davidm, LKML
On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> >Hi!
> >
> >I've been poking around and found the following link on O(1) scheduler
> >starvation problems:
> >
> >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
> >
> >The web page contains a small test program which supposedly is able to
> >make two processes starvate. However, I've been unable to reproduce what
> >is described in the above link. In fact, the CPU usage ratio ranges
> >between 60-40% or 70-30% in the worst cases.
>
> (you're talking about with my monotinic_clock() diff in your kernel right?)
Don't know exactly its name. wli posted to LKLM a few days ago. In a few
words, the patch creates
+inline void __scheduler_tick(runqueue_t *rq, task_t *p)
and
+#define SCHED_NANOSECOND 1
+#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
+#define SCHED_TICK (SCHED_SECOND / HZ)
+#define TICKS_PER_SECOND (SCHED_SECOND / SCHED_TICK)
Don't know if this is the patch you're talking about. It's not thud,
anyways.
> If you examine the priorities vs cpu usage, therein lies a big fat bug.
>
> I think the fundamental problem is that you can only execute in series, but
> can sleep in parallel, which makes for more sleep time existing than all
> execution time combined. If you're running test-starve with my
> monotonic_clock() diff, you should notice that one task is at maximum
> priority and eating ~75% cpu, while the other is at minumum and getting the
> rest minus what top gets. In a sane universe, that should be
> impossible. In my current tree, this _is_ now impossible, but I haven't
> worked out some nasty kinks.
Exactly. This is more or less what was happening, roughly a 70-30
balance of CPU usage.
> >I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
> >small patch I made myself to improve interactivity (mainly, to stop XMMS
> >from skipping by adjusting some scheduler parameters).
>
> Please show me your xmms hack, and show me how you make xmms skip without
> doing something that puts tons of stress on the cache. I built xmms here,
> and the only time the audio thread gets starved is when starting a new
> song. That's because of CHILD_PENALTY when starting a new copy of xmms
> while something of prio < 20 is hogging cpu (process_load <grrrr>). Once
> playing, it's rock solid here.
No, I wasn't talking about an XMMS hack. I was talking about changing
scheduler defaults, as shown in the following patch:
--- old/kernel/sched.c 2003-06-17 21:04:21.240902000 +0200
+++ new/kernel/sched.c 2003-06-17 20:58:54.840902000 +0200
@@ -66,14 +66,14 @@
* they expire.
*/
#define MIN_TIMESLICE ( 10 * HZ / 1000)
-#define MAX_TIMESLICE (200 * HZ / 1000)
+#define MAX_TIMESLICE ( 20 * HZ / 1000)
#define CHILD_PENALTY 50
#define PARENT_PENALTY 100
#define EXIT_WEIGHT 3
-#define PRIO_BONUS_RATIO 25
+#define PRIO_BONUS_RATIO 20
#define INTERACTIVE_DELTA 2
-#define MAX_SLEEP_AVG (10*HZ)
-#define STARVATION_LIMIT (10*HZ)
+#define MAX_SLEEP_AVG (2*HZ)
+#define STARVATION_LIMIT (2*HZ)
#define NODE_THRESHOLD 125
#define SCHED_NANOSECOND 1
#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
To make XMMS skip, just force the X server to do a lot of repainting,
for example, by dragging a big window slowly enough over another one
which requires a lot of painting (Evolution, for example, is a good
candidate as it requires a lot of CPU to repaint uncovered areas). It's
easy to reproduce just after launching XMMS. However, after a while, it
gets difficult to make XMMS to skip sound (it seems the scheduler
adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
niced processes at all.
With your patch and my scheduler changes, I've been unable to make XMMS
skip audio, even when the X server is reniced to -20, or another CPU hog
is running.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 14:22 ` Felipe Alfaro Solana
@ 2003-06-18 15:54 ` Mike Galbraith
2003-06-18 15:59 ` Con Kolivas
2003-06-18 21:30 ` Felipe Alfaro Solana
0 siblings, 2 replies; 15+ messages in thread
From: Mike Galbraith @ 2003-06-18 15:54 UTC (permalink / raw)
To: Felipe Alfaro Solana; +Cc: davidm, LKML
At 04:22 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> > At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > >Hi!
> > >
> > >I've been poking around and found the following link on O(1) scheduler
> > >starvation problems:
> > >
> > >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
> > >
> > >The web page contains a small test program which supposedly is able to
> > >make two processes starvate. However, I've been unable to reproduce what
> > >is described in the above link. In fact, the CPU usage ratio ranges
> > >between 60-40% or 70-30% in the worst cases.
> >
> > (you're talking about with my monotinic_clock() diff in your kernel right?)
>
>Don't know exactly its name. wli posted to LKLM a few days ago. In a few
>words, the patch creates
>
>+inline void __scheduler_tick(runqueue_t *rq, task_t *p)
>
>and
>
>+#define SCHED_NANOSECOND 1
>+#define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
>+#define SCHED_TICK (SCHED_SECOND / HZ)
>+#define TICKS_PER_SECOND (SCHED_SECOND / SCHED_TICK)
>
>Don't know if this is the patch you're talking about. It's not thud,
>anyways.
Yeah, that's the diff. (thud is a starvation testcase posted a while back)
> > If you examine the priorities vs cpu usage, therein lies a big fat bug.
> >
> > I think the fundamental problem is that you can only execute in series,
> but
> > can sleep in parallel, which makes for more sleep time existing than all
> > execution time combined. If you're running test-starve with my
> > monotonic_clock() diff, you should notice that one task is at maximum
> > priority and eating ~75% cpu, while the other is at minumum and getting
> the
> > rest minus what top gets. In a sane universe, that should be
> > impossible. In my current tree, this _is_ now impossible, but I haven't
> > worked out some nasty kinks.
>
>Exactly. This is more or less what was happening, roughly a 70-30
>balance of CPU usage.
>
> > >I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
> > >small patch I made myself to improve interactivity (mainly, to stop XMMS
> > >from skipping by adjusting some scheduler parameters).
> >
> > Please show me your xmms hack, and show me how you make xmms skip without
> > doing something that puts tons of stress on the cache. I built xmms here,
> > and the only time the audio thread gets starved is when starting a new
> > song. That's because of CHILD_PENALTY when starting a new copy of xmms
> > while something of prio < 20 is hogging cpu (process_load <grrrr>). Once
> > playing, it's rock solid here.
>
>No, I wasn't talking about an XMMS hack. I was talking about changing
>scheduler defaults, as shown in the following patch:
Ah, ok (darn, was hoping for fresh idea)
>--- old/kernel/sched.c 2003-06-17 21:04:21.240902000 +0200
>+++ new/kernel/sched.c 2003-06-17 20:58:54.840902000 +0200
>@@ -66,14 +66,14 @@
> * they expire.
> */
> #define MIN_TIMESLICE ( 10 * HZ / 1000)
>-#define MAX_TIMESLICE (200 * HZ / 1000)
>+#define MAX_TIMESLICE ( 20 * HZ / 1000)
(tiny slices make for much context switching, but [alone] doesn't help low
prio tasks get cpu)
> #define CHILD_PENALTY 50
> #define PARENT_PENALTY 100
> #define EXIT_WEIGHT 3
>-#define PRIO_BONUS_RATIO 25
>+#define PRIO_BONUS_RATIO 20
> #define INTERACTIVE_DELTA 2
>-#define MAX_SLEEP_AVG (10*HZ)
>-#define STARVATION_LIMIT (10*HZ)
>+#define MAX_SLEEP_AVG (2*HZ)
(this will increase cpu multiplexing drastically, but will also make the
desktop feel any background load pretty much instantly. btdt too, ok for
very light load, but bad for balls-to-the-wall throughput here)
>+#define STARVATION_LIMIT (2*HZ)
> #define NODE_THRESHOLD 125
> #define SCHED_NANOSECOND 1
> #define SCHED_SECOND (1000000000 * SCHED_NANOSECOND)
>
>To make XMMS skip, just force the X server to do a lot of repainting,
>for example, by dragging a big window slowly enough over another one
>which requires a lot of painting (Evolution, for example, is a good
>candidate as it requires a lot of CPU to repaint uncovered areas). It's
>easy to reproduce just after launching XMMS. However, after a while, it
>gets difficult to make XMMS to skip sound (it seems the scheduler
>adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
>niced processes at all.
Thanks. I don't have evolution on my linux box (pine/vi/procmail
rules). ImageMagic ought to give X more than enough spurts of frenetic
activity though. Do you have that, and does image manipulation make xmms
stutter as well? Just moving windows around and changing backgrounds
doesn't do anything here. (500mhz piii/128mb ram btw)
>With your patch and my scheduler changes, I've been unable to make XMMS
>skip audio, even when the X server is reniced to -20, or another CPU hog
>is running.
<g> I can make it do a skip so big that SysRq-E or BRB are the only
recovery options.
-Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 15:54 ` Mike Galbraith
@ 2003-06-18 15:59 ` Con Kolivas
2003-06-18 16:29 ` Mike Galbraith
2003-06-18 21:30 ` Felipe Alfaro Solana
1 sibling, 1 reply; 15+ messages in thread
From: Con Kolivas @ 2003-06-18 15:59 UTC (permalink / raw)
To: Mike Galbraith, Felipe Alfaro Solana; +Cc: davidm, LKML
On Thu, 19 Jun 2003 01:54, Mike Galbraith wrote:
> At 04:22 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> >On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> > > At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > > >Hi!
> > > >
> > > >I've been poking around and found the following link on O(1) scheduler
> > > >starvation problems:
> > > >
> > > >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
<BIG SNIP>
Have you seen this email I just posted to lkml?
[PATCH] 2.5.72 O(1) interactivity bugfix
It may be affecting the scheduler in all sorts of ways.
Con
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 15:59 ` Con Kolivas
@ 2003-06-18 16:29 ` Mike Galbraith
0 siblings, 0 replies; 15+ messages in thread
From: Mike Galbraith @ 2003-06-18 16:29 UTC (permalink / raw)
To: Con Kolivas; +Cc: Felipe Alfaro Solana, davidm, LKML
At 01:59 AM 6/19/2003 +1000, Con Kolivas wrote:
>On Thu, 19 Jun 2003 01:54, Mike Galbraith wrote:
> > At 04:22 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > >On Wed, 2003-06-18 at 14:04, Mike Galbraith wrote:
> > > > At 09:53 AM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
> > > > >Hi!
> > > > >
> > > > >I've been poking around and found the following link on O(1) scheduler
> > > > >starvation problems:
> > > > >
> > > > >http://www.hpl.hp.com/research/linux/kernel/o1-starve.php
><BIG SNIP>
>
>Have you seen this email I just posted to lkml?
(yeah, replied off-line)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 7:53 O(1) scheduler starvation Felipe Alfaro Solana
2003-06-18 12:04 ` Mike Galbraith
@ 2003-06-18 16:51 ` William Lee Irwin III
1 sibling, 0 replies; 15+ messages in thread
From: William Lee Irwin III @ 2003-06-18 16:51 UTC (permalink / raw)
To: Felipe Alfaro Solana; +Cc: davidm, LKML
On Wed, Jun 18, 2003 at 09:53:28AM +0200, Felipe Alfaro Solana wrote:
> I'm running 2.5.72-mm1 with Mike Galbraith's scheduler patches and a
> small patch I made myself to improve interactivity (mainly, to stop XMMS
> from skipping by adjusting some scheduler parameters).
> What's going on here? Is the previous article outdated, or have the
> starvation problems been definitively fixed?
The MAX_TIMESLICE change is inappropriate.
There's papering over broken algorithms, and then there's getting really
blatant about it -- and MAX_TIMESLICE going from 200 -> 20 is blatant.
-- wli
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 12:04 ` Mike Galbraith
2003-06-18 12:16 ` Helge Hafting
2003-06-18 14:22 ` Felipe Alfaro Solana
@ 2003-06-18 16:52 ` William Lee Irwin III
2003-06-18 17:17 ` Mike Galbraith
2 siblings, 1 reply; 15+ messages in thread
From: William Lee Irwin III @ 2003-06-18 16:52 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Felipe Alfaro Solana, davidm, LKML
On Wed, Jun 18, 2003 at 02:04:45PM +0200, Mike Galbraith wrote:
> I've got thud licked (without the restricting sleep to time_slice),
> test-starve works right as well, and interactivity is up. Tasks waking
> each other in a loop is a bitch and a half though, and need to be beaten
> about the head and shoulders. Going to a synchronous wakeup for pipes
> (talking stock kernel now) cures irman process_load's ability to starve...
> IFF you're running it from a vt. If you're in an xterm, it'll still climb
> up from the bottom (only place where it can't starve anybody) and starve
> via pass-the-baton wakeup DoS. That will/does take the joy out of using
> xmms. If xmms didn't use multiple threads, it'd be much worse... right
> now, you'll lose eye-candy [cpu hungry visualization stuff] before you lose
> sound [at next song].
That's great. I'd love to see the patch.
-- wli
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 16:52 ` William Lee Irwin III
@ 2003-06-18 17:17 ` Mike Galbraith
0 siblings, 0 replies; 15+ messages in thread
From: Mike Galbraith @ 2003-06-18 17:17 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: Felipe Alfaro Solana, davidm, LKML
At 09:52 AM 6/18/2003 -0700, William Lee Irwin III wrote:
>On Wed, Jun 18, 2003 at 02:04:45PM +0200, Mike Galbraith wrote:
> > I've got thud licked (without the restricting sleep to time_slice),
> > test-starve works right as well, and interactivity is up. Tasks waking
> > each other in a loop is a bitch and a half though, and need to be beaten
> > about the head and shoulders. Going to a synchronous wakeup for pipes
> > (talking stock kernel now) cures irman process_load's ability to starve...
> > IFF you're running it from a vt. If you're in an xterm, it'll still climb
> > up from the bottom (only place where it can't starve anybody) and starve
> > via pass-the-baton wakeup DoS. That will/does take the joy out of using
> > xmms. If xmms didn't use multiple threads, it'd be much worse... right
> > now, you'll lose eye-candy [cpu hungry visualization stuff] before you
> lose
> > sound [at next song].
>
>That's great. I'd love to see the patch.
If I can get the kinks worked out and still have something left. I'm
gaining on the darn thing, but only a net millimeter at a time... with much
cha-cha-cha action in between ;-)
-Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
@ 2003-06-18 20:44 Ricardo Galli
2003-06-18 20:48 ` Marc-Christian Petersen
0 siblings, 1 reply; 15+ messages in thread
From: Ricardo Galli @ 2003-06-18 20:44 UTC (permalink / raw)
To: linux-kernel
> Have you seen this email I just posted to lkml?
>
> [PATCH] 2.5.72 O(1) interactivity bugfix
>
> It may be affecting the scheduler in all sorts of ways.
I've applied the patch you posted in your web page for 2.4.21-ck1. It feels
more reactive but has an annoying collateral effect, it seems to slow down to
new processes forked from an interactive program.
I see it when selecting a PGP signed message in kmail, it takes up to two
seconds to show the message body. The lag wasn't so noticeable before.
Regards,
--
ricardo galli GPG id C8114D34
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 20:44 Ricardo Galli
@ 2003-06-18 20:48 ` Marc-Christian Petersen
0 siblings, 0 replies; 15+ messages in thread
From: Marc-Christian Petersen @ 2003-06-18 20:48 UTC (permalink / raw)
To: Ricardo Galli, linux-kernel
On Wednesday 18 June 2003 22:44, Ricardo Galli wrote:
Hi Ricardo,
> > Have you seen this email I just posted to lkml?
> > [PATCH] 2.5.72 O(1) interactivity bugfix
> > It may be affecting the scheduler in all sorts of ways.
>
> I've applied the patch you posted in your web page for 2.4.21-ck1. It feels
> more reactive but has an annoying collateral effect, it seems to slow down
> to new processes forked from an interactive program.
> I see it when selecting a PGP signed message in kmail, it takes up to two
> seconds to show the message body. The lag wasn't so noticeable before.
Robert already told so.
ciao, Marc
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 15:54 ` Mike Galbraith
2003-06-18 15:59 ` Con Kolivas
@ 2003-06-18 21:30 ` Felipe Alfaro Solana
2003-06-18 22:32 ` Mike Galbraith
1 sibling, 1 reply; 15+ messages in thread
From: Felipe Alfaro Solana @ 2003-06-18 21:30 UTC (permalink / raw)
To: Mike Galbraith; +Cc: davidm, LKML
On Wed, 2003-06-18 at 17:54, Mike Galbraith wrote:
> >To make XMMS skip, just force the X server to do a lot of repainting,
> >for example, by dragging a big window slowly enough over another one
> >which requires a lot of painting (Evolution, for example, is a good
> >candidate as it requires a lot of CPU to repaint uncovered areas). It's
> >easy to reproduce just after launching XMMS. However, after a while, it
> >gets difficult to make XMMS to skip sound (it seems the scheduler
> >adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
> >niced processes at all.
>
> Thanks. I don't have evolution on my linux box (pine/vi/procmail
> rules). ImageMagic ought to give X more than enough spurts of frenetic
> activity though. Do you have that, and does image manipulation make xmms
> stutter as well? Just moving windows around and changing backgrounds
> doesn't do anything here. (500mhz piii/128mb ram btw)
In fact, I can only make XMMS skip sound for a very brief period, just
after starting it up. After a few seconds, it seems the dynamic
priorities are adjusted and I can't make XMMS skip sound anymore. To
reproduce it, open up a big window, then launch XMMS and make it play
some MP3 file. Then, start moving the big window around. For me, this
causes XMMS to skip sound for a brief period of time (~5 seconds,
approx).
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: O(1) scheduler starvation
2003-06-18 21:30 ` Felipe Alfaro Solana
@ 2003-06-18 22:32 ` Mike Galbraith
0 siblings, 0 replies; 15+ messages in thread
From: Mike Galbraith @ 2003-06-18 22:32 UTC (permalink / raw)
To: Felipe Alfaro Solana; +Cc: davidm, LKML
At 11:30 PM 6/18/2003 +0200, Felipe Alfaro Solana wrote:
>On Wed, 2003-06-18 at 17:54, Mike Galbraith wrote:
> > >To make XMMS skip, just force the X server to do a lot of repainting,
> > >for example, by dragging a big window slowly enough over another one
> > >which requires a lot of painting (Evolution, for example, is a good
> > >candidate as it requires a lot of CPU to repaint uncovered areas). It's
> > >easy to reproduce just after launching XMMS. However, after a while, it
> > >gets difficult to make XMMS to skip sound (it seems the scheduler
> > >adjusts priorities well enough). This is on a PIII 700Mhz laptop with no
> > >niced processes at all.
> >
> > Thanks. I don't have evolution on my linux box (pine/vi/procmail
> > rules). ImageMagic ought to give X more than enough spurts of frenetic
> > activity though. Do you have that, and does image manipulation make xmms
> > stutter as well? Just moving windows around and changing backgrounds
> > doesn't do anything here. (500mhz piii/128mb ram btw)
>
>In fact, I can only make XMMS skip sound for a very brief period, just
>after starting it up. After a few seconds, it seems the dynamic
>priorities are adjusted and I can't make XMMS skip sound anymore. To
>reproduce it, open up a big window, then launch XMMS and make it play
>some MP3 file. Then, start moving the big window around. For me, this
>causes XMMS to skip sound for a brief period of time (~5 seconds,
>approx).
If you bump CHILD_PENALTY up (to say 75) you should see a marked improvement.
-Mike
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2003-06-18 22:14 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-18 7:53 O(1) scheduler starvation Felipe Alfaro Solana
2003-06-18 12:04 ` Mike Galbraith
2003-06-18 12:16 ` Helge Hafting
2003-06-18 12:32 ` Mike Galbraith
2003-06-18 14:22 ` Felipe Alfaro Solana
2003-06-18 15:54 ` Mike Galbraith
2003-06-18 15:59 ` Con Kolivas
2003-06-18 16:29 ` Mike Galbraith
2003-06-18 21:30 ` Felipe Alfaro Solana
2003-06-18 22:32 ` Mike Galbraith
2003-06-18 16:52 ` William Lee Irwin III
2003-06-18 17:17 ` Mike Galbraith
2003-06-18 16:51 ` William Lee Irwin III
-- strict thread matches above, loose matches on Subject: below --
2003-06-18 20:44 Ricardo Galli
2003-06-18 20:48 ` Marc-Christian Petersen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox