public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* SD_SHARE_CPUPOWER breaks scheduler fairness
@ 2005-05-31 17:46 Steve Rotolo
  2005-06-01  2:49 ` Con Kolivas
  0 siblings, 1 reply; 15+ messages in thread
From: Steve Rotolo @ 2005-05-31 17:46 UTC (permalink / raw)
  To: linux-kernel; +Cc: bugsy

The SD_SHARE_CPUPOWER flag in SMT scheduling domains (hyperthread
systems) can starve out sched_other tasks and even hang the system.  A
long-running (or run-away) sched_fifo task causes sched_other tasks to
get stuck on the sibling cpu's runqueue without any chance to run.  The
sibling cpu simply stays idle with tasks on it's runqueue for as long as
the sched_fifo task runs on the other sibling cpu.  The culprit is
dependent_sleeper() in sched.c.

I guess the SD_SHARE_CPUPOWER is supposed to cause the scheduler to
prohibit non-real-time tasks from running on a cpu while a real-time
task is running on the sibling cpu.  The problem is that sched_other
tasks are not migrated to a different runqueue and essentially get stuck
on a dead runqueue until either the sched_fifo task yields or the
load-balancer moves him.  Unfortunately, the load-balancer will never
migrate the task if the runqueue length is not sufficiently out of
balance.  Even more unfortunate, the load-balancer will actually move
tasks *to* the dead runqueue if it is less busy.  And still worse, since
SD_WAKE_IDLE is also set in the scheduling domain, the dead cpu will
actually attract waking tasks to it because it is idle!  The cpu becomes
a sort-of black-hole sucking in innocent tasks so they can no longer
run.

The worst-case scenario is when there are N spinning sched_fifo tasks on
an N-way hyperthreaded system.  This hangs the system since nothing can
run on the virtual cpus.  If you turn off the SD_SHARE_CPUPOWER flag,
the system stays fully functional until you have N*2 spinners hogging
all the virtual cpus.

I get the same behavior from 2.6.9 to 2.6.12-rc5.  So is this a bug or a
feature?

-- 
Steve Rotolo
Concurrent Computer Corporation


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2005-06-03  0:49 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-31 17:46 SD_SHARE_CPUPOWER breaks scheduler fairness Steve Rotolo
2005-06-01  2:49 ` Con Kolivas
2005-06-01 14:29   ` Steve Rotolo
2005-06-01 14:47     ` Con Kolivas
2005-06-01 18:41       ` Steve Rotolo
2005-06-01 21:37         ` Con Kolivas
2005-06-01 21:54           ` Con Kolivas
2005-06-01 22:01           ` Steve Rotolo
2005-06-02  3:01             ` Con Kolivas
2005-06-01 23:16           ` Joe Korty
2005-06-01 23:25             ` Con Kolivas
2005-06-02 13:30               ` Steve Rotolo
2005-06-02 13:34                 ` Con Kolivas
2005-06-02 15:48                   ` Steve Rotolo
2005-06-03  0:43                     ` [PATCH] SCHED: run SCHED_NORMAL tasks with real time tasks on SMT siblings Con Kolivas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox