linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [QUERY] Confusing usage of rq->nr_running in load balancing
@ 2014-09-03 12:21 Preeti U Murthy
  2014-09-03 15:30 ` Peter Zijlstra
  2014-09-03 16:58 ` Vincent Guittot
  0 siblings, 2 replies; 7+ messages in thread
From: Preeti U Murthy @ 2014-09-03 12:21 UTC (permalink / raw)
  To: Vincent Guittot, peterz@infradead.org, Ingo Molnar, Rik van Riel,
	Morten Rasmussen
  Cc: LKML, Mike Galbraith, Nicolas Pitre, daniel.lezcano@linaro.org,
	Dietmar Eggemann, Kamalesh Babulal, Srikar Dronamraju

Hi,

There are places in kernel/sched/fair.c in the load balancing part where
rq->nr_running is used as against cfs_rq->nr_running. At least I could
not make out why the former was used in the following scenarios.
It looks to me that it can very well lead to incorrect load balancing.
Also I did not pay attention to the numa balancing part of the code
while skimming through this file to catch this scenario. There are a
couple of places there too which need to be scrutinized.

1. load_balance(): The check (busiest->nr_running > 1)
The load balancing would be futile if there are tasks of other
scheduling classes, wouldn't it?

2. active_load_balance_cpu_stop(): A similar check and a similar
consequence as 1 here.

3. nohz_kick_needed() : We check for more than one task on the runqueue
and hence trigger load balancing even if there are rt-tasks.

4. cpu_avg_load_per_task(): This stands out among the rest as an
incorrect usage of rq->nr_running in place of cfs_rq->nr_running. We
divide the load associated with the cfs_rq by the number of tasks on the
rq. This will make the cfs_rq load look smaller.

5. task_hot() : I am not too sure about the consequences of using
rq->nr_running here.

6. update_sg_lb_stats(): sgs->sum_nr_running is the sum of
rq->nr_running and propogates thus throughout the load balancing code path.

7. sg_capacity_factor(): Returns the capacity factor measured against
the cpu capacity available to fair tasks. We then compare this with the
rq->nr_running in update_sg_lb_stats(), update_sd_pick_busiest() and
calculate_imbalance()

8. find_busiest_queue(): This anomaly shows up when we filter against
rq->nr_running == 1 and imbalance cannot be taken care of by the
existing task on this rq.

Did I miss something or is it true that the usage of rq->nr_running in
the above places is incorrect?

Thanks

Regards
Preeti U Murthy


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-09-15  4:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-03 12:21 [QUERY] Confusing usage of rq->nr_running in load balancing Preeti U Murthy
2014-09-03 15:30 ` Peter Zijlstra
2014-09-03 16:58 ` Vincent Guittot
2014-09-05 12:19   ` Preeti U Murthy
2014-09-05 12:27     ` Vincent Guittot
2014-09-10  8:21       ` Preeti U Murthy
2014-09-15  4:16   ` Preeti U Murthy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).