From: Peter Williams <pwil3058@bigpond.net.au>
To: Peter Williams <pwil3058@bigpond.net.au>
Cc: Andrew Morton <akpm@osdl.org>,
suresh.b.siddha@intel.com, kernel@kolivas.org, npiggin@suse.de,
mingo@elte.hu, rostedt@goodmis.org, linux-kernel@vger.kernel.org,
torvalds@osdl.org
Subject: Re: [rfc][patch] sched: remove smpnice
Date: Mon, 13 Feb 2006 10:10:23 +1100 [thread overview]
Message-ID: <43EFC05F.6010204@bigpond.net.au> (raw)
In-Reply-To: <43EE8BAD.9040601@bigpond.net.au>
Peter Williams wrote:
> Andrew Morton wrote:
>
>> Peter Williams <pwil3058@bigpond.net.au> wrote:
>>
>>> I don't think either of these issues warrant abandoning smpnice. The
>>> first is highly unlikely to occur on real systems and the second is
>>> just an example of the patch doing its job (maybe too officiously).
>>> I don't think users would notice either on real systems.
>>>
>>> Even if you pull it from 2.6.16 rather than upgrading it with my
>>> patch can you please leave both in -mm?
>>
>>
>>
>> Yes, I have done that. I currently have:
>>
>> sched-restore-smpnice.patch
>> sched-modified-nice-support-for-smp-load-balancing.patch
>> sched-cleanup_task_activated.patch
>> sched-alter_uninterruptible_sleep_interactivity.patch
>> sched-make_task_noninteractive_use_sleep_type.patch
>> sched-dont_decrease_idle_sleep_avg.patch
>> sched-include_noninteractive_sleep_in_idle_detect.patch
>> sched-new-sched-domain-for-representing-multi-core.patch
>> sched-fix-group-power-for-allnodes_domains.patch
>
>
> OK. Having slept on these problems I am now of the opinion that the
> problems are caused by the use of NICE_TO_BIAS_PRIO(0) to set *imbalance
> inside the (*imbalance < SCHED_LOAD_SCALE) if statement in
> find_busiest_group(). What is happening here is that even though the
> imbalance is less than one (average) task sometimes the decision is made
> to move a task anyway but with the current version this decision can be
> subverted in two ways: 1) if the task to be moved has a nice value less
> than zero the value of *imbalance that is set will be too small for
> move_tasks() to move it; and 2) if there are a number of tasks with nice
> values greater than zero on the "busiest" more than one of them may be
> moved as the value of *imbalance that is set may be big enough to
> include more than one of these tasks.
>
> The fix for this problem is to replace NICE_TO_BIAS_PRIO(0) with the
> "average bias prio per runnable task" on "busiest". This will
> (generally) result in a larger value for *imbalance in case 1. above and
> a smaller one in case 2. and alleviate both problems. A patch to apply
> this fix is attached.
>
> Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
>
> Could you please add this patch to -mm so that it can be tested?
>
> Thanks
> Peter
>
>
> ------------------------------------------------------------------------
>
> Index: MM-2.6.X/kernel/sched.c
> ===================================================================
> --- MM-2.6.X.orig/kernel/sched.c 2006-02-12 11:24:48.000000000 +1100
> +++ MM-2.6.X/kernel/sched.c 2006-02-12 11:35:40.000000000 +1100
> @@ -735,6 +735,19 @@ static inline unsigned long biased_load(
> {
> return (wload * NICE_TO_BIAS_PRIO(0)) / SCHED_LOAD_SCALE;
> }
> +
> +/* get the average biased load per runnable task for a run queue */
> +static inline unsigned long avg_biased_load(runqueue_t *rq)
> +{
> + /*
> + * When I'm convinced that this won't be called with a zero nr_running
> + * and that it can't change during the call this can be simplified.
> + * For the time being and for proof of concept let's paly it safe.
> + */
> + unsigned long n = rq->nr_running;
> +
> + return n ? rq->prio_bias / n : 0;
> +}
> #else
> static inline void set_bias_prio(task_t *p)
> {
> @@ -2116,7 +2129,7 @@ find_busiest_group(struct sched_domain *
> unsigned long tmp;
>
> if (max_load - this_load >= SCHED_LOAD_SCALE*2) {
> - *imbalance = NICE_TO_BIAS_PRIO(0);
> + *imbalance = avg_biased_load(busiest);
> return busiest;
> }
>
> @@ -2149,7 +2162,7 @@ find_busiest_group(struct sched_domain *
> if (pwr_move <= pwr_now)
> goto out_balanced;
>
> - *imbalance = NICE_TO_BIAS_PRIO(0);
> + *imbalance = avg_biased_load(busiest);
> return busiest;
> }
>
Can we pull this one, please? I've mistakenly assumed that busiest was
a run queue when it's actually a sched group. :-(
Peter
--
Peter Williams pwil3058@bigpond.net.au
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
next prev parent reply other threads:[~2006-02-12 23:10 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-07 14:28 [rfc][patch] sched: remove smpnice Nick Piggin
2006-02-07 14:57 ` Con Kolivas
2006-02-07 15:05 ` Nick Piggin
2006-02-07 22:15 ` Andrew Morton
2006-02-07 23:11 ` Con Kolivas
2006-02-07 23:36 ` Andrew Morton
2006-02-08 3:28 ` Nick Piggin
2006-02-08 14:56 ` Ingo Molnar
2006-02-10 7:01 ` Siddha, Suresh B
2006-02-10 7:17 ` Andrew Morton
2006-02-10 7:23 ` Con Kolivas
2006-02-10 9:06 ` Ingo Molnar
2006-02-11 1:27 ` Peter Williams
2006-02-11 2:00 ` Andrew Morton
2006-02-12 1:13 ` Peter Williams
2006-02-12 23:10 ` Peter Williams [this message]
2006-02-13 1:06 ` Peter Williams
2006-02-14 0:37 ` Peter Williams
2006-02-14 8:53 ` Siddha, Suresh B
2006-02-11 3:36 ` Peter Williams
2006-02-11 4:04 ` Peter Williams
2006-02-14 9:07 ` Siddha, Suresh B
2006-02-14 22:40 ` Peter Williams
2006-02-14 23:44 ` Paul Jackson
2006-02-15 0:09 ` Peter Williams
2006-02-15 1:00 ` Paul Jackson
2006-02-15 7:07 ` Siddha, Suresh B
2006-02-15 22:36 ` Peter Williams
2006-02-15 23:29 ` Peter Williams
2006-02-13 14:12 ` Con Kolivas
2006-02-07 23:20 ` Peter Williams
2006-02-07 23:29 ` Con Kolivas
2006-02-07 23:36 ` Martin Bligh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43EFC05F.6010204@bigpond.net.au \
--to=pwil3058@bigpond.net.au \
--cc=akpm@osdl.org \
--cc=kernel@kolivas.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=npiggin@suse.de \
--cc=rostedt@goodmis.org \
--cc=suresh.b.siddha@intel.com \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox