public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Williams <pwil3058@bigpond.net.au>
To: Peter Williams <pwil3058@bigpond.net.au>
Cc: Andrew Morton <akpm@osdl.org>,
	suresh.b.siddha@intel.com, kernel@kolivas.org, npiggin@suse.de,
	mingo@elte.hu, rostedt@goodmis.org, linux-kernel@vger.kernel.org,
	torvalds@osdl.org
Subject: Re: [rfc][patch] sched: remove smpnice
Date: Mon, 13 Feb 2006 10:10:23 +1100	[thread overview]
Message-ID: <43EFC05F.6010204@bigpond.net.au> (raw)
In-Reply-To: <43EE8BAD.9040601@bigpond.net.au>

Peter Williams wrote:
> Andrew Morton wrote:
> 
>> Peter Williams <pwil3058@bigpond.net.au> wrote:
>>
>>> I don't think either of these issues warrant abandoning smpnice.  The 
>>> first is highly unlikely to occur on real systems and the second is 
>>> just an example of the patch doing its job (maybe too officiously).  
>>> I don't think users would notice either on real systems.
>>>
>>> Even if you pull it from 2.6.16 rather than upgrading it with my 
>>> patch can you please leave both in -mm?
>>
>>
>>
>> Yes, I have done that.  I currently have:
>>
>> sched-restore-smpnice.patch
>> sched-modified-nice-support-for-smp-load-balancing.patch
>> sched-cleanup_task_activated.patch
>> sched-alter_uninterruptible_sleep_interactivity.patch
>> sched-make_task_noninteractive_use_sleep_type.patch
>> sched-dont_decrease_idle_sleep_avg.patch
>> sched-include_noninteractive_sleep_in_idle_detect.patch
>> sched-new-sched-domain-for-representing-multi-core.patch
>> sched-fix-group-power-for-allnodes_domains.patch
> 
> 
> OK.  Having slept on these problems I am now of the opinion that the 
> problems are caused by the use of NICE_TO_BIAS_PRIO(0) to set *imbalance 
> inside the (*imbalance < SCHED_LOAD_SCALE) if statement in 
> find_busiest_group().  What is happening here is that even though the 
> imbalance is less than one (average) task sometimes the decision is made 
> to move a task anyway but with the current version this decision can be 
> subverted in two ways: 1) if the task to be moved has a nice value less 
> than zero the value of *imbalance that is set will be too small for 
> move_tasks() to move it; and 2) if there are a number of tasks with nice 
> values greater than zero on the "busiest" more than one of them may be 
> moved as the value of *imbalance that is set may be big enough to 
> include more than one of these tasks.
> 
> The fix for this problem is to replace NICE_TO_BIAS_PRIO(0) with the 
> "average bias prio per runnable task" on "busiest".  This will 
> (generally) result in a larger value for *imbalance in case 1. above and 
> a smaller one in case 2. and alleviate both problems.  A patch to apply 
> this fix is attached.
> 
> Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
> 
> Could you please add this patch to -mm so that it can be tested?
> 
> Thanks
> Peter
> 
> 
> ------------------------------------------------------------------------
> 
> Index: MM-2.6.X/kernel/sched.c
> ===================================================================
> --- MM-2.6.X.orig/kernel/sched.c	2006-02-12 11:24:48.000000000 +1100
> +++ MM-2.6.X/kernel/sched.c	2006-02-12 11:35:40.000000000 +1100
> @@ -735,6 +735,19 @@ static inline unsigned long biased_load(
>  {
>  	return (wload * NICE_TO_BIAS_PRIO(0)) / SCHED_LOAD_SCALE;
>  }
> +
> +/* get the average biased load per runnable task for a run queue */
> +static inline unsigned long avg_biased_load(runqueue_t *rq)
> +{
> +	/*
> +	 * When I'm convinced that this won't be called with a zero nr_running
> +	 * and that it can't change during the call this can be simplified.
> +	 * For the time being and for proof of concept let's paly it safe.
> +	 */
> +	unsigned long n = rq->nr_running;
> +
> +	return n ? rq->prio_bias / n : 0;
> +}
>  #else
>  static inline void set_bias_prio(task_t *p)
>  {
> @@ -2116,7 +2129,7 @@ find_busiest_group(struct sched_domain *
>  		unsigned long tmp;
>  
>  		if (max_load - this_load >= SCHED_LOAD_SCALE*2) {
> -			*imbalance = NICE_TO_BIAS_PRIO(0);
> +			*imbalance = avg_biased_load(busiest);
>  			return busiest;
>  		}
>  
> @@ -2149,7 +2162,7 @@ find_busiest_group(struct sched_domain *
>  		if (pwr_move <= pwr_now)
>  			goto out_balanced;
>  
> -		*imbalance = NICE_TO_BIAS_PRIO(0);
> +		*imbalance = avg_biased_load(busiest);
>  		return busiest;
>  	}
>  

Can we pull this one, please?  I've mistakenly assumed that busiest was 
a run queue when it's actually a sched group. :-(

Peter
-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

  reply	other threads:[~2006-02-12 23:10 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-07 14:28 [rfc][patch] sched: remove smpnice Nick Piggin
2006-02-07 14:57 ` Con Kolivas
2006-02-07 15:05   ` Nick Piggin
2006-02-07 22:15   ` Andrew Morton
2006-02-07 23:11     ` Con Kolivas
2006-02-07 23:36       ` Andrew Morton
2006-02-08  3:28         ` Nick Piggin
2006-02-08 14:56         ` Ingo Molnar
2006-02-10  7:01         ` Siddha, Suresh B
2006-02-10  7:17           ` Andrew Morton
2006-02-10  7:23             ` Con Kolivas
2006-02-10  9:06             ` Ingo Molnar
2006-02-11  1:27             ` Peter Williams
2006-02-11  2:00               ` Andrew Morton
2006-02-12  1:13                 ` Peter Williams
2006-02-12 23:10                   ` Peter Williams [this message]
2006-02-13  1:06                     ` Peter Williams
2006-02-14  0:37                       ` Peter Williams
2006-02-14  8:53                         ` Siddha, Suresh B
2006-02-11  3:36               ` Peter Williams
2006-02-11  4:04               ` Peter Williams
2006-02-14  9:07               ` Siddha, Suresh B
2006-02-14 22:40                 ` Peter Williams
2006-02-14 23:44                   ` Paul Jackson
2006-02-15  0:09                     ` Peter Williams
2006-02-15  1:00                       ` Paul Jackson
2006-02-15  7:07                   ` Siddha, Suresh B
2006-02-15 22:36                     ` Peter Williams
2006-02-15 23:29                       ` Peter Williams
2006-02-13 14:12           ` Con Kolivas
2006-02-07 23:20     ` Peter Williams
2006-02-07 23:29       ` Con Kolivas
2006-02-07 23:36       ` Martin Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43EFC05F.6010204@bigpond.net.au \
    --to=pwil3058@bigpond.net.au \
    --cc=akpm@osdl.org \
    --cc=kernel@kolivas.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=rostedt@goodmis.org \
    --cc=suresh.b.siddha@intel.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox