public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Williams <pwil3058@bigpond.net.au>
To: Peter Williams <pwil3058@bigpond.net.au>
Cc: Con Kolivas <kernel@kolivas.org>,
	Martin Bligh <mbligh@google.com>, Andrew Morton <akpm@osdl.org>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	Andy Whitcroft <apw@shadowen.org>
Subject: Re: -mm seems significanty slower than mainline on kernbench
Date: Fri, 13 Jan 2006 23:00:51 +1100	[thread overview]
Message-ID: <43C79673.8040507@bigpond.net.au> (raw)
In-Reply-To: <43C75178.80809@bigpond.net.au>

Peter Williams wrote:
> Peter Williams wrote:
> 
>> Peter Williams wrote:
>>
>>> Martin Bligh wrote:
>>>
>>>>
>>>>>>
>>>>>> But I was thinking more about the code that (in the original) 
>>>>>> handled the case where the number of tasks to be moved was less 
>>>>>> than 1 but more than 0 (i.e. the cases where "imbalance" would 
>>>>>> have been reduced to zero when divided by SCHED_LOAD_SCALE).  I 
>>>>>> think that I got that part wrong and you can end up with a bias 
>>>>>> load to be moved which is less than any of the bias_prio values 
>>>>>> for any queued tasks (in circumstances where the original code 
>>>>>> would have rounded up to 1 and caused a move).  I think that the 
>>>>>> way to handle this problem is to replace 1 with "average bias 
>>>>>> prio" within that logic.  This would guarantee at least one task 
>>>>>> with a bias_prio small enough to be moved.
>>>>>>
>>>>>> I think that this analysis is a strong argument for my original 
>>>>>> patch being the cause of the problem so I'll go ahead and generate 
>>>>>> a fix. I'll try to have a patch available later this morning.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Attached is a patch that addresses this problem.  Unlike the 
>>>>> description above it does not use "average bias prio" as that 
>>>>> solution would be very complicated.  Instead it makes the 
>>>>> assumption that NICE_TO_BIAS_PRIO(0) is a "good enough" for this 
>>>>> purpose as this is highly likely to be the median bias prio and the 
>>>>> median is probably better for this purpose than the average.
>>>>>
>>>>> Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Doesn't fix the perf issue.
>>>
>>>
>>>
>>>
>>> OK, thanks.  I think there's a few more places where SCHED_LOAD_SCALE 
>>> needs to be multiplied by NICE_TO_BIAS_PRIO(0).  Basically, anywhere 
>>> that it's added to, subtracted from or compared to a load.  In those 
>>> cases it's being used as a scaled version of 1 and we need a scaled 
>>
>>
>>
>> This would have been better said as "the load generated by 1 task" 
>> rather than just "a scaled version of 1".  Numerically, they're the 
>> same but one is clearer than the other and makes it more obvious why 
>> we need NICE_TO_BIAS_PRIO(0) * SCHED_LOAD_SCALE and where we need it.
>>
>>> version of NICE_TO_BIAS_PRIO(0).  I'll have another patch later today.
>>
>>
>>
>> I'm just testing this at the moment.
> 
> 
> Attached is a new patch to fix the excessive idle problem.  This patch 
> takes a new approach to the problem as it was becoming obvious that 
> trying to alter the load balancing code to cope with biased load was 
> harder than it seemed.
> 
> This approach reverts to the old load values but weights them according 
> to tasks' bias_prio values.  This means that any assumptions by the load 
> balancing code that the load generated by a single task is 
> SCHED_LOAD_SCALE will still hold.  Then, in find_busiest_group(), the 
> imbalance is scaled back up to bias_prio scale so that move_tasks() can 
> move biased load rather than tasks.
> 
> One advantage of this is that when there are no non zero niced tasks the 
> processing will be mathematically the same as the original code. 
> Kernbench results from a 2 CPU Celeron 550Mhz system are:
> 
> Average Optimal -j 8 Load Run:
> Elapsed Time 1056.16 (0.831102)
> User Time 1906.54 (1.38447)
> System Time 182.086 (0.973386)
> Percent CPU 197 (0)
> Context Switches 48727.2 (249.351)
> Sleeps 27623.4 (413.913)
> 
> This indicates that, on average, 98.9% of the total available CPU was 
> used by the build.

Here's the numbers for the same machine with the "improved smp nice 
handling" completely removed i.e. back to 2.6.15 version.

Average Optimal -j 8 Load Run:
Elapsed Time 1059.95 (1.19324)
User Time 1914.94 (1.11102)
System Time 181.846 (0.916695)
Percent CPU 197.4 (0.547723)
Context Switches 40917.4 (469.184)
Sleeps 26654 (320.824)

> 
> Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
> 
> BTW I think that we need to think about a slightly more complex nice to 
> bias mapping function.  The current one gives a nice==19 1/20 of the 
> bias of a nice=0 task but only gives nice=-20 tasks twice the bias of a 
> nice=0 task.  I don't think this is a big problem as the majority of non 
> nice==0 tasks will have positive nice but should be looked at for a 
> future enhancement.
> 
> Peter
> 
> 
> ------------------------------------------------------------------------
> 
> Index: MM-2.6.X/kernel/sched.c
> ===================================================================
> --- MM-2.6.X.orig/kernel/sched.c	2006-01-13 14:53:34.000000000 +1100
> +++ MM-2.6.X/kernel/sched.c	2006-01-13 15:11:19.000000000 +1100
> @@ -1042,7 +1042,8 @@ void kick_process(task_t *p)
>  static unsigned long source_load(int cpu, int type)
>  {
>  	runqueue_t *rq = cpu_rq(cpu);
> -	unsigned long load_now = rq->prio_bias * SCHED_LOAD_SCALE;
> +	unsigned long load_now = (rq->prio_bias * SCHED_LOAD_SCALE) /
> +		NICE_TO_BIAS_PRIO(0);
>  
>  	if (type == 0)
>  		return load_now;
> @@ -1056,7 +1057,8 @@ static unsigned long source_load(int cpu
>  static inline unsigned long target_load(int cpu, int type)
>  {
>  	runqueue_t *rq = cpu_rq(cpu);
> -	unsigned long load_now = rq->prio_bias * SCHED_LOAD_SCALE;
> +	unsigned long load_now = (rq->prio_bias * SCHED_LOAD_SCALE) /
> +		NICE_TO_BIAS_PRIO(0);
>  
>  	if (type == 0)
>  		return load_now;
> @@ -1322,7 +1324,8 @@ static int try_to_wake_up(task_t *p, uns
>  			 * of the current CPU:
>  			 */
>  			if (sync)
> -				tl -= p->bias_prio * SCHED_LOAD_SCALE;
> +				tl -= (p->bias_prio * SCHED_LOAD_SCALE) /
> +					NICE_TO_BIAS_PRIO(0);
>  
>  			if ((tl <= load &&
>  				tl + target_load(cpu, idx) <= SCHED_LOAD_SCALE) ||
> @@ -2159,7 +2162,7 @@ find_busiest_group(struct sched_domain *
>  	}
>  
>  	/* Get rid of the scaling factor, rounding down as we divide */
> -	*imbalance = *imbalance / SCHED_LOAD_SCALE;
> +	*imbalance = (*imbalance * NICE_TO_BIAS_PRIO(0)) / SCHED_LOAD_SCALE;
>  	return busiest;
>  
>  out_balanced:
> @@ -2472,7 +2475,8 @@ static void rebalance_tick(int this_cpu,
>  	struct sched_domain *sd;
>  	int i;
>  
> -	this_load = this_rq->prio_bias * SCHED_LOAD_SCALE;
> +	this_load = (this_rq->prio_bias * SCHED_LOAD_SCALE) /
> +		NICE_TO_BIAS_PRIO(0);
>  	/* Update our load */
>  	for (i = 0; i < 3; i++) {
>  		unsigned long new_load = this_load;


-- 
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
  -- Ambrose Bierce

  reply	other threads:[~2006-01-13 12:00 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-11  1:14 -mm seems significanty slower than mainline on kernbench Martin Bligh
2006-01-11  1:31 ` Andrew Morton
2006-01-11  1:41   ` Martin Bligh
2006-01-11  1:48     ` Andrew Morton
2006-01-11  1:49     ` Con Kolivas
2006-01-11  2:38       ` Peter Williams
2006-01-11  3:07         ` Con Kolivas
2006-01-11  3:12           ` Martin Bligh
2006-01-11  3:40           ` Peter Williams
2006-01-11  3:49             ` Con Kolivas
2006-01-11  4:33               ` Peter Williams
2006-01-11  5:14             ` Peter Williams
2006-01-11  6:21               ` Martin J. Bligh
2006-01-11 12:24                 ` Peter Williams
2006-01-11 14:29                   ` Con Kolivas
2006-01-11 22:05                     ` Peter Williams
2006-01-12  0:54                       ` Peter Williams
2006-01-12  1:18                         ` Con Kolivas
2006-01-12  1:29                           ` Peter Williams
2006-01-12  1:36                             ` Con Kolivas
2006-01-12  2:23                               ` Peter Williams
2006-01-12  2:26                                 ` Martin Bligh
2006-01-12  6:39                                   ` Con Kolivas
2006-01-23 19:28                                     ` Martin Bligh
2006-01-24  1:25                                       ` Peter Williams
2006-01-24  3:50                                         ` Peter Williams
2006-01-24  4:41                                           ` Martin J. Bligh
2006-01-24  6:22                                             ` Peter Williams
2006-01-24  6:42                                               ` Martin J. Bligh
2006-01-28 23:20                                                 ` Peter Williams
2006-01-29  0:52                                                   ` Martin J. Bligh
2006-01-12  2:27                                 ` Con Kolivas
2006-01-12  2:04                           ` Martin Bligh
2006-01-12  6:35                             ` Martin J. Bligh
2006-01-12  6:41                               ` Con Kolivas
2006-01-12  6:54                                 ` Peter Williams
2006-01-12 18:39                         ` Martin Bligh
2006-01-12 20:03                           ` Peter Williams
2006-01-12 22:20                             ` Peter Williams
2006-01-13  7:06                               ` Peter Williams
2006-01-13 12:00                                 ` Peter Williams [this message]
2006-01-13 16:15                                 ` Martin J. Bligh
2006-01-13 16:26                                 ` Andy Whitcroft
2006-01-13 17:54                                   ` Andy Whitcroft
2006-01-13 20:41                                     ` Martin Bligh
2006-01-14  0:23                                       ` Peter Williams
2006-01-14  5:03                                         ` Nick Piggin
2006-01-14  5:40                                           ` Con Kolivas
2006-01-14  6:05                                             ` Nick Piggin
2006-01-14  5:53                                           ` Peter Williams
2006-01-14  6:13                                             ` Nick Piggin
2006-01-13 22:59                                     ` Peter Williams
2006-01-14 18:48                                 ` Martin J. Bligh
2006-01-15  0:05                                   ` Peter Williams
2006-01-15  2:04                                     ` Con Kolivas
2006-01-15  2:09                                     ` [PATCH] sched - remove unnecessary smpnice ifdefs Con Kolivas
2006-01-15  3:50                                     ` -mm seems significanty slower than mainline on kernbench Ingo Molnar
2006-01-12  1:25                       ` Peter Williams
2006-01-11  1:52     ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43C79673.8040507@bigpond.net.au \
    --to=pwil3058@bigpond.net.au \
    --cc=akpm@osdl.org \
    --cc=apw@shadowen.org \
    --cc=kernel@kolivas.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@google.com \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox