All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rik van Riel <riel@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, mgorman@suse.de,
	chegu_vinod@hp.com
Subject: Re: [PATCH 2/4] sched,numa: weigh nearby nodes for task placement on complex NUMA topologies
Date: Fri, 09 May 2014 11:11:00 -0400	[thread overview]
Message-ID: <536CF004.7090102@redhat.com> (raw)
In-Reply-To: <20140509101152.GT30445@twins.programming.kicks-ass.net>

On 05/09/2014 06:11 AM, Peter Zijlstra wrote:
> On Thu, May 08, 2014 at 01:23:29PM -0400, riel@redhat.com wrote:
>> @@ -930,7 +987,7 @@ static inline unsigned long group_faults_cpu(struct numa_group *group, int nid)
>>    */
>>   static inline unsigned long task_weight(struct task_struct *p, int nid)
>>   {
>> -	unsigned long total_faults;
>> +	unsigned long total_faults, score;
>>
>>   	if (!p->numa_faults_memory)
>>   		return 0;
>> @@ -940,15 +997,32 @@ static inline unsigned long task_weight(struct task_struct *p, int nid)
>>   	if (!total_faults)
>>   		return 0;
>>
>> -	return 1000 * task_faults(p, nid) / total_faults;
>> +	score = 1000 * task_faults(p, nid);
>> +	score += nearby_nodes_score(p, nid, true);
>> +
>> +	score /= total_faults;
>> +
>> +	return score;
>>   }
>>
>>   static inline unsigned long group_weight(struct task_struct *p, int nid)
>>   {
>> -	if (!p->numa_group || !p->numa_group->total_faults)
>> +	unsigned long total_faults, score;
>> +
>> +	if (!p->numa_group)
>> +		return 0;
>> +
>> +	total_faults = p->numa_group->total_faults;
>> +
>> +	if (!total_faults)
>>   		return 0;
>>
>> -	return 1000 * group_faults(p, nid) / p->numa_group->total_faults;
>> +	score = 1000 * group_faults(p, nid);
>> +	score += nearby_nodes_score(p, nid, false);
>> +
>> +	score /= total_faults;
>> +
>> +	return score;
>>   }
>
> OK, and that's just sad..
>
> See task_numa_placement(), which does:
>
> 	for_each_online_node(nid) {
> 		weight = task_weight(p, nid) + group_weight(p, nid);
> 		if (weight > max_weight) {
> 			max_weight = weight;
> 			max_nid = nid;
> 		}
> 	}
>
> So not only is that loop now O(nr_nodes^2), the inner loops doubly
> iterates all nodes.

I am not too worried about task_numa_placement, but you are
right that this may well be much too expensive for more
frequently called code like migrate_improves_locality.

Having said that, grouping related tasks together on nearby
nodes does seem to bring significant performance gains.

Do you have any ideas on other ways we can achieve that
grouping?

> Also, {task,group}_weight() functions were like cheap-ish (/me mumbles
> something about people using !2^n scaling factors for no sane reason).
> And they're used all over with that in mind.
>
> But look what you did to migrate_improves_locality(), that will now
> iterate all nodes _4_ times, and its called for every single task we try
> and migrate during load balance, while holding rq->lock.
>
>


  reply	other threads:[~2014-05-09 15:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-08 17:23 [PATCH 0/4] sched,numa: task placement for complex NUMA topologies riel
2014-05-08 17:23 ` [PATCH 1/4] numa,x86: store maximum numa node distance riel
2014-05-09  9:45   ` Peter Zijlstra
2014-05-09 15:08     ` Rik van Riel
2014-05-08 17:23 ` [PATCH 2/4] sched,numa: weigh nearby nodes for task placement on complex NUMA topologies riel
2014-05-09  9:53   ` Peter Zijlstra
2014-05-09 15:14     ` Rik van Riel
2014-05-09  9:54   ` Peter Zijlstra
2014-05-09 10:03   ` Peter Zijlstra
2014-05-09 15:16     ` Rik van Riel
2014-05-09 10:11   ` Peter Zijlstra
2014-05-09 15:11     ` Rik van Riel [this message]
2014-05-09 10:13   ` Peter Zijlstra
2014-05-09 15:03     ` Rik van Riel
2014-05-08 17:23 ` [PATCH 3/4] sched,numa: store numa_group's preferred nid riel
2014-05-08 17:23 ` [PATCH 4/4] sched,numa: pull workloads towards their preferred nodes riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=536CF004.7090102@redhat.com \
    --to=riel@redhat.com \
    --cc=chegu_vinod@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.