[PATCH] sched,numa: update migrate_improves/degrades

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] sched,numa: update migrate_improves/degrades_locality
@ 2014-05-15 17:03 Rik van Riel
  2014-05-16 13:46 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Rik van Riel @ 2014-05-15 17:03 UTC (permalink / raw)
  To: linux-kernel; +Cc: peterz, mgorman, chegu_vinod, mingo

Update the migrate_improves/degrades_locality functions with
knowledge of pseudo-interleaving.

Do not consider moving tasks around within the set of group's active
nodes as improving or degrading locality. Instead, leave the load
balancer free to balance the load between a numa_group's active nodes.

Also, switch from the group/task_weight functions to the group/task_fault
functions. The "weight" functions involve a division, but both calls use
the same divisor, so there's no point in doing that from these functions.

On a 4 node (x10 core) system, performance of SPECjbb2005 seems
unaffected, though the number of migrations with 2 8-warehouse wide
instances seems to have almost halved, due to the scheduler running
each instance on a single node.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched/fair.c | 42 +++++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6504015..4f01e2f1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4971,6 +4971,7 @@ task_hot(struct task_struct *p, u64 now, struct sched_domain *sd)
 /* Returns true if the destination node has incurred more faults */
 static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
 {
+	struct numa_group *numa_group = ACCESS_ONCE(p->numa_group);
 	int src_nid, dst_nid;
 
 	if (!sched_feat(NUMA_FAVOUR_HIGHER) || !p->numa_faults_memory ||
@@ -4984,21 +4985,29 @@ static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
 	if (src_nid == dst_nid)
 		return false;
 
-	/* Always encourage migration to the preferred node. */
-	if (dst_nid == p->numa_preferred_nid)
-		return true;
+	if (numa_group) {
+		/* Task is already in the group's interleave set. */
+		if (node_isset(src_nid, numa_group->active_nodes))
+			return false;	
+
+		/* Task is moving into the group's interleave set. */
+		if (node_isset(dst_nid, numa_group->active_nodes))
+			return true;
 
-	/* If both task and group weight improve, this move is a winner. */
-	if (task_weight(p, dst_nid) > task_weight(p, src_nid) &&
-	    group_weight(p, dst_nid) > group_weight(p, src_nid))
+		return group_faults(p, dst_nid) > group_faults(p, src_nid);
+	}
+
+	/* Encourage migration to the preferred node. */
+	if (dst_nid == p->numa_preferred_nid)
 		return true;
 
-	return false;
+	return task_faults(p, dst_nid) > task_faults(p, src_nid);
 }
 
 
 static bool migrate_degrades_locality(struct task_struct *p, struct lb_env *env)
 {
+	struct numa_group *numa_group = ACCESS_ONCE(p->numa_group);
 	int src_nid, dst_nid;
 
 	if (!sched_feat(NUMA) || !sched_feat(NUMA_RESIST_LOWER))
@@ -5013,16 +5022,23 @@ static bool migrate_degrades_locality(struct task_struct *p, struct lb_env *env)
 	if (src_nid == dst_nid)
 		return false;
 
+	if (numa_group) {
+		/* Task is moving within/into the group's interleave set. */
+		if (node_isset(dst_nid, numa_group->active_nodes))
+			return false;
+
+		/* Task is moving out of the group's interleave set. */
+		if (node_isset(src_nid, numa_group->active_nodes))
+			return true;	
+
+		return group_faults(p, dst_nid) < group_faults(p, src_nid);
+	}
+
 	/* Migrating away from the preferred node is always bad. */
 	if (src_nid == p->numa_preferred_nid)
 		return true;
 
-	/* If either task or group weight get worse, don't do it. */
-	if (task_weight(p, dst_nid) < task_weight(p, src_nid) ||
-	    group_weight(p, dst_nid) < group_weight(p, src_nid))
-		return true;
-
-	return false;
+	return task_faults(p, dst_nid) < task_faults(p, src_nid);
 }
 
 #else


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] sched,numa: update migrate_improves/degrades_locality
  2014-05-15 17:03 [PATCH] sched,numa: update migrate_improves/degrades_locality Rik van Riel
@ 2014-05-16 13:46 ` Peter Zijlstra
  2014-05-19 13:11 ` [tip:sched/core] sched,numa: Update migrate_improves/ degrades_locality tip-bot for Rik van Riel
  2014-05-22 12:29 ` [tip:sched/core] sched/numa: Update migrate_improves/ degrades_locality() tip-bot for Rik van Riel
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Zijlstra @ 2014-05-16 13:46 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, mgorman, chegu_vinod, mingo

[-- Attachment #1: Type: text/plain, Size: 1751 bytes --]

On Thu, May 15, 2014 at 01:03:06PM -0400, Rik van Riel wrote:
> Update the migrate_improves/degrades_locality functions with
> knowledge of pseudo-interleaving.
> 
> Do not consider moving tasks around within the set of group's active
> nodes as improving or degrading locality. Instead, leave the load
> balancer free to balance the load between a numa_group's active nodes.
> 
> Also, switch from the group/task_weight functions to the group/task_fault
> functions. The "weight" functions involve a division, but both calls use
> the same divisor, so there's no point in doing that from these functions.
> 
> On a 4 node (x10 core) system, performance of SPECjbb2005 seems
> unaffected, though the number of migrations with 2 8-warehouse wide
> instances seems to have almost halved, due to the scheduler running
> each instance on a single node.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>  kernel/sched/fair.c | 42 +++++++++++++++++++++++++++++-------------
>  1 file changed, 29 insertions(+), 13 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6504015..4f01e2f1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4971,6 +4971,7 @@ task_hot(struct task_struct *p, u64 now, struct sched_domain *sd)
>  /* Returns true if the destination node has incurred more faults */
>  static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
>  {
> +	struct numa_group *numa_group = ACCESS_ONCE(p->numa_group);

That wants to be rcu_dereference() to match the rcu_assign_pointer() we
use to set it.

Same in that wake_numa patch

>  	int src_nid, dst_nid;
>  
>  	if (!sched_feat(NUMA_FAVOUR_HIGHER) || !p->numa_faults_memory ||

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:sched/core] sched,numa: Update migrate_improves/ degrades_locality
  2014-05-15 17:03 [PATCH] sched,numa: update migrate_improves/degrades_locality Rik van Riel
  2014-05-16 13:46 ` Peter Zijlstra
@ 2014-05-19 13:11 ` tip-bot for Rik van Riel
  2014-05-22 12:29 ` [tip:sched/core] sched/numa: Update migrate_improves/ degrades_locality() tip-bot for Rik van Riel
  2 siblings, 0 replies; 4+ messages in thread
From: tip-bot for Rik van Riel @ 2014-05-19 13:11 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, riel, hpa, mingo, peterz, tglx

Commit-ID:  f5c1e1af91b2a4238d7c2a6dc4aa0076908b5864
Gitweb:     http://git.kernel.org/tip/f5c1e1af91b2a4238d7c2a6dc4aa0076908b5864
Author:     Rik van Riel <riel@redhat.com>
AuthorDate: Thu, 15 May 2014 13:03:06 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 May 2014 22:02:43 +0900

sched,numa: Update migrate_improves/degrades_locality

Update the migrate_improves/degrades_locality functions with
knowledge of pseudo-interleaving.

Do not consider moving tasks around within the set of group's active
nodes as improving or degrading locality. Instead, leave the load
balancer free to balance the load between a numa_group's active nodes.

Also, switch from the group/task_weight functions to the group/task_fault
functions. The "weight" functions involve a division, but both calls use
the same divisor, so there's no point in doing that from these functions.

On a 4 node (x10 core) system, performance of SPECjbb2005 seems
unaffected, though the number of migrations with 2 8-warehouse wide
instances seems to have almost halved, due to the scheduler running
each instance on a single node.

Cc: mgorman@suse.de
Cc: chegu_vinod@hp.com
Cc: mingo@kernel.org
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140515130306.61aae7db@cuia.bos.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/sched/fair.c | 42 +++++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b899613..503f750 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5123,6 +5123,7 @@ task_hot(struct task_struct *p, u64 now)
 /* Returns true if the destination node has incurred more faults */
 static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
 {
+	struct numa_group *numa_group = rcu_dereference(p->numa_group);
 	int src_nid, dst_nid;
 
 	if (!sched_feat(NUMA_FAVOUR_HIGHER) || !p->numa_faults_memory ||
@@ -5136,21 +5137,29 @@ static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
 	if (src_nid == dst_nid)
 		return false;
 
-	/* Always encourage migration to the preferred node. */
-	if (dst_nid == p->numa_preferred_nid)
-		return true;
+	if (numa_group) {
+		/* Task is already in the group's interleave set. */
+		if (node_isset(src_nid, numa_group->active_nodes))
+			return false;
+
+		/* Task is moving into the group's interleave set. */
+		if (node_isset(dst_nid, numa_group->active_nodes))
+			return true;
 
-	/* If both task and group weight improve, this move is a winner. */
-	if (task_weight(p, dst_nid) > task_weight(p, src_nid) &&
-	    group_weight(p, dst_nid) > group_weight(p, src_nid))
+		return group_faults(p, dst_nid) > group_faults(p, src_nid);
+	}
+
+	/* Encourage migration to the preferred node. */
+	if (dst_nid == p->numa_preferred_nid)
 		return true;
 
-	return false;
+	return task_faults(p, dst_nid) > task_faults(p, src_nid);
 }
 
 
 static bool migrate_degrades_locality(struct task_struct *p, struct lb_env *env)
 {
+	struct numa_group *numa_group = rcu_dereference(p->numa_group);
 	int src_nid, dst_nid;
 
 	if (!sched_feat(NUMA) || !sched_feat(NUMA_RESIST_LOWER))
@@ -5165,16 +5174,23 @@ static bool migrate_degrades_locality(struct task_struct *p, struct lb_env *env)
 	if (src_nid == dst_nid)
 		return false;
 
+	if (numa_group) {
+		/* Task is moving within/into the group's interleave set. */
+		if (node_isset(dst_nid, numa_group->active_nodes))
+			return false;
+
+		/* Task is moving out of the group's interleave set. */
+		if (node_isset(src_nid, numa_group->active_nodes))
+			return true;
+
+		return group_faults(p, dst_nid) < group_faults(p, src_nid);
+	}
+
 	/* Migrating away from the preferred node is always bad. */
 	if (src_nid == p->numa_preferred_nid)
 		return true;
 
-	/* If either task or group weight get worse, don't do it. */
-	if (task_weight(p, dst_nid) < task_weight(p, src_nid) ||
-	    group_weight(p, dst_nid) < group_weight(p, src_nid))
-		return true;
-
-	return false;
+	return task_faults(p, dst_nid) < task_faults(p, src_nid);
 }
 
 #else

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [tip:sched/core] sched/numa: Update migrate_improves/ degrades_locality()
  2014-05-15 17:03 [PATCH] sched,numa: update migrate_improves/degrades_locality Rik van Riel
  2014-05-16 13:46 ` Peter Zijlstra
  2014-05-19 13:11 ` [tip:sched/core] sched,numa: Update migrate_improves/ degrades_locality tip-bot for Rik van Riel
@ 2014-05-22 12:29 ` tip-bot for Rik van Riel
  2 siblings, 0 replies; 4+ messages in thread
From: tip-bot for Rik van Riel @ 2014-05-22 12:29 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, riel, hpa, mingo, peterz, tglx

Commit-ID:  b1ad065e65f56103db8b97edbd218a271ff5b1bb
Gitweb:     http://git.kernel.org/tip/b1ad065e65f56103db8b97edbd218a271ff5b1bb
Author:     Rik van Riel <riel@redhat.com>
AuthorDate: Thu, 15 May 2014 13:03:06 -0400
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 22 May 2014 11:16:39 +0200

sched/numa: Update migrate_improves/degrades_locality()

Update the migrate_improves/degrades_locality() functions with
knowledge of pseudo-interleaving.

Do not consider moving tasks around within the set of group's active
nodes as improving or degrading locality. Instead, leave the load
balancer free to balance the load between a numa_group's active nodes.

Also, switch from the group/task_weight functions to the group/task_fault
functions. The "weight" functions involve a division, but both calls use
the same divisor, so there's no point in doing that from these functions.

On a 4 node (x10 core) system, performance of SPECjbb2005 seems
unaffected, though the number of migrations with 2 8-warehouse wide
instances seems to have almost halved, due to the scheduler running
each instance on a single node.

Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: mgorman@suse.de
Cc: chegu_vinod@hp.com
Link: http://lkml.kernel.org/r/20140515130306.61aae7db@cuia.bos.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/fair.c | 42 +++++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b899613..503f750 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5123,6 +5123,7 @@ task_hot(struct task_struct *p, u64 now)
 /* Returns true if the destination node has incurred more faults */
 static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
 {
+	struct numa_group *numa_group = rcu_dereference(p->numa_group);
 	int src_nid, dst_nid;
 
 	if (!sched_feat(NUMA_FAVOUR_HIGHER) || !p->numa_faults_memory ||
@@ -5136,21 +5137,29 @@ static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
 	if (src_nid == dst_nid)
 		return false;
 
-	/* Always encourage migration to the preferred node. */
-	if (dst_nid == p->numa_preferred_nid)
-		return true;
+	if (numa_group) {
+		/* Task is already in the group's interleave set. */
+		if (node_isset(src_nid, numa_group->active_nodes))
+			return false;
+
+		/* Task is moving into the group's interleave set. */
+		if (node_isset(dst_nid, numa_group->active_nodes))
+			return true;
 
-	/* If both task and group weight improve, this move is a winner. */
-	if (task_weight(p, dst_nid) > task_weight(p, src_nid) &&
-	    group_weight(p, dst_nid) > group_weight(p, src_nid))
+		return group_faults(p, dst_nid) > group_faults(p, src_nid);
+	}
+
+	/* Encourage migration to the preferred node. */
+	if (dst_nid == p->numa_preferred_nid)
 		return true;
 
-	return false;
+	return task_faults(p, dst_nid) > task_faults(p, src_nid);
 }
 
 
 static bool migrate_degrades_locality(struct task_struct *p, struct lb_env *env)
 {
+	struct numa_group *numa_group = rcu_dereference(p->numa_group);
 	int src_nid, dst_nid;
 
 	if (!sched_feat(NUMA) || !sched_feat(NUMA_RESIST_LOWER))
@@ -5165,16 +5174,23 @@ static bool migrate_degrades_locality(struct task_struct *p, struct lb_env *env)
 	if (src_nid == dst_nid)
 		return false;
 
+	if (numa_group) {
+		/* Task is moving within/into the group's interleave set. */
+		if (node_isset(dst_nid, numa_group->active_nodes))
+			return false;
+
+		/* Task is moving out of the group's interleave set. */
+		if (node_isset(src_nid, numa_group->active_nodes))
+			return true;
+
+		return group_faults(p, dst_nid) < group_faults(p, src_nid);
+	}
+
 	/* Migrating away from the preferred node is always bad. */
 	if (src_nid == p->numa_preferred_nid)
 		return true;
 
-	/* If either task or group weight get worse, don't do it. */
-	if (task_weight(p, dst_nid) < task_weight(p, src_nid) ||
-	    group_weight(p, dst_nid) < group_weight(p, src_nid))
-		return true;
-
-	return false;
+	return task_faults(p, dst_nid) < task_faults(p, src_nid);
 }
 
 #else

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-22 12:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-15 17:03 [PATCH] sched,numa: update migrate_improves/degrades_locality Rik van Riel
2014-05-16 13:46 ` Peter Zijlstra
2014-05-19 13:11 ` [tip:sched/core] sched,numa: Update migrate_improves/ degrades_locality tip-bot for Rik van Riel
2014-05-22 12:29 ` [tip:sched/core] sched/numa: Update migrate_improves/ degrades_locality() tip-bot for Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox