From: Ingo Molnar <mingo@kernel.org>
To: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
LKML <linux-kernel@vger.kernel.org>,
Mel Gorman <mgorman@techsingularity.net>,
Rik van Riel <riel@surriel.com>,
Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 3/6] sched/numa: Avoid task migration for small numa improvement
Date: Mon, 10 Sep 2018 10:46:33 +0200 [thread overview]
Message-ID: <20180910084633.GD48257@gmail.com> (raw)
In-Reply-To: <1533276841-16341-4-git-send-email-srikar@linux.vnet.ibm.com>
* Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote:
> If numa improvement from the task migration is going to be very
> minimal, then avoid task migration.
>
> specjbb2005 / bops/JVM / higher bops are better
> on 2 Socket/2 Node Intel
> JVMS Prev Current %Change
> 4 200892 210118 4.59252
> 1 325766 313171 -3.86627
>
>
> on 2 Socket/4 Node Power8 (PowerNV)
> JVMS Prev Current %Change
> 8 89011.9 91027.5 2.26442
> 1 211338 216460 2.42361
>
>
> on 2 Socket/2 Node Power9 (PowerNV)
> JVMS Prev Current %Change
> 4 190261 191918 0.870909
> 1 195305 207043 6.01009
>
>
> on 4 Socket/4 Node Power7
> JVMS Prev Current %Change
> 8 57651.1 58462.1 1.40674
> 1 111351 108334 -2.70945
>
>
> dbench / transactions / higher numbers are better
> on 2 Socket/2 Node Intel
> count Min Max Avg Variance %Change
> 5 12254.7 12331.9 12297.8 28.1846
> 5 11851.8 11937.3 11890.9 33.5169 -3.30872
>
>
> on 2 Socket/4 Node Power8 (PowerNV)
> count Min Max Avg Variance %Change
> 5 4997.83 5030.14 5015.54 12.947
> 5 4791 5016.08 4962.55 85.9625 -1.05652
>
>
> on 2 Socket/2 Node Power9 (PowerNV)
> count Min Max Avg Variance %Change
> 5 9331.84 9375.11 9352.04 16.0703
> 5 9353.43 9380.49 9369.6 9.04361 0.187767
>
>
> on 4 Socket/4 Node Power7
> count Min Max Avg Variance %Change
> 5 147.55 181.605 168.963 11.3513
> 5 149.518 215.412 179.083 21.5903 5.98948
>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog v1->v2:
> - Handle trivial changes due to variable name change. (Rik Van Riel)
> - Drop changes where subsequent better cpu find was rejected for
> small numa improvement (Rik Van Riel).
>
> kernel/sched/fair.c | 23 ++++++++++++++++++-----
> 1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5cf921a..a717870 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -1568,6 +1568,13 @@ static bool load_too_imbalanced(long src_load, long dst_load,
> }
>
> /*
> + * Maximum numa importance can be 1998 (2*999);
> + * SMALLIMP @ 30 would be close to 1998/64.
> + * Used to deter task migration.
> + */
> +#define SMALLIMP 30
> +
> +/*
> * This checks if the overall compute and NUMA accesses of the system would
> * be improved if the source tasks was migrated to the target dst_cpu taking
> * into account that it might be best if task running on the dst_cpu should
> @@ -1600,7 +1607,7 @@ static void task_numa_compare(struct task_numa_env *env,
> goto unlock;
>
> if (!cur) {
> - if (maymove || imp > env->best_imp)
> + if (maymove && moveimp >= env->best_imp)
> goto assign;
> else
> goto unlock;
> @@ -1643,16 +1650,22 @@ static void task_numa_compare(struct task_numa_env *env,
> task_weight(cur, env->dst_nid, dist);
> }
>
> - if (imp <= env->best_imp)
> - goto unlock;
> -
> if (maymove && moveimp > imp && moveimp > env->best_imp) {
> - imp = moveimp - 1;
> + imp = moveimp;
> cur = NULL;
> goto assign;
> }
>
> /*
> + * If the numa importance is less than SMALLIMP,
> + * task migration might only result in ping pong
> + * of tasks and also hurt performance due to cache
> + * misses.
> + */
> + if (imp < SMALLIMP || imp <= env->best_imp + SMALLIMP / 2)
> + goto unlock;
> +
> + /*
> * In the overloaded case, try and keep the load balanced.
> */
> load = task_h_load(env->p) - task_h_load(cur);
So what is this 'NUMA importance'? Seems just like a random parameter which generally isn't a
good idea.
Also, same review feedback as I gave for the previous patches.
Thanks,
Ingo
next prev parent reply other threads:[~2018-09-10 8:46 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-03 6:13 [PATCH 0/6] numa-balancing patches Srikar Dronamraju
2018-08-03 6:13 ` [PATCH 1/6] sched/numa: Stop multiple tasks from moving to the cpu at the same time Srikar Dronamraju
2018-09-10 8:42 ` Ingo Molnar
2018-08-03 6:13 ` [PATCH 2/6] mm/migrate: Use trylock while resetting rate limit Srikar Dronamraju
2018-09-06 11:48 ` Peter Zijlstra
2018-09-10 8:39 ` Ingo Molnar
2018-08-03 6:13 ` [PATCH 3/6] sched/numa: Avoid task migration for small numa improvement Srikar Dronamraju
2018-09-10 8:46 ` Ingo Molnar [this message]
2018-09-12 15:17 ` Srikar Dronamraju
2018-08-03 6:13 ` [PATCH 4/6] sched/numa: Pass destination cpu as a parameter to migrate_task_rq Srikar Dronamraju
2018-08-03 6:14 ` [PATCH 5/6] sched/numa: Reset scan rate whenever task moves across nodes Srikar Dronamraju
2018-09-10 8:48 ` Ingo Molnar
2018-09-12 15:19 ` Srikar Dronamraju
2018-08-03 6:14 ` [PATCH 6/6] sched/numa: Limit the conditions where scan period is reset Srikar Dronamraju
2018-08-21 12:01 ` [PATCH 0/6] numa-balancing patches Srikar Dronamraju
2018-09-06 12:17 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180910084633.GD48257@gmail.com \
--to=mingo@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=peterz@infradead.org \
--cc=riel@surriel.com \
--cc=srikar@linux.vnet.ibm.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.