All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Phil Auld <pauld@redhat.com>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	Quentin Perret <quentin.perret@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Morten Rasmussen <Morten.Rasmussen@arm.com>,
	Hillf Danton <hdanton@sina.com>, Parth Shah <parth@linux.ibm.com>,
	Rik van Riel <riel@surriel.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched, fair: Allow a small load imbalance between low utilisation SD_NUMA domains v4
Date: Fri, 17 Jan 2020 14:26:28 +0000	[thread overview]
Message-ID: <20200117142628.GR3466@techsingularity.net> (raw)
In-Reply-To: <CAKfTPtBROKKtTkz55McjJo6b=Qq0QRVckFe2fQS2kdxf8kCJLw@mail.gmail.com>

On Fri, Jan 17, 2020 at 02:16:15PM +0100, Vincent Guittot wrote:
> > A more interesting example is the Facebook schbench which uses a
> > number of messaging threads to communicate with worker threads. In this
> > configuration, one messaging thread is used per NUMA node and the number of
> > worker threads is varied. The 50, 75, 90, 95, 99, 99.5 and 99.9 percentiles
> > for response latency is then reported.
> >
> > Lat 50.00th-qrtle-1        44.00 (   0.00%)       37.00 (  15.91%)
> > Lat 75.00th-qrtle-1        53.00 (   0.00%)       41.00 (  22.64%)
> > Lat 90.00th-qrtle-1        57.00 (   0.00%)       42.00 (  26.32%)
> > Lat 95.00th-qrtle-1        63.00 (   0.00%)       43.00 (  31.75%)
> > Lat 99.00th-qrtle-1        76.00 (   0.00%)       51.00 (  32.89%)
> > Lat 99.50th-qrtle-1        89.00 (   0.00%)       52.00 (  41.57%)
> > Lat 99.90th-qrtle-1        98.00 (   0.00%)       55.00 (  43.88%)
> 
> Which parameter changes between above and below tests ?
> 
> > Lat 50.00th-qrtle-2        42.00 (   0.00%)       42.00 (   0.00%)
> > Lat 75.00th-qrtle-2        48.00 (   0.00%)       47.00 (   2.08%)
> > Lat 90.00th-qrtle-2        53.00 (   0.00%)       52.00 (   1.89%)
> > Lat 95.00th-qrtle-2        55.00 (   0.00%)       53.00 (   3.64%)
> > Lat 99.00th-qrtle-2        62.00 (   0.00%)       60.00 (   3.23%)
> > Lat 99.50th-qrtle-2        63.00 (   0.00%)       63.00 (   0.00%)
> > Lat 99.90th-qrtle-2        68.00 (   0.00%)       66.00 (   2.94%
> >

The number of worker pool threads. Above is 1 worker thread, below is 2.

> > @@ -8691,16 +8687,37 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
> >                         env->migration_type = migrate_task;
> >                         lsub_positive(&nr_diff, local->sum_nr_running);
> >                         env->imbalance = nr_diff >> 1;
> > -                       return;
> > -               }
> > +               } else {
> >
> > -               /*
> > -                * If there is no overload, we just want to even the number of
> > -                * idle cpus.
> > -                */
> > -               env->migration_type = migrate_task;
> > -               env->imbalance = max_t(long, 0, (local->idle_cpus -
> > +                       /*
> > +                        * If there is no overload, we just want to even the number of
> > +                        * idle cpus.
> > +                        */
> > +                       env->migration_type = migrate_task;
> > +                       env->imbalance = max_t(long, 0, (local->idle_cpus -
> >                                                  busiest->idle_cpus) >> 1);
> > +               }
> > +
> > +               /* Consider allowing a small imbalance between NUMA groups */
> > +               if (env->sd->flags & SD_NUMA) {
> > +                       unsigned int imbalance_min;
> > +
> > +                       /*
> > +                        * Compute an allowed imbalance based on a simple
> > +                        * pair of communicating tasks that should remain
> > +                        * local and ignore them.
> > +                        *
> > +                        * NOTE: Generally this would have been based on
> > +                        * the domain size and this was evaluated. However,
> > +                        * the benefit is similar across a range of workloads
> > +                        * and machines but scaling by the domain size adds
> > +                        * the risk that lower domains have to be rebalanced.
> > +                        */
> > +                       imbalance_min = 2;
> > +                       if (busiest->sum_nr_running <= imbalance_min)
> > +                               env->imbalance = 0;
> 
> Out of curiosity why have you decided to use the above instead of
>   env->imbalance -= min(env->imbalance, imbalance_adj);
> 
> Have you seen perf regression with the min ?
> 

I didn't see a regression with min() but at this point, we're only
dealing with the case of ignoring a small imbalance when the busiest
group is almost completely idle. The distinction between using min and
just ignoring the imbalance is almost irrevelant in that case.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2020-01-17 14:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 10:13 [PATCH] sched, fair: Allow a small load imbalance between low utilisation SD_NUMA domains v4 Mel Gorman
2020-01-16 16:35 ` Mel Gorman
2020-01-17 13:08   ` Vincent Guittot
2020-01-17 14:15     ` Mel Gorman
2020-01-17 14:32       ` Phil Auld
2020-01-17 14:23     ` Phil Auld
2020-01-17 14:37   ` Valentin Schneider
2020-01-17 13:16 ` Vincent Guittot
2020-01-17 14:26   ` Mel Gorman [this message]
2020-01-17 14:29     ` Vincent Guittot
2020-01-17 15:09 ` Vincent Guittot
2020-01-17 15:11   ` Peter Zijlstra
2020-01-17 15:21 ` Phil Auld
2020-01-17 17:56 ` Srikar Dronamraju
2020-01-17 21:58   ` Mel Gorman
2020-01-20  8:09     ` Srikar Dronamraju
2020-01-20  8:33       ` Mel Gorman
2020-01-20 17:27         ` Srikar Dronamraju
2020-01-20 18:21           ` Mel Gorman
2020-01-21  8:55             ` Srikar Dronamraju
2020-01-21  9:11               ` Mel Gorman
2020-01-21 10:42                 ` Peter Zijlstra
2020-01-21  9:59 ` Srikar Dronamraju
2020-01-29 11:32 ` [tip: sched/core] sched/fair: Allow a small load imbalance between low utilisation SD_NUMA domains tip-bot2 for Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200117142628.GR3466@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=Morten.Rasmussen@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=parth@linux.ibm.com \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=quentin.perret@arm.com \
    --cc=riel@surriel.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.