All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Rik van Riel <riel@surriel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Michael Neuling <mikey@neuling.org>,
	Gautham R Shenoy <ego@linux.vnet.ibm.com>,
	Parth Shah <parth@linux.ibm.com>
Subject: Re: [PATCH] sched/fair: Prefer idle CPU to cache affinity
Date: Wed, 10 Mar 2021 11:22:41 +0530	[thread overview]
Message-ID: <20210310055241.GO2028034@linux.vnet.ibm.com> (raw)
In-Reply-To: <CAKfTPtA2XSmqt1L2X9WvdtdA5eqNYuhSws8jDOr1HA1xqXWfDQ@mail.gmail.com>

* Vincent Guittot <vincent.guittot@linaro.org> [2021-03-08 14:52:39]:

> On Fri, 26 Feb 2021 at 17:41, Srikar Dronamraju
> <srikar@linux.vnet.ibm.com> wrote:
> >

Thanks Vincent for your review comments.

> > +static int prefer_idler_llc(int this_cpu, int prev_cpu, int sync)
> > +{
> > +       struct sched_domain_shared *tsds, *psds;
> > +       int pnr_busy, pllc_size, tnr_busy, tllc_size, diff;
> > +
> > +       tsds = rcu_dereference(per_cpu(sd_llc_shared, this_cpu));
> > +       tnr_busy = atomic_read(&tsds->nr_busy_cpus);
> > +       tllc_size = per_cpu(sd_llc_size, this_cpu);
> > +
> > +       psds = rcu_dereference(per_cpu(sd_llc_shared, prev_cpu));
> > +       pnr_busy = atomic_read(&psds->nr_busy_cpus);
> > +       pllc_size = per_cpu(sd_llc_size, prev_cpu);
> > +
> > +       /* No need to compare, if both LLCs are fully loaded */
> > +       if (pnr_busy == pllc_size && tnr_busy == pllc_size)
> > +               return nr_cpumask_bits;
> > +
> > +       if (sched_feat(WA_WAKER) && tnr_busy < tllc_size)
> > +               return this_cpu;
> 
> Why have you chosen to favor this_cpu instead of prev_cpu unlike for wake_idle ?

At this point, we know the waker running on this_cpu and wakee which was
running on prev_cpu are affine to each other and this_cpu and prev_cpu dont
share cache. I chose to move them close to each other to benefit from the
cache sharing. Based on feedback from Peter and Rik, I made the check more
conservative i.e tnr_busy <= tllc_size/smt_weight (where smt_weight is the
cpumask weight of smt domain for this_cpu) i.e if we have a free core in
this llc domain, chose this_cpu.  select_idle_sibling() should pick an idle
cpu/core/smt within the llc domain for this_cpu.

Do you feel, this may not be the correct option?

We are also experimenting with another option, were we call prefer_idler_cpu
after wa_weight. I.e 
1. if wake_affine_weight choses this_cpu but llc in prev_cpu has an idle
smt/CPU but there are no idle smt/CPU in this_cpu, then chose idle smt/CPU
in prev_cpu
2. if wake_affine_weight choses nr_cpumask(aka prev_cpu) but llc in this_cpu
has an idle smt/CPU but there are no idle smt/CPU in prev_cpu, then chose
idle smt/CPU in this_cpu


> > +
> > +       /* For better wakeup latency, prefer idler LLC to cache affinity */
> > +       diff = tnr_busy * pllc_size - sync - pnr_busy * tllc_size;
> > +       if (!diff)
> > +               return nr_cpumask_bits;
> > +       if (diff < 0)
> > +               return this_cpu;
> > +
> > +       return prev_cpu;
> > +}
> > +
> >  static int wake_affine(struct sched_domain *sd, struct task_struct *p,
> >                        int this_cpu, int prev_cpu, int sync)
> >  {
> > @@ -5877,6 +5907,10 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p,
> >         if (sched_feat(WA_IDLE))
> >                 target = wake_affine_idle(this_cpu, prev_cpu, sync);
> >
> > +       if (sched_feat(WA_IDLER_LLC) && target == nr_cpumask_bits &&
> > +                               !cpus_share_cache(this_cpu, prev_cpu))
> > +               target = prefer_idler_llc(this_cpu, prev_cpu, sync);
> 
> could you use the same naming convention as others function ?
> wake_affine_llc as an example

I guess you meant s/prefer_idler_llc/wake_affine_llc/
Sure. I can modify.

> 
> > +
> >         if (sched_feat(WA_WEIGHT) && target == nr_cpumask_bits)
> >                 target = wake_affine_weight(sd, p, this_cpu, prev_cpu, sync);
> >
> > @@ -5884,8 +5918,11 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p,
> >         if (target == nr_cpumask_bits)
> >                 return prev_cpu;
> >
> > -       schedstat_inc(sd->ttwu_move_affine);
> > -       schedstat_inc(p->se.statistics.nr_wakeups_affine);
> > +       if (target == this_cpu) {
> 
> How is this condition related to $subject ?

Before this change, wake_affine_weight and wake_affine_idle would either
return this_cpu or nr_cpumask_bits. Just before this check, we check if
target is nr_cpumask_bits and return prev_cpu. So the stats were only
incremented when target was this_cpu.

However with prefer_idler_llc, we may return this_cpu, prev_cpu or
nr_cpumask_bits. Now we only to update stats when we have chosen to migrate
the task to this_cpu. Hence I had this check.

If we use the slightly lazier approach which is check for wa_weight first
before wa_idler_llc, then we may not need this change at all.

-- 
Thanks and Regards
Srikar Dronamraju

  reply	other threads:[~2021-03-10  5:54 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26 16:40 [PATCH] sched/fair: Prefer idle CPU to cache affinity Srikar Dronamraju
2021-02-27 19:56 ` Rik van Riel
2021-03-01 13:37   ` Srikar Dronamraju
2021-03-01 15:44   ` Peter Zijlstra
2021-03-01 17:06     ` Srikar Dronamraju
2021-03-01 17:18       ` Peter Zijlstra
2021-03-02  7:39         ` Srikar Dronamraju
2021-03-02  9:10           ` Peter Zijlstra
2021-03-02 10:05             ` Srikar Dronamraju
2021-03-01 15:40 ` Peter Zijlstra
2021-03-01 17:08   ` Srikar Dronamraju
2021-03-02  9:53 ` Dietmar Eggemann
2021-03-02 10:04   ` Srikar Dronamraju
2021-03-08 13:52 ` Vincent Guittot
2021-03-10  5:52   ` Srikar Dronamraju [this message]
2021-03-10 15:37     ` Vincent Guittot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210310055241.GO2028034@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mikey@neuling.org \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=parth@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=tglx@linutronix.de \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.