linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	Phil Auld <pauld@redhat.com>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 4/4] sched/fair: Track possibly overloaded domains and abort a scan if necessary
Date: Fri, 20 Mar 2020 17:43:04 +0000	[thread overview]
Message-ID: <20200320174304.GF3818@techsingularity.net> (raw)
In-Reply-To: <CAKfTPtBixZKDES_i3Lnsj1eAa_kVi-zHv-0uE8uTsKOBcjmkYg@mail.gmail.com>

On Fri, Mar 20, 2020 at 05:54:57PM +0100, Vincent Guittot wrote:
> On Fri, 20 Mar 2020 at 17:44, Mel Gorman <mgorman@techsingularity.net> wrote:
> >
> > On Fri, Mar 20, 2020 at 04:48:39PM +0100, Vincent Guittot wrote:
> > > > ---
> > > >  include/linux/sched/topology.h |  1 +
> > > >  kernel/sched/fair.c            | 65 +++++++++++++++++++++++++++++++++++++++---
> > > >  kernel/sched/features.h        |  3 ++
> > > >  3 files changed, 65 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> > > > index af9319e4cfb9..76ec7a54f57b 100644
> > > > --- a/include/linux/sched/topology.h
> > > > +++ b/include/linux/sched/topology.h
> > > > @@ -66,6 +66,7 @@ struct sched_domain_shared {
> > > >         atomic_t        ref;
> > > >         atomic_t        nr_busy_cpus;
> > > >         int             has_idle_cores;
> > > > +       int             is_overloaded;
> > >
> > > Can't nr_busy_cpus compared to sd->span_weight give you similar status  ?
> > >
> >
> > It's connected to nohz balancing and I didn't see how I could use that
> > for detecting overload. Also, I don't think it ever can be larger than
> > the sd weight and overload is based on the number of running tasks being
> > greater than the number of available CPUs. Did I miss something obvious?
> 
> IIUC you try to estimate if there is a chance to find an idle cpu
> before starting the loop and scanning the domain and abort early if
> the possibility is low.
> 
> if nr_busy_cpus equals to sd->span_weight it means that there is no
> free cpu so there is no need to scan
> 

Ok, I see what you are getting at but I worry there are multiple
problems there. First, the nr_busy_cpus is decremented only when a CPU
is entering idle with the tick stopped. If nohz is disabled then this
breaks, no? Secondly, a CPU can be idle but the tick not stopped if
__tick_nohz_idle_stop_tick knows there is an event in the near future
so using busy_cpus, we potentially miss a sibling that was adequate
for running a task. Finally, the threshold for cutting off the search
entirely seems low. The patch marks a domain as overloaded if there are
twice as many running tasks as runqueues scanned. In that scenario, even
if tasks are rapidly switching between busy/idle, it's still unlikely
the task will go idle. When cutting off at just the fully-busy mark, we
could miss a CPU that is going idle, almost idle or is running SCHED_IDLE
tasks where are acceptable target candidates for select_idle_sibling. I
think there are too many cases where nr_busy_cpus are problematic to
make it a good alternative.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2020-03-20 17:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-20 15:12 [PATCH 0/4] Throttle select_idle_sibling when a target domain is overloaded Mel Gorman
2020-03-20 15:12 ` [PATCH 1/4] sched/fair: Track efficiency of select_idle_sibling Mel Gorman
2020-03-23 13:30   ` Valentin Schneider
2020-03-23 13:55     ` Mel Gorman
2020-03-20 15:12 ` [PATCH 2/4] sched/fair: Track efficiency of task recent_used_cpu Mel Gorman
2020-03-23 13:30   ` Valentin Schneider
2020-03-20 15:12 ` [PATCH 3/4] sched/fair: Clear SMT siblings after determining the core is not idle Mel Gorman
2020-03-23 13:31   ` Valentin Schneider
2020-03-20 15:12 ` [PATCH 4/4] sched/fair: Track possibly overloaded domains and abort a scan if necessary Mel Gorman
2020-03-20 15:48   ` Vincent Guittot
2020-03-20 16:44     ` Mel Gorman
2020-03-20 16:54       ` Vincent Guittot
2020-03-20 17:43         ` Mel Gorman [this message]
2020-03-24 10:35           ` Vincent Guittot
2020-03-24 11:23             ` Mel Gorman
2020-04-02  7:59   ` [sched/fair] 15e7470dfc: hackbench.throughput 11.2% improvement kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200320174304.GF3818@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pauld@redhat.com \
    --cc=peterz@infradead.org \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).