public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Ricardo Neri <ricardo.neri@intel.com>,
	"Ravi V . Shankar" <ravi.v.shankar@intel.com>,
	Ben Segall <bsegall@google.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Valentin Schneider <vschneid@redhat.com>,
	Ionela Voinescu <ionela.voinescu@arm.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	naveen.n.rao@linux.vnet.ibm.com,
	Yicong Yang <yangyicong@hisilicon.com>,
	Barry Song <v-songbaohua@oppo.com>,
	Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Subject: Re: [PATCH 4/6] sched/fair: Skip prefer sibling move between SMT group and non-SMT group
Date: Sat, 6 May 2023 02:08:15 +0200	[thread overview]
Message-ID: <20230506000815.GA1824020@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20230505233836.GB1821641@hirez.programming.kicks-ass.net>

On Sat, May 06, 2023 at 01:38:36AM +0200, Peter Zijlstra wrote:
> On Fri, May 05, 2023 at 04:07:39PM -0700, Tim Chen wrote:
> > On Fri, 2023-05-05 at 15:22 +0200, Peter Zijlstra wrote:
> > > On Thu, May 04, 2023 at 09:09:54AM -0700, Tim Chen wrote:
> > > > From: Tim C Chen <tim.c.chen@linux.intel.com>
> > > > 
> > > > Do not try to move tasks between non SMT sched group and SMT sched
> > > > group for "prefer sibling" load balance.
> > > > Let asym_active_balance_busiest() handle that case properly.
> > > > Otherwise we could get task bouncing back and forth between
> > > > the SMT sched group and non SMT sched group.
> > > > 
> > > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> > > > ---
> > > >  kernel/sched/fair.c | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > > 
> > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > index 8a325db34b02..58ef7d529731 100644
> > > > --- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
> > > >  	/*
> > > >  	 * Try to move all excess tasks to a sibling domain of the busiest
> > > >  	 * group's child domain.
> > > > +	 *
> > > > +	 * Do not try to move between non smt sched group and smt sched
> > > > +	 * group. Let asym active balance properly handle that case.
> > > >  	 */
> > > >  	if (sds.prefer_sibling && local->group_type == group_has_spare &&
> > > > +	    !asymmetric_groups(sds.busiest, sds.local) &&
> > > >  	    busiest->sum_nr_running > local->sum_nr_running + 1)
> > > >  		goto force_balance;
> > > 
> > > This seems to have the hidden assumption that a !SMT core is somehow
> > > 'less' that an SMT code. Should this not also look at
> > > sched_asym_prefer() to establush this is so?
> > > 
> > > I mean, imagine I have a regular system and just offline one smt sibling
> > > for giggles.
> > 
> > I don't quite follow your point as asymmetric_groups() returns false even
> > one smt sibling is offlined.
> > 
> > Even say sds.busiest has 1 SMT and sds.local has 2 SMT, both sched groups still
> > have SD_SHARE_CPUCAPACITY flag turned on.  So asymmetric_groups() return
> > false and the load balancing logic is not changed for regular non-hybrid system.
> > 
> > I may be missing something.
> 
> What's the difference between the two cases? That is, if the remaining
> sibling will have SD_SHARE_CPUCAPACIY from the degenerate SMT domain
> that's been reaped, then why doesn't the same thing apply to the atoms
> in the hybrid muck?
> 
> Those two cases *should* be identical, both cases you have cores with
> and cores without SMT.

On my alderlake:

[  202.222019] CPU0 attaching sched-domain(s):
[  202.222509]  domain-0: span=0-1 level=SMT
[  202.222707]   groups: 0:{ span=0 }, 1:{ span=1 }
[  202.222945]   domain-1: span=0-23 level=MC
[  202.223148]    groups: 0:{ span=0-1 cap=2048 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 },12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 }, 23:{ span=23 }
...
[  202.249979] CPU23 attaching sched-domain(s):
[  202.250127]  domain-0: span=0-23 level=MC
[  202.250198]   groups: 23:{ span=23 }, 0:{ span=0-1 cap=2048 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 }, 12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 }

$ echo 0 > /sys/devices/system/cpu/cpu1/online
[  251.213848] CPU0 attaching sched-domain(s):
[  251.214376]  domain-0: span=0,2-23 level=MC
[  251.214580]   groups: 0:{ span=0 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 }, 12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 }, 23:{ span=23 }
...
[  251.239511] CPU23 attaching sched-domain(s):
[  251.239656]  domain-0: span=0,2-23 level=MC
[  251.239727]   groups: 23:{ span=23 }, 0:{ span=0 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 }, 12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 }

$ cat /debug/sched/domains/cpu0/domain0/groups_flags

$ cat /debug/sched/domains/cpu23/domain0/groups_flags


IOW, neither the big core with SMT with one sibling offline, nor the
little core with no SMT on at all have the relevant flags set on their
domain0 groups.



---
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 98bfc0f4ec94..e408b2889186 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -427,6 +427,7 @@ static void register_sd(struct sched_domain *sd, struct dentry *parent)
 #undef SDM
 
 	debugfs_create_file("flags", 0444, parent, &sd->flags, &sd_flags_fops);
+	debugfs_create_file("groups_flags", 0444, parent, &sd->groups->flags, &sd_flags_fops);
 }
 
 void update_sched_domain_debugfs(void)

  reply	other threads:[~2023-05-06  0:08 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-04 16:09 [PATCH 0/6] Enable Cluster Scheduling for x86 Hybrid CPUs Tim Chen
2023-05-04 16:09 ` [PATCH 1/6] sched/topology: Propagate SMT flags when removing degenerate domain Tim Chen
2023-05-04 16:09 ` [PATCH 2/6] sched/fair: Check whether active load balance is needed in busiest group Tim Chen
2023-05-05 12:16   ` Peter Zijlstra
2023-05-05 22:29     ` Tim Chen
2023-05-05 23:44       ` Peter Zijlstra
2023-05-09 13:31   ` Vincent Guittot
2023-05-04 16:09 ` [PATCH 3/6] sched/fair: Fix busiest group selection for asym groups Tim Chen
2023-05-05 13:19   ` Peter Zijlstra
2023-05-05 22:36     ` Tim Chen
2023-05-04 16:09 ` [PATCH 4/6] sched/fair: Skip prefer sibling move between SMT group and non-SMT group Tim Chen
2023-05-05 13:22   ` Peter Zijlstra
2023-05-05 23:07     ` Tim Chen
2023-05-05 23:38       ` Peter Zijlstra
2023-05-06  0:08         ` Peter Zijlstra [this message]
2023-05-09 13:36   ` Vincent Guittot
2023-05-09 23:35     ` Tim Chen
2023-05-04 16:09 ` [PATCH 5/6] sched/fair: Consider the idle state of the whole core for load balance Tim Chen
2023-05-05 13:23   ` Peter Zijlstra
2023-05-05 22:51     ` Tim Chen
2023-05-04 16:09 ` [PATCH 6/6] sched/x86: Add cluster topology to hybrid CPU Tim Chen
     [not found] ` <20230505071735.4083-1-hdanton@sina.com>
2023-05-05 22:49   ` [PATCH 5/6] sched/fair: Consider the idle state of the whole core for load balance Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230506000815.GA1824020@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=ionela.voinescu@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=ricardo.neri@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=sshegde@linux.vnet.ibm.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    --cc=yangyicong@hisilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox