Re: [Patch v2 3/6] sched/fair: Implement prefer sibling imbalance calculation between asymmetric groups

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Ricardo Neri <ricardo.neri@intel.com>,
	"Ravi V . Shankar" <ravi.v.shankar@intel.com>,
	Ben Segall <bsegall@google.com>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>,
	"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Valentin Schneider <vschneid@redhat.com>,
	Ionela Voinescu <ionela.voinescu@arm.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>,
	naveen.n.rao@linux.vnet.ibm.com,
	Yicong Yang <yangyicong@hisilicon.com>,
	Barry Song <v-songbaohua@oppo.com>, Chen Yu <yu.c.chen@intel.com>,
	Hillf Danton <hdanton@sina.com>
Subject: Re: [Patch v2 3/6] sched/fair: Implement prefer sibling imbalance calculation between asymmetric groups
Date: Tue, 13 Jun 2023 10:46:36 -0700	[thread overview]
Message-ID: <321a474bfa562164a56f504144d6b33eb2f7acbd.camel@linux.intel.com> (raw)
In-Reply-To: <20230612120528.GL4253@hirez.programming.kicks-ass.net>

On Mon, 2023-06-12 at 14:05 +0200, Peter Zijlstra wrote:
> On Thu, Jun 08, 2023 at 03:32:29PM -0700, Tim Chen wrote:
> 
> > 
> > +	if (env->sd->flags & SD_ASYM_PACKING) {
> > +		int limit;
> > +
> > +		if (!busiest->sum_nr_running)
> > +			goto out;
> 
> This seems out-of-place, shouldn't we have terminate sooner if busiest
> is empty?

Yes.  Should move this check to the beginning.

> 
> > +
> > +		if (sched_asym_prefer(env->dst_cpu, sds->busiest->asym_prefer_cpu)) {
> > +			/* Don't leave preferred core idle */
> > +			if (imbalance == 0 && local->sum_nr_running < ncores_local)
> > +				imbalance = 1;
> > +			goto out;
> > +		}
> > +
> > +		/* Limit tasks moved from preferred group, don't leave cores idle */
> > +		limit = busiest->sum_nr_running;
> > +		lsub_positive(&limit, ncores_busiest);
> > +		if (imbalance > limit)
> > +			imbalance = limit;
> 
> How does this affect the server parts that have larger than single core
> turbo domains?

Are you thinking about the case where the local group is completely empty
so there's turbo headroom and we should move at least one task, even though
CPU in busiest group has higher priority?

In other words, are you suggesting we should add

		if (imbalance == 0 && busiest->sum_nr_running > 0 &&
			local->sum_nr_running == 0)
			imbalance = 1;
		

> 
> > +
> > +		goto out;
> > +	}
> > +
> > +	/* Take advantage of resource in an empty sched group */
> > +	if (imbalance == 0 && local->sum_nr_running == 0 &&
> > +	    busiest->sum_nr_running > 1)
> > +		imbalance = 1;
> > +out:
> > +	return imbalance << 1;
> > +}
> 
> 
> But basically you have:
> 
>         LcBn - BcLn
>   imb = -----------
>            LcBc
> 
> Which makes sense, except you then return:
> 
>   imb * 2
> 
> which then made me wonder about rounding.
> 
> Do we want to to add (LcBc -1) or (LcBc/2) to resp. ceil() or round()
> the thing before division? Because currently it uses floor().
> 
> If you evaludate it like:
> 
> 
>         2 * (LcBn - BcLn)
>   imb = -----------------
>               LcBc
> 
> The result is different from what you have now.

If I do the rounding after multiplying imb by two (call it imb_new),
the difference with imb I am returning now (call it imb_old)
will be at most 1.  Note that imb_old returned is always a multiple of 2.

I will be using imb in calculate_imbalance() and divide it
by 2 there to get the number tasks to move from busiest group.
So when there is a difference of 1 between imb_old and imb_new,
the difference will be trimmed off after the division of 2.

We will get the same number of tasks to move with either
imb_old or imb_new in calculate_imbalance() so the two
computations will arrive at the same result eventually.

> 
> What actual behaviour is desired in these low imbalance cases? and can
> you write a comment as to why we do as we do etc..?

I do not keep imb as 

           2 * (LcBn - BcLn)
   imb = -----------------
               LcBc

as it is easier to leave out the factor of 2
in the middle of sibling_imblalance() computation
so I can directly interpret imb as the number
of tasks to move, and add the factor of two
when I actually need to return the imbalance.

Would you like me to add this reasoning in the comments?

Thanks.

Tim

next prev parent reply	other threads:[~2023-06-13 17:47 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-08 22:32 [Patch v2 0/6] Enable Cluster Scheduling for x86 Hybrid CPUs Tim Chen
2023-06-08 22:32 ` [Patch v2 1/6] sched/fair: Determine active load balance for SMT sched groups Tim Chen
2023-06-12 11:13   ` Peter Zijlstra
2023-06-12 20:12     ` Tim Chen
2023-06-12 20:14       ` Peter Zijlstra
2023-06-12 11:16   ` Peter Zijlstra
2023-06-12 20:12     ` Tim Chen
2023-06-08 22:32 ` [Patch v2 2/6] sched/topology: Record number of cores in sched group Tim Chen
2023-06-12 11:29   ` Peter Zijlstra
2023-06-12 20:16     ` Tim Chen
2023-06-08 22:32 ` [Patch v2 3/6] sched/fair: Implement prefer sibling imbalance calculation between asymmetric groups Tim Chen
2023-06-12 12:05   ` Peter Zijlstra
2023-06-13 17:46     ` Tim Chen [this message]
2023-06-15 11:07       ` Peter Zijlstra
2023-06-15 17:01         ` Tim Chen
2023-06-08 22:32 ` [Patch v2 4/6] sched/fair: Consider the idle state of the whole core for load balance Tim Chen
2023-06-08 22:32 ` [Patch v2 5/6] sched/x86: Add cluster topology to hybrid CPU Tim Chen
2023-06-08 22:32 ` [Patch v2 6/6] sched/debug: Dump domains' sched group flags Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=321a474bfa562164a56f504144d6b33eb2f7acbd.camel@linux.intel.com \
    --to=tim.c.chen@linux.intel.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=hdanton@sina.com \
    --cc=ionela.voinescu@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=ravi.v.shankar@intel.com \
    --cc=ricardo.neri@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=sshegde@linux.vnet.ibm.com \
    --cc=v-songbaohua@oppo.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=x86@kernel.org \
    --cc=yangyicong@hisilicon.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox