public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>,
	Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	 Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	 Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Valentin Schneider <vschneid@redhat.com>
Subject: Re: [PATCH] sched/fair: Enable group_asym_packing in find_idlest_group
Date: Tue, 09 Jan 2024 16:58:27 -0800	[thread overview]
Message-ID: <a100b38341e13afbb5f8753b731c9e469e704667.camel@linux.intel.com> (raw)
In-Reply-To: <ea049b25-ba49-4790-8b79-05078adbfc77@linux.vnet.ibm.com>

On Thu, 2024-01-04 at 21:20 +0530, Shrikanth Hegde wrote:
> On 10/18/23 9:20 PM, Srikar Dronamraju wrote:
> 
> Hi Srikar, 
> 
> > Current scheduler code doesn't handle SD_ASYM_PACKING in the
> > find_idlest_cpu path. On few architectures, like Powerpc, cache is at a
> > core. Moving threads across cores may end up in cache misses.
> > 
> > While asym_packing can be enabled above SMT level, enabling Asym packing
> > across cores could result in poorer performance due to cache misses.
> > However if the initial task placement via find_idlest_cpu does take
> > Asym_packing into consideration, then scheduler can avoid asym_packing
> > migrations. This will result in lesser migrations and better packing and
> > better overall performance.
> > 
> 
> This would handle asym packing case when finding the idle CPU for newly woken
> up task and thereby reducing the number of migrations if it is placed correctly in 
> the first place. I think thats helpful. 
> 
> Currently intel cluster and powerVM shared LPAR's are the two where ASYM PACKING 
> is enabled at higher domain than SMT. Is that correct or is there any other topology?
> 
> +tim 
> 
> > Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> > ---
> >  kernel/sched/fair.c | 33 ++++++++++++++++++++++++++++++---
> >  1 file changed, 30 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index cb225921bbca..7164f79a3d13 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -9931,11 +9931,13 @@ static int idle_cpu_without(int cpu, struct task_struct *p)
> >   * @group: sched_group whose statistics are to be updated.
> >   * @sgs: variable to hold the statistics for this group.
> >   * @p: The task for which we look for the idlest group/CPU.
> > + * @this_cpu: current cpu
> >   */
> >  static inline void update_sg_wakeup_stats(struct sched_domain *sd,
> >  					  struct sched_group *group,
> >  					  struct sg_lb_stats *sgs,
> > -					  struct task_struct *p)
> > +					  struct task_struct *p,
> > +					  int this_cpu)
> >  {
> >  	int i, nr_running;
> >  
> > @@ -9972,6 +9974,11 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd,
> >  
> >  	}
> >  
> > +	if (sd->flags & SD_ASYM_PACKING && sgs->sum_h_nr_running &&
> > +			sched_asym_prefer(group->asym_prefer_cpu, this_cpu)) {
> > +		sgs->group_asym_packing = 1;

I disagree with the above criteria for doing asym_packing.

I think asym packing only makes sense if you have an idle CPU availabe
in the group that is preferred over this_cpu, and you have fewer
tasks than CPU.  Using group->asym_prefer_cpu
is inappropriate as that most preferred CPU may be busy.
You should be migrating task from this_cpu to that highest
priority idle_cpu identified

If the group is fully busy or overloaded, we should stick with the original
logic of picking the most lightly loaded group and not use asym_packing. 

You may want to note down the idle CPU in the group with highest priority, 
or most preferred if there are more than 1 cpu in the group to compare 
between two idle groups that have idle CPUs.

Tim

> > +	}
> > +
> 
> 
> I think there is a corner case here which could be taken care. please correct me if i 
> am wrong. 
> 
> Assume there are four sched groups, sg1, sg2, sg3 and sg4. asym packing is enabled at sd. 
> sg1, and sg3 have one task each and a new task is being created. So find_idlest_cpu is 
> called for this new task. 
> 
> Because of sgs->sum_h_nr_running check, sg1 and sg3 will have group_asym_packing, while 
> sg2 and sg4 will have group_has_spare. update_pick_idlest will choose the lowest type. 
> so group_has_spare. TIE would be between sg2 and sg4. Because of asym packing (atleast true 
> for powerpc shared LPAR case) sg4 will have lower utilization compared to sg2, and hence sg4 
> will be given as the idlest_cpu. On the next load balance sg2 will pull task from sg4 due to 
> asym packing. 
> 
> Additional migration may be avoided if we omit the sum_h_nr_running check? 
> 
> 
> >  	sgs->group_capacity = group->sgc->capacity;
> >  
> >  	sgs->group_weight = group->group_weight;
> > @@ -10012,8 +10019,17 @@ static bool update_pick_idlest(struct sched_group *idlest,
> >  			return false;
> >  		break;
> >  
> > -	case group_imbalanced:
> >  	case group_asym_packing:
> > +		if (sched_asym_prefer(group->asym_prefer_cpu, idlest->asym_prefer_cpu)) {
> > +			int busy_cpus = idlest_sgs->group_weight - idlest_sgs->idle_cpus;
> > +
> > +			busy_cpus -= (sgs->group_weight - sgs->idle_cpus);
> > +			if (busy_cpus >= 0)
> > +				return true;
> 
> 
> wouldn't using idle_cpus would be simpler? something like, 
> 
> if (sgs->idle_cpus - idlest->idle_cpus > 0)
> 	return true
> 
> > +		}
> > +		return false;
> > +
> > +	case group_imbalanced:
> >  	case group_smt_balance:
> >  		/* Those types are not used in the slow wakeup path */
> >  		return false;
> > @@ -10080,7 +10096,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
> >  			sgs = &tmp_sgs;
> >  		}
> >  
> > -		update_sg_wakeup_stats(sd, group, sgs, p);
> > +		update_sg_wakeup_stats(sd, group, sgs, p, this_cpu);
> >  
> >  		if (!local_group && update_pick_idlest(idlest, &idlest_sgs, group, sgs)) {
> >  			idlest = group;
> > @@ -10112,6 +10128,17 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, int this_cpu)
> >  	if (local_sgs.group_type > idlest_sgs.group_type)
> >  		return idlest;
> >  
> > +	if (idlest_sgs.group_type == group_asym_packing) {
> > +		if (sched_asym_prefer(idlest->asym_prefer_cpu, local->asym_prefer_cpu)) {
> > +			int busy_cpus = local_sgs.group_weight - local_sgs.idle_cpus;
> > +
> > +			busy_cpus -= (idlest_sgs.group_weight - idlest_sgs.idle_cpus);
> > +			if (busy_cpus >= 0)
> > +				return idlest;
> > +		}
> > +		return NULL;
> > +	}
> 
> same comment of using idle_cpus 
> 
> > +
> >  	switch (local_sgs.group_type) {
> >  	case group_overloaded:
> >  	case group_fully_busy:
> 


      reply	other threads:[~2024-01-10  0:58 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-18 15:50 [PATCH] sched/fair: Enable group_asym_packing in find_idlest_group Srikar Dronamraju
2023-12-15  4:10 ` Srikar Dronamraju
2024-01-04 15:50 ` Shrikanth Hegde
2024-01-10  0:58   ` Tim Chen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a100b38341e13afbb5f8753b731c9e469e704667.camel@linux.intel.com \
    --to=tim.c.chen@linux.intel.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=sshegde@linux.vnet.ibm.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox