Re: [PATCH v4] sched/fair: Consider cpu affinity when allowing NUMA imbalance in find_idlest_group

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@techsingularity.net>
To: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: peterz@infradead.org, aubrey.li@linux.intel.com, efault@gmx.de,
	gautham.shenoy@amd.com, linux-kernel@vger.kernel.org,
	mingo@kernel.org, song.bao.hua@hisilicon.com,
	srikar@linux.vnet.ibm.com, valentin.schneider@arm.com,
	vincent.guittot@linaro.org
Subject: Re: [PATCH v4] sched/fair: Consider cpu affinity when allowing NUMA imbalance in find_idlest_group
Date: Thu, 17 Feb 2022 13:15:12 +0000	[thread overview]
Message-ID: <20220217131512.GW3366@techsingularity.net> (raw)
In-Reply-To: <a57dbac2-1b8e-ea5c-8b6c-3a97ac186ad9@amd.com>

On Thu, Feb 17, 2022 at 04:53:51PM +0530, K Prateek Nayak wrote:
> Hello Mel,
> 
> 
> Thank you for looking into the patch.
> 
> On 2/17/2022 3:35 PM, Mel Gorman wrote:
> > Thanks Prateek,
> >
> > On Thu, Feb 17, 2022 at 11:24:08AM +0530, K Prateek Nayak wrote:
> >> [..snip..]
> >>
> >> Eg: numactl -C 0,16,32,48,64,80,96,112 ./stream8
> >>
> > In this case the stream threads can use any CPU of the subset, presumably
> > this is parallelised with OpenMP without specifying spread or bind
> > directives.
> Yes it is parallelized using OpenMP without specifying any directive.
> > [..snip..]
> > One concern I have is that we incur a cpumask setup and cpumask_weight
> > cost on every clone whether a restricted CPU mask is used or not.  Peter,
> > is it acceptable to avoid the cpumask check if there is no restrictions
> > on allowed cpus like this?
> >
> > 	imb = sd->imb_numa_nr;
> > 	if (p->nr_cpus_allowed != num_online_cpus())
> > 		struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> >
> > 		cpumask_and(cpus, sched_group_span(local), p->cpus_ptr);
> > 		imb = min(cpumask_weight(cpus), imb);
> > 	}
> Can we optimize this further as:
> 
> 	imb = sd->imb_numa_nr;
> 	if (unlikely(p->nr_cpus_allowed != num_online_cpus()))
> 		struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
> 
> 		cpumask_and(cpus, sched_group_span(local), p->cpus_ptr);
> 		imb = min(cpumask_weight(cpus), imb);
> 	}
> 
> As for most part, p->nr_cpus_allowed will be equal to num_online_cpus()
> unless user has specifically pinned the task.
> 

I'm a little wary due to https://lwn.net/Articles/420019/ raising concerns
from people that feel more strongly about likely/unlikely use. Whether that
branch is likely true or not is specific to the deployment. On my desktop
and most tests I run, the branch is very unlikely because most workloads
I run are usually not CPU-constrained and not fork-intensive. Even those
that are CPU contrained are generally not fork intensive. For a setup with
lots of containers, virtual machines, locality-aware applications etc,
the path is potentially very likely and harder to detect in the future.

I don't object to the change but I would wonder if it's measurable for
anything other than a fork-intensive microbenchmark given it's one branch
in a relatively heavy operation.

I think a relatively harmless micro-optimisation would be

-		imb = min(cpumask_weight(cpus), imb);
+		imb = cpumask_weight(cpus);

It assumes that the constrained cpus_allowed would have a lower imb
than one calculated based on all cpus allowed which sounds like a safe
assumption other than racing with hot-onlining a bunch of CPUs.

I think both micro-optimisations are negligible in comparison to avoiding
an unecessary cpumask_and cpumask_weight call. FWIW, I looked at my own
use of likely/unlikely recently and it's

c49c2c47dab6b8d45022b3fabf0642a0e62e3109 unlikely that memory hotplug operation is in progress
3b12e7e97938424de2bb1b95ba0bd6a49bad39f9 hotplug active or machine booting
df1acc856923c0a65c28b588585449106c316b71 memory isolated for hotplug or CMA attempt in progress
56f0e661ea8c0178e80048df7166653a51ef2c3d memory isolated for hotplug or CMA attempt in progress
b3b64ebd38225d8032b5db42938d969b602040c2 bulk allocation request with an array that already has pages

Of those, the last one is the most marginal because it really depends
on whether network core or NFS is the heavy user of the interface and
I made a guess that high-speed networks are more common critical paths
than NFS servers.

-- 
Mel Gorman
SUSE Labs

next prev parent reply	other threads:[~2022-02-17 13:15 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17  5:54 [PATCH v4] sched/fair: Consider cpu affinity when allowing NUMA imbalance in find_idlest_group K Prateek Nayak
2022-02-17 10:05 ` Mel Gorman
2022-02-17 11:23   ` K Prateek Nayak
2022-02-17 13:15     ` Mel Gorman [this message]
2022-02-18  3:55       ` K Prateek Nayak
2022-02-22  6:00       ` K Prateek Nayak
2022-02-22  8:45         ` Mel Gorman
2022-02-22  9:39           ` K Prateek Nayak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220217131512.GW3366@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=aubrey.li@linux.intel.com \
    --cc=efault@gmx.de \
    --cc=gautham.shenoy@amd.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=song.bao.hua@hisilicon.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=valentin.schneider@arm.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox