public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
To: Ankit Jain <ankitja@vmware.com>
Cc: peterz@infradead.org, yury.norov@gmail.com,
	linux@rasmusvillemoes.dk, qyousef@layalina.io, pjt@google.com,
	joshdon@google.com, bristot@redhat.com, vschneid@redhat.com,
	linux-kernel@vger.kernel.org, namit@vmware.com,
	amakhalov@vmware.com, srinidhir@vmware.com,
	vsirnapalli@vmware.com, vbrahmajosyula@vmware.com,
	akaher@vmware.com, srivatsa@csail.mit.edu
Subject: Re: [PATCH RFC] cpumask: Randomly distribute the tasks within affinity mask
Date: Wed, 11 Oct 2023 13:39:12 +0300	[thread overview]
Message-ID: <ZSZ7UOBupdHHB24h@smile.fi.intel.com> (raw)
In-Reply-To: <20231011071925.761590-1-ankitja@vmware.com>

On Wed, Oct 11, 2023 at 12:49:25PM +0530, Ankit Jain wrote:
> commit 46a87b3851f0 ("sched/core: Distribute tasks within affinity masks")
> and commit 14e292f8d453 ("sched,rt: Use cpumask_any*_distribute()")
> introduced the logic to distribute the tasks at initial wakeup on cpus
> where load balancing works poorly or disabled at all (isolated cpus).
> 
> There are cases in which the distribution of tasks
> that are spawned on isolcpus does not happen properly.
> In production deployment, initial wakeup of tasks spawn from
> housekeeping cpus to isolcpus[nohz_full cpu] happens on first cpu
> within isolcpus range instead of distributed across isolcpus.
> 
> Usage of distribute_cpu_mask_prev from one processes group,
> will clobber previous value of another or other groups and vice-versa.
> 
> When housekeeping cpus spawn multiple child tasks to wakeup on
> isolcpus[nohz_full cpu], using cpusets.cpus/sched_setaffinity(),
> distribution is currently performed based on per-cpu
> distribute_cpu_mask_prev counter.
> At the same time, on housekeeping cpus there are percpu
> bounded timers interrupt/rcu threads and other system/user tasks
> would be running with affinity as housekeeping cpus. In a real-life
> environment, housekeeping cpus are much fewer and are too much loaded.
> So, distribute_cpu_mask_prev value from these tasks impacts
> the offset value for the tasks spawning to wakeup on isolcpus and
> thus most of the tasks end up waking up on first cpu within the
> isolcpus set.
> 
> Steps to reproduce:
> Kernel cmdline parameters:
> isolcpus=2-5 skew_tick=1 nohz=on nohz_full=2-5
> rcu_nocbs=2-5 rcu_nocb_poll idle=poll irqaffinity=0-1
> 
> * pid=$(echo $$)
> * taskset -pc 0 $pid
> * cat loop-normal.c
> int main(void)
> {
>         while (1)
>                 ;
>         return 0;
> }
> * gcc -o loop-normal loop-normal.c
> * for i in {1..50}; do ./loop-normal & done
> * pids=$(ps -a | grep loop-normal | cut -d' ' -f5)
> * for i in $pids; do taskset -pc 2-5 $i ; done
> 
> Expected output:
> * All 50 “loop-normal” tasks should wake up on cpu2-5
> equally distributed.
> * ps -eLo cpuid,pid,tid,ppid,cls,psr,cls,cmd | grep "^    [2345]"
> 
> Actual output:
> * All 50 “loop-normal” tasks got woken up on cpu2 only
> 
> Analysis:
> There are percpu bounded timer interrupt/rcu threads activities
> going on every few microseconds on housekeeping cpus, exercising
> find_lowest_rq() -> cpumask_any_and_distribute()/cpumask_any_distribute()
> So, per cpu variable distribute_cpu_mask_prev for housekeeping cpus
> keep on getting set to housekeeping cpus. Bash/docker processes
> are sharing same per cpu variable as they run on housekeeping cpus.
> Thus intersection of clobbered distribute_cpu_mask_prev and
> new mask(isolcpus) return always first cpu within the new mask(isolcpus)
> in accordance to the logic mentioned in commits above.
> 
> Fix the issue by using random cores out of the applicable CPU set
> instead of relying on distribute_cpu_mask_prev.

> Fixes: 46a87b3851f0 ("sched/core: Distribute tasks within affinity masks")
> Fixes: 14e292f8d453 ("sched,rt: Use cpumask_any*_distribute()")

> 

Blank lines are not allowed in the tag block.

> Signed-off-by: Ankit Jain <ankitja@vmware.com>

...

> +/**
> + * Returns an arbitrary cpu within srcp.
> + *
> + * Iterated calls using the same srcp will be randomly distributed
> + */

This is invalid. Always run

	scripts/kernel-doc -v -none -Wall ...

against the file of interest and fix all warnings and errors reported.

-- 
With Best Regards,
Andy Shevchenko



  reply	other threads:[~2023-10-11 10:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-11  7:19 [PATCH RFC] cpumask: Randomly distribute the tasks within affinity mask Ankit Jain
2023-10-11 10:39 ` Andy Shevchenko [this message]
2023-10-11 10:53 ` Peter Zijlstra
2023-10-11 11:46   ` Peter Zijlstra
2023-10-11 13:52     ` Peter Zijlstra
2023-10-11 23:55       ` Josh Don
2023-10-12  8:05         ` Peter Zijlstra
2023-10-12 15:43     ` Ankit Jain
2023-10-12  0:16 ` Yury Norov
2023-10-12 15:52   ` Ankit Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSZ7UOBupdHHB24h@smile.fi.intel.com \
    --to=andriy.shevchenko@linux.intel.com \
    --cc=akaher@vmware.com \
    --cc=amakhalov@vmware.com \
    --cc=ankitja@vmware.com \
    --cc=bristot@redhat.com \
    --cc=joshdon@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=namit@vmware.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=qyousef@layalina.io \
    --cc=srinidhir@vmware.com \
    --cc=srivatsa@csail.mit.edu \
    --cc=vbrahmajosyula@vmware.com \
    --cc=vschneid@redhat.com \
    --cc=vsirnapalli@vmware.com \
    --cc=yury.norov@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox