All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@intel.com>
To: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: rob@landley.net, mingo@redhat.com, peterz@infradead.org,
	gregkh@linuxfoundation.org, andre.przywara@amd.com, rjw@sisk.pl,
	paul.gortmaker@windriver.com, akpm@linux-foundation.org,
	paulmck@linux.vnet.ibm.com, linux-kernel@vger.kernel.org,
	pjt@google.com, vincent.guittot@linaro.org
Subject: Re: [PATCH 01/18] sched: select_task_rq_fair clean up
Date: Tue, 11 Dec 2012 19:53:13 +0800	[thread overview]
Message-ID: <50C71EA9.3060507@intel.com> (raw)
In-Reply-To: <50C6D31F.1030408@linux.vnet.ibm.com>

On 12/11/2012 02:30 PM, Preeti U Murthy wrote:
> On 12/11/2012 10:58 AM, Alex Shi wrote:
>> On 12/11/2012 12:23 PM, Preeti U Murthy wrote:
>>> Hi Alex,
>>>
>>> On 12/10/2012 01:52 PM, Alex Shi wrote:
>>>> It is impossible to miss a task allowed cpu in a eligible group.
>>>
>>> The one thing I am concerned with here is if there is a possibility of
>>> the task changing its tsk_cpus_allowed() while this code is running.
>>>
>>> i.e find_idlest_group() finds an idle group,then the tsk_cpus_allowed()
>>> for the task changes,perhaps by the user himself,which might not include
>>> the cpus in the idle group.After this find_idlest_cpu() is called.I mean
>>> a race condition in short.Then we might not have an eligible cpu in that
>>> group right?
>>
>> your worry make sense, but the code handle the situation, in
>> select_task_rq(), it will check the cpu allowed again. if the answer is
>> no, it will fallback to old cpu.
>>>
>>>> And since find_idlest_group only return a different group which
>>>> excludes old cpu, it's also imporissible to find a new cpu same as old
>>>> cpu.
> 
> I doubt this will work correctly.Consider the following situation:sched
> domain begins with sd that encloses both socket1 and socket2
> 
> cpu0 cpu1  | cpu2 cpu3
> -----------|-------------
>  socket1   |  socket2
> 
> old cpu = cpu1
> 
> Iteration1:
> 1.find_idlest_group() returns socket2 to be idlest.
> 2.task changes tsk_allowed_cpus to 0,1
> 3.find_idlest_cpu() returns cpu2
> 
> * without your patch
>    1.the condition after find_idlest_cpu() returns -1,and sd->child is
> chosen which happens to be socket1
>    2.in the next iteration, find_idlest_group() and find_idlest_cpu()
> will probably choose cpu0 which happens to be idler than cpu1,which is
> in tsk_allowed_cpu.

Thanks for question Preeti! :)

Yes, with more iteration you has more possibility to get task allowed
cpu in select_task_rq_fair. but how many opportunity the situation
happened?  how much gain you get here?
With LCPU increasing many many iterations cause scalability issue. that
is the simplified forking patchset for. and that why 10% performance
gain on hackbench process/thread.

and if you insist want not to miss your chance in strf(), the current
iteration is still not enough. How you know the idlest cpu is still
idlest after this function finished? how to ensure the allowed cpu won't
be changed again?

A quick snapshot is enough in balancing here. we still has periodic
balacning.

> 
> * with your patch
>    1.the condition after find_idlest_cpu() does not exist,therefore
> a sched domain to which cpu2 belongs to is chosen.this is socket2.(under
> the for_each_domain() loop).
>    2.in the next iteration, find_idlest_group() return NULL,because
> there is no cpu which intersects with tsk_allowed_cpus.
>    3.in select task rq,the fallback cpu is chosen even when an idle cpu
> existed.
> 
> So my concern is though select_task_rq() checks the
> tsk_allowed_cpus(),you might end up choosing a different path of
> sched_domains compared to without this patch as shown above.
> 
> In short without the "if(new_cpu==-1)" condition we might get misled
> doing unnecessary iterations over the wrong sched domains in
> select_task_rq_fair().(Think about situations when not all the cpus of
> socket2 are disallowed by the task,then there will more iterations in

After read the first 4 patches, believe you will find the patchset is
trying to reduce iterations, not increase them.

> the wrong path of sched_domains before exit,compared to what is shown
> above.)
> 
> Regards
> Preeti U Murthy
> 
> 


-- 
Thanks
    Alex

  reply	other threads:[~2012-12-11 11:53 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-10  8:22 [PATCH 0/18] sched: simplified fork, enable load average into LB and power awareness scheduling Alex Shi
2012-12-10  8:22 ` [PATCH 01/18] sched: select_task_rq_fair clean up Alex Shi
2012-12-11  4:23   ` Preeti U Murthy
2012-12-11  5:28     ` Alex Shi
2012-12-11  6:30       ` Preeti U Murthy
2012-12-11 11:53         ` Alex Shi [this message]
2012-12-12  5:26           ` Preeti U Murthy
2012-12-21  4:28         ` Namhyung Kim
2012-12-23 12:17           ` Alex Shi
2012-12-10  8:22 ` [PATCH 02/18] sched: fix find_idlest_group mess logical Alex Shi
2012-12-11  5:08   ` Preeti U Murthy
2012-12-11  5:29     ` Alex Shi
2012-12-11  5:50       ` Preeti U Murthy
2012-12-11 11:55         ` Alex Shi
2012-12-10  8:22 ` [PATCH 03/18] sched: don't need go to smaller sched domain Alex Shi
2012-12-10  8:22 ` [PATCH 04/18] sched: remove domain iterations in fork/exec/wake Alex Shi
2012-12-10  8:22 ` [PATCH 05/18] sched: load tracking bug fix Alex Shi
2012-12-10  8:22 ` [PATCH 06/18] sched: set initial load avg of new forked task as its load weight Alex Shi
2012-12-21  4:33   ` Namhyung Kim
2012-12-23 12:00     ` Alex Shi
2012-12-10  8:22 ` [PATCH 07/18] sched: compute runnable load avg in cpu_load and cpu_avg_load_per_task Alex Shi
2012-12-12  3:57   ` Preeti U Murthy
2012-12-12  5:52     ` Alex Shi
2012-12-13  8:45     ` Alex Shi
2012-12-21  4:35       ` Namhyung Kim
2012-12-23 11:42         ` Alex Shi
2012-12-10  8:22 ` [PATCH 08/18] sched: consider runnable load average in move_tasks Alex Shi
2012-12-12  4:41   ` Preeti U Murthy
2012-12-12  6:26     ` Alex Shi
2012-12-21  4:43       ` Namhyung Kim
2012-12-23 12:29         ` Alex Shi
2012-12-10  8:22 ` [PATCH 09/18] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" Alex Shi
2012-12-10  8:22 ` [PATCH 10/18] sched: add sched_policy in kernel Alex Shi
2012-12-10  8:22 ` [PATCH 11/18] sched: add sched_policy and it's sysfs interface Alex Shi
2012-12-10  8:22 ` [PATCH 12/18] sched: log the cpu utilization at rq Alex Shi
2012-12-10  8:22 ` [PATCH 13/18] sched: add power aware scheduling in fork/exec/wake Alex Shi
2012-12-10  8:22 ` [PATCH 14/18] sched: add power/performance balance allowed flag Alex Shi
2012-12-10  8:22 ` [PATCH 15/18] sched: don't care if the local group has capacity Alex Shi
2012-12-10  8:22 ` [PATCH 16/18] sched: pull all tasks from source group Alex Shi
2012-12-10  8:22 ` [PATCH 17/18] sched: power aware load balance, Alex Shi
2012-12-10  8:22 ` [PATCH 18/18] sched: lazy powersaving balance Alex Shi
2012-12-11  0:51 ` [PATCH 0/18] sched: simplified fork, enable load average into LB and power awareness scheduling Alex Shi
2012-12-11 12:10   ` Alex Shi
2012-12-11 15:48     ` Borislav Petkov
2012-12-11 16:03       ` Arjan van de Ven
2012-12-11 16:13         ` Borislav Petkov
2012-12-11 16:40           ` Arjan van de Ven
2012-12-12  9:52             ` Amit Kucheria
2012-12-12 13:55               ` Alex Shi
2012-12-12 14:21                 ` Vincent Guittot
2012-12-13  2:51                   ` Alex Shi
2012-12-12 14:41             ` Borislav Petkov
2012-12-13  3:07               ` Alex Shi
2012-12-13 11:35                 ` Borislav Petkov
2012-12-14  1:56                   ` Alex Shi
2012-12-12  1:14           ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C71EA9.3060507@intel.com \
    --to=alex.shi@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andre.przywara@amd.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=preeti@linux.vnet.ibm.com \
    --cc=rjw@sisk.pl \
    --cc=rob@landley.net \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.