linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Alex Shi <alex.shi@intel.com>
Cc: rob@landley.net, mingo@redhat.com, peterz@infradead.org,
	gregkh@linuxfoundation.org, andre.przywara@amd.com, rjw@sisk.pl,
	paul.gortmaker@windriver.com, akpm@linux-foundation.org,
	paulmck@linux.vnet.ibm.com, linux-kernel@vger.kernel.org,
	pjt@google.com, vincent.guittot@linaro.org
Subject: Re: [PATCH 01/18] sched: select_task_rq_fair clean up
Date: Wed, 12 Dec 2012 10:56:32 +0530	[thread overview]
Message-ID: <50C81588.20901@linux.vnet.ibm.com> (raw)
In-Reply-To: <50C71EA9.3060507@intel.com>

On 12/11/2012 05:23 PM, Alex Shi wrote:
> On 12/11/2012 02:30 PM, Preeti U Murthy wrote:
>> On 12/11/2012 10:58 AM, Alex Shi wrote:
>>> On 12/11/2012 12:23 PM, Preeti U Murthy wrote:
>>>> Hi Alex,
>>>>
>>>> On 12/10/2012 01:52 PM, Alex Shi wrote:
>>>>> It is impossible to miss a task allowed cpu in a eligible group.
>>>>
>>>> The one thing I am concerned with here is if there is a possibility of
>>>> the task changing its tsk_cpus_allowed() while this code is running.
>>>>
>>>> i.e find_idlest_group() finds an idle group,then the tsk_cpus_allowed()
>>>> for the task changes,perhaps by the user himself,which might not include
>>>> the cpus in the idle group.After this find_idlest_cpu() is called.I mean
>>>> a race condition in short.Then we might not have an eligible cpu in that
>>>> group right?
>>>
>>> your worry make sense, but the code handle the situation, in
>>> select_task_rq(), it will check the cpu allowed again. if the answer is
>>> no, it will fallback to old cpu.
>>>>
>>>>> And since find_idlest_group only return a different group which
>>>>> excludes old cpu, it's also imporissible to find a new cpu same as old
>>>>> cpu.
>>
>> I doubt this will work correctly.Consider the following situation:sched
>> domain begins with sd that encloses both socket1 and socket2
>>
>> cpu0 cpu1  | cpu2 cpu3
>> -----------|-------------
>>  socket1   |  socket2
>>
>> old cpu = cpu1
>>
>> Iteration1:
>> 1.find_idlest_group() returns socket2 to be idlest.
>> 2.task changes tsk_allowed_cpus to 0,1
>> 3.find_idlest_cpu() returns cpu2
>>
>> * without your patch
>>    1.the condition after find_idlest_cpu() returns -1,and sd->child is
>> chosen which happens to be socket1
>>    2.in the next iteration, find_idlest_group() and find_idlest_cpu()
>> will probably choose cpu0 which happens to be idler than cpu1,which is
>> in tsk_allowed_cpu.
> 
> Thanks for question Preeti! :)
> 
> Yes, with more iteration you has more possibility to get task allowed
> cpu in select_task_rq_fair. but how many opportunity the situation
> happened?  how much gain you get here?
> With LCPU increasing many many iterations cause scalability issue. that
> is the simplified forking patchset for. and that why 10% performance
> gain on hackbench process/thread.
> 
> and if you insist want not to miss your chance in strf(), the current
> iteration is still not enough. How you know the idlest cpu is still
> idlest after this function finished? how to ensure the allowed cpu won't
> be changed again?
> 
> A quick snapshot is enough in balancing here. we still has periodic
> balacning.

Hmm ok,let me look at this more closely.
> 
>>
>> * with your patch
>>    1.the condition after find_idlest_cpu() does not exist,therefore
>> a sched domain to which cpu2 belongs to is chosen.this is socket2.(under
>> the for_each_domain() loop).
>>    2.in the next iteration, find_idlest_group() return NULL,because
>> there is no cpu which intersects with tsk_allowed_cpus.
>>    3.in select task rq,the fallback cpu is chosen even when an idle cpu
>> existed.
>>
>> So my concern is though select_task_rq() checks the
>> tsk_allowed_cpus(),you might end up choosing a different path of
>> sched_domains compared to without this patch as shown above.
>>
>> In short without the "if(new_cpu==-1)" condition we might get misled
>> doing unnecessary iterations over the wrong sched domains in
>> select_task_rq_fair().(Think about situations when not all the cpus of
>> socket2 are disallowed by the task,then there will more iterations in
> 
> After read the first 4 patches, believe you will find the patchset is
> trying to reduce iterations, not increase them.

Right,sorry about not noticing this.
> 
>> the wrong path of sched_domains before exit,compared to what is shown
>> above.)
>>
Regards
Preeti U Murthy


  reply	other threads:[~2012-12-12  5:27 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-10  8:22 [PATCH 0/18] sched: simplified fork, enable load average into LB and power awareness scheduling Alex Shi
2012-12-10  8:22 ` [PATCH 01/18] sched: select_task_rq_fair clean up Alex Shi
2012-12-11  4:23   ` Preeti U Murthy
2012-12-11  5:28     ` Alex Shi
2012-12-11  6:30       ` Preeti U Murthy
2012-12-11 11:53         ` Alex Shi
2012-12-12  5:26           ` Preeti U Murthy [this message]
2012-12-21  4:28         ` Namhyung Kim
2012-12-23 12:17           ` Alex Shi
2012-12-10  8:22 ` [PATCH 02/18] sched: fix find_idlest_group mess logical Alex Shi
2012-12-11  5:08   ` Preeti U Murthy
2012-12-11  5:29     ` Alex Shi
2012-12-11  5:50       ` Preeti U Murthy
2012-12-11 11:55         ` Alex Shi
2012-12-10  8:22 ` [PATCH 03/18] sched: don't need go to smaller sched domain Alex Shi
2012-12-10  8:22 ` [PATCH 04/18] sched: remove domain iterations in fork/exec/wake Alex Shi
2012-12-10  8:22 ` [PATCH 05/18] sched: load tracking bug fix Alex Shi
2012-12-10  8:22 ` [PATCH 06/18] sched: set initial load avg of new forked task as its load weight Alex Shi
2012-12-21  4:33   ` Namhyung Kim
2012-12-23 12:00     ` Alex Shi
2012-12-10  8:22 ` [PATCH 07/18] sched: compute runnable load avg in cpu_load and cpu_avg_load_per_task Alex Shi
2012-12-12  3:57   ` Preeti U Murthy
2012-12-12  5:52     ` Alex Shi
2012-12-13  8:45     ` Alex Shi
2012-12-21  4:35       ` Namhyung Kim
2012-12-23 11:42         ` Alex Shi
2012-12-10  8:22 ` [PATCH 08/18] sched: consider runnable load average in move_tasks Alex Shi
2012-12-12  4:41   ` Preeti U Murthy
2012-12-12  6:26     ` Alex Shi
2012-12-21  4:43       ` Namhyung Kim
2012-12-23 12:29         ` Alex Shi
2012-12-10  8:22 ` [PATCH 09/18] Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking" Alex Shi
2012-12-10  8:22 ` [PATCH 10/18] sched: add sched_policy in kernel Alex Shi
2012-12-10  8:22 ` [PATCH 11/18] sched: add sched_policy and it's sysfs interface Alex Shi
2012-12-10  8:22 ` [PATCH 12/18] sched: log the cpu utilization at rq Alex Shi
2012-12-10  8:22 ` [PATCH 13/18] sched: add power aware scheduling in fork/exec/wake Alex Shi
2012-12-10  8:22 ` [PATCH 14/18] sched: add power/performance balance allowed flag Alex Shi
2012-12-10  8:22 ` [PATCH 15/18] sched: don't care if the local group has capacity Alex Shi
2012-12-10  8:22 ` [PATCH 16/18] sched: pull all tasks from source group Alex Shi
2012-12-10  8:22 ` [PATCH 17/18] sched: power aware load balance, Alex Shi
2012-12-10  8:22 ` [PATCH 18/18] sched: lazy powersaving balance Alex Shi
2012-12-11  0:51 ` [PATCH 0/18] sched: simplified fork, enable load average into LB and power awareness scheduling Alex Shi
2012-12-11 12:10   ` Alex Shi
2012-12-11 15:48     ` Borislav Petkov
2012-12-11 16:03       ` Arjan van de Ven
2012-12-11 16:13         ` Borislav Petkov
2012-12-11 16:40           ` Arjan van de Ven
2012-12-12  9:52             ` Amit Kucheria
2012-12-12 13:55               ` Alex Shi
2012-12-12 14:21                 ` Vincent Guittot
2012-12-13  2:51                   ` Alex Shi
2012-12-12 14:41             ` Borislav Petkov
2012-12-13  3:07               ` Alex Shi
2012-12-13 11:35                 ` Borislav Petkov
2012-12-14  1:56                   ` Alex Shi
2012-12-12  1:14           ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C81588.20901@linux.vnet.ibm.com \
    --to=preeti@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@intel.com \
    --cc=andre.przywara@amd.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=rjw@sisk.pl \
    --cc=rob@landley.net \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).