From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Jason Low <jason.low2@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
mingo@kernel.org, linux-kernel@vger.kernel.org,
daniel.lezcano@linaro.org, alex.shi@linaro.org, efault@gmx.de,
vincent.guittot@linaro.org, morten.rasmussen@arm.com,
aswin@hp.com, chegu_vinod@hp.com
Subject: Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted
Date: Fri, 25 Apr 2014 10:42:30 +0530 [thread overview]
Message-ID: <5359EEBE.2030808@linux.vnet.ibm.com> (raw)
In-Reply-To: <1398377917.3509.32.camel@j-VirtualBox>
Hi Jason,
On 04/25/2014 03:48 AM, Jason Low wrote:
> On Thu, 2014-04-24 at 19:14 +0200, Peter Zijlstra wrote:
>> On Thu, Apr 24, 2014 at 09:53:37AM -0700, Jason Low wrote:
>>>
>>> So I thought that the original rationale (commit 1bd77f2d) behind
>>> updating rq->next_balance in idle_balance() is that, if we are going
>>> idle (!pulled_task), we want to ensure that the next_balance gets
>>> calculated without the busy_factor.
>>>
>>> If the rq is busy, then rq->next_balance gets updated based on
>>> sd->interval * busy_factor. However, when the rq goes from "busy"
>>> to idle, rq->next_balance might still have been calculated under
>>> the assumption that the rq is busy. Thus, if we are going idle, we
>>> would then properly update next_balance without the busy factor
>>> if we update when !pulled_task.
>>>
>>
>> Its late here and I'm confused!
>>
>> So the for_each_domain() loop calculates a new next_balance based on
>> ->balance_interval (which has that busy_factor on, right).
>>
>> But if it fails to pull anything, we'll (potentially) iterate the entire
>> tree up to the largest domain; and supposedly set next_balanced to the
>> largest possible interval.
>>
>> So when we go from busy to idle (!pulled_task), we actually set
>> ->next_balance to the longest interval. Whereas the commit you
>> referenced says it sets it to a shorter while.
>>
>> Not seeing it.
>
> So this is the way I understand that code:
>
> In rebalance_domain, next_balance is suppose to be set to the
> minimum of all sd->last_balance + interval so that we properly call
> into rebalance_domains() if one of the domains is due for a balance.
>
> In the domain traversals:
>
> if (time_after(next_balance, sd->last_balance + interval))
> next_balance = sd->last_balance + interval;
>
> we update next_balance to a new value if the current next_balance
> is after, and we only update next_balance to a smaller value.
>
> In rebalance_domains, we have code:
>
> interval = sd->balance_interval;
> if (idle != CPU_IDLE)
> interval *= sd->busy_factor;
>
> ...
>
> if (time_after(next_balance, sd->last_balance + interval)) {
> next_balance = sd->last_balance + interval;
>
> ...
>
> rq->next_balance = next_balance;
>
> In the CPU_IDLE case, interval would not include the busy factor,
> whereas in the !CPU_IDLE case, we multiply the interval by the
> sd->busy_factor.
>
> So as an example, if a CPU is not idle and we run this:
>
> rebalance_domain()
> interval = 1 ms;
> if (idle != CPU_IDLE)
> interval *= 64;
>
> next_balance = sd->last_balance + 64 ms
>
> rq->next_balance = next_balance
>
> The rq->next_balance is set to a large value since the CPU is not idle.
>
> Then, let's say the CPU then goes idle 1 ms later. The
> rq->next_balance can be up to 63 ms later, because we computed
> it when the CPU is not idle. Now that we are going idle,
> we would have to wait a long time for the next balance.
>
> So I believe that the initial reason why rq->next_balance was
> updated in idle_balance is that if the CPU is in the process
> of going idle (!pulled_task in idle_balance()), we can reset the
> rq->next_balance based on the interval = 1 ms, as oppose to
> having it remain up to 64 ms later (in idle_balance(), interval
> doesn't get multiplied by sd->busy_factor).
I agree with this. However I am concerned with an additional point that
I have mentioned in my reply to Peter's mail on this thread.
Should we verify if rq->next_balance update is independent of
pulled_tasks? sd->balance_interval is changed during load_balance() and
rq->next_balance should perhaps consider that?
Regards
Preeti U Murthy
>
>
>
next prev parent reply other threads:[~2014-04-25 5:17 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-24 1:30 [PATCH 0/3] sched: Idle balance patches Jason Low
2014-04-24 1:30 ` [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted Jason Low
2014-04-24 10:14 ` Preeti U Murthy
2014-04-24 12:04 ` Peter Zijlstra
2014-04-24 12:44 ` Peter Zijlstra
2014-04-24 16:53 ` Jason Low
2014-04-24 17:14 ` Peter Zijlstra
2014-04-24 17:29 ` Peter Zijlstra
2014-04-24 22:18 ` Jason Low
2014-04-25 5:12 ` Preeti U Murthy [this message]
2014-04-25 7:13 ` Jason Low
2014-04-25 7:58 ` Mike Galbraith
2014-04-25 17:03 ` Jason Low
2014-04-25 5:08 ` Preeti U Murthy
2014-04-25 9:43 ` Peter Zijlstra
2014-04-25 19:54 ` Jason Low
2014-04-26 14:50 ` Peter Zijlstra
2014-04-28 16:42 ` Jason Low
2014-04-27 8:31 ` Preeti Murthy
2014-04-28 9:24 ` Peter Zijlstra
2014-04-29 3:10 ` Preeti U Murthy
2014-04-28 18:04 ` Jason Low
2014-04-29 3:52 ` Preeti U Murthy
2014-04-24 1:30 ` [PATCH 2/3] sched: Initialize newidle balance stats in sd_numa_init() Jason Low
2014-04-24 12:18 ` Peter Zijlstra
2014-04-25 5:57 ` Preeti U Murthy
2014-05-08 10:42 ` [tip:sched/core] sched/numa: " tip-bot for Jason Low
2014-04-24 1:30 ` [PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks Jason Low
2014-04-24 2:51 ` Mike Galbraith
2014-04-24 8:28 ` Mike Galbraith
2014-04-24 16:37 ` Jason Low
2014-04-24 19:07 ` Mike Galbraith
2014-04-24 7:15 ` Peter Zijlstra
2014-04-24 16:43 ` Jason Low
2014-04-24 16:52 ` Peter Zijlstra
2014-04-25 1:24 ` Jason Low
2014-04-25 2:45 ` Mike Galbraith
2014-04-25 3:33 ` Jason Low
2014-04-25 5:46 ` Mike Galbraith
2014-04-24 16:54 ` Peter Zijlstra
2014-04-24 10:30 ` Morten Rasmussen
2014-04-24 11:32 ` Peter Zijlstra
2014-04-24 14:08 ` Morten Rasmussen
2014-04-24 14:59 ` Peter Zijlstra
2014-05-08 10:44 ` [tip:sched/core] sched/fair: " tip-bot for Jason Low
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5359EEBE.2030808@linux.vnet.ibm.com \
--to=preeti@linux.vnet.ibm.com \
--cc=alex.shi@linaro.org \
--cc=aswin@hp.com \
--cc=chegu_vinod@hp.com \
--cc=daniel.lezcano@linaro.org \
--cc=efault@gmx.de \
--cc=jason.low2@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox