From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Jason Low <jason.low2@hp.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
mingo@kernel.org, linux-kernel@vger.kernel.org,
daniel.lezcano@linaro.org, alex.shi@linaro.org, efault@gmx.de,
vincent.guittot@linaro.org, morten.rasmussen@arm.com,
aswin@hp.com, chegu_vinod@hp.com
Subject: Re: [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted
Date: Fri, 25 Apr 2014 10:42:30 +0530 [thread overview]
Message-ID: <5359EEBE.2030808@linux.vnet.ibm.com> (raw)
In-Reply-To: <1398377917.3509.32.camel@j-VirtualBox>
Hi Jason,
On 04/25/2014 03:48 AM, Jason Low wrote:
> On Thu, 2014-04-24 at 19:14 +0200, Peter Zijlstra wrote:
>> On Thu, Apr 24, 2014 at 09:53:37AM -0700, Jason Low wrote:
>>>
>>> So I thought that the original rationale (commit 1bd77f2d) behind
>>> updating rq->next_balance in idle_balance() is that, if we are going
>>> idle (!pulled_task), we want to ensure that the next_balance gets
>>> calculated without the busy_factor.
>>>
>>> If the rq is busy, then rq->next_balance gets updated based on
>>> sd->interval * busy_factor. However, when the rq goes from "busy"
>>> to idle, rq->next_balance might still have been calculated under
>>> the assumption that the rq is busy. Thus, if we are going idle, we
>>> would then properly update next_balance without the busy factor
>>> if we update when !pulled_task.
>>>
>>
>> Its late here and I'm confused!
>>
>> So the for_each_domain() loop calculates a new next_balance based on
>> ->balance_interval (which has that busy_factor on, right).
>>
>> But if it fails to pull anything, we'll (potentially) iterate the entire
>> tree up to the largest domain; and supposedly set next_balanced to the
>> largest possible interval.
>>
>> So when we go from busy to idle (!pulled_task), we actually set
>> ->next_balance to the longest interval. Whereas the commit you
>> referenced says it sets it to a shorter while.
>>
>> Not seeing it.
>
> So this is the way I understand that code:
>
> In rebalance_domain, next_balance is suppose to be set to the
> minimum of all sd->last_balance + interval so that we properly call
> into rebalance_domains() if one of the domains is due for a balance.
>
> In the domain traversals:
>
> if (time_after(next_balance, sd->last_balance + interval))
> next_balance = sd->last_balance + interval;
>
> we update next_balance to a new value if the current next_balance
> is after, and we only update next_balance to a smaller value.
>
> In rebalance_domains, we have code:
>
> interval = sd->balance_interval;
> if (idle != CPU_IDLE)
> interval *= sd->busy_factor;
>
> ...
>
> if (time_after(next_balance, sd->last_balance + interval)) {
> next_balance = sd->last_balance + interval;
>
> ...
>
> rq->next_balance = next_balance;
>
> In the CPU_IDLE case, interval would not include the busy factor,
> whereas in the !CPU_IDLE case, we multiply the interval by the
> sd->busy_factor.
>
> So as an example, if a CPU is not idle and we run this:
>
> rebalance_domain()
> interval = 1 ms;
> if (idle != CPU_IDLE)
> interval *= 64;
>
> next_balance = sd->last_balance + 64 ms
>
> rq->next_balance = next_balance
>
> The rq->next_balance is set to a large value since the CPU is not idle.
>
> Then, let's say the CPU then goes idle 1 ms later. The
> rq->next_balance can be up to 63 ms later, because we computed
> it when the CPU is not idle. Now that we are going idle,
> we would have to wait a long time for the next balance.
>
> So I believe that the initial reason why rq->next_balance was
> updated in idle_balance is that if the CPU is in the process
> of going idle (!pulled_task in idle_balance()), we can reset the
> rq->next_balance based on the interval = 1 ms, as oppose to
> having it remain up to 64 ms later (in idle_balance(), interval
> doesn't get multiplied by sd->busy_factor).
I agree with this. However I am concerned with an additional point that
I have mentioned in my reply to Peter's mail on this thread.
Should we verify if rq->next_balance update is independent of
pulled_tasks? sd->balance_interval is changed during load_balance() and
rq->next_balance should perhaps consider that?
Regards
Preeti U Murthy
>
>
>
next prev parent reply other threads:[~2014-04-25 5:17 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-24 1:30 [PATCH 0/3] sched: Idle balance patches Jason Low
2014-04-24 1:30 ` [PATCH 1/3] sched, balancing: Update rq->max_idle_balance_cost whenever newidle balance is attempted Jason Low
2014-04-24 10:14 ` Preeti U Murthy
2014-04-24 12:04 ` Peter Zijlstra
2014-04-24 12:44 ` Peter Zijlstra
2014-04-24 16:53 ` Jason Low
2014-04-24 17:14 ` Peter Zijlstra
2014-04-24 17:29 ` Peter Zijlstra
2014-04-24 22:18 ` Jason Low
2014-04-25 5:12 ` Preeti U Murthy [this message]
2014-04-25 7:13 ` Jason Low
2014-04-25 7:58 ` Mike Galbraith
2014-04-25 17:03 ` Jason Low
2014-04-25 5:08 ` Preeti U Murthy
2014-04-25 9:43 ` Peter Zijlstra
2014-04-25 19:54 ` Jason Low
2014-04-26 14:50 ` Peter Zijlstra
2014-04-28 16:42 ` Jason Low
2014-04-27 8:31 ` Preeti Murthy
2014-04-28 9:24 ` Peter Zijlstra
2014-04-29 3:10 ` Preeti U Murthy
2014-04-28 18:04 ` Jason Low
2014-04-29 3:52 ` Preeti U Murthy
2014-04-24 1:30 ` [PATCH 2/3] sched: Initialize newidle balance stats in sd_numa_init() Jason Low
2014-04-24 12:18 ` Peter Zijlstra
2014-04-25 5:57 ` Preeti U Murthy
2014-05-08 10:42 ` [tip:sched/core] sched/numa: " tip-bot for Jason Low
2014-04-24 1:30 ` [PATCH 3/3] sched, fair: Stop searching for tasks in newidle balance if there are runnable tasks Jason Low
2014-04-24 2:51 ` Mike Galbraith
2014-04-24 8:28 ` Mike Galbraith
2014-04-24 16:37 ` Jason Low
2014-04-24 19:07 ` Mike Galbraith
2014-04-24 7:15 ` Peter Zijlstra
2014-04-24 16:43 ` Jason Low
2014-04-24 16:52 ` Peter Zijlstra
2014-04-25 1:24 ` Jason Low
2014-04-25 2:45 ` Mike Galbraith
2014-04-25 3:33 ` Jason Low
2014-04-25 5:46 ` Mike Galbraith
2014-04-24 16:54 ` Peter Zijlstra
2014-04-24 10:30 ` Morten Rasmussen
2014-04-24 11:32 ` Peter Zijlstra
2014-04-24 14:08 ` Morten Rasmussen
2014-04-24 14:59 ` Peter Zijlstra
2014-05-08 10:44 ` [tip:sched/core] sched/fair: " tip-bot for Jason Low
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5359EEBE.2030808@linux.vnet.ibm.com \
--to=preeti@linux.vnet.ibm.com \
--cc=alex.shi@linaro.org \
--cc=aswin@hp.com \
--cc=chegu_vinod@hp.com \
--cc=daniel.lezcano@linaro.org \
--cc=efault@gmx.de \
--cc=jason.low2@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=morten.rasmussen@arm.com \
--cc=peterz@infradead.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.