From: "Li, Aubrey" <aubrey.li@linux.intel.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>, Mike Galbraith <efault@gmx.de>
Cc: Aubrey Li <aubrey.li@intel.com>,
tglx@linutronix.de, peterz@infradead.org, len.brown@intel.com,
ak@linux.intel.com, tim.c.chen@linux.intel.com,
linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v2 2/8] cpuidle: record the overhead of idle entry
Date: Tue, 17 Oct 2017 15:04:11 +0800 [thread overview]
Message-ID: <fd85c52f-98f2-e847-b534-30f7907ecf11@linux.intel.com> (raw)
In-Reply-To: <10643458.iWc7GTROAz@aspire.rjw.lan>
On 2017/10/17 8:05, Rafael J. Wysocki wrote:
> On Monday, October 16, 2017 5:11:57 AM CEST Li, Aubrey wrote:
>> On 2017/10/14 8:35, Rafael J. Wysocki wrote:
>>> On Saturday, September 30, 2017 9:20:28 AM CEST Aubrey Li wrote:
>>>> Record the overhead of idle entry in micro-second
>>>>
>>>
>>> What is this needed for?
>>
>> We need to figure out how long of a idle is a short idle and recording
>> the overhead is for this purpose. The short idle threshold is based
>> on this overhead.
>
> I don't really understand this statement.
>
> Pretent I'm not familiar with this stuff and try to explain it to me. :-)
>
Okay, let me try, :-)
Today what we did in idle loop as follows:
do_idle {
idle_entry {
- deferrable stuff like quiet_vmstat
- turn off tick(without looking at historical/predicted idle interval)
- rcu idle enter, c-state selection, etc
}
idle_call {
- poll or halt or mwait
}
idle_exit {
- rcu idle exit
- restore the tick if tick is stopped before enter idle
}
}
And we already measured idle_entry and idle_exit costs several micro-seconds,
say 10us.
Now if idle_call is 1000us, much larger than idle_entry and idle_exit, we can
ignore the time cost in idle_entry and idle_exit.
But for some workloads with short idle pattern, like netperf, the idle_call
is 2us, then idle_entry and idle_exit start to dominate. If we can reduce the
time in idle_entry and idle_exit, we then get better workload performance
significantly.
Modem high-speed network and low-latency I/O like Nvme disk has this requirement.
Mike's patch was made several years ago though I don't know the details. Here is
an article related to this.
https://cacm.acm.org/magazines/2017/4/215032-attack-of-the-killer-microseconds/fulltext
>>>
>>>> +void cpuidle_entry_end(void)
>>>> +{
>>>> + struct cpuidle_device *dev = cpuidle_get_device();
>>>> + u64 overhead;
>>>> + s64 diff;
>>>> +
>>>> + if (dev) {
>>>> + dev->idle_stat.entry_end = local_clock();
>>>> + overhead = div_u64(dev->idle_stat.entry_end -
>>>> + dev->idle_stat.entry_start, NSEC_PER_USEC);
>>>
>>> Is the conversion really necessary?
>>>
>>> If so, then why?
>>
>> We can choose nano-second and micro-second. Given that workload results
>> in the short idle pattern, I think micro-second is good enough for the
>> real workload.
>>
>> Another reason is that prediction from idle governor is micro-second, so
>> I convert it for comparing purpose.
>>>
>>> And if there is a good reason, what about using right shift to do
>>> an approximate conversion to avoid the extra division here?
>>
>> Sure >> 10 works for me as I don't think here precision is a big deal.
>>
>>>
>>>> + diff = overhead - dev->idle_stat.overhead;
>>>> + dev->idle_stat.overhead += diff >> 3;
>>>
>>> Can you please explain what happens in the two lines above?
>>
>> Online average computing algorithm, stolen from update_avg() @ kernel/sched/core.c.
>
> OK
>
> Maybe care to add a comment to that effect?
Sure, I'll add in the next version.
Thanks,
-Aubrey
next prev parent reply other threads:[~2017-10-17 7:04 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1506756034-6340-1-git-send-email-aubrey.li@intel.com>
[not found] ` <1506756034-6340-2-git-send-email-aubrey.li@intel.com>
[not found] ` <1629755.KbDSmDPDTX@aspire.rjw.lan>
2017-10-16 2:46 ` [RFC PATCH v2 1/8] cpuidle: menu: extract prediction functionality Li, Aubrey
[not found] ` <1506756034-6340-3-git-send-email-aubrey.li@intel.com>
[not found] ` <2672521.fEEa1b19Vu@aspire.rjw.lan>
2017-10-16 3:11 ` [RFC PATCH v2 2/8] cpuidle: record the overhead of idle entry Li, Aubrey
2017-10-17 0:05 ` Rafael J. Wysocki
2017-10-17 7:04 ` Li, Aubrey [this message]
[not found] ` <1506756034-6340-5-git-send-email-aubrey.li@intel.com>
[not found] ` <4523111.uMcC96MW3N@aspire.rjw.lan>
2017-10-16 3:26 ` [RFC PATCH v2 4/8] tick/nohz: keep tick on for a fast idle Li, Aubrey
2017-10-16 4:45 ` Mike Galbraith
2017-10-16 5:34 ` Li, Aubrey
2017-10-16 6:25 ` Mike Galbraith
2017-10-16 6:31 ` Li, Aubrey
[not found] ` <1506756034-6340-7-git-send-email-aubrey.li@intel.com>
[not found] ` <2242303.t20yq9Lc6j@aspire.rjw.lan>
2017-10-16 6:00 ` [RFC PATCH v2 6/8] cpuidle: make fast idle threshold tunable Li, Aubrey
2017-10-17 0:01 ` Rafael J. Wysocki
2017-10-17 6:12 ` Li, Aubrey
[not found] ` <1506756034-6340-6-git-send-email-aubrey.li@intel.com>
[not found] ` <1554921.dz8jk4n8cL@aspire.rjw.lan>
2017-10-16 6:46 ` [RFC PATCH v2 5/8] timers: keep sleep length updated as needed Li, Aubrey
2017-10-16 23:58 ` Rafael J. Wysocki
2017-10-17 6:10 ` Li, Aubrey
[not found] ` <3026355.QRuoy6eIZM@aspire.rjw.lan>
2017-10-16 7:44 ` [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality Li, Aubrey
2017-10-17 0:07 ` Rafael J. Wysocki
2017-10-17 7:32 ` Li, Aubrey
[not found] ` <1506756034-6340-4-git-send-email-aubrey.li@intel.com>
[not found] ` <2353480.vFnqZDvmsB@aspire.rjw.lan>
2017-10-16 8:04 ` [RFC PATCH v2 3/8] cpuidle: add a new predict interface Li, Aubrey
[not found] ` <3044561.Ej2KzLJlAU@aspire.rjw.lan>
2017-10-16 9:52 ` Li, Aubrey
2017-11-30 1:00 ` [RFC PATCH v2 0/8] Introduct cpu idle prediction functionality Li, Aubrey
2017-11-30 1:37 ` Rafael J. Wysocki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fd85c52f-98f2-e847-b534-30f7907ecf11@linux.intel.com \
--to=aubrey.li@linux.intel.com \
--cc=ak@linux.intel.com \
--cc=aubrey.li@intel.com \
--cc=efault@gmx.de \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rjw@rjwysocki.net \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).