From: Ingo Molnar <mingo@kernel.org>
To: Len Brown <lenb@kernel.org>
Cc: x86@kernel.org, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org, Len Brown <len.brown@intel.com>,
stable@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
"H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Mike Galbraith <efault@gmx.de>, Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH] x86 idle: repair large-server 50-watt idle-power regression
Date: Thu, 19 Dec 2013 13:22:57 +0100 [thread overview]
Message-ID: <20131219122257.GC11279@gmail.com> (raw)
In-Reply-To: <baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com>
* Len Brown <lenb@kernel.org> wrote:
> From: Len Brown <len.brown@intel.com>
>
> Linux 3.10 changed the timing of how thread_info->flags is touched:
>
> x86: Use generic idle loop
> (7d1a941731fabf27e5fb6edbebb79fe856edb4e5)
>
> This caused Intel NHM-EX and WSM-EX servers to experience a large number
> of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote
> from deep C-states to shallow C-states, which caused these platforms
> to experience a significant increase in idle power.
>
> Note that this issue was already present before the commit above,
> however, it wasn't seen often enough to be noticed in power measurements.
>
> Here we extend an errata workaround from the Core2 EX "Dunnington"
> to extend to NHM-EX and WSM-EX, to prevent these immediate
> returns from MWAIT, reducing idle power on these platforms.
>
> While only acpi_idle ran on Dunnington, intel_idle
> may also run on these two newer systems.
> As of today, there are no other models that are known
> to need this tweak.
>
> ref: https://lkml.org/lkml/2013/12/7/22
> Signed-off-by: Len Brown <len.brown@intel.com>
> Cc: <stable@vger.kernel.org> # 3.12.x, 3.11.x, 3.10.x
> ---
> arch/x86/kernel/cpu/intel.c | 3 ++-
> drivers/idle/intel_idle.c | 3 +++
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index dc1ec0d..ea04b34 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -387,7 +387,8 @@ static void init_intel(struct cpuinfo_x86 *c)
> set_cpu_cap(c, X86_FEATURE_PEBS);
> }
>
> - if (c->x86 == 6 && c->x86_model == 29 && cpu_has_clflush)
> + if (c->x86 == 6 && cpu_has_clflush &&
> + (c->x86_model == 29 || c->x86_model == 46 || c->x86_model == 47))
> set_cpu_cap(c, X86_FEATURE_CLFLUSH_MONITOR);
>
> #ifdef CONFIG_X86_64
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 92d1206..f80b700 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -377,6 +377,9 @@ static int intel_idle(struct cpuidle_device *dev,
>
> if (!current_set_polling_and_test()) {
>
> + if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR))
> + clflush((void *)¤t_thread_info()->flags);
> +
> __monitor((void *)¤t_thread_info()->flags, 0, 0);
I don't think either of these casts to '(void *)' is needed, both the
clflush() and __monitor() will take pointers.
Looks good to me otherwise - except that maybe the best way to
represent this quirk would be for the CLFLUSH+MONITOR sequence to be a
single 'instruction' which is patched in dynamically during bootup,
using our usual alternatives framework.
On non-affected CPUs a NOP would remain in place of the CLFLUSH,
eliminating the branch above.
So the whole thing could be thought of as a slightly more complex
'monitor' instruction - not exposing the quirk details to actual usage
sites.
Thanks,
Ingo
next prev parent reply other threads:[~2013-12-19 12:23 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-07 8:00 50 Watt idle power regression bisected to Linux-3.10 Len Brown
2013-12-07 8:39 ` Mike Galbraith
2013-12-07 16:01 ` Len Brown
2013-12-07 16:45 ` Len Brown
2013-12-07 19:17 ` Mike Galbraith
2013-12-10 11:41 ` Ingo Molnar
2013-12-07 12:54 ` Thomas Gleixner
2013-12-08 4:57 ` Mike Galbraith
2013-12-08 20:40 ` Len Brown
2013-12-09 3:16 ` Mike Galbraith
2013-12-10 5:17 ` Mike Galbraith
2013-12-10 11:45 ` Ingo Molnar
2013-12-10 14:29 ` Thomas Gleixner
2013-12-10 15:06 ` Ingo Molnar
2013-12-11 2:05 ` Thomas Gleixner
2013-12-11 3:21 ` Mike Galbraith
2013-12-11 11:28 ` Thomas Gleixner
2013-12-11 11:38 ` Borislav Petkov
2013-12-11 11:52 ` Peter Zijlstra
2013-12-11 12:29 ` Mike Galbraith
2013-12-11 12:43 ` Peter Zijlstra
2013-12-11 13:10 ` Mike Galbraith
2013-12-11 13:40 ` Borislav Petkov
2013-12-11 14:56 ` Ingo Molnar
2013-12-11 16:02 ` Borislav Petkov
2013-12-11 16:43 ` Peter Zijlstra
2013-12-11 17:50 ` Ingo Molnar
2013-12-11 23:08 ` H. Peter Anvin
2013-12-11 23:14 ` Borislav Petkov
2013-12-12 0:52 ` H. Peter Anvin
2013-12-12 4:25 ` Mike Galbraith
2013-12-12 4:49 ` H. Peter Anvin
2013-12-12 4:59 ` Mike Galbraith
2013-12-12 5:37 ` Mike Galbraith
2013-12-12 5:45 ` H. Peter Anvin
2013-12-12 5:57 ` Mike Galbraith
2013-12-12 6:05 ` Mike Galbraith
2013-12-12 7:57 ` H. Peter Anvin
2013-12-12 8:51 ` Peter Zijlstra
2013-12-12 13:28 ` Ingo Molnar
2013-12-12 15:06 ` H. Peter Anvin
2013-12-12 15:51 ` Peter Zijlstra
2013-12-11 14:42 ` Ingo Molnar
2013-12-11 15:02 ` Thomas Gleixner
2013-12-11 15:09 ` Ingo Molnar
2013-12-11 16:44 ` Peter Zijlstra
2013-12-11 17:48 ` Ingo Molnar
2013-12-11 16:44 ` Peter Zijlstra
2013-12-11 17:47 ` Ingo Molnar
2013-12-11 21:43 ` Len Brown
2013-12-11 22:22 ` Thomas Gleixner
2013-12-18 21:44 ` [PATCH] x86 idle: repair large-server 50-watt idle-power regression Len Brown
2013-12-19 12:22 ` Ingo Molnar [this message]
2013-12-19 14:40 ` H. Peter Anvin
2013-12-19 15:45 ` Borislav Petkov
2013-12-19 15:55 ` H. Peter Anvin
2013-12-19 16:02 ` Ingo Molnar
2013-12-19 16:09 ` H. Peter Anvin
2013-12-19 16:13 ` H. Peter Anvin
2013-12-19 16:21 ` Peter Zijlstra
2013-12-19 16:50 ` H. Peter Anvin
2013-12-19 17:07 ` Ingo Molnar
2013-12-19 17:25 ` Peter Zijlstra
2013-12-19 17:36 ` Peter Zijlstra
2013-12-19 18:05 ` H. Peter Anvin
2013-12-19 18:14 ` Ingo Molnar
2013-12-19 17:50 ` Peter Zijlstra
2013-12-19 18:18 ` Ingo Molnar
2013-12-19 21:05 ` H. Peter Anvin
2013-12-19 21:17 ` Ingo Molnar
2013-12-19 18:10 ` Ingo Molnar
2013-12-19 18:09 ` H. Peter Anvin
2013-12-19 18:19 ` H. Peter Anvin
2013-12-19 18:23 ` Ingo Molnar
[not found] ` <CA+55aFzGxcML7j8CEvQPYzh0W81uVoAAVmGctMOUZ7CZ1yYd2A@mail.gmail.com>
2013-12-19 18:43 ` Ingo Molnar
2013-12-19 20:09 ` [tip:x86/idle] x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers tip-bot for H. Peter Anvin
2013-12-19 20:40 ` Ingo Molnar
2013-12-19 20:46 ` Linus Torvalds
2013-12-19 21:14 ` Ingo Molnar
2013-12-19 21:25 ` Linus Torvalds
2013-12-19 21:55 ` Peter Zijlstra
2013-12-20 8:47 ` Ingo Molnar
2013-12-19 20:33 ` [tip:x86/idle] x86, idle: Add memory barriers around clflush in mwait_play_dead() tip-bot for H. Peter Anvin
2013-12-19 18:19 ` [PATCH] x86 idle: repair large-server 50-watt idle-power regression Ingo Molnar
2013-12-19 19:22 ` H. Peter Anvin
2013-12-19 19:27 ` Peter Zijlstra
2013-12-19 19:51 ` [tip:x86/urgent] x86 idle: Repair " tip-bot for Len Brown
2014-03-18 0:20 ` Davidlohr Bueso
2014-03-18 9:16 ` Peter Zijlstra
2014-03-19 2:14 ` Jason Low
2014-03-19 6:42 ` Peter Zijlstra
2014-04-08 21:43 ` Brown, Len
2014-04-09 8:18 ` Peter Zijlstra
2014-04-15 3:27 ` Davidlohr Bueso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131219122257.GC11279@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=bp@alien8.de \
--cc=efault@gmx.de \
--cc=hpa@zytor.com \
--cc=len.brown@intel.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox