From: Ingo Molnar <mingo@kernel.org>
To: Len Brown <lenb@kernel.org>
Cc: x86@kernel.org, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org, Len Brown <len.brown@intel.com>,
stable@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>,
"H. Peter Anvin" <hpa@zytor.com>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Mike Galbraith <efault@gmx.de>, Borislav Petkov <bp@alien8.de>
Subject: Re: [PATCH] x86 idle: repair large-server 50-watt idle-power regression
Date: Thu, 19 Dec 2013 13:22:57 +0100 [thread overview]
Message-ID: <20131219122257.GC11279@gmail.com> (raw)
In-Reply-To: <baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com>
* Len Brown <lenb@kernel.org> wrote:
> From: Len Brown <len.brown@intel.com>
>
> Linux 3.10 changed the timing of how thread_info->flags is touched:
>
> x86: Use generic idle loop
> (7d1a941731fabf27e5fb6edbebb79fe856edb4e5)
>
> This caused Intel NHM-EX and WSM-EX servers to experience a large number
> of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote
> from deep C-states to shallow C-states, which caused these platforms
> to experience a significant increase in idle power.
>
> Note that this issue was already present before the commit above,
> however, it wasn't seen often enough to be noticed in power measurements.
>
> Here we extend an errata workaround from the Core2 EX "Dunnington"
> to extend to NHM-EX and WSM-EX, to prevent these immediate
> returns from MWAIT, reducing idle power on these platforms.
>
> While only acpi_idle ran on Dunnington, intel_idle
> may also run on these two newer systems.
> As of today, there are no other models that are known
> to need this tweak.
>
> ref: https://lkml.org/lkml/2013/12/7/22
> Signed-off-by: Len Brown <len.brown@intel.com>
> Cc: <stable@vger.kernel.org> # 3.12.x, 3.11.x, 3.10.x
> ---
> arch/x86/kernel/cpu/intel.c | 3 ++-
> drivers/idle/intel_idle.c | 3 +++
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index dc1ec0d..ea04b34 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -387,7 +387,8 @@ static void init_intel(struct cpuinfo_x86 *c)
> set_cpu_cap(c, X86_FEATURE_PEBS);
> }
>
> - if (c->x86 == 6 && c->x86_model == 29 && cpu_has_clflush)
> + if (c->x86 == 6 && cpu_has_clflush &&
> + (c->x86_model == 29 || c->x86_model == 46 || c->x86_model == 47))
> set_cpu_cap(c, X86_FEATURE_CLFLUSH_MONITOR);
>
> #ifdef CONFIG_X86_64
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 92d1206..f80b700 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -377,6 +377,9 @@ static int intel_idle(struct cpuidle_device *dev,
>
> if (!current_set_polling_and_test()) {
>
> + if (this_cpu_has(X86_FEATURE_CLFLUSH_MONITOR))
> + clflush((void *)¤t_thread_info()->flags);
> +
> __monitor((void *)¤t_thread_info()->flags, 0, 0);
I don't think either of these casts to '(void *)' is needed, both the
clflush() and __monitor() will take pointers.
Looks good to me otherwise - except that maybe the best way to
represent this quirk would be for the CLFLUSH+MONITOR sequence to be a
single 'instruction' which is patched in dynamically during bootup,
using our usual alternatives framework.
On non-affected CPUs a NOP would remain in place of the CLFLUSH,
eliminating the branch above.
So the whole thing could be thought of as a slightly more complex
'monitor' instruction - not exposing the quirk details to actual usage
sites.
Thanks,
Ingo
next prev parent reply other threads:[~2013-12-19 12:23 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-07 8:00 50 Watt idle power regression bisected to Linux-3.10 Len Brown
2013-12-07 8:39 ` Mike Galbraith
2013-12-07 16:01 ` Len Brown
2013-12-07 16:45 ` Len Brown
2013-12-07 19:17 ` Mike Galbraith
2013-12-10 11:41 ` Ingo Molnar
2013-12-07 12:54 ` Thomas Gleixner
2013-12-08 4:57 ` Mike Galbraith
2013-12-08 20:40 ` Len Brown
2013-12-09 3:16 ` Mike Galbraith
2013-12-10 5:17 ` Mike Galbraith
2013-12-10 11:45 ` Ingo Molnar
2013-12-10 14:29 ` Thomas Gleixner
2013-12-10 15:06 ` Ingo Molnar
2013-12-11 2:05 ` Thomas Gleixner
2013-12-11 3:21 ` Mike Galbraith
2013-12-11 11:28 ` Thomas Gleixner
2013-12-11 11:38 ` Borislav Petkov
2013-12-11 11:52 ` Peter Zijlstra
2013-12-11 12:29 ` Mike Galbraith
2013-12-11 12:43 ` Peter Zijlstra
2013-12-11 13:10 ` Mike Galbraith
2013-12-11 13:40 ` Borislav Petkov
2013-12-11 14:56 ` Ingo Molnar
2013-12-11 16:02 ` Borislav Petkov
2013-12-11 16:43 ` Peter Zijlstra
2013-12-11 17:50 ` Ingo Molnar
2013-12-11 23:08 ` H. Peter Anvin
2013-12-11 23:14 ` Borislav Petkov
2013-12-12 0:52 ` H. Peter Anvin
2013-12-12 4:25 ` Mike Galbraith
2013-12-12 4:49 ` H. Peter Anvin
2013-12-12 4:59 ` Mike Galbraith
2013-12-12 5:37 ` Mike Galbraith
2013-12-12 5:45 ` H. Peter Anvin
2013-12-12 5:57 ` Mike Galbraith
2013-12-12 6:05 ` Mike Galbraith
2013-12-12 7:57 ` H. Peter Anvin
2013-12-12 8:51 ` Peter Zijlstra
2013-12-12 13:28 ` Ingo Molnar
2013-12-12 15:06 ` H. Peter Anvin
2013-12-12 15:51 ` Peter Zijlstra
2013-12-11 14:42 ` Ingo Molnar
2013-12-11 15:02 ` Thomas Gleixner
2013-12-11 15:09 ` Ingo Molnar
2013-12-11 16:44 ` Peter Zijlstra
2013-12-11 17:48 ` Ingo Molnar
2013-12-11 16:44 ` Peter Zijlstra
2013-12-11 17:47 ` Ingo Molnar
2013-12-11 21:43 ` Len Brown
2013-12-11 22:22 ` Thomas Gleixner
2013-12-18 21:44 ` [PATCH] x86 idle: repair large-server 50-watt idle-power regression Len Brown
2013-12-18 21:44 ` Len Brown
2013-12-19 12:22 ` Ingo Molnar [this message]
2013-12-19 14:40 ` H. Peter Anvin
2013-12-19 15:45 ` Borislav Petkov
2013-12-19 15:55 ` H. Peter Anvin
2013-12-19 16:02 ` Ingo Molnar
2013-12-19 16:09 ` H. Peter Anvin
2013-12-19 16:13 ` H. Peter Anvin
2013-12-19 16:21 ` Peter Zijlstra
2013-12-19 16:50 ` H. Peter Anvin
2013-12-19 17:07 ` Ingo Molnar
2013-12-19 17:25 ` Peter Zijlstra
2013-12-19 17:36 ` Peter Zijlstra
2013-12-19 18:05 ` H. Peter Anvin
2013-12-19 18:14 ` Ingo Molnar
2013-12-19 17:50 ` Peter Zijlstra
2013-12-19 18:18 ` Ingo Molnar
2013-12-19 21:05 ` H. Peter Anvin
2013-12-19 21:17 ` Ingo Molnar
2013-12-19 18:10 ` Ingo Molnar
2013-12-19 18:09 ` H. Peter Anvin
2013-12-19 18:19 ` H. Peter Anvin
2013-12-19 18:23 ` Ingo Molnar
[not found] ` <CA+55aFzGxcML7j8CEvQPYzh0W81uVoAAVmGctMOUZ7CZ1yYd2A@mail.gmail.com>
2013-12-19 18:43 ` Ingo Molnar
2013-12-19 18:43 ` Ingo Molnar
2013-12-19 20:09 ` [tip:x86/idle] x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers tip-bot for H. Peter Anvin
2013-12-19 20:40 ` Ingo Molnar
2013-12-19 20:46 ` Linus Torvalds
2013-12-19 21:14 ` Ingo Molnar
2013-12-19 21:25 ` Linus Torvalds
2013-12-19 21:55 ` Peter Zijlstra
2013-12-20 8:47 ` Ingo Molnar
2013-12-19 20:33 ` [tip:x86/idle] x86, idle: Add memory barriers around clflush in mwait_play_dead() tip-bot for H. Peter Anvin
2013-12-19 18:19 ` [PATCH] x86 idle: repair large-server 50-watt idle-power regression Ingo Molnar
2013-12-19 19:22 ` H. Peter Anvin
2013-12-19 19:27 ` Peter Zijlstra
2013-12-19 19:51 ` [tip:x86/urgent] x86 idle: Repair " tip-bot for Len Brown
2014-03-18 0:20 ` Davidlohr Bueso
2014-03-18 9:16 ` Peter Zijlstra
2014-03-19 2:14 ` Jason Low
2014-03-19 6:42 ` Peter Zijlstra
2014-04-08 21:43 ` Brown, Len
2014-04-09 8:18 ` Peter Zijlstra
2014-04-15 3:27 ` Davidlohr Bueso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131219122257.GC11279@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=bp@alien8.de \
--cc=efault@gmx.de \
--cc=hpa@zytor.com \
--cc=len.brown@intel.com \
--cc=lenb@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.