stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	alan@lxorguk.ukuu.org.uk, Fenghua Yu <fenghua.yu@intel.com>,
	Borislav Petkov <bp@amd64.org>, Tony Luck <tony.luck@intel.com>,
	"H. Peter Anvin" <hpa@linux.intel.com>,
	maximilian attems <max@stro.at>
Subject: [ 35/37] x86, mce, therm_throt: Dont report power limit and package level thermal throttle events in mcelog
Date: Fri, 30 Nov 2012 10:46:22 -0800	[thread overview]
Message-ID: <20121130183901.006382287@linuxfoundation.org> (raw)
In-Reply-To: <20121130183857.166228045@linuxfoundation.org>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Fenghua Yu <fenghua.yu@intel.com>

commit 29e9bf1841e4f9df13b4992a716fece7087dd237 upstream.

Thermal throttle and power limit events are not defined as MCE errors in x86
architecture and should not generate MCE errors in mcelog.

Current kernel generates fake software defined MCE errors for these events.
This may confuse users because they may think the machine has real MCE errors
while actually only thermal throttle or power limit events happen.

To make it worse, buggy firmware on some platforms may falsely generate
the events. Therefore, kernel reports MCE errors which users think as real
hardware errors. Although the firmware bugs should be fixed, on the other hand,
kernel should not report MCE errors either.

So mcelog is not a good mechanism to report these events. To report the events, we count them in respective counters (core_power_limit_count,
package_power_limit_count, core_throttle_count, and package_throttle_count) in
/sys/devices/system/cpu/cpu#/thermal_throttle/. Users can check the counters
for each event on each CPU. Please note that all CPU's on one package report
duplicate counters. It's user application's responsibity to retrieve a package
level counter for one package.

This patch doesn't report package level power limit, core level power limit, and
package level thermal throttle events in mcelog. When the events happen, only
report them in respective counters in sysfs.

Since core level thermal throttle has been legacy code in kernel for a while and
users accepted it as MCE error in mcelog, core level thermal throttle is still
reported in mcelog. In the mean time, the event is counted in a counter in sysfs
as well.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Borislav Petkov <bp@amd64.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20111215001945.GA21009@linux-os.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: maximilian attems <max@stro.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


---
 arch/x86/kernel/cpu/mcheck/therm_throt.c |   29 +++++++----------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -322,17 +322,6 @@ device_initcall(thermal_throttle_init_de
 
 #endif /* CONFIG_SYSFS */
 
-/*
- * Set up the most two significant bit to notify mce log that this thermal
- * event type.
- * This is a temp solution. May be changed in the future with mce log
- * infrasture.
- */
-#define CORE_THROTTLED		(0)
-#define CORE_POWER_LIMIT	((__u64)1 << 62)
-#define PACKAGE_THROTTLED	((__u64)2 << 62)
-#define PACKAGE_POWER_LIMIT	((__u64)3 << 62)
-
 static void notify_thresholds(__u64 msr_val)
 {
 	/* check whether the interrupt handler is defined;
@@ -362,27 +351,23 @@ static void intel_thermal_interrupt(void
 	if (therm_throt_process(msr_val & THERM_STATUS_PROCHOT,
 				THERMAL_THROTTLING_EVENT,
 				CORE_LEVEL) != 0)
-		mce_log_therm_throt_event(CORE_THROTTLED | msr_val);
+		mce_log_therm_throt_event(msr_val);
 
 	if (this_cpu_has(X86_FEATURE_PLN))
-		if (therm_throt_process(msr_val & THERM_STATUS_POWER_LIMIT,
+		therm_throt_process(msr_val & THERM_STATUS_POWER_LIMIT,
 					POWER_LIMIT_EVENT,
-					CORE_LEVEL) != 0)
-			mce_log_therm_throt_event(CORE_POWER_LIMIT | msr_val);
+					CORE_LEVEL);
 
 	if (this_cpu_has(X86_FEATURE_PTS)) {
 		rdmsrl(MSR_IA32_PACKAGE_THERM_STATUS, msr_val);
-		if (therm_throt_process(msr_val & PACKAGE_THERM_STATUS_PROCHOT,
+		therm_throt_process(msr_val & PACKAGE_THERM_STATUS_PROCHOT,
 					THERMAL_THROTTLING_EVENT,
-					PACKAGE_LEVEL) != 0)
-			mce_log_therm_throt_event(PACKAGE_THROTTLED | msr_val);
+					PACKAGE_LEVEL);
 		if (this_cpu_has(X86_FEATURE_PLN))
-			if (therm_throt_process(msr_val &
+			therm_throt_process(msr_val &
 					PACKAGE_THERM_STATUS_POWER_LIMIT,
 					POWER_LIMIT_EVENT,
-					PACKAGE_LEVEL) != 0)
-				mce_log_therm_throt_event(PACKAGE_POWER_LIMIT
-							  | msr_val);
+					PACKAGE_LEVEL);
 	}
 }
 



  parent reply	other threads:[~2012-11-30 18:46 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-30 18:45 [ 00/37] 3.0.54-stable review Greg Kroah-Hartman
2012-11-30 18:45 ` [ 01/37] ALSA: pcmcia - Use pcmcia_request_irq() Greg Kroah-Hartman
2012-11-30 18:45 ` [ 02/37] drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning Greg Kroah-Hartman
2012-11-30 18:45 ` [ 03/37] drivers/block/DAC960: fix -Wuninitialized warning Greg Kroah-Hartman
2012-11-30 18:45 ` [ 04/37] riva/fbdev: fix several -Wuninitialized Greg Kroah-Hartman
2012-11-30 18:45 ` [ 05/37] ifenslave: Fix unused variable warnings Greg Kroah-Hartman
2012-11-30 18:45 ` [ 06/37] x86-32: Fix invalid stack address while in softirq Greg Kroah-Hartman
2012-12-04 13:42   ` Herton Ronaldo Krzesinski
2012-12-04 14:13     ` Robert Richter
2012-12-06 18:42     ` Greg Kroah-Hartman
2012-11-30 18:45 ` [ 07/37] x86, microcode, AMD: Add support for family 16h processors Greg Kroah-Hartman
2012-11-30 18:45 ` [ 08/37] rtlwifi: rtl8192cu: Add new USB ID Greg Kroah-Hartman
2012-11-30 18:45 ` [ 09/37] mwifiex: report error to MMC core if we cannot suspend Greg Kroah-Hartman
2012-11-30 18:45 ` [ 10/37] SCSI: isci: copy fis 0x34 response into proper buffer Greg Kroah-Hartman
2012-11-30 18:45 ` [ 11/37] ALSA: ua101, usx2y: fix broken MIDI output Greg Kroah-Hartman
2012-11-30 18:45 ` [ 12/37] ALSA: hda - Cirrus: Correctly clear line_out_pins when moving to speaker Greg Kroah-Hartman
2012-12-03  9:46   ` David Henningsson
2012-12-03 20:56     ` Greg Kroah-Hartman
2012-11-30 18:46 ` [ 13/37] PARISC: fix virtual aliasing issue in get_shared_area() Greg Kroah-Hartman
2012-11-30 18:46 ` [ 14/37] PARISC: fix user-triggerable panic on parisc Greg Kroah-Hartman
2012-11-30 18:46 ` [ 15/37] mtd: slram: invalid checking of absolute end address Greg Kroah-Hartman
2012-11-30 18:46 ` [ 16/37] dm: fix deadlock with request based dm and queue request_fn recursion Greg Kroah-Hartman
2012-11-30 18:46 ` [ 17/37] futex: avoid wake_futex() for a PI futex_q Greg Kroah-Hartman
2012-11-30 18:46 ` [ 18/37] mac80211: deinitialize ibss-internals after emptiness check Greg Kroah-Hartman
2012-11-30 18:46 ` [ 19/37] radeon: add AGPMode 1 quirk for RV250 Greg Kroah-Hartman
2012-11-30 18:46 ` [ 20/37] can: bcm: initialize ifindex for timeouts without previous frame reception Greg Kroah-Hartman
2012-11-30 18:46 ` [ 21/37] jbd: Fix lock ordering bug in journal_unmap_buffer() Greg Kroah-Hartman
2012-11-30 18:46 ` [ 22/37] sparc64: not any error from do_sigaltstack() should fail rt_sigreturn() Greg Kroah-Hartman
2012-11-30 18:46 ` [ 23/37] ALSA: hda - Add new codec ALC283 ALC290 support Greg Kroah-Hartman
2012-11-30 18:46 ` [ 24/37] ALSA: hda - Fix missing beep on ASUS X43U notebook Greg Kroah-Hartman
2012-11-30 18:46 ` [ 25/37] ALSA: hda - Add support for Realtek ALC292 Greg Kroah-Hartman
2012-11-30 18:46 ` [ 26/37] bas_gigaset: fix pre_reset handling Greg Kroah-Hartman
2012-11-30 18:46 ` [ 27/37] ixgbe: add support for X540-AT1 Greg Kroah-Hartman
2012-11-30 18:46 ` [ 28/37] sata_svw: check DMA start bit before reset Greg Kroah-Hartman
2012-11-30 18:46 ` [ 29/37] ixgbe: add support for new 82599 device Greg Kroah-Hartman
2012-11-30 18:46 ` [ 30/37] ixgbe: add support for new 82599 device id Greg Kroah-Hartman
2012-11-30 18:46 ` [ 31/37] get_dvb_firmware: fix download site for tda10046 firmware Greg Kroah-Hartman
2012-11-30 18:46 ` [ 32/37] USB: mct_u232: fix broken close Greg Kroah-Hartman
2012-11-30 18:46 ` [ 33/37] watchdog: using u64 in get_sample_period() Greg Kroah-Hartman
2012-11-30 18:46 ` [ 34/37] acer-wmi: support for P key on TM8372 Greg Kroah-Hartman
2012-11-30 18:46 ` Greg Kroah-Hartman [this message]
2012-11-30 18:46 ` [ 36/37] Input: bcm5974 - set BUTTONPAD property Greg Kroah-Hartman
2012-11-30 18:46 ` [ 37/37] mmc: sdhci-s3c: fix the wrong number of max bus clocks Greg Kroah-Hartman
2012-12-01 15:36 ` [ 00/37] 3.0.54-stable review Satoru Takeuchi
2012-12-01 16:24   ` David Miller
2012-12-01 17:15     ` Shuah Khan
2012-12-02  0:27       ` Satoru Takeuchi
2012-12-02  2:07 ` Shuah Khan
2012-12-02 17:01   ` Greg Kroah-Hartman
2012-12-02 19:36 ` Nikola Ciprich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121130183901.006382287@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=bp@amd64.org \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=max@stro.at \
    --cc=stable@vger.kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).