* Re: [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch
2009-05-19 21:56 [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch Andi Kleen
@ 2009-05-20 9:41 ` Hidetoshi Seto
2009-05-20 11:34 ` Andi Kleen
2009-05-20 9:41 ` [PATCH -tip 1/2] x86, mce: Revert "add mce_threshold option for intel cmci" Hidetoshi Seto
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Hidetoshi Seto @ 2009-05-20 9:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: mingo, tglx, hpa, linux-kernel
Andi Kleen wrote:
> I also reverted one patch where the patch author agreed it should
> be reverted.
> Revert "x86, mce: Add mce=nopoll option to disable timer polling"
That is:
> commit 41e654ade96b57c2dca4cea3e21f75819ab9696c
> Author: Andi Kleen <ak@linux.intel.com>
> Date: Wed Apr 29 10:17:01 2009 +0200
>
> Revert "x86, mce: Add mce=nopoll option to disable timer polling"
>
> This reverts commit f5b3ca6efb64aa3ab5fa32f0fc9cc0f6cfedee95.
>
> Seto-san has an updated patch following recent discussion that
> should be used instead.
OK. Please use following patches with my Signed-off-by instead.
And an another revert patch should be applied to tip tree too.
Thanks,
H.Seto
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch
2009-05-20 9:41 ` Hidetoshi Seto
@ 2009-05-20 11:34 ` Andi Kleen
0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2009-05-20 11:34 UTC (permalink / raw)
To: Hidetoshi Seto; +Cc: Andi Kleen, mingo, tglx, hpa, linux-kernel
> OK. Please use following patches with my Signed-off-by instead.
> And an another revert patch should be applied to tip tree too.
Thanks.
Ideally these patches should be just removed before upstreaming. But I didn't
want to touch the original branch for now so it's clear what I changed.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH -tip 1/2] x86, mce: Revert "add mce_threshold option for intel cmci"
2009-05-19 21:56 [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch Andi Kleen
2009-05-20 9:41 ` Hidetoshi Seto
@ 2009-05-20 9:41 ` Hidetoshi Seto
2009-05-20 9:42 ` [PATCH -tip 2/2] x86, mce: Revert "add mce=nopoll option to disable timer polling" Hidetoshi Seto
2009-05-22 8:21 ` [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch Hidetoshi Seto
3 siblings, 0 replies; 7+ messages in thread
From: Hidetoshi Seto @ 2009-05-20 9:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: mingo, tglx, hpa, linux-kernel
This reverts commit 0b66224ac2fd303cd2858bf313058c555a555642.
After having some discussion with Andi Kleen, we have concluded
that setting threshold >1 will not work properly.
This patch reverts the previous patch that introduces mce_threshold
option.
However cmci is a new feature so having boot controls to disable it
is generally a good idea, and it might be handy if the hw is
misbehaving.
So instead of mce_threshold, I will introduce "mce=no_cmci" option
to support cmci disablement in later patch.
Compare with mce_threshold, it lacks threshold >1 support but it
does not matter because it will not work.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
---
Documentation/kernel-parameters.txt | 5 ---
arch/x86/include/asm/msr-index.h | 1 -
arch/x86/kernel/cpu/mcheck/mce_intel_64.c | 56 ++---------------------------
3 files changed, 3 insertions(+), 59 deletions(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 5e966a9..6172e43 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1208,11 +1208,6 @@ and is between 256 and 4096 characters. It is defined in the file
mce=option [X86-64] See Documentation/x86/x86_64/boot-options.txt
- mce_threshold= [X86-64,intel] Default CMCI threshold
- Should be unsigned integer. Setting 0 disables cmci.
- Format: <integer>
- Default: 1
-
md= [HW] RAID subsystems devices and level
See Documentation/md.txt.
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 1ea4055..bc2f045 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -85,7 +85,6 @@
#define MSR_IA32_MC0_CTL2 0x00000280
#define CMCI_EN (1ULL << 30)
#define CMCI_THRESHOLD_MASK 0xffffULL
-#define CMCI_THRESHOLD_VAL_MASK 0x7fffULL
#define MSR_P6_PERFCTR0 0x000000c1
#define MSR_P6_PERFCTR1 0x000000c2
diff --git a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
index 2970e14..e46c608 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_intel_64.c
@@ -51,6 +51,8 @@ static DEFINE_PER_CPU(mce_banks_t, mce_banks_owned);
*/
static DEFINE_SPINLOCK(cmci_discover_lock);
+#define CMCI_THRESHOLD 1
+
static int cmci_supported(int *banks)
{
u64 cap;
@@ -81,51 +83,6 @@ static void intel_threshold_interrupt(void)
mce_notify_user();
}
-/*
- * Default threshold, setting 0 disables cmci
- */
-static unsigned long threshold_limit = 1;
-
-static int __init mcheck_threshold(char *str)
-{
- int val;
-
- get_option(&str, &val);
- if (val < 0) {
- printk(KERN_INFO "mce_threshold argument ignored.\n");
- return 0;
- }
- threshold_limit = val;
-
- return 1;
-}
-__setup("mce_threshold=", mcheck_threshold);
-
-void static cmci_set_threshold(int bank)
-{
- u64 val, limit, max, new;
-
- rdmsrl(MSR_IA32_MC0_CTL2 + bank, val);
- limit = val & CMCI_THRESHOLD_VAL_MASK;
-
- /* Thresholding available? */
- if (!limit)
- return;
- /* Return if no need to change */
- if (limit == threshold_limit)
- return;
-
- /* Find the maximum threshold value */
- max = (val & ~CMCI_THRESHOLD_MASK) | CMCI_THRESHOLD_VAL_MASK;
- wrmsrl(MSR_IA32_MC0_CTL2 + bank, max);
- rdmsrl(MSR_IA32_MC0_CTL2 + bank, max);
- max &= CMCI_THRESHOLD_VAL_MASK;
- max = (threshold_limit > max ? max : threshold_limit);
-
- new = (val & ~CMCI_THRESHOLD_MASK) | max;
- wrmsrl(MSR_IA32_MC0_CTL2 + bank, new);
-}
-
static void print_update(char *type, int *hdr, int num)
{
if (*hdr == 0)
@@ -134,9 +91,6 @@ static void print_update(char *type, int *hdr, int num)
printk(KERN_CONT " %s:%d", type, num);
}
-/* Used to determine whether thresholding is available or not */
-#define CMCI_THRESHOLD_FIRST 1
-
/*
* Enable CMCI (Corrected Machine Check Interrupt) for available MCE banks
* on this CPU. Use the algorithm recommended in the SDM to discover shared
@@ -148,9 +102,6 @@ static void cmci_discover(int banks, int boot)
int hdr = 0;
int i;
- if (!threshold_limit)
- return;
-
spin_lock(&cmci_discover_lock);
for (i = 0; i < banks; i++) {
u64 val;
@@ -168,7 +119,7 @@ static void cmci_discover(int banks, int boot)
continue;
}
- val |= CMCI_EN | CMCI_THRESHOLD_FIRST;
+ val |= CMCI_EN | CMCI_THRESHOLD;
wrmsrl(MSR_IA32_MC0_CTL2 + i, val);
rdmsrl(MSR_IA32_MC0_CTL2 + i, val);
@@ -177,7 +128,6 @@ static void cmci_discover(int banks, int boot)
if (!test_and_set_bit(i, owned) || boot)
print_update("CMCI", &hdr, i);
__clear_bit(i, __get_cpu_var(mce_poll_banks));
- cmci_set_threshold(i);
} else {
WARN_ON(!test_bit(i, __get_cpu_var(mce_poll_banks)));
}
--
1.6.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH -tip 2/2] x86, mce: Revert "add mce=nopoll option to disable timer polling"
2009-05-19 21:56 [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch Andi Kleen
2009-05-20 9:41 ` Hidetoshi Seto
2009-05-20 9:41 ` [PATCH -tip 1/2] x86, mce: Revert "add mce_threshold option for intel cmci" Hidetoshi Seto
@ 2009-05-20 9:42 ` Hidetoshi Seto
2009-05-22 8:21 ` [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch Hidetoshi Seto
3 siblings, 0 replies; 7+ messages in thread
From: Hidetoshi Seto @ 2009-05-20 9:42 UTC (permalink / raw)
To: Andi Kleen; +Cc: mingo, tglx, hpa, linux-kernel
This reverts commit f5b3ca6efb64aa3ab5fa32f0fc9cc0f6cfedee95.
Disabling only polling but not cmci is pointless setting.
Instead of "mce=nopoll" which should be paired with cmci disablement,
it rather make sense to have a "mce=ignore_ce" option that disable
both of polling and cmci at once. A patch for this replacement
will follow this reverting patch.
OTOH, once booted, we can disable polling by setting check_interval
to 0, but there are no mention about the fact. Later Andi will post
updated documents that can respond this issue.
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
---
Documentation/x86/x86_64/boot-options.txt | 2 --
arch/x86/kernel/cpu/mcheck/mce.c | 10 ++--------
2 files changed, 2 insertions(+), 10 deletions(-)
diff --git a/Documentation/x86/x86_64/boot-options.txt b/Documentation/x86/x86_64/boot-options.txt
index 5d55158..34c1304 100644
--- a/Documentation/x86/x86_64/boot-options.txt
+++ b/Documentation/x86/x86_64/boot-options.txt
@@ -13,8 +13,6 @@ Machine check
in a reboot. On Intel systems it is enabled by default.
mce=nobootlog
Disable boot machine check logging.
- mce=nopoll
- Disable timer polling for corrected errors
mce=tolerancelevel (number)
0: always panic on uncorrected errors, log corrected errors
1: panic or SIGBUS on uncorrected errors, log corrected errors
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index d7dccfe..d3628f3 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -465,8 +465,6 @@ void mce_log_therm_throt_event(__u64 status)
* Periodic polling timer for "silent" machine check errors. If the
* poller finds an MCE, poll 2x faster. When the poller finds no more
* errors, poll 2x slower (up to check_interval seconds).
- *
- * If check_interval is 0, polling is disabled.
*/
static int check_interval = 5 * 60; /* 5 minutes */
@@ -665,12 +663,11 @@ static void mce_init_timer(void)
{
struct timer_list *t = &__get_cpu_var(mce_timer);
- /* Disable polling if check_interval is 0 */
- if (!check_interval)
- return;
/* data race harmless because everyone sets to the same value */
if (!next_interval)
next_interval = check_interval * HZ;
+ if (!next_interval)
+ return;
setup_timer(t, mcheck_timer, smp_processor_id());
t->expires = round_jiffies(jiffies + next_interval);
add_timer(t);
@@ -885,14 +882,11 @@ __setup("nomce", mcheck_disable);
* mce=TOLERANCELEVEL (number, see above)
* mce=bootlog Log MCEs from before booting. Disabled by default on AMD.
* mce=nobootlog Don't log MCEs from before booting.
- * mce=nopoll Disable timer polling for corrected errors
*/
static int __init mcheck_enable(char *str)
{
if (!strcmp(str, "off"))
mce_dont_init = 1;
- else if (!strcmp(str, "nopoll"))
- check_interval = 0;
else if (!strcmp(str, "bootlog") || !strcmp(str, "nobootlog"))
mce_bootlog = (str[0] == 'b');
else if (isdigit(str[0]))
--
1.6.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch
2009-05-19 21:56 [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch Andi Kleen
` (2 preceding siblings ...)
2009-05-20 9:42 ` [PATCH -tip 2/2] x86, mce: Revert "add mce=nopoll option to disable timer polling" Hidetoshi Seto
@ 2009-05-22 8:21 ` Hidetoshi Seto
2009-05-22 10:16 ` Andi Kleen
3 siblings, 1 reply; 7+ messages in thread
From: Hidetoshi Seto @ 2009-05-22 8:21 UTC (permalink / raw)
To: Andi Kleen; +Cc: mingo, tglx, hpa, linux-kernel
Andi Kleen wrote:
> The following changes since commit dd9869965a301d0f35e32c42eb87d5f94883443a:
> Ingo Molnar (1):
> x86, mce: print number of MCE banks
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6.git mce-32bit-merge
>
> Andi Kleen (20):
> x86: MCE: Make mce_amd_64.c compile again
> Revert "x86, mce: Add mce=nopoll option to disable timer polling"
> x86: MCE: Move ifdef X86_64 to the beginning of the file
> x86: MCE: Initial steps to make 64bit mce code 32bit clean
> x86: MCE: Implement the PPro bank 0 quirk in the 64bit machine check code
> x86: MCE: Port K7 bank 0 quirk to 64bit mce code
> x86: MCE: Use a call vector to call the 64bit mce handler
> x86: MCE: Rename 64bit mce_dont_init to mce_disabled
> x86: MCE: Move mce_disabled option into common 64bit/64bit code
> x86: MCE: Remove machine check handler idle notify on 64bit
> x86: MCE: Remove oops_begin() use in 64bit machine check
> x86: MCE: Remove unused stop/restart_mce on 32bit
> x86: MCE: Use 64bit machine check code on 32bit
> x86: MCE: Deprecate old 32bit machine check code
> x86: MCE: Enable MCE_INTEL for 32bit new MCE
> x86: MCE: Enable MCE_AMD for 32bit NEW_MCE
> x86: MCE: Document new 32bit mcelog requirement in Documentation/Changes
> Export add_timer_on for modules
> x86: MCE: Add MSR read wrappers for easier error injection
> x86: MCE: Add basic error injection infrastructure
I reviewed them again...
#####
[1]: in x86: MCE: Deprecate old 32bit machine check code
> @@ -1,4 +1,4 @@
> -The following is a list of files and features that are going to be
> +he following is a list of files and features that are going to be
> removed in the kernel source tree. Every entry should contain what
> exactly is going away, why it is happening, and who is going to be doing
> the work. When the feature is removed from the kernel, it should also
Is it necessary?
#####
[2]: in x86: MCE: Enable MCE_INTEL for 32bit new MCE
> @@ -88,12 +88,13 @@
> #define THERMAL_APIC_VECTOR 0xfa
>
> #ifdef CONFIG_X86_32
> -/* 0xf8 - 0xf9 : free */
> +/* 0xf9 : free */
> #else
> -# define THRESHOLD_APIC_VECTOR 0xf9
> # define UV_BAU_MESSAGE 0xf8
> #endif
>
> +#define THRESHOLD_APIC_VECTOR 0xf9
> +
> /* f0-f7 used for spreading out TLB flushes: */
> #define INVALIDATE_TLB_VECTOR_END 0xf7
> #define INVALIDATE_TLB_VECTOR_START 0xf0
"/* 0xf8 : free :/" ?
And please place vectors in numerical order, like:
> #define THERMAL_APIC_VECTOR 0xfa
> +#define THRESHOLD_APIC_VECTOR 0xf9
>
> #ifdef CONFIG_X86_32
> -/* 0xf8 - 0xf9 : free */
> +/* 0xf8 : free */
> #else
> -# define THRESHOLD_APIC_VECTOR 0xf9
> # define UV_BAU_MESSAGE 0xf8
> #endif
>
> /* f0-f7 used for spreading out TLB flushes: */
#####
[3]: in x86: MCE: Add basic error injection infrastructure
> @@ -744,6 +778,7 @@ void __cpuinit mcheck_init(struct cpuinfo_x86 *c)
> mce_cpu_features(c);
> mce_init_timer();
> }
> +EXPORT_SYMBOL_GPL(do_machine_check);
>
> /*
> * Character device to read and clear the MCE log.
> @@ -870,6 +905,7 @@ timeout:
>
> return err ? -EFAULT : buf - ubuf;
> }
> +EXPORT_SYMBOL_GPL(mce_notify_user);
>
> static unsigned int mce_poll(struct file *file, poll_table *wait)
> {
These EXPORT_SYMBOL_GPLs are located in wrong place.
#####
[trivial]: in x86: MCE: Use 64bit machine check code on 32bit
> @@ -793,6 +809,15 @@ config X86_MCE_AMD
> Additional support for AMD specific MCE features such as
> the DRAM Error Threshold.
>
> +config X86_ANCIENT_MCE
> + def_bool n
> + depends on X86_32
> + prompt "Support for old Pentium 5 / WinChip machine checks"
> + help
> + Include support for machine check handling on old Pentium 5 or WinChip
> + systems. These typically need to be enabled explicitely on the command
> + line.
> +
> config X86_MCE_THRESHOLD
> depends on X86_MCE_AMD || X86_MCE_INTEL
> bool
It would be better to use "---help---" instead of "help".
And also it would be better to clean trailing spaces and spaces before tabs etc.,
that can be seen here and there.
Thanks,
H.Seto
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch
2009-05-22 8:21 ` [x86] [MCE] Updated 32bit merge for tip x86 mce2 branch Hidetoshi Seto
@ 2009-05-22 10:16 ` Andi Kleen
0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2009-05-22 10:16 UTC (permalink / raw)
To: Hidetoshi Seto; +Cc: Andi Kleen, mingo, tglx, hpa, linux-kernel
On Fri, May 22, 2009 at 05:21:53PM +0900, Hidetoshi Seto wrote:
> > @@ -1,4 +1,4 @@
> > -The following is a list of files and features that are going to be
> > +he following is a list of files and features that are going to be
> > removed in the kernel source tree. Every entry should contain what
> > exactly is going away, why it is happening, and who is going to be doing
> > the work. When the feature is removed from the kernel, it should also
>
> Is it necessary?
I fixed all that and regenerated the tree.
Hopefully the tree can be properly redone now, it's quite messy
that mce2 is still missing all the rc6 mce_64.c changes. I have
to patch around because of this on my followup trees.
-Andi
--
ak@linux.intel.com -- Speaking for myself only.
^ permalink raw reply [flat|nested] 7+ messages in thread