* [PATCH -rt 0/4] nmi_watchdog fixes for -rt
@ 2008-04-28 18:10 Hiroshi Shimamoto
2008-04-28 18:14 ` [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on Hiroshi Shimamoto
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Hiroshi Shimamoto @ 2008-04-28 18:10 UTC (permalink / raw)
To: Ingo Molnar, Steven Rostedt, Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
Hi,
Here is a patchset of nmi_watchdog fixes for -rt kernel.
These patches are against 2.6.24.4-rt4.
Could you please review?
thanks,
Hiroshi Shimamoto
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on
2008-04-28 18:10 [PATCH -rt 0/4] nmi_watchdog fixes for -rt Hiroshi Shimamoto
@ 2008-04-28 18:14 ` Hiroshi Shimamoto
2008-04-28 19:00 ` Steven Rostedt
2008-04-28 18:16 ` [PATCH -rt 2/4] x86: return true for NMI handled Hiroshi Shimamoto
` (3 subsequent siblings)
4 siblings, 1 reply; 8+ messages in thread
From: Hiroshi Shimamoto @ 2008-04-28 18:14 UTC (permalink / raw)
To: Ingo Molnar, Steven Rostedt, Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
The flags nmi_show_regs should be set before send NMI.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
arch/x86/kernel/nmi_64.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
index d187ab9..69cc737 100644
--- a/arch/x86/kernel/nmi_64.c
+++ b/arch/x86/kernel/nmi_64.c
@@ -327,11 +327,11 @@ void nmi_show_all_regs(void)
if (system_state == SYSTEM_BOOTING)
return;
- smp_send_nmi_allbutself();
-
for_each_online_cpu(i)
nmi_show_regs[i] = 1;
+ smp_send_nmi_allbutself();
+
for_each_online_cpu(i) {
while (nmi_show_regs[i] == 1)
barrier();
--
1.5.4.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH -rt 2/4] x86: return true for NMI handled
2008-04-28 18:10 [PATCH -rt 0/4] nmi_watchdog fixes for -rt Hiroshi Shimamoto
2008-04-28 18:14 ` [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on Hiroshi Shimamoto
@ 2008-04-28 18:16 ` Hiroshi Shimamoto
2008-04-28 18:17 ` [PATCH -rt 3/4] x86: nmi_watchdog NMI needed for irq_show_regs_callback() Hiroshi Shimamoto
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Hiroshi Shimamoto @ 2008-04-28 18:16 UTC (permalink / raw)
To: Ingo Molnar, Steven Rostedt, Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
NMI for show_regs causes unknown NMI when nmi_watchdog is local APIC mode.
Because lapic_wd_event() will fail due to still running perfctr.
If NMI is for show_regs, nmi_watchdog_tick() should return 1.
On x86_32, call irq_show_regs_callback() is moved to top of the
nmi_watchdog_tick() same as x86_64.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
arch/x86/kernel/nmi_32.c | 10 +++++-----
arch/x86/kernel/nmi_64.c | 9 +++++----
include/linux/sched.h | 2 +-
3 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c
index da9deb3..d1f92ca 100644
--- a/arch/x86/kernel/nmi_32.c
+++ b/arch/x86/kernel/nmi_32.c
@@ -350,10 +350,10 @@ void nmi_show_all_regs(void)
static DEFINE_RAW_SPINLOCK(nmi_print_lock);
-notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
+notrace int irq_show_regs_callback(int cpu, struct pt_regs *regs)
{
if (!nmi_show_regs[cpu])
- return;
+ return 0;
nmi_show_regs[cpu] = 0;
spin_lock(&nmi_print_lock);
@@ -362,6 +362,7 @@ notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
per_cpu(irq_stat, cpu).apic_timer_irqs);
show_regs(regs);
spin_unlock(&nmi_print_lock);
+ return 1;
}
notrace __kprobes int
@@ -376,8 +377,9 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
unsigned int sum;
int touched = 0;
int cpu = smp_processor_id();
- int rc=0;
+ int rc;
+ rc = irq_show_regs_callback(cpu, regs);
__profile_tick(CPU_PROFILING, regs);
/* check for other users first */
@@ -404,8 +406,6 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
sum = per_cpu(irq_stat, cpu).apic_timer_irqs +
per_cpu(irq_stat, cpu).irq0_irqs;
- irq_show_regs_callback(cpu, regs);
-
/* if the apic timer isn't firing, this cpu isn't doing much */
/* if the none of the timers isn't firing, this cpu isn't doing much */
if (!touched && last_irq_sums[cpu] == sum) {
diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
index 5d3073c..afc0317 100644
--- a/arch/x86/kernel/nmi_64.c
+++ b/arch/x86/kernel/nmi_64.c
@@ -340,10 +340,10 @@ void nmi_show_all_regs(void)
static DEFINE_RAW_SPINLOCK(nmi_print_lock);
-notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
+notrace int irq_show_regs_callback(int cpu, struct pt_regs *regs)
{
if (!nmi_show_regs[cpu])
- return;
+ return 0;
nmi_show_regs[cpu] = 0;
spin_lock(&nmi_print_lock);
@@ -351,6 +351,7 @@ notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
printk(KERN_WARNING "apic_timer_irqs: %d\n", read_pda(apic_timer_irqs));
show_regs(regs);
spin_unlock(&nmi_print_lock);
+ return 1;
}
notrace int __kprobes
@@ -359,9 +360,9 @@ nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
int sum;
int touched = 0;
int cpu = smp_processor_id();
- int rc = 0;
+ int rc;
- irq_show_regs_callback(cpu, regs);
+ rc = irq_show_regs_callback(cpu, regs);
__profile_tick(CPU_PROFILING, regs);
/* check for other users first */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4176f87..a37200a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -292,7 +292,7 @@ static inline void show_state(void)
}
extern void show_regs(struct pt_regs *);
-extern void irq_show_regs_callback(int cpu, struct pt_regs *regs);
+extern int irq_show_regs_callback(int cpu, struct pt_regs *regs);
/*
* TASK is a pointer to the task whose backtrace we want to see (or NULL for current
--
1.5.4.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH -rt 3/4] x86: nmi_watchdog NMI needed for irq_show_regs_callback()
2008-04-28 18:10 [PATCH -rt 0/4] nmi_watchdog fixes for -rt Hiroshi Shimamoto
2008-04-28 18:14 ` [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on Hiroshi Shimamoto
2008-04-28 18:16 ` [PATCH -rt 2/4] x86: return true for NMI handled Hiroshi Shimamoto
@ 2008-04-28 18:17 ` Hiroshi Shimamoto
2008-04-28 18:19 ` [PATCH -rt 4/4] wait for finish show_regs() before panic Hiroshi Shimamoto
2008-04-28 19:03 ` [PATCH -rt 0/4] nmi_watchdog fixes for -rt Steven Rostedt
4 siblings, 0 replies; 8+ messages in thread
From: Hiroshi Shimamoto @ 2008-04-28 18:17 UTC (permalink / raw)
To: Ingo Molnar, Steven Rostedt, Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
The -rt kernel doesn't panic immediately when NMI lockup detected.
Because the kernel waits show_regs on all cpus, but NMI is not come so
frequently.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
arch/x86/kernel/nmi_32.c | 7 +++++++
arch/x86/kernel/nmi_64.c | 8 +++++++-
2 files changed, 14 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c
index f55f05b..da9deb3 100644
--- a/arch/x86/kernel/nmi_32.c
+++ b/arch/x86/kernel/nmi_32.c
@@ -428,6 +428,13 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
if (i == cpu)
continue;
nmi_show_regs[i] = 1;
+ }
+
+ smp_send_nmi_allbutself();
+
+ for_each_online_cpu(i) {
+ if (i == cpu)
+ continue;
while (nmi_show_regs[i] == 1)
cpu_relax();
}
diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
index 69cc737..5d3073c 100644
--- a/arch/x86/kernel/nmi_64.c
+++ b/arch/x86/kernel/nmi_64.c
@@ -412,10 +412,16 @@ nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
if (i == cpu)
continue;
nmi_show_regs[i] = 1;
+ }
+
+ smp_send_nmi_allbutself();
+
+ for_each_online_cpu(i) {
+ if (i == cpu)
+ continue;
while (nmi_show_regs[i] == 1)
cpu_relax();
}
-
die_nmi("NMI Watchdog detected LOCKUP on CPU %d\n", regs,
panic_on_timeout);
}
--
1.5.4.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH -rt 4/4] wait for finish show_regs() before panic
2008-04-28 18:10 [PATCH -rt 0/4] nmi_watchdog fixes for -rt Hiroshi Shimamoto
` (2 preceding siblings ...)
2008-04-28 18:17 ` [PATCH -rt 3/4] x86: nmi_watchdog NMI needed for irq_show_regs_callback() Hiroshi Shimamoto
@ 2008-04-28 18:19 ` Hiroshi Shimamoto
2008-04-28 19:03 ` [PATCH -rt 0/4] nmi_watchdog fixes for -rt Steven Rostedt
4 siblings, 0 replies; 8+ messages in thread
From: Hiroshi Shimamoto @ 2008-04-28 18:19 UTC (permalink / raw)
To: Ingo Molnar, Steven Rostedt, Thomas Gleixner; +Cc: linux-kernel, linux-rt-users
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
It might cause kdump failure that the kernel doesn't wait for finish
show_regs(). The nmi_show_regs variable for show_regs() flag is cleared
before show_regs() is really called. This flag should be cleared after
show_regs().
kdump stops all CPUs other than crashing CPU by NMI handler, but if
show_regs() takes a bit time, kdump cannot wait and will continue process.
It means that the 2nd kernel and the old kernel run simultaneously and it
might cause unexpected behavior, such as randomly reboot.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Maxim Uvarov <muvarov@ru.mvista.com>
---
arch/x86/kernel/nmi_32.c | 2 +-
arch/x86/kernel/nmi_64.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c
index d1f92ca..f46bb6e 100644
--- a/arch/x86/kernel/nmi_32.c
+++ b/arch/x86/kernel/nmi_32.c
@@ -355,13 +355,13 @@ notrace int irq_show_regs_callback(int cpu, struct pt_regs *regs)
if (!nmi_show_regs[cpu])
return 0;
- nmi_show_regs[cpu] = 0;
spin_lock(&nmi_print_lock);
printk(KERN_WARNING "NMI show regs on CPU#%d:\n", cpu);
printk(KERN_WARNING "apic_timer_irqs: %d\n",
per_cpu(irq_stat, cpu).apic_timer_irqs);
show_regs(regs);
spin_unlock(&nmi_print_lock);
+ nmi_show_regs[cpu] = 0;
return 1;
}
diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
index afc0317..8bc2328 100644
--- a/arch/x86/kernel/nmi_64.c
+++ b/arch/x86/kernel/nmi_64.c
@@ -345,12 +345,12 @@ notrace int irq_show_regs_callback(int cpu, struct pt_regs *regs)
if (!nmi_show_regs[cpu])
return 0;
- nmi_show_regs[cpu] = 0;
spin_lock(&nmi_print_lock);
printk(KERN_WARNING "NMI show regs on CPU#%d:\n", cpu);
printk(KERN_WARNING "apic_timer_irqs: %d\n", read_pda(apic_timer_irqs));
show_regs(regs);
spin_unlock(&nmi_print_lock);
+ nmi_show_regs[cpu] = 0;
return 1;
}
--
1.5.4.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on
2008-04-28 18:14 ` [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on Hiroshi Shimamoto
@ 2008-04-28 19:00 ` Steven Rostedt
2008-04-28 21:34 ` Hiroshi Shimamoto
0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2008-04-28 19:00 UTC (permalink / raw)
To: Hiroshi Shimamoto
Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, linux-rt-users
On Mon, 28 Apr 2008, Hiroshi Shimamoto wrote:
> diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
> index d187ab9..69cc737 100644
> --- a/arch/x86/kernel/nmi_64.c
> +++ b/arch/x86/kernel/nmi_64.c
> @@ -327,11 +327,11 @@ void nmi_show_all_regs(void)
> if (system_state == SYSTEM_BOOTING)
> return;
>
> - smp_send_nmi_allbutself();
> -
> for_each_online_cpu(i)
> nmi_show_regs[i] = 1;
Hi Hiroshi,
I know this wasn't your code to begin with but, how does this function
exit? I mean, we set an array where each index per online cpu is set to
one, then do an "nmi_allbutself", and then wait on those indexes to turn
zero, one at a time. If we are CPU 0 here, we set that index to 1, then
enter the loop, and will block forever on this "while" loop below.
Am I missing something?
Thanks,
-- Steve
> + smp_send_nmi_allbutself();
> +
> for_each_online_cpu(i) {
> while (nmi_show_regs[i] == 1)
> barrier();
> --
> 1.5.4.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH -rt 0/4] nmi_watchdog fixes for -rt
2008-04-28 18:10 [PATCH -rt 0/4] nmi_watchdog fixes for -rt Hiroshi Shimamoto
` (3 preceding siblings ...)
2008-04-28 18:19 ` [PATCH -rt 4/4] wait for finish show_regs() before panic Hiroshi Shimamoto
@ 2008-04-28 19:03 ` Steven Rostedt
4 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2008-04-28 19:03 UTC (permalink / raw)
To: Hiroshi Shimamoto
Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, linux-rt-users
On Mon, 28 Apr 2008, Hiroshi Shimamoto wrote:
> Hi,
>
> Here is a patchset of nmi_watchdog fixes for -rt kernel.
> These patches are against 2.6.24.4-rt4.
> Could you please review?
Patches look good!
I'll queue them up for the next release.
Thanks,
-- Steve
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on
2008-04-28 19:00 ` Steven Rostedt
@ 2008-04-28 21:34 ` Hiroshi Shimamoto
0 siblings, 0 replies; 8+ messages in thread
From: Hiroshi Shimamoto @ 2008-04-28 21:34 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, linux-rt-users
Steven Rostedt wrote:
>
> On Mon, 28 Apr 2008, Hiroshi Shimamoto wrote:
>> diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
>> index d187ab9..69cc737 100644
>> --- a/arch/x86/kernel/nmi_64.c
>> +++ b/arch/x86/kernel/nmi_64.c
>> @@ -327,11 +327,11 @@ void nmi_show_all_regs(void)
>> if (system_state == SYSTEM_BOOTING)
>> return;
>>
>> - smp_send_nmi_allbutself();
>> -
>> for_each_online_cpu(i)
>> nmi_show_regs[i] = 1;
>
> Hi Hiroshi,
>
> I know this wasn't your code to begin with but, how does this function
> exit? I mean, we set an array where each index per online cpu is set to
> one, then do an "nmi_allbutself", and then wait on those indexes to turn
> zero, one at a time. If we are CPU 0 here, we set that index to 1, then
> enter the loop, and will block forever on this "while" loop below.
Hm, I'm not quit sure when NMI disabled.
If NMI is working issuing CPU will receive NMI and the flag turns off in
NMI handler.
I'll look into it again and will work if needed.
thanks,
Hiroshi Shimamoto
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-04-28 21:34 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-28 18:10 [PATCH -rt 0/4] nmi_watchdog fixes for -rt Hiroshi Shimamoto
2008-04-28 18:14 ` [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on Hiroshi Shimamoto
2008-04-28 19:00 ` Steven Rostedt
2008-04-28 21:34 ` Hiroshi Shimamoto
2008-04-28 18:16 ` [PATCH -rt 2/4] x86: return true for NMI handled Hiroshi Shimamoto
2008-04-28 18:17 ` [PATCH -rt 3/4] x86: nmi_watchdog NMI needed for irq_show_regs_callback() Hiroshi Shimamoto
2008-04-28 18:19 ` [PATCH -rt 4/4] wait for finish show_regs() before panic Hiroshi Shimamoto
2008-04-28 19:03 ` [PATCH -rt 0/4] nmi_watchdog fixes for -rt Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).