From: Don Zickus <dzickus@redhat.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Shaohua Li <shaohua.li@intel.com>,
Miles Lane <miles.lane@gmail.com>, Andrew Morton <akpm@osdl.org>,
linux-kernel@vger.kernel.org, ak@suse.de
Subject: Re: [2.6.17-rc5-mm2] crash when doing second suspend: BUG in arch/i386/kernel/nmi.c:174
Date: Tue, 6 Jun 2006 22:49:38 -0400 [thread overview]
Message-ID: <20060607024938.GG11696@redhat.com> (raw)
In-Reply-To: <4485AC1F.9050001@goop.org>
Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
On Tue, Jun 06, 2006 at 09:23:59AM -0700, Jeremy Fitzhardinge wrote:
> Shaohua Li wrote:
> >Does below patch help? The nmi suspend/resume doesn't look good to me.
> >Only CPU0 uses the suspend/resume code path. Other CPUs run the CPU
> >hotplug code path.
> >
> Unfortunately this just oopses immediately on the first suspend. The
> stack trace is unclear (and I'm just going from memory at the moment),
> but it looked like it got an invalid op. I'll try to get a clearer idea
> of the crash later today.
>
> J
Can you apply this patch on top of Shaohua's. This should fix all your
suspend problems.
Inside the patch is a little hack to handle the scenario when we come out
of resume we do _not_ want the nmi watchdog enabled (to match the
case entering suspend).
Compiled but not tested, as I don't have easy access to my test machines
right now. Mainly posted for Andrew to pick up for rc6-mm1.
Cheers,
Don
Index: linux-don/arch/i386/kernel/nmi.c
===================================================================
--- linux-don.orig/arch/i386/kernel/nmi.c
+++ linux-don/arch/i386/kernel/nmi.c
@@ -745,6 +745,7 @@ static void stop_intel_arch_watchdog(voi
void setup_apic_nmi_watchdog (void *unused)
{
+ struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
#ifdef CONFIG_LOCKDEP
/*
* The NMI watchdog uses spinlocks (notifier chains, etc.),
@@ -761,6 +762,14 @@ void setup_apic_nmi_watchdog (void *unus
(nmi_watchdog != NMI_IO_APIC))
return;
+ if (wd->enabled == 1)
+ return;
+
+ /* cheap hack to support suspend/resume */
+ /* if cpu0 is not active neither should the other cpus */
+ if ((smp_processor_id() != 0) && (atomic_read(&nmi_active) <= 0))
+ return;
+
if (nmi_watchdog == NMI_LOCAL_APIC) {
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_AMD:
@@ -798,17 +807,22 @@ void setup_apic_nmi_watchdog (void *unus
return;
}
}
- __get_cpu_var(nmi_watchdog_ctlblk.enabled) = 1;
+ wd->enabled = 1;
atomic_inc(&nmi_active);
}
void stop_apic_nmi_watchdog(void *unused)
{
+ struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
+
/* only support LOCAL and IO APICs for now */
if ((nmi_watchdog != NMI_LOCAL_APIC) &&
(nmi_watchdog != NMI_IO_APIC))
return;
+ if (wd->enabled == 0)
+ return;
+
if (nmi_watchdog == NMI_LOCAL_APIC) {
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_AMD:
@@ -836,7 +850,7 @@ void stop_apic_nmi_watchdog(void *unused
return;
}
}
- __get_cpu_var(nmi_watchdog_ctlblk.enabled) = 0;
+ wd->enabled = 0;
atomic_dec(&nmi_active);
}
Index: linux-don/arch/x86_64/kernel/nmi.c
===================================================================
--- linux-don.orig/arch/x86_64/kernel/nmi.c
+++ linux-don/arch/x86_64/kernel/nmi.c
@@ -672,6 +672,7 @@ static void stop_intel_arch_watchdog(voi
void setup_apic_nmi_watchdog(void *unused)
{
+ struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
#ifdef CONFIG_LOCKDEP
/*
* The NMI watchdog uses spinlocks (notifier chains, etc.),
@@ -688,6 +689,14 @@ void setup_apic_nmi_watchdog(void *unuse
(nmi_watchdog != NMI_IO_APIC))
return;
+ if (wd->enabled == 1)
+ return;
+
+ /* cheap hack to support suspend/resume */
+ /* if cpu0 is not active neither should the other cpus */
+ if ((smp_processor_id() != 0) && (atomic_read(&nmi_active) <= 0))
+ return;
+
if (nmi_watchdog == NMI_LOCAL_APIC) {
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_AMD:
@@ -709,17 +718,22 @@ void setup_apic_nmi_watchdog(void *unuse
return;
}
}
- __get_cpu_var(nmi_watchdog_ctlblk.enabled) = 1;
+ wd->enabled = 1;
atomic_inc(&nmi_active);
}
void stop_apic_nmi_watchdog(void *unused)
{
+ struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
+
/* only support LOCAL and IO APICs for now */
if ((nmi_watchdog != NMI_LOCAL_APIC) &&
(nmi_watchdog != NMI_IO_APIC))
return;
+ if (wd->enabled == 0)
+ return;
+
if (nmi_watchdog == NMI_LOCAL_APIC) {
switch (boot_cpu_data.x86_vendor) {
case X86_VENDOR_AMD:
@@ -738,7 +752,7 @@ void stop_apic_nmi_watchdog(void *unused
return;
}
}
- __get_cpu_var(nmi_watchdog_ctlblk.enabled) = 0;
+ wd->enabled = 0;
atomic_dec(&nmi_active);
}
next prev parent reply other threads:[~2006-06-07 2:45 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-06-02 22:51 [2.6.17-rc5-mm2] crash when doing second suspend: BUG in arch/i386/kernel/nmi.c:174 Jeremy Fitzhardinge
2006-06-04 11:47 ` Rafael J. Wysocki
2006-06-05 7:21 ` Jeremy Fitzhardinge
2006-06-05 7:37 ` Jeremy Fitzhardinge
2006-06-05 7:48 ` Andrew Morton
2006-06-05 7:59 ` Jeremy Fitzhardinge
2006-06-05 8:35 ` Miles Lane
2006-06-06 6:44 ` Shaohua Li
2006-06-06 14:17 ` Don Zickus
2006-06-06 14:18 ` Andi Kleen
2006-06-06 21:45 ` Don Zickus
2006-06-06 22:15 ` Andrew Morton
2006-06-06 23:05 ` Don Zickus
2006-06-06 23:22 ` Andrew Morton
2006-06-06 23:27 ` Jeremy Fitzhardinge
2006-06-06 23:32 ` Andi Kleen
2006-06-06 23:42 ` Don Zickus
2006-06-08 20:11 ` Pavel Machek
2006-06-06 23:38 ` Nigel Cunningham
2006-06-07 0:06 ` Jeremy Fitzhardinge
2006-06-07 0:13 ` Nigel Cunningham
2006-06-07 0:24 ` Andrew Morton
2006-06-07 0:29 ` Jeremy Fitzhardinge
2006-06-07 0:31 ` Nigel Cunningham
2006-06-07 0:33 ` Andi Kleen
2006-06-07 0:40 ` Nigel Cunningham
2006-06-07 0:26 ` Jeremy Fitzhardinge
2006-06-07 0:33 ` Nigel Cunningham
2006-06-07 0:56 ` Jeremy Fitzhardinge
2006-06-08 20:13 ` Pavel Machek
2006-06-08 12:45 ` Pavel Machek
2006-06-06 23:34 ` Andi Kleen
2006-06-06 23:55 ` Don Zickus
2006-06-07 0:04 ` Andi Kleen
2006-06-07 0:05 ` Nigel Cunningham
2006-06-07 0:42 ` Don Zickus
2006-06-07 0:50 ` Nigel Cunningham
2006-06-07 3:29 ` [linux-pm] " David Brownell
2006-06-07 9:55 ` Rafael J. Wysocki
2006-06-08 20:27 ` Pavel Machek
2006-06-06 16:23 ` Jeremy Fitzhardinge
2006-06-06 16:51 ` Don Zickus
2006-06-07 2:49 ` Don Zickus [this message]
2006-06-07 16:33 ` Andi Kleen
2006-06-07 17:07 ` Jeremy Fitzhardinge
2006-06-07 17:50 ` Don Zickus
2006-06-07 18:53 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060607024938.GG11696@redhat.com \
--to=dzickus@redhat.com \
--cc=ak@suse.de \
--cc=akpm@osdl.org \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miles.lane@gmail.com \
--cc=shaohua.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.