public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] watchdog: reduce "NMI watchdog enabled, takes one hw-pmu counter." messages
@ 2012-06-06 18:09 Nathan Zimmer
  2012-06-06 21:15 ` Don Zickus
  0 siblings, 1 reply; 3+ messages in thread
From: Nathan Zimmer @ 2012-06-06 18:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton, Don Zickus, Peter Zijlstra

 watchdog: reduces some noise on a large system
 The printk buffer can be flooded with with redundant
 "NMI watchdog enabled, takes one hw-pmu counter." messages.
 It doesn't add any value beyond the first.

 Note the message needs logged a second time if the watchdog was disabled then
 reenabled.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
---
 kernel/watchdog.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index e5e1d85..404b141 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -372,6 +372,8 @@ static int watchdog(void *unused)
 
 
 #ifdef CONFIG_HARDLOCKUP_DETECTOR
+static bool already_notified;
+
 static int watchdog_nmi_enable(int cpu)
 {
 	struct perf_event_attr *wd_attr;
@@ -391,7 +393,10 @@ static int watchdog_nmi_enable(int cpu)
 	/* Try to register using hardware perf events */
 	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
 	if (!IS_ERR(event)) {
-		pr_info("enabled, takes one hw-pmu counter.\n");
+		if (!already_notified) {
+			pr_info("enabled, takes one hw-pmu counter.\n");
+			already_notified = true;
+		}
 		goto out_save;
 	}
 
@@ -426,6 +431,7 @@ static void watchdog_nmi_disable(int cpu)
 
 		/* should be in cleanup, but blocks oprofile */
 		perf_event_release_kernel(event);
+		already_notified = false;
 	}
 	return;
 }
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] watchdog: reduce "NMI watchdog enabled, takes one hw-pmu counter." messages
  2012-06-06 18:09 [PATCH] watchdog: reduce "NMI watchdog enabled, takes one hw-pmu counter." messages Nathan Zimmer
@ 2012-06-06 21:15 ` Don Zickus
  2012-06-07 17:14   ` Nathan Zimmer
  0 siblings, 1 reply; 3+ messages in thread
From: Don Zickus @ 2012-06-06 21:15 UTC (permalink / raw)
  To: Nathan Zimmer; +Cc: linux-kernel, Andrew Morton, Peter Zijlstra

On Wed, Jun 06, 2012 at 01:09:46PM -0500, Nathan Zimmer wrote:
>  watchdog: reduces some noise on a large system
>  The printk buffer can be flooded with with redundant
>  "NMI watchdog enabled, takes one hw-pmu counter." messages.
>  It doesn't add any value beyond the first.
> 
>  Note the message needs logged a second time if the watchdog was disabled then
>  reenabled.

Hi Nathan,

Thanks for the patch.  I added something similar to RHEL-6 a while ago
that solved the same problem in a more robust way (I think).  IOW, I dealt
with the watchdog failures too (for virt and bios issues).

It doesn't cover the nmi_disable case like your patch does, but is easy to
add.

I attached it below.  Let me know if this meets your needs or not?

Cheers,
Don

-----------------------------8<-----------------------
From: Don Zickus <dzickus@redhat.com>
Date: Wed, 6 Jun 2012 15:17:19 -0400
Subject: [PATCH] watchdog: quiet down the boot messages

A bunch of bugzillas have complained how noisy the nmi_watchdog is during
boot-up especially with its expected failure cases (like virt and bios
resource contention).

This is my attempt to quiet them down and keep it less confusing for the end
user.  What I did is print the message for cpu0 and save it for future
comparisions.  If future cpus have an identical message as cpu0, then don't
print the redundant info.  However, if a future cpu has a different message,
happily print that loudly.

Before the change, you would see something like:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog enabled, takes one hw-pmu counter.
    Booting Node   0, Processors  #1
    NMI watchdog enabled, takes one hw-pmu counter.
     #2
    NMI watchdog enabled, takes one hw-pmu counter.
     #3 Ok.
    NMI watchdog enabled, takes one hw-pmu counter.
    Brought up 4 CPUs
    Total of 4 processors activated (22607.24 BogoMIPS).

After the change, it is simlified to:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog enabled, takes one hw-pmu counter.
    Booting Node   0, Processors  #1 #2 #3 Ok.
    Brought up 4 CPUs

Signed-off-by: Don Zickus <dzickus@redhat.com>
---
 kernel/watchdog.c |   20 +++++++++++++++++++-
 1 files changed, 19 insertions(+), 1 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index e5e1d85..79ff671 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -377,6 +377,14 @@ static int watchdog_nmi_enable(int cpu)
 	struct perf_event_attr *wd_attr;
 	struct perf_event *event = per_cpu(watchdog_ev, cpu);
 
+	/*
+	 * People like the simple clean cpu node info
+	 * on boot.  Simplify the noise from the watchdog
+	 * by only printing messages that are different than
+	 * what cpu0 displayed
+	 */
+	static unsigned long err0 = 0;
+
 	/* is it already setup and enabled? */
 	if (event && event->state > PERF_EVENT_STATE_OFF)
 		goto out;
@@ -390,11 +398,21 @@ static int watchdog_nmi_enable(int cpu)
 
 	/* Try to register using hardware perf events */
 	event = perf_event_create_kernel_counter(wd_attr, cpu, NULL, watchdog_overflow_callback, NULL);
+
+	/* save cpu0 error for future comparision */
+	if (!cpu)
+		err0 = (IS_ERR(event) ? PTR_ERR(event) : 0);
+
 	if (!IS_ERR(event)) {
-		pr_info("enabled, takes one hw-pmu counter.\n");
+		/* only print for cpu0 or different than cpu0 */
+		if (!cpu || err0)
+			pr_info("enabled, takes one hw-pmu counter.\n");
 		goto out_save;
 	}
 
+	/* skip displaying the same error again */
+	if ((PTR_ERR(event) == err0) && cpu)
+		return PTR_ERR(event);
 
 	/* vary the KERN level based on the returned errno */
 	if (PTR_ERR(event) == -EOPNOTSUPP)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] watchdog: reduce "NMI watchdog enabled, takes one hw-pmu counter." messages
  2012-06-06 21:15 ` Don Zickus
@ 2012-06-07 17:14   ` Nathan Zimmer
  0 siblings, 0 replies; 3+ messages in thread
From: Nathan Zimmer @ 2012-06-07 17:14 UTC (permalink / raw)
  To: Don Zickus; +Cc: Nathan Zimmer, linux-kernel, Andrew Morton, Peter Zijlstra

On Wed, Jun 06, 2012 at 05:15:23PM -0400, Don Zickus wrote:
> On Wed, Jun 06, 2012 at 01:09:46PM -0500, Nathan Zimmer wrote:
> >  watchdog: reduces some noise on a large system
> >  The printk buffer can be flooded with with redundant
> >  "NMI watchdog enabled, takes one hw-pmu counter." messages.
> >  It doesn't add any value beyond the first.
> > 
> >  Note the message needs logged a second time if the watchdog was disabled then
> >  reenabled.
> 
> Hi Nathan,
> 
> Thanks for the patch.  I added something similar to RHEL-6 a while ago
> that solved the same problem in a more robust way (I think).  IOW, I dealt
> with the watchdog failures too (for virt and bios issues).
> 
> It doesn't cover the nmi_disable case like your patch does, but is easy to
> add.
> 
> I attached it below.  Let me know if this meets your needs or not?
> 
> Cheers,
> Don

Yes it does seem to meet my needs quite well.

Thanks,
Nate 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-06-07 17:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-06 18:09 [PATCH] watchdog: reduce "NMI watchdog enabled, takes one hw-pmu counter." messages Nathan Zimmer
2012-06-06 21:15 ` Don Zickus
2012-06-07 17:14   ` Nathan Zimmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox