public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Pan, Zhenjie" <zhenjie.pan@intel.com>,
	Stephane Eranian <eranian@google.com>,
	"paulus@samba.org" <paulus@samba.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"acme@ghostprotocols.net" <acme@ghostprotocols.net>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"Liu, Chuansheng" <chuansheng.liu@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2] NMI: fix NMI period is not correct when cpu frequency changes issue.
Date: Tue, 23 Apr 2013 14:14:23 -0400	[thread overview]
Message-ID: <20130423181423.GC79013@redhat.com> (raw)
In-Reply-To: <1366663056.8337.7.camel@laptop>

On Mon, Apr 22, 2013 at 10:37:36PM +0200, Peter Zijlstra wrote:
> On Mon, 2013-04-22 at 00:50 +0000, Pan, Zhenjie wrote:
> > This make watchdog reset happen before hard lockup detect.
> 
> Doesn't your watchdog trigger an NMI you can use to print the panic?
> 
> ISTR some people (hi Don!) spending quite a lot of time to make this
> work for some other platforms.
> 
> IIRC those things would fire an NMI at some point and then hard-reset
> the machine not much later.. the difficulty was detecting this
> 'unclaimed' nmi and allowing drivers to register for it.
> 
> NMI_UNKNOWN and unknown_nmi_panic are the result of that.

I think you are confusing the hard lockup detector watchdog (which uses
the perf counters) with a physical hardware watchdog (which just resets
the cpu if not kicked frequently; ie
drivers/watchdog/intel_scu_watchdog.c).

I believe what Zhenjie's problem is the hard lockup detector (ie
nmi_watchdog) becomes useless because sometimes it can correctly fire
before the hardware watchdog expires, other times it may not.

In order for the hard lockup detector to be useful, it should be reliable.
Today it isn't because it period inversely varies with cpu frequency.

I don't have a real issue with his patch.  I was just concerned about the
frequency of the changes (10-15 times a second seems like a lot).

Cheers,
Don

      reply	other threads:[~2013-04-23 18:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-16  6:57 [PATCH v2] NMI: fix NMI period is not correct when cpu frequency changes issue Pan, Zhenjie
2013-04-18 11:42 ` Peter Zijlstra
2013-04-18 12:04   ` Stephane Eranian
2013-04-18 13:39     ` Don Zickus
2013-04-22  0:50       ` Pan, Zhenjie
2013-04-22 18:59         ` Don Zickus
2013-04-23  0:52           ` Pan, Zhenjie
2013-04-22 20:37         ` Peter Zijlstra
2013-04-23 18:14           ` Don Zickus [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130423181423.GC79013@redhat.com \
    --to=dzickus@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@ghostprotocols.net \
    --cc=akpm@linux-foundation.org \
    --cc=chuansheng.liu@intel.com \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulus@samba.org \
    --cc=tglx@linutronix.de \
    --cc=zhenjie.pan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox