* introduce NMI_AUTO as nmi_watchdog option
@ 2010-01-11 19:16 Don Zickus
2010-01-11 20:27 ` Cyrill Gorcunov
0 siblings, 1 reply; 10+ messages in thread
From: Don Zickus @ 2010-01-11 19:16 UTC (permalink / raw)
To: mingo; +Cc: aris, linux-kernel
Hi Ingo,
To dig up an old thread last November:
======
* Aristeu Rozanski <aris@redhat.com> wrote:
> > > > > NMI_AUTO is a new nmi_watchdog option that makes LAPIC be tried
> > > > > first
> > > > > and if the CPU isn't supported, IOAPIC will be used. It's useful
> > > > > in
> > > > > cases where NMI watchdog is enabled by default in a kernel built
> > > > > for
> > > > > different machines. It can be configured by default or selected
> > > > > with
> > > > > nmi_watchdog=3 or nmi_watchdog=auto parameters.
> > > >
> > > > What i'd like to see for the NMI watchdog is much more ambitious
> > > > than
> > > > this: the use of perf events to run a periodic NMI callback.
> > > >
> > > > The NMI watchdog would cause the creation of a per-cpu perf_event
> > > > structure (in-kernel). All x86 CPUs that have perf event support
> > > > (the
> > > > majority of them) will thus be able to have an NMI
> > > > watchdog using a
> > > > nice, generic piece of code and we'd be able to phase out the
> > > > open-coded
> > > > NMI watchdog code.
> > > >
> > > > The user would not notice much from this: we'd still have the
> > > > /proc/sys/kernel/nmi_watchdog toggle to turn it on/off, and we'd
> > > > still
> > > > have the nmi_watchog= boot parameter as well. But the underlying
> > > > implementation would be far more generic and far more usable than
> > > > the
> > > > current code.
> > > >
> > > > Would you be interested in moving the NMI watchdog code in this
> > > > direction? Most of the perf events changes (callbacks, helpers for
> > > > in-kernel event allocations, etc.) are in latest
> > > > -tip already, so you
> > > > could use that as a base.
> > >
> > > but that would work only for LAPIC. You're suggesting killing IOAPIC
> > > mode too?
> >
> > Would it be a big loss, with all modern systems expected to have a
> > working lapic based NMI source? I wrote the IOAPIC mode originally but
> > i
> > dont feel too attached to it ;-)
>
> ok, fair enough. but since it'll be another implementation, do you
> mind applying the patches I submitted so they can be used until the
> new implementation is in place?
For that i need to see at least an RFC v1 version series of the new
implementation - otherwise we might end up sitting on this interim
version with no-one doing the better variant.
========
I was going to jump in and try to do this work. I wanted to make sure
what you were looking for here. When you say convert nmi watchdog to perf
events, I assume you mean merging over the bits of perfctr-watchdog.c to
perf_events.c, modify nmi.c to just register as a normal perf event and
probably cleanup the oprofile stuff to match, correct?
Cheers,
Don
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-11 19:16 introduce NMI_AUTO as nmi_watchdog option Don Zickus
@ 2010-01-11 20:27 ` Cyrill Gorcunov
2010-01-11 20:33 ` Don Zickus
0 siblings, 1 reply; 10+ messages in thread
From: Cyrill Gorcunov @ 2010-01-11 20:27 UTC (permalink / raw)
To: Don Zickus; +Cc: mingo, aris, linux-kernel
On Mon, Jan 11, 2010 at 02:16:33PM -0500, Don Zickus wrote:
> Hi Ingo,
>
...
> I was going to jump in and try to do this work. I wanted to make sure
> what you were looking for here. When you say convert nmi watchdog to perf
> events, I assume you mean merging over the bits of perfctr-watchdog.c to
> perf_events.c, modify nmi.c to just register as a normal perf event and
> probably cleanup the oprofile stuff to match, correct?
>
> Cheers,
> Don
>
As far as I know -- converting perfctr-watchdog.c to into perfevents
style would be quite a desirable feature. But I still didn't manage to
find time for this task :( If you're interested to start this work
-- that would be just great!
-- Cyrill
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-11 20:27 ` Cyrill Gorcunov
@ 2010-01-11 20:33 ` Don Zickus
2010-01-11 20:51 ` Cyrill Gorcunov
2010-01-13 9:32 ` Ingo Molnar
0 siblings, 2 replies; 10+ messages in thread
From: Don Zickus @ 2010-01-11 20:33 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: mingo, aris, linux-kernel
On Mon, Jan 11, 2010 at 11:27:29PM +0300, Cyrill Gorcunov wrote:
> On Mon, Jan 11, 2010 at 02:16:33PM -0500, Don Zickus wrote:
> > Hi Ingo,
> >
> ...
> > I was going to jump in and try to do this work. I wanted to make sure
> > what you were looking for here. When you say convert nmi watchdog to perf
> > events, I assume you mean merging over the bits of perfctr-watchdog.c to
> > perf_events.c, modify nmi.c to just register as a normal perf event and
> > probably cleanup the oprofile stuff to match, correct?
> >
> > Cheers,
> > Don
> >
>
> As far as I know -- converting perfctr-watchdog.c to into perfevents
> style would be quite a desirable feature. But I still didn't manage to
> find time for this task :( If you're interested to start this work
> -- that would be just great!
After looking through the code I just had some questions, perhaps you have
thought about this longer than me, what to do with the reservation code
(just remove it I assume and let perf_events _be_ the only code that
handles perf events) and what to do with some of the cpu quirks as noted
in perfctr-watchdog.c (notable some of the Intel errata for the Core
chipsets).
Cheers,
Don
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-11 20:33 ` Don Zickus
@ 2010-01-11 20:51 ` Cyrill Gorcunov
2010-01-13 9:32 ` Ingo Molnar
1 sibling, 0 replies; 10+ messages in thread
From: Cyrill Gorcunov @ 2010-01-11 20:51 UTC (permalink / raw)
To: Don Zickus; +Cc: mingo, aris, linux-kernel, Frederic Weisbecker
On Mon, Jan 11, 2010 at 03:33:56PM -0500, Don Zickus wrote:
> On Mon, Jan 11, 2010 at 11:27:29PM +0300, Cyrill Gorcunov wrote:
> > On Mon, Jan 11, 2010 at 02:16:33PM -0500, Don Zickus wrote:
> > > Hi Ingo,
> > >
> > ...
> > > I was going to jump in and try to do this work. I wanted to make sure
> > > what you were looking for here. When you say convert nmi watchdog to perf
> > > events, I assume you mean merging over the bits of perfctr-watchdog.c to
> > > perf_events.c, modify nmi.c to just register as a normal perf event and
> > > probably cleanup the oprofile stuff to match, correct?
> > >
> > > Cheers,
> > > Don
> > >
> >
> > As far as I know -- converting perfctr-watchdog.c to into perfevents
> > style would be quite a desirable feature. But I still didn't manage to
> > find time for this task :( If you're interested to start this work
> > -- that would be just great!
>
> After looking through the code I just had some questions, perhaps you have
> thought about this longer than me, what to do with the reservation code
> (just remove it I assume and let perf_events _be_ the only code that
> handles perf events) and what to do with some of the cpu quirks as noted
> in perfctr-watchdog.c (notable some of the Intel errata for the Core
> chipsets).
>
> Cheers,
> Don
>
Hi Don,
well I must admit I didn't look too close to this code (if I had I would
have sent some patch for review at least :). But I was suggested to take
a look on hw_breakpoint.c (Frederic worked on it iirc, CC'ed) as an example
of perfevent'ed code. So converting to perf-event is not trivial task
and I fear I can't give any useful advice at moment since as I said I
didn't manage to find time for this task and as result didn't read code
byte-to-byte, sorry. But if I get some idea -- will share!
-- Cyrill
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-11 20:33 ` Don Zickus
2010-01-11 20:51 ` Cyrill Gorcunov
@ 2010-01-13 9:32 ` Ingo Molnar
2010-01-13 13:13 ` Peter Zijlstra
2010-01-13 16:23 ` Don Zickus
1 sibling, 2 replies; 10+ messages in thread
From: Ingo Molnar @ 2010-01-13 9:32 UTC (permalink / raw)
To: Don Zickus; +Cc: Cyrill Gorcunov, aris, linux-kernel
* Don Zickus <dzickus@redhat.com> wrote:
> On Mon, Jan 11, 2010 at 11:27:29PM +0300, Cyrill Gorcunov wrote:
> > On Mon, Jan 11, 2010 at 02:16:33PM -0500, Don Zickus wrote:
> > > Hi Ingo,
> > >
> > ...
> > > I was going to jump in and try to do this work. I wanted to make sure
> > > what you were looking for here. When you say convert nmi watchdog to perf
> > > events, I assume you mean merging over the bits of perfctr-watchdog.c to
> > > perf_events.c, modify nmi.c to just register as a normal perf event and
> > > probably cleanup the oprofile stuff to match, correct?
> > >
> > > Cheers,
> > > Don
> > >
> >
> > As far as I know -- converting perfctr-watchdog.c to into perfevents
> > style would be quite a desirable feature. But I still didn't manage to
> > find time for this task :( If you're interested to start this work
> > -- that would be just great!
>
> After looking through the code I just had some questions, perhaps you have
> thought about this longer than me, what to do with the reservation code
> (just remove it I assume and let perf_events _be_ the only code that
> handles perf events) and what to do with some of the cpu quirks as noted in
> perfctr-watchdog.c (notable some of the Intel errata for the Core chipsets).
Given the amount of quirks in the perctr code it might make sense to shape
this as a new feature initially: introduce a new NMI watchdog that is perf
based and has a different codebase.
Then, once it's capable enough and has been in circulation long enough we can
simply drop the old NMI watchdog. (without users noticing anything [modulo
bugs])
v1 should concentrate on x86 CPUs that are supported by perf currently. Note,
it _might_ make sense to do it via a new kernel/nmi_watchdog.c file - other
architectures have NMI concepts as well, such as Sparc64. A further idea would
be to maybe even merge it with the softlockup code in kernel/softlockup.c - so
that we dont have two sets of apis like touch_nmi_watchdog and
touch_softlockup_watchdog.
So there's a wide spectrum of possibilities - the important thing is to start
small :-)
Ingo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-13 9:32 ` Ingo Molnar
@ 2010-01-13 13:13 ` Peter Zijlstra
2010-01-13 16:25 ` Don Zickus
2010-01-13 16:35 ` Ingo Molnar
2010-01-13 16:23 ` Don Zickus
1 sibling, 2 replies; 10+ messages in thread
From: Peter Zijlstra @ 2010-01-13 13:13 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Don Zickus, Cyrill Gorcunov, aris, linux-kernel
On Wed, 2010-01-13 at 10:32 +0100, Ingo Molnar wrote:
> other architectures have NMI concepts as well, such as Sparc64.
I think both sparc64 and ppc64 fake NMIs by playing games with hw IRQ
priorities and partial masks. But yes.
One interesting 'feature' for the perf-nmi interaction is creating an
idle scheduling class for counters, because as long as there is a
counter present you can use his NMIs to drive the watchdog, but as soon
as there are non left, you need to install one.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-13 9:32 ` Ingo Molnar
2010-01-13 13:13 ` Peter Zijlstra
@ 2010-01-13 16:23 ` Don Zickus
1 sibling, 0 replies; 10+ messages in thread
From: Don Zickus @ 2010-01-13 16:23 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Cyrill Gorcunov, aris, linux-kernel
On Wed, Jan 13, 2010 at 10:32:40AM +0100, Ingo Molnar wrote:
> > After looking through the code I just had some questions, perhaps you have
> > thought about this longer than me, what to do with the reservation code
> > (just remove it I assume and let perf_events _be_ the only code that
> > handles perf events) and what to do with some of the cpu quirks as noted in
> > perfctr-watchdog.c (notable some of the Intel errata for the Core chipsets).
>
> Given the amount of quirks in the perctr code it might make sense to shape
> this as a new feature initially: introduce a new NMI watchdog that is perf
> based and has a different codebase.
>
> Then, once it's capable enough and has been in circulation long enough we can
> simply drop the old NMI watchdog. (without users noticing anything [modulo
> bugs])
>
> v1 should concentrate on x86 CPUs that are supported by perf currently. Note,
> it _might_ make sense to do it via a new kernel/nmi_watchdog.c file - other
> architectures have NMI concepts as well, such as Sparc64. A further idea would
> be to maybe even merge it with the softlockup code in kernel/softlockup.c - so
> that we dont have two sets of apis like touch_nmi_watchdog and
> touch_softlockup_watchdog.
Ok, interesting. Right now I am working on making sure I know how to
register something with the perf event framework (from kernel space).
Once I can do that, I'll expand it outward and see where it goes. :-)
>
> So there's a wide spectrum of possibilities - the important thing is to start
> small :-)
I see. Thanks.
Cheers,
Don
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-13 13:13 ` Peter Zijlstra
@ 2010-01-13 16:25 ` Don Zickus
2010-01-13 16:42 ` Peter Zijlstra
2010-01-13 16:35 ` Ingo Molnar
1 sibling, 1 reply; 10+ messages in thread
From: Don Zickus @ 2010-01-13 16:25 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Ingo Molnar, Cyrill Gorcunov, aris, linux-kernel
On Wed, Jan 13, 2010 at 02:13:42PM +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-13 at 10:32 +0100, Ingo Molnar wrote:
> > other architectures have NMI concepts as well, such as Sparc64.
>
> I think both sparc64 and ppc64 fake NMIs by playing games with hw IRQ
> priorities and partial masks. But yes.
>
> One interesting 'feature' for the perf-nmi interaction is creating an
> idle scheduling class for counters, because as long as there is a
> counter present you can use his NMIs to drive the watchdog, but as soon
> as there are non left, you need to install one.
Interesting idea. How can I guarantee the frequency of the NMI I want to
piggyback off of? A breakpoint that takes an hour to trigger may not be
the best NMI to use? Then again I am still trying to understand the perf
event code a little better.
Cheers,
Don
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-13 13:13 ` Peter Zijlstra
2010-01-13 16:25 ` Don Zickus
@ 2010-01-13 16:35 ` Ingo Molnar
1 sibling, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2010-01-13 16:35 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Don Zickus, Cyrill Gorcunov, aris, linux-kernel
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, 2010-01-13 at 10:32 +0100, Ingo Molnar wrote:
> > other architectures have NMI concepts as well, such as Sparc64.
>
> I think both sparc64 and ppc64 fake NMIs by playing games with hw IRQ
> priorities and partial masks. But yes.
>
> One interesting 'feature' for the perf-nmi interaction is creating an idle
> scheduling class for counters, because as long as there is a counter present
> you can use his NMIs to drive the watchdog, but as soon as there are non
> left, you need to install one.
Yeah. I'd suggest to not complicate things with that initially - but to simply
create a standalone event for it and 'waste' a counter on NMI generation.
Later on it can indeed be a good feature to make the NMI watchdog 'seemless'
in the sense of it not causing any wasted hw resources - it can piggyback on
any existing NMI event. (as long as that event is at least ~1 HZ strong or so)
Ingo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: introduce NMI_AUTO as nmi_watchdog option
2010-01-13 16:25 ` Don Zickus
@ 2010-01-13 16:42 ` Peter Zijlstra
0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2010-01-13 16:42 UTC (permalink / raw)
To: Don Zickus; +Cc: Ingo Molnar, Cyrill Gorcunov, aris, linux-kernel
On Wed, 2010-01-13 at 11:25 -0500, Don Zickus wrote:
> On Wed, Jan 13, 2010 at 02:13:42PM +0100, Peter Zijlstra wrote:
> > On Wed, 2010-01-13 at 10:32 +0100, Ingo Molnar wrote:
> > > other architectures have NMI concepts as well, such as Sparc64.
> >
> > I think both sparc64 and ppc64 fake NMIs by playing games with hw IRQ
> > priorities and partial masks. But yes.
> >
> > One interesting 'feature' for the perf-nmi interaction is creating an
> > idle scheduling class for counters, because as long as there is a
> > counter present you can use his NMIs to drive the watchdog, but as soon
> > as there are non left, you need to install one.
>
> Interesting idea. How can I guarantee the frequency of the NMI I want to
> piggyback off of? A breakpoint that takes an hour to trigger may not be
> the best NMI to use? Then again I am still trying to understand the perf
> event code a little better.
You could play games with the period, we can handle getting more NMIs
than are needed. This is how we implement a period larger than the
physical counter for example.
But yeah, its a tricky game since a tight loop might never generate the
event we're counting.. we could limit this to things like
cycles/ins/bus-cycles etc.. those will always tick.
Anyway, its all an optimization, the simple/first implementation would
simply install a kernel cpu perf counter and hook the overflow handler.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2010-01-13 16:42 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-11 19:16 introduce NMI_AUTO as nmi_watchdog option Don Zickus
2010-01-11 20:27 ` Cyrill Gorcunov
2010-01-11 20:33 ` Don Zickus
2010-01-11 20:51 ` Cyrill Gorcunov
2010-01-13 9:32 ` Ingo Molnar
2010-01-13 13:13 ` Peter Zijlstra
2010-01-13 16:25 ` Don Zickus
2010-01-13 16:42 ` Peter Zijlstra
2010-01-13 16:35 ` Ingo Molnar
2010-01-13 16:23 ` Don Zickus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox