From: Randy Dunlap <rdunlap-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
To: Ricardo Neri
<ricardo.neri-calderon-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
Cc: "Rafael J. Wysocki"
<rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>,
Alexei Starovoitov <ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Kai-Heng Feng
<kai.heng.feng-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>,
"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Christoffer Dall <cdall-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
Davidlohr Bueso <dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org>,
Ashok Raj <ashok.raj-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Michael Ellerman <mpe-Gsx/Oe8HsFggBc27wqDAHg@public.gmane.org>,
x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
Andi Kleen <andi.kleen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
Borislav Petkov <bp-l3A5Bk7waGM@public.gmane.org>,
Masami Hiramatsu
<mhiramat-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Don Zickus <dzickus-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
"Ravi V. Shankar"
<ravi.v.shankar-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
Konrad Rzeszutek Wilk
<konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
Marc Zyngier <marc.zyngier-5wv7dgnIgG8@public.gmane.org>,
Frederic Weisbecker
<frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Nicholas Piggin <npiggin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI
Date: Tue, 19 Jun 2018 17:25:09 -0700 [thread overview]
Message-ID: <c040d4e7-f33b-9c93-6f5a-6bf960943e36@infradead.org> (raw)
In-Reply-To: <20180620001535.GA27531@voyager>
On 06/19/2018 05:15 PM, Ricardo Neri wrote:
> On Sat, Jun 16, 2018 at 03:24:49PM +0200, Thomas Gleixner wrote:
>> On Fri, 15 Jun 2018, Ricardo Neri wrote:
>>> On Fri, Jun 15, 2018 at 11:19:09AM +0200, Thomas Gleixner wrote:
>>>> On Thu, 14 Jun 2018, Ricardo Neri wrote:
>>>>> Alternatively, there could be a counter that skips reading the HPET status
>>>>> register (and the detection of hardlockups) for every X NMIs. This would
>>>>> reduce the overall frequency of HPET register reads.
>>>>
>>>> Great plan. So if the watchdog is the only NMI (because perf is off) then
>>>> you delay the watchdog detection by that count.
>>>
>>> OK. This was a bad idea. Then, is it acceptable to have an read to an HPET
>>> register per NMI just to check in the status register if the HPET timer
>>> caused the NMI?
>>
>> The status register is useless in case of MSI. MSI is edge triggered ....
>>
>> The only register which gives you proper information is the counter
>> register itself. That adds an massive overhead to each NMI, because the
>> counter register access is synchronized to the HPET clock with hardware
>> magic. Plus on larger systems, the HPET access is cross node and even
>> slower.
>
> It starts to sound that the HPET is too slow to drive the hardlockup detector.
>
> Would it be possible to envision a variant of this implementation? In this
> variant, the HPET only targets a single CPU. The actual hardlockup detector
> is implemented by this single CPU sending interprocessor interrupts to the
> rest of the CPUs.
>
> In this manner only one CPU has to deal with the slowness of the HPET; the
> rest of the CPUs don't have to read or write any HPET registers. A sysfs
> entry could be added to configure which CPU will have to deal with the HPET
> timer. However, profiling could not be done accurately on such CPU.
Please forgive my simple question:
What happens when this one CPU is the one that locks up?
thnx,
--
~Randy
next prev parent reply other threads:[~2018-06-20 0:25 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1528851463-21140-1-git-send-email-ricardo.neri-calderon@linux.intel.com>
2018-06-13 0:57 ` [RFC PATCH 01/23] x86/apic: Add a parameter for the APIC delivery mode Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 02/23] genirq: Introduce IRQD_DELIVER_AS_NMI Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI Ricardo Neri
[not found] ` <1528851463-21140-4-git-send-email-ricardo.neri-calderon-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-06-13 8:34 ` Peter Zijlstra
2018-06-13 8:59 ` Julien Thierry
2018-06-13 9:20 ` Thomas Gleixner
2018-06-13 9:36 ` Julien Thierry
[not found] ` <344b838e-81e3-97d8-f90d-315fed7879c1-5wv7dgnIgG8@public.gmane.org>
2018-06-13 9:49 ` Julien Thierry
2018-06-13 9:57 ` Thomas Gleixner
2018-06-13 10:25 ` Julien Thierry
[not found] ` <alpine.DEB.2.21.1806131104570.2280-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2018-06-13 10:06 ` Marc Zyngier
2018-06-15 2:12 ` Ricardo Neri
2018-06-15 8:01 ` Julien Thierry
[not found] ` <4eb34b18-11f8-7d70-46a5-f206d127b768-5wv7dgnIgG8@public.gmane.org>
2018-06-16 0:39 ` Ricardo Neri
2018-06-16 13:36 ` Thomas Gleixner
2018-06-13 0:57 ` [RFC PATCH 04/23] iommu/vt-d/irq_remapping: Add support for IRQCHIP_CAN_DELIVER_AS_NMI Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 05/23] x86/msi: " Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 06/23] x86/ioapic: Add support for IRQCHIP_CAN_DELIVER_AS_NMI with interrupt remapping Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 07/23] x86/hpet: Expose more functions to read and write registers Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 08/23] x86/hpet: Calculate ticks-per-second in a separate function Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 09/23] x86/hpet: Reserve timer for the HPET hardlockup detector Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 10/23] x86/hpet: Relocate flag definitions to a header file Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 11/23] x86/hpet: Configure the timer used by the hardlockup detector Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations Ricardo Neri
2018-06-13 7:41 ` Nicholas Piggin
[not found] ` <20180613174141.539fc6c1-a5aMA/AkCkgK5Ils6ZIQy0EOCMrvLtNR@public.gmane.org>
2018-06-13 8:42 ` Peter Zijlstra
2018-06-13 9:26 ` Thomas Gleixner
[not found] ` <alpine.DEB.2.21.1806131121180.2280-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2018-06-13 11:52 ` Nicholas Piggin
[not found] ` <20180613215225.2a938abc-a5aMA/AkCkgK5Ils6ZIQy0EOCMrvLtNR@public.gmane.org>
2018-06-14 1:31 ` Ricardo Neri
2018-06-14 2:32 ` Nicholas Piggin
2018-06-14 8:32 ` Thomas Gleixner
2018-06-15 2:21 ` Ricardo Neri
2018-06-14 1:26 ` Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 13/23] watchdog/hardlockup: Define a generic function to detect hardlockups Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 14/23] watchdog/hardlockup: Decouple the hardlockup detector from perf Ricardo Neri
[not found] ` <1528851463-21140-15-git-send-email-ricardo.neri-calderon-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-06-13 8:43 ` Peter Zijlstra
2018-06-14 1:19 ` Ricardo Neri
2018-06-14 1:41 ` Nicholas Piggin
2018-06-15 2:23 ` Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 15/23] kernel/watchdog: Add a function to obtain the watchdog_allowed_mask Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 16/23] watchdog/hardlockup: Add an HPET-based hardlockup detector Ricardo Neri
[not found] ` <1528851463-21140-17-git-send-email-ricardo.neri-calderon-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-06-13 5:23 ` Randy Dunlap
[not found] ` <1e5bc136-4123-328a-2d2e-e6f2faef5bf4-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-06-14 1:00 ` Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI Ricardo Neri
[not found] ` <1528851463-21140-18-git-send-email-ricardo.neri-calderon-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-06-13 9:07 ` Peter Zijlstra
[not found] ` <20180613090720.GV12258-Nxj+rRp3nVydTX5a5knrm8zTDFooKrT+cvkQGrU6aU0@public.gmane.org>
2018-06-15 2:07 ` Ricardo Neri
2018-06-13 9:40 ` Thomas Gleixner
2018-06-15 2:03 ` Ricardo Neri
2018-06-15 9:19 ` Thomas Gleixner
[not found] ` <alpine.DEB.2.21.1806151029210.2079-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2018-06-16 0:51 ` Ricardo Neri
2018-06-16 13:24 ` Thomas Gleixner
[not found] ` <alpine.DEB.2.21.1806161517050.1582-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2018-06-20 0:15 ` Ricardo Neri
2018-06-20 0:25 ` Randy Dunlap [this message]
2018-06-21 0:25 ` Ricardo Neri
2018-06-20 7:47 ` Thomas Gleixner
2018-06-13 0:57 ` [RFC PATCH 18/23] watchdog/hardlockup/hpet: Add the NMI watchdog operations Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 19/23] watchdog/hardlockup: Make arch_touch_nmi_watchdog() to hpet-based implementation Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs Ricardo Neri
2018-06-13 9:48 ` Thomas Gleixner
[not found] ` <alpine.DEB.2.21.1806131140560.2280-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2018-06-15 2:16 ` Ricardo Neri
2018-06-15 10:29 ` Thomas Gleixner
[not found] ` <alpine.DEB.2.21.1806151122070.2079-ecDvlHI5BZPZikZi3RtOZ1XZhhPuCNm+@public.gmane.org>
2018-06-16 0:46 ` Ricardo Neri
2018-06-16 13:27 ` Thomas Gleixner
2018-06-13 0:57 ` [RFC PATCH 21/23] watchdog/hardlockup/hpet: Adjust timer expiration on the number of " Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 22/23] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Ricardo Neri
[not found] ` <1528851463-21140-23-git-send-email-ricardo.neri-calderon-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2018-06-13 5:26 ` Randy Dunlap
[not found] ` <c2edf778-79cf-009d-6617-13e54ad8b93b-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-06-14 0:58 ` Ricardo Neri
2018-06-14 3:30 ` Randy Dunlap
2018-06-13 0:57 ` [RFC PATCH 23/23] watchdog/hardlockup: Activate the HPET-based lockup detector Ricardo Neri
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c040d4e7-f33b-9c93-6f5a-6bf960943e36@infradead.org \
--to=rdunlap-wegcikhe2lqwvfeawa7xhq@public.gmane.org \
--cc=andi.kleen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=ashok.raj-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=bp-l3A5Bk7waGM@public.gmane.org \
--cc=cdall-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org \
--cc=dave-h16yJtLeMjHk1uMJSBkQmQ@public.gmane.org \
--cc=dzickus-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=frederic-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=kai.heng.feng-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
--cc=konrad.wilk-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
--cc=longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
--cc=marc.zyngier-5wv7dgnIgG8@public.gmane.org \
--cc=mhiramat-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=mpe-Gsx/Oe8HsFggBc27wqDAHg@public.gmane.org \
--cc=npiggin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org \
--cc=rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=ravi.v.shankar-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
--cc=ricardo.neri-calderon-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
--cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
--cc=sparclinux-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
--cc=x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).