linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: dzickus@redhat.com (Don Zickus)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus
Date: Thu, 10 Jan 2013 13:17:51 -0500	[thread overview]
Message-ID: <20130110181751.GR88797@redhat.com> (raw)
In-Reply-To: <CAMbhsRT7q+DSKOdPMtUqtPZJrB_z-ixmv09TkT2ZweUJGXjkYg@mail.gmail.com>

On Thu, Jan 10, 2013 at 09:27:28AM -0800, Colin Cross wrote:
> On Thu, Jan 10, 2013 at 6:02 AM, Don Zickus <dzickus@redhat.com> wrote:
> > On Wed, Jan 09, 2013 at 05:57:39PM -0800, Colin Cross wrote:
> >> Emulate NMIs on systems where they are not available by using timer
> >> interrupts on other cpus.  Each cpu will use its softlockup hrtimer
> >> to check that the next cpu is processing hrtimer interrupts by
> >> verifying that a counter is increasing.
> >>
> >> This patch is useful on systems where the hardlockup detector is not
> >> available due to a lack of NMIs, for example most ARM SoCs.
> >
> > I have seen other cpus, like Sparc I think, create a 'virtual NMI' by
> > reserving an IRQ line as 'special' (can not be masked).  Not sure if that
> > is something worth looking at here (or even possible).
> >
> >> Without this patch any cpu stuck with interrupts disabled can
> >> cause a hardware watchdog reset with no debugging information,
> >> but with this patch the kernel can detect the lockup and panic,
> >> which can result in useful debugging info.
> >
> > <SNIP>
> >> +#ifdef CONFIG_HARDLOCKUP_DETECTOR_OTHER_CPU
> >> +static int is_hardlockup_other_cpu(int cpu)
> >> +{
> >> +     unsigned long hrint = per_cpu(hrtimer_interrupts, cpu);
> >> +
> >> +     if (per_cpu(hrtimer_interrupts_saved, cpu) == hrint)
> >> +             return 1;
> >> +
> >> +     per_cpu(hrtimer_interrupts_saved, cpu) = hrint;
> >> +     return 0;
> >
> > Will this race with the other cpu you are checking?  For example if cpuA
> > just updated its hrtimer_interrupts_saved and cpuB goes to check cpuA's
> > hrtimer_interrupts_saved, it seems possible that cpuB could falsely assume
> > cpuA is stuck?
> 
> cpuA doesn't update its own hrtimer_interrupts_saved, cpuB does.
> However, there may be a similar race condition during hotplug if cpuB
> updates hrtimer_interrupts_saved for cpuA, then goes offline, then
> cpuC may try to check cpuA and see that hrtimer_interrupts_saved ==
> hrtimer_interrupts.  I think this can be solved by setting
> watchdog_nmi_touch for the next cpu when a cpu goes online or offline.

Ah, that is where my misunderstanding was.  I overlooked the fact that it
was only updated by the other cpu.  Sorry about that.

I'll re-review it again with that in mind.

Cheers,
Don

  reply	other threads:[~2013-01-10 18:17 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-10  1:57 [PATCH] hardlockup: detect hard lockups without NMIs using secondary cpus Colin Cross
2013-01-10 14:02 ` Don Zickus
2013-01-10 14:22   ` Russell King - ARM Linux
2013-01-10 16:18     ` Frederic Weisbecker
2013-01-10 17:00       ` Russell King - ARM Linux
2013-01-10 17:27   ` Colin Cross
2013-01-10 18:17     ` Don Zickus [this message]
2013-01-10 20:38 ` Tony Lindgren
2013-01-10 22:34   ` Colin Cross
2013-01-10 23:42     ` Tony Lindgren
2013-01-11  1:39 ` Liu, Chuansheng
2013-01-11  5:34   ` Colin Cross
2013-01-11  5:57     ` Liu, Chuansheng
2013-01-11  6:17       ` Colin Cross
2013-01-11  6:27         ` Liu, Chuansheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130110181751.GR88797@redhat.com \
    --to=dzickus@redhat.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).