From: Frederic Weisbecker <fweisbec@gmail.com>
To: Don Zickus <dzickus@redhat.com>
Cc: mingo@elte.hu, peterz@infradead.org, gorcunov@gmail.com,
aris@redhat.com, linux-kernel@vger.kernel.org,
randy.dunlap@oracle.com
Subject: Re: [PATCH 1/8] [watchdog] combine nmi_watchdog and softlockup
Date: Wed, 28 Apr 2010 14:36:54 +0200 [thread overview]
Message-ID: <20100428123645.GA12017@nowhere> (raw)
In-Reply-To: <1272039216-8890-2-git-send-email-dzickus@redhat.com>
On Fri, Apr 23, 2010 at 12:13:29PM -0400, Don Zickus wrote:
> +void watchdog_overflow_callback(struct perf_event *event, int nmi,
> + struct perf_sample_data *data,
> + struct pt_regs *regs)
> +{
> + int this_cpu = smp_processor_id();
> + unsigned long touch_ts = per_cpu(watchdog_touch_ts, this_cpu);
> + char warn = __get_cpu_var(watchdog_warn);
> +
> + if (touch_ts == 0) {
> + __touch_watchdog();
> + return;
> + }
> +
> + /* check for a hardlockup
> + * This is done by making sure our timer interrupt
> + * is incrementing. The timer interrupt should have
> + * fired multiple times before we overflow'd. If it hasn't
> + * then this is a good indication the cpu is stuck
> + */
> + if (is_hardlockup(this_cpu)) {
> + /* only print hardlockups once */
> + if (warn & HARDLOCKUP)
> + return;
> +
> + if (hardlockup_panic)
> + panic("Watchdog detected hard LOCKUP on cpu %d", this_cpu);
> + else
> + WARN(1, "Watchdog detected hard LOCKUP on cpu %d", this_cpu);
> +
> + __get_cpu_var(watchdog_warn) = warn | HARDLOCKUP;
> + return;
> + }
> +
> + __get_cpu_var(watchdog_warn) = warn & ~HARDLOCKUP;
> + return;
> +}
[...]
> +static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> +{
> + int this_cpu = smp_processor_id();
> + unsigned long touch_ts = __get_cpu_var(watchdog_touch_ts);
> + char warn = __get_cpu_var(watchdog_warn);
> + struct pt_regs *regs = get_irq_regs();
> + int duration;
> +
> + /* kick the hardlockup detector */
> + watchdog_interrupt_count();
> +
> + /* kick the softlockup detector */
> + wake_up_process(__get_cpu_var(softlockup_watchdog));
> +
> + /* .. and repeat */
> + hrtimer_forward_now(hrtimer, ns_to_ktime(get_sample_period()));
> +
> + if (touch_ts == 0) {
> + __touch_watchdog();
> + return HRTIMER_RESTART;
> + }
> +
> + /* check for a softlockup
> + * This is done by making sure a high priority task is
> + * being scheduled. The task touches the watchdog to
> + * indicate it is getting cpu time. If it hasn't then
> + * this is a good indication some task is hogging the cpu
> + */
> + duration = is_softlockup(touch_ts, this_cpu);
> + if (unlikely(duration)) {
> + /* only warn once */
> + if (warn & SOFTLOCKUP)
> + return HRTIMER_RESTART;
> +
> + printk(KERN_ERR "BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
> + this_cpu, duration,
> + current->comm, task_pid_nr(current));
> + print_modules();
> + print_irqtrace_events(current);
> + if (regs)
> + show_regs(regs);
> + else
> + dump_stack();
> +
> + if (softlockup_panic)
> + panic("softlockup: hung tasks");
> + __get_cpu_var(watchdog_warn) = warn | SOFTLOCKUP;
> + } else
> + __get_cpu_var(watchdog_warn) = warn & ~SOFTLOCKUP;
Note these watchdog_warn modifications are racy against the same that
happens with HARDLOCKUP. You might clear what did the nmi.
The race is harmless enough that we don't care much I think, but that's
why it would have make sense to separate watchdog_warn tracking space
between both.
next prev parent reply other threads:[~2010-04-28 12:36 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-23 16:13 [PATCH 0/8] lockup detector changes Don Zickus
2010-04-23 16:13 ` [PATCH 1/8] [watchdog] combine nmi_watchdog and softlockup Don Zickus
2010-04-28 12:36 ` Frederic Weisbecker [this message]
2010-04-28 20:28 ` Don Zickus
2010-04-23 16:13 ` [PATCH 2/8] [watchdog] convert touch_softlockup_watchdog to touch_watchdog Don Zickus
2010-04-23 16:13 ` [PATCH 3/8] [watchdog] remove old softlockup code Don Zickus
2010-04-23 16:13 ` [PATCH 4/8] [watchdog] remove nmi_watchdog.c file Don Zickus
2010-04-23 16:13 ` [PATCH 5/8] [x86] watchdog: move trigger_all_cpu_backtrace to its own die_notifier Don Zickus
2010-04-23 16:13 ` [PATCH 6/8] [x86] watchdog: cleanup hw_nmi.c cruft Don Zickus
2010-04-23 16:13 ` [PATCH 7/8] [watchdog] resolve softlockup.c conflicts Don Zickus
2010-04-23 16:13 ` [PATCH 8/8] [watchdog] separate touch_nmi_watchdog code path from touch_watchdog Don Zickus
2010-04-28 12:48 ` Frederic Weisbecker
2010-04-28 20:28 ` Don Zickus
2010-04-27 1:44 ` [PATCH 0/8] lockup detector changes Frederic Weisbecker
-- strict thread matches above, loose matches on Subject: below --
2010-05-07 21:11 Don Zickus
2010-05-07 21:11 ` [PATCH 1/8] [watchdog] combine nmi_watchdog and softlockup Don Zickus
2010-05-12 19:55 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100428123645.GA12017@nowhere \
--to=fweisbec@gmail.com \
--cc=aris@redhat.com \
--cc=dzickus@redhat.com \
--cc=gorcunov@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=randy.dunlap@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.