From: Jet Chen <jet.chen@intel.com>
To: Viresh Kumar <viresh.kumar@linaro.org>
Cc: "Wu, Fengguang" <fengguang.wu@intel.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [clocksource] INFO: possible irq lock inversion dependency detected
Date: Wed, 09 Apr 2014 12:34:19 +0800 [thread overview]
Message-ID: <5344CDCB.20707@intel.com> (raw)
In-Reply-To: <CAKohpo=t2Bb7EO3J36i9iRn_nHbtPvwzBpswVSb8xQXT7kk03g@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 549 bytes --]
On 04/09/2014 12:25 PM, Viresh Kumar wrote:
> On 9 April 2014 06:51, Jet Chen <jet.chen@intel.com> wrote:
>> spin_lock_irqsave() does fix this issue.
>>
>> Tested-by: Jet Chen <jet.chen@intel.com>
>
> Thanks a lot :)
>
Welcome.
> How did you got this in cc list ?
>
> "abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.kumar@linaro.org"
> <abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.kumar@linaro.org>
>
I got it from the patch you sent to me before. attach it again.
Apologizes if it's improper to cc this list.
[-- Attachment #2: 0001-clocksource-register-cpu-notifier-to-remove-timer-fr.patch --]
[-- Type: text/x-patch, Size: 6109 bytes --]
>From abd38155f8293923de5953cc063f9e2d7ecb3f04 Mon Sep 17 00:00:00 2001
Message-Id: <abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.kumar@linaro.org>
From: Viresh Kumar <viresh.kumar@linaro.org>
Date: Sat, 5 Apr 2014 11:43:25 +0530
Subject: [PATCH] clocksource: register cpu notifier to remove timer from
dying CPU
clocksource core is using add_timer_on() to run clocksource_watchdog() on all
CPUs one by one. But when a core is brought down, clocksource core doesn't
remove this timer from the dying CPU. And in this case timer core gives this
(Gives this only with unmerged code, anyway in the current code as well timer
core is migrating a pinned timer to other CPUs, which is also wrong:
http://www.gossamer-threads.com/lists/linux/kernel/1898117)
migrate_timer_list: can't migrate pinned timer: ffffffff81f06a60,
timer->function: ffffffff810d7010,deactivating it Modules linked in:
CPU: 0 PID: 1932 Comm: 01-cpu-hotplug Not tainted 3.14.0-rc1-00088-gab3c4fd #4
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
0000000000000009 ffff88001d407c38 ffffffff817237bd ffff88001d407c80
ffff88001d407c70 ffffffff8106a1dd 0000000000000010 ffffffff81f06a60
ffff88001e04d040 ffffffff81e3d4c0 ffff88001e04d030 ffff88001d407cd0
Call Trace:
[<ffffffff817237bd>] dump_stack+0x4d/0x66
[<ffffffff8106a1dd>] warn_slowpath_common+0x7d/0xa0
[<ffffffff8106a24c>] warn_slowpath_fmt+0x4c/0x50
[<ffffffff810761c3>] ? __internal_add_timer+0x113/0x130
[<ffffffff810d7010>] ? clocksource_watchdog_kthread+0x40/0x40
[<ffffffff8107753b>] migrate_timer_list+0xdb/0xf0
[<ffffffff810782dc>] timer_cpu_notify+0xfc/0x1f0
[<ffffffff8173046c>] notifier_call_chain+0x4c/0x70
[<ffffffff8109340e>] __raw_notifier_call_chain+0xe/0x10
[<ffffffff8106a3f3>] cpu_notify+0x23/0x50
[<ffffffff8106a44e>] cpu_notify_nofail+0xe/0x20
[<ffffffff81712a5d>] _cpu_down+0x1ad/0x2e0
[<ffffffff81712bc4>] cpu_down+0x34/0x50
[<ffffffff813fec54>] cpu_subsys_offline+0x14/0x20
[<ffffffff813f9f65>] device_offline+0x95/0xc0
[<ffffffff813fa060>] online_store+0x40/0x90
[<ffffffff813f75d8>] dev_attr_store+0x18/0x30
[<ffffffff8123309d>] sysfs_kf_write+0x3d/0x50
This patch tries to fix this by registering cpu notifiers from clocksource core,
only when we start clocksource-watchdog. And if on the CPU_DEAD notification it
is found that dying CPU was the CPU on which this timer is queued on, then it is
removed from that CPU and queued to next CPU.
Reported-by: Jet Chen <jet.chen@intel.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
kernel/time/clocksource.c | 64 +++++++++++++++++++++++++++++++++++++++--------
1 file changed, 53 insertions(+), 11 deletions(-)
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index ba3e502..9e96853 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -23,16 +23,21 @@
* o Allow clocksource drivers to be unregistered
*/
+#include <linux/cpu.h>
#include <linux/device.h>
#include <linux/clocksource.h>
#include <linux/init.h>
#include <linux/module.h>
+#include <linux/notifier.h>
#include <linux/sched.h> /* for spin_unlock_irq() using preempt_count() m68k */
#include <linux/tick.h>
#include <linux/kthread.h>
#include "tick-internal.h"
+/* Tracks next CPU to queue watchdog timer on */
+static int timer_cpu;
+
void timecounter_init(struct timecounter *tc,
const struct cyclecounter *cc,
u64 start_tstamp)
@@ -246,12 +251,25 @@ void clocksource_mark_unstable(struct clocksource *cs)
spin_unlock_irqrestore(&watchdog_lock, flags);
}
+void queue_timer_on_next_cpu(void)
+{
+ /*
+ * Cycle through CPUs to check if the CPUs stay synchronized to each
+ * other.
+ */
+ timer_cpu = cpumask_next(timer_cpu, cpu_online_mask);
+ if (timer_cpu >= nr_cpu_ids)
+ timer_cpu = cpumask_first(cpu_online_mask);
+ watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
+ add_timer_on(&watchdog_timer, timer_cpu);
+}
+
static void clocksource_watchdog(unsigned long data)
{
struct clocksource *cs;
cycle_t csnow, wdnow;
int64_t wd_nsec, cs_nsec;
- int next_cpu, reset_pending;
+ int reset_pending;
spin_lock(&watchdog_lock);
if (!watchdog_running)
@@ -336,27 +354,50 @@ static void clocksource_watchdog(unsigned long data)
if (reset_pending)
atomic_dec(&watchdog_reset_pending);
- /*
- * Cycle through CPUs to check if the CPUs stay synchronized
- * to each other.
- */
- next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
- if (next_cpu >= nr_cpu_ids)
- next_cpu = cpumask_first(cpu_online_mask);
- watchdog_timer.expires += WATCHDOG_INTERVAL;
- add_timer_on(&watchdog_timer, next_cpu);
+ queue_timer_on_next_cpu();
out:
spin_unlock(&watchdog_lock);
}
+static int clocksource_cpu_notify(struct notifier_block *self,
+ unsigned long action, void *hcpu)
+{
+ long cpu = (long)hcpu;
+
+ spin_lock(&watchdog_lock);
+ if (!watchdog_running)
+ goto notify_out;
+
+ switch (action) {
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
+ if (cpu != timer_cpu)
+ break;
+ del_timer(&watchdog_timer);
+ queue_timer_on_next_cpu();
+ break;
+ }
+
+notify_out:
+ spin_unlock(&watchdog_lock);
+ return NOTIFY_OK;
+}
+
+static struct notifier_block clocksource_nb = {
+ .notifier_call = clocksource_cpu_notify,
+ .priority = 1,
+};
+
static inline void clocksource_start_watchdog(void)
{
if (watchdog_running || !watchdog || list_empty(&watchdog_list))
return;
+ timer_cpu = cpumask_first(cpu_online_mask);
+ register_cpu_notifier(&clocksource_nb);
init_timer(&watchdog_timer);
watchdog_timer.function = clocksource_watchdog;
watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
- add_timer_on(&watchdog_timer, cpumask_first(cpu_online_mask));
+ add_timer_on(&watchdog_timer, timer_cpu);
watchdog_running = 1;
}
@@ -365,6 +406,7 @@ static inline void clocksource_stop_watchdog(void)
if (!watchdog_running || (watchdog && !list_empty(&watchdog_list)))
return;
del_timer(&watchdog_timer);
+ unregister_cpu_notifier(&clocksource_nb);
watchdog_running = 0;
}
--
1.7.12.rc2.18.g61b472e
next prev parent reply other threads:[~2014-04-09 4:35 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-08 3:59 [clocksource] INFO: possible irq lock inversion dependency detected Jet Chen
2014-04-08 5:21 ` Viresh Kumar
2014-04-09 1:21 ` Jet Chen
2014-04-09 4:25 ` Viresh Kumar
2014-04-09 4:34 ` Jet Chen [this message]
2014-04-09 4:40 ` Viresh Kumar
2014-04-09 4:42 ` Jet Chen
-- strict thread matches above, loose matches on Subject: below --
2014-04-07 14:57 Jet Chen
2014-04-08 2:55 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5344CDCB.20707@intel.com \
--to=jet.chen@intel.com \
--cc=fengguang.wu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=viresh.kumar@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).