From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: linux-tip-commits@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@redhat.com,
a.p.zijlstra@chello.nl, tglx@linutronix.de, mingo@elte.hu
Subject: [tip:core/locking] hrtimer: fix rq->lock inversion (again)
Date: Fri, 13 Mar 2009 14:57:28 GMT [thread overview]
Message-ID: <tip-84648567ee357abd8879257018f38f8b6b35334d@git.kernel.org> (raw)
In-Reply-To: <20090313112301.096138802@chello.nl>
Commit-ID: 84648567ee357abd8879257018f38f8b6b35334d
Gitweb: http://git.kernel.org/tip/84648567ee357abd8879257018f38f8b6b35334d
Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
AuthorDate: Fri, 13 Mar 2009 12:21:27 +0100
Commit: Ingo Molnar <mingo@elte.hu>
CommitDate: Fri, 13 Mar 2009 15:31:28 +0100
hrtimer: fix rq->lock inversion (again)
It appears I inadvertly introduced rq->lock recursion to the
hrtimer_start() path when I delegated running already expired
timers to softirq context.
This patch fixes it by introducing a __hrtimer_start_range_ns()
method that will not use raise_softirq_irqoff() but
__raise_softirq_irqoff() which avoids the wakeup.
It then also changes schedule() to check for pending softirqs and
do the wakeup then, I'm not quite sure I like this last bit, nor
am I convinced its really needed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
LKML-Reference: <20090313112301.096138802@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
include/linux/hrtimer.h | 5 ++++
include/linux/interrupt.h | 1 +
kernel/hrtimer.c | 55 +++++++++++++++++++++++++++-----------------
kernel/sched.c | 14 +++++++++--
kernel/softirq.c | 2 +-
5 files changed, 52 insertions(+), 25 deletions(-)
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index bd37078..0d2f7c8 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -336,6 +336,11 @@ extern int hrtimer_start(struct hrtimer *timer, ktime_t tim,
const enum hrtimer_mode mode);
extern int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
unsigned long range_ns, const enum hrtimer_mode mode);
+extern int
+__hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+ unsigned long delta_ns,
+ const enum hrtimer_mode mode, int wakeup);
+
extern int hrtimer_cancel(struct hrtimer *timer);
extern int hrtimer_try_to_cancel(struct hrtimer *timer);
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 169db98..663b8bc 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -274,6 +274,7 @@ extern void softirq_init(void);
#define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
extern void raise_softirq_irqoff(unsigned int nr);
extern void raise_softirq(unsigned int nr);
+extern void wakeup_softirqd(void);
/* This is the worklist that queues up per-cpu softirq work.
*
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index f394d2a..cb8a15c 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -651,14 +651,20 @@ static inline void hrtimer_init_timer_hres(struct hrtimer *timer)
* and expiry check is done in the hrtimer_interrupt or in the softirq.
*/
static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
- struct hrtimer_clock_base *base)
+ struct hrtimer_clock_base *base,
+ int wakeup)
{
if (base->cpu_base->hres_active && hrtimer_reprogram(timer, base)) {
- spin_unlock(&base->cpu_base->lock);
- raise_softirq_irqoff(HRTIMER_SOFTIRQ);
- spin_lock(&base->cpu_base->lock);
+ if (wakeup) {
+ spin_unlock(&base->cpu_base->lock);
+ raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+ spin_lock(&base->cpu_base->lock);
+ } else
+ __raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+
return 1;
}
+
return 0;
}
@@ -703,7 +709,8 @@ static inline int hrtimer_is_hres_enabled(void) { return 0; }
static inline int hrtimer_switch_to_hres(void) { return 0; }
static inline void hrtimer_force_reprogram(struct hrtimer_cpu_base *base) { }
static inline int hrtimer_enqueue_reprogram(struct hrtimer *timer,
- struct hrtimer_clock_base *base)
+ struct hrtimer_clock_base *base,
+ int wakeup)
{
return 0;
}
@@ -886,20 +893,9 @@ remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base)
return 0;
}
-/**
- * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
- * @timer: the timer to be added
- * @tim: expiry time
- * @delta_ns: "slack" range for the timer
- * @mode: expiry mode: absolute (HRTIMER_ABS) or relative (HRTIMER_REL)
- *
- * Returns:
- * 0 on success
- * 1 when the timer was active
- */
-int
-hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, unsigned long delta_ns,
- const enum hrtimer_mode mode)
+int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+ unsigned long delta_ns, const enum hrtimer_mode mode,
+ int wakeup)
{
struct hrtimer_clock_base *base, *new_base;
unsigned long flags;
@@ -940,12 +936,29 @@ hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, unsigned long delta_n
* XXX send_remote_softirq() ?
*/
if (leftmost && new_base->cpu_base == &__get_cpu_var(hrtimer_bases))
- hrtimer_enqueue_reprogram(timer, new_base);
+ hrtimer_enqueue_reprogram(timer, new_base, wakeup);
unlock_hrtimer_base(timer, &flags);
return ret;
}
+
+/**
+ * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
+ * @timer: the timer to be added
+ * @tim: expiry time
+ * @delta_ns: "slack" range for the timer
+ * @mode: expiry mode: absolute (HRTIMER_ABS) or relative (HRTIMER_REL)
+ *
+ * Returns:
+ * 0 on success
+ * 1 when the timer was active
+ */
+int hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+ unsigned long delta_ns, const enum hrtimer_mode mode)
+{
+ return __hrtimer_start_range_ns(timer, tim, delta_ns, mode, 1);
+}
EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
/**
@@ -961,7 +974,7 @@ EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
int
hrtimer_start(struct hrtimer *timer, ktime_t tim, const enum hrtimer_mode mode)
{
- return hrtimer_start_range_ns(timer, tim, 0, mode);
+ return __hrtimer_start_range_ns(timer, tim, 0, mode, 1);
}
EXPORT_SYMBOL_GPL(hrtimer_start);
diff --git a/kernel/sched.c b/kernel/sched.c
index 01275cb..1f43d81 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -231,13 +231,20 @@ static void start_rt_bandwidth(struct rt_bandwidth *rt_b)
spin_lock(&rt_b->rt_runtime_lock);
for (;;) {
+ unsigned long delta;
+ ktime_t soft, hard;
+
if (hrtimer_active(&rt_b->rt_period_timer))
break;
now = hrtimer_cb_get_time(&rt_b->rt_period_timer);
hrtimer_forward(&rt_b->rt_period_timer, now, rt_b->rt_period);
- hrtimer_start_expires(&rt_b->rt_period_timer,
- HRTIMER_MODE_ABS);
+
+ soft = hrtimer_get_softexpires(&rt_b->rt_period_timer);
+ hard = hrtimer_get_expires(&rt_b->rt_period_timer);
+ delta = ktime_to_ns(ktime_sub(hard, soft));
+ __hrtimer_start_range_ns(&rt_b->rt_period_timer, soft, delta,
+ HRTIMER_MODE_ABS, 0);
}
spin_unlock(&rt_b->rt_runtime_lock);
}
@@ -1129,7 +1136,8 @@ static __init void init_hrtick(void)
*/
static void hrtick_start(struct rq *rq, u64 delay)
{
- hrtimer_start(&rq->hrtick_timer, ns_to_ktime(delay), HRTIMER_MODE_REL);
+ __hrtimer_start_range_ns(&rq->hrtick_timer, ns_to_ktime(delay), 0,
+ HRTIMER_MODE_REL, 0);
}
static inline void init_hrtick(void)
diff --git a/kernel/softirq.c b/kernel/softirq.c
index f813122..34c309d 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -58,7 +58,7 @@ static DEFINE_PER_CPU(struct task_struct *, ksoftirqd);
* to the pending events, so lets the scheduler to balance
* the softirq load for us.
*/
-static inline void wakeup_softirqd(void)
+void wakeup_softirqd(void)
{
/* Interrupts are disabled: no need to stop preemption */
struct task_struct *tsk = __get_cpu_var(ksoftirqd);
next prev parent reply other threads:[~2009-03-13 14:58 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-13 11:21 [PATCH 00/11] generic software counters -v2 Peter Zijlstra
2009-03-13 11:21 ` [PATCH 01/11] sched: remove extra call overhead for schedule() Peter Zijlstra
2009-03-13 13:00 ` [tip:core/locking] " Peter Zijlstra
2009-04-20 19:00 ` [tip:sched/core] " tip-bot for Peter Zijlstra
2009-03-13 11:21 ` [PATCH 02/11] hrtimer: fix rq->lock inversion (again) Peter Zijlstra
2009-03-13 13:00 ` [tip:core/locking] " Peter Zijlstra
2009-03-13 13:27 ` Peter Zijlstra
2009-03-13 14:57 ` Peter Zijlstra [this message]
2009-03-31 12:57 ` Peter Zijlstra
2009-04-02 19:45 ` [tip:timers/urgent] " Peter Zijlstra
2009-03-13 11:21 ` [PATCH 03/11] perf_counter: x86: fix 32bit irq_period assumption Peter Zijlstra
2009-03-13 13:00 ` [tip:perfcounters/core] perf_counter: x86: fix 32-bit " Peter Zijlstra
2009-03-13 13:06 ` Peter Zijlstra
2009-03-13 11:21 ` [PATCH 04/11] perf_counter: use list_move_tail Peter Zijlstra
2009-03-13 13:00 ` [tip:perfcounters/core] perf_counter: use list_move_tail() Peter Zijlstra
2009-03-13 13:06 ` Peter Zijlstra
2009-03-13 11:21 ` [PATCH 05/11] perf_counter: add comment to barrier Peter Zijlstra
2009-03-13 13:01 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 13:06 ` Peter Zijlstra
2009-03-13 11:21 ` [PATCH 06/11] perf_counter: x86: use ULL postfix for 64bit constants Peter Zijlstra
2009-03-13 13:01 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 13:06 ` Peter Zijlstra
2009-03-13 11:21 ` [PATCH 07/11] perf_counter: software counter event infrastructure Peter Zijlstra
2009-03-13 13:01 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 13:07 ` Peter Zijlstra
2009-03-13 11:21 ` [PATCH 08/11] perf_counter: provide pagefault software events Peter Zijlstra
2009-03-13 13:01 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 13:07 ` Peter Zijlstra
2009-03-13 11:21 ` [PATCH 09/11] perf_counter: provide major/minor page fault " Peter Zijlstra
2009-03-13 13:01 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 13:07 ` Peter Zijlstra
2009-03-13 11:21 ` [PATCH 10/11] perf_counter: hrtimer based sampling for software time events Peter Zijlstra
2009-03-13 13:01 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 13:07 ` Peter Zijlstra
2009-03-13 15:43 ` [PATCH 10.5/11] perf_counter: fix hrtimer sampling Peter Zijlstra
2009-03-13 16:09 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 11:21 ` [PATCH 11/11] perf_counter: add an event_list Peter Zijlstra
2009-03-13 13:02 ` [tip:perfcounters/core] " Peter Zijlstra
2009-03-13 13:07 ` Peter Zijlstra
2009-03-13 13:07 ` [PATCH 00/11] generic software counters -v2 Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=tip-84648567ee357abd8879257018f38f8b6b35334d@git.kernel.org \
--to=a.p.zijlstra@chello.nl \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.