linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [ANNOUNCE] 3.12.6-rt9
@ 2013-12-23 22:50 Sebastian Andrzej Siewior
  2013-12-24 15:15 ` 3.12.6-rt9 build failure Nicholas Mc Guire
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2013-12-23 22:50 UTC (permalink / raw)
  To: linux-rt-users; +Cc: LKML, Thomas Gleixner, rostedt, John Kacur

Dear RT folks!

I'm pleased to announce the v3.12.6-rt9 patch set.

Changes since v3.12.6-rt8
- ARM's mach-sti is now using rawlock as boot_lock (like the other
  mach-*)
- There was a callpath to rcu_preempt_qs() with interrupts enabled. Tiejun
  Chen posted a patch to call it with interrupt disabled like we always
  do.
- A patch from Paul E. McKenney to not activate RCU core on NO_HZ_FULL
  CPUs
- A patch from Thomas Gleixner not to raise the timer softirq
  unconditionally (only if a timer is pending)


There is also a patch in the queue from Paul E. McKenney to move RCU
processing from softirq into its own thread. After Mike Galbraith
reported a few RCU stalls I decided to keep it disabled for now until I
have some time to look at it.

Known issues:

      - bcache is disabled.

      - Brian Silverman reported a BUG (via Debian BTS) where gdb's
        record command does something nasty and causes a double fault on
        x86-64 kernel with 32bit userland (the debugged application).
        32bit and 64bit setup are not kernels are not affected. The
        problem is limited is limited to x86.

      - Sami Pietikäinen reported a crash in __ip_make_skb(). Nicholas
        Mc Guire is preparing a patch for it.

The delta patch against v3.12.6-rt8 is appended below and can be found
here:
   https://www.kernel.org/pub/linux/kernel/projects/rt/3.12/incr/patch-3.12.6-rt8-rt9.patch.xz

The RT patch against 3.12.6 can be found here:

   https://www.kernel.org/pub/linux/kernel/projects/rt/3.12/patch-3.12.6-rt9.patch.xz

The split quilt queue is available at:

   https://www.kernel.org/pub/linux/kernel/projects/rt/3.12/patches-3.12.6-rt9.tar.xz

Sebastian

diff --git a/arch/arm/mach-sti/platsmp.c b/arch/arm/mach-sti/platsmp.c
index dce50d9..c05b764 100644
--- a/arch/arm/mach-sti/platsmp.c
+++ b/arch/arm/mach-sti/platsmp.c
@@ -35,7 +35,7 @@ static void write_pen_release(int val)
 	outer_clean_range(__pa(&pen_release), __pa(&pen_release + 1));
 }
 
-static DEFINE_SPINLOCK(boot_lock);
+static DEFINE_RAW_SPINLOCK(boot_lock);
 
 void sti_secondary_init(unsigned int cpu)
 {
@@ -50,8 +50,8 @@ void sti_secondary_init(unsigned int cpu)
 	/*
 	 * Synchronise with the boot thread.
 	 */
-	spin_lock(&boot_lock);
-	spin_unlock(&boot_lock);
+	raw_spin_lock(&boot_lock);
+	raw_spin_unlock(&boot_lock);
 }
 
 int sti_boot_secondary(unsigned int cpu, struct task_struct *idle)
@@ -62,7 +62,7 @@ int sti_boot_secondary(unsigned int cpu, struct task_struct *idle)
 	 * set synchronisation state between this boot processor
 	 * and the secondary one
 	 */
-	spin_lock(&boot_lock);
+	raw_spin_lock(&boot_lock);
 
 	/*
 	 * The secondary processor is waiting to be released from
@@ -93,7 +93,7 @@ int sti_boot_secondary(unsigned int cpu, struct task_struct *idle)
 	 * now the secondary core is starting up let it run its
 	 * calibrations, then wait for it to finish
 	 */
-	spin_unlock(&boot_lock);
+	raw_spin_unlock(&boot_lock);
 
 	return pen_release != -1 ? -ENOSYS : 0;
 }
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 79a7a35..bdbf77db 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -461,9 +461,8 @@ extern int schedule_hrtimeout_range_clock(ktime_t *expires,
 		unsigned long delta, const enum hrtimer_mode mode, int clock);
 extern int schedule_hrtimeout(ktime_t *expires, const enum hrtimer_mode mode);
 
-/* Soft interrupt function to run the hrtimer queues: */
+/* Called from the periodic timer tick */
 extern void hrtimer_run_queues(void);
-extern void hrtimer_run_pending(void);
 
 /* Bootup initialization: */
 extern void __init hrtimers_init(void);
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index c383841..7aa442e 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1694,30 +1694,6 @@ static void run_hrtimer_softirq(struct softirq_action *h)
 }
 
 /*
- * Called from timer softirq every jiffy, expire hrtimers:
- *
- * For HRT its the fall back code to run the softirq in the timer
- * softirq context in case the hrtimer initialization failed or has
- * not been done yet.
- */
-void hrtimer_run_pending(void)
-{
-	if (hrtimer_hres_active())
-		return;
-
-	/*
-	 * This _is_ ugly: We have to check in the softirq context,
-	 * whether we can switch to highres and / or nohz mode. The
-	 * clocksource switch happens in the timer interrupt with
-	 * xtime_lock held. Notification from there only sets the
-	 * check bit in the tick_oneshot code, otherwise we might
-	 * deadlock vs. xtime_lock.
-	 */
-	if (tick_check_oneshot_change(!hrtimer_is_hres_enabled()))
-		hrtimer_switch_to_hres();
-}
-
-/*
  * Called from hardirq context every jiffy
  */
 void hrtimer_run_queues(void)
@@ -1730,6 +1706,13 @@ void hrtimer_run_queues(void)
 	if (hrtimer_hres_active())
 		return;
 
+	/*
+	 * Check whether we can switch to highres mode.
+	 */
+	if (tick_check_oneshot_change(!hrtimer_is_hres_enabled())
+	    && hrtimer_switch_to_hres())
+		return;
+
 	for (index = 0; index < HRTIMER_MAX_CLOCK_BASES; index++) {
 		base = &cpu_base->clock_base[index];
 		if (!timerqueue_getnext(&base->active))
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 10365be..f4f61bb 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -204,7 +204,12 @@ static void rcu_preempt_qs(int cpu);
 
 void rcu_bh_qs(int cpu)
 {
+	unsigned long flags;
+
+	/* Callers to this function, rcu_preempt_qs(), must disable irqs. */
+	local_irq_save(flags);
 	rcu_preempt_qs(cpu);
+	local_irq_restore(flags);
 }
 #else
 void rcu_bh_qs(int cpu)
@@ -2674,6 +2679,10 @@ static int __rcu_pending(struct rcu_state *rsp, struct rcu_data *rdp)
 	/* Check for CPU stalls, if enabled. */
 	check_cpu_stall(rsp, rdp);
 
+	/* Is this CPU a NO_HZ_FULL CPU that should ignore RCU? */
+	if (rcu_nohz_full_cpu(rsp))
+		return 0;
+
 	/* Is the RCU core waiting for a quiescent state from this CPU? */
 	if (rcu_scheduler_fully_active &&
 	    rdp->qs_pending && !rdp->passed_quiesce) {
diff --git a/kernel/rcutree.h b/kernel/rcutree.h
index c36d59a..eb4fe67 100644
--- a/kernel/rcutree.h
+++ b/kernel/rcutree.h
@@ -563,6 +563,7 @@ static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
 				  unsigned long maxj);
 static void rcu_bind_gp_kthread(void);
 static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp);
+static bool rcu_nohz_full_cpu(struct rcu_state *rsp);
 
 #endif /* #ifndef RCU_TREE_NONCORE */
 
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 05bcc6f..c1735a1 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -2801,3 +2801,23 @@ static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp)
 }
 
 #endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
+
+/*
+ * Is this CPU a NO_HZ_FULL CPU that should ignore RCU so that the
+ * grace-period kthread will do force_quiescent_state() processing?
+ * The idea is to avoid waking up RCU core processing on such a
+ * CPU unless the grace period has extended for too long.
+ *
+ * This code relies on the fact that all NO_HZ_FULL CPUs are also
+ * CONFIG_RCU_NOCB_CPUs.
+ */
+static bool rcu_nohz_full_cpu(struct rcu_state *rsp)
+{
+#ifdef CONFIG_NO_HZ_FULL
+	if (tick_nohz_full_cpu(smp_processor_id()) &&
+	    (!rcu_gp_in_progress(rsp) ||
+	     ULONG_CMP_LT(jiffies, ACCESS_ONCE(rsp->gp_start) + HZ)))
+		return 1;
+#endif /* #ifdef CONFIG_NO_HZ_FULL */
+	return 0;
+}
diff --git a/kernel/timer.c b/kernel/timer.c
index b06c647..46467be 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1443,8 +1443,6 @@ static void run_timer_softirq(struct softirq_action *h)
 	irq_work_run();
 #endif
 
-	hrtimer_run_pending();
-
 	if (time_after_eq(jiffies, base->timer_jiffies))
 		__run_timers(base);
 }
@@ -1454,8 +1452,27 @@ static void run_timer_softirq(struct softirq_action *h)
  */
 void run_local_timers(void)
 {
+	struct tvec_base *base = __this_cpu_read(tvec_bases);
+
 	hrtimer_run_queues();
-	raise_softirq(TIMER_SOFTIRQ);
+	/*
+	 * We can access this lockless as we are in the timer
+	 * interrupt. If there are no timers queued, nothing to do in
+	 * the timer softirq.
+	 */
+	if (!spin_do_trylock(&base->lock)) {
+		raise_softirq(TIMER_SOFTIRQ);
+		return;
+	}
+	if (!base->active_timers)
+		goto out;
+
+	/* Check whether the next pending timer has expired */
+	if (time_before_eq(base->next_timer, jiffies))
+		raise_softirq(TIMER_SOFTIRQ);
+out:
+	rt_spin_unlock_after_trylock_in_irq(&base->lock);
+
 }
 
 #ifdef __ARCH_WANT_SYS_ALARM
diff --git a/localversion-rt b/localversion-rt
index 700c857..22746d6 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt8
+-rt9

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* 3.12.6-rt9 build failure
  2013-12-23 22:50 [ANNOUNCE] 3.12.6-rt9 Sebastian Andrzej Siewior
@ 2013-12-24 15:15 ` Nicholas Mc Guire
  2013-12-24 15:47 ` [ANNOUNCE] 3.12.6-rt9 Mike Galbraith
  2013-12-27 20:00 ` Nicholas Mc Guire
  2 siblings, 0 replies; 21+ messages in thread
From: Nicholas Mc Guire @ 2013-12-24 15:15 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, Thomas Gleixner, rostedt, John Kacur

Hi !

in 3.12.6-rt9:
 patch timers-do-not-raise-softirq-unconditionally.patch
 fails compile test for: 
  CONFIG_PREEMPT_RT_BASE=y
  # CONFIG_PREEMPT_RT_FULL is not set
 full config with seed 0xE3A03BD0.

<snip>
  CC      kernel/timer.o
kernel/timer.c: In function 'run_local_timers':
kernel/timer.c:1463: error: implicit declaration of function 'spin_do_trylock'
kernel/timer.c:1474: error: implicit declaration of function 'rt_spin_unlock_after_trylock_in_irq'
make[1]: *** [kernel/timer.o] Error 1
make: *** [kernel] Error 2

 ... RT specific API not ifdef'ed, a brute force ifdefification "fix" below

 But what this might actually point to is that testing of patches with RT off 
 (and other configs ?) is being neglected a bit. rt6 also had a problem (seed
 was 0xBE96A834 file:kernel/rtmutex.c:__mutex_lock_check_stamp:1007)

 maybe a set of minimum test-configs should be defined to atleast catch 
 build-failurs. 

thx!
hofrat

Signed-off-by: Nicholas Mc Guire <der.herr@hofr.at>
---
 kernel/timer.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 46467be..88951cb 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1452,9 +1452,13 @@ static void run_timer_softirq(struct softirq_action *h)
  */
 void run_local_timers(void)
 {
+#ifdef CONFIG_PREEMPT_RT_FULL
 	struct tvec_base *base = __this_cpu_read(tvec_bases);
+#endif
 
 	hrtimer_run_queues();
+
+#ifdef CONFIG_PREEMPT_RT_FULL
 	/*
 	 * We can access this lockless as we are in the timer
 	 * interrupt. If there are no timers queued, nothing to do in
@@ -1472,7 +1476,9 @@ void run_local_timers(void)
 		raise_softirq(TIMER_SOFTIRQ);
 out:
 	rt_spin_unlock_after_trylock_in_irq(&base->lock);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-23 22:50 [ANNOUNCE] 3.12.6-rt9 Sebastian Andrzej Siewior
  2013-12-24 15:15 ` 3.12.6-rt9 build failure Nicholas Mc Guire
@ 2013-12-24 15:47 ` Mike Galbraith
  2013-12-24 16:39   ` Pavel Vasilyev
  2014-01-17 17:00   ` Sebastian Andrzej Siewior
  2013-12-27 20:00 ` Nicholas Mc Guire
  2 siblings, 2 replies; 21+ messages in thread
From: Mike Galbraith @ 2013-12-24 15:47 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, LKML, Thomas Gleixner, rostedt, John Kacur

On Mon, 2013-12-23 at 23:50 +0100, Sebastian Andrzej Siewior wrote: 
> Dear RT folks!
> 
> I'm pleased to announce the v3.12.6-rt9 patch set.
> 
> Changes since v3.12.6-rt8
> - ARM's mach-sti is now using rawlock as boot_lock (like the other
>   mach-*)
> - There was a callpath to rcu_preempt_qs() with interrupts enabled. Tiejun
>   Chen posted a patch to call it with interrupt disabled like we always
>   do.
> - A patch from Paul E. McKenney to not activate RCU core on NO_HZ_FULL
>   CPUs
> - A patch from Thomas Gleixner not to raise the timer softirq
>   unconditionally (only if a timer is pending)
> 
> 
> There is also a patch in the queue from Paul E. McKenney to move RCU
> processing from softirq into its own thread. After Mike Galbraith
> reported a few RCU stalls I decided to keep it disabled for now until I
> have some time to look at it.

I built this kernel with Paul's patch and NO_HZ_FULL enabled again on 64
core box.  I haven't seen RCU grip yet, but I just checked on it after
3.5 hours into this boot/beat (after fixing crash+kdump setup), and
found it in the process of dumping. 

crash> bt
PID: 508    TASK: ffff8802739ba340  CPU: 16  COMMAND: "ksoftirqd/16"
 #0 [ffff880276806a40] machine_kexec at ffffffff8103bc07
 #1 [ffff880276806aa0] crash_kexec at ffffffff810d56b3
 #2 [ffff880276806b70] panic at ffffffff815bf8b0
 #3 [ffff880276806bf0] watchdog_overflow_callback at ffffffff810fed3d
 #4 [ffff880276806c10] __perf_event_overflow at ffffffff81131928
 #5 [ffff880276806ca0] perf_event_overflow at ffffffff81132254
 #6 [ffff880276806cb0] intel_pmu_handle_irq at ffffffff8102078f
 #7 [ffff880276806de0] perf_event_nmi_handler at ffffffff815c5825
 #8 [ffff880276806e10] nmi_handle at ffffffff815c4ed3
 #9 [ffff880276806ea0] default_do_nmi at ffffffff815c5063
#10 [ffff880276806ed0] do_nmi at ffffffff815c5388
#11 [ffff880276806ef0] end_repeat_nmi at ffffffff815c4371
    [exception RIP: _raw_spin_trylock+48]
    RIP: ffffffff815c3790  RSP: ffff880276803e28  RFLAGS: 00000002
    RAX: 0000000000000010  RBX: 0000000000000010  RCX: 0000000000000002
    RDX: ffff880276803e28  RSI: 0000000000000018  RDI: 0000000000000001
    RBP: ffffffff815c3790   R8: ffffffff815c3790   R9: 0000000000000018
    R10: ffff880276803e28  R11: 0000000000000002  R12: ffffffffffffffff
    R13: ffff880273a0c000  R14: ffff8802739ba340  R15: ffff880273a03fd8
    ORIG_RAX: ffff880273a03fd8  CS: 0010  SS: 0018
--- <RT exception stack> ---
#12 [ffff880276803e28] _raw_spin_trylock at ffffffff815c3790
#13 [ffff880276803e30] rt_spin_lock_slowunlock_hirq at ffffffff815c2cc8
#14 [ffff880276803e50] rt_spin_unlock_after_trylock_in_irq at ffffffff815c3425
#15 [ffff880276803e60] get_next_timer_interrupt at ffffffff810684a7
#16 [ffff880276803ed0] tick_nohz_stop_sched_tick at ffffffff810c5f2e
#17 [ffff880276803f50] tick_nohz_irq_exit at ffffffff810c6333
#18 [ffff880276803f70] irq_exit at ffffffff81060065
#19 [ffff880276803f90] smp_apic_timer_interrupt at ffffffff810358f5
#20 [ffff880276803fb0] apic_timer_interrupt at ffffffff815cbf9d
--- <IRQ stack> ---
#21 [ffff880273a03b28] apic_timer_interrupt at ffffffff815cbf9d
    [exception RIP: _raw_spin_lock+50]
    RIP: ffffffff815c3642  RSP: ffff880273a03bd8  RFLAGS: 00000202
    RAX: 0000000000008b49  RBX: ffff880272157290  RCX: ffff8802739ba340
    RDX: 0000000000008b4a  RSI: 0000000000000010  RDI: ffff880273a0c000
    RBP: ffff880273a03bd8   R8: 0000000000000001   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000001  R12: ffffffff810927b5
    R13: ffff880273a03b68  R14: 0000000000000010  R15: 0000000000000010
    ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
#22 [ffff880273a03be0] rt_spin_lock_slowlock at ffffffff815c2591
#23 [ffff880273a03cc0] rt_spin_lock at ffffffff815c3362
#24 [ffff880273a03cd0] run_timer_softirq at ffffffff81069002
#25 [ffff880273a03d70] handle_softirq at ffffffff81060d0f
#26 [ffff880273a03db0] do_current_softirqs at ffffffff81060f3c
#27 [ffff880273a03e20] run_ksoftirqd at ffffffff81061045
#28 [ffff880273a03e40] smpboot_thread_fn at ffffffff81089c31
#29 [ffff880273a03ec0] kthread at ffffffff810807fe
#30 [ffff880273a03f50] ret_from_fork at ffffffff815cb28c
crash> gdb list *0xffffffff815c2591
0xffffffff815c2591 is in rt_spin_lock_slowlock (kernel/rtmutex.c:109).
104     }
105     #endif
106     
107     static inline void init_lists(struct rt_mutex *lock)
108     {
109             if (unlikely(!lock->wait_list.node_list.prev))
110                     plist_head_init(&lock->wait_list);
111     }
112     
113     /*
crash> gdb list *0xffffffff815c2590
0xffffffff815c2590 is in rt_spin_lock_slowlock (kernel/rtmutex.c:744).
739             struct rt_mutex_waiter waiter, *top_waiter;
740             int ret;
741     
742             rt_mutex_init_waiter(&waiter, true);
743     
744             raw_spin_lock(&lock->wait_lock);
745             init_lists(lock);
746     
747             if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
748                     raw_spin_unlock(&lock->wait_lock);
crash> gdb list *0xffffffff815c2cc8
0xffffffff815c2cc8 is in rt_spin_lock_slowunlock_hirq (kernel/rtmutex.c:851).
846     {
847             int ret;
848     
849             do {
850                     ret = raw_spin_trylock(&lock->wait_lock);
851             } while (!ret);
852     
853             __rt_spin_lock_slowunlock(lock);
854     }
855

Dang, Santa might have delivered a lock pick set in a few more hours.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-24 15:47 ` [ANNOUNCE] 3.12.6-rt9 Mike Galbraith
@ 2013-12-24 16:39   ` Pavel Vasilyev
  2013-12-25  3:24     ` Mike Galbraith
  2014-01-17 17:00   ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 21+ messages in thread
From: Pavel Vasilyev @ 2013-12-24 16:39 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

[-- Attachment #1: Type: text/plain, Size: 317 bytes --]

24.12.2013 19:47, Mike Galbraith пишет:
> On Mon, 2013-12-23 at 23:50 +0100, Sebastian Andrzej Siewior wrote: 

> crash> bt
> PID: 508    TASK: ffff8802739ba340  CPU: 16  COMMAND: "ksoftirqd/16"

YES!!! And ARM code broke :)



-- 

                                                         Pavel.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-24 16:39   ` Pavel Vasilyev
@ 2013-12-25  3:24     ` Mike Galbraith
  0 siblings, 0 replies; 21+ messages in thread
From: Mike Galbraith @ 2013-12-25  3:24 UTC (permalink / raw)
  To: pavel
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

On Tue, 2013-12-24 at 20:39 +0400, Pavel Vasilyev wrote: 
> 24.12.2013 19:47, Mike Galbraith пишет:
> > On Mon, 2013-12-23 at 23:50 +0100, Sebastian Andrzej Siewior wrote: 
> 
> > crash> bt
> > PID: 508    TASK: ffff8802739ba340  CPU: 16  COMMAND: "ksoftirqd/16"
> 
> YES!!! And ARM code broke :)

And NO_HZ_TICK config survived for only 4.5 hours.

PID: 6948   TASK: ffff880272d1f1c0  CPU: 29  COMMAND: "tbench"
 #0 [ffff8802769a6a40] machine_kexec at ffffffff8103bc07
 #1 [ffff8802769a6aa0] crash_kexec at ffffffff810d3e93
 #2 [ffff8802769a6b70] panic at ffffffff815bce70
 #3 [ffff8802769a6bf0] watchdog_overflow_callback at ffffffff810fd51d
 #4 [ffff8802769a6c10] __perf_event_overflow at ffffffff8112f1f8
 #5 [ffff8802769a6ca0] perf_event_overflow at ffffffff8112fb14
 #6 [ffff8802769a6cb0] intel_pmu_handle_irq at ffffffff8102078f
 #7 [ffff8802769a6de0] perf_event_nmi_handler at ffffffff815c2de5
 #8 [ffff8802769a6e10] nmi_handle at ffffffff815c2493
 #9 [ffff8802769a6ea0] default_do_nmi at ffffffff815c2623
#10 [ffff8802769a6ed0] do_nmi at ffffffff815c2948
#11 [ffff8802769a6ef0] end_repeat_nmi at ffffffff815c1931
    [exception RIP: preempt_schedule+36]
    RIP: ffffffff815be944  RSP: ffff8802769a3d98  RFLAGS: 00000002
    RAX: 0000000000000010  RBX: 0000000000000010  RCX: 0000000000000002
    RDX: ffff8802769a3d98  RSI: 0000000000000018  RDI: 0000000000000001
    RBP: ffffffff815be944   R8: ffffffff815be944   R9: 0000000000000018
    R10: ffff8802769a3d98  R11: 0000000000000002  R12: ffffffffffffffff
    R13: ffff880273f74000  R14: ffff880272d1f1c0  R15: ffff880269cedfd8
    ORIG_RAX: ffff880269cedfd8  CS: 0010  SS: 0018
--- <RT exception stack> ---
#12 [ffff8802769a3d98] preempt_schedule at ffffffff815be944
#13 [ffff8802769a3db0] _raw_spin_trylock at ffffffff815c0d6e
#14 [ffff8802769a3dc0] rt_spin_lock_slowunlock_hirq at ffffffff815c0288
#15 [ffff8802769a3de0] rt_spin_unlock_after_trylock_in_irq at ffffffff815c09e5
#16 [ffff8802769a3df0] run_local_timers at ffffffff81068025
#17 [ffff8802769a3e10] update_process_times at ffffffff810680ac
#18 [ffff8802769a3e40] tick_sched_handle at ffffffff810c3a92
#19 [ffff8802769a3e60] tick_sched_timer at ffffffff810c3d2f
#20 [ffff8802769a3e90] __run_hrtimer at ffffffff8108471d
#21 [ffff8802769a3ed0] hrtimer_interrupt at ffffffff8108497a
#22 [ffff8802769a3f70] local_apic_timer_interrupt at ffffffff810349e6
#23 [ffff8802769a3f90] smp_apic_timer_interrupt at ffffffff810358ee
#24 [ffff8802769a3fb0] apic_timer_interrupt at ffffffff815c955d
--- <IRQ stack> ---
#25 [ffff880269ced848] apic_timer_interrupt at ffffffff815c955d
    [exception RIP: _raw_spin_lock+53]
    RIP: ffffffff815c0c05  RSP: ffff880269ced8f8  RFLAGS: 00000202
    RAX: 0000000000000b7b  RBX: 0000000000000282  RCX: ffff880272d1f1c0
    RDX: 0000000000000b7d  RSI: ffff880269ceda38  RDI: ffff880273f74000
    RBP: ffff880269ced8f8   R8: 0000000000000001   R9: 00000000b54d13a4
    R10: 0000000000000001  R11: 0000000000000001  R12: ffff880269ced910
    R13: ffff880276d32170  R14: ffffffff810c9030  R15: ffff880269ced8b8
    ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
#26 [ffff880269ced900] rt_spin_lock_slowlock at ffffffff815bfb51
#27 [ffff880269ced9e0] rt_spin_lock at ffffffff815c0922
#28 [ffff880269ced9f0] lock_timer_base at ffffffff81067f92
#29 [ffff880269ceda20] mod_timer at ffffffff81069bcb
#30 [ffff880269ceda70] sk_reset_timer at ffffffff814d1e57
#31 [ffff880269ceda90] inet_csk_reset_xmit_timer at ffffffff8152d4a8
#32 [ffff880269cedac0] tcp_rearm_rto at ffffffff8152d583
#33 [ffff880269cedae0] tcp_ack at ffffffff81534085
#34 [ffff880269cedb60] tcp_rcv_established at ffffffff8153443d
#35 [ffff880269cedbb0] tcp_v4_do_rcv at ffffffff8153f56a
#36 [ffff880269cedbe0] __release_sock at ffffffff814d3891
#37 [ffff880269cedc10] release_sock at ffffffff814d3942
#38 [ffff880269cedc30] tcp_sendmsg at ffffffff8152b955
#39 [ffff880269cedd00] inet_sendmsg at ffffffff8155350e
#40 [ffff880269cedd30] sock_sendmsg at ffffffff814cea87
#41 [ffff880269cede40] sys_sendto at ffffffff814cebdf
#42 [ffff880269cedf80] tracesys at ffffffff815c8b09 (via system_call)
    RIP: 00007f0441a1fc35  RSP: 00007fffdea86130  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: ffffffff815c8b09  RCX: ffffffffffffffff
    RDX: 000000000000248d  RSI: 0000000000607260  RDI: 0000000000000004
    RBP: 000000000000248d   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000246  R12: 00007fffdea86a10
    R13: 00007fffdea86414  R14: 0000000000000004  R15: 0000000000607260
    ORIG_RAX: 000000000000002c  CS: 0033  SS: 002b


--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-23 22:50 [ANNOUNCE] 3.12.6-rt9 Sebastian Andrzej Siewior
  2013-12-24 15:15 ` 3.12.6-rt9 build failure Nicholas Mc Guire
  2013-12-24 15:47 ` [ANNOUNCE] 3.12.6-rt9 Mike Galbraith
@ 2013-12-27 20:00 ` Nicholas Mc Guire
  2013-12-28  3:30   ` Mike Galbraith
                     ` (3 more replies)
  2 siblings, 4 replies; 21+ messages in thread
From: Nicholas Mc Guire @ 2013-12-27 20:00 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, LKML, Thomas Gleixner, rostedt, John Kacur

On Mon, 23 Dec 2013, Sebastian Andrzej Siewior wrote:

> Dear RT folks!
> 
> I'm pleased to announce the v3.12.6-rt9 patch set.
> 
> Changes since v3.12.6-rt8
<snip>
> - A patch from Thomas Gleixner not to raise the timer softirq
>   unconditionally (only if a timer is pending)
> 

This one seems to deadlock early in the boot sequence on x86
(i3/i7/Phenom-4x here and Carsten Emde also had boot failures)

after droping this patch with:
patch -p1 -R < ../paches/timers-do-not-raise-softirq-unconditionally.patch
3.12.6-rt9 boots up fine. cyclictest seems to be back to what it was before
(only ran for a few minutes idle and 1h with load on an i3).

The main problem with this patch though are proceduaral isues 
the commit note - which is a mail exchange - actually does not explain what 
the rational for the changes is (...well I don't understand the logic of
run_local_timers - if someone can explain - pleas do) and notably:

from timers-do-not-raise-softirq-unconditionally.patch
<snip>
well, that very same problem is in mainline if you add "threadirqs" to
the command line. But we can be smart about this. The untested patch
                                                    ^^^^^^^^^^^^^^^^^^
below should address that issue. If that works on mainline we can
adapt it for RT (needs a trylock(&base->lock) there).
<snip>

 does make me wonder why this went into -rt9 ?
 It also build fails with CONFIG_PREEMPT_RT_FULL not set.

 as with this patch, systems that booted just fine with 3.12.5-rt7 don't
 even boot (atleast my 3 x86 test boxes here did not) this raises some
 questions regarding the process of getting patches into -rtX - are
 we going to fast here ?

 I would prefere if such patches would go out with a request for testing
 or atleast a "might blow up your system" note in them...

thx!
hofrat

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-27 20:00 ` Nicholas Mc Guire
@ 2013-12-28  3:30   ` Mike Galbraith
  2013-12-28  3:48     ` Mike Galbraith
  2013-12-28  4:33     ` Mike Galbraith
  2014-01-11 20:25   ` Joakim Hernberg
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 21+ messages in thread
From: Mike Galbraith @ 2013-12-28  3:30 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

On Fri, 2013-12-27 at 21:00 +0100, Nicholas Mc Guire wrote: 
> On Mon, 23 Dec 2013, Sebastian Andrzej Siewior wrote:
> 
> > Dear RT folks!
> > 
> > I'm pleased to announce the v3.12.6-rt9 patch set.
> > 
> > Changes since v3.12.6-rt8
> <snip>
> > - A patch from Thomas Gleixner not to raise the timer softirq
> >   unconditionally (only if a timer is pending)
> > 
> 
> This one seems to deadlock early in the boot sequence on x86
> (i3/i7/Phenom-4x here and Carsten Emde also had boot failures)
> 
> after droping this patch with:
> patch -p1 -R < ../paches/timers-do-not-raise-softirq-unconditionally.patch
> 3.12.6-rt9 boots up fine. cyclictest seems to be back to what it was before
> (only ran for a few minutes idle and 1h with load on an i3).
> 
> The main problem with this patch though are proceduaral isues 
> the commit note - which is a mail exchange - actually does not explain what 
> the rational for the changes is

Raising the timer softirq unconditionally wakes ksoftirqd at every tick,
so the only time the no_hz_full "one and only one task is runnable" tick
shutdown criteria can be met is when the box has zero other runnable
tasks.. i.e. when box is idle.

Here, patch works fine boot wise, and no_hz_full tick shutdown works as
well, but there are a couple spots where taking an interrupt is a bad
idea as things sit.  Watchdog barked at two such spots, and there's a
"you _will_ hit this warning in -rt" spot as well.  

With bandaids on the sore spots, my 64 core box survives.

-Mike

(Less than wonderful changelogs probably comes from the fact that
maintaining -rt out of tree is time consuming as all hell.  Everybody
gets to breaks it, a couple guys get to fix it up again and again.)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-28  3:30   ` Mike Galbraith
@ 2013-12-28  3:48     ` Mike Galbraith
  2013-12-28  7:43       ` Nicholas Mc Guire
  2013-12-28  4:33     ` Mike Galbraith
  1 sibling, 1 reply; 21+ messages in thread
From: Mike Galbraith @ 2013-12-28  3:48 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

On Sat, 2013-12-28 at 04:30 +0100, Mike Galbraith wrote:

> (Less than wonderful changelogs probably comes from the fact that
> maintaining -rt out of tree is time consuming as all hell.  Everybody
> gets to breaks it, a couple guys get to fix it up again and again.)

P.S.  try rolling your tree forward to master or tip for entertainment,
you'll see what I mean.  Hi Peter, Rik.. other breakers of worlds :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-28  3:30   ` Mike Galbraith
  2013-12-28  3:48     ` Mike Galbraith
@ 2013-12-28  4:33     ` Mike Galbraith
  1 sibling, 0 replies; 21+ messages in thread
From: Mike Galbraith @ 2013-12-28  4:33 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

On Sat, 2013-12-28 at 04:30 +0100, Mike Galbraith wrote:

> Watchdog barked at two such spots..

btw, lockdep doesn't grumble about that (didn't stare at annotation,
don't speak lockdep well).  I fixed it up to not take it's toys and go
home in a snit at boot (rt_mutex debug offends it methinks), but it
didn't gripe.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-28  3:48     ` Mike Galbraith
@ 2013-12-28  7:43       ` Nicholas Mc Guire
  2013-12-28 13:57         ` Mike Galbraith
  0 siblings, 1 reply; 21+ messages in thread
From: Nicholas Mc Guire @ 2013-12-28  7:43 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

On Sat, 28 Dec 2013, Mike Galbraith wrote:

> On Sat, 2013-12-28 at 04:30 +0100, Mike Galbraith wrote:
> 
> > (Less than wonderful changelogs probably comes from the fact that
> > maintaining -rt out of tree is time consuming as all hell.  Everybody
> > gets to breaks it, a couple guys get to fix it up again and again.)
> 
> P.S.  try rolling your tree forward to master or tip for entertainment,
> you'll see what I mean.  Hi Peter, Rik.. other breakers of worlds :)
>
protesting exernal breakage by ameding -rt with home-made landmines
does sound like an optimized entertainment strategy...

This type of blowups will not help to go mainline (refereing to 3.12.X here, 
3.4/6/8/10 is a different story).

thx!
hofrat

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-28  7:43       ` Nicholas Mc Guire
@ 2013-12-28 13:57         ` Mike Galbraith
  0 siblings, 0 replies; 21+ messages in thread
From: Mike Galbraith @ 2013-12-28 13:57 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

On Sat, 2013-12-28 at 08:43 +0100, Nicholas Mc Guire wrote:

> This type of blowups will not help to go mainline (refereing to 3.12.X here, 
> 3.4/6/8/10 is a different story).

Nah.  Breakage is a vital sign.  When breakage stops, bury it.

-Mike

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-27 20:00 ` Nicholas Mc Guire
  2013-12-28  3:30   ` Mike Galbraith
@ 2014-01-11 20:25   ` Joakim Hernberg
  2014-01-17 16:10   ` Sebastian Andrzej Siewior
  2014-01-19 20:54   ` Fernando Lopez-Lezcano
  3 siblings, 0 replies; 21+ messages in thread
From: Joakim Hernberg @ 2014-01-11 20:25 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	rostedt, John Kacur

On Fri, 27 Dec 2013 21:00:24 +0100
Nicholas Mc Guire <der.herr@hofr.at> wrote:

> On Mon, 23 Dec 2013, Sebastian Andrzej Siewior wrote:
> 
> > Dear RT folks!
> > 
> > I'm pleased to announce the v3.12.6-rt9 patch set.
> > 
> > Changes since v3.12.6-rt8
> <snip>
> > - A patch from Thomas Gleixner not to raise the timer softirq
> >   unconditionally (only if a timer is pending)
> > 
> 
> This one seems to deadlock early in the boot sequence on x86
> (i3/i7/Phenom-4x here and Carsten Emde also had boot failures)

This patch seems to frequently make the kernel hang hard early in the
boot process on my i7-2600k too. Reverting
timers-do-not-raise-softirq-unconditionally.patch appears to fix the
problem.

-- 

   Joakim

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-27 20:00 ` Nicholas Mc Guire
  2013-12-28  3:30   ` Mike Galbraith
  2014-01-11 20:25   ` Joakim Hernberg
@ 2014-01-17 16:10   ` Sebastian Andrzej Siewior
  2014-01-19 20:54   ` Fernando Lopez-Lezcano
  3 siblings, 0 replies; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-01-17 16:10 UTC (permalink / raw)
  To: Nicholas Mc Guire
  Cc: linux-rt-users, LKML, Thomas Gleixner, rostedt, John Kacur

* Nicholas Mc Guire | 2013-12-27 21:00:24 [+0100]:

>> - A patch from Thomas Gleixner not to raise the timer softirq
>>   unconditionally (only if a timer is pending)
>> 
>
>This one seems to deadlock early in the boot sequence on x86
>(i3/i7/Phenom-4x here and Carsten Emde also had boot failures)
>
>after droping this patch with:
>patch -p1 -R < ../paches/timers-do-not-raise-softirq-unconditionally.patch
>3.12.6-rt9 boots up fine. cyclictest seems to be back to what it was before
>(only ran for a few minutes idle and 1h with load on an i3).
>
>The main problem with this patch though are proceduaral isues 
>the commit note - which is a mail exchange - actually does not explain what 
>the rational for the changes is (...well I don't understand the logic of
>run_local_timers - if someone can explain - pleas do) and notably:
>
>from timers-do-not-raise-softirq-unconditionally.patch
><snip>
>well, that very same problem is in mainline if you add "threadirqs" to
>the command line. But we can be smart about this. The untested patch
>                                                    ^^^^^^^^^^^^^^^^^^
>below should address that issue. If that works on mainline we can
>adapt it for RT (needs a trylock(&base->lock) there).
><snip>
>
> does make me wonder why this went into -rt9 ?

It was on the mailing list for a few weeks. My understanding was that
Mike Galbraith tested it on mainline and then I added the RT specific
pieces and added it it to the tree.

> It also build fails with CONFIG_PREEMPT_RT_FULL not set.

I will add a non-RT based config to my compile tests.

> as with this patch, systems that booted just fine with 3.12.5-rt7 don't
> even boot (atleast my 3 x86 test boxes here did not) this raises some
> questions regarding the process of getting patches into -rtX - are
> we going to fast here ?
>
> I would prefere if such patches would go out with a request for testing
> or atleast a "might blow up your system" note in them...

I didn't expect that much trouble. In general I try to avoid adding
explosives unless marked as such.

>thx!
>hofrat

Sebastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-24 15:47 ` [ANNOUNCE] 3.12.6-rt9 Mike Galbraith
  2013-12-24 16:39   ` Pavel Vasilyev
@ 2014-01-17 17:00   ` Sebastian Andrzej Siewior
  2014-01-18  3:15     ` Mike Galbraith
  1 sibling, 1 reply; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-01-17 17:00 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-rt-users, LKML, Thomas Gleixner, rostedt, John Kacur

* Mike Galbraith | 2013-12-24 16:47:47 [+0100]:

>I built this kernel with Paul's patch and NO_HZ_FULL enabled again on 64
>core box.  I haven't seen RCU grip yet, but I just checked on it after
>3.5 hours into this boot/beat (after fixing crash+kdump setup), and
>found it in the process of dumping. 

So you also have the timers-do-not-raise-softirq-unconditionally.patch?

>crash> bt
>PID: 508    TASK: ffff8802739ba340  CPU: 16  COMMAND: "ksoftirqd/16"
> #0 [ffff880276806a40] machine_kexec at ffffffff8103bc07
> #1 [ffff880276806aa0] crash_kexec at ffffffff810d56b3
> #2 [ffff880276806b70] panic at ffffffff815bf8b0
> #3 [ffff880276806bf0] watchdog_overflow_callback at ffffffff810fed3d
> #4 [ffff880276806c10] __perf_event_overflow at ffffffff81131928
> #5 [ffff880276806ca0] perf_event_overflow at ffffffff81132254
> #6 [ffff880276806cb0] intel_pmu_handle_irq at ffffffff8102078f
> #7 [ffff880276806de0] perf_event_nmi_handler at ffffffff815c5825
> #8 [ffff880276806e10] nmi_handle at ffffffff815c4ed3
> #9 [ffff880276806ea0] default_do_nmi at ffffffff815c5063
>#10 [ffff880276806ed0] do_nmi at ffffffff815c5388
>#11 [ffff880276806ef0] end_repeat_nmi at ffffffff815c4371
>    [exception RIP: _raw_spin_trylock+48]
>    RIP: ffffffff815c3790  RSP: ffff880276803e28  RFLAGS: 00000002
>    RAX: 0000000000000010  RBX: 0000000000000010  RCX: 0000000000000002
>    RDX: ffff880276803e28  RSI: 0000000000000018  RDI: 0000000000000001
>    RBP: ffffffff815c3790   R8: ffffffff815c3790   R9: 0000000000000018
>    R10: ffff880276803e28  R11: 0000000000000002  R12: ffffffffffffffff
>    R13: ffff880273a0c000  R14: ffff8802739ba340  R15: ffff880273a03fd8
>    ORIG_RAX: ffff880273a03fd8  CS: 0010  SS: 0018
>--- <RT exception stack> ---
>#12 [ffff880276803e28] _raw_spin_trylock at ffffffff815c3790
>#13 [ffff880276803e30] rt_spin_lock_slowunlock_hirq at ffffffff815c2cc8
>#14 [ffff880276803e50] rt_spin_unlock_after_trylock_in_irq at ffffffff815c3425
>#15 [ffff880276803e60] get_next_timer_interrupt at ffffffff810684a7
>#16 [ffff880276803ed0] tick_nohz_stop_sched_tick at ffffffff810c5f2e
>#17 [ffff880276803f50] tick_nohz_irq_exit at ffffffff810c6333
>#18 [ffff880276803f70] irq_exit at ffffffff81060065
>#19 [ffff880276803f90] smp_apic_timer_interrupt at ffffffff810358f5
>#20 [ffff880276803fb0] apic_timer_interrupt at ffffffff815cbf9d
>--- <IRQ stack> ---
>#21 [ffff880273a03b28] apic_timer_interrupt at ffffffff815cbf9d
>    [exception RIP: _raw_spin_lock+50]
>    RIP: ffffffff815c3642  RSP: ffff880273a03bd8  RFLAGS: 00000202
>    RAX: 0000000000008b49  RBX: ffff880272157290  RCX: ffff8802739ba340
>    RDX: 0000000000008b4a  RSI: 0000000000000010  RDI: ffff880273a0c000
>    RBP: ffff880273a03bd8   R8: 0000000000000001   R9: 0000000000000000
>    R10: 0000000000000000  R11: 0000000000000001  R12: ffffffff810927b5
>    R13: ffff880273a03b68  R14: 0000000000000010  R15: 0000000000000010
>    ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
>#22 [ffff880273a03be0] rt_spin_lock_slowlock at ffffffff815c2591
>#23 [ffff880273a03cc0] rt_spin_lock at ffffffff815c3362
>#24 [ffff880273a03cd0] run_timer_softirq at ffffffff81069002
>#25 [ffff880273a03d70] handle_softirq at ffffffff81060d0f
>#26 [ffff880273a03db0] do_current_softirqs at ffffffff81060f3c
>#27 [ffff880273a03e20] run_ksoftirqd at ffffffff81061045
>#28 [ffff880273a03e40] smpboot_thread_fn at ffffffff81089c31
>#29 [ffff880273a03ec0] kthread at ffffffff810807fe
>#30 [ffff880273a03f50] ret_from_fork at ffffffff815cb28c
>crash> gdb list *0xffffffff815c2591
>0xffffffff815c2591 is in rt_spin_lock_slowlock (kernel/rtmutex.c:109).
>104     }
>105     #endif
>106     
>107     static inline void init_lists(struct rt_mutex *lock)
>108     {
>109             if (unlikely(!lock->wait_list.node_list.prev))
>110                     plist_head_init(&lock->wait_list);
>111     }
>112     
>113     /*
>crash> gdb list *0xffffffff815c2590
>0xffffffff815c2590 is in rt_spin_lock_slowlock (kernel/rtmutex.c:744).
>739             struct rt_mutex_waiter waiter, *top_waiter;
>740             int ret;
>741     
>742             rt_mutex_init_waiter(&waiter, true);
>743     
>744             raw_spin_lock(&lock->wait_lock);
>745             init_lists(lock);
>746     
>747             if (__try_to_take_rt_mutex(lock, self, NULL, STEAL_LATERAL)) {
>748                     raw_spin_unlock(&lock->wait_lock);
>crash> gdb list *0xffffffff815c2cc8
>0xffffffff815c2cc8 is in rt_spin_lock_slowunlock_hirq (kernel/rtmutex.c:851).
>846     {
>847             int ret;
>848     
>849             do {
>850                     ret = raw_spin_trylock(&lock->wait_lock);
>851             } while (!ret);
>852     
>853             __rt_spin_lock_slowunlock(lock);
>854     }
>855
>
>Dang, Santa might have delivered a lock pick set in a few more hours.

I have a small problem with understanding this…

|#24 [ffff880273a03cd0] run_timer_softirq at ffffffff81069002

Here we obtain wait_lock from tvec_base of _this_ CPU. And we get to
init_lists() before the apic timer kicks in. So we have the wait_lock.
In the hard interrupt triggered by the apic timer we get to
get_next_timer_interrupt() and go again for same the wait_lock. Here we
have the try_lock so we avoid this deadlock.
The odd part: we get the lock. It should be the same lock because both use
| struct tvec_base *base = __this_cpu_read(tvec_bases);
to ge it. And we shouldn't get it because the lock is already hold.
We get into trouble in the unlock path where we spin forever:

|#14 [ffff880276803e50] rt_spin_unlock_after_trylock_in_irq at ffffffff815c3425
|#12 [ffff880276803e28] _raw_spin_trylock at ffffffff815c3790

which releases the lock with a trylock in order to keep lockdep happy.
My understanding was that we should be able to obtain the wait_lock here
since we were able to obtain it in the lock path and in irq off context
there is nothing that could take the lock in the meantime.

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2014-01-17 17:00   ` Sebastian Andrzej Siewior
@ 2014-01-18  3:15     ` Mike Galbraith
  2014-01-21  2:17       ` Steven Rostedt
  0 siblings, 1 reply; 21+ messages in thread
From: Mike Galbraith @ 2014-01-18  3:15 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, LKML, Thomas Gleixner, rostedt, John Kacur

On Fri, 2014-01-17 at 18:00 +0100, Sebastian Andrzej Siewior wrote: 
> * Mike Galbraith | 2013-12-24 16:47:47 [+0100]:
> 
> >I built this kernel with Paul's patch and NO_HZ_FULL enabled again on 64
> >core box.  I haven't seen RCU grip yet, but I just checked on it after
> >3.5 hours into this boot/beat (after fixing crash+kdump setup), and
> >found it in the process of dumping. 
> 
> So you also have the timers-do-not-raise-softirq-unconditionally.patch?

Oh dear, there's holidays, vacation, and massive turkey overdose between
then and now, but I'm almost positive that the tree was virgin $subject,
with only Paul's patch enabled, that being what I wanted to beat on.

> I have a small problem with understanding this…
> 
> |#24 [ffff880273a03cd0] run_timer_softirq at ffffffff81069002
> 
> Here we obtain wait_lock from tvec_base of _this_ CPU. And we get to
> init_lists() before the apic timer kicks in. So we have the wait_lock.

gdb fibs a little, we're acquiring.

>--- <IRQ stack> ---
> >#21 [ffff880273a03b28] apic_timer_interrupt at ffffffff815cbf9d
> >    [exception RIP: _raw_spin_lock+50]

> In the hard interrupt triggered by the apic timer we get to
> get_next_timer_interrupt() and go again for same the wait_lock. Here we
> have the try_lock so we avoid this deadlock.
> The odd part: we get the lock. It should be the same lock because both use
> | struct tvec_base *base = __this_cpu_read(tvec_bases);
> to ge it. And we shouldn't get it because the lock is already hold.
> We get into trouble in the unlock path where we spin forever:
> 
> |#14 [ffff880276803e50] rt_spin_unlock_after_trylock_in_irq at ffffffff815c3425
> |#12 [ffff880276803e28] _raw_spin_trylock at ffffffff815c3790
> 
> which releases the lock with a trylock in order to keep lockdep happy.
> My understanding was that we should be able to obtain the wait_lock here
> since we were able to obtain it in the lock path and in irq off context
> there is nothing that could take the lock in the meantime.

IIRC, we were endlessly trying, but with an un-punched ticket under us,
and no Xen like evilness to save the day.

I've since cleaned out my crashdump directory and moved on to frolicking
with hotplug gremlins, so don't have that one to revisit, but the don't
unconditionally raise timer softirq patch is the bad guy.

-Mike

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2013-12-27 20:00 ` Nicholas Mc Guire
                     ` (2 preceding siblings ...)
  2014-01-17 16:10   ` Sebastian Andrzej Siewior
@ 2014-01-19 20:54   ` Fernando Lopez-Lezcano
  3 siblings, 0 replies; 21+ messages in thread
From: Fernando Lopez-Lezcano @ 2014-01-19 20:54 UTC (permalink / raw)
  To: Nicholas Mc Guire, Sebastian Andrzej Siewior
  Cc: nando, linux-rt-users, LKML, Thomas Gleixner, rostedt, John Kacur

[-- Attachment #1: Type: text/plain, Size: 23005 bytes --]

On 12/27/2013 12:00 PM, Nicholas Mc Guire wrote:
> On Mon, 23 Dec 2013, Sebastian Andrzej Siewior wrote:
>
>> Dear RT folks!
>>
>> I'm pleased to announce the v3.12.6-rt9 patch set.
>>
>> Changes since v3.12.6-rt8
> <snip>
>> - A patch from Thomas Gleixner not to raise the timer softirq
>>    unconditionally (only if a timer is pending)
>>
>
> This one seems to deadlock early in the boot sequence on x86
> (i3/i7/Phenom-4x here and Carsten Emde also had boot failures)
>
> after droping this patch with:
> patch -p1 -R < ../paches/timers-do-not-raise-softirq-unconditionally.patch
> 3.12.6-rt9 boots up fine. cyclictest seems to be back to what it was before
> (only ran for a few minutes idle and 1h with load on an i3).

I can confirm that dropping this patch makes the kernel bootable in my 
tests (it would hang very early before, no clues left behind).

On a boot this morning I got a couple of oops (I'm attaching the 
configuration for this kernel), see below.
-- Fernando

--------
Jan 19 12:42:11 localhost kernel: [    2.721316] BUG: unable to handle 
kernel paging request at 0000000000841f0f
Jan 19 12:42:11 localhost kernel: [    2.721320] IP: 
[<ffffffff81331f3b>] __list_add+0x1b/0xc0
Jan 19 12:42:11 localhost kernel: [    2.721321] PGD 0
Jan 19 12:42:11 localhost kernel: [    2.721322] Oops: 0000 [#1] PREEMPT 
SMP
Jan 19 12:42:11 localhost kernel: [    2.721330] Modules linked in: 
snd_hda_codec_realtek lpc_ich(+) e1000e(+) mfd_core mei_me ptp mei 
pps_core snd_hda_intel(+) snd_hda_codec snd_hdsp(+) snd_rawmidi 
snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd 
soundcore shpchp firewire_ohci firewire_core nouveau crc_itu_t 
i2c_algo_bit drm_kms_helper ttm drm i2c_core mxm_wmi wmi video
Jan 19 12:42:11 localhost kernel: [    2.721332] CPU: 6 PID: 427 Comm: 
systemd-udevd Not tainted 3.12.6-300.rt9.1.fc20.ccrma.x86_64+rt #1
Jan 19 12:42:11 localhost kernel: [    2.721333] Hardware name: 
          /DZ77GA-70K, BIOS GAZ7711H.86A.0061.2012.1228.1110 12/28/2012
Jan 19 12:42:11 localhost kernel: [    2.721333] task: ffff8800372fdc80 
ti: ffff8807d57ea000 task.ti: ffff8807d57ea000
Jan 19 12:42:11 localhost kernel: [    2.721334] RIP: 
0010:[<ffffffff81331f3b>]  [<ffffffff81331f3b>] __list_add+0x1b/0xc0
Jan 19 12:42:11 localhost kernel: [    2.721335] RSP: 
0018:ffff8807d57ebd28  EFLAGS: 00010046
Jan 19 12:42:11 localhost kernel: [    2.721336] RAX: 0000000000000000 
RBX: ffff8807d57ebe30 RCX: ffffffff81353ba8
Jan 19 12:42:11 localhost kernel: [    2.721336] RDX: ffffffff81353bb0 
RSI: 0000000000841f0f RDI: ffff8807d57ebe30
Jan 19 12:42:11 localhost kernel: [    2.721337] RBP: ffff8807d57ebd40 
R08: 0000000000841f0f R09: 0000000000000000
Jan 19 12:42:11 localhost kernel: [    2.721337] R10: ffffffff81ca67c9 
R11: 0000000000000246 R12: ffffffff81353bb0
Jan 19 12:42:11 localhost kernel: [    2.721338] R13: 0000000000841f0f 
R14: ffff8807d57ebe40 R15: ffff8807d57ebe30
Jan 19 12:42:11 localhost kernel: [    2.721338] FS: 
00007fb3e891c880(0000) GS:ffff8807fe500000(0000) knlGS:0000000000000000
Jan 19 12:42:11 localhost kernel: [    2.721339] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Jan 19 12:42:11 localhost kernel: [    2.721339] CR2: 0000000000841f0f 
CR3: 00000007d50a1000 CR4: 00000000001407e0
Jan 19 12:42:11 localhost kernel: [    2.721340] Stack:
Jan 19 12:42:11 localhost kernel: [    2.721342]  ffff8807d57ebe28 
ffffffff81ca6e68 ffffffff81353bc0 ffff8807d57ebd78
Jan 19 12:42:11 localhost kernel: [    2.721343]  ffffffff8131e122 
ffffffff81ca67c8 ffff8807d57ebe18 ffff8807d57ebe00
Jan 19 12:42:11 localhost kernel: [    2.721343]  ffffffff81ca67b0 
ffff8800372fdc80 ffff8807d57ebdd8 ffffffff810db933
Jan 19 12:42:11 localhost kernel: [    2.721344] Call Trace:
Jan 19 12:42:11 localhost kernel: [    2.721347]  [<ffffffff81353bc0>] ? 
pci_dev_attrs_are_visible+0x40/0x40
Jan 19 12:42:11 localhost kernel: [    2.721349]  [<ffffffff8131e122>] 
plist_add+0x82/0xd0
Jan 19 12:42:11 localhost kernel: [    2.721352]  [<ffffffff810db933>] 
task_blocks_on_rt_mutex+0x1e3/0x260
Jan 19 12:42:11 localhost kernel: [    2.721354]  [<ffffffff8168a951>] 
rt_mutex_slowlock+0x111/0x260
Jan 19 12:42:11 localhost kernel: [    2.721356]  [<ffffffff8168ab51>] 
rt_mutex_lock+0x31/0x40
Jan 19 12:42:11 localhost kernel: [    2.721358]  [<ffffffff8168b4be>] 
_mutex_lock+0xe/0x10
Jan 19 12:42:11 localhost kernel: [    2.721360]  [<ffffffff81232f96>] 
sysfs_read_file+0x36/0x1a0
Jan 19 12:42:11 localhost kernel: [    2.721363]  [<ffffffff811bae6e>] 
vfs_read+0x9e/0x170
Jan 19 12:42:11 localhost kernel: [    2.721364]  [<ffffffff811bb999>] 
SyS_read+0x49/0xa0
Jan 19 12:42:11 localhost kernel: [    2.721366]  [<ffffffff81693529>] 
system_call_fastpath+0x16/0x1b
Jan 19 12:42:11 localhost kernel: [    2.721374] Code: 5e 41 5f 5d c3 66 
2e 0f 1f 84 00 00 00 00 00 90 55 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 
53 4c 8b 42 08 48 89 fb 49 39 f0 75 2a <4d> 8b 45 00 4d 39 c4 75 68 4c 
39 e3 74 3e 4c 39 eb 74 39 49 89
Jan 19 12:42:11 localhost kernel: [    2.721375] RIP 
[<ffffffff81331f3b>] __list_add+0x1b/0xc0
Jan 19 12:42:11 localhost kernel: [    2.721375]  RSP <ffff8807d57ebd28>
Jan 19 12:42:11 localhost kernel: [    2.721375] CR2: 0000000000841f0f
Jan 19 12:42:11 localhost kernel: [    2.726383] ---[ end trace 
0000000000000002 ]---
Jan 19 12:42:11 localhost kernel: [    2.726385] note: 
systemd-udevd[427] exited with preempt_count 2
Jan 19 12:42:11 localhost kernel: [    2.726534] ------------[ cut here 
]------------
Jan 19 12:42:11 localhost kernel: [    2.726536] WARNING: CPU: 6 PID: 
427 at lib/list_debug.c:62 __list_del_entry+0x82/0xd0()
Jan 19 12:42:11 localhost kernel: [    2.726536] list_del corruption. 
next->prev should be ffffffff81ca67e0, but was ffffffff81ca67d8
Jan 19 12:42:11 localhost kernel: [    2.726541] Modules linked in: 
ac97_bus(+) snd_hda_codec_realtek lpc_ich(+) e1000e(+) mfd_core mei_me 
ptp mei pps_core snd_hda_intel(+) snd_hda_codec snd_hdsp(+) snd_rawmidi 
snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd 
soundcore shpchp firewire_ohci firewire_core nouveau crc_itu_t 
i2c_algo_bit drm_kms_helper ttm drm i2c_core mxm_wmi wmi video
Jan 19 12:42:11 localhost kernel: [    2.726542] CPU: 6 PID: 427 Comm: 
systemd-udevd Tainted: G      D 
3.12.6-300.rt9.1.fc20.ccrma.x86_64+rt #1
Jan 19 12:42:11 localhost kernel: [    2.726543] Hardware name: 
          /DZ77GA-70K, BIOS GAZ7711H.86A.0061.2012.1228.1110 12/28/2012
Jan 19 12:42:11 localhost kernel: [    2.726544]  0000000000000009 
ffff8807d57eb8a0 ffffffff8168536e ffff8807d57eb8e8
Jan 19 12:42:11 localhost kernel: [    2.726545]  ffff8807d57eb8d8 
ffffffff8106d21d ffffffff81ca67e0 ffffffff81ca6790
Jan 19 12:42:11 localhost kernel: [    2.726546]  ffff8807d866e770 
ffff8807d83aa690 ffff8807d59d2920 ffff8807d57eb938
Jan 19 12:42:11 localhost kernel: [    2.726546] Call Trace:
Jan 19 12:42:11 localhost kernel: [    2.726547]  [<ffffffff8168536e>] 
dump_stack+0x54/0x9a
Jan 19 12:42:11 localhost kernel: [    2.726549]  [<ffffffff8106d21d>] 
warn_slowpath_common+0x7d/0xc0
Jan 19 12:42:11 localhost kernel: [    2.726550]  [<ffffffff8106d2ac>] 
warn_slowpath_fmt+0x4c/0x50
Jan 19 12:42:11 localhost kernel: [    2.726551]  [<ffffffff81332062>] 
__list_del_entry+0x82/0xd0
Jan 19 12:42:11 localhost kernel: [    2.726552]  [<ffffffff813320bd>] 
list_del+0xd/0x30
Jan 19 12:42:11 localhost kernel: [    2.726553]  [<ffffffff81232bdf>] 
sysfs_release+0x3f/0xa0
Jan 19 12:42:11 localhost kernel: [    2.726554]  [<ffffffff811bc970>] 
__fput+0xd0/0x220
Jan 19 12:42:11 localhost kernel: [    2.726555]  [<ffffffff811bcb0e>] 
____fput+0xe/0x10
Jan 19 12:42:11 localhost kernel: [    2.726556]  [<ffffffff8108e514>] 
task_work_run+0xc4/0xe0
Jan 19 12:42:11 localhost kernel: [    2.726558]  [<ffffffff810700e8>] 
do_exit+0x2c8/0xae0
Jan 19 12:42:11 localhost kernel: [    2.726560]  [<ffffffff81681516>] ? 
printk+0x67/0x69
Jan 19 12:42:11 localhost kernel: [    2.726562]  [<ffffffff810c30e1>] ? 
kmsg_dump+0xc1/0xd0
Jan 19 12:42:11 localhost kernel: [    2.726563]  [<ffffffff8168ccd2>] 
oops_end+0xa2/0xe0
Jan 19 12:42:11 localhost kernel: [    2.726564]  [<ffffffff81680985>] 
no_context+0x263/0x270
Jan 19 12:42:11 localhost kernel: [    2.726565]  [<ffffffff81680a05>] 
__bad_area_nosemaphore+0x73/0x1ca
Jan 19 12:42:11 localhost kernel: [    2.726567]  [<ffffffff810ada22>] ? 
enqueue_task_fair+0x412/0x660
Jan 19 12:42:11 localhost kernel: [    2.726567]  [<ffffffff81680b6f>] 
bad_area_nosemaphore+0x13/0x15
Jan 19 12:42:11 localhost kernel: [    2.726568]  [<ffffffff8168f104>] 
__do_page_fault+0xf4/0x600
Jan 19 12:42:11 localhost kernel: [    2.726569]  [<ffffffff810aade0>] ? 
update_curr+0x70/0x1f0
Jan 19 12:42:11 localhost kernel: [    2.726570]  [<ffffffff810ab386>] ? 
dequeue_entity+0x106/0x560
Jan 19 12:42:11 localhost kernel: [    2.726571]  [<ffffffff810abbfe>] ? 
dequeue_task_fair+0x41e/0x620
Jan 19 12:42:11 localhost kernel: [    2.726572]  [<ffffffff81353bb0>] ? 
pci_dev_attrs_are_visible+0x30/0x40
Jan 19 12:42:11 localhost kernel: [    2.726573]  [<ffffffff8168f61e>] 
do_page_fault+0xe/0x10
Jan 19 12:42:11 localhost kernel: [    2.726574]  [<ffffffff8168c118>] 
page_fault+0x28/0x30
Jan 19 12:42:11 localhost kernel: [    2.726575]  [<ffffffff81353bb0>] ? 
pci_dev_attrs_are_visible+0x30/0x40
Jan 19 12:42:11 localhost kernel: [    2.726576]  [<ffffffff81353ba8>] ? 
pci_dev_attrs_are_visible+0x28/0x40
Jan 19 12:42:11 localhost kernel: [    2.726576]  [<ffffffff81353bb0>] ? 
pci_dev_attrs_are_visible+0x30/0x40
Jan 19 12:42:11 localhost kernel: [    2.726577]  [<ffffffff81331f3b>] ? 
__list_add+0x1b/0xc0
Jan 19 12:42:11 localhost kernel: [    2.726578]  [<ffffffff81353bc0>] ? 
pci_dev_attrs_are_visible+0x40/0x40
Jan 19 12:42:11 localhost kernel: [    2.726580]  [<ffffffff8131e122>] 
plist_add+0x82/0xd0
Jan 19 12:42:11 localhost kernel: [    2.726581]  [<ffffffff810db933>] 
task_blocks_on_rt_mutex+0x1e3/0x260
Jan 19 12:42:11 localhost kernel: [    2.726583]  [<ffffffff8168a951>] 
rt_mutex_slowlock+0x111/0x260
Jan 19 12:42:11 localhost kernel: [    2.726584]  [<ffffffff8168ab51>] 
rt_mutex_lock+0x31/0x40
Jan 19 12:42:11 localhost kernel: [    2.726586]  [<ffffffff8168b4be>] 
_mutex_lock+0xe/0x10
Jan 19 12:42:11 localhost kernel: [    2.726586]  [<ffffffff81232f96>] 
sysfs_read_file+0x36/0x1a0
Jan 19 12:42:11 localhost kernel: [    2.726588]  [<ffffffff811bae6e>] 
vfs_read+0x9e/0x170
Jan 19 12:42:11 localhost kernel: [    2.726589]  [<ffffffff811bb999>] 
SyS_read+0x49/0xa0
Jan 19 12:42:11 localhost kernel: [    2.726590]  [<ffffffff81693529>] 
system_call_fastpath+0x16/0x1b
Jan 19 12:42:11 localhost kernel: [    2.726591] ---[ end trace 
0000000000000003 ]---
Jan 19 12:42:11 localhost kernel: [    2.726594] ------------[ cut here 
]------------
Jan 19 12:42:11 localhost kernel: [    2.726594] kernel BUG at 
mm/slub.c:3398!
Jan 19 12:42:11 localhost kernel: [    2.726595] invalid opcode: 0000 
[#2] PREEMPT SMP
Jan 19 12:42:11 localhost kernel: [    2.726599] Modules linked in: 
ac97_bus(+) snd_hda_codec_realtek lpc_ich(+) e1000e(+) mfd_core mei_me 
ptp mei pps_core snd_hda_intel(+) snd_hda_codec snd_hdsp(+) snd_rawmidi 
snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd 
soundcore shpchp firewire_ohci firewire_core nouveau crc_itu_t 
i2c_algo_bit drm_kms_helper ttm drm i2c_core mxm_wmi wmi video
Jan 19 12:42:11 localhost kernel: [    2.726600] CPU: 6 PID: 427 Comm: 
systemd-udevd Tainted: G      D W 
3.12.6-300.rt9.1.fc20.ccrma.x86_64+rt #1
Jan 19 12:42:11 localhost kernel: [    2.726600] Hardware name: 
          /DZ77GA-70K, BIOS GAZ7711H.86A.0061.2012.1228.1110 12/28/2012
Jan 19 12:42:11 localhost kernel: [    2.726600] task: ffff8800372fdc80 
ti: ffff8807d57ea000 task.ti: ffff8807d57ea000
Jan 19 12:42:11 localhost kernel: [    2.726602] RIP: 
0010:[<ffffffff811a4237>]  [<ffffffff811a4237>] kfree+0x1f7/0x200
Jan 19 12:42:11 localhost kernel: [    2.726603] RSP: 
0018:ffff8807d57eb928  EFLAGS: 00010046
Jan 19 12:42:11 localhost kernel: [    2.726603] RAX: 003ff00000000400 
RBX: ffffffff81ca6790 RCX: 0000000000000000
Jan 19 12:42:11 localhost kernel: [    2.726604] RDX: 0000000000000000 
RSI: 0000000000000000 RDI: ffffffff81ca6790
Jan 19 12:42:11 localhost kernel: [    2.726604] RBP: ffff8807d57eb960 
R08: ffffffff81f3dd9c R09: ffffffff81f558b0
Jan 19 12:42:11 localhost kernel: [    2.726605] R10: 0000000000028520 
R11: 0000000000040000 R12: ffffffff81ca6790
Jan 19 12:42:11 localhost kernel: [    2.726606] R13: ffffea0000072980 
R14: ffff8807d83aa690 R15: ffff8807d59d2920
Jan 19 12:42:11 localhost kernel: [    2.726606] FS: 
0000000000000000(0000) GS:ffff8807fe500000(0000) knlGS:0000000000000000
Jan 19 12:42:11 localhost kernel: [    2.726607] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Jan 19 12:42:11 localhost kernel: [    2.726607] CR2: 0000000000841f0f 
CR3: 0000000001c0f000 CR4: 00000000001407e0
Jan 19 12:42:11 localhost kernel: [    2.726607] Stack:
Jan 19 12:42:11 localhost kernel: [    2.726608]  ffffffff81ca67d8 
0000000000000000 ffffffff81ca6790 ffffffff81ca6790
Jan 19 12:42:11 localhost kernel: [    2.726609]  ffff8807d866e770 
ffff8807d83aa690 ffff8807d59d2920 ffff8807d57eb988
Jan 19 12:42:11 localhost kernel: [    2.726610]  ffffffff81232c0a 
ffff8807d61d5180 0000000000000010 ffff8807d83ad600
Jan 19 12:42:11 localhost kernel: [    2.726610] Call Trace:
Jan 19 12:42:11 localhost kernel: [    2.726611]  [<ffffffff81232c0a>] 
sysfs_release+0x6a/0xa0
Jan 19 12:42:11 localhost kernel: [    2.726611]  [<ffffffff811bc970>] 
__fput+0xd0/0x220
Jan 19 12:42:11 localhost kernel: [    2.726612]  [<ffffffff811bcb0e>] 
____fput+0xe/0x10
Jan 19 12:42:11 localhost kernel: [    2.726613]  [<ffffffff8108e514>] 
task_work_run+0xc4/0xe0
Jan 19 12:42:11 localhost kernel: [    2.726614]  [<ffffffff810700e8>] 
do_exit+0x2c8/0xae0
Jan 19 12:42:11 localhost kernel: [    2.726615]  [<ffffffff81681516>] ? 
printk+0x67/0x69
Jan 19 12:42:11 localhost kernel: [    2.726616]  [<ffffffff810c30e1>] ? 
kmsg_dump+0xc1/0xd0
Jan 19 12:42:11 localhost kernel: [    2.726617]  [<ffffffff8168ccd2>] 
oops_end+0xa2/0xe0
Jan 19 12:42:11 localhost kernel: [    2.726617]  [<ffffffff81680985>] 
no_context+0x263/0x270
Jan 19 12:42:11 localhost kernel: [    2.726618]  [<ffffffff81680a05>] 
__bad_area_nosemaphore+0x73/0x1ca
Jan 19 12:42:11 localhost kernel: [    2.726619]  [<ffffffff810ada22>] ? 
enqueue_task_fair+0x412/0x660
Jan 19 12:42:11 localhost kernel: [    2.726620]  [<ffffffff81680b6f>] 
bad_area_nosemaphore+0x13/0x15
Jan 19 12:42:11 localhost kernel: [    2.726621]  [<ffffffff8168f104>] 
__do_page_fault+0xf4/0x600
Jan 19 12:42:11 localhost kernel: [    2.726621]  [<ffffffff810aade0>] ? 
update_curr+0x70/0x1f0
Jan 19 12:42:11 localhost kernel: [    2.726622]  [<ffffffff810ab386>] ? 
dequeue_entity+0x106/0x560
Jan 19 12:42:11 localhost kernel: [    2.726623]  [<ffffffff810abbfe>] ? 
dequeue_task_fair+0x41e/0x620
Jan 19 12:42:11 localhost kernel: [    2.726625]  [<ffffffff81353bb0>] ? 
pci_dev_attrs_are_visible+0x30/0x40
Jan 19 12:42:11 localhost kernel: [    2.726626]  [<ffffffff8168f61e>] 
do_page_fault+0xe/0x10
Jan 19 12:42:11 localhost kernel: [    2.726627]  [<ffffffff8168c118>] 
page_fault+0x28/0x30
Jan 19 12:42:11 localhost kernel: [    2.726628]  [<ffffffff81353bb0>] ? 
pci_dev_attrs_are_visible+0x30/0x40
Jan 19 12:42:11 localhost kernel: [    2.726629]  [<ffffffff81353ba8>] ? 
pci_dev_attrs_are_visible+0x28/0x40
Jan 19 12:42:11 localhost kernel: [    2.726630]  [<ffffffff81353bb0>] ? 
pci_dev_attrs_are_visible+0x30/0x40
Jan 19 12:42:11 localhost kernel: [    2.726631]  [<ffffffff81331f3b>] ? 
__list_add+0x1b/0xc0
Jan 19 12:42:11 localhost kernel: [    2.726632]  [<ffffffff81353bc0>] ? 
pci_dev_attrs_are_visible+0x40/0x40
Jan 19 12:42:11 localhost kernel: [    2.726633]  [<ffffffff8131e122>] 
plist_add+0x82/0xd0
Jan 19 12:42:11 localhost kernel: [    2.726634]  [<ffffffff810db933>] 
task_blocks_on_rt_mutex+0x1e3/0x260
Jan 19 12:42:11 localhost kernel: [    2.726635]  [<ffffffff8168a951>] 
rt_mutex_slowlock+0x111/0x260
Jan 19 12:42:11 localhost kernel: [    2.726637]  [<ffffffff8168ab51>] 
rt_mutex_lock+0x31/0x40
Jan 19 12:42:11 localhost kernel: [    2.726638]  [<ffffffff8168b4be>] 
_mutex_lock+0xe/0x10
Jan 19 12:42:11 localhost kernel: [    2.726638]  [<ffffffff81232f96>] 
sysfs_read_file+0x36/0x1a0
Jan 19 12:42:11 localhost kernel: [    2.726640]  [<ffffffff811bae6e>] 
vfs_read+0x9e/0x170
Jan 19 12:42:11 localhost kernel: [    2.726641]  [<ffffffff811bb999>] 
SyS_read+0x49/0xa0
Jan 19 12:42:11 localhost kernel: [    2.726642]  [<ffffffff81693529>] 
system_call_fastpath+0x16/0x1b
Jan 19 12:42:11 localhost kernel: [    2.726648] Code: 00 c0 00 00 74 25 
49 8b 45 00 31 f6 f6 c4 40 74 04 41 8b 75 68 4c 89 ef e8 37 4c fb ff e9 
14 ff ff ff 4d 8b 6d 30 e9 74 fe ff ff <0f> 0b 0f 1f 80 00 00 00 00 0f 
1f 44 00 00 55 48 89 e5 41 57 41
Jan 19 12:42:11 localhost kernel: [    2.726649] RIP 
[<ffffffff811a4237>] kfree+0x1f7/0x200
Jan 19 12:42:11 localhost kernel: [    2.726650]  RSP <ffff8807d57eb928>
Jan 19 12:42:11 localhost kernel: [    2.736576] usb 3-1.2: new 
low-speed USB device number 4 using xhci_hcd
Jan 19 12:42:11 localhost kernel: [    2.752839] usb 3-1.2: New USB 
device found, idVendor=046d, idProduct=c00e
Jan 19 12:42:11 localhost kernel: [    2.752841] usb 3-1.2: New USB 
device strings: Mfr=1, Product=2, SerialNumber=0
Jan 19 12:42:11 localhost kernel: [    2.752842] usb 3-1.2: Product: 
USB-PS/2 Optical Mouse
Jan 19 12:42:11 localhost kernel: [    2.752843] usb 3-1.2: 
Manufacturer: Logitech
Jan 19 12:42:11 localhost kernel: [    2.785648] ---[ end trace 
0000000000000004 ]---
Jan 19 12:42:11 localhost kernel: [    2.785649] Fixing recursive fault 
but reboot is needed!
Jan 19 12:42:11 localhost kernel: [    2.785650] BUG: unable to handle 
kernel paging request at 0000000100000010
Jan 19 12:42:11 localhost kernel: [    2.785653] IP: 
[<ffffffff8134734c>] pinctrl_get+0x4c/0x470
Jan 19 12:42:11 localhost kernel: [    2.785654] PGD 0
Jan 19 12:42:11 localhost kernel: [    2.785654] Oops: 0000 [#3] PREEMPT 
SMP
Jan 19 12:42:11 localhost kernel: [    2.785663] Modules linked in: 
ac97_bus(+) snd_hda_codec_realtek lpc_ich(+) e1000e(+) mfd_core mei_me 
ptp mei pps_core snd_hda_intel(+) snd_hda_codec snd_hdsp(+) snd_rawmidi 
snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer snd 
soundcore shpchp firewire_ohci firewire_core nouveau crc_itu_t 
i2c_algo_bit drm_kms_helper ttm drm i2c_core mxm_wmi wmi video
Jan 19 12:42:11 localhost kernel: [    2.785664] CPU: 4 PID: 75 Comm: 
khubd Tainted: G      D W    3.12.6-300.rt9.1.fc20.ccrma.x86_64+rt #1
Jan 19 12:42:11 localhost kernel: [    2.785664] Hardware name: 
          /DZ77GA-70K, BIOS GAZ7711H.86A.0061.2012.1228.1110 12/28/2012
Jan 19 12:42:11 localhost kernel: [    2.785665] task: ffff8807d84b5340 
ti: ffff8807d86ce000 task.ti: ffff8807d86ce000
Jan 19 12:42:11 localhost kernel: [    2.785666] RIP: 
0010:[<ffffffff8134734c>]  [<ffffffff8134734c>] pinctrl_get+0x4c/0x470
Jan 19 12:42:11 localhost kernel: [    2.785667] RSP: 
0018:ffff8807d86cfb98  EFLAGS: 00010203
Jan 19 12:42:11 localhost kernel: [    2.785667] RAX: 0000000000000000 
RBX: 0000000100000000 RCX: 0000000000000000
Jan 19 12:42:11 localhost kernel: [    2.785668] RDX: ffff8807d84b5340 
RSI: 0000000000000000 RDI: ffffffff81ca6840
Jan 19 12:42:11 localhost kernel: [    2.785668] RBP: ffff8807d86cfbe0 
R08: 00000000000664e0 R09: ffff8807cd8ca1c0
Jan 19 12:42:11 localhost kernel: [    2.785669] R10: ffff8807fdc03b00 
R11: 0000000000000000 R12: ffff8807d5260088
Jan 19 12:42:11 localhost kernel: [    2.785669] R13: ffff8807d5260088 
R14: ffff8807d5260088 R15: ffff8807d5a3a800
Jan 19 12:42:11 localhost kernel: [    2.785670] FS: 
0000000000000000(0000) GS:ffff8807fe400000(0000) knlGS:0000000000000000
Jan 19 12:42:11 localhost kernel: [    2.785670] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Jan 19 12:42:11 localhost kernel: [    2.785671] CR2: 0000000100000010 
CR3: 00000000df05b000 CR4: 00000000001407e0
Jan 19 12:42:11 localhost kernel: [    2.785671] Stack:
Jan 19 12:42:11 localhost kernel: [    2.785673]  ffffffff81a6afa4 
ffffffff813471c0 ffff8807d86cfbe0 ffffffff8141bc13
Jan 19 12:42:11 localhost kernel: [    2.785674]  ffff8807cd8ca1e8 
ffff8807d46d2e08 ffff8807d5260088 ffff8807d5260088
Jan 19 12:42:11 localhost kernel: [    2.785674]  ffff8807d5a3a800 
ffff8807d86cfc08 ffffffff813477b5 ffff8807d5260088
Jan 19 12:42:11 localhost kernel: [    2.785675] Call Trace:
Jan 19 12:42:11 localhost kernel: [    2.785676]  [<ffffffff813471c0>] ? 
pinctrl_put+0x30/0x30
Jan 19 12:42:11 localhost kernel: [    2.785678]  [<ffffffff8141bc13>] ? 
__devres_alloc+0x43/0x70
Jan 19 12:42:11 localhost kernel: [    2.785679]  [<ffffffff813477b5>] 
devm_pinctrl_get+0x45/0x80
Jan 19 12:42:11 localhost kernel: [    2.785681]  [<ffffffff8143205b>] 
pinctrl_bind_pins+0x3b/0x220
Jan 19 12:42:11 localhost kernel: [    2.785682]  [<ffffffff814189ba>] 
driver_probe_device+0x6a/0x3a0
Jan 19 12:42:11 localhost kernel: [    2.785684]  [<ffffffff81418cf0>] ? 
driver_probe_device+0x3a0/0x3a0
Jan 19 12:42:11 localhost kernel: [    2.785685]  [<ffffffff81418d2b>] 
__device_attach+0x3b/0x40
Jan 19 12:42:11 localhost kernel: [    2.785686]  [<ffffffff81416893>] 
bus_for_each_drv+0x63/0xa0
Jan 19 12:42:11 localhost kernel: [    2.785687]  [<ffffffff814188d8>] 
device_attach+0x88/0xa0
Jan 19 12:42:11 localhost kernel: [    2.785688]  [<ffffffff81417bd8>] 
bus_probe_device+0xa8/0xd0
Jan 19 12:42:11 localhost kernel: [    2.785689]  [<ffffffff81415794>] 
device_add+0x4c4/0x7a0
Jan 19 12:42:11 localhost kernel: [    2.785691]  [<ffffffff8148b000>] 
usb_new_device+0x220/0x3b0
Jan 19 12:42:11 localhost kernel: [    2.785692]  [<ffffffff8148dd08>] 
hub_thread+0x8b8/0x1710
Jan 19 12:42:11 localhost kernel: [    2.785695]  [<ffffffff81092b30>] ? 
wake_up_atomic_t+0x30/0x30
Jan 19 12:42:11 localhost kernel: [    2.785696]  [<ffffffff8148d450>] ? 
usb_reset_device+0x1d0/0x1d0
Jan 19 12:42:11 localhost kernel: [    2.785697]  [<ffffffff81091f52>] 
kthread+0xb2/0xc0
Jan 19 12:42:11 localhost kernel: [    2.785698]  [<ffffffff81091ea0>] ? 
kthread_worker_fn+0x180/0x180
Jan 19 12:42:11 localhost kernel: [    2.785700]  [<ffffffff8169347c>] 
ret_from_fork+0x7c/0xb0
Jan 19 12:42:11 localhost kernel: [    2.785701]  [<ffffffff81091ea0>] ? 
kthread_worker_fn+0x180/0x180
Jan 19 12:42:11 localhost kernel: [    2.785708] Code: 00 48 c7 c7 40 68 
ca 81 e8 82 41 34 00 48 8b 1d 5b f4 95 00 48 81 fb 90 67 ca 81 75 0e eb 
60 48 8b 1b 48 81 fb 90 67 ca 81 74 54 <4c> 3b 63 10 75 ee 48 c7 c7 40 
68 ca 81 e8 b2 41 34 00 48 85 db
Jan 19 12:42:11 localhost kernel: [    2.785709] RIP 
[<ffffffff8134734c>] pinctrl_get+0x4c/0x470
Jan 19 12:42:11 localhost kernel: [    2.785709]  RSP <ffff8807d86cfb98>
Jan 19 12:42:11 localhost kernel: [    2.785709] CR2: 0000000100000010
Jan 19 12:42:11 localhost kernel: [    2.820180] ---[ end trace 
0000000000000005 ]---


[-- Attachment #2: config-3.12.6-300.rt9.1.fc20.ccrma.x86_64+rt.bz2 --]
[-- Type: application/x-bzip, Size: 31439 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2014-01-18  3:15     ` Mike Galbraith
@ 2014-01-21  2:17       ` Steven Rostedt
  2014-01-21  6:39         ` Muli Baron
                           ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Steven Rostedt @ 2014-01-21  2:17 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sebastian Andrzej Siewior, linux-rt-users, LKML, Thomas Gleixner,
	John Kacur

On Sat, 18 Jan 2014 04:15:29 +0100
Mike Galbraith <bitbucket@online.de> wrote:

 
> > So you also have the timers-do-not-raise-softirq-unconditionally.patch?
> 

People have been complaining that the latest 3.12-rt does not boot on
intel i7 boxes. And by reverting this patch, it boots fine.

I happen to have a i7 box to test on, and sure enough, the latest
3.12-rt locks up on boot and reverting the
timers-do-not-raise-softirq-unconditionally.patch, it boots fine.

Looking into it, I made this small update, and the box boots. Seems
checking "active_timers" is not enough to skip raising softirqs. I
haven't looked at why yet, but I would like others to test this patch
too.

I'll leave why this lets i7 boxes boot as an exercise for Thomas ;-)

-- Steve

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/timer.c b/kernel/timer.c
index 46467be..8212c10 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1464,13 +1464,11 @@ void run_local_timers(void)
 		raise_softirq(TIMER_SOFTIRQ);
 		return;
 	}
-	if (!base->active_timers)
-		goto out;
 
 	/* Check whether the next pending timer has expired */
 	if (time_before_eq(base->next_timer, jiffies))
 		raise_softirq(TIMER_SOFTIRQ);
-out:
+
 	rt_spin_unlock_after_trylock_in_irq(&base->lock);
 
 }

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2014-01-21  2:17       ` Steven Rostedt
@ 2014-01-21  6:39         ` Muli Baron
  2014-01-21 15:40           ` Joe Korty
  2014-01-22 21:27         ` Joakim Hernberg
  2014-01-24 11:19         ` Sebastian Andrzej Siewior
  2 siblings, 1 reply; 21+ messages in thread
From: Muli Baron @ 2014-01-21  6:39 UTC (permalink / raw)
  To: linux-rt-users; +Cc: linux-kernel

On 21/1/2014 04:17, Steven Rostedt wrote:
> On Sat, 18 Jan 2014 04:15:29 +0100
> Mike Galbraith <bitbucket@online.de> wrote:
>
>
>>> So you also have the timers-do-not-raise-softirq-unconditionally.patch?
>>
>
> People have been complaining that the latest 3.12-rt does not boot on
> intel i7 boxes. And by reverting this patch, it boots fine.
>
> I happen to have a i7 box to test on, and sure enough, the latest
> 3.12-rt locks up on boot and reverting the
> timers-do-not-raise-softirq-unconditionally.patch, it boots fine.
>
> Looking into it, I made this small update, and the box boots. Seems
> checking "active_timers" is not enough to skip raising softirqs. I
> haven't looked at why yet, but I would like others to test this patch
> too.
>
> I'll leave why this lets i7 boxes boot as an exercise for Thomas ;-)
>
> -- Steve
>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
>
> diff --git a/kernel/timer.c b/kernel/timer.c
> index 46467be..8212c10 100644
> --- a/kernel/timer.c
> +++ b/kernel/timer.c
> @@ -1464,13 +1464,11 @@ void run_local_timers(void)
>   		raise_softirq(TIMER_SOFTIRQ);
>   		return;
>   	}
> -	if (!base->active_timers)
> -		goto out;
>
>   	/* Check whether the next pending timer has expired */
>   	if (time_before_eq(base->next_timer, jiffies))
>   		raise_softirq(TIMER_SOFTIRQ);
> -out:
> +
>   	rt_spin_unlock_after_trylock_in_irq(&base->lock);
>
>   }
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

While this might fix booting on i7 machines it kinds of defeats the 
original purpose of this patch, which was to let NO_HZ_FULL work 
properly with threaded interrupts. With the active_timers check removed 
the timer interrupt keeps firing even though there is only one task 
running on a specific processor, since it can't shut down the tick 
because the ksoftirqd thread keeps getting scheduled (see the previous 
thread "CONFIG_NO_HZ_FULL + CONFIG_PREEMPT_RT_FULL = nogo" for the full 
discussion).

-- Muli



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2014-01-21  6:39         ` Muli Baron
@ 2014-01-21 15:40           ` Joe Korty
  0 siblings, 0 replies; 21+ messages in thread
From: Joe Korty @ 2014-01-21 15:40 UTC (permalink / raw)
  To: Muli Baron; +Cc: linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org

On Tue, Jan 21, 2014 at 01:39:10AM -0500, Muli Baron wrote:
> On 21/1/2014 04:17, Steven Rostedt wrote:
> > On Sat, 18 Jan 2014 04:15:29 +0100
> > Mike Galbraith <bitbucket@online.de> wrote:
> >
> >
> >>> So you also have the timers-do-not-raise-softirq-unconditionally.patch?
> >>
> >
> > People have been complaining that the latest 3.12-rt does not boot on
> > intel i7 boxes. And by reverting this patch, it boots fine.
> >
> > I happen to have a i7 box to test on, and sure enough, the latest
> > 3.12-rt locks up on boot and reverting the
> > timers-do-not-raise-softirq-unconditionally.patch, it boots fine.
> >
> > Looking into it, I made this small update, and the box boots. Seems
> > checking "active_timers" is not enough to skip raising softirqs. I
> > haven't looked at why yet, but I would like others to test this patch
> > too.
> >
> > I'll leave why this lets i7 boxes boot as an exercise for Thomas ;-)
> >
> > -- Steve
> >
> > Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> >
> > diff --git a/kernel/timer.c b/kernel/timer.c
> > index 46467be..8212c10 100644
> > --- a/kernel/timer.c
> > +++ b/kernel/timer.c
> > @@ -1464,13 +1464,11 @@ void run_local_timers(void)
> >   		raise_softirq(TIMER_SOFTIRQ);
> >   		return;
> >   	}
> > -	if (!base->active_timers)
> > -		goto out;
> >
> >   	/* Check whether the next pending timer has expired */
> >   	if (time_before_eq(base->next_timer, jiffies))
> >   		raise_softirq(TIMER_SOFTIRQ);
> > -out:
> > +
> >   	rt_spin_unlock_after_trylock_in_irq(&base->lock);
> >
> >   }
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> While this might fix booting on i7 machines it kinds of defeats the 
> original purpose of this patch, which was to let NO_HZ_FULL work 
> properly with threaded interrupts. With the active_timers check removed 
> the timer interrupt keeps firing even though there is only one task 
> running on a specific processor, since it can't shut down the tick 
> because the ksoftirqd thread keeps getting scheduled (see the previous 
> thread "CONFIG_NO_HZ_FULL + CONFIG_PREEMPT_RT_FULL = nogo" for the full 
> discussion).
> 
> -- Muli


Would something like this work?  This would get us past boot, which has
always been this strange, half initialized thing one has to tiptoe around.

-	if (!base->active_timers)
+	if (!base->active_timers && system_state == SYSTEM_RUNNING)

Joe

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2014-01-21  2:17       ` Steven Rostedt
  2014-01-21  6:39         ` Muli Baron
@ 2014-01-22 21:27         ` Joakim Hernberg
  2014-01-24 11:19         ` Sebastian Andrzej Siewior
  2 siblings, 0 replies; 21+ messages in thread
From: Joakim Hernberg @ 2014-01-22 21:27 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mike Galbraith, Sebastian Andrzej Siewior, linux-rt-users, LKML,
	Thomas Gleixner, John Kacur

On Mon, 20 Jan 2014 21:17:36 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> I happen to have a i7 box to test on, and sure enough, the latest
> 3.12-rt locks up on boot and reverting the
> timers-do-not-raise-softirq-unconditionally.patch, it boots fine.

> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> diff --git a/kernel/timer.c b/kernel/timer.c
> index 46467be..8212c10 100644
> --- a/kernel/timer.c
> +++ b/kernel/timer.c
> @@ -1464,13 +1464,11 @@ void run_local_timers(void)
>  		raise_softirq(TIMER_SOFTIRQ);
>  		return;
>  	}
> -	if (!base->active_timers)
> -		goto out;
>  
>  	/* Check whether the next pending timer has expired */
>  	if (time_before_eq(base->next_timer, jiffies))
>  		raise_softirq(TIMER_SOFTIRQ);
> -out:
> +
>  	rt_spin_unlock_after_trylock_in_irq(&base->lock);
>  
>  }

This fixes the problem on my i7-2600k.

-- 

   Joakim

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [ANNOUNCE] 3.12.6-rt9
  2014-01-21  2:17       ` Steven Rostedt
  2014-01-21  6:39         ` Muli Baron
  2014-01-22 21:27         ` Joakim Hernberg
@ 2014-01-24 11:19         ` Sebastian Andrzej Siewior
  2 siblings, 0 replies; 21+ messages in thread
From: Sebastian Andrzej Siewior @ 2014-01-24 11:19 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mike Galbraith, linux-rt-users, LKML, Thomas Gleixner, John Kacur

On 01/21/2014 03:17 AM, Steven Rostedt wrote:
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> 
> diff --git a/kernel/timer.c b/kernel/timer.c
> index 46467be..8212c10 100644
> --- a/kernel/timer.c
> +++ b/kernel/timer.c
> @@ -1464,13 +1464,11 @@ void run_local_timers(void)
>  		raise_softirq(TIMER_SOFTIRQ);
>  		return;
>  	}
> -	if (!base->active_timers)
> -		goto out;
>  
>  	/* Check whether the next pending timer has expired */
>  	if (time_before_eq(base->next_timer, jiffies))
>  		raise_softirq(TIMER_SOFTIRQ);

Hmmm. If active_timers is 0 and "time_before_eq(base->next_timer,
jiffies))" is true than that timer should have been initialized with
init_timer_deferrable() or we have a serious bug here where
active_timers isn't properly synchronized anymore.

Now. If there is really just a deferrable timer that expired and nothing
else then this would explain it.

> -out:
> +
>  	rt_spin_unlock_after_trylock_in_irq(&base->lock);
>  
>  }

Sebastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-01-24 11:19 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-23 22:50 [ANNOUNCE] 3.12.6-rt9 Sebastian Andrzej Siewior
2013-12-24 15:15 ` 3.12.6-rt9 build failure Nicholas Mc Guire
2013-12-24 15:47 ` [ANNOUNCE] 3.12.6-rt9 Mike Galbraith
2013-12-24 16:39   ` Pavel Vasilyev
2013-12-25  3:24     ` Mike Galbraith
2014-01-17 17:00   ` Sebastian Andrzej Siewior
2014-01-18  3:15     ` Mike Galbraith
2014-01-21  2:17       ` Steven Rostedt
2014-01-21  6:39         ` Muli Baron
2014-01-21 15:40           ` Joe Korty
2014-01-22 21:27         ` Joakim Hernberg
2014-01-24 11:19         ` Sebastian Andrzej Siewior
2013-12-27 20:00 ` Nicholas Mc Guire
2013-12-28  3:30   ` Mike Galbraith
2013-12-28  3:48     ` Mike Galbraith
2013-12-28  7:43       ` Nicholas Mc Guire
2013-12-28 13:57         ` Mike Galbraith
2013-12-28  4:33     ` Mike Galbraith
2014-01-11 20:25   ` Joakim Hernberg
2014-01-17 16:10   ` Sebastian Andrzej Siewior
2014-01-19 20:54   ` Fernando Lopez-Lezcano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).