* [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
@ 2005-03-19 19:16 Ingo Molnar
2005-03-20 0:24 ` Lee Revell
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Ingo Molnar @ 2005-03-19 19:16 UTC (permalink / raw)
To: linux-kernel; +Cc: Paul E. McKenney
i have released the -V0.7.41-00 Real-Time Preemption patch (merged to
2.6.12-rc1), which can be downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
the biggest change in this patch is the merge of Paul E. McKenney's
preemptable RCU code. The new RCU code is active on PREEMPT_RT. While it
is still quite experimental at this stage, it allowed the removal of
locking cruft (mainly in the networking code), so it could solve some of
the longstanding netfilter/networking deadlocks/crashes reported by a
number of people. Be careful nevertheless.
there are a couple of minor changes relative to Paul's latest
preemptable-RCU code drop:
- made the two variants two #ifdef blocks - this is sufficient for now
and we'll see what the best way is in the longer run.
- moved rcu_check_callbacks() from the timer IRQ to ksoftirqd. (the
timer IRQ still runs in hardirq context on PREEMPT_RT.)
- changed the irq-flags method to a preempt_disable()-based method, and
moved the lock taking outside of the critical sections. (due to locks
potentially sleeping on PREEMPT_RT).
to create a -V0.7.41-00 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.11.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.12-rc1.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.12-rc1-V0.7.41-00
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-19 19:16 Ingo Molnar
@ 2005-03-20 0:24 ` Lee Revell
2005-03-21 15:42 ` K.R. Foley
2005-03-20 1:33 ` Lee Revell
2005-03-20 17:45 ` Paul E. McKenney
2 siblings, 1 reply; 11+ messages in thread
From: Lee Revell @ 2005-03-20 0:24 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Paul E. McKenney
On Sat, 2005-03-19 at 20:16 +0100, Ingo Molnar wrote:
> i have released the -V0.7.41-00 Real-Time Preemption patch (merged to
> 2.6.12-rc1), which can be downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
3ms latency in the NFS client code. Workload was a kernel compile over
NFS.
preemption latency trace v1.1.4 on 2.6.12-rc1-RT-V0.7.41-00
--------------------------------------------------------------------
latency: 3178 �s, #4095/14224, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
-----------------
| task: ksoftirqd/0-2 (uid:0 nice:-10 policy:0 rt_prio:0)
-----------------
_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
(T1/#0) <...> 32105 0 3 00000004 00000000 [0011939614227867] 0.000ms (+4137027.445ms): <6500646c> (<61000000>)
(T1/#2) <...> 32105 0 3 00000004 00000002 [0011939614228097] 0.000ms (+0.000ms): __trace_start_sched_wakeup+0x9a/0xd0 <c013150a> (try_to_wake_up+0x94/0x140 <c0110474>)
(T1/#3) <...> 32105 0 3 00000003 00000003 [0011939614228436] 0.000ms (+0.000ms): preempt_schedule+0x11/0x80 <c02b57c1> (try_to_wake_up+0x94/0x140 <c0110474>)
(T3/#4) <...>-32105 0dn.3 0�s : try_to_wake_up+0x11e/0x140 <c01104fe> <<...>-2> (69 76):
(T1/#5) <...> 32105 0 3 00000002 00000005 [0011939614228942] 0.000ms (+0.000ms): preempt_schedule+0x11/0x80 <c02b57c1> (try_to_wake_up+0xf8/0x140 <c01104d8>)
(T1/#6) <...> 32105 0 3 00000002 00000006 [0011939614229130] 0.000ms (+0.000ms): wake_up_process+0x35/0x40 <c0110555> (do_softirq+0x3f/0x50 <c011b05f>)
(T6/#7) <...>-32105 0dn.1 1�s < (1)
(T1/#8) <...> 32105 0 2 00000001 00000008 [0011939614229782] 0.001ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
(T1/#9) <...> 32105 0 2 00000001 00000009 [0011939614229985] 0.001ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
(T1/#10) <...> 32105 0 2 00000001 0000000a [0011939614230480] 0.001ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
(T1/#11) <...> 32105 0 2 00000001 0000000b [0011939614230634] 0.002ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
(T1/#12) <...> 32105 0 2 00000001 0000000c [0011939614230889] 0.002ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
(T1/#13) <...> 32105 0 2 00000001 0000000d [0011939614231034] 0.002ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
(T1/#14) <...> 32105 0 2 00000001 0000000e [0011939614231302] 0.002ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
(T1/#15) <...> 32105 0 2 00000001 0000000f [0011939614231419] 0.002ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
(last two lines just repeat)
This is probably not be a regression; I had never tested this with NFS
before.
Lee
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-19 19:16 Ingo Molnar
2005-03-20 0:24 ` Lee Revell
@ 2005-03-20 1:33 ` Lee Revell
2005-03-20 1:50 ` K.R. Foley
2005-03-20 17:45 ` Paul E. McKenney
2 siblings, 1 reply; 11+ messages in thread
From: Lee Revell @ 2005-03-20 1:33 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Paul E. McKenney
On Sat, 2005-03-19 at 20:16 +0100, Ingo Molnar wrote:
> the biggest change in this patch is the merge of Paul E. McKenney's
> preemptable RCU code. The new RCU code is active on PREEMPT_RT. While it
> is still quite experimental at this stage, it allowed the removal of
> locking cruft (mainly in the networking code), so it could solve some of
> the longstanding netfilter/networking deadlocks/crashes reported by a
> number of people. Be careful nevertheless.
With PREEMPT_RT my machine deadlocked within 20 minutes of boot.
"apt-get dist-upgrade" seemed to trigger the crash. I did not see any
Oops unfortunately.
Lee
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-20 1:33 ` Lee Revell
@ 2005-03-20 1:50 ` K.R. Foley
2005-03-20 4:32 ` Lee Revell
0 siblings, 1 reply; 11+ messages in thread
From: K.R. Foley @ 2005-03-20 1:50 UTC (permalink / raw)
To: Lee Revell; +Cc: Ingo Molnar, linux-kernel, Paul E. McKenney
Lee Revell wrote:
> On Sat, 2005-03-19 at 20:16 +0100, Ingo Molnar wrote:
>
>>the biggest change in this patch is the merge of Paul E. McKenney's
>>preemptable RCU code. The new RCU code is active on PREEMPT_RT. While it
>>is still quite experimental at this stage, it allowed the removal of
>>locking cruft (mainly in the networking code), so it could solve some of
>>the longstanding netfilter/networking deadlocks/crashes reported by a
>>number of people. Be careful nevertheless.
>
>
> With PREEMPT_RT my machine deadlocked within 20 minutes of boot.
> "apt-get dist-upgrade" seemed to trigger the crash. I did not see any
> Oops unfortunately.
>
> Lee
>
Lee,
Just curious. Is this with UP or SMP? I currently have my UP box running
PREEMPT_RT, with no problems thus far. However, my SMP box dies while
booting (with an oops). I am working on trying to get setup to capture
the oops, although it might be tomorrow before I get that done.
--
kr
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-20 1:50 ` K.R. Foley
@ 2005-03-20 4:32 ` Lee Revell
2005-03-20 22:40 ` Paul E. McKenney
0 siblings, 1 reply; 11+ messages in thread
From: Lee Revell @ 2005-03-20 4:32 UTC (permalink / raw)
To: K.R. Foley; +Cc: Ingo Molnar, linux-kernel, Paul E. McKenney
On Sat, 2005-03-19 at 19:50 -0600, K.R. Foley wrote:
> Lee Revell wrote:
> > On Sat, 2005-03-19 at 20:16 +0100, Ingo Molnar wrote:
> >
> >>the biggest change in this patch is the merge of Paul E. McKenney's
> >>preemptable RCU code. The new RCU code is active on PREEMPT_RT. While it
> >>is still quite experimental at this stage, it allowed the removal of
> >>locking cruft (mainly in the networking code), so it could solve some of
> >>the longstanding netfilter/networking deadlocks/crashes reported by a
> >>number of people. Be careful nevertheless.
> >
> >
> > With PREEMPT_RT my machine deadlocked within 20 minutes of boot.
> > "apt-get dist-upgrade" seemed to trigger the crash. I did not see any
> > Oops unfortunately.
> >
> > Lee
> >
>
> Lee,
>
> Just curious. Is this with UP or SMP? I currently have my UP box running
> PREEMPT_RT, with no problems thus far. However, my SMP box dies while
> booting (with an oops). I am working on trying to get setup to capture
> the oops, although it might be tomorrow before I get that done.
>
UP. It's 100% reproducible, this machine locks up over and over. Seems
to be associated with network activity by multiple processes.
Lee
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-19 19:16 Ingo Molnar
2005-03-20 0:24 ` Lee Revell
2005-03-20 1:33 ` Lee Revell
@ 2005-03-20 17:45 ` Paul E. McKenney
2005-03-21 8:53 ` Ingo Molnar
2 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2005-03-20 17:45 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel
On Sat, Mar 19, 2005 at 08:16:58PM +0100, Ingo Molnar wrote:
>
> i have released the -V0.7.41-00 Real-Time Preemption patch (merged to
> 2.6.12-rc1), which can be downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> the biggest change in this patch is the merge of Paul E. McKenney's
> preemptable RCU code. The new RCU code is active on PREEMPT_RT. While it
> is still quite experimental at this stage, it allowed the removal of
> locking cruft (mainly in the networking code), so it could solve some of
> the longstanding netfilter/networking deadlocks/crashes reported by a
> number of people. Be careful nevertheless.
>
> there are a couple of minor changes relative to Paul's latest
> preemptable-RCU code drop:
>
> - made the two variants two #ifdef blocks - this is sufficient for now
> and we'll see what the best way is in the longer run.
>
> - moved rcu_check_callbacks() from the timer IRQ to ksoftirqd. (the
> timer IRQ still runs in hardirq context on PREEMPT_RT.)
>
> - changed the irq-flags method to a preempt_disable()-based method, and
> moved the lock taking outside of the critical sections. (due to locks
> potentially sleeping on PREEMPT_RT).
>
> to create a -V0.7.41-00 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.11.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.12-rc1.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.12-rc1-V0.7.41-00
Some proposed fixes from a quick scan (untested, probably does not even
compile). These proposed fixes fall into the following categories:
o Some functions that should be static.
o Introduced a synchronize_kernel_barrier() for a number of
uses of synchronize_kernel() that are broken by the new
implementation. Note that synchronize_kernel_barrier() is
the same as synchronize_kernel() in non-CONFIG_PREEMPT_RT
kernels. Not clear that synchronize_kernel_barrier()
is strong enough for some uses, may need another API
(synchronize_kernel_barrier_voluntary()???) that waits for all
tasks to -voluntary- context switch or be executing in user
space (these are marked with FIXME in the attached patch).
Dipankar and/or Rusty put out a patch that did this some time
back -- this was when we were trying to make an RCU that worked
in CONFIG_PREEMPT kernels, but did not want preempt_disable()
on the read side.
That said, some of the synchronize_kernel_barrier()s marked
with FIXME may be fixable more simply by inserting
rcu_read_lock()/rcu_read_unlock() pairs in appropriate
places.
o Merged the two identical implementations each of
rcu_dereference() and rcu_assign_pointer().
o Added an rcu_read_lock() or two. Clearly need to be searching
for patches containing "synchronize_kernel" in addition to
patches containing "rcu"...
Thoughts?
Thanx, Paul
Signed-off-by: <paulmck@us.ibm.com>
diff -urpN -X dontdiff linux-2.6.11/arch/i386/oprofile/nmi_timer_int.c linux-2.6.11.fixes/arch/i386/oprofile/nmi_timer_int.c
--- linux-2.6.11/arch/i386/oprofile/nmi_timer_int.c Tue Mar 1 23:37:52 2005
+++ linux-2.6.11.fixes/arch/i386/oprofile/nmi_timer_int.c Sun Mar 20 08:40:31 2005
@@ -36,7 +36,7 @@ static void timer_stop(void)
{
enable_timer_nmi_watchdog();
unset_nmi_callback();
- synchronize_kernel();
+ synchronize_kernel_barrier();
}
diff -urpN -X dontdiff linux-2.6.11/arch/ppc64/kernel/ItLpQueue.c linux-2.6.11.fixes/arch/ppc64/kernel/ItLpQueue.c
--- linux-2.6.11/arch/ppc64/kernel/ItLpQueue.c Tue Mar 1 23:37:48 2005
+++ linux-2.6.11.fixes/arch/ppc64/kernel/ItLpQueue.c Sun Mar 20 08:48:29 2005
@@ -142,7 +142,9 @@ unsigned ItLpQueue_process( struct ItLpQ
lpQueue->xLpIntCountByType[nextLpEvent->xType]++;
if ( nextLpEvent->xType < HvLpEvent_Type_NumTypes &&
lpEventHandler[nextLpEvent->xType] )
+ rcu_read_lock();
lpEventHandler[nextLpEvent->xType](nextLpEvent, regs);
+ rcu_read_unlock();
else
printk(KERN_INFO "Unexpected Lp Event type=%d\n", nextLpEvent->xType );
diff -urpN -X dontdiff linux-2.6.11/arch/x86_64/kernel/mce.c linux-2.6.11.fixes/arch/x86_64/kernel/mce.c
--- linux-2.6.11/arch/x86_64/kernel/mce.c Tue Mar 1 23:37:52 2005
+++ linux-2.6.11.fixes/arch/x86_64/kernel/mce.c Sun Mar 20 08:49:45 2005
@@ -392,7 +392,7 @@ static ssize_t mce_read(struct file *fil
memset(mcelog.entry, 0, next * sizeof(struct mce));
mcelog.next = 0;
- synchronize_kernel();
+ synchronize_kernel_barrier();
/* Collect entries that were still getting written before the synchronize. */
diff -urpN -X dontdiff linux-2.6.11/drivers/acpi/processor_idle.c linux-2.6.11.fixes/drivers/acpi/processor_idle.c
--- linux-2.6.11/drivers/acpi/processor_idle.c Tue Mar 1 23:38:25 2005
+++ linux-2.6.11.fixes/drivers/acpi/processor_idle.c Sun Mar 20 09:01:44 2005
@@ -838,7 +838,7 @@ int acpi_processor_cst_has_changed (stru
/* Fall back to the default idle loop */
pm_idle = pm_idle_save;
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: strong enough? */
pr->flags.power = 0;
result = acpi_processor_get_power_info(pr);
diff -urpN -X dontdiff linux-2.6.11/drivers/char/ipmi/ipmi_si_intf.c linux-2.6.11.fixes/drivers/char/ipmi/ipmi_si_intf.c
--- linux-2.6.11/drivers/char/ipmi/ipmi_si_intf.c Sat Mar 19 14:04:13 2005
+++ linux-2.6.11.fixes/drivers/char/ipmi/ipmi_si_intf.c Sun Mar 20 08:39:49 2005
@@ -2194,7 +2194,7 @@ static int init_one_smi(int intf_num, st
/* Wait until we know that we are out of any interrupt
handlers might have been running before we freed the
interrupt. */
- synchronize_kernel();
+ synchronize_kernel_barrier();
if (new_smi->si_sm) {
if (new_smi->handlers)
@@ -2307,7 +2307,7 @@ static void __exit cleanup_one_si(struct
/* Wait until we know that we are out of any interrupt
handlers might have been running before we freed the
interrupt. */
- synchronize_kernel();
+ synchronize_kernel_barrier();
/* Wait for the timer to stop. This avoids problems with race
conditions removing the timer here. */
diff -urpN -X dontdiff linux-2.6.11/drivers/input/keyboard/atkbd.c linux-2.6.11.fixes/drivers/input/keyboard/atkbd.c
--- linux-2.6.11/drivers/input/keyboard/atkbd.c Sat Mar 19 14:04:16 2005
+++ linux-2.6.11.fixes/drivers/input/keyboard/atkbd.c Sun Mar 20 09:02:33 2005
@@ -678,7 +678,7 @@ static void atkbd_disconnect(struct seri
atkbd_disable(atkbd);
/* make sure we don't have a command in flight */
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: Strong enough? */
flush_scheduled_work();
device_remove_file(&serio->dev, &atkbd_attr_extra);
diff -urpN -X dontdiff linux-2.6.11/drivers/input/serio/i8042.c linux-2.6.11.fixes/drivers/input/serio/i8042.c
--- linux-2.6.11/drivers/input/serio/i8042.c Sat Mar 19 14:04:16 2005
+++ linux-2.6.11.fixes/drivers/input/serio/i8042.c Sun Mar 20 09:27:35 2005
@@ -396,7 +396,7 @@ static void i8042_stop(struct serio *ser
struct i8042_port *port = serio->port_data;
port->exists = 0;
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: Strong enough? */
port->serio = NULL;
}
diff -urpN -X dontdiff linux-2.6.11/drivers/net/r8169.c linux-2.6.11.fixes/drivers/net/r8169.c
--- linux-2.6.11/drivers/net/r8169.c Sat Mar 19 14:04:19 2005
+++ linux-2.6.11.fixes/drivers/net/r8169.c Sun Mar 20 09:09:06 2005
@@ -2385,7 +2385,7 @@ core_down:
}
/* Give a racing hard_start_xmit a few cycles to complete. */
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: Strong enough? */
/*
* And now for the 50k$ question: are IRQ disabled or not ?
diff -urpN -X dontdiff linux-2.6.11/drivers/s390/cio/airq.c linux-2.6.11.fixes/drivers/s390/cio/airq.c
--- linux-2.6.11/drivers/s390/cio/airq.c Tue Mar 1 23:38:17 2005
+++ linux-2.6.11.fixes/drivers/s390/cio/airq.c Sun Mar 20 09:11:57 2005
@@ -45,7 +45,7 @@ s390_register_adapter_interrupt (adapter
else
ret = (cmpxchg(&adapter_handler, NULL, handler) ? -EBUSY : 0);
if (!ret)
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: Strong enough? */
sprintf (dbf_txt, "ret:%d", ret);
CIO_TRACE_EVENT (4, dbf_txt);
@@ -65,7 +65,7 @@ s390_unregister_adapter_interrupt (adapt
ret = -EINVAL;
else {
adapter_handler = NULL;
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: Strong enough? */
ret = 0;
}
sprintf (dbf_txt, "ret:%d", ret);
diff -urpN -X dontdiff linux-2.6.11/include/linux/rcupdate.h linux-2.6.11.fixes/include/linux/rcupdate.h
--- linux-2.6.11/include/linux/rcupdate.h Sat Mar 19 14:09:52 2005
+++ linux-2.6.11.fixes/include/linux/rcupdate.h Sun Mar 20 09:24:20 2005
@@ -222,6 +222,8 @@ static inline int rcu_pending(int cpu)
*/
#define rcu_read_unlock_bh() local_bh_enable()
+#endif /* CONFIG_PREEMPT_RT */
+
/**
* rcu_dereference - fetch an RCU-protected pointer in an
* RCU read-side critical section. This pointer may later
@@ -256,6 +258,22 @@ static inline int rcu_pending(int cpu)
(p) = (v); \
})
+#ifndef CONFIG_PREEMPT_RT
+
+/**
+ * synchronize_kernel_barrier - block until each CPU executes a
+ * context switch, appears in the idle loop, or otherwise exits
+ * kernel execution. This is synonymous with synchronize_kernel()
+ * in the classic RCU implementation, but not in some RCU
+ * implementations optimized for realtime use. In these realtime
+ * uses, synchronize_kernel() can potentially return immediately,
+ * even on SMP systems.
+ *
+ * NMI-related uses of RCU need to use synchronize_kernel_barrier().
+ */
+
+#define synchronize_kernel_barrer() synchronize_kernel()
+
extern void rcu_init(void);
extern void rcu_check_callbacks(int cpu, int user);
extern void rcu_restart_cpu(int cpu);
@@ -275,40 +293,6 @@ extern void synchronize_kernel(void);
#define rcu_bh_qsctr_inc(cpu)
#define rcu_qsctr_inc(cpu)
-/**
- * rcu_dereference - fetch an RCU-protected pointer in an
- * RCU read-side critical section. This pointer may later
- * be safely dereferenced.
- *
- * Inserts memory barriers on architectures that require them
- * (currently only the Alpha), and, more importantly, documents
- * exactly which pointers are protected by RCU.
- */
-
-#define rcu_dereference(p) ({ \
- typeof(p) _________p1 = p; \
- smp_read_barrier_depends(); \
- (_________p1); \
- })
-
-/**
- * rcu_assign_pointer - assign (publicize) a pointer to a newly
- * initialized structure that will be dereferenced by RCU read-side
- * critical sections. Returns the value assigned.
- *
- * Inserts memory barriers on architectures that require them
- * (pretty much all of them other than x86), and also prevents
- * the compiler from reordering the code that initializes the
- * structure after the pointer assignment. More importantly, this
- * call documents which pointers will be dereferenced by RCU read-side
- * code.
- */
-
-#define rcu_assign_pointer(p, v) ({ \
- smp_wmb(); \
- (p) = (v); \
- })
-
extern void rcu_init(void);
/* Exported interfaces */
@@ -317,6 +301,7 @@ extern void FASTCALL(call_rcu(struct rcu
extern void rcu_read_lock(void);
extern void rcu_read_unlock(void);
extern void synchronize_kernel(void);
+extern void synchronize_kernel_barrier(void);
extern int rcu_pending(int cpu);
extern void rcu_check_callbacks(int cpu, int user);
diff -urpN -X dontdiff linux-2.6.11/kernel/module.c linux-2.6.11.fixes/kernel/module.c
--- linux-2.6.11/kernel/module.c Sat Mar 19 14:09:51 2005
+++ linux-2.6.11.fixes/kernel/module.c Sun Mar 20 09:13:23 2005
@@ -1812,7 +1812,7 @@ sys_init_module(void __user *umod,
/* Init routine failed: abort. Try to protect us from
buggy refcounters. */
mod->state = MODULE_STATE_GOING;
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: Strong enough? */
if (mod->unsafe)
printk(KERN_ERR "%s: module is now stuck!\n",
mod->name);
diff -urpN -X dontdiff linux-2.6.11/kernel/profile.c linux-2.6.11.fixes/kernel/profile.c
--- linux-2.6.11/kernel/profile.c Sat Mar 19 14:09:51 2005
+++ linux-2.6.11.fixes/kernel/profile.c Sun Mar 20 09:18:05 2005
@@ -194,7 +194,7 @@ void unregister_timer_hook(int (*hook)(s
WARN_ON(hook != timer_hook);
timer_hook = NULL;
/* make sure all CPUs see the NULL hook */
- synchronize_kernel();
+ synchronize_kernel_barrier(); /* FIXME: Strong enough? */
}
EXPORT_SYMBOL_GPL(register_timer_hook);
diff -urpN -X dontdiff linux-2.6.11/kernel/rcupdate.c linux-2.6.11.fixes/kernel/rcupdate.c
--- linux-2.6.11/kernel/rcupdate.c Sat Mar 19 14:09:51 2005
+++ linux-2.6.11.fixes/kernel/rcupdate.c Sun Mar 20 09:32:13 2005
@@ -548,7 +548,37 @@ void synchronize_kernel(void)
}
}
-void rcu_advance_callbacks(void)
+/*
+ * FIXME: Note that this implementation might not be strong enough
+ * for a number of driver uses of synchronize_kernel. Some of these
+ * uses seem to assume a non-CONFIG_PREEMPT kernel, so may need
+ * to come up with a different approach. Note that these uses
+ * are -not- waiting to free memory, but rather to ensure that
+ * a change is seen by all future driver invocations.
+ *
+ * The correct implementation is likely to be a tasklist scan,
+ * which blocks until all tasks encounter a voluntary context switch.
+ * If so, this implementation is required in CONFIG_PREEMPT
+ * kernels as well as CONFIG_PREEMPT_RT kernels.
+ */
+
+void synchronize_kernel_barrier(void)
+{
+ cpumask_t oldmask;
+ cpumask_t curmask;
+ int cpu;
+
+ if (sched_getaffinity(0, &oldmask) < 0) {
+ oldmask = cpu_possible_mask;
+ }
+ for_each_cpu(cpu) {
+ sched_setaffinity(0, cpumask_of_cpu(cpu));
+ schedule();
+ }
+ sched_setaffinity(0, oldmask);
+}
+
+static void rcu_advance_callbacks(void)
{
struct rcu_data *rdp;
@@ -578,7 +608,7 @@ void fastcall call_rcu(struct rcu_head *
put_cpu_var(rcu_data);
}
-void rcu_process_callbacks(void)
+static void rcu_process_callbacks(void)
{
struct rcu_head *next, *list;
struct rcu_data *rdp;
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-20 4:32 ` Lee Revell
@ 2005-03-20 22:40 ` Paul E. McKenney
0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2005-03-20 22:40 UTC (permalink / raw)
To: Lee Revell; +Cc: K.R. Foley, Ingo Molnar, linux-kernel
On Sat, Mar 19, 2005 at 11:32:59PM -0500, Lee Revell wrote:
> On Sat, 2005-03-19 at 19:50 -0600, K.R. Foley wrote:
> > Lee Revell wrote:
> > > On Sat, 2005-03-19 at 20:16 +0100, Ingo Molnar wrote:
> > >
> > >>the biggest change in this patch is the merge of Paul E. McKenney's
> > >>preemptable RCU code. The new RCU code is active on PREEMPT_RT. While it
> > >>is still quite experimental at this stage, it allowed the removal of
> > >>locking cruft (mainly in the networking code), so it could solve some of
> > >>the longstanding netfilter/networking deadlocks/crashes reported by a
> > >>number of people. Be careful nevertheless.
> > >
> > >
> > > With PREEMPT_RT my machine deadlocked within 20 minutes of boot.
> > > "apt-get dist-upgrade" seemed to trigger the crash. I did not see any
> > > Oops unfortunately.
> > >
> > > Lee
> > >
> >
> > Lee,
> >
> > Just curious. Is this with UP or SMP? I currently have my UP box running
> > PREEMPT_RT, with no problems thus far. However, my SMP box dies while
> > booting (with an oops). I am working on trying to get setup to capture
> > the oops, although it might be tomorrow before I get that done.
> >
>
> UP. It's 100% reproducible, this machine locks up over and over. Seems
> to be associated with network activity by multiple processes.
OK, guess I need to go inspect the uses of synchronize_net() in addition
to synchronize_kernel...
If you do manage to get any additional info, please let me know...
Thanx, Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-20 17:45 ` Paul E. McKenney
@ 2005-03-21 8:53 ` Ingo Molnar
2005-03-21 9:01 ` Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2005-03-21 8:53 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: linux-kernel
got this early-bootup crash on an SMP box:
BUG: Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c0131aec
*pde = 00000000
Oops: 0002 [#1]
PREEMPT SMP
Modules linked in:
CPU: 1
EIP: 0060:[<c0131aec>] Not tainted VLI
EFLAGS: 00010293 (2.6.12-rc1-RT-V0.7.41-00)
EIP is at rcu_advance_callbacks+0x3c/0x80
eax: 00000000 ebx: c050f280 ecx: c12191e0 edx: 00000000
esi: cfd2e560 edi: cfd2e4e0 ebp: cfd31dd0 esp: cfd31dc8
ds: 007b es: 007b ss: 0068 preempt: 00000003
Process khelper (pid: 60, threadinfo=cfd30000 task=cfd106a0)
Stack: 00000001 c12191e0 cfd31de4 c0131b67 00000001 cfd2e4d8 c13004d8 cfd31e00
c017e449 cfd2e4d8 c04d6e80 cfd32006 fffffffe cfd31e54 cfd31e70 c01749cc
cfd2e4d8 cfd31e50 cfd31e4c 00000001 cfd32001 cfd2e4d8 c03dd41f c04cf920
Call Trace:
[<c010412f>] show_stack+0x7f/0xa0 (28)
[<c01042da>] show_registers+0x16a/0x1e0 (56)
[<c0104511>] die+0x101/0x190 (64)
[<c0115862>] do_page_fault+0x442/0x680 (216)
[<c0103d9b>] error_code+0x2b/0x30 (68)
[<c0131b67>] call_rcu+0x37/0x70 (20)
[<c017e449>] dput+0x139/0x210 (28)
[<c01749cc>] __link_path_walk+0x9fc/0xf80 (112)
[<c0174f9a>] link_path_walk+0x4a/0x130 (100)
[<c017538e>] path_lookup+0x9e/0x1c0 (32)
[<c01707e8>] open_exec+0x28/0x100 (100)
[<c0171a04>] do_execve+0x44/0x220 (36)
[<c0101da2>] sys_execve+0x42/0xa0 (36)
[<c0103315>] syscall_call+0x7/0xb (-8096)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c0131b4f>] .... call_rcu+0x1f/0x70
.....[<c017e449>] .. ( <= dput+0x139/0x210)
.. [<c0131ac3>] .... rcu_advance_callbacks+0x13/0x80
.....[<c0131b67>] .. ( <= call_rcu+0x37/0x70)
.. [<c03dddca>] .... _raw_spin_lock_irqsave+0x1a/0xa0
.....[<c010444f>] .. ( <= die+0x3f/0x190)
.. [<c013b9e6>] .... print_traces+0x16/0x50
.....[<c010412f>] .. ( <= show_stack+0x7f/0xa0)
Code: 00 00 e8 78 2d 0a 00 8b 0c 85 20 20 51 c0 bb 80 f2 50 c0 01 d9 f0 83 44 24 00 00 a1 88 19 52 c0 39 41 40 74 23 8b 41 44 8b 51 50 <89> 02 8b 41 48 c7 41 44 00 00 00 00 89 41 50 8d 41 44 89 41 48
<6>note: khelper[60] exited with preempt_count 2
(gdb) list *0xc0131aec
0xc0131aec is in rcu_advance_callbacks (kernel/rcupdate.c:558).
553 struct rcu_data *rdp;
554
555 rdp = &get_cpu_var(rcu_data);
556 smp_mb(); /* prevent sampling batch # before list removal. */
557 if (rdp->batch != rcu_ctrlblk.batch) {
558 *rdp->donetail = rdp->waitlist;
559 rdp->donetail = rdp->waittail;
560 rdp->waitlist = NULL;
561 rdp->waittail = &rdp->waitlist;
562 rdp->batch = rcu_ctrlblk.batch;
(gdb)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-21 8:53 ` Ingo Molnar
@ 2005-03-21 9:01 ` Ingo Molnar
0 siblings, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2005-03-21 9:01 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: linux-kernel
* Ingo Molnar <mingo@elte.hu> wrote:
> got this early-bootup crash on an SMP box:
the same kernel image boots fine on an UP box, so it's an SMP bug.
note that the same occurs with your latest (synchronization barrier)
fixes applied as well.
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
2005-03-20 0:24 ` Lee Revell
@ 2005-03-21 15:42 ` K.R. Foley
0 siblings, 0 replies; 11+ messages in thread
From: K.R. Foley @ 2005-03-21 15:42 UTC (permalink / raw)
To: Lee Revell; +Cc: Ingo Molnar, linux-kernel, Paul E. McKenney
Lee Revell wrote:
> On Sat, 2005-03-19 at 20:16 +0100, Ingo Molnar wrote:
>
>>i have released the -V0.7.41-00 Real-Time Preemption patch (merged to
>>2.6.12-rc1), which can be downloaded from the usual place:
>>
>> http://redhat.com/~mingo/realtime-preempt/
>>
>
>
> 3ms latency in the NFS client code. Workload was a kernel compile over
> NFS.
>
> preemption latency trace v1.1.4 on 2.6.12-rc1-RT-V0.7.41-00
> --------------------------------------------------------------------
> latency: 3178 �s, #4095/14224, CPU#0 | (M:preempt VP:0, KP:1, SP:1 HP:1 #P:1)
> -----------------
> | task: ksoftirqd/0-2 (uid:0 nice:-10 policy:0 rt_prio:0)
> -----------------
>
> _------=> CPU#
> / _-----=> irqs-off
> | / _----=> need-resched
> || / _---=> hardirq/softirq
> ||| / _--=> preempt-depth
> |||| /
> ||||| delay
> cmd pid ||||| time | caller
> \ / ||||| \ | /
> (T1/#0) <...> 32105 0 3 00000004 00000000 [0011939614227867] 0.000ms (+4137027.445ms): <6500646c> (<61000000>)
> (T1/#2) <...> 32105 0 3 00000004 00000002 [0011939614228097] 0.000ms (+0.000ms): __trace_start_sched_wakeup+0x9a/0xd0 <c013150a> (try_to_wake_up+0x94/0x140 <c0110474>)
> (T1/#3) <...> 32105 0 3 00000003 00000003 [0011939614228436] 0.000ms (+0.000ms): preempt_schedule+0x11/0x80 <c02b57c1> (try_to_wake_up+0x94/0x140 <c0110474>)
> (T3/#4) <...>-32105 0dn.3 0�s : try_to_wake_up+0x11e/0x140 <c01104fe> <<...>-2> (69 76):
> (T1/#5) <...> 32105 0 3 00000002 00000005 [0011939614228942] 0.000ms (+0.000ms): preempt_schedule+0x11/0x80 <c02b57c1> (try_to_wake_up+0xf8/0x140 <c01104d8>)
> (T1/#6) <...> 32105 0 3 00000002 00000006 [0011939614229130] 0.000ms (+0.000ms): wake_up_process+0x35/0x40 <c0110555> (do_softirq+0x3f/0x50 <c011b05f>)
> (T6/#7) <...>-32105 0dn.1 1�s < (1)
> (T1/#8) <...> 32105 0 2 00000001 00000008 [0011939614229782] 0.001ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
> (T1/#9) <...> 32105 0 2 00000001 00000009 [0011939614229985] 0.001ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
> (T1/#10) <...> 32105 0 2 00000001 0000000a [0011939614230480] 0.001ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
> (T1/#11) <...> 32105 0 2 00000001 0000000b [0011939614230634] 0.002ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
> (T1/#12) <...> 32105 0 2 00000001 0000000c [0011939614230889] 0.002ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
> (T1/#13) <...> 32105 0 2 00000001 0000000d [0011939614231034] 0.002ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
> (T1/#14) <...> 32105 0 2 00000001 0000000e [0011939614231302] 0.002ms (+0.000ms): radix_tree_gang_lookup+0xe/0x70 <c01e05ee> (nfs_wait_on_requests+0x6d/0x110 <c01c744d>)
> (T1/#15) <...> 32105 0 2 00000001 0000000f [0011939614231419] 0.002ms (+0.000ms): __lookup+0xe/0xd0 <c01e051e> (radix_tree_gang_lookup+0x52/0x70 <c01e0632>)
>
> (last two lines just repeat)
>
> This is probably not be a regression; I had never tested this with NFS
> before.
>
> Lee
>
Lee,
I did some testing with NFS quite a while ago. Actually it was the NFS
compile within the stress-kernel pkg. I had similar crappy performance
problems. Ingo pointed out that there were serious locking issues with
NFS and suggested that I report the problems to the NFS folks, which I
did. The reports seemed to fall mostly on deaf ears, at least that was
my perspective. I decided to move on and took the NFS compile out of my
stress testing to be revisited at a later time.
--
kr
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00
@ 2005-03-21 16:45 Paul Mckenney
0 siblings, 0 replies; 11+ messages in thread
From: Paul Mckenney @ 2005-03-21 16:45 UTC (permalink / raw)
To: mingo; +Cc: rlrevell, linux-kernel
> got this early-bootup crash on an SMP box:
>
> BUG: Unable to handle kernel NULL pointer dereference at virtual address 00000000
> printing eip:
> c0131aec
> *pde = 00000000
> Oops: 0002 [#1]
> PREEMPT SMP
> Modules linked in:
> CPU: 1
> EIP: 0060:[<c0131aec>] Not tainted VLI
> EFLAGS: 00010293 (2.6.12-rc1-RT-V0.7.41-00)
> EIP is at rcu_advance_callbacks+0x3c/0x80
> eax: 00000000 ebx: c050f280 ecx: c12191e0 edx: 00000000
> esi: cfd2e560 edi: cfd2e4e0 ebp: cfd31dd0 esp: cfd31dc8
> ds: 007b es: 007b ss: 0068 preempt: 00000003
> Process khelper (pid: 60, threadinfo=cfd30000 task=cfd106a0)
> Stack: 00000001 c12191e0 cfd31de4 c0131b67 00000001 cfd2e4d8 c13004d8 cfd31e00
> c017e449 cfd2e4d8 c04d6e80 cfd32006 fffffffe cfd31e54 cfd31e70 c01749cc
> cfd2e4d8 cfd31e50 cfd31e4c 00000001 cfd32001 cfd2e4d8 c03dd41f c04cf920
> Call Trace:
> [<c010412f>] show_stack+0x7f/0xa0 (28)
> [<c01042da>] show_registers+0x16a/0x1e0 (56)
> [<c0104511>] die+0x101/0x190 (64)
> [<c0115862>] do_page_fault+0x442/0x680 (216)
> [<c0103d9b>] error_code+0x2b/0x30 (68)
> [<c0131b67>] call_rcu+0x37/0x70 (20)
> [<c017e449>] dput+0x139/0x210 (28)
> [<c01749cc>] __link_path_walk+0x9fc/0xf80 (112)
> [<c0174f9a>] link_path_walk+0x4a/0x130 (100)
> [<c017538e>] path_lookup+0x9e/0x1c0 (32)
> [<c01707e8>] open_exec+0x28/0x100 (100)
> [<c0171a04>] do_execve+0x44/0x220 (36)
> [<c0101da2>] sys_execve+0x42/0xa0 (36)
> [<c0103315>] syscall_call+0x7/0xb (-8096)
> ---------------------------
> | preempt count: 00000004 ]
> | 4-level deep critical section nesting:
> ----------------------------------------
> .. [<c0131b4f>] .... call_rcu+0x1f/0x70
> .....[<c017e449>] .. ( <= dput+0x139/0x210)
> .. [<c0131ac3>] .... rcu_advance_callbacks+0x13/0x80
> .....[<c0131b67>] .. ( <= call_rcu+0x37/0x70)
> .. [<c03dddca>] .... _raw_spin_lock_irqsave+0x1a/0xa0
> .....[<c010444f>] .. ( <= die+0x3f/0x190)
> .. [<c013b9e6>] .... print_traces+0x16/0x50
> .....[<c010412f>] .. ( <= show_stack+0x7f/0xa0)
> Code: 00 00 e8 78 2d 0a 00 8b 0c 85 20 20 51 c0 bb 80 f2 50 c0 01 d9 f0 83 44 24 00 00 a1 88 19 52 c0 39 41 40 74 23 8b 41 44 8b 51 50 <89> 02 8b 41 48 c7 41 44 00 00 00 00 89 41 50 8d 41 44 89 41 48
> <6>note: khelper[60] exited with preempt_count 2
>
> (gdb) list *0xc0131aec
> 0xc0131aec is in rcu_advance_callbacks (kernel/rcupdate.c:558).
>
> 553 struct rcu_data *rdp;
> 554
> 555 rdp = &get_cpu_var(rcu_data);
> 556 smp_mb(); /* prevent sampling batch # before list removal. */
> 557 if (rdp->batch != rcu_ctrlblk.batch) {
> 558 *rdp->donetail = rdp->waitlist;
> 559 rdp->donetail = rdp->waittail;
> 560 rdp->waitlist = NULL;
> 561 rdp->waittail = &rdp->waitlist;
> 562 rdp->batch = rcu_ctrlblk.batch;
> (gdb)
Does the following help?
Thanx, Paul
diff -urpN -X dontdiff linux-2.6.11.fixes/kernel/rcupdate.c linux-2.6.11.fixes2/kernel/rcupdate.c
--- linux-2.6.11.fixes/kernel/rcupdate.c Mon Mar 21 08:14:47 2005
+++ linux-2.6.11.fixes2/kernel/rcupdate.c Mon Mar 21 08:17:00 2005
@@ -620,7 +620,7 @@ static void rcu_process_callbacks(void)
return;
}
rdp->donelist = NULL;
- rdp->donetail = &rdp->waitlist;
+ rdp->donetail = &rdp->donelist;
put_cpu_var(rcu_data);
while (list) {
next = list->next;
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2005-03-21 16:45 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-03-21 16:45 [patch] Real-Time Preemption, -RT-2.6.12-rc1-V0.7.41-00 Paul Mckenney
-- strict thread matches above, loose matches on Subject: below --
2005-03-19 19:16 Ingo Molnar
2005-03-20 0:24 ` Lee Revell
2005-03-21 15:42 ` K.R. Foley
2005-03-20 1:33 ` Lee Revell
2005-03-20 1:50 ` K.R. Foley
2005-03-20 4:32 ` Lee Revell
2005-03-20 22:40 ` Paul E. McKenney
2005-03-20 17:45 ` Paul E. McKenney
2005-03-21 8:53 ` Ingo Molnar
2005-03-21 9:01 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox