* [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
@ 2010-03-09 11:13 Lai Jiangshan
2010-03-09 12:52 ` Steven Rostedt
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Lai Jiangshan @ 2010-03-09 11:13 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ingo Molnar, Peter Zijlstra, Steven Rostedt, Mathieu Desnoyers,
josh, LKML, Frederic Weisbecker
Current, synchronize_sched() ignores preempt-disable()
sequences in the idle loop. It makes synchronize_sched()
is not so pure, and it hurts tracing.
Paul have a proposal before:
http://lkml.org/lkml/2009/4/5/140
http://lkml.org/lkml/2009/4/6/496
But old fix needs to hack into all architectures' idle loops.
This is another try, it uses the fact that idle loops
are executing with preept_count()=1.
But I didn't look deep into all idle loops.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 3ec8160..0761723 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -80,6 +80,10 @@ DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh_state);
DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
+#ifndef IDLE_CORE_LOOP_PREEMPT_COUNT
+#define IDLE_CORE_LOOP_PREEMPT_COUNT (1)
+#endif
+
/*
* Return true if an RCU grace period is in progress. The ACCESS_ONCE()s
* permit this function to be invoked without holding the root rcu_node
@@ -1114,6 +1118,26 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
raise_softirq(RCU_SOFTIRQ);
}
+static inline int rcu_idle_qs(int cpu)
+{
+ if (!idle_cpu(cpu))
+ return 0;
+
+ if (!rcu_scheduler_active)
+ return 0;
+
+ if (in_softirq())
+ return 0;
+
+ if (hardirq_count() > (1 << HARDIRQ_SHIFT))
+ return 0;
+
+ if ((preempt_count() & PREEMPT_MASK) > IDLE_CORE_LOOP_PREEMPT_COUNT)
+ return 0;
+
+ return 1;
+}
+
/*
* Check to see if this CPU is in a non-context-switch quiescent state
* (user mode or idle loop for rcu, non-softirq execution for rcu_bh).
@@ -1127,9 +1151,7 @@ void rcu_check_callbacks(int cpu, int user)
{
if (!rcu_pending(cpu))
return; /* if nothing for RCU to do. */
- if (user ||
- (idle_cpu(cpu) && rcu_scheduler_active &&
- !in_softirq() && hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
+ if (user || rcu_idle_qs(cpu)) {
/*
* Get here if this CPU took its interrupt from user
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
2010-03-09 11:13 [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop Lai Jiangshan
@ 2010-03-09 12:52 ` Steven Rostedt
2010-03-10 0:57 ` Lai Jiangshan
2010-03-10 0:54 ` Frederic Weisbecker
2010-03-10 1:30 ` Paul E. McKenney
2 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2010-03-09 12:52 UTC (permalink / raw)
To: Lai Jiangshan
Cc: Paul E. McKenney, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
josh, LKML, Frederic Weisbecker
On Tue, 2010-03-09 at 19:13 +0800, Lai Jiangshan wrote:
> Current, synchronize_sched() ignores preempt-disable()
> sequences in the idle loop. It makes synchronize_sched()
> is not so pure, and it hurts tracing.
>
> Paul have a proposal before:
> http://lkml.org/lkml/2009/4/5/140
> http://lkml.org/lkml/2009/4/6/496
> But old fix needs to hack into all architectures' idle loops.
>
> This is another try, it uses the fact that idle loops
> are executing with preept_count()=1.
> But I didn't look deep into all idle loops.
Lai,
Does this (with your patch) fix the bug you were seeing with the ring
buffer code?
-- Steve
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 3ec8160..0761723 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -80,6 +80,10 @@ DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
> struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh_state);
> DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
>
> +#ifndef IDLE_CORE_LOOP_PREEMPT_COUNT
> +#define IDLE_CORE_LOOP_PREEMPT_COUNT (1)
> +#endif
> +
> /*
> * Return true if an RCU grace period is in progress. The ACCESS_ONCE()s
> * permit this function to be invoked without holding the root rcu_node
> @@ -1114,6 +1118,26 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
> raise_softirq(RCU_SOFTIRQ);
> }
>
> +static inline int rcu_idle_qs(int cpu)
> +{
> + if (!idle_cpu(cpu))
> + return 0;
> +
> + if (!rcu_scheduler_active)
> + return 0;
> +
> + if (in_softirq())
> + return 0;
> +
> + if (hardirq_count() > (1 << HARDIRQ_SHIFT))
> + return 0;
> +
> + if ((preempt_count() & PREEMPT_MASK) > IDLE_CORE_LOOP_PREEMPT_COUNT)
> + return 0;
> +
> + return 1;
> +}
> +
> /*
> * Check to see if this CPU is in a non-context-switch quiescent state
> * (user mode or idle loop for rcu, non-softirq execution for rcu_bh).
> @@ -1127,9 +1151,7 @@ void rcu_check_callbacks(int cpu, int user)
> {
> if (!rcu_pending(cpu))
> return; /* if nothing for RCU to do. */
> - if (user ||
> - (idle_cpu(cpu) && rcu_scheduler_active &&
> - !in_softirq() && hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> + if (user || rcu_idle_qs(cpu)) {
>
> /*
> * Get here if this CPU took its interrupt from user
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
2010-03-09 11:13 [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop Lai Jiangshan
2010-03-09 12:52 ` Steven Rostedt
@ 2010-03-10 0:54 ` Frederic Weisbecker
2010-03-10 1:30 ` Paul E. McKenney
2 siblings, 0 replies; 8+ messages in thread
From: Frederic Weisbecker @ 2010-03-10 0:54 UTC (permalink / raw)
To: Lai Jiangshan
Cc: Paul E. McKenney, Ingo Molnar, Peter Zijlstra, Steven Rostedt,
Mathieu Desnoyers, josh, LKML
On Tue, Mar 09, 2010 at 07:13:27PM +0800, Lai Jiangshan wrote:
>
> Current, synchronize_sched() ignores preempt-disable()
> sequences in the idle loop. It makes synchronize_sched()
> is not so pure, and it hurts tracing.
>
> Paul have a proposal before:
> http://lkml.org/lkml/2009/4/5/140
> http://lkml.org/lkml/2009/4/6/496
> But old fix needs to hack into all architectures' idle loops.
>
> This is another try, it uses the fact that idle loops
> are executing with preept_count()=1.
> But I didn't look deep into all idle loops.
>
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 3ec8160..0761723 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -80,6 +80,10 @@ DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
> struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh_state);
> DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
>
> +#ifndef IDLE_CORE_LOOP_PREEMPT_COUNT
> +#define IDLE_CORE_LOOP_PREEMPT_COUNT (1)
> +#endif
> +
> /*
> * Return true if an RCU grace period is in progress. The ACCESS_ONCE()s
> * permit this function to be invoked without holding the root rcu_node
> @@ -1114,6 +1118,26 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
> raise_softirq(RCU_SOFTIRQ);
> }
>
> +static inline int rcu_idle_qs(int cpu)
> +{
> + if (!idle_cpu(cpu))
> + return 0;
> +
> + if (!rcu_scheduler_active)
> + return 0;
> +
> + if (in_softirq())
> + return 0;
> +
> + if (hardirq_count() > (1 << HARDIRQ_SHIFT))
> + return 0;
> +
> + if ((preempt_count() & PREEMPT_MASK) > IDLE_CORE_LOOP_PREEMPT_COUNT)
> + return 0;
This is neat. But I wonder about something. It means that in most of
the idle loop, we won't be able to schedule the rcu callbacks
So if I understand well, this is going to needlessly delay
rcu callbacks for no strong reason most of the time. I'm not sure
if this is going to have any bad impact, but if so may be
do we want this as a feature, and check we currently have
running users of this feature (I only have function tracers in mind,
may be I'm missing some others). Altough I wonder how racy such
a check could be. At least we can make it a CONFIG.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
2010-03-09 12:52 ` Steven Rostedt
@ 2010-03-10 0:57 ` Lai Jiangshan
2010-03-10 13:51 ` Steven Rostedt
0 siblings, 1 reply; 8+ messages in thread
From: Lai Jiangshan @ 2010-03-10 0:57 UTC (permalink / raw)
To: rostedt
Cc: Paul E. McKenney, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
josh, LKML, Frederic Weisbecker
Steven Rostedt wrote:
> On Tue, 2010-03-09 at 19:13 +0800, Lai Jiangshan wrote:
>> Current, synchronize_sched() ignores preempt-disable()
>> sequences in the idle loop. It makes synchronize_sched()
>> is not so pure, and it hurts tracing.
>>
>> Paul have a proposal before:
>> http://lkml.org/lkml/2009/4/5/140
>> http://lkml.org/lkml/2009/4/6/496
>> But old fix needs to hack into all architectures' idle loops.
>>
>> This is another try, it uses the fact that idle loops
>> are executing with preept_count()=1.
>> But I didn't look deep into all idle loops.
>
> Lai,
>
> Does this (with your patch) fix the bug you were seeing with the ring
> buffer code?
>
No, this can not fix the bug we found with the ring buffer code.
I think the bug is not come from this issue or from RCU.
Lai
>
>> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
>> ---
>> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
>> index 3ec8160..0761723 100644
>> --- a/kernel/rcutree.c
>> +++ b/kernel/rcutree.c
>> @@ -80,6 +80,10 @@ DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
>> struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh_state);
>> DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
>>
>> +#ifndef IDLE_CORE_LOOP_PREEMPT_COUNT
>> +#define IDLE_CORE_LOOP_PREEMPT_COUNT (1)
>> +#endif
>> +
>> /*
>> * Return true if an RCU grace period is in progress. The ACCESS_ONCE()s
>> * permit this function to be invoked without holding the root rcu_node
>> @@ -1114,6 +1118,26 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
>> raise_softirq(RCU_SOFTIRQ);
>> }
>>
>> +static inline int rcu_idle_qs(int cpu)
>> +{
>> + if (!idle_cpu(cpu))
>> + return 0;
>> +
>> + if (!rcu_scheduler_active)
>> + return 0;
>> +
>> + if (in_softirq())
>> + return 0;
>> +
>> + if (hardirq_count() > (1 << HARDIRQ_SHIFT))
>> + return 0;
>> +
>> + if ((preempt_count() & PREEMPT_MASK) > IDLE_CORE_LOOP_PREEMPT_COUNT)
>> + return 0;
>> +
>> + return 1;
>> +}
>> +
>> /*
>> * Check to see if this CPU is in a non-context-switch quiescent state
>> * (user mode or idle loop for rcu, non-softirq execution for rcu_bh).
>> @@ -1127,9 +1151,7 @@ void rcu_check_callbacks(int cpu, int user)
>> {
>> if (!rcu_pending(cpu))
>> return; /* if nothing for RCU to do. */
>> - if (user ||
>> - (idle_cpu(cpu) && rcu_scheduler_active &&
>> - !in_softirq() && hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
>> + if (user || rcu_idle_qs(cpu)) {
>>
>> /*
>> * Get here if this CPU took its interrupt from user
>>
>
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
2010-03-09 11:13 [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop Lai Jiangshan
2010-03-09 12:52 ` Steven Rostedt
2010-03-10 0:54 ` Frederic Weisbecker
@ 2010-03-10 1:30 ` Paul E. McKenney
2010-03-10 2:13 ` Lai Jiangshan
2 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2010-03-10 1:30 UTC (permalink / raw)
To: Lai Jiangshan
Cc: Ingo Molnar, Peter Zijlstra, Steven Rostedt, Mathieu Desnoyers,
josh, LKML, Frederic Weisbecker
On Tue, Mar 09, 2010 at 07:13:27PM +0800, Lai Jiangshan wrote:
>
> Current, synchronize_sched() ignores preempt-disable()
> sequences in the idle loop. It makes synchronize_sched()
> is not so pure, and it hurts tracing.
>
> Paul have a proposal before:
> http://lkml.org/lkml/2009/4/5/140
> http://lkml.org/lkml/2009/4/6/496
> But old fix needs to hack into all architectures' idle loops.
>
> This is another try, it uses the fact that idle loops
> are executing with preept_count()=1.
> But I didn't look deep into all idle loops.
Hello, Lai!
One question below...
Thanx, Paul
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 3ec8160..0761723 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -80,6 +80,10 @@ DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
> struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh_state);
> DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
>
> +#ifndef IDLE_CORE_LOOP_PREEMPT_COUNT
> +#define IDLE_CORE_LOOP_PREEMPT_COUNT (1)
> +#endif
> +
> /*
> * Return true if an RCU grace period is in progress. The ACCESS_ONCE()s
> * permit this function to be invoked without holding the root rcu_node
> @@ -1114,6 +1118,26 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
> raise_softirq(RCU_SOFTIRQ);
> }
>
> +static inline int rcu_idle_qs(int cpu)
> +{
> + if (!idle_cpu(cpu))
> + return 0;
> +
> + if (!rcu_scheduler_active)
> + return 0;
> +
> + if (in_softirq())
> + return 0;
> +
> + if (hardirq_count() > (1 << HARDIRQ_SHIFT))
> + return 0;
> +
> + if ((preempt_count() & PREEMPT_MASK) > IDLE_CORE_LOOP_PREEMPT_COUNT)
> + return 0;
How does this work in CONFIG_PREEMPT=n kernels? I don't see how it
does, regardless of what preempt_count() returns in this case.
Any enlightenment?
> +
> + return 1;
> +}
> +
> /*
> * Check to see if this CPU is in a non-context-switch quiescent state
> * (user mode or idle loop for rcu, non-softirq execution for rcu_bh).
> @@ -1127,9 +1151,7 @@ void rcu_check_callbacks(int cpu, int user)
> {
> if (!rcu_pending(cpu))
> return; /* if nothing for RCU to do. */
> - if (user ||
> - (idle_cpu(cpu) && rcu_scheduler_active &&
> - !in_softirq() && hardirq_count() <= (1 << HARDIRQ_SHIFT))) {
> + if (user || rcu_idle_qs(cpu)) {
>
> /*
> * Get here if this CPU took its interrupt from user
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
2010-03-10 1:30 ` Paul E. McKenney
@ 2010-03-10 2:13 ` Lai Jiangshan
2010-03-10 2:28 ` Paul E. McKenney
0 siblings, 1 reply; 8+ messages in thread
From: Lai Jiangshan @ 2010-03-10 2:13 UTC (permalink / raw)
To: paulmck
Cc: Ingo Molnar, Peter Zijlstra, Steven Rostedt, Mathieu Desnoyers,
josh, LKML, Frederic Weisbecker
Paul E. McKenney wrote:
>>
>> This is another try, it uses the fact that idle loops
>> are executing with preept_count()=1.
>> But I didn't look deep into all idle loops.
>
> Hello, Lai!
>
> One question below...
>
> Thanx, Paul
>
[...]
>> +
>> + if ((preempt_count() & PREEMPT_MASK) > IDLE_CORE_LOOP_PREEMPT_COUNT)
>> + return 0;
>
> How does this work in CONFIG_PREEMPT=n kernels? I don't see how it
> does, regardless of what preempt_count() returns in this case.
>
>
You are right, It cannot work in CONFIG_PREEMPT=n kernels.
ignore this stupid patch.
Thanx, Lai
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
2010-03-10 2:13 ` Lai Jiangshan
@ 2010-03-10 2:28 ` Paul E. McKenney
0 siblings, 0 replies; 8+ messages in thread
From: Paul E. McKenney @ 2010-03-10 2:28 UTC (permalink / raw)
To: Lai Jiangshan
Cc: Ingo Molnar, Peter Zijlstra, Steven Rostedt, Mathieu Desnoyers,
josh, LKML, Frederic Weisbecker
On Wed, Mar 10, 2010 at 10:13:56AM +0800, Lai Jiangshan wrote:
> Paul E. McKenney wrote:
> >>
> >> This is another try, it uses the fact that idle loops
> >> are executing with preept_count()=1.
> >> But I didn't look deep into all idle loops.
> >
> > Hello, Lai!
> >
> > One question below...
> >
> > Thanx, Paul
> >
>
> [...]
>
> >> +
> >> + if ((preempt_count() & PREEMPT_MASK) > IDLE_CORE_LOOP_PREEMPT_COUNT)
> >> + return 0;
> >
> > How does this work in CONFIG_PREEMPT=n kernels? I don't see how it
> > does, regardless of what preempt_count() returns in this case.
>
> You are right, It cannot work in CONFIG_PREEMPT=n kernels.
> ignore this stupid patch.
Don't be too hard on yourself -- it is an interesting idea, just doesn't
seem to quite work out. Perhaps it will lead you to another idea that
does work.
Thanx, Paul
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop
2010-03-10 0:57 ` Lai Jiangshan
@ 2010-03-10 13:51 ` Steven Rostedt
0 siblings, 0 replies; 8+ messages in thread
From: Steven Rostedt @ 2010-03-10 13:51 UTC (permalink / raw)
To: Lai Jiangshan
Cc: Paul E. McKenney, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers,
josh, LKML, Frederic Weisbecker
On Wed, 2010-03-10 at 08:57 +0800, Lai Jiangshan wrote:
> No, this can not fix the bug we found with the ring buffer code.
> I think the bug is not come from this issue or from RCU.
Looking at the stress test, I believe the bug is not at the ring buffer
layer, but the ftrace layer above it.
There are some restrictions that are suppose to be implemented with
various actions of the ring buffer.
1) When using the iterator (reading the non-consuming trace file), the
ring buffer is not to be modified.
2) Same is true with resize and reset. The buffer should not be modified
during these actions, but more importantly, they should not happen while
an iterator is active.
3) Ftrace uses two ring buffers (one to store max latencies), when a max
is hit it switches buffers. I'm thinking that there may be cases we are
starting an operation on one and finishing it on another.
I've already found a few cases that ftrace does not protect against the
above, with the operations that this stress test does.
The main issue I see is that it is switching the tracer (which resets)
and it is also resizing the buffer, while the trace file is being read.
This may cause the iterator to be pointing to false pages. I'm going see
if disabling resizing and switching tracers when the trace file is open
prevents the crash. If this does prevent the crashes, then I'll see if I
can make the iterator more robust, if not, then we will have to prevent
these actions while someone has the trace file opened.
Thanks,
-- Steve
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-03-10 13:52 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-09 11:13 [RFC PATCH] rcu: don't ignore preempt_disable() in the idle loop Lai Jiangshan
2010-03-09 12:52 ` Steven Rostedt
2010-03-10 0:57 ` Lai Jiangshan
2010-03-10 13:51 ` Steven Rostedt
2010-03-10 0:54 ` Frederic Weisbecker
2010-03-10 1:30 ` Paul E. McKenney
2010-03-10 2:13 ` Lai Jiangshan
2010-03-10 2:28 ` Paul E. McKenney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox