* [PATCH v2 1/2] workqueue: add quiescent state between work items [not found] <1412529862-17954-1-git-send-email-joe.lawrence@stratus.com> @ 2014-10-05 17:24 ` Joe Lawrence 2014-10-05 19:21 ` Tejun Heo 0 siblings, 1 reply; 10+ messages in thread From: Joe Lawrence @ 2014-10-05 17:24 UTC (permalink / raw) To: linux-kernel; +Cc: tj, paulmck, jiri, Joe Lawrence, stable Similar to the stop_machine deadlock scenario on !PREEMPT kernels addressed in b22ce2785d97 "workqueue: cond_resched() after processing each work item", kworker threads requeueing back-to-back with zero jiffy delay can stall RCU. The cond_resched call introduced in that fix will yield only iff there are other higher priority tasks to run, so force a quiescent RCU state between work items. Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item") Cc: <stable@vger.kernel.org> --- kernel/workqueue.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 5dbe22a..345bec9 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2043,8 +2043,10 @@ __acquires(&pool->lock) * kernels, where a requeueing work item waiting for something to * happen could deadlock with stop_machine as such work item could * indefinitely requeue itself while all other CPUs are trapped in - * stop_machine. + * stop_machine. At the same time, report a quiescent RCU state so + * the same condition doesn't freeze RCU. */ + rcu_note_voluntary_context_switch(current); cond_resched(); spin_lock_irq(&pool->lock); -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-05 17:24 ` [PATCH v2 1/2] workqueue: add quiescent state between work items Joe Lawrence @ 2014-10-05 19:21 ` Tejun Heo 2014-10-05 19:47 ` Tejun Heo 0 siblings, 1 reply; 10+ messages in thread From: Tejun Heo @ 2014-10-05 19:21 UTC (permalink / raw) To: Joe Lawrence; +Cc: linux-kernel, paulmck, jiri, stable On Sun, Oct 05, 2014 at 01:24:21PM -0400, Joe Lawrence wrote: > Similar to the stop_machine deadlock scenario on !PREEMPT kernels > addressed in b22ce2785d97 "workqueue: cond_resched() after processing > each work item", kworker threads requeueing back-to-back with zero jiffy > delay can stall RCU. The cond_resched call introduced in that fix will > yield only iff there are other higher priority tasks to run, so force a > quiescent RCU state between work items. > > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> > Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com > Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com > Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item") > Cc: <stable@vger.kernel.org> Applied to wq/for-3.17-fixes. If 3.17 comes out before this gets merged, I'll send it as for-3.18. Thanks. -- tejun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-05 19:21 ` Tejun Heo @ 2014-10-05 19:47 ` Tejun Heo 2014-10-06 4:21 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Tejun Heo @ 2014-10-05 19:47 UTC (permalink / raw) To: Joe Lawrence; +Cc: linux-kernel, paulmck, jiri, stable On Sun, Oct 05, 2014 at 03:21:19PM -0400, Tejun Heo wrote: > On Sun, Oct 05, 2014 at 01:24:21PM -0400, Joe Lawrence wrote: > > Similar to the stop_machine deadlock scenario on !PREEMPT kernels > > addressed in b22ce2785d97 "workqueue: cond_resched() after processing > > each work item", kworker threads requeueing back-to-back with zero jiffy > > delay can stall RCU. The cond_resched call introduced in that fix will > > yield only iff there are other higher priority tasks to run, so force a > > quiescent RCU state between work items. > > > > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> > > Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com > > Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com > > Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item") > > Cc: <stable@vger.kernel.org> > > Applied to wq/for-3.17-fixes. If 3.17 comes out before this gets > merged, I'll send it as for-3.18. Oops, the rcu calls aren't in mainline yet. I think it'd be best to route these through the RCU tree. Paul, can you please route these two patches? Acked-by: Tejun Heo <tj@kernel.org> Thanks. -- tejun ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-05 19:47 ` Tejun Heo @ 2014-10-06 4:21 ` Paul E. McKenney 2014-10-07 7:29 ` Jiri Pirko 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2014-10-06 4:21 UTC (permalink / raw) To: Tejun Heo; +Cc: Joe Lawrence, linux-kernel, jiri, stable On Sun, Oct 05, 2014 at 03:47:48PM -0400, Tejun Heo wrote: > On Sun, Oct 05, 2014 at 03:21:19PM -0400, Tejun Heo wrote: > > On Sun, Oct 05, 2014 at 01:24:21PM -0400, Joe Lawrence wrote: > > > Similar to the stop_machine deadlock scenario on !PREEMPT kernels > > > addressed in b22ce2785d97 "workqueue: cond_resched() after processing > > > each work item", kworker threads requeueing back-to-back with zero jiffy > > > delay can stall RCU. The cond_resched call introduced in that fix will > > > yield only iff there are other higher priority tasks to run, so force a > > > quiescent RCU state between work items. > > > > > > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> > > > Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com > > > Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com > > > Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item") > > > Cc: <stable@vger.kernel.org> > > > > Applied to wq/for-3.17-fixes. If 3.17 comes out before this gets > > merged, I'll send it as for-3.18. > > Oops, the rcu calls aren't in mainline yet. I think it'd be best to > route these through the RCU tree. Paul, can you please route these > two patches? > > Acked-by: Tejun Heo <tj@kernel.org> Will do! I will try 3.17, failing that, 3.18. Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-06 4:21 ` Paul E. McKenney @ 2014-10-07 7:29 ` Jiri Pirko 2014-10-07 13:43 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Jiri Pirko @ 2014-10-07 7:29 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Tejun Heo, Joe Lawrence, linux-kernel, stable Mon, Oct 06, 2014 at 06:21:58AM CEST, paulmck@linux.vnet.ibm.com wrote: >On Sun, Oct 05, 2014 at 03:47:48PM -0400, Tejun Heo wrote: >> On Sun, Oct 05, 2014 at 03:21:19PM -0400, Tejun Heo wrote: >> > On Sun, Oct 05, 2014 at 01:24:21PM -0400, Joe Lawrence wrote: >> > > Similar to the stop_machine deadlock scenario on !PREEMPT kernels >> > > addressed in b22ce2785d97 "workqueue: cond_resched() after processing >> > > each work item", kworker threads requeueing back-to-back with zero jiffy >> > > delay can stall RCU. The cond_resched call introduced in that fix will >> > > yield only iff there are other higher priority tasks to run, so force a >> > > quiescent RCU state between work items. >> > > >> > > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> >> > > Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com >> > > Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com >> > > Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item") >> > > Cc: <stable@vger.kernel.org> >> > >> > Applied to wq/for-3.17-fixes. If 3.17 comes out before this gets >> > merged, I'll send it as for-3.18. >> >> Oops, the rcu calls aren't in mainline yet. I think it'd be best to >> route these through the RCU tree. Paul, can you please route these >> two patches? >> >> Acked-by: Tejun Heo <tj@kernel.org> > >Will do! > >I will try 3.17, failing that, 3.18. Paul, Tehun, how do you propose to fix this on older kernels which do not have rcu_note_voluntary_context_switch? I'm particullary interested in 3.10. Thanks. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-07 7:29 ` Jiri Pirko @ 2014-10-07 13:43 ` Paul E. McKenney 2014-10-07 17:45 ` Joe Lawrence 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2014-10-07 13:43 UTC (permalink / raw) To: Jiri Pirko; +Cc: Tejun Heo, Joe Lawrence, linux-kernel, stable On Tue, Oct 07, 2014 at 09:29:42AM +0200, Jiri Pirko wrote: > Mon, Oct 06, 2014 at 06:21:58AM CEST, paulmck@linux.vnet.ibm.com wrote: > >On Sun, Oct 05, 2014 at 03:47:48PM -0400, Tejun Heo wrote: > >> On Sun, Oct 05, 2014 at 03:21:19PM -0400, Tejun Heo wrote: > >> > On Sun, Oct 05, 2014 at 01:24:21PM -0400, Joe Lawrence wrote: > >> > > Similar to the stop_machine deadlock scenario on !PREEMPT kernels > >> > > addressed in b22ce2785d97 "workqueue: cond_resched() after processing > >> > > each work item", kworker threads requeueing back-to-back with zero jiffy > >> > > delay can stall RCU. The cond_resched call introduced in that fix will > >> > > yield only iff there are other higher priority tasks to run, so force a > >> > > quiescent RCU state between work items. > >> > > > >> > > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> > >> > > Link: https://lkml.kernel.org/r/20140926105227.01325697@jlaw-desktop.mno.stratus.com > >> > > Link: https://lkml.kernel.org/r/20140929115445.40221d8e@jlaw-desktop.mno.stratus.com > >> > > Fixes: b22ce2785d97 ("workqueue: cond_resched() after processing each work item") > >> > > Cc: <stable@vger.kernel.org> > >> > > >> > Applied to wq/for-3.17-fixes. If 3.17 comes out before this gets > >> > merged, I'll send it as for-3.18. > >> > >> Oops, the rcu calls aren't in mainline yet. I think it'd be best to > >> route these through the RCU tree. Paul, can you please route these > >> two patches? > >> > >> Acked-by: Tejun Heo <tj@kernel.org> > > > >Will do! > > > >I will try 3.17, failing that, 3.18. > > > Paul, Tehun, how do you propose to fix this on older kernels which do > not have rcu_note_voluntary_context_switch? I'm particullary interested > in 3.10. Hello, Jiri, Older kernels can instead use rcu_note_context_switch(). Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-07 13:43 ` Paul E. McKenney @ 2014-10-07 17:45 ` Joe Lawrence 2014-10-08 3:24 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Joe Lawrence @ 2014-10-07 17:45 UTC (permalink / raw) To: paulmck; +Cc: Jiri Pirko, Tejun Heo, linux-kernel, stable On Tue, 7 Oct 2014 06:43:29 -0700 "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > On Tue, Oct 07, 2014 at 09:29:42AM +0200, Jiri Pirko wrote: [ ... snip ... ] > > > > Paul, Tehun, how do you propose to fix this on older kernels which do > > not have rcu_note_voluntary_context_switch? I'm particullary interested > > in 3.10. > > Hello, Jiri, > > Older kernels can instead use rcu_note_context_switch(). Hi Paul, Does 4a81e8328d37 ("rcu: Reduce overhead of cond_resched() checks for RCU") affect a backport to 3.10? I noticed that rcu_note_context_switch added a call to rcu_momentary_dyntick_idle in that change, which is only present in v3.16+. Would rcu_note_context_switch be effective by itself on a 3.10 kernel? Thanks, -- Joe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-07 17:45 ` Joe Lawrence @ 2014-10-08 3:24 ` Paul E. McKenney 2014-10-08 11:54 ` Jiri Pirko 0 siblings, 1 reply; 10+ messages in thread From: Paul E. McKenney @ 2014-10-08 3:24 UTC (permalink / raw) To: Joe Lawrence; +Cc: Jiri Pirko, Tejun Heo, linux-kernel, stable On Tue, Oct 07, 2014 at 01:45:28PM -0400, Joe Lawrence wrote: > On Tue, 7 Oct 2014 06:43:29 -0700 > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > > > On Tue, Oct 07, 2014 at 09:29:42AM +0200, Jiri Pirko wrote: > [ ... snip ... ] > > > > > > Paul, Tehun, how do you propose to fix this on older kernels which do > > > not have rcu_note_voluntary_context_switch? I'm particullary interested > > > in 3.10. > > > > Hello, Jiri, > > > > Older kernels can instead use rcu_note_context_switch(). > > Hi Paul, > > Does 4a81e8328d37 ("rcu: Reduce overhead of cond_resched() checks for > RCU") affect a backport to 3.10? > > I noticed that rcu_note_context_switch added a call to > rcu_momentary_dyntick_idle in that change, which is only present in > v3.16+. > > Would rcu_note_context_switch be effective by itself on a 3.10 kernel? Should be fine. There is more overhead than current mainline, but that should not be in the noise compared to executing a work-queue item. Thanx, Paul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-08 3:24 ` Paul E. McKenney @ 2014-10-08 11:54 ` Jiri Pirko 2014-10-08 12:19 ` Paul E. McKenney 0 siblings, 1 reply; 10+ messages in thread From: Jiri Pirko @ 2014-10-08 11:54 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Joe Lawrence, Tejun Heo, linux-kernel, stable Wed, Oct 08, 2014 at 05:24:11AM CEST, paulmck@linux.vnet.ibm.com wrote: >On Tue, Oct 07, 2014 at 01:45:28PM -0400, Joe Lawrence wrote: >> On Tue, 7 Oct 2014 06:43:29 -0700 >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: >> >> > On Tue, Oct 07, 2014 at 09:29:42AM +0200, Jiri Pirko wrote: >> [ ... snip ... ] >> > > >> > > Paul, Tehun, how do you propose to fix this on older kernels which do >> > > not have rcu_note_voluntary_context_switch? I'm particullary interested >> > > in 3.10. >> > >> > Hello, Jiri, >> > >> > Older kernels can instead use rcu_note_context_switch(). >> >> Hi Paul, >> >> Does 4a81e8328d37 ("rcu: Reduce overhead of cond_resched() checks for >> RCU") affect a backport to 3.10? >> >> I noticed that rcu_note_context_switch added a call to >> rcu_momentary_dyntick_idle in that change, which is only present in >> v3.16+. >> >> Would rcu_note_context_switch be effective by itself on a 3.10 kernel? > >Should be fine. There is more overhead than current mainline, but that >should not be in the noise compared to executing a work-queue item. > > Thanx, Paul > I cooked up following patch. Please tell me if it is fine or not. I can also send it oficially so it can be included into stable trees: Subject: workqueue: Add quiescent state between work items Similar to the stop_machine deadlock scenario on !PREEMPT kernels addressed in b22ce2785d97 "workqueue: cond_resched() after processing each work item", kworker threads requeueing back-to-back with zero jiffy delay can stall RCU. The cond_resched call introduced in that fix will yield only iff there are other higher priority tasks to run, so force a quiescent RCU state between work items. Signed-off-by: Jiri Pirko <jiri@resnulli.us> --- kernel/workqueue.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index e9719c7..14a7163 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -2196,8 +2196,10 @@ __acquires(&pool->lock) * kernels, where a requeueing work item waiting for something to * happen could deadlock with stop_machine as such work item could * indefinitely requeue itself while all other CPUs are trapped in - * stop_machine. + * stop_machine. At the same time, report a quiescent RCU state so + * the same condition doesn't freeze RCU. */ + rcu_note_context_switch(raw_smp_processor_id()); cond_resched(); spin_lock_irq(&pool->lock); -- 1.9.3 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2 1/2] workqueue: add quiescent state between work items 2014-10-08 11:54 ` Jiri Pirko @ 2014-10-08 12:19 ` Paul E. McKenney 0 siblings, 0 replies; 10+ messages in thread From: Paul E. McKenney @ 2014-10-08 12:19 UTC (permalink / raw) To: Jiri Pirko; +Cc: Joe Lawrence, Tejun Heo, linux-kernel, stable On Wed, Oct 08, 2014 at 01:54:28PM +0200, Jiri Pirko wrote: > Wed, Oct 08, 2014 at 05:24:11AM CEST, paulmck@linux.vnet.ibm.com wrote: > >On Tue, Oct 07, 2014 at 01:45:28PM -0400, Joe Lawrence wrote: > >> On Tue, 7 Oct 2014 06:43:29 -0700 > >> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote: > >> > >> > On Tue, Oct 07, 2014 at 09:29:42AM +0200, Jiri Pirko wrote: > >> [ ... snip ... ] > >> > > > >> > > Paul, Tehun, how do you propose to fix this on older kernels which do > >> > > not have rcu_note_voluntary_context_switch? I'm particullary interested > >> > > in 3.10. > >> > > >> > Hello, Jiri, > >> > > >> > Older kernels can instead use rcu_note_context_switch(). > >> > >> Hi Paul, > >> > >> Does 4a81e8328d37 ("rcu: Reduce overhead of cond_resched() checks for > >> RCU") affect a backport to 3.10? > >> > >> I noticed that rcu_note_context_switch added a call to > >> rcu_momentary_dyntick_idle in that change, which is only present in > >> v3.16+. > >> > >> Would rcu_note_context_switch be effective by itself on a 3.10 kernel? > > > >Should be fine. There is more overhead than current mainline, but that > >should not be in the noise compared to executing a work-queue item. > > > > Thanx, Paul > > > > I cooked up following patch. Please tell me if it is fine or not. I can > also send it oficially so it can be included into stable trees: Looks good! Thanx, Paul > Subject: workqueue: Add quiescent state between work items > > Similar to the stop_machine deadlock scenario on !PREEMPT kernels > addressed in b22ce2785d97 "workqueue: cond_resched() after processing > each work item", kworker threads requeueing back-to-back with zero jiffy > delay can stall RCU. The cond_resched call introduced in that fix will > yield only iff there are other higher priority tasks to run, so force a > quiescent RCU state between work items. > > Signed-off-by: Jiri Pirko <jiri@resnulli.us> > --- > kernel/workqueue.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index e9719c7..14a7163 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -2196,8 +2196,10 @@ __acquires(&pool->lock) > * kernels, where a requeueing work item waiting for something to > * happen could deadlock with stop_machine as such work item could > * indefinitely requeue itself while all other CPUs are trapped in > - * stop_machine. > + * stop_machine. At the same time, report a quiescent RCU state so > + * the same condition doesn't freeze RCU. > */ > + rcu_note_context_switch(raw_smp_processor_id()); > cond_resched(); > > spin_lock_irq(&pool->lock); > -- > 1.9.3 > > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-10-08 12:19 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1412529862-17954-1-git-send-email-joe.lawrence@stratus.com>
2014-10-05 17:24 ` [PATCH v2 1/2] workqueue: add quiescent state between work items Joe Lawrence
2014-10-05 19:21 ` Tejun Heo
2014-10-05 19:47 ` Tejun Heo
2014-10-06 4:21 ` Paul E. McKenney
2014-10-07 7:29 ` Jiri Pirko
2014-10-07 13:43 ` Paul E. McKenney
2014-10-07 17:45 ` Joe Lawrence
2014-10-08 3:24 ` Paul E. McKenney
2014-10-08 11:54 ` Jiri Pirko
2014-10-08 12:19 ` Paul E. McKenney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).