From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
laijs@cn.fujitsu.com, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org,
dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com,
fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com
Subject: Re: [PATCH RFC tip/core/rcu 1/9] rcu: Add call_rcu_tasks()
Date: Tue, 29 Jul 2014 09:33:12 -0700 [thread overview]
Message-ID: <20140729163312.GR11241@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140729160754.GW20603@laptop.programming.kicks-ass.net>
On Tue, Jul 29, 2014 at 06:07:54PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 29, 2014 at 08:57:47AM -0700, Paul E. McKenney wrote:
> > On Tue, Jul 29, 2014 at 09:50:55AM +0200, Peter Zijlstra wrote:
> > > On Mon, Jul 28, 2014 at 03:56:12PM -0700, Paul E. McKenney wrote:
> > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > > index bc1638b33449..a0d2f3a03566 100644
> > > > --- a/kernel/sched/core.c
> > > > +++ b/kernel/sched/core.c
> > > > @@ -2762,6 +2762,7 @@ need_resched:
> > > > } else {
> > > > deactivate_task(rq, prev, DEQUEUE_SLEEP);
> > > > prev->on_rq = 0;
> > > > + rcu_note_voluntary_context_switch(prev);
> > > >
> > > > /*
> > > > * If a worker went to sleep, notify and ask workqueue
> > > > @@ -2828,6 +2829,7 @@ asmlinkage __visible void __sched schedule(void)
> > > > struct task_struct *tsk = current;
> > > >
> > > > sched_submit_work(tsk);
> > > > + rcu_note_voluntary_context_switch(tsk);
> > > > __schedule();
> > > > }
> > >
> > > Yeah, not entirely happy with that, you add two calls into one of the
> > > hotest paths of the kernel.
> >
> > I did look into leveraging counters, but cannot remember why I decided
> > that this was a bad idea. I guess it is time to recheck...
> >
> > The ->nvcsw field in the task_struct structure looks promising:
> >
> > o Looks like it does in fact get incremented in __schedule() via
> > the switch_count pointer.
> >
> > o Looks like it is unconditionally compiled in.
> >
> > o There are no memory barriers, but a synchronize_sched()
> > should take care of that, given that this counter is
> > incremented with interrupts disabled.
>
> Well, there's obviously the actual context switch, which should imply an
> actual MB such that tasks are self ordered even when execution continues
> on another cpu etc..
True enough, except that it appears that the context switch happens
after the ->nvcsw increment, which means that it doesn't help RCU-tasks
guarantee that if it has seen the increment, then all prior processing
has completed. There might be enough stuff prior the increment, but I
don't see anything that I feel comfortable relying on. Am I missing
some ordering?
> > So I should be able to snapshot the task_struct structure's ->nvcsw
> > field and avoid the added code in the fastpaths.
> >
> > Seem plausible, or am I confused about the role of ->nvcsw?
>
> Nope, that's the 'I scheduled to go to sleep' counter.
I am assuming that the "Nope" goes with "am I confused" rather than
"Seem plausible" -- if not, please let me know. ;-)
> There is of course the 'polling' issue I raised in a further email...
Yep, and other flavors of RCU go to lengths to avoid scanning the
task_struct lists. Steven said that updates will be rare and that it
is OK for them to have high latency and overhead. Thus far, I am taking
him at his word. ;-)
I considered interrupting the task_struct polling loop periodically,
and would add that if needed. That said, this requires nailing down the
task_struct at which the vacation is taken. Here "nailing down" does not
simply mean "prevent from being freed", but rather "prevent from being
removed from the lists traversed by do_each_thread/while_each_thread."
Of course, if there is some easy way of doing this, please let me know!
> > > And I'm still not entirely sure why, your 0/x babbled something about
> > > trampolines, but I'm not sure I understand how those lead to this.
> >
> > Steven Rostedt sent an email recently giving more detail. And of course
> > now I am having trouble finding it. Maybe he will take pity on us and
> > send along a pointer to it. ;-)
>
> Yah, would make good Changelog material that ;-)
;-) ;-) ;-)
Thanx, Paul
next prev parent reply other threads:[~2014-07-29 16:33 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-28 22:55 [PATCH tip/core/rcu 0/9] RCU-tasks implementation Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 1/9] rcu: Add call_rcu_tasks() Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 2/9] rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops Paul E. McKenney
2014-07-29 7:55 ` Peter Zijlstra
2014-07-29 16:22 ` Paul E. McKenney
2014-07-29 17:25 ` Peter Zijlstra
2014-07-29 17:33 ` Paul E. McKenney
2014-07-29 17:36 ` Peter Zijlstra
2014-07-29 17:37 ` Peter Zijlstra
2014-07-29 17:55 ` Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 3/9] rcu: Add synchronous grace-period waiting for RCU-tasks Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 4/9] rcu: Export RCU-tasks APIs to GPL modules Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 5/9] rcutorture: Add torture tests for RCU-tasks Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 6/9] rcutorture: Add RCU-tasks test cases Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 7/9] rcu: Add stall-warning checks for RCU-tasks Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 8/9] rcu: Make RCU-tasks track exiting tasks Paul E. McKenney
2014-07-30 17:04 ` Oleg Nesterov
2014-07-30 18:24 ` Paul E. McKenney
2014-07-28 22:56 ` [PATCH RFC tip/core/rcu 9/9] rcu: Improve RCU-tasks energy efficiency Paul E. McKenney
2014-07-29 7:50 ` [PATCH RFC tip/core/rcu 1/9] rcu: Add call_rcu_tasks() Peter Zijlstra
2014-07-29 15:57 ` Paul E. McKenney
2014-07-29 16:07 ` Peter Zijlstra
2014-07-29 16:33 ` Paul E. McKenney [this message]
2014-07-29 17:31 ` Peter Zijlstra
2014-07-29 18:19 ` Paul E. McKenney
2014-07-29 19:25 ` Peter Zijlstra
2014-07-29 20:11 ` Paul E. McKenney
2014-07-29 8:12 ` Peter Zijlstra
2014-07-29 16:36 ` Paul E. McKenney
2014-07-29 8:12 ` Peter Zijlstra
2014-07-29 8:14 ` Peter Zijlstra
2014-07-29 17:23 ` Paul E. McKenney
2014-07-29 17:33 ` Peter Zijlstra
2014-07-29 18:06 ` Paul E. McKenney
2014-07-30 13:23 ` Mike Galbraith
2014-07-30 14:23 ` Paul E. McKenney
2014-07-31 7:37 ` Mike Galbraith
2014-07-31 16:38 ` Paul E. McKenney
2014-08-01 2:59 ` Mike Galbraith
2014-08-01 15:16 ` Paul E. McKenney
2014-07-30 6:52 ` Lai Jiangshan
2014-07-30 15:07 ` Paul E. McKenney
2014-07-30 13:41 ` Frederic Weisbecker
2014-07-30 16:10 ` Paul E. McKenney
2014-07-30 15:49 ` Oleg Nesterov
2014-07-30 16:08 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140729163312.GR11241@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=bobby.prani@gmail.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=dvhart@linux.intel.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.