Consolidating RCU-bh, RCU-preempt, and RCU-sched

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Consolidating RCU-bh, RCU-preempt, and RCU-sched
@ 2018-07-13  0:02 Paul E. McKenney
  2018-07-13  3:47 ` Lai Jiangshan
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2018-07-13  0:02 UTC (permalink / raw)
  To: josh, rostedt, mathieu.desnoyers, jiangshanlai
  Cc: linux-kernel, mingo, torvalds, peterz, oleg, edumazet, davem,
	tglx

Hello!

I now have a semi-reasonable prototype of changes consolidating the
RCU-bh, RCU-preempt, and RCU-sched update-side APIs in my -rcu tree.
There are likely still bugs to be fixed and probably other issues as well,
but a prototype does exist.

Assuming continued good rcutorture results and no objections, I am
thinking in terms of this timeline:

o	Preparatory work and cleanups are slated for the v4.19 merge window.

o	The actual consolidation and post-consolidation cleanup is slated
	for the merge window after v4.19 (v5.0?).  These cleanups include
	the replacements called out below within the RCU implementation
	itself (but excluding kernel/rcu/sync.c, see question below).

o	Replacement of now-obsolete update APIs is slated for the second
	merge window after v4.19 (v5.1?).  The replacements are currently
	expected to be as follows:

	synchronize_rcu_bh() -> synchronize_rcu()
	synchronize_rcu_bh_expedited() -> synchronize_rcu_expedited()
	call_rcu_bh() -> call_rcu()
	rcu_barrier_bh() -> rcu_barrier()
	synchronize_sched() -> synchronize_rcu()
	synchronize_sched_expedited() -> synchronize_rcu_expedited()
	call_rcu_sched() -> call_rcu()
	rcu_barrier_sched() -> rcu_barrier()
	get_state_synchronize_sched() -> get_state_synchronize_rcu()
	cond_synchronize_sched() -> cond_synchronize_rcu()
	synchronize_rcu_mult() -> synchronize_rcu()

	I have done light testing of these replacements with good results.

Any objections to this timeline?

I also have some questions on the ultimate end point.  I have default
choices, which I will likely take if there is no discussion.

o	
	Currently, I am thinking in terms of keeping the per-flavor
	read-side functions.  For example, rcu_read_lock_bh() would
	continue to disable softirq, and would also continue to tell
	lockdep about the RCU-bh read-side critical section.  However,
	synchronize_rcu() will wait for all flavors of read-side critical
	sections, including those introduced by (say) preempt_disable(),
	so there will no longer be any possibility of mismatching (say)
	RCU-bh readers with RCU-sched updaters.

	I could imagine other ways of handling this, including:

	a.	Eliminate rcu_read_lock_bh() in favor of
		local_bh_disable() and so on.  Rely on lockdep
		instrumentation of these other functions to identify RCU
		readers, introducing such instrumentation as needed.  I am
		not a fan of this approach because of the large number of
		places in the Linux kernel where interrupts, preemption,
		and softirqs are enabled or disabled "behind the scenes".

	b.	Eliminate rcu_read_lock_bh() in favor of rcu_read_lock(),
		and required callers to also disable softirqs, preemption,
		or whatever as needed.	I am not a fan of this approach
		because it seems a lot less convenient to users of RCU-bh
		and RCU-sched.

	At the moment, I therefore favor keeping the RCU-bh and RCU-sched
	read-side APIs.  But are there better approaches?

o	How should kernel/rcu/sync.c be handled?  Here are some
	possibilities:

	a.	Leave the full gp_ops[] array and simply translate
		the obsolete update-side functions to their RCU
		equivalents.

	b.	Leave the current gp_ops[] array, but only have
		the RCU_SYNC entry.  The __INIT_HELD field would
		be set to a function that was OK with being in an
		RCU read-side critical section, an interrupt-disabled
		section, etc.

		This allows for possible addition of SRCU functionality.
		It is also a trivial change.  Note that the sole user
		of sync.c uses RCU_SCHED_SYNC, and this would need to
		be changed to RCU_SYNC.

		But is it likely that we will ever add SRCU?

	c.	Eliminate that gp_ops[] array, hard-coding the function
		pointers into their call sites.

	I don't really have a preference.  Left to myself, I will be lazy
	and take option #a.  Are there better approaches?

o	Currently, if a lock related to the scheduler's rq or pi locks is
	held across rcu_read_unlock(), that lock must be held across the
	entire read-side critical section in order to avoid deadlock.
	Now that the end of the RCU read-side critical section is
	deferred until sometime after interrupts are re-enabled, this
	requirement could be lifted.  However, because the end of the RCU
	read-side critical section is detected sometime after interrupts
	are re-enabled, this means that a low-priority RCU reader might
	remain priority-boosted longer than need be, which could be a
	problem when running real-time workloads.

	My current thought is therefore to leave this constraint in
	place.  Thoughts?

Anything else that I should be worried about?  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Consolidating RCU-bh, RCU-preempt, and RCU-sched
  2018-07-13  0:02 Consolidating RCU-bh, RCU-preempt, and RCU-sched Paul E. McKenney
@ 2018-07-13  3:47 ` Lai Jiangshan
  2018-07-13  3:58   ` Paul E. McKenney
  2018-07-23 20:10   ` Steven Rostedt
  0 siblings, 2 replies; 5+ messages in thread
From: Lai Jiangshan @ 2018-07-13  3:47 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, LKML,
	Ingo Molnar, Linus Torvalds, Peter Zijlstra, oleg, Eric Dumazet,
	davem, Thomas Gleixner

On Fri, Jul 13, 2018 at 8:02 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> Hello!
>
> I now have a semi-reasonable prototype of changes consolidating the
> RCU-bh, RCU-preempt, and RCU-sched update-side APIs in my -rcu tree.
> There are likely still bugs to be fixed and probably other issues as well,
> but a prototype does exist.
>
> Assuming continued good rcutorture results and no objections, I am
> thinking in terms of this timeline:
>
> o       Preparatory work and cleanups are slated for the v4.19 merge window.
>
> o       The actual consolidation and post-consolidation cleanup is slated
>         for the merge window after v4.19 (v5.0?).  These cleanups include
>         the replacements called out below within the RCU implementation
>         itself (but excluding kernel/rcu/sync.c, see question below).
>
> o       Replacement of now-obsolete update APIs is slated for the second
>         merge window after v4.19 (v5.1?).  The replacements are currently
>         expected to be as follows:
>
>         synchronize_rcu_bh() -> synchronize_rcu()
>         synchronize_rcu_bh_expedited() -> synchronize_rcu_expedited()
>         call_rcu_bh() -> call_rcu()
>         rcu_barrier_bh() -> rcu_barrier()
>         synchronize_sched() -> synchronize_rcu()
>         synchronize_sched_expedited() -> synchronize_rcu_expedited()
>         call_rcu_sched() -> call_rcu()
>         rcu_barrier_sched() -> rcu_barrier()
>         get_state_synchronize_sched() -> get_state_synchronize_rcu()
>         cond_synchronize_sched() -> cond_synchronize_rcu()
>         synchronize_rcu_mult() -> synchronize_rcu()
>
>         I have done light testing of these replacements with good results.
>
> Any objections to this timeline?
>
> I also have some questions on the ultimate end point.  I have default
> choices, which I will likely take if there is no discussion.
>
> o
>         Currently, I am thinking in terms of keeping the per-flavor
>         read-side functions.  For example, rcu_read_lock_bh() would
>         continue to disable softirq, and would also continue to tell
>         lockdep about the RCU-bh read-side critical section.  However,
>         synchronize_rcu() will wait for all flavors of read-side critical
>         sections, including those introduced by (say) preempt_disable(),
>         so there will no longer be any possibility of mismatching (say)
>         RCU-bh readers with RCU-sched updaters.
>
>         I could imagine other ways of handling this, including:
>
>         a.      Eliminate rcu_read_lock_bh() in favor of
>                 local_bh_disable() and so on.  Rely on lockdep
>                 instrumentation of these other functions to identify RCU
>                 readers, introducing such instrumentation as needed.  I am
>                 not a fan of this approach because of the large number of
>                 places in the Linux kernel where interrupts, preemption,
>                 and softirqs are enabled or disabled "behind the scenes".
>
>         b.      Eliminate rcu_read_lock_bh() in favor of rcu_read_lock(),
>                 and required callers to also disable softirqs, preemption,
>                 or whatever as needed.  I am not a fan of this approach
>                 because it seems a lot less convenient to users of RCU-bh
>                 and RCU-sched.
>
>         At the moment, I therefore favor keeping the RCU-bh and RCU-sched
>         read-side APIs.  But are there better approaches?

Hello, Paul

Since local_bh_disable() will be guaranteed to be protected by RCU
and more general. I'm afraid it will be preferred over
rcu_read_lock_bh() which will be gradually being phased out.

In other words, keeping the RCU-bh read-side APIs will be a slower
version of the option A. So will the same approach for the RCU-sched.
But it'll still be better than the hurrying option A, IMHO.

Thanks,
Lai

>
> o       How should kernel/rcu/sync.c be handled?  Here are some
>         possibilities:
>
>         a.      Leave the full gp_ops[] array and simply translate
>                 the obsolete update-side functions to their RCU
>                 equivalents.
>
>         b.      Leave the current gp_ops[] array, but only have
>                 the RCU_SYNC entry.  The __INIT_HELD field would
>                 be set to a function that was OK with being in an
>                 RCU read-side critical section, an interrupt-disabled
>                 section, etc.
>
>                 This allows for possible addition of SRCU functionality.
>                 It is also a trivial change.  Note that the sole user
>                 of sync.c uses RCU_SCHED_SYNC, and this would need to
>                 be changed to RCU_SYNC.
>
>                 But is it likely that we will ever add SRCU?
>
>         c.      Eliminate that gp_ops[] array, hard-coding the function
>                 pointers into their call sites.
>
>         I don't really have a preference.  Left to myself, I will be lazy
>         and take option #a.  Are there better approaches?
>
> o       Currently, if a lock related to the scheduler's rq or pi locks is
>         held across rcu_read_unlock(), that lock must be held across the
>         entire read-side critical section in order to avoid deadlock.
>         Now that the end of the RCU read-side critical section is
>         deferred until sometime after interrupts are re-enabled, this
>         requirement could be lifted.  However, because the end of the RCU
>         read-side critical section is detected sometime after interrupts
>         are re-enabled, this means that a low-priority RCU reader might
>         remain priority-boosted longer than need be, which could be a
>         problem when running real-time workloads.
>
>         My current thought is therefore to leave this constraint in
>         place.  Thoughts?
>
> Anything else that I should be worried about?  ;-)
>
>                                                         Thanx, Paul
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Consolidating RCU-bh, RCU-preempt, and RCU-sched
  2018-07-13  3:47 ` Lai Jiangshan
@ 2018-07-13  3:58   ` Paul E. McKenney
  2018-07-23 20:10   ` Steven Rostedt
  1 sibling, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2018-07-13  3:58 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Josh Triplett, Steven Rostedt, Mathieu Desnoyers, LKML,
	Ingo Molnar, Linus Torvalds, Peter Zijlstra, oleg, Eric Dumazet,
	davem, Thomas Gleixner

On Fri, Jul 13, 2018 at 11:47:18AM +0800, Lai Jiangshan wrote:
> On Fri, Jul 13, 2018 at 8:02 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > Hello!
> >
> > I now have a semi-reasonable prototype of changes consolidating the
> > RCU-bh, RCU-preempt, and RCU-sched update-side APIs in my -rcu tree.
> > There are likely still bugs to be fixed and probably other issues as well,
> > but a prototype does exist.
> >
> > Assuming continued good rcutorture results and no objections, I am
> > thinking in terms of this timeline:
> >
> > o       Preparatory work and cleanups are slated for the v4.19 merge window.
> >
> > o       The actual consolidation and post-consolidation cleanup is slated
> >         for the merge window after v4.19 (v5.0?).  These cleanups include
> >         the replacements called out below within the RCU implementation
> >         itself (but excluding kernel/rcu/sync.c, see question below).
> >
> > o       Replacement of now-obsolete update APIs is slated for the second
> >         merge window after v4.19 (v5.1?).  The replacements are currently
> >         expected to be as follows:
> >
> >         synchronize_rcu_bh() -> synchronize_rcu()
> >         synchronize_rcu_bh_expedited() -> synchronize_rcu_expedited()
> >         call_rcu_bh() -> call_rcu()
> >         rcu_barrier_bh() -> rcu_barrier()
> >         synchronize_sched() -> synchronize_rcu()
> >         synchronize_sched_expedited() -> synchronize_rcu_expedited()
> >         call_rcu_sched() -> call_rcu()
> >         rcu_barrier_sched() -> rcu_barrier()
> >         get_state_synchronize_sched() -> get_state_synchronize_rcu()
> >         cond_synchronize_sched() -> cond_synchronize_rcu()
> >         synchronize_rcu_mult() -> synchronize_rcu()
> >
> >         I have done light testing of these replacements with good results.
> >
> > Any objections to this timeline?
> >
> > I also have some questions on the ultimate end point.  I have default
> > choices, which I will likely take if there is no discussion.
> >
> > o
> >         Currently, I am thinking in terms of keeping the per-flavor
> >         read-side functions.  For example, rcu_read_lock_bh() would
> >         continue to disable softirq, and would also continue to tell
> >         lockdep about the RCU-bh read-side critical section.  However,
> >         synchronize_rcu() will wait for all flavors of read-side critical
> >         sections, including those introduced by (say) preempt_disable(),
> >         so there will no longer be any possibility of mismatching (say)
> >         RCU-bh readers with RCU-sched updaters.
> >
> >         I could imagine other ways of handling this, including:
> >
> >         a.      Eliminate rcu_read_lock_bh() in favor of
> >                 local_bh_disable() and so on.  Rely on lockdep
> >                 instrumentation of these other functions to identify RCU
> >                 readers, introducing such instrumentation as needed.  I am
> >                 not a fan of this approach because of the large number of
> >                 places in the Linux kernel where interrupts, preemption,
> >                 and softirqs are enabled or disabled "behind the scenes".
> >
> >         b.      Eliminate rcu_read_lock_bh() in favor of rcu_read_lock(),
> >                 and required callers to also disable softirqs, preemption,
> >                 or whatever as needed.  I am not a fan of this approach
> >                 because it seems a lot less convenient to users of RCU-bh
> >                 and RCU-sched.
> >
> >         At the moment, I therefore favor keeping the RCU-bh and RCU-sched
> >         read-side APIs.  But are there better approaches?
> 
> Hello, Paul
> 
> Since local_bh_disable() will be guaranteed to be protected by RCU
> and more general. I'm afraid it will be preferred over
> rcu_read_lock_bh() which will be gradually being phased out.
> 
> In other words, keeping the RCU-bh read-side APIs will be a slower
> version of the option A. So will the same approach for the RCU-sched.
> But it'll still be better than the hurrying option A, IMHO.

I am OK with the read-side RCU-bh and RCU-sched interfaces going away,
it is just that I am not willing to put all that much effort into
it myself.  ;-)

Unless there is a good reason for me to hurry it along, of course.

							Thanx, Paul

> Thanks,
> Lai
> 
> >
> > o       How should kernel/rcu/sync.c be handled?  Here are some
> >         possibilities:
> >
> >         a.      Leave the full gp_ops[] array and simply translate
> >                 the obsolete update-side functions to their RCU
> >                 equivalents.
> >
> >         b.      Leave the current gp_ops[] array, but only have
> >                 the RCU_SYNC entry.  The __INIT_HELD field would
> >                 be set to a function that was OK with being in an
> >                 RCU read-side critical section, an interrupt-disabled
> >                 section, etc.
> >
> >                 This allows for possible addition of SRCU functionality.
> >                 It is also a trivial change.  Note that the sole user
> >                 of sync.c uses RCU_SCHED_SYNC, and this would need to
> >                 be changed to RCU_SYNC.
> >
> >                 But is it likely that we will ever add SRCU?
> >
> >         c.      Eliminate that gp_ops[] array, hard-coding the function
> >                 pointers into their call sites.
> >
> >         I don't really have a preference.  Left to myself, I will be lazy
> >         and take option #a.  Are there better approaches?
> >
> > o       Currently, if a lock related to the scheduler's rq or pi locks is
> >         held across rcu_read_unlock(), that lock must be held across the
> >         entire read-side critical section in order to avoid deadlock.
> >         Now that the end of the RCU read-side critical section is
> >         deferred until sometime after interrupts are re-enabled, this
> >         requirement could be lifted.  However, because the end of the RCU
> >         read-side critical section is detected sometime after interrupts
> >         are re-enabled, this means that a low-priority RCU reader might
> >         remain priority-boosted longer than need be, which could be a
> >         problem when running real-time workloads.
> >
> >         My current thought is therefore to leave this constraint in
> >         place.  Thoughts?
> >
> > Anything else that I should be worried about?  ;-)
> >
> >                                                         Thanx, Paul
> >
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Consolidating RCU-bh, RCU-preempt, and RCU-sched
  2018-07-13  3:47 ` Lai Jiangshan
  2018-07-13  3:58   ` Paul E. McKenney
@ 2018-07-23 20:10   ` Steven Rostedt
  2018-07-23 20:25     ` Paul E. McKenney
  1 sibling, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2018-07-23 20:10 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Paul E. McKenney, Josh Triplett, Mathieu Desnoyers, LKML,
	Ingo Molnar, Linus Torvalds, Peter Zijlstra, oleg, Eric Dumazet,
	davem, Thomas Gleixner


Sorry for the late reply, just came back from the Caribbean :-) :-) :-)

On Fri, 13 Jul 2018 11:47:18 +0800
Lai Jiangshan <jiangshanlai@gmail.com> wrote:

> On Fri, Jul 13, 2018 at 8:02 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > Hello!
> >
> > I now have a semi-reasonable prototype of changes consolidating the
> > RCU-bh, RCU-preempt, and RCU-sched update-side APIs in my -rcu tree.
> > There are likely still bugs to be fixed and probably other issues as well,
> > but a prototype does exist.

What's the rational for all this churn? Linus's complaining that there
are too many RCU variants?


> >
> > Assuming continued good rcutorture results and no objections, I am
> > thinking in terms of this timeline:
> >
> > o       Preparatory work and cleanups are slated for the v4.19 merge window.
> >
> > o       The actual consolidation and post-consolidation cleanup is slated
> >         for the merge window after v4.19 (v5.0?).  These cleanups include
> >         the replacements called out below within the RCU implementation
> >         itself (but excluding kernel/rcu/sync.c, see question below).
> >
> > o       Replacement of now-obsolete update APIs is slated for the second
> >         merge window after v4.19 (v5.1?).  The replacements are currently
> >         expected to be as follows:
> >
> >         synchronize_rcu_bh() -> synchronize_rcu()
> >         synchronize_rcu_bh_expedited() -> synchronize_rcu_expedited()
> >         call_rcu_bh() -> call_rcu()
> >         rcu_barrier_bh() -> rcu_barrier()
> >         synchronize_sched() -> synchronize_rcu()
> >         synchronize_sched_expedited() -> synchronize_rcu_expedited()
> >         call_rcu_sched() -> call_rcu()
> >         rcu_barrier_sched() -> rcu_barrier()
> >         get_state_synchronize_sched() -> get_state_synchronize_rcu()
> >         cond_synchronize_sched() -> cond_synchronize_rcu()
> >         synchronize_rcu_mult() -> synchronize_rcu()
> >
> >         I have done light testing of these replacements with good results.
> >
> > Any objections to this timeline?
> >
> > I also have some questions on the ultimate end point.  I have default
> > choices, which I will likely take if there is no discussion.
> >
> > o
> >         Currently, I am thinking in terms of keeping the per-flavor
> >         read-side functions.  For example, rcu_read_lock_bh() would
> >         continue to disable softirq, and would also continue to tell
> >         lockdep about the RCU-bh read-side critical section.  However,
> >         synchronize_rcu() will wait for all flavors of read-side critical
> >         sections, including those introduced by (say) preempt_disable(),
> >         so there will no longer be any possibility of mismatching (say)
> >         RCU-bh readers with RCU-sched updaters.
> >
> >         I could imagine other ways of handling this, including:
> >
> >         a.      Eliminate rcu_read_lock_bh() in favor of
> >                 local_bh_disable() and so on.  Rely on lockdep
> >                 instrumentation of these other functions to identify RCU
> >                 readers, introducing such instrumentation as needed.  I am
> >                 not a fan of this approach because of the large number of
> >                 places in the Linux kernel where interrupts, preemption,
> >                 and softirqs are enabled or disabled "behind the scenes".
> >
> >         b.      Eliminate rcu_read_lock_bh() in favor of rcu_read_lock(),
> >                 and required callers to also disable softirqs, preemption,
> >                 or whatever as needed.  I am not a fan of this approach
> >                 because it seems a lot less convenient to users of RCU-bh
> >                 and RCU-sched.
> >
> >         At the moment, I therefore favor keeping the RCU-bh and RCU-sched
> >         read-side APIs.  But are there better approaches?  
> 
> Hello, Paul
> 
> Since local_bh_disable() will be guaranteed to be protected by RCU
> and more general. I'm afraid it will be preferred over
> rcu_read_lock_bh() which will be gradually being phased out.
> 
> In other words, keeping the RCU-bh read-side APIs will be a slower
> version of the option A. So will the same approach for the RCU-sched.
> But it'll still be better than the hurrying option A, IMHO.

Now when all this gets done, is synchronize_rcu() going to just wait
for everything to pass? (scheduling, RCU readers, softirqs, etc) Is
there any worry about lengthening the time of synchronize_rcu?

-- Steve


> >
> > o       How should kernel/rcu/sync.c be handled?  Here are some
> >         possibilities:
> >
> >         a.      Leave the full gp_ops[] array and simply translate
> >                 the obsolete update-side functions to their RCU
> >                 equivalents.
> >
> >         b.      Leave the current gp_ops[] array, but only have
> >                 the RCU_SYNC entry.  The __INIT_HELD field would
> >                 be set to a function that was OK with being in an
> >                 RCU read-side critical section, an interrupt-disabled
> >                 section, etc.
> >
> >                 This allows for possible addition of SRCU functionality.
> >                 It is also a trivial change.  Note that the sole user
> >                 of sync.c uses RCU_SCHED_SYNC, and this would need to
> >                 be changed to RCU_SYNC.
> >
> >                 But is it likely that we will ever add SRCU?
> >
> >         c.      Eliminate that gp_ops[] array, hard-coding the function
> >                 pointers into their call sites.
> >
> >         I don't really have a preference.  Left to myself, I will be lazy
> >         and take option #a.  Are there better approaches?
> >
> > o       Currently, if a lock related to the scheduler's rq or pi locks is
> >         held across rcu_read_unlock(), that lock must be held across the
> >         entire read-side critical section in order to avoid deadlock.
> >         Now that the end of the RCU read-side critical section is
> >         deferred until sometime after interrupts are re-enabled, this
> >         requirement could be lifted.  However, because the end of the RCU
> >         read-side critical section is detected sometime after interrupts
> >         are re-enabled, this means that a low-priority RCU reader might
> >         remain priority-boosted longer than need be, which could be a
> >         problem when running real-time workloads.
> >
> >         My current thought is therefore to leave this constraint in
> >         place.  Thoughts?
> >
> > Anything else that I should be worried about?  ;-)
> >
> >                                                         Thanx, Paul
> >  


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Consolidating RCU-bh, RCU-preempt, and RCU-sched
  2018-07-23 20:10   ` Steven Rostedt
@ 2018-07-23 20:25     ` Paul E. McKenney
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2018-07-23 20:25 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Lai Jiangshan, Josh Triplett, Mathieu Desnoyers, LKML,
	Ingo Molnar, Linus Torvalds, Peter Zijlstra, oleg, Eric Dumazet,
	davem, Thomas Gleixner

On Mon, Jul 23, 2018 at 04:10:41PM -0400, Steven Rostedt wrote:
> 
> Sorry for the late reply, just came back from the Caribbean :-) :-) :-)

Welcome back, and I hope that the Caribbean trip was a good one!

> On Fri, 13 Jul 2018 11:47:18 +0800
> Lai Jiangshan <jiangshanlai@gmail.com> wrote:
> 
> > On Fri, Jul 13, 2018 at 8:02 AM, Paul E. McKenney
> > <paulmck@linux.vnet.ibm.com> wrote:
> > > Hello!
> > >
> > > I now have a semi-reasonable prototype of changes consolidating the
> > > RCU-bh, RCU-preempt, and RCU-sched update-side APIs in my -rcu tree.
> > > There are likely still bugs to be fixed and probably other issues as well,
> > > but a prototype does exist.
> 
> What's the rational for all this churn? Linus's complaining that there
> are too many RCU variants?

A CVE stemming from someone getting confused between the different flavors
of RCU.  The churn is large, as you say, but it does have the benefit of
making RCU a bit smaller.

Not necessarily simpler, but smaller.

> > > Assuming continued good rcutorture results and no objections, I am
> > > thinking in terms of this timeline:
> > >
> > > o       Preparatory work and cleanups are slated for the v4.19 merge window.
> > >
> > > o       The actual consolidation and post-consolidation cleanup is slated
> > >         for the merge window after v4.19 (v5.0?).  These cleanups include
> > >         the replacements called out below within the RCU implementation
> > >         itself (but excluding kernel/rcu/sync.c, see question below).
> > >
> > > o       Replacement of now-obsolete update APIs is slated for the second
> > >         merge window after v4.19 (v5.1?).  The replacements are currently
> > >         expected to be as follows:
> > >
> > >         synchronize_rcu_bh() -> synchronize_rcu()
> > >         synchronize_rcu_bh_expedited() -> synchronize_rcu_expedited()
> > >         call_rcu_bh() -> call_rcu()
> > >         rcu_barrier_bh() -> rcu_barrier()
> > >         synchronize_sched() -> synchronize_rcu()
> > >         synchronize_sched_expedited() -> synchronize_rcu_expedited()
> > >         call_rcu_sched() -> call_rcu()
> > >         rcu_barrier_sched() -> rcu_barrier()
> > >         get_state_synchronize_sched() -> get_state_synchronize_rcu()
> > >         cond_synchronize_sched() -> cond_synchronize_rcu()
> > >         synchronize_rcu_mult() -> synchronize_rcu()
> > >
> > >         I have done light testing of these replacements with good results.
> > >
> > > Any objections to this timeline?
> > >
> > > I also have some questions on the ultimate end point.  I have default
> > > choices, which I will likely take if there is no discussion.
> > >
> > > o
> > >         Currently, I am thinking in terms of keeping the per-flavor
> > >         read-side functions.  For example, rcu_read_lock_bh() would
> > >         continue to disable softirq, and would also continue to tell
> > >         lockdep about the RCU-bh read-side critical section.  However,
> > >         synchronize_rcu() will wait for all flavors of read-side critical
> > >         sections, including those introduced by (say) preempt_disable(),
> > >         so there will no longer be any possibility of mismatching (say)
> > >         RCU-bh readers with RCU-sched updaters.
> > >
> > >         I could imagine other ways of handling this, including:
> > >
> > >         a.      Eliminate rcu_read_lock_bh() in favor of
> > >                 local_bh_disable() and so on.  Rely on lockdep
> > >                 instrumentation of these other functions to identify RCU
> > >                 readers, introducing such instrumentation as needed.  I am
> > >                 not a fan of this approach because of the large number of
> > >                 places in the Linux kernel where interrupts, preemption,
> > >                 and softirqs are enabled or disabled "behind the scenes".
> > >
> > >         b.      Eliminate rcu_read_lock_bh() in favor of rcu_read_lock(),
> > >                 and required callers to also disable softirqs, preemption,
> > >                 or whatever as needed.  I am not a fan of this approach
> > >                 because it seems a lot less convenient to users of RCU-bh
> > >                 and RCU-sched.
> > >
> > >         At the moment, I therefore favor keeping the RCU-bh and RCU-sched
> > >         read-side APIs.  But are there better approaches?  
> > 
> > Hello, Paul
> > 
> > Since local_bh_disable() will be guaranteed to be protected by RCU
> > and more general. I'm afraid it will be preferred over
> > rcu_read_lock_bh() which will be gradually being phased out.
> > 
> > In other words, keeping the RCU-bh read-side APIs will be a slower
> > version of the option A. So will the same approach for the RCU-sched.
> > But it'll still be better than the hurrying option A, IMHO.
> 
> Now when all this gets done, is synchronize_rcu() going to just wait
> for everything to pass? (scheduling, RCU readers, softirqs, etc) Is
> there any worry about lengthening the time of synchronize_rcu?

Yes, when all is said and done, synchronize_rcu() will wait for everything
to get done.  I am not too worried about PREEMPT=y synchronize_rcu()'s
latency because the kernel usually doesn't spend that large a fraction
of its time disabled.  I am not worried at all about PREEMPT=n
synchronize_rcu()'s latency because it will if anything be slightly
faster due to being able to take advantage of some softirq transitions.

But one reason for feeding this in over three successive merge windows
is to get more time on it before it all goes in.

							Thanx, Paul

> -- Steve
> 
> 
> > >
> > > o       How should kernel/rcu/sync.c be handled?  Here are some
> > >         possibilities:
> > >
> > >         a.      Leave the full gp_ops[] array and simply translate
> > >                 the obsolete update-side functions to their RCU
> > >                 equivalents.
> > >
> > >         b.      Leave the current gp_ops[] array, but only have
> > >                 the RCU_SYNC entry.  The __INIT_HELD field would
> > >                 be set to a function that was OK with being in an
> > >                 RCU read-side critical section, an interrupt-disabled
> > >                 section, etc.
> > >
> > >                 This allows for possible addition of SRCU functionality.
> > >                 It is also a trivial change.  Note that the sole user
> > >                 of sync.c uses RCU_SCHED_SYNC, and this would need to
> > >                 be changed to RCU_SYNC.
> > >
> > >                 But is it likely that we will ever add SRCU?
> > >
> > >         c.      Eliminate that gp_ops[] array, hard-coding the function
> > >                 pointers into their call sites.
> > >
> > >         I don't really have a preference.  Left to myself, I will be lazy
> > >         and take option #a.  Are there better approaches?
> > >
> > > o       Currently, if a lock related to the scheduler's rq or pi locks is
> > >         held across rcu_read_unlock(), that lock must be held across the
> > >         entire read-side critical section in order to avoid deadlock.
> > >         Now that the end of the RCU read-side critical section is
> > >         deferred until sometime after interrupts are re-enabled, this
> > >         requirement could be lifted.  However, because the end of the RCU
> > >         read-side critical section is detected sometime after interrupts
> > >         are re-enabled, this means that a low-priority RCU reader might
> > >         remain priority-boosted longer than need be, which could be a
> > >         problem when running real-time workloads.
> > >
> > >         My current thought is therefore to leave this constraint in
> > >         place.  Thoughts?
> > >
> > > Anything else that I should be worried about?  ;-)
> > >
> > >                                                         Thanx, Paul
> > >  
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-23 20:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-13  0:02 Consolidating RCU-bh, RCU-preempt, and RCU-sched Paul E. McKenney
2018-07-13  3:47 ` Lai Jiangshan
2018-07-13  3:58   ` Paul E. McKenney
2018-07-23 20:10   ` Steven Rostedt
2018-07-23 20:25     ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox