[PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU
@ 2018-07-27 15:49 Paul E. McKenney
  2018-07-30  9:25 ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2018-07-27 15:49 UTC (permalink / raw)
  To: peterz; +Cc: linux-kernel

Hello, Peter,

It occurred to me that it is wasteful to let resched_cpu() acquire
->pi_lock when doing something like resched_cpu(smp_processor_id()),
and that it would be better to instead use set_tsk_need_resched(current)
and set_preempt_need_resched().

But is doing so really worthwhile?  For that matter, are there some
constraints on the use of those two functions that I am failing to
allow for in the patch below?

							Thanx, Paul

------------------------------------------------------------------------

commit e95e2d26fff60af9bb4111a9c17461ecd5e17a7d
Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Date:   Thu Jul 26 13:44:00 2018 -0700

    rcu: Avoid resched_cpu() when rescheduling the current CPU
    
    The resched_cpu() interface is quite handy, but it does acquire the
    specified CPU's runqueue lock, which does not come for free.  This
    commit therefore substitutes the following when directing resched_cpu()
    at the current CPU:
    
            set_tsk_need_resched(current);
            set_preempt_need_resched();
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 595059141c40..061ceb171d8e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1353,7 +1353,8 @@ static void print_cpu_stall(void)
 	 * progress and it could be we're stuck in kernel space without context
 	 * switches for an entirely unreasonable amount of time.
 	 */
-	resched_cpu(smp_processor_id());
+	set_tsk_need_resched(current);
+	set_preempt_need_resched();
 }
 
 static void check_cpu_stall(struct rcu_data *rdp)
@@ -2674,10 +2675,12 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused
 	WARN_ON_ONCE(!rdp->beenonline);
 
 	/* Report any deferred quiescent states if preemption enabled. */
-	if (!(preempt_count() & PREEMPT_MASK))
+	if (!(preempt_count() & PREEMPT_MASK)) {
 		rcu_preempt_deferred_qs(current);
-	else if (rcu_preempt_need_deferred_qs(current))
-		resched_cpu(rdp->cpu); /* Provoke future context switch. */
+	} else if (rcu_preempt_need_deferred_qs(current)) {
+		set_tsk_need_resched(current);
+		set_preempt_need_resched();
+	}
 
 	/* Update RCU state based on any recent quiescent states. */
 	rcu_check_quiescent_state(rdp);
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index b3e2c873b8e4..62d363d7fab2 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -672,7 +672,8 @@ static void sync_rcu_exp_handler(void *unused)
 			rcu_report_exp_rdp(rdp);
 		} else {
 			rdp->deferred_qs = true;
-			resched_cpu(rdp->cpu);
+			set_tsk_need_resched(t);
+			set_preempt_need_resched();
 		}
 		return;
 	}
@@ -710,15 +711,16 @@ static void sync_rcu_exp_handler(void *unused)
 	 * because we are in an interrupt handler, which will cause that
 	 * function to take an early exit without doing anything.
 	 *
-	 * Otherwise, use resched_cpu() to force a context switch after
-	 * the CPU enables everything.
+	 * Otherwise, force a context switch after the CPU enables everything.
 	 */
 	rdp->deferred_qs = true;
 	if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) ||
-	    WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs()))
+	    WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs())) {
 		rcu_preempt_deferred_qs(t);
-	else
-		resched_cpu(rdp->cpu);
+	} else {
+		set_tsk_need_resched(t);
+		set_preempt_need_resched();
+	}
 }
 
 /* PREEMPT=y, so no PREEMPT=n expedited grace period to clean up after. */
@@ -779,7 +781,8 @@ static void sync_sched_exp_handler(void *unused)
 	__this_cpu_write(rcu_data.cpu_no_qs.b.exp, true);
 	/* Store .exp before .rcu_urgent_qs. */
 	smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true);
-	resched_cpu(smp_processor_id());
+	set_tsk_need_resched(current);
+	set_preempt_need_resched();
 }
 
 /* Send IPI for expedited cleanup if needed at end of CPU-hotplug operation. */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 5f4c8bab7c72..d3ccf4389a67 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -791,8 +791,10 @@ static void rcu_flavor_check_callbacks(int user)
 	if (t->rcu_read_lock_nesting > 0 ||
 	    (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) {
 		/* No QS, force context switch if deferred. */
-		if (rcu_preempt_need_deferred_qs(t))
-			resched_cpu(smp_processor_id());
+		if (rcu_preempt_need_deferred_qs(t)) {
+			set_tsk_need_resched(t);
+			set_preempt_need_resched();
+		}
 	} else if (rcu_preempt_need_deferred_qs(t)) {
 		rcu_preempt_deferred_qs(t); /* Report deferred QS. */
 		return;


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU
  2018-07-27 15:49 [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU Paul E. McKenney
@ 2018-07-30  9:25 ` Peter Zijlstra
  2018-07-30 14:59   ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2018-07-30  9:25 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Fri, Jul 27, 2018 at 08:49:31AM -0700, Paul E. McKenney wrote:
> Hello, Peter,
> 
> It occurred to me that it is wasteful to let resched_cpu() acquire
> ->pi_lock when doing something like resched_cpu(smp_processor_id()),

rq->lock

> and that it would be better to instead use set_tsk_need_resched(current)
> and set_preempt_need_resched().
> 
> But is doing so really worthwhile?  For that matter, are there some
> constraints on the use of those two functions that I am failing to
> allow for in the patch below?


>     The resched_cpu() interface is quite handy, but it does acquire the
>     specified CPU's runqueue lock, which does not come for free.  This
>     commit therefore substitutes the following when directing resched_cpu()
>     at the current CPU:
>     
>             set_tsk_need_resched(current);
>             set_preempt_need_resched();

That is only a valid substitute for resched_cpu(smp_processor_id()).

But also note how this can cause more context switches over
resched_curr() for not checking if TIF_NEED_RESCHED wasn't already set.

Something that might be more in line with
resched_curr(smp_processor_id()) would be:

	preempt_disable();
	if (!test_tsk_need_resched(current)) {
		set_tsk_need_resched(current);
		set_preempt_need_resched();
	}
	preempt_enable();

Where the preempt_enable() could of course instantly trigger the
reschedule if it was the outer most one.

> @@ -2674,10 +2675,12 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused

> -		resched_cpu(rdp->cpu); /* Provoke future context switch. */

> +		set_tsk_need_resched(current);
> +		set_preempt_need_resched();

That's not obviously correct. rdp->cpu had better be smp_processor_id().

> @@ -672,7 +672,8 @@ static void sync_rcu_exp_handler(void *unused)
>  			rcu_report_exp_rdp(rdp);
>  		} else {
>  			rdp->deferred_qs = true;
> -			resched_cpu(rdp->cpu);
> +			set_tsk_need_resched(t);
> +			set_preempt_need_resched();

That only works if @t == current.

>  		}
>  		return;
>  	}

> -	else
> -		resched_cpu(rdp->cpu);
> +	} else {
> +		set_tsk_need_resched(t);
> +		set_preempt_need_resched();

Similar...

>  }

> @@ -791,8 +791,10 @@ static void rcu_flavor_check_callbacks(int user)
>  	if (t->rcu_read_lock_nesting > 0 ||
>  	    (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) {
>  		/* No QS, force context switch if deferred. */
> -		if (rcu_preempt_need_deferred_qs(t))
> -			resched_cpu(smp_processor_id());
> +		if (rcu_preempt_need_deferred_qs(t)) {
> +			set_tsk_need_resched(t);
> +			set_preempt_need_resched();
> +		}

And another dodgy one..

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU
  2018-07-30  9:25 ` Peter Zijlstra
@ 2018-07-30 14:59   ` Paul E. McKenney
  2018-07-30 16:42     ` Peter Zijlstra
  0 siblings, 1 reply; 5+ messages in thread
From: Paul E. McKenney @ 2018-07-30 14:59 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel

On Mon, Jul 30, 2018 at 11:25:13AM +0200, Peter Zijlstra wrote:
> On Fri, Jul 27, 2018 at 08:49:31AM -0700, Paul E. McKenney wrote:
> > Hello, Peter,
> > 
> > It occurred to me that it is wasteful to let resched_cpu() acquire
> > ->pi_lock when doing something like resched_cpu(smp_processor_id()),
> 
> rq->lock

Good catch, will fix.  And thank you for looking this over!

> > and that it would be better to instead use set_tsk_need_resched(current)
> > and set_preempt_need_resched().
> > 
> > But is doing so really worthwhile?  For that matter, are there some
> > constraints on the use of those two functions that I am failing to
> > allow for in the patch below?
> 
> 
> >     The resched_cpu() interface is quite handy, but it does acquire the
> >     specified CPU's runqueue lock, which does not come for free.  This
> >     commit therefore substitutes the following when directing resched_cpu()
> >     at the current CPU:
> >     
> >             set_tsk_need_resched(current);
> >             set_preempt_need_resched();
> 
> That is only a valid substitute for resched_cpu(smp_processor_id()).

Understood.

> But also note how this can cause more context switches over
> resched_curr() for not checking if TIF_NEED_RESCHED wasn't already set.
> 
> Something that might be more in line with
> resched_curr(smp_processor_id()) would be:
> 
> 	preempt_disable();
> 	if (!test_tsk_need_resched(current)) {
> 		set_tsk_need_resched(current);
> 		set_preempt_need_resched();
> 	}
> 	preempt_enable();
> 
> Where the preempt_enable() could of course instantly trigger the
> reschedule if it was the outer most one.

Ah.  So should I use resched_curr() from rcu_check_callbacks(), which
is invoked from the scheduling-clock interrupt?  Right now I have calls
to set_tsk_need_resched() and set_preempt_need_resched().

> > @@ -2674,10 +2675,12 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused
> 
> > -		resched_cpu(rdp->cpu); /* Provoke future context switch. */
> 
> > +		set_tsk_need_resched(current);
> > +		set_preempt_need_resched();
> 
> That's not obviously correct. rdp->cpu had better be smp_processor_id().

At the beginning of the function, we have:

	struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);

And this is in a softirq handler, so we are OK.

> > @@ -672,7 +672,8 @@ static void sync_rcu_exp_handler(void *unused)
> >  			rcu_report_exp_rdp(rdp);
> >  		} else {
> >  			rdp->deferred_qs = true;
> > -			resched_cpu(rdp->cpu);
> > +			set_tsk_need_resched(t);
> > +			set_preempt_need_resched();
> 
> That only works if @t == current.

At the beginning of the function, we have:

	struct task_struct *t = current;

So we should be OK.

> >  		}
> >  		return;
> >  	}
> 
> > -	else
> > -		resched_cpu(rdp->cpu);
> > +	} else {
> > +		set_tsk_need_resched(t);
> > +		set_preempt_need_resched();
> 
> Similar...

Same function, so we should be good here as well.

> >  }
> 
> > @@ -791,8 +791,10 @@ static void rcu_flavor_check_callbacks(int user)
> >  	if (t->rcu_read_lock_nesting > 0 ||
> >  	    (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) {
> >  		/* No QS, force context switch if deferred. */
> > -		if (rcu_preempt_need_deferred_qs(t))
> > -			resched_cpu(smp_processor_id());
> > +		if (rcu_preempt_need_deferred_qs(t)) {
> > +			set_tsk_need_resched(t);
> > +			set_preempt_need_resched();
> > +		}
> 
> And another dodgy one..

And the beginning of this function also has:

	struct task_struct *t = current;

So good there as well.

Should I be instead using resched_curr() on some or all of these?

kernel/rcu/tiny.c rcu_check_callbacks():

	Interrupts disabled (scheduling clock interrupt), so no
	point in preempt_disable().  It would make sense to check
	test_tsk_need_resched().  This is handling the case where someone
	disabled something over rcu_read_unlock(), but got preempted
	within (or had an overly long) RCU read-side critical section.
	This used to result in deadlock, but now just messes up real-time
	response.

kernel/rcu/tree.c print_cpu_stall():

	Interrupts disabled, so no point in preempt_disable().
	It might make sense to check test_tsk_need_resched(), but
	on the other hand at this point this CPU has gone for
	tens of seconds without a quiescent state.  Wouldn't hurt
	to check, though.

kernel/rcu/tree.c rcu_check_callbacks():

	Interrupts disabled (scheduling clock interrupt), so no
	point in preempt_disable().  It would make sense to check
	test_tsk_need_resched().  This is handling the case where someone
	disabled something over rcu_read_unlock(), but got preempted
	within (or had an overly long) RCU read-side critical section.
	This used to result in deadlock, but now just messes up real-time
	response.

kernel/rcu/tree.c rcu_process_callbacks():

	Softirqs disabled (softirq handler), so no point
	in preempt_disable().  It might make sense to check
	test_tsk_need_resched().  This is handling the case where someone
	disabled something over rcu_read_unlock(), but got preempted
	within (or had an overly long) RCU read-side critical section.
	This used to result in deadlock, but now just messes up real-time
	response.

kernel/rcu/tree_exp.h sync_rcu_exp_handler():
kernel/rcu/tree_exp.h sync_sched_exp_handler():

	Interrupts disabled (IPI handler), so no point in
	preempt_disable().  It might make sense to check
	test_tsk_need_resched().  This is the expedited
	grace-period case.  (The first is PREEMPT, the second
	!PREEMPT.)

kernel/rcu/tree_plugin.h rcu_flavor_check_callbacks():

	Interrupts disabled (scheduling clock interrupt), so no
	point in preempt_disable().  It would make sense to check
	test_tsk_need_resched().  This is handling the case where someone
	disabled something over rcu_read_unlock(), but got preempted
	within (or had an overly long) RCU read-side critical section.
	This used to result in deadlock, but now just messes up real-time
	response.

So it looks safe for me to invoke resched_curr() in all cases.  I don't
believe that the extra nested preempt_disable() will be a performance
problem.  Anything that I am missing here?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU
  2018-07-30 14:59   ` Paul E. McKenney
@ 2018-07-30 16:42     ` Peter Zijlstra
  2018-07-30 17:14       ` Paul E. McKenney
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2018-07-30 16:42 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel

On Mon, Jul 30, 2018 at 07:59:33AM -0700, Paul E. McKenney wrote:

> > Something that might be more in line with
> > resched_curr(smp_processor_id()) would be:
> > 
> > 	preempt_disable();
> > 	if (!test_tsk_need_resched(current)) {
> > 		set_tsk_need_resched(current);
> > 		set_preempt_need_resched();
> > 	}
> > 	preempt_enable();
> > 
> > Where the preempt_enable() could of course instantly trigger the
> > reschedule if it was the outer most one.
> 
> Ah.  So should I use resched_curr() from rcu_check_callbacks(), which
> is invoked from the scheduling-clock interrupt?  Right now I have calls
> to set_tsk_need_resched() and set_preempt_need_resched().
> 
> > > @@ -2674,10 +2675,12 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused
> > 
> > > -		resched_cpu(rdp->cpu); /* Provoke future context switch. */
> > 
> > > +		set_tsk_need_resched(current);
> > > +		set_preempt_need_resched();
> > 
> > That's not obviously correct. rdp->cpu had better be smp_processor_id().
> 
> At the beginning of the function, we have:
> 
> 	struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);
> 
> And this is in a softirq handler, so we are OK.

Agreed.

> > > @@ -672,7 +672,8 @@ static void sync_rcu_exp_handler(void *unused)
> > >  			rcu_report_exp_rdp(rdp);
> > >  		} else {
> > >  			rdp->deferred_qs = true;
> > > -			resched_cpu(rdp->cpu);
> > > +			set_tsk_need_resched(t);
> > > +			set_preempt_need_resched();
> > 
> > That only works if @t == current.
> 
> At the beginning of the function, we have:
> 
> 	struct task_struct *t = current;
> 
> So we should be OK.

Ah, the scheduler and locking code typically use to call that curr, to
be more explicit that it is the current task.


> Should I be instead using resched_curr() on some or all of these?

If, as it seems is the case, they are all targeting the current cpu and
have (soft) interrupts disabled, then what you propose is indeed fine.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU
  2018-07-30 16:42     ` Peter Zijlstra
@ 2018-07-30 17:14       ` Paul E. McKenney
  0 siblings, 0 replies; 5+ messages in thread
From: Paul E. McKenney @ 2018-07-30 17:14 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel

On Mon, Jul 30, 2018 at 06:42:47PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 30, 2018 at 07:59:33AM -0700, Paul E. McKenney wrote:
> 
> > > Something that might be more in line with
> > > resched_curr(smp_processor_id()) would be:
> > > 
> > > 	preempt_disable();
> > > 	if (!test_tsk_need_resched(current)) {
> > > 		set_tsk_need_resched(current);
> > > 		set_preempt_need_resched();
> > > 	}
> > > 	preempt_enable();
> > > 
> > > Where the preempt_enable() could of course instantly trigger the
> > > reschedule if it was the outer most one.
> > 
> > Ah.  So should I use resched_curr() from rcu_check_callbacks(), which
> > is invoked from the scheduling-clock interrupt?  Right now I have calls
> > to set_tsk_need_resched() and set_preempt_need_resched().
> > 
> > > > @@ -2674,10 +2675,12 @@ static __latent_entropy void rcu_process_callbacks(struct softirq_action *unused
> > > 
> > > > -		resched_cpu(rdp->cpu); /* Provoke future context switch. */
> > > 
> > > > +		set_tsk_need_resched(current);
> > > > +		set_preempt_need_resched();
> > > 
> > > That's not obviously correct. rdp->cpu had better be smp_processor_id().
> > 
> > At the beginning of the function, we have:
> > 
> > 	struct rcu_data *rdp = raw_cpu_ptr(&rcu_data);
> > 
> > And this is in a softirq handler, so we are OK.
> 
> Agreed.
> 
> > > > @@ -672,7 +672,8 @@ static void sync_rcu_exp_handler(void *unused)
> > > >  			rcu_report_exp_rdp(rdp);
> > > >  		} else {
> > > >  			rdp->deferred_qs = true;
> > > > -			resched_cpu(rdp->cpu);
> > > > +			set_tsk_need_resched(t);
> > > > +			set_preempt_need_resched();
> > > 
> > > That only works if @t == current.
> > 
> > At the beginning of the function, we have:
> > 
> > 	struct task_struct *t = current;
> > 
> > So we should be OK.
> 
> Ah, the scheduler and locking code typically use to call that curr, to
> be more explicit that it is the current task.

I cargo-culted the "t" from somewhere a very long time ago, and of course
I have no idea from where.  Now I have hundreds of them in RCU.  :-/

Then again, if I am to change, doing it now when I have other full-source
changes makes sense...

> > Should I be instead using resched_curr() on some or all of these?
> 
> If, as it seems is the case, they are all targeting the current cpu and
> have (soft) interrupts disabled, then what you propose is indeed fine.

Very good, I will leave them as is, then.  Thank you for the review!
May I add your Reviewed-by, Acked-by, or some such?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-30 17:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-27 15:49 [PATCH RFC tip/core/rcu] Avoid resched_cpu() when rescheduling the current CPU Paul E. McKenney
2018-07-30  9:25 ` Peter Zijlstra
2018-07-30 14:59   ` Paul E. McKenney
2018-07-30 16:42     ` Peter Zijlstra
2018-07-30 17:14       ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox