All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: linux-kernel@vger.kernel.org,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Josh Triplett <josh@joshtriplett.org>,
	kernel-team@android.com, Lai Jiangshan <jiangshanlai@gmail.com>,
	linux-doc@vger.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	rcu@vger.kernel.org, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [RFC v1 1/2] rcu/tree: Clean up dynticks counter usage
Date: Wed, 28 Aug 2019 13:13:44 -0700	[thread overview]
Message-ID: <20190828201344.GR26530@linux.ibm.com> (raw)
In-Reply-To: <5d648895.1c69fb81.5e60a.fc6f@mx.google.com>

On Mon, Aug 26, 2019 at 09:33:53PM -0400, Joel Fernandes (Google) wrote:
> The dynticks counter are confusing due to crowbar writes of
> DYNTICK_IRQ_NONIDLE whose purpose is to detect half-interrupts (i.e. we
> see rcu_irq_enter() but not rcu_irq_exit() due to a usermode upcall) and
> if so then do a reset of the dyntick_nmi_nesting counters. This patch
> tries to get rid of DYNTICK_IRQ_NONIDLE while still keeping the code
> working, fully functional, and less confusing. The confusion recently
> has even led to patches forgetting that DYNTICK_IRQ_NONIDLE was written
> to which wasted lots of time.
> 
> The patch has the following changes:
> 
> (1) Use dynticks_nesting instead of dynticks_nmi_nesting for determining
> outer most "EQS exit". This is needed to detect in
> rcu_nmi_enter_common() if we have already EQS-exited, such as because of
> a syscall. Currently we rely on a forced write of DYNTICK_IRQ_NONIDLE
> from rcu_eqs_exit() for this purpose. This is one purpose of the
> DYNTICK_IRQ_NONIDLE write (other than detecting half-interrupts).
> However, we do not need to do that. dyntick_nesting already tells us that
> we have EQS-exited so just use that thus removing the dependence of
> dynticks_nmi_nesting for this purpose.
> 
> (2) Keep dynticks_nmi_nesting around because:
> 
>   (a) rcu_is_cpu_rrupt_from_idle() needs to be able to detect first
>       interrupt nesting level.
> 
>   (b) We need to detect half-interrupts till we are sure they're not an
>       issue. However, change the comparison to DYNTICK_IRQ_NONIDLE with 0.
> 
> (3) Since we got rid of DYNTICK_IRQ_NONIDLE, we also do cheaper
> comparisons with zero instead for the code that keeps the tick on in
> rcu_nmi_enter_common().
> 
> In the next patch, both of the concerns of (2) will be addressed and
> then we can get rid of dynticks_nmi_nesting, however one step at a time.

Postponing discussion of the commit log for the moment.

> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> ---
>  kernel/rcu/rcu.h  |  4 ----
>  kernel/rcu/tree.c | 60 ++++++++++++++++++++++++++++-------------------
>  2 files changed, 36 insertions(+), 28 deletions(-)
> 
> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> index aeec70fda82c..046833f3784b 100644
> --- a/kernel/rcu/rcu.h
> +++ b/kernel/rcu/rcu.h
> @@ -12,10 +12,6 @@
>  
>  #include <trace/events/rcu.h>
>  
> -/* Offset to allow distinguishing irq vs. task-based idle entry/exit. */
> -#define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
> -
> -

OK.

>  /*
>   * Grace-period counter management.
>   */
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 68ebf0eb64c8..255cd6835526 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -81,7 +81,7 @@
>  
>  static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
>  	.dynticks_nesting = 1,
> -	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
> +	.dynticks_nmi_nesting = 0,

C initializes to zero by default, so this can simply be deleted.

>  	.dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR),
>  };
>  struct rcu_state rcu_state = {
> @@ -558,17 +558,18 @@ EXPORT_SYMBOL_GPL(rcutorture_get_gp_data);
>  /*
>   * Enter an RCU extended quiescent state, which can be either the
>   * idle loop or adaptive-tickless usermode execution.
> - *
> - * We crowbar the ->dynticks_nmi_nesting field to zero to allow for
> - * the possibility of usermode upcalls having messed up our count
> - * of interrupt nesting level during the prior busy period.
>   */
>  static void rcu_eqs_enter(bool user)
>  {
>  	struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
>  
> -	WARN_ON_ONCE(rdp->dynticks_nmi_nesting != DYNTICK_IRQ_NONIDLE);
> -	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
> +	/* Entering usermode/idle from interrupt is not handled. These would
> +	 * mean usermode upcalls or idle entry happened from interrupts. But,
> +	 * reset the counter if we warn.
> +	 */

Please either put the "/*" on its own line or use "//"-style comments.

> +	if (WARN_ON_ONCE(rdp->dynticks_nmi_nesting != 0))
> +		WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);
> +

@@@

>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
>  		     rdp->dynticks_nesting == 0);
>  	if (rdp->dynticks_nesting != 1) {
> @@ -642,23 +643,27 @@ static __always_inline void rcu_nmi_exit_common(bool irq)
>  	 * (We are exiting an NMI handler, so RCU better be paying attention
>  	 * to us!)
>  	 */
> +	WARN_ON_ONCE(rdp->dynticks_nesting <= 0);

This is fine.

>  	WARN_ON_ONCE(rdp->dynticks_nmi_nesting <= 0);
>  	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
>  
> +	WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
> +		   rdp->dynticks_nmi_nesting - 1);

This is problematic.  The +/-1 and +/-2 dance is specifically for NMIs, so...

>  	/*
>  	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
>  	 * leave it in non-RCU-idle state.
>  	 */
> -	if (rdp->dynticks_nmi_nesting != 1) {
> -		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2, rdp->dynticks);
> -		WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
> -			   rdp->dynticks_nmi_nesting - 2);
> +	if (rdp->dynticks_nesting != 1) {
> +		trace_rcu_dyntick(TPS("--="), rdp->dynticks_nesting,
> +				  rdp->dynticks_nesting - 2, rdp->dynticks);
> +		WRITE_ONCE(rdp->dynticks_nesting, /* No store tearing. */
> +			   rdp->dynticks_nesting - 2);

Making the dancer's name be ->dynticks_nesting instead of
->dynticks_nmi_nesting is going to be trouble.  (Yes, I did
take a quick look at the next patch, more on that when I get
there.)

>  		return;
>  	}
>  
>  	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
> -	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, rdp->dynticks);
> -	WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
> +	trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nesting, 0, rdp->dynticks);
> +	WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid store tearing. */

Same here...

>  	if (irq)
>  		rcu_prepare_for_idle();
> @@ -723,10 +728,6 @@ void rcu_irq_exit_irqson(void)
>  /*
>   * Exit an RCU extended quiescent state, which can be either the
>   * idle loop or adaptive-tickless usermode execution.
> - *
> - * We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
> - * allow for the possibility of usermode upcalls messing up our count of
> - * interrupt nesting level during the busy period that is just now starting.
>   */
>  static void rcu_eqs_exit(bool user)
>  {
> @@ -747,8 +748,13 @@ static void rcu_eqs_exit(bool user)
>  	trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, rdp->dynticks);
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
>  	WRITE_ONCE(rdp->dynticks_nesting, 1);
> -	WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
> -	WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
> +
> +	/* Exiting usermode/idle from interrupt is not handled. These would
> +	 * mean usermode upcalls or idle exit happened from interrupts. But,
> +	 * reset the counter if we warn.
> +	 */
> +	if (WARN_ON_ONCE(rdp->dynticks_nmi_nesting != 0))
> +		WRITE_ONCE(rdp->dynticks_nmi_nesting, 0);

And here.  Plus this is adding a test and branch in the common case.
Given that the location being written to should be hot in the cache,
it is not clear that this is a win.

>  }
>  
>  /**
> @@ -804,6 +810,7 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
>  	long incby = 2;
>  
>  	/* Complain about underflow. */
> +	WARN_ON_ONCE(rdp->dynticks_nesting < 0);

OK.

>  	WARN_ON_ONCE(rdp->dynticks_nmi_nesting < 0);
>  
>  	/*
> @@ -826,16 +833,21 @@ static __always_inline void rcu_nmi_enter_common(bool irq)
>  
>  		incby = 1;
>  	} else if (tick_nohz_full_cpu(rdp->cpu) &&
> -		   rdp->dynticks_nmi_nesting == DYNTICK_IRQ_NONIDLE &&
> -		   rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
> +		   !rdp->dynticks_nmi_nesting && rdp->rcu_urgent_qs &&
> +		   !rdp->rcu_forced_tick) {

OK.  Though you should be able to save a line by pulling the
"rdp->rcu_urgent_qs &&" onto the first line.

>  		rdp->rcu_forced_tick = true;
>  		tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
>  	}
> +

Not clear that the added blank line is a win, here or below.

>  	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
> -			  rdp->dynticks_nmi_nesting,
> -			  rdp->dynticks_nmi_nesting + incby, rdp->dynticks);
> +			  rdp->dynticks_nesting,
> +			  rdp->dynticks_nesting + incby, rdp->dynticks);
> +
> +	WRITE_ONCE(rdp->dynticks_nesting, /* Prevent store tearing. */
> +		   rdp->dynticks_nesting + incby);
> +
>  	WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */
> -		   rdp->dynticks_nmi_nesting + incby);
> +		   rdp->dynticks_nmi_nesting + 1);

And same naming issue here.

							Thanx, Paul

>  	barrier();
>  }
>  
> -- 
> 2.23.0.187.g17f5b7556c-goog
> 

  reply	other threads:[~2019-08-28 20:15 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-27  1:33 [RFC v1 1/2] rcu/tree: Clean up dynticks counter usage Joel Fernandes (Google)
2019-08-28 20:13 ` Paul E. McKenney [this message]
2019-08-28 21:56   ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190828201344.GR26530@linux.ibm.com \
    --to=paulmck@kernel.org \
    --cc=corbet@lwn.net \
    --cc=fweisbec@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=kernel-team@android.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mchehab+samsung@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.