Re: [PATCH RFC tip/core/rcu 1/5] rcu: Reduce overhead of cond_resched() checks for RCU

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com,
	edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com,
	sbw@mit.edu, Andi Kleen <ak@linux.intel.com>,
	Christoph Lameter <cl@gentwo.org>,
	Mike Galbraith <umgwanakikbuti@gmail.com>,
	Eric Dumazet <eric.dumazet@gmail.com>
Subject: Re: [PATCH RFC tip/core/rcu 1/5] rcu: Reduce overhead of cond_resched() checks for RCU
Date: Mon, 23 Jun 2014 18:43:21 +0200	[thread overview]
Message-ID: <20140623164321.GA5543@redhat.com> (raw)
In-Reply-To: <1403289203-6371-1-git-send-email-paulmck@linux.vnet.ibm.com>

On 06/20, Paul E. McKenney wrote:
>
> This commit takes a different approach to fixing this bug, mainly by
> avoiding having cond_resched() do an RCU-visible quiescent state unless
> there is a grace period that has been in flight for a significant period
> of time.  This commit also reduces the common-case cond_resched() overhead
> to a check of a single per-CPU variable.

I can't say I fully understand this change, but I think it is fine.
Just a really stupid question below.

> +void rcu_resched(void)
> +{
> +	unsigned long flags;
> +	struct rcu_data *rdp;
> +	struct rcu_dynticks *rdtp;
> +	int resched_mask;
> +	struct rcu_state *rsp;
> +
> +	local_irq_save(flags);
> +
> +	/*
> +	 * Yes, we can lose flag-setting operations.  This is OK, because
> +	 * the flag will be set again after some delay.
> +	 */
> +	resched_mask = raw_cpu_read(rcu_cond_resched_mask);
> +	raw_cpu_write(rcu_cond_resched_mask, 0);
> +
> +	/* Find the flavor that needs a quiescent state. */
> +	for_each_rcu_flavor(rsp) {
> +		rdp = raw_cpu_ptr(rsp->rda);
> +		if (!(resched_mask & rsp->flavor_mask))
> +			continue;
> +		smp_mb(); /* ->flavor_mask before ->cond_resched_completed. */
> +		if (ACCESS_ONCE(rdp->mynode->completed) !=
> +		    ACCESS_ONCE(rdp->cond_resched_completed))
> +			continue;

Probably the comment above mb() meant "rcu_cond_resched_mask before
->cond_resched_completed" ? Otherwise I can't see why do we need any
barrier.

> @@ -893,13 +946,20 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp,
>  	}
>
>  	/*
> -	 * There is a possibility that a CPU in adaptive-ticks state
> -	 * might run in the kernel with the scheduling-clock tick disabled
> -	 * for an extended time period.  Invoke rcu_kick_nohz_cpu() to
> -	 * force the CPU to restart the scheduling-clock tick in this
> -	 * CPU is in this state.
> +	 * A CPU running for an extended time within the kernel can
> +	 * delay RCU grace periods.  When the CPU is in NO_HZ_FULL mode,
> +	 * even context-switching back and forth between a pair of
> +	 * in-kernel CPU-bound tasks cannot advance grace periods.
> +	 * So if the grace period is old enough, make the CPU pay attention.
>  	 */
> -	rcu_kick_nohz_cpu(rdp->cpu);
> +	if (ULONG_CMP_GE(jiffies, rdp->rsp->gp_start + 7)) {
> +		rcrmp = &per_cpu(rcu_cond_resched_mask, rdp->cpu);
> +		ACCESS_ONCE(rdp->cond_resched_completed) =
> +			ACCESS_ONCE(rdp->mynode->completed);
> +		smp_mb(); /* ->cond_resched_completed before *rcrmp. */
> +		ACCESS_ONCE(*rcrmp) =
> +			ACCESS_ONCE(*rcrmp) + rdp->rsp->flavor_mask;
> +	}

OK, in this case I guess we need a full barrier because we need to read
rcu_cond_resched_mask before updating it...

But, I am just curious, is there any reason to use ACCESS_ONCE() twice?

	ACCESS_ONCE(*rcrmp) |= rdp->rsp->flavor_mask;

or even

	ACCESS_ONCE(per_cpu(rcu_cond_resched_mask, rdp->cpu)) |=
		rdp->rsp->flavor_mask;

should equally work, or ACCESS_ONCE() can't be used to RMW ?

(and in fact at least the 2nd ACCESS_ONCE() (load) looks unnecessary anyway
 because of smp_mb() above).


Once again, of course I am not arguing if there is no "real" reason and you
just prefer it this way. But the kernel has more and more ACESS_ONCE() users
and sometime I simply do not understand why it is needed. For example,
cyc2ns_write_end().

Or even INIT_LIST_HEAD_RCU(). The comment in list_splice_init_rcu() says:

	/*
	 * "first" and "last" tracking list, so initialize it.  RCU readers
	 * have access to this list, so we must use INIT_LIST_HEAD_RCU()
	 * instead of INIT_LIST_HEAD().
	 */

	INIT_LIST_HEAD_RCU(list);

but we are going to call synchronize_rcu() or something similar, this should
act as compiler barrier too?

Oleg.

next prev parent reply	other threads:[~2014-06-23 16:45 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-20 18:32 [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression Paul E. McKenney
2014-06-20 18:33 ` [PATCH RFC tip/core/rcu 1/5] rcu: Reduce overhead of cond_resched() checks for RCU Paul E. McKenney
2014-06-20 18:33   ` [PATCH RFC tip/core/rcu 2/5] rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops Paul E. McKenney
2014-06-20 18:33   ` [PATCH RFC tip/core/rcu 3/5] rcu: Add RCU_COND_RESCHED_QS for large systems Paul E. McKenney
2014-06-20 18:33   ` [PATCH RFC tip/core/rcu 4/5] rcutorture: Suppress spurious RCU CPU stall warnings Paul E. McKenney
2014-06-20 18:33   ` [PATCH RFC tip/core/rcu 5/5] rcu: Add boot/sysfs control for RCU cond_resched() help solicitation Paul E. McKenney
2014-06-23 16:43   ` Oleg Nesterov [this message]
2014-06-23 17:36     ` [PATCH RFC tip/core/rcu 1/5] rcu: Reduce overhead of cond_resched() checks for RCU Paul E. McKenney
2014-06-23 18:35       ` Oleg Nesterov
2014-06-24  0:18         ` Paul E. McKenney
2014-07-22  4:35   ` Pranith Kumar
2014-07-22  4:52     ` Pranith Kumar
2014-07-22 11:07       ` Paul E. McKenney
2014-07-22 11:06     ` Paul E. McKenney
2014-06-20 19:04 ` [PATCH tip/core/rcu 0/5] Fix for cond_resched performance regression josh
2014-06-20 22:04   ` Paul E. McKenney
2014-06-20 19:12 ` Paul E. McKenney
2014-06-20 21:24   ` josh
2014-06-20 22:11     ` Paul E. McKenney
2014-06-20 22:39       ` josh
2014-06-20 23:30         ` Paul E. McKenney
2014-06-20 23:52           ` josh
2014-06-21  0:14             ` Paul E. McKenney
2014-06-21  0:36               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140623164321.GA5543@redhat.com \
    --to=oleg@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=niv@us.ibm.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sbw@mit.edu \
    --cc=tglx@linutronix.de \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.