linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-kernel@vger.kernel.org, Boqun Feng <boqun.feng@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	tglx@linutronix.de, Steven Rostedt <rostedt@goodmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	tj@kernel.org
Subject: Re: [PATCH] rcu: Use cpus_read_lock() while looking at cpu_online_mask
Date: Tue, 11 Sep 2018 10:02:22 -0700	[thread overview]
Message-ID: <20180911170222.GO4225@linux.vnet.ibm.com> (raw)
In-Reply-To: <20180911162142.cc3vgook2gctus4c@linutronix.de>

On Tue, Sep 11, 2018 at 06:21:42PM +0200, Sebastian Andrzej Siewior wrote:
> On 2018-09-11 09:05:32 [-0700], Paul E. McKenney wrote:
> > On Mon, Sep 10, 2018 at 03:56:16PM +0200, Sebastian Andrzej Siewior wrote:
> > > It was possible that sync_rcu_exp_select_cpus() enqueued something on
> > > CPU0 while CPU0 was offline. Such a work item wouldn't be processed
> > > until CPU0 gets back online. This problem was addressed in commit
> > > fcc6354365015 ("rcu: Make expedited GPs handle CPU 0 being offline"). I
> > > don't think the issue fully addressed.
> > > 
> > > Assume grplo = 0 and grphi = 7 and sync_rcu_exp_select_cpus() is invoked
> > > on CPU1. The preempt_disable() section on CPU1 won't ensure that CPU0
> > > remains online between looking at cpu_online_mask and invoking
> > > queue_work_on() on CPU1.
> > > 
> > > Use cpus_read_lock() to ensure that `cpu' is not going down between
> > > looking at cpu_online_mask at invoking queue_work_on() and waiting for
> > > its completion. It is added around the loop + flush_work() which is
> > > similar to work_on_cpu_safe() (and we can have multiple jobs running on
> > > NUMA systems).
> > 
> > Is this experimental or theoretical?
> 
> theoretical. I saw that hunk on RT and I can't have queue_work() within
> a preempt_disable() section here.

OK, I feel better now.  ;-)

> > If theoretical, the counter-theory
> > is that the stop-machine processing prevents any of the cpu_online_mask
> > bits from changing, though, yes, we would like to get rid of the
> > stop-machine processing.  So either way, yes, the current state could
> > use some improvement.
> > 
> > But one problem with the patch below is that sync_rcu_exp_select_cpus()
> > can be called while the cpu_hotplug_lock is write-held.  Or is that
> > somehow OK these days?  
> 
> depends. Is it okay to wait until the write-lock is dropped? If it is,
> then it is okay. If not…

The problem is that people wait for grace periods within CPU hotplug
notifiers, so the lock won't be dropped until long after that notifier
returns.

> > Assuming not, how about the (untested) patch
> > below?
> 
> Doesn't work for me because it is still within the preempt-disable
> section :/.
> Would it work to use WORK_CPU_UNBOUND? As far as I understand it, the
> CPU number does not matter, you just want to spread it across multiple
> CPUs in the NUMA case.

Locality is a good thing, but yes, something like this?

	if (!IS_ENABLED(CONFIG_PREEMPT_RT) && /* or whatever it is called */
	    unlikely(cpu > rnp->grphi - rnp->grplo))

Another approach that might be better longer term would be to have a
workqueue interface that treats the specified CPU as a suggestion,
and silently switches to WORK_CPU_UNBOUND if there is any problem
whatsoever with the specified CPU.  Tejun, Lai, thoughts?

> > commit 5214cbbfe6a5d6b92c76c4e411a049fe57245d4a
> > Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Date:   Tue Sep 11 08:57:48 2018 -0700
> > 
> >     rcu: Stop expedited grace periods from relying on stop-machine
> >     
> >     The CPU-selection code in sync_rcu_exp_select_cpus() disables preemption
> >     to prevent the cpu_online_mask from changing.  However, this relies on
> >     the stop-machine mechanism in the CPU-hotplug offline code, which is not
> >     desirable (it would be good to someday remove the stop-machine mechanism).
> 
> not that I tested it, but I still don't understand how a
> preempt_disable() section on CPU1 can ensure that CPU3 won't go down. Is
> there some code that invokes stop_cpus() for each CPU or so?

Yes, there is a stop_machine_cpuslocked() in takedown_cpu().

There is also a synchronize_rcu_mult(call_rcu, call_rcu_sched) in
sched_cpu_deactivate().

The old code relies on the stop_machine_cpuslocked() and my proposed
patch relies on the synchronize_rcu_mult().

							Thanx, Paul


  reply	other threads:[~2018-09-11 17:02 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-10 13:56 [PATCH] rcu: Use cpus_read_lock() while looking at cpu_online_mask Sebastian Andrzej Siewior
2018-09-11 16:05 ` Paul E. McKenney
2018-09-11 16:21   ` Sebastian Andrzej Siewior
2018-09-11 17:02     ` Paul E. McKenney [this message]
2018-09-19 20:55       ` Tejun Heo
2018-09-19 22:11         ` Paul E. McKenney
2018-10-12 18:41           ` Sebastian Andrzej Siewior
2018-10-13 13:48             ` Paul E. McKenney
2018-10-15 14:42               ` Sebastian Andrzej Siewior
2018-10-15 15:07                 ` Boqun Feng
2018-10-15 15:09                   ` Sebastian Andrzej Siewior
2018-10-15 15:33                     ` Boqun Feng
2018-10-15 16:36                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180911170222.GO4225@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bigeasy@linutronix.de \
    --cc=boqun.feng@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).