public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Lameter <cl@gentwo.org>,
	linux-kernel@vger.kernel.org, mingo@kernel.org,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com,
	edumazet@google.com, dvhart@linux.intel.com, oleg@redhat.com,
	sbw@mit.edu
Subject: Re: [PATCH tip/core/rcu 11/17] rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs
Date: Fri, 11 Jul 2014 12:08:16 -0700	[thread overview]
Message-ID: <20140711190816.GR16041@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140711185731.GG26045@localhost.localdomain>

On Fri, Jul 11, 2014 at 08:57:33PM +0200, Frederic Weisbecker wrote:
> On Fri, Jul 11, 2014 at 11:45:28AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 11, 2014 at 08:25:43PM +0200, Frederic Weisbecker wrote:
> > > On Fri, Jul 11, 2014 at 01:10:41PM -0500, Christoph Lameter wrote:
> > > > On Tue, 8 Jul 2014, Frederic Weisbecker wrote:
> > > > 
> > > > > > I was figuring that a fair number of the kthreads might eventually
> > > > > > be using this, not just for the grace-period kthreads.
> > > > >
> > > > > Ok makes sense. But can we just rename the cpumask to housekeeping_mask?
> > > > 
> > > > That would imply that all no-nohz processors are housekeeping? So all
> > > > processors with a tick are housekeeping?
> > > 
> > > Well, now that I think about it again, I would really like to keep housekeeping
> > > to CPU 0 when nohz_full= is passed.
> > 
> > When CONFIG_NO_HZ_FULL_SYSIDLE=y, then housekeeping kthreads are bound to
> > CPU 0.  However, doing this causes significant slowdowns according to
> > Fengguang's testing, so when CONFIG_NO_HZ_FULL_SYSIDLE=n, I bind the
> > housekeeping kthreads to the set of non-nohz_full CPUs.
> 
> But did he see these slowdowns with nohz_full= parameter passed? I doubt he
> tested that. And I'm not sure that people who need full dynticks will run
> the usecases that trigger slowdowns with grace period kthreads.
> 
> I also doubt that people will often omit other CPUs than CPU 0 nohz_full=
> range.

Agreed, this is only a problem when people run workloads for which
NO_HZ_FULL is not well-suited.  Which is why I settled on designating
the non-nohz_full= CPUs as the housekeeping CPUs -- people wanting to
run general workloads not suited to NO_HZ_FULL probably won't specify
nohz_full=.  If they don't, then any CPU can be a housekeeping CPU.

> > > > Could we make that set configurable? Ideally I'd like to have the ability
> > > > restrict the housekeeping to one processor.
> > > 
> > > Ah, I'm curious about your usecase. But I think we can do that. And we should.
> > > 
> > > In fact I think that Paul could keep affining grace period kthread to CPU 0
> > > for the sole case when we have nohz_full= parameter passed.
> > > 
> > > I think the performance issues reported to him refer to CONFIG_NO_HZ_FULL=y
> > > config without nohz_full= parameter passed. That's the most important to address.
> > > 
> > > Optimizing the "nohz_full= passed" case is probably not very useful and worse
> > > it complicate things a lot.
> > > 
> > > What do you think Paul? Can we simplify things that way? I'm pretty sure that
> > > nobody cares about optimizing the nohz_full= case. That would really simplify
> > > things to stick to CPU 0.
> > 
> > When we have CONFIG_NO_HZ_FULL_SYSIDLE=y, agreed.  In that case, having
> > housekeeping CPUs on CPUs other than CPU 0 means that you never reach
> > full-system-idle state.
> 
> That said I expect CONFIG_NO_HZ_FULL_SYSIDLE=y to be always enable for those
> who run NO_HZ_FULL in the long run.

Hmmm...  That probably means that we need boot-time parameters to
make sysidle detection really happen.  Otherwise, many users will
get a nasty surprise once CONFIG_NO_HZ_FULL_SYSIDLE=y is enabled on
systems that really aren't running HPC or RT workloads.

I suppose that I could confine SYSIDLE's attention to the nohz_full=
CPUs -- that might actually make things work nicely in all cases with
no configuration of any sort required.  I will need to give this some
thought.

> > But in other cases, we appear to need more than one housekeeping CPU.
> > This is especially the case when people run general workloads on systems
> > that have NO_HZ_FULL=y, which appears to be a significant fraction of
> > the systems these days.
> 
> Yeah NO_HZ_FULL=y is likely to be enabled in many distros. But you know the
> amount of nohz_full= users.

Indeed!  ;-)

								Thanx, Paul


  reply	other threads:[~2014-07-11 19:08 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-07 22:37 [PATCH tip/core/rcu 0/17] Miscellaneous fixes for 3.17 Paul E. McKenney
2014-07-07 22:38 ` [PATCH tip/core/rcu 01/17] rcu: Document deadlock-avoidance information for rcu_read_unlock() Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 02/17] rcu: Handle obsolete references to TINY_PREEMPT_RCU Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 03/17] signal: Explain local_irq_save() call Paul E. McKenney
2014-07-08  9:01     ` Lai Jiangshan
2014-07-08 15:50       ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 04/17] rcu: Make rcu node arrays static const char * const Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 05/17] rcu: remove redundant ACCESS_ONCE() from tick_do_timer_cpu Paul E. McKenney
2014-07-08 14:46     ` Frederic Weisbecker
2014-07-07 22:38   ` [PATCH tip/core/rcu 06/17] rcu: Eliminate read-modify-write ACCESS_ONCE() calls Paul E. McKenney
2014-07-08 16:59     ` Pranith Kumar
2014-07-08 20:35       ` Paul E. McKenney
2014-07-08 20:43         ` Pranith Kumar
2014-07-08 21:40           ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 07/17] rcu: Loosen __call_rcu()'s rcu_head alignment constraint Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 08/17] rcu: Allow post-unlock reference for rt_mutex Paul E. McKenney
2014-07-09  1:50     ` Lai Jiangshan
2014-07-09 16:04       ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 09/17] rcu: Check both root and current rcu_node when setting up future grace period Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 10/17] rcu: Simplify priority boosting by putting rt_mutex in rcu_node Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 11/17] rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs Paul E. McKenney
2014-07-08 15:24     ` Frederic Weisbecker
2014-07-08 15:47       ` Paul E. McKenney
2014-07-08 18:38         ` Frederic Weisbecker
2014-07-08 19:58           ` Paul E. McKenney
2014-07-08 20:40             ` Frederic Weisbecker
2014-07-08 22:05               ` Paul E. McKenney
2014-07-09 15:40                 ` Frederic Weisbecker
2014-07-11 18:10           ` Christoph Lameter
2014-07-11 18:25             ` Frederic Weisbecker
2014-07-11 18:45               ` Paul E. McKenney
2014-07-11 18:57                 ` Frederic Weisbecker
2014-07-11 19:08                   ` Paul E. McKenney [this message]
2014-07-11 19:26                     ` Frederic Weisbecker
2014-07-11 19:43                       ` Paul E. McKenney
2014-07-11 19:55                         ` Frederic Weisbecker
2014-07-11 19:05               ` Christoph Lameter
2014-07-11 19:11                 ` Frederic Weisbecker
2014-07-11 20:35                   ` Paul E. McKenney
2014-07-11 20:45                     ` Frederic Weisbecker
2014-07-12  1:39                       ` Paul E. McKenney
2014-07-14 13:52                         ` Christoph Lameter
2014-07-11 20:15                 ` Peter Zijlstra
2014-07-14 13:53                   ` Christoph Lameter
2014-07-11 18:29             ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 12/17] rcu: Don't use NMIs to dump other CPUs' stacks Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 13/17] rcu: Use __this_cpu_read() instead of per_cpu_ptr() Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 14/17] rcu: remove CONFIG_PROVE_RCU_DELAY Paul E. McKenney
2014-07-08  8:11     ` Paul Bolle
2014-07-08 13:56       ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 15/17] rcu: Fix __rcu_reclaim() to use true/false for bool Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 16/17] rcu: Fix a sparse warning in rcu_initiate_boost() Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 17/17] rcu: Fix a sparse warning in rcu_report_unblock_qs_rnp() Paul E. McKenney
2014-07-09  2:14 ` [PATCH tip/core/rcu 0/17] Miscellaneous fixes for 3.17 Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140711190816.GR16041@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=niv@us.ibm.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sbw@mit.edu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox