All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Christoph Lameter <cl@gentwo.org>,
	linux-kernel@vger.kernel.org, mingo@kernel.org,
	laijs@cn.fujitsu.com, dipankar@in.ibm.com,
	akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
	josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
	peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com,
	edumazet@google.com, dvhart@linux.intel.com, oleg@redhat.com,
	sbw@mit.edu
Subject: Re: [PATCH tip/core/rcu 11/17] rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs
Date: Fri, 11 Jul 2014 12:43:14 -0700	[thread overview]
Message-ID: <20140711194314.GU16041@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140711192612.GJ26045@localhost.localdomain>

On Fri, Jul 11, 2014 at 09:26:14PM +0200, Frederic Weisbecker wrote:
> On Fri, Jul 11, 2014 at 12:08:16PM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 11, 2014 at 08:57:33PM +0200, Frederic Weisbecker wrote:
> > > On Fri, Jul 11, 2014 at 11:45:28AM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jul 11, 2014 at 08:25:43PM +0200, Frederic Weisbecker wrote:
> > > > > On Fri, Jul 11, 2014 at 01:10:41PM -0500, Christoph Lameter wrote:
> > > > > > On Tue, 8 Jul 2014, Frederic Weisbecker wrote:
> > > > > > 
> > > > > > > > I was figuring that a fair number of the kthreads might eventually
> > > > > > > > be using this, not just for the grace-period kthreads.
> > > > > > >
> > > > > > > Ok makes sense. But can we just rename the cpumask to housekeeping_mask?
> > > > > > 
> > > > > > That would imply that all no-nohz processors are housekeeping? So all
> > > > > > processors with a tick are housekeeping?
> > > > > 
> > > > > Well, now that I think about it again, I would really like to keep housekeeping
> > > > > to CPU 0 when nohz_full= is passed.
> > > > 
> > > > When CONFIG_NO_HZ_FULL_SYSIDLE=y, then housekeeping kthreads are bound to
> > > > CPU 0.  However, doing this causes significant slowdowns according to
> > > > Fengguang's testing, so when CONFIG_NO_HZ_FULL_SYSIDLE=n, I bind the
> > > > housekeeping kthreads to the set of non-nohz_full CPUs.
> > > 
> > > But did he see these slowdowns with nohz_full= parameter passed? I doubt he
> > > tested that. And I'm not sure that people who need full dynticks will run
> > > the usecases that trigger slowdowns with grace period kthreads.
> > > 
> > > I also doubt that people will often omit other CPUs than CPU 0 nohz_full=
> > > range.
> > 
> > Agreed, this is only a problem when people run workloads for which
> > NO_HZ_FULL is not well-suited.  Which is why I settled on designating
> > the non-nohz_full= CPUs as the housekeeping CPUs -- people wanting to
> > run general workloads not suited to NO_HZ_FULL probably won't specify
> > nohz_full=.  If they don't, then any CPU can be a housekeeping CPU.
> 
> Right. So affining GP kthread to all non-nohz-full CPU works in all case. It's convenient
> but it requires some plumbing:
> 
> * add a housekeeping cpumask and implement housekeeping_affine on top
> * add kthread_bind_cpumask()

Yep.

> So what I propose is to skip these complications and just do:
> 
>         if (tick_nohz_full_enabled()) // means that somebody passed nohz_full= kernel parameter
>             kthread_bind_cpu(GP kthread, 0)
> 
> Moreover Thomas didn't like the idea of extending housekeeping duty further CPU 0, arguing that
> it's too early for that. He meant that for timekeeping but the idea is expandable.

Although I agree that we can get away with a single timekeeping CPU, I
don't believe that we get away with having only a single housekeeping CPU.

> > > > > > Could we make that set configurable? Ideally I'd like to have the ability
> > > > > > restrict the housekeeping to one processor.
> > > > > 
> > > > > Ah, I'm curious about your usecase. But I think we can do that. And we should.
> > > > > 
> > > > > In fact I think that Paul could keep affining grace period kthread to CPU 0
> > > > > for the sole case when we have nohz_full= parameter passed.
> > > > > 
> > > > > I think the performance issues reported to him refer to CONFIG_NO_HZ_FULL=y
> > > > > config without nohz_full= parameter passed. That's the most important to address.
> > > > > 
> > > > > Optimizing the "nohz_full= passed" case is probably not very useful and worse
> > > > > it complicate things a lot.
> > > > > 
> > > > > What do you think Paul? Can we simplify things that way? I'm pretty sure that
> > > > > nobody cares about optimizing the nohz_full= case. That would really simplify
> > > > > things to stick to CPU 0.
> > > > 
> > > > When we have CONFIG_NO_HZ_FULL_SYSIDLE=y, agreed.  In that case, having
> > > > housekeeping CPUs on CPUs other than CPU 0 means that you never reach
> > > > full-system-idle state.
> > > 
> > > That said I expect CONFIG_NO_HZ_FULL_SYSIDLE=y to be always enable for those
> > > who run NO_HZ_FULL in the long run.
> > 
> > Hmmm...  That probably means that we need boot-time parameters to
> > make sysidle detection really happen.  Otherwise, many users will
> > get a nasty surprise once CONFIG_NO_HZ_FULL_SYSIDLE=y is enabled on
> > systems that really aren't running HPC or RT workloads.
> > 
> > I suppose that I could confine SYSIDLE's attention to the nohz_full=
> > CPUs -- that might actually make things work nicely in all cases with
> > no configuration of any sort required.  I will need to give this some
> > thought.
> 
> Exactly, nohz_full= gives all the information we need for sysidle.

Famous last words!  ;-)

But it does good thus far.

							Thanx, Paul

> > > > But in other cases, we appear to need more than one housekeeping CPU.
> > > > This is especially the case when people run general workloads on systems
> > > > that have NO_HZ_FULL=y, which appears to be a significant fraction of
> > > > the systems these days.
> > > 
> > > Yeah NO_HZ_FULL=y is likely to be enabled in many distros. But you know the
> > > amount of nohz_full= users.
> > 
> > Indeed!  ;-)
> > 
> > 								Thanx, Paul
> > 
> 


  reply	other threads:[~2014-07-11 19:43 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-07 22:37 [PATCH tip/core/rcu 0/17] Miscellaneous fixes for 3.17 Paul E. McKenney
2014-07-07 22:38 ` [PATCH tip/core/rcu 01/17] rcu: Document deadlock-avoidance information for rcu_read_unlock() Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 02/17] rcu: Handle obsolete references to TINY_PREEMPT_RCU Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 03/17] signal: Explain local_irq_save() call Paul E. McKenney
2014-07-08  9:01     ` Lai Jiangshan
2014-07-08 15:50       ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 04/17] rcu: Make rcu node arrays static const char * const Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 05/17] rcu: remove redundant ACCESS_ONCE() from tick_do_timer_cpu Paul E. McKenney
2014-07-08 14:46     ` Frederic Weisbecker
2014-07-07 22:38   ` [PATCH tip/core/rcu 06/17] rcu: Eliminate read-modify-write ACCESS_ONCE() calls Paul E. McKenney
2014-07-08 16:59     ` Pranith Kumar
2014-07-08 20:35       ` Paul E. McKenney
2014-07-08 20:43         ` Pranith Kumar
2014-07-08 21:40           ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 07/17] rcu: Loosen __call_rcu()'s rcu_head alignment constraint Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 08/17] rcu: Allow post-unlock reference for rt_mutex Paul E. McKenney
2014-07-09  1:50     ` Lai Jiangshan
2014-07-09 16:04       ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 09/17] rcu: Check both root and current rcu_node when setting up future grace period Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 10/17] rcu: Simplify priority boosting by putting rt_mutex in rcu_node Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 11/17] rcu: Bind grace-period kthreads to non-NO_HZ_FULL CPUs Paul E. McKenney
2014-07-08 15:24     ` Frederic Weisbecker
2014-07-08 15:47       ` Paul E. McKenney
2014-07-08 18:38         ` Frederic Weisbecker
2014-07-08 19:58           ` Paul E. McKenney
2014-07-08 20:40             ` Frederic Weisbecker
2014-07-08 22:05               ` Paul E. McKenney
2014-07-09 15:40                 ` Frederic Weisbecker
2014-07-11 18:10           ` Christoph Lameter
2014-07-11 18:25             ` Frederic Weisbecker
2014-07-11 18:45               ` Paul E. McKenney
2014-07-11 18:57                 ` Frederic Weisbecker
2014-07-11 19:08                   ` Paul E. McKenney
2014-07-11 19:26                     ` Frederic Weisbecker
2014-07-11 19:43                       ` Paul E. McKenney [this message]
2014-07-11 19:55                         ` Frederic Weisbecker
2014-07-11 19:05               ` Christoph Lameter
2014-07-11 19:11                 ` Frederic Weisbecker
2014-07-11 20:35                   ` Paul E. McKenney
2014-07-11 20:45                     ` Frederic Weisbecker
2014-07-12  1:39                       ` Paul E. McKenney
2014-07-14 13:52                         ` Christoph Lameter
2014-07-11 20:15                 ` Peter Zijlstra
2014-07-14 13:53                   ` Christoph Lameter
2014-07-11 18:29             ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 12/17] rcu: Don't use NMIs to dump other CPUs' stacks Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 13/17] rcu: Use __this_cpu_read() instead of per_cpu_ptr() Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 14/17] rcu: remove CONFIG_PROVE_RCU_DELAY Paul E. McKenney
2014-07-08  8:11     ` Paul Bolle
2014-07-08 13:56       ` Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 15/17] rcu: Fix __rcu_reclaim() to use true/false for bool Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 16/17] rcu: Fix a sparse warning in rcu_initiate_boost() Paul E. McKenney
2014-07-07 22:38   ` [PATCH tip/core/rcu 17/17] rcu: Fix a sparse warning in rcu_report_unblock_qs_rnp() Paul E. McKenney
2014-07-09  2:14 ` [PATCH tip/core/rcu 0/17] Miscellaneous fixes for 3.17 Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140711194314.GU16041@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=niv@us.ibm.com \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sbw@mit.edu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.