All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Christoph Lameter <cl@linux.com>,
	"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
	Wanpeng Li <kernellwp@gmail.com>, Mike Galbraith <efault@gmx.de>,
	Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH 4/5] sched/isolation: Residual 1Hz scheduler tick offload
Date: Tue, 16 Jan 2018 16:57:45 +0100	[thread overview]
Message-ID: <20180116155743.GA27567@lerouge> (raw)
In-Reply-To: <20180112142258.31e7a24c@redhat.com>

On Fri, Jan 12, 2018 at 02:22:58PM -0500, Luiz Capitulino wrote:
> On Thu,  4 Jan 2018 05:25:36 +0100
> Frederic Weisbecker <frederic@kernel.org> wrote:
> 
> > When a CPU runs in full dynticks mode, a 1Hz tick remains in order to
> > keep the scheduler stats alive. However this residual tick is a burden
> > for bare metal tasks that can't stand any interruption at all, or want
> > to minimize them.
> > 
> > Adding the boot parameter "isolcpus=nohz_offload" will now outsource
> > these scheduler ticks to the global workqueue so that a housekeeping CPU
> > handles that tick remotely.
> > 
> > Note it's still up to the user to affine the global workqueues to the
> > housekeeping CPUs through /sys/devices/virtual/workqueue/cpumask or
> > domains isolation.
> > 
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Chris Metcalf <cmetcalf@mellanox.com>
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Luiz Capitulino <lcapitulino@redhat.com>
> > Cc: Mike Galbraith <efault@gmx.de>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Rik van Riel <riel@redhat.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Wanpeng Li <kernellwp@gmail.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > ---
> >  kernel/sched/core.c      | 88 ++++++++++++++++++++++++++++++++++++++++++++++--
> >  kernel/sched/isolation.c |  4 +++
> >  kernel/sched/sched.h     |  2 ++
> >  3 files changed, 91 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index d72d0e9..b964890 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3052,9 +3052,14 @@ void scheduler_tick(void)
> >   */
> >  u64 scheduler_tick_max_deferment(void)
> >  {
> > -	struct rq *rq = this_rq();
> > -	unsigned long next, now = READ_ONCE(jiffies);
> > +	struct rq *rq;
> > +	unsigned long next, now;
> >  
> > +	if (!housekeeping_cpu(smp_processor_id(), HK_FLAG_TICK_SCHED))
> > +		return ktime_to_ns(KTIME_MAX);
> > +
> > +	rq = this_rq();
> > +	now = READ_ONCE(jiffies);
> >  	next = rq->last_sched_tick + HZ;
> >  
> >  	if (time_before_eq(next, now))
> > @@ -3062,7 +3067,82 @@ u64 scheduler_tick_max_deferment(void)
> >  
> >  	return jiffies_to_nsecs(next - now);
> >  }
> > -#endif
> > +
> > +struct tick_work {
> > +	int			cpu;
> > +	struct delayed_work	work;
> > +};
> > +
> > +static struct tick_work __percpu *tick_work_cpu;
> > +
> > +static void sched_tick_remote(struct work_struct *work)
> > +{
> > +	struct delayed_work *dwork = to_delayed_work(work);
> > +	struct tick_work *twork = container_of(dwork, struct tick_work, work);
> > +	int cpu = twork->cpu;
> > +	struct rq *rq = cpu_rq(cpu);
> > +	struct rq_flags rf;
> > +
> > +	/*
> > +	 * Handle the tick only if it appears the remote CPU is running
> > +	 * in full dynticks mode. The check is racy by nature, but
> > +	 * missing a tick or having one too much is no big deal.
> > +	 */
> > +	if (!idle_cpu(cpu) && tick_nohz_tick_stopped_cpu(cpu)) {
> > +		rq_lock_irq(rq, &rf);
> > +		update_rq_clock(rq);
> > +		rq->curr->sched_class->task_tick(rq, rq->curr, 0);
> > +		rq_unlock_irq(rq, &rf);
> > +	}
> 
> OK, so this executes task_tick() remotely. What about account_process_tick()?
> Don't we need it as well?

Nope, tasks in nohz_full mode have their special accounting that doesn't
rely on the tick.

> 
> In particular, when I run a hog application on a nohz_full core configured
> with tick offload, I can see in top that the CPU usage goes from 100%
> to idle for a few seconds every couple of seconds. Could this be related?
> 
> Also, in my testing I'm sometimes seeing the tick. Sometimes at 10 or
> 20 seconds interval. Is this expected? I'll dig deeper next week.

That's expected, see the changelog: the offload is not affine by default.
You need to either also isolate the domains:

    isolcpus=nohz_offload,domain

or tweak the workqueue cpumask through:

    /sys/devices/virtual/workqueue/cpumask

Thanks.

  reply	other threads:[~2018-01-16 15:57 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-04  4:25 [GIT PULL] isolation: 1Hz residual tick offloading v3 Frederic Weisbecker
2018-01-04  4:25 ` [PATCH 1/5] sched: Rename init_rq_hrtick to hrtick_rq_init Frederic Weisbecker
2018-01-04  4:25 ` [PATCH 2/5] sched/isolation: Add scheduler tick offloading interface Frederic Weisbecker
2018-01-04  4:25 ` [PATCH 3/5] nohz: Allow to check if remote CPU tick is stopped Frederic Weisbecker
2018-01-04  4:25 ` [PATCH 4/5] sched/isolation: Residual 1Hz scheduler tick offload Frederic Weisbecker
2018-01-12 19:22   ` Luiz Capitulino
2018-01-16 15:57     ` Frederic Weisbecker [this message]
2018-01-16 16:53       ` Luiz Capitulino
2018-01-04  4:25 ` [PATCH 5/5] sched/isolation: Document "nohz_offload" flag Frederic Weisbecker
2018-01-12 19:18 ` [GIT PULL] isolation: 1Hz residual tick offloading v3 Luiz Capitulino
2018-01-16 15:41   ` Frederic Weisbecker
2018-01-16 16:52     ` Luiz Capitulino
2018-01-16 22:51       ` Frederic Weisbecker
2018-01-17 17:38         ` Luiz Capitulino
2018-01-18  3:04           ` Frederic Weisbecker
2018-01-18 14:02             ` Luiz Capitulino
2018-01-16 17:58     ` Mike Galbraith
2018-01-16 22:53       ` Frederic Weisbecker
2018-01-17 14:51       ` Christopher Lameter
2018-01-17 15:59         ` Mike Galbraith
2018-01-17 16:32           ` Christopher Lameter
2018-01-17 16:58             ` Mike Galbraith
  -- strict thread matches above, loose matches on Subject: below --
2017-12-30  3:55 [PATCH 0/5] " Frederic Weisbecker
2017-12-30  3:55 ` [PATCH 4/5] sched/isolation: Residual 1Hz scheduler tick offload Frederic Weisbecker
2017-12-21 17:14 [PATCH 0/5] isolation: 1Hz residual tick offloading v2 Frederic Weisbecker
2017-12-21 17:14 ` [PATCH 4/5] sched/isolation: Residual 1Hz scheduler tick offload Frederic Weisbecker
2017-12-19  3:23 [RFC PATCH 0/5] isolation: 1Hz residual tick offloading Frederic Weisbecker
2017-12-19  3:23 ` [PATCH 4/5] sched/isolation: Residual 1Hz scheduler tick offload Frederic Weisbecker
2017-12-19  9:19   ` Peter Zijlstra
2017-12-19 14:34     ` Luiz Capitulino
2017-12-19 16:01     ` Christopher Lameter
2017-12-19 16:04       ` Peter Zijlstra
2017-12-19 16:38         ` Christopher Lameter
2017-12-19 16:49           ` Peter Zijlstra
2017-12-19 17:26             ` Christopher Lameter
2017-12-19 16:26     ` Frederic Weisbecker
2017-12-19 16:03   ` Christopher Lameter
2017-12-19 16:32     ` Frederic Weisbecker
2017-12-19 17:23       ` Christopher Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180116155743.GA27567@lerouge \
    --to=frederic@kernel.org \
    --cc=cl@linux.com \
    --cc=cmetcalf@mellanox.com \
    --cc=efault@gmx.de \
    --cc=kernellwp@gmail.com \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.