From: Frederic Weisbecker <frederic@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Chris Metcalf <cmetcalf@mellanox.com>,
Thomas Gleixner <tglx@linutronix.de>,
Luiz Capitulino <lcapitulino@redhat.com>,
Christoph Lameter <cl@linux.com>,
"Paul E . McKenney" <paulmck@linux.vnet.ibm.com>,
Wanpeng Li <kernellwp@gmail.com>, Mike Galbraith <efault@gmx.de>,
Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH 4/6] sched/isolation: Residual 1Hz scheduler tick offload
Date: Sat, 10 Feb 2018 11:29:09 +0100 [thread overview]
Message-ID: <20180210102908.GC14047@lerouge> (raw)
In-Reply-To: <20180209071612.uubujtfjxidrad5r@gmail.com>
On Fri, Feb 09, 2018 at 08:16:12AM +0100, Ingo Molnar wrote:
>
> * Frederic Weisbecker <frederic@kernel.org> wrote:
>
> > When a CPU runs in full dynticks mode, a 1Hz tick remains in order to
> > keep the scheduler stats alive. However this residual tick is a burden
> > for bare metal tasks that can't stand any interruption at all, or want
> > to minimize them.
> >
> > The usual boot parameters "nohz_full=" or "isolcpus=nohz" will now
> > outsource these scheduler ticks to the global workqueue so that a
> > housekeeping CPU handles those remotely. The sched_class::task_tick()
> > implementations have been audited and look safe to be called remotely
> > as the target runqueue and its current task are passed in parameter
> > and don't seem to be accessed locally.
> >
> > Note that in the case of using isolcpus, it's still up to the user to
> > affine the global workqueues to the housekeeping CPUs through
> > /sys/devices/virtual/workqueue/cpumask or domains isolation
> > "isolcpus=nohz,domain".
> >
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Chris Metcalf <cmetcalf@mellanox.com>
> > Cc: Christoph Lameter <cl@linux.com>
> > Cc: Luiz Capitulino <lcapitulino@redhat.com>
> > Cc: Mike Galbraith <efault@gmx.de>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Rik van Riel <riel@redhat.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Wanpeng Li <kernellwp@gmail.com>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > ---
> > kernel/sched/core.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++-
> > kernel/sched/isolation.c | 4 +++
> > kernel/sched/sched.h | 2 ++
> > 3 files changed, 96 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index fc9fa25..5c0e8b6 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3120,7 +3120,94 @@ u64 scheduler_tick_max_deferment(void)
> >
> > return jiffies_to_nsecs(next - now);
> > }
> > -#endif
> > +
> > +struct tick_work {
> > + int cpu;
> > + struct delayed_work work;
> > +};
> > +
> > +static struct tick_work __percpu *tick_work_cpu;
> > +
> > +static void sched_tick_remote(struct work_struct *work)
> > +{
> > + struct delayed_work *dwork = to_delayed_work(work);
> > + struct tick_work *twork = container_of(dwork, struct tick_work, work);
> > + int cpu = twork->cpu;
> > + struct rq *rq = cpu_rq(cpu);
> > + struct rq_flags rf;
> > +
> > + /*
> > + * Handle the tick only if it appears the remote CPU is running
> > + * in full dynticks mode. The check is racy by nature, but
> > + * missing a tick or having one too much is no big deal.
>
> I'd suggest pointing out why it's no big deal:
>
> * missing a tick or having one too much is no big deal,
> * because the scheduler tick updates statistics and checks
> * timeslices in a time-independent way, regardless of when
> * exactly it is running.
>
> > + */
> > + if (!idle_cpu(cpu) && tick_nohz_tick_stopped_cpu(cpu)) {
> > + struct task_struct *curr;
> > + u64 delta;
> > +
> > + rq_lock_irq(rq, &rf);
> > + update_rq_clock(rq);
> > + curr = rq->curr;
> > + delta = rq_clock_task(rq) - curr->se.exec_start;
> > + /* Make sure we tick in a reasonable amount of time */
> > + WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
>
>
> Please add a newline before the comment, and I'd also suggest this wording:
>
> /* Make sure the next tick runs within a reasonable amount of time: */
>
> > + /*
> > + * Perform remote tick every second. The arbitrary frequence is
> > + * large enough to avoid overload and short enough to keep sched
> > + * internal stats alive.
> > + */
> > + queue_delayed_work(system_unbound_wq, dwork, HZ);
> > +}
>
> Typo. I'd also suggest somewhat clearer wording:
>
> /*
> * Run the remote tick once per second (1Hz). This arbitrary
> * frequency is large enough to avoid overload but short enough
> * to keep scheduler internal stats reasonably up to date.
> */
>
> > +#ifdef CONFIG_HOTPLUG_CPU
> > +static void sched_tick_stop(int cpu)
> > +{
> > + struct tick_work *twork;
> > +
> > + if (housekeeping_cpu(cpu, HK_FLAG_TICK))
> > + return;
> > +
> > + WARN_ON_ONCE(!tick_work_cpu);
> > +
> > + twork = per_cpu_ptr(tick_work_cpu, cpu);
> > + cancel_delayed_work_sync(&twork->work);
> > +}
> > +#endif /* CONFIG_HOTPLUG_CPU */
> > +
> > +int __init sched_tick_offload_init(void)
> > +{
> > + tick_work_cpu = alloc_percpu(struct tick_work);
> > + if (!tick_work_cpu) {
> > + pr_err("Can't allocate remote tick struct\n");
> > + return -ENOMEM;
>
> Printing a warning is not enough. If tick_work_cpu ends up being NULL, then the
> tick will crash AFAICS, due to:
>
> > + twork = per_cpu_ptr(tick_work_cpu, cpu);
> > + cancel_delayed_work_sync(&twork->work);
>
> ... it's much better to crash straight away - i.e. we should use panic().
>
> > +#else
> > +static void sched_tick_start(int cpu) { }
> > +static void sched_tick_stop(int cpu) { }
> > +#endif /* CONFIG_NO_HZ_FULL */
>
> So if we are using #if/else/endif markers, please use them in the #else branch
> when it's so short, where they are actually useful:
>
> > +#else /* !CONFIG_NO_HZ_FULL: */
> > +static void sched_tick_start(int cpu) { }
> > +static void sched_tick_stop(int cpu) { }
> > +#endif
>
> (also note the inversion)
Ok for everything there, I'll fix.
Thanks!
next prev parent reply other threads:[~2018-02-10 10:29 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-08 17:59 [PATCH 0/6] isolation: 1Hz residual tick offloading v5 Frederic Weisbecker
2018-02-08 17:59 ` [PATCH 1/6] sched: Rename init_rq_hrtick to hrtick_rq_init Frederic Weisbecker
2018-02-09 6:53 ` Ingo Molnar
2018-02-08 17:59 ` [PATCH 2/6] nohz: Allow to check if remote CPU tick is stopped Frederic Weisbecker
2018-02-08 17:59 ` [PATCH 3/6] sched/isolation: Isolate workqueues when "nohz_full=" is set Frederic Weisbecker
2018-02-09 6:55 ` Ingo Molnar
2018-02-10 10:22 ` Frederic Weisbecker
2018-02-08 17:59 ` [PATCH 4/6] sched/isolation: Residual 1Hz scheduler tick offload Frederic Weisbecker
2018-02-09 7:16 ` Ingo Molnar
2018-02-10 10:29 ` Frederic Weisbecker [this message]
2018-02-08 17:59 ` [PATCH 5/6] sched/nohz: Remove the 1 Hz tick code Frederic Weisbecker
2018-02-08 17:59 ` [PATCH 6/6] sched/isolation: Tick offload documentation Frederic Weisbecker
2018-02-09 7:06 ` Ingo Molnar
2018-02-14 14:52 ` Frederic Weisbecker
2018-02-09 7:00 ` [PATCH 0/6] isolation: 1Hz residual tick offloading v5 Ingo Molnar
2018-02-10 10:24 ` Frederic Weisbecker
-- strict thread matches above, loose matches on Subject: below --
2018-01-19 0:02 [GIT PULL] isolation: 1Hz residual tick offloading v4 Frederic Weisbecker
2018-01-19 0:02 ` [PATCH 4/6] sched/isolation: Residual 1Hz scheduler tick offload Frederic Weisbecker
2018-01-29 15:38 ` Peter Zijlstra
2018-01-29 16:48 ` Frederic Weisbecker
2018-01-29 17:20 ` Peter Zijlstra
2018-01-29 15:39 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180210102908.GC14047@lerouge \
--to=frederic@kernel.org \
--cc=cl@linux.com \
--cc=cmetcalf@mellanox.com \
--cc=efault@gmx.de \
--cc=kernellwp@gmail.com \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.