linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Kevin Hilman <khilman@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linaro-kernel@lists.linaro.org,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH 1/2] sched/nohz: add debugfs control over sched_tick_max_deferment
Date: Fri, 10 Jan 2014 16:17:56 +0100	[thread overview]
Message-ID: <20140110151753.GA15280@localhost.localdomain> (raw)
In-Reply-To: <87eh4lez2w.fsf@linaro.org>

On Mon, Jan 06, 2014 at 10:37:27AM -0800, Kevin Hilman wrote:
> Frederic Weisbecker <fweisbec@gmail.com> writes:
> 
> > On Tue, Dec 17, 2013 at 01:23:07PM -0800, Kevin Hilman wrote:
> >> Allow debugfs override of sched_tick_max_deferment in order to ease
> >> finding/fixing the remaining issues with full nohz.
> >> 
> >> The value to be written is in jiffies, and -1 means the max deferment
> >> is disabled (scheduler_tick_max_deferment() returns KTIME_MAX.)
> >> 
> >> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> >> Signed-off-by: Kevin Hilman <khilman@linaro.org>
> >> ---
> >>  kernel/sched/core.c | 16 +++++++++++++++-
> >>  1 file changed, 15 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index 5ac63c9a995a..4b1fe3e69fe4 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -2175,6 +2175,8 @@ void scheduler_tick(void)
> >>  }
> >>  
> >>  #ifdef CONFIG_NO_HZ_FULL
> >> +static u32 sched_tick_max_deferment = HZ;
> >> +
> >>  /**
> >>   * scheduler_tick_max_deferment
> >>   *
> >> @@ -2193,13 +2195,25 @@ u64 scheduler_tick_max_deferment(void)
> >>  	struct rq *rq = this_rq();
> >>  	unsigned long next, now = ACCESS_ONCE(jiffies);
> >>  
> >> -	next = rq->last_sched_tick + HZ;
> >> +	if (sched_tick_max_deferment == -1)
> >> +		return KTIME_MAX;
> >> +
> >> +	next = rq->last_sched_tick + sched_tick_max_deferment;
> >>  
> >>  	if (time_before_eq(next, now))
> >>  		return 0;
> >>  
> >>  	return jiffies_to_usecs(next - now) * NSEC_PER_USEC;
> >>  }
> >> +
> >> +static __init int sched_nohz_full_init_debug(void)
> >> +{
> >> +	debugfs_create_u32("sched_tick_max_deferment", 0644, NULL,
> >> +			   &sched_tick_max_deferment);
> >> +
> >> +	return 0;
> >> +}
> >> +late_initcall(sched_nohz_full_init_debug);
> >
> > If the goal is mostly to turn off sched_tick_max_deferment (set to -1), we should
> > perhaps make it a boolean sched feature (see kernel/sched/features.h) as it's a pretty
> > well consolidated interface.
> 
> Well, I suspect folks may want to set it to various values, depending on
> workload to experiment with the results.

Another option is to add an integer file in sched_features/ debugfs directory. But all other
files there are boolean, so that wouldn't integrate there very well.

One of the things I would like to try is to offline sched_class(current[$CPU])::scheduler_tick()
to the timekeeper or any housekeeping CPU.

So the housekeeper could handle the periodic tick on behalf of full dynticks CPUs. And then
being able to tune the frequency of this sounds interesting.

So yeah having an tunable integer makes sense after all.

> 
> Also, my first attempt was to add control over this via sysctl[1] (though
> not sched_features) and you suggested[2] I use debugfs instead since this
> should be a temporary hack until we can remove the 1Hz residual tick.

Right, but SCHED_FEAT are debugfs :)  And I thought we could either reuse it
or reuse the sched feature debugfs directory. But again I realize it's all made of bool values
so it's not very welcoming for consistency.

Anyway thinking about it more, perhaps we should actually use your patch that use sysctl since
the rest of the scheduler does that for tunable numbers.

Now since it's sysctl, I'm kind of more picky about correctness limits: what if people set high values,
thinking the kernel can handle them just fine, while it can't yet obviously? Should we ignore values that
goes too far? And how to we draw the line?

Thoughts?

> Kevin
> 
> [1] http://marc.info/?l=linux-kernel&m=137159992306877&w=2
> [2] http://marc.info/?l=linux-kernel&m=137166737830821&w=2

  reply	other threads:[~2014-01-10 15:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-17 21:23 [PATCH 1/2] sched/nohz: add debugfs control over sched_tick_max_deferment Kevin Hilman
2013-12-17 21:23 ` [PATCH 2/2] sched/nohz: fix overflow error in scheduler_tick_max_deferment() Kevin Hilman
2014-01-05 13:06   ` Frederic Weisbecker
2014-01-06 18:27     ` Kevin Hilman
2014-01-25 14:22   ` [tip:timers/urgent] sched/nohz: Fix " tip-bot for Kevin Hilman
2014-01-05 13:21 ` [PATCH 1/2] sched/nohz: add debugfs control over sched_tick_max_deferment Frederic Weisbecker
2014-01-06 18:37   ` Kevin Hilman
2014-01-10 15:17     ` Frederic Weisbecker [this message]
2014-01-14 20:46       ` Kevin Hilman
  -- strict thread matches above, loose matches on Subject: below --
2013-09-16 22:43 Kevin Hilman
2013-11-18 21:42 ` Kevin Hilman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140110151753.GA15280@localhost.localdomain \
    --to=fweisbec@gmail.com \
    --cc=khilman@linaro.org \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).