Re: [RFC/RFT][PATCH v4 1/2] cpufreq: New governor using utilization data from the scheduler

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Steve Muckle <steve.muckle@linaro.org>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Linux PM list <linux-pm@vger.kernel.org>,
	Juri Lelli <juri.lelli@arm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [RFC/RFT][PATCH v4 1/2] cpufreq: New governor using utilization data from the scheduler
Date: Mon, 29 Feb 2016 20:10:07 -0800	[thread overview]
Message-ID: <56D5161F.1030701@linaro.org> (raw)
In-Reply-To: <2754630.1sRldKdOu8@vostro.rjw.lan>

On 02/27/2016 07:24 AM, Rafael J. Wysocki wrote:
>>> +static unsigned int sugov_next_freq(struct policy_dbs_info *policy_dbs,
>>> +				    unsigned long util, unsigned long max,
>>> +				    u64 last_sample_time)
>>> +{
>>> +	struct cpufreq_policy *policy = policy_dbs->policy;
>>> +	unsigned int min_f = policy->min;
>>> +	unsigned int max_f = policy->max;
>>> +	unsigned int j;
>>> +
>>> +	if (util > max || min_f >= max_f)
>>> +		return max_f;
>>> +
>>> +	for_each_cpu(j, policy->cpus) {
>>> +		struct sugov_cpu *j_sg_cpu;
>>> +		unsigned long j_util, j_max;
>>> +
>>> +		if (j == smp_processor_id())
>>> +			continue;
>>> +
>>> +		j_sg_cpu = &per_cpu(sugov_cpu, j);
>>> +		/*
>>> +		 * If the CPU was last updated before the previous sampling
>>> +		 * time, we don't take it into account as it probably is idle
>>> +		 * now.
>>> +		 */
>>
>> If the sampling rate is less than the tick, it seems possible a busy CPU
>> may not manage to get an update in within the current sampling period.
> 
> Right.
> 
> sampling_rate is more of a rate limit here, though, because the governor doesn't
> really do any sampling (I was talking about that in the intro message at
> http://marc.info/?l=linux-pm&m=145609673008122&w=2).
> 
> It was easy to re-use the existing show/store for that, so I took the shortcut,
> but I'm considering to introduce a new attribute with a more suitable name for
> that.  That would help to avoid a couple of unclean things in patch [2/2] as
> well if I'm not mistaken.
> 
>> What about checking to see if there was an update within the last tick
>> period, rather than sample period?
> 
> If your rate limit is set below the rate of the updates (scheduling rate more
> or less), it simply has no effect.  To me, that case doesn't require any
> special care.

I'm specifically worried about the check below where we omit a CPU's
capacity request if its last update came before the last sample time.

Say there are 2 CPUs in a frequency domain, HZ is 100 and the sample
delay here is 4ms. At t=0ms CPU0 starts running a CPU hog and reports
being fully busy. At t=5ms CPU1 starts running a task - CPU0's vote is
kept and factored in and last_sample_time is updated to t=5ms. At t=9ms
CPU1 stops running the task and updates again, but this time CPU0's vote
is discarded because its last update is t=0ms, and last_sample_time is
5ms. But CPU0 is fully busy and the freq is erroneously dropped.

> 
>>> +		if (j_sg_cpu->last_update < last_sample_time)
>>> +			continue;
>>> +
>>> +		j_util = j_sg_cpu->util;
>>> +		j_max = j_sg_cpu->max;
>>> +		if (j_util > j_max)
>>> +			return max_f;
>>> +
>>> +		if (j_util * max > j_max * util) {
>>> +			util = j_util;
>>> +			max = j_max;
>>> +		}
>>> +	}
>>> +
>>> +	return min_f + util * (max_f - min_f) / max;
>>> +}
...
>>> +static unsigned int sugov_set_freq(struct cpufreq_policy *policy)
>>> +{
>>> +	struct policy_dbs_info *policy_dbs = policy->governor_data;
>>> +	struct sugov_policy *sg_policy = to_sg_policy(policy_dbs);
>>> +
>>> +	__cpufreq_driver_target(policy, sg_policy->next_freq, CPUFREQ_RELATION_C);
>>
>> Similar to the concern raised in the acpi changes I think this should be
>> CPUFREQ_RELATION_L, otherwise you may not run fast enough to meet demand.
> 
> "ondemand" uses CPUFREQ_RELATION_C when powersave_bias is unset and that's why
> I used it here.
> 
> Like I said in my reply to Peter in that thread, using RELATION_L here is likely
> to make us avoid the min frequency almost entirely even if the system is almost
> completely idle.  I don't think that would be OK really.
> 
> That said my opinion about this particular item isn't really strong.

I think the calculation for required CPU bandwidth needs tweaking.

Aside from always wanting something past fmin, currently the amount of
extra CPU capacity given for a particular % utilization depends on how
high the platform's fmin happens to be, even if the fmax speeds are the
same. For example given two platforms with the following available
frequencies (MHz):

platform A: 100, 300, 500, 700, 900, 1100
platform B: 500, 700, 900, 1100

A 50% utilization load on platform A will want 600 MHz (rounding up to
700 MHz perhaps) whereas platform B will want 800 MHz (again likely
rounding up to 900 MHz), even though the load consumes 550 MHz on both
platforms.

One possibility would be something like we had in schedfreq, getting the
absolute CPU bw requirement (util/max) * fmax and then adding some %
margin, which I think is more consistent. It is true that it means
figuring out what the right margin is and now there's a magic number
(and potentially a tunable), but it would be more consistent.

thanks,
Steve

next prev parent reply	other threads:[~2016-03-01  4:10 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-21 23:16 [RFC/RFT][PATCH 0/1] cpufreq: New governor based on scheduler-provided utilization data Rafael J. Wysocki
2016-02-21 23:18 ` [RFC/RFT][PATCH 1/1] cpufreq: New governor using utilization data from the scheduler Rafael J. Wysocki
2016-02-22 14:16   ` Juri Lelli
2016-02-22 23:02     ` Rafael J. Wysocki
2016-02-23  7:20       ` Steve Muckle
2016-02-24  1:38         ` Rafael J. Wysocki
2016-02-25 11:01       ` Juri Lelli
2016-02-26  2:36         ` Rafael J. Wysocki
2016-03-01 14:56           ` Juri Lelli
2016-03-01 20:26             ` Rafael J. Wysocki
2016-02-24  1:20 ` [RFC/RFT][PATCH v2 0/2] cpufreq: New governor based on scheduler-provided utilization data Rafael J. Wysocki
2016-02-24  1:22   ` [RFC/RFT][PATCH v2 1/2] cpufreq: New governor using utilization data from the scheduler Rafael J. Wysocki
2016-02-25 21:14     ` [RFC/RFT][PATCH v4 " Rafael J. Wysocki
2016-02-27  0:21       ` Rafael J. Wysocki
2016-02-27  4:33       ` Steve Muckle
2016-02-27 15:24         ` Rafael J. Wysocki
2016-03-01  4:10           ` Steve Muckle [this message]
2016-03-01 20:20             ` Rafael J. Wysocki
2016-03-03  3:20               ` Steve Muckle
2016-03-03  3:35                 ` Steve Muckle
2016-03-03 19:20                 ` Rafael J. Wysocki
2016-02-24  1:28   ` [RFC/RFT][PATCH v2 2/2] cpufreq: schedutil: Switching frequencies from interrupt context Rafael J. Wysocki
2016-02-24 23:30     ` [RFC/RFT][PATCH v3 " Rafael J. Wysocki
2016-02-25  9:08       ` Peter Zijlstra
2016-02-25  9:12         ` Peter Zijlstra
2016-02-25 11:11           ` Rafael J. Wysocki
2016-02-25 11:10         ` Rafael J. Wysocki
2016-02-25 11:52           ` Peter Zijlstra
2016-02-25 20:54             ` Rafael J. Wysocki
2016-02-25 21:20     ` [RFC/RFT][PATCH v4 " Rafael J. Wysocki
2016-03-03 14:27 ` [RFC/RFT][PATCH 0/1] cpufreq: New governor based on scheduler-provided utilization data Ingo Molnar
2016-03-03 17:15   ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D5161F.1030701@linaro.org \
    --to=steve.muckle@linaro.org \
    --cc=juri.lelli@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).