Re: [PATCH RFC 0/1] cpufreq/x86: Add P-state driver for sandy bridge.

cpufreq.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Arjan van de Ven <arjan@linux.intel.com>
To: David C Niemi <dniemi@verisign.com>
Cc: dirk.brandewie@gmail.com, cpufreq@vger.kernel.org, rjw@sisk.pl,
	deneen.t.dock@intel.com
Subject: Re: [PATCH RFC 0/1] cpufreq/x86: Add P-state driver for sandy bridge.
Date: Thu, 06 Dec 2012 09:41:12 -0800	[thread overview]
Message-ID: <50C0D8B8.1060301@linux.intel.com> (raw)
In-Reply-To: <50C0D64E.8050005@verisign.com>

On 12/6/2012 9:30 AM, David C Niemi wrote:
> On 12/06/12 11:27, Arjan van de Ven wrote:
>> ...
>>> The exposed configuration interface might be as simple as choosing one of several discrete settings:
>>> - max single-threaded performance
>>> - max multi-threaded performance
>>
>>
>> these are identical on todays silicon btw; or rather, this is not a P state choice item, but a task scheduler policy item.
>
> Here's where there is a difference in power management:
 > if you want to maximize single-thread performance, you're willing to enable power-expensive boost
 > modes on behalf of a thread.

sure

> You don't want to do that for multithreaded performance because your thermal envelope may not let
 > you boost them all at once.  Or at least that is what I was thinking.

this part I don't buy, at least on current hw... the boost code will deal with this quite well;
there's no knob that can do better than that.

>
> Also some people will be all about I/O throughput, and others will care more about latency than anything else, and percentages for those people may be wildly different than for general computation.  So we can't guarantee any particular percentage outside some well-defined benchmarks.  But we could try to lump them all together as best we can and have a couple of knobs on the side like the current "io_is_busy", perhaps.
>>> - "server" setting -- save power but only in ways that do not affect performance
>>
>> this is a fiction btw... if there was a way to reduce power and not affect performance, that's your "max performance" setting.
>> anything else will sacrifice SOME performance from max...

> I know people who don't pay for electricity or cooling and think max performance == run every thread at maximum possible speed all the time, even if it is idle.
 > But boost modes mean "maximum possible speed" is a fluid concept.

my point was that this is no different than "max single/multi performance" above.. unless you can make tradeoffs
(which means performance impact).

>> and defining a common policy interface I'm quite fine with (not quite in the way you defined it, but ok...)
>> But that's not going to lead to a common implementation as a "governor" ;-(
>>
>> My idea for a policy "dial" is mostly
>>
>> * Uncompromised performance
>> * Balanced - biased towards performance     (say, defined to be lowest power at most a 2 1/2% perf hit)
>> * Balanced                                  (say, at most a 5% perf hit)
>> * Balanced - biased towards lower power     (sat, at most a 10% perf hit)
>> * Uncompromised lowest power
>>
>> we can argue about the exact %ages, but the idea is to give at least some reasonably definition that people can understand,
>> but that also can be measured

> I am quite happy with your definitions above.  It is the same in spirit as what I was trying for, just better stated.
>
> I expect the performance degradation percentages are going to vary a lot depending on what
 >  techniques are available in the hardware. If we want to generalize this to encompass older
 > hardware too (which I think is a good idea), I could see percentages being, say, <3% <10% <20% to
 >  give more room to work with, and nicer newer hardware being able to do better as your percentages indicate.

I'm quite ok to add other steps... my point was to get an explicit/clear expectation of what a setting means
in a way that you can measure (and thus validate/etc)

>
> On reporting frequency: would it be practical to report some sort of medium-term average frequency,

so there are counters in the cpus about what we ran it, and you do a delta over a time that you pick to get
an average. (if you pick too short a time, say, 100 cycles, obviously the division gives you a mostly noise number due
to quantization and then dividing a small number by a small noisy number)
so reporting in hindsight over a reasonable time (say a few dozen milliseconds) is not too hard as
long as you could define a time in the past where you did a measurement
to start the delta point... ideally we don't wake up the cpu to do this.. because then we're wasting power for it -(



> or if that is not available, to just report the max freq that the hardware thread is currently eligible to use?

this part is not available at all..... so no we cannot do this.
(well, we do have the maximum the chip can do... but that's a constant number.. might as well report "42")

next prev parent reply	other threads:[~2012-12-06 17:41 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-05 19:01 [PATCH RFC 0/1] cpufreq/x86: Add P-state driver for sandy bridge dirk.brandewie
2012-12-05 19:01 ` [PATCH RFC 1/1] " dirk.brandewie
2012-12-05 20:28 ` [PATCH RFC 0/1] " David C Niemi
2012-12-05 21:01   ` Arjan van de Ven
2012-12-05 21:40     ` David C Niemi
2012-12-05 21:54       ` Arjan van de Ven
2012-12-06 15:01         ` David C Niemi
2012-12-06 16:27           ` Arjan van de Ven
2012-12-06 17:30             ` David C Niemi
2012-12-06 17:41               ` Arjan van de Ven [this message]
2012-12-06 18:25               ` Dirk Brandewie
2012-12-06 18:41                 ` David C Niemi
2012-12-06 21:35                   ` Dirk Brandewie
2012-12-06 22:23                     ` David C Niemi
2012-12-06 20:45             ` Rafael J. Wysocki
2012-12-06 21:15               ` Arjan van de Ven
2012-12-06 21:26                 ` Rafael J. Wysocki
2012-12-06 21:34                   ` Rafael J. Wysocki
2012-12-06 22:08                     ` Arjan van de Ven
2012-12-06 22:53                       ` Rafael J. Wysocki
2012-12-06 16:35   ` Dirk Brandewie
2012-12-06 16:49     ` Arjan van de Ven
2012-12-06 18:16     ` David C Niemi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50C0D8B8.1060301@linux.intel.com \
    --to=arjan@linux.intel.com \
    --cc=cpufreq@vger.kernel.org \
    --cc=deneen.t.dock@intel.com \
    --cc=dirk.brandewie@gmail.com \
    --cc=dniemi@verisign.com \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).