xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Performance difference between Xen versions
@ 2011-04-29 12:32 Juergen Gross
  2011-04-29 13:28 ` Keir Fraser
  2011-04-29 16:10 ` Jan Beulich
  0 siblings, 2 replies; 28+ messages in thread
From: Juergen Gross @ 2011-04-29 12:32 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com

Hi,

comparing performance of different Xen versions with BS2000 as HVM guest
showed some weird data I'd like to understand.

All measurements were done on an Intel Xeon E7220 box. We used a disk-
benchmark and found the cpu utilization was much higher with Xen 4.0 compared
to Xen 3.3. I did some more investigation and narrowed things down to calls of
the hypervisor (implicit or explicit).

Following is a table with timing data for different low-level functions, all
timing values are tsc ticks obtained via rdtsc:

Xen 3.3     Xen 4.0      Function
       88        165      just the measurement overhead
      176        330      rdtsc-instruction + cli/sti
     5896      11044      lapic timer query
     7381      13519      setting lapic timer
     4653       8987      reload of cr3
     3124       5709      invlpg instruction
   792253     792264      wbinvd instruction
      748       1375      int + iret
     5203       9317      hypervisor yield call
12598102   12597882      memory access loop

All operations involving the hypervisor take nearly twice the time on 4.0.
Operations not involving the hypervisor (wbinvd and memory access loop) are
the same on both systems (this rules out the possibility of different rdtsc
behavior).

Is there any easy explanation for this? Both Xen versions are from SLES
(SLES11 or SLES11 SP1).


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-04-29 12:32 Performance difference between Xen versions Juergen Gross
@ 2011-04-29 13:28 ` Keir Fraser
  2011-04-29 13:35   ` Juergen Gross
  2011-04-29 16:10 ` Jan Beulich
  1 sibling, 1 reply; 28+ messages in thread
From: Keir Fraser @ 2011-04-29 13:28 UTC (permalink / raw)
  To: Juergen Gross, xen-devel@lists.xensource.com

Are you sure TSC runs at the same rate in the guest on both hypervisor
versions? Xen 4.0 might trap and emulate a more consistent but slower rate
TSC by default. 'tsc_mode=2' in your domain config file on 4.0 might be a
quick fix.

 -- Keir

On 29/04/2011 13:32, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> Hi,
> 
> comparing performance of different Xen versions with BS2000 as HVM guest
> showed some weird data I'd like to understand.
> 
> All measurements were done on an Intel Xeon E7220 box. We used a disk-
> benchmark and found the cpu utilization was much higher with Xen 4.0 compared
> to Xen 3.3. I did some more investigation and narrowed things down to calls of
> the hypervisor (implicit or explicit).
> 
> Following is a table with timing data for different low-level functions, all
> timing values are tsc ticks obtained via rdtsc:
> 
> Xen 3.3     Xen 4.0      Function
>        88        165      just the measurement overhead
>       176        330      rdtsc-instruction + cli/sti
>      5896      11044      lapic timer query
>      7381      13519      setting lapic timer
>      4653       8987      reload of cr3
>      3124       5709      invlpg instruction
>    792253     792264      wbinvd instruction
>       748       1375      int + iret
>      5203       9317      hypervisor yield call
> 12598102   12597882      memory access loop
> 
> All operations involving the hypervisor take nearly twice the time on 4.0.
> Operations not involving the hypervisor (wbinvd and memory access loop) are
> the same on both systems (this rules out the possibility of different rdtsc
> behavior).
> 
> Is there any easy explanation for this? Both Xen versions are from SLES
> (SLES11 or SLES11 SP1).
> 
> 
> Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-04-29 13:28 ` Keir Fraser
@ 2011-04-29 13:35   ` Juergen Gross
  2011-04-29 14:58     ` Keir Fraser
  0 siblings, 1 reply; 28+ messages in thread
From: Juergen Gross @ 2011-04-29 13:35 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel@lists.xensource.com

On 04/29/11 15:28, Keir Fraser wrote:
> Are you sure TSC runs at the same rate in the guest on both hypervisor
> versions? Xen 4.0 might trap and emulate a more consistent but slower rate
> TSC by default. 'tsc_mode=2' in your domain config file on 4.0 might be a
> quick fix.
Already done :-), so yes, I am sure the tsc rate is the same. The debug key
's' (softTSC stats) shows that no tsc is emulated.

BTW: different tsc rate is improbable as the memory access loop shows
nearly the same tsc difference...

Juergen

>   -- Keir
>
> On 29/04/2011 13:32, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>
>> Hi,
>>
>> comparing performance of different Xen versions with BS2000 as HVM guest
>> showed some weird data I'd like to understand.
>>
>> All measurements were done on an Intel Xeon E7220 box. We used a disk-
>> benchmark and found the cpu utilization was much higher with Xen 4.0 compared
>> to Xen 3.3. I did some more investigation and narrowed things down to calls of
>> the hypervisor (implicit or explicit).
>>
>> Following is a table with timing data for different low-level functions, all
>> timing values are tsc ticks obtained via rdtsc:
>>
>> Xen 3.3     Xen 4.0      Function
>>         88        165      just the measurement overhead
>>        176        330      rdtsc-instruction + cli/sti
>>       5896      11044      lapic timer query
>>       7381      13519      setting lapic timer
>>       4653       8987      reload of cr3
>>       3124       5709      invlpg instruction
>>     792253     792264      wbinvd instruction
>>        748       1375      int + iret
>>       5203       9317      hypervisor yield call
>> 12598102   12597882      memory access loop
>>
>> All operations involving the hypervisor take nearly twice the time on 4.0.
>> Operations not involving the hypervisor (wbinvd and memory access loop) are
>> the same on both systems (this rules out the possibility of different rdtsc
>> behavior).
>>
>> Is there any easy explanation for this? Both Xen versions are from SLES
>> (SLES11 or SLES11 SP1).
>>
>>
>> Juergen
-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-04-29 13:35   ` Juergen Gross
@ 2011-04-29 14:58     ` Keir Fraser
  0 siblings, 0 replies; 28+ messages in thread
From: Keir Fraser @ 2011-04-29 14:58 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

On 29/04/2011 14:35, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> On 04/29/11 15:28, Keir Fraser wrote:
>> Are you sure TSC runs at the same rate in the guest on both hypervisor
>> versions? Xen 4.0 might trap and emulate a more consistent but slower rate
>> TSC by default. 'tsc_mode=2' in your domain config file on 4.0 might be a
>> quick fix.
> Already done :-), so yes, I am sure the tsc rate is the same. The debug key
> 's' (softTSC stats) shows that no tsc is emulated.
> 
> BTW: different tsc rate is improbable as the memory access loop shows
> nearly the same tsc difference...

Then I'm not sure. Maybe something got added to the VMEXIT/VMENTRY path that
is unexpectedly slow. You'll have to do a bit of digging.

 -- Keir

> Juergen
> 
>>   -- Keir
>> 
>> On 29/04/2011 13:32, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>> 
>>> Hi,
>>> 
>>> comparing performance of different Xen versions with BS2000 as HVM guest
>>> showed some weird data I'd like to understand.
>>> 
>>> All measurements were done on an Intel Xeon E7220 box. We used a disk-
>>> benchmark and found the cpu utilization was much higher with Xen 4.0
>>> compared
>>> to Xen 3.3. I did some more investigation and narrowed things down to calls
>>> of
>>> the hypervisor (implicit or explicit).
>>> 
>>> Following is a table with timing data for different low-level functions, all
>>> timing values are tsc ticks obtained via rdtsc:
>>> 
>>> Xen 3.3     Xen 4.0      Function
>>>         88        165      just the measurement overhead
>>>        176        330      rdtsc-instruction + cli/sti
>>>       5896      11044      lapic timer query
>>>       7381      13519      setting lapic timer
>>>       4653       8987      reload of cr3
>>>       3124       5709      invlpg instruction
>>>     792253     792264      wbinvd instruction
>>>        748       1375      int + iret
>>>       5203       9317      hypervisor yield call
>>> 12598102   12597882      memory access loop
>>> 
>>> All operations involving the hypervisor take nearly twice the time on 4.0.
>>> Operations not involving the hypervisor (wbinvd and memory access loop) are
>>> the same on both systems (this rules out the possibility of different rdtsc
>>> behavior).
>>> 
>>> Is there any easy explanation for this? Both Xen versions are from SLES
>>> (SLES11 or SLES11 SP1).
>>> 
>>> 
>>> Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-04-29 12:32 Performance difference between Xen versions Juergen Gross
  2011-04-29 13:28 ` Keir Fraser
@ 2011-04-29 16:10 ` Jan Beulich
  2011-05-02  5:31   ` Juergen Gross
  1 sibling, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2011-04-29 16:10 UTC (permalink / raw)
  To: Juergen Gross; +Cc: xen-devel@lists.xensource.com

>>> On 29.04.11 at 14:32, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
> Hi,
> 
> comparing performance of different Xen versions with BS2000 as HVM guest
> showed some weird data I'd like to understand.
> 
> All measurements were done on an Intel Xeon E7220 box. We used a disk-
> benchmark and found the cpu utilization was much higher with Xen 4.0 
> compared
> to Xen 3.3. I did some more investigation and narrowed things down to calls 
> of
> the hypervisor (implicit or explicit).
> 
> Following is a table with timing data for different low-level functions, all
> timing values are tsc ticks obtained via rdtsc:
> 
> Xen 3.3     Xen 4.0      Function
>        88        165      just the measurement overhead
>       176        330      rdtsc-instruction + cli/sti
>      5896      11044      lapic timer query
>      7381      13519      setting lapic timer
>      4653       8987      reload of cr3
>      3124       5709      invlpg instruction
>    792253     792264      wbinvd instruction
>       748       1375      int + iret
>      5203       9317      hypervisor yield call
> 12598102   12597882      memory access loop
> 
> All operations involving the hypervisor take nearly twice the time on 4.0.
> Operations not involving the hypervisor (wbinvd and memory access loop) are
> the same on both systems (this rules out the possibility of different rdtsc
> behavior).
> 
> Is there any easy explanation for this? Both Xen versions are from SLES
> (SLES11 or SLES11 SP1).

I think cpufreq handling was off by default in 3.3, and is on by
default on 4.0. Try turning this off, or using the performance
governor.

> 
> 
> Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-04-29 16:10 ` Jan Beulich
@ 2011-05-02  5:31   ` Juergen Gross
  2011-05-02  6:41     ` Keir Fraser
  0 siblings, 1 reply; 28+ messages in thread
From: Juergen Gross @ 2011-05-02  5:31 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel@lists.xensource.com

On 04/29/11 18:10, Jan Beulich wrote:
>>>> On 29.04.11 at 14:32, Juergen Gross<juergen.gross@ts.fujitsu.com>  wrote:
>> Hi,
>>
>> comparing performance of different Xen versions with BS2000 as HVM guest
>> showed some weird data I'd like to understand.
>>
>> All measurements were done on an Intel Xeon E7220 box. We used a disk-
>> benchmark and found the cpu utilization was much higher with Xen 4.0
>> compared
>> to Xen 3.3. I did some more investigation and narrowed things down to calls
>> of
>> the hypervisor (implicit or explicit).
>>
>> Following is a table with timing data for different low-level functions, all
>> timing values are tsc ticks obtained via rdtsc:
>>
>> Xen 3.3     Xen 4.0      Function
>>         88        165      just the measurement overhead
>>        176        330      rdtsc-instruction + cli/sti
>>       5896      11044      lapic timer query
>>       7381      13519      setting lapic timer
>>       4653       8987      reload of cr3
>>       3124       5709      invlpg instruction
>>     792253     792264      wbinvd instruction
>>        748       1375      int + iret
>>       5203       9317      hypervisor yield call
>> 12598102   12597882      memory access loop
>>
>> All operations involving the hypervisor take nearly twice the time on 4.0.
>> Operations not involving the hypervisor (wbinvd and memory access loop) are
>> the same on both systems (this rules out the possibility of different rdtsc
>> behavior).
>>
>> Is there any easy explanation for this? Both Xen versions are from SLES
>> (SLES11 or SLES11 SP1).
> I think cpufreq handling was off by default in 3.3, and is on by
> default on 4.0. Try turning this off, or using the performance
> governor.
Jan, you got it! With cpufreq=none Xen 4.0 has more or less the same numbers
as 3.3. Now I wonder why the default is so much slower. I looks as if the
hypervisor would run at a lower speed. I can't believe it should behave like that!


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02  5:31   ` Juergen Gross
@ 2011-05-02  6:41     ` Keir Fraser
  2011-05-02  7:23       ` Jan Beulich
  0 siblings, 1 reply; 28+ messages in thread
From: Keir Fraser @ 2011-05-02  6:41 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich; +Cc: xen-devel@lists.xensource.com

On 02/05/2011 06:31, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

>>> Is there any easy explanation for this? Both Xen versions are from SLES
>>> (SLES11 or SLES11 SP1).
>> I think cpufreq handling was off by default in 3.3, and is on by
>> default on 4.0. Try turning this off, or using the performance
>> governor.
> Jan, you got it! With cpufreq=none Xen 4.0 has more or less the same numbers
> as 3.3. Now I wonder why the default is so much slower. I looks as if the
> hypervisor would run at a lower speed. I can't believe it should behave like
> that!

It runs at lower frequency unless your test offers sufficient load over a
long enough time period. Short microbenchmarks are probably finished before
the frequency governor can react.

 -- Keir

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02  6:41     ` Keir Fraser
@ 2011-05-02  7:23       ` Jan Beulich
  2011-05-02  8:00         ` Juergen Gross
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Jan Beulich @ 2011-05-02  7:23 UTC (permalink / raw)
  To: Keir Fraser, Juergen Gross; +Cc: xen-devel@lists.xensource.com

>>> On 02.05.11 at 08:41, Keir Fraser <keir.xen@gmail.com> wrote:
> On 02/05/2011 06:31, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
>>>> Is there any easy explanation for this? Both Xen versions are from SLES
>>>> (SLES11 or SLES11 SP1).
>>> I think cpufreq handling was off by default in 3.3, and is on by
>>> default on 4.0. Try turning this off, or using the performance
>>> governor.
>> Jan, you got it! With cpufreq=none Xen 4.0 has more or less the same numbers
>> as 3.3. Now I wonder why the default is so much slower. I looks as if the
>> hypervisor would run at a lower speed. I can't believe it should behave like
>> that!
> 
> It runs at lower frequency unless your test offers sufficient load over a
> long enough time period. Short microbenchmarks are probably finished before
> the frequency governor can react.

Correct. I generally found the default threshold of the ondemand
governor nor very suitable for optimal performance of short lived
jobs, and boot all of my systems with "cpufreq=xen:ondemand,threshold=20".

Jan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02  7:23       ` Jan Beulich
@ 2011-05-02  8:00         ` Juergen Gross
  2011-05-02  8:15           ` Jan Beulich
  2011-05-02 17:52         ` John Weekes
       [not found]         ` <4DBF13BB.3000309@nuclearfallout.net>
  2 siblings, 1 reply; 28+ messages in thread
From: Juergen Gross @ 2011-05-02  8:00 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, xen-devel@lists.xensource.com

On 05/02/11 09:23, Jan Beulich wrote:
>>>> On 02.05.11 at 08:41, Keir Fraser<keir.xen@gmail.com>  wrote:
>> On 02/05/2011 06:31, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>>
>>>>> Is there any easy explanation for this? Both Xen versions are from SLES
>>>>> (SLES11 or SLES11 SP1).
>>>> I think cpufreq handling was off by default in 3.3, and is on by
>>>> default on 4.0. Try turning this off, or using the performance
>>>> governor.
>>> Jan, you got it! With cpufreq=none Xen 4.0 has more or less the same numbers
>>> as 3.3. Now I wonder why the default is so much slower. I looks as if the
>>> hypervisor would run at a lower speed. I can't believe it should behave like
>>> that!
>> It runs at lower frequency unless your test offers sufficient load over a
>> long enough time period. Short microbenchmarks are probably finished before
>> the frequency governor can react.
> Correct. I generally found the default threshold of the ondemand
> governor nor very suitable for optimal performance of short lived
> jobs, and boot all of my systems with "cpufreq=xen:ondemand,threshold=20".

Thanks, Keir and Jan! You both helped me a lot!

I think the short term solution for our problem is to disable the cpufreq
governor on our BS2000 machines.

On the long run I'd like to make the cpufreq governor a feature of the
cpupool. This would enable an administrator of a large Xen machine
with a heterogeneous load to specify which domains should run at
full speed and which are allowed to save energy at the cost of latency.

What do you think?


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02  8:00         ` Juergen Gross
@ 2011-05-02  8:15           ` Jan Beulich
  2011-05-02  8:23             ` Juergen Gross
  0 siblings, 1 reply; 28+ messages in thread
From: Jan Beulich @ 2011-05-02  8:15 UTC (permalink / raw)
  To: Juergen Gross; +Cc: Keir Fraser, xen-devel@lists.xensource.com

>>> On 02.05.11 at 10:00, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
> On the long run I'd like to make the cpufreq governor a feature of the
> cpupool. This would enable an administrator of a large Xen machine
> with a heterogeneous load to specify which domains should run at
> full speed and which are allowed to save energy at the cost of latency.
> 
> What do you think?

Certainly an interesting idea, with the question of how an implementation
of this would look like.

Jan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02  8:15           ` Jan Beulich
@ 2011-05-02  8:23             ` Juergen Gross
  2011-05-02  8:49               ` Keir Fraser
  0 siblings, 1 reply; 28+ messages in thread
From: Juergen Gross @ 2011-05-02  8:23 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Keir Fraser, xen-devel@lists.xensource.com

On 05/02/11 10:15, Jan Beulich wrote:
>>>> On 02.05.11 at 10:00, Juergen Gross<juergen.gross@ts.fujitsu.com>  wrote:
>> On the long run I'd like to make the cpufreq governor a feature of the
>> cpupool. This would enable an administrator of a large Xen machine
>> with a heterogeneous load to specify which domains should run at
>> full speed and which are allowed to save energy at the cost of latency.
>>
>> What do you think?
> Certainly an interesting idea, with the question of how an implementation
> of this would look like.

Let me do some research work first :-)
I hope to make a proposal soon.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02  8:23             ` Juergen Gross
@ 2011-05-02  8:49               ` Keir Fraser
  2011-05-03  3:06                 ` Tian, Kevin
  0 siblings, 1 reply; 28+ messages in thread
From: Keir Fraser @ 2011-05-02  8:49 UTC (permalink / raw)
  To: Juergen Gross, Jan Beulich; +Cc: Keir Fraser, xen-devel@lists.xensource.com

On 02/05/2011 09:23, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:

> On 05/02/11 10:15, Jan Beulich wrote:
>>>>> On 02.05.11 at 10:00, Juergen Gross<juergen.gross@ts.fujitsu.com>  wrote:
>>> On the long run I'd like to make the cpufreq governor a feature of the
>>> cpupool. This would enable an administrator of a large Xen machine
>>> with a heterogeneous load to specify which domains should run at
>>> full speed and which are allowed to save energy at the cost of latency.
>>> 
>>> What do you think?
>> Certainly an interesting idea, with the question of how an implementation
>> of this would look like.
> 
> Let me do some research work first :-)
> I hope to make a proposal soon.

I think it's a good idea, and it should be quite possible to implement
cleanly.

 -- Keir

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02  7:23       ` Jan Beulich
  2011-05-02  8:00         ` Juergen Gross
@ 2011-05-02 17:52         ` John Weekes
  2011-05-02 18:12           ` Konrad Rzeszutek Wilk
       [not found]         ` <4DBF13BB.3000309@nuclearfallout.net>
  2 siblings, 1 reply; 28+ messages in thread
From: John Weekes @ 2011-05-02 17:52 UTC (permalink / raw)
  To: xen-devel@lists.xensource.com; +Cc: mark.langsdorf, winston.l.wang, gang.wei

On 5/2/2011 12:23 AM, Jan Beulich wrote:
> Correct. I generally found the default threshold of the ondemand
> governor nor very suitable for optimal performance of short lived
> jobs, and boot all of my systems with "cpufreq=xen:ondemand,threshold=20".

These pm comments made me wonder about turbo mode, which I've never seen 
working, and the fact that xenpm doesn't work for me either (for 
instance, trying to turn on turbo with it causes Xen to freeze). So, I 
started digging a bit.

I'm testing with 4.1. I started by setting my line to include the one 
that you gave as an example, but adding ",verbose=1" to the end in order 
to see more output. Strangely, I didn't see any, and turbo mode was 
still not being set (and frequencies weren't changing).

I added some further debug code and found that cpufreq_add_cpu was 
aborting because of its "if (!processor_pminfo[cpu])" check at the 
beginning. I can't find where processor_pminfo[cpu] would be set 
anywhere but in the set_px_pminfo hypercall (via copying), and I can't 
find a caller of that function anywhere in the Xen source or 
2.6.32-stable kernel source. I do see it mentioned in the old 2.6.18. Is 
this something that has yet to be ported to pv_ops, and are there plans 
to do so? Is there also the possibility of initializing it internally 
without dom0 interaction, when "xen" is chosen as the cpufreq scheduler?

Or I am I just missing something entirely here?

-John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02 17:52         ` John Weekes
@ 2011-05-02 18:12           ` Konrad Rzeszutek Wilk
  2011-05-02 18:43             ` John Weekes
  0 siblings, 1 reply; 28+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-05-02 18:12 UTC (permalink / raw)
  To: John Weekes
  Cc: winston.l.wang, xen-devel@lists.xensource.com, mark.langsdorf,
	gang.wei

On Mon, May 02, 2011 at 10:52:45AM -0700, John Weekes wrote:
> On 5/2/2011 12:23 AM, Jan Beulich wrote:
> >Correct. I generally found the default threshold of the ondemand
> >governor nor very suitable for optimal performance of short lived
> >jobs, and boot all of my systems with "cpufreq=xen:ondemand,threshold=20".
> 
> These pm comments made me wonder about turbo mode, which I've never
> seen working, and the fact that xenpm doesn't work for me either
> (for instance, trying to turn on turbo with it causes Xen to
> freeze). So, I started digging a bit.
> 
> I'm testing with 4.1. I started by setting my line to include the
> one that you gave as an example, but adding ",verbose=1" to the end
> in order to see more output. Strangely, I didn't see any, and turbo
> mode was still not being set (and frequencies weren't changing).
> 
> I added some further debug code and found that cpufreq_add_cpu was
> aborting because of its "if (!processor_pminfo[cpu])" check at the
> beginning. I can't find where processor_pminfo[cpu] would be set
> anywhere but in the set_px_pminfo hypercall (via copying), and I
> can't find a caller of that function anywhere in the Xen source or
> 2.6.32-stable kernel source. I do see it mentioned in the old

Oh? I think git commit a3ca5a20ec9d5c4917271021d49768961e7a8421
Author: Yu Ke <ke.yu@intel.com>
Date:   Thu Apr 28 09:50:55 2011 -0400

    xen/acpi: add xen acpi processor driver
    
    Xen hypervisor need parsed acpi processor info for CPU Cx/Px power management,
    so this patch introduces xen acpi processor driver to parse the acpi info,
    and notify the hypervisor upon receiving the info.
    
    This patch has two components:
     - driver/acpi/processor_xen.c: implement the xen acpi processor driver
     - drivers/xen/acpi_processor.c: provide the interface to notify Xen hypervisor
    
adds it in the 2.6.32 tree?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02 18:12           ` Konrad Rzeszutek Wilk
@ 2011-05-02 18:43             ` John Weekes
  2011-05-02 19:16               ` John Weekes
  0 siblings, 1 reply; 28+ messages in thread
From: John Weekes @ 2011-05-02 18:43 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: mark.langsdorf, xen-devel@lists.xensource.com, winston.l.wang,
	gang.wei

On 5/2/2011 11:12 AM, Konrad Rzeszutek Wilk wrote:
> On Mon, May 02, 2011 at 10:52:45AM -0700, John Weekes wrote:
>> Or I am I just missing something entirely here?
> Oh? I think git commit a3ca5a20ec9d5c4917271021d49768961e7a8421
> Author: Yu Ke<ke.yu@intel.com>
> Date:   Thu Apr 28 09:50:55 2011 -0400
>
>      xen/acpi: add xen acpi processor driver
>
>      Xen hypervisor need parsed acpi processor info for CPU Cx/Px power management,
>      so this patch introduces xen acpi processor driver to parse the acpi info,
>      and notify the hypervisor upon receiving the info.
>
>      This patch has two components:
>       - driver/acpi/processor_xen.c: implement the xen acpi processor driver
>       - drivers/xen/acpi_processor.c: provide the interface to notify Xen hypervisor
>
> adds it in the 2.6.32 tree?

Thanks, Konrad.

It looks like the commit in question is actually this one, from March 
2010: 
http://git.kernel.org/?p=linux/kernel/git/jeremy/xen.git;a=commit;h=7529008534a694c5784dfb11aa46184e33adc227. 
My search didn't find it because the files don't use the string 
"px_pminfo" anywhere.

The important thing that I missed on my end was not having the ACPI 
processor driver selected (for some reason). I had cpufreq and ACPI 
enabled, but I needed that, as well.

-John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02 18:43             ` John Weekes
@ 2011-05-02 19:16               ` John Weekes
  2011-05-02 19:36                 ` Konrad Rzeszutek Wilk
  2011-05-03  3:04                 ` Tian, Kevin
  0 siblings, 2 replies; 28+ messages in thread
From: John Weekes @ 2011-05-02 19:16 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: winston.l.wang, xen-devel@lists.xensource.com, mark.langsdorf,
	gang.wei

On 5/2/2011 11:43 AM, John Weekes wrote:
>
> The important thing that I missed on my end was not having the ACPI 
> processor driver selected (for some reason). I had cpufreq and ACPI 
> enabled, but I needed that, as well. 

cpufreq seems to be working now, as is xenpm (and using xenpm is much 
easier than setting the Xen command line), but I'm still not seeing 
signs that turbo mode is bumping up my CPU speed beyond the standard 
value, as I would expect it to.

Here's what it looks like when I start a single process that spins and 
gobbles down a core:

# xenpm get-cpufreq-states | grep current
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 2268 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz
current frequency    : 1600 MHz

And looking at the core when running at the higher speed, I see:

# xenpm get-cpufreq-para 5
cpu id               : 5
affected_cpus        : 5
cpuinfo frequency    : max [2268000] min [1600000] cur [2268000]
scaling_driver       : acpi-cpufreq
scaling_avail_gov    : userspace performance powersave ondemand
current_governor     : ondemand
   ondemand specific  :
     sampling_rate    : max [10000000] min [10000] cur [20000]
     up_threshold     : 80
scaling_avail_freq   : *2268000 2267000 2133000 2000000 1867000 1733000 
1600000
scaling frequency    : max [2268000] min [1600000] cur [2268000]
turbo mode           : enabled

Does it do it silently? If so, how can I see the true frequency?

-John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02 19:16               ` John Weekes
@ 2011-05-02 19:36                 ` Konrad Rzeszutek Wilk
  2011-05-02 19:54                   ` John Weekes
  2011-05-03  2:16                   ` Tian, Kevin
  2011-05-03  3:04                 ` Tian, Kevin
  1 sibling, 2 replies; 28+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-05-02 19:36 UTC (permalink / raw)
  To: John Weekes, ke.yu, kevin.tian
  Cc: mark.langsdorf, xen-devel@lists.xensource.com, winston.l.wang,
	gang.wei

> # xenpm get-cpufreq-para 5
> cpu id               : 5
> affected_cpus        : 5
> cpuinfo frequency    : max [2268000] min [1600000] cur [2268000]
> scaling_driver       : acpi-cpufreq
> scaling_avail_gov    : userspace performance powersave ondemand
> current_governor     : ondemand
>   ondemand specific  :
>     sampling_rate    : max [10000000] min [10000] cur [20000]
>     up_threshold     : 80
> scaling_avail_freq   : *2268000 2267000 2133000 2000000 1867000
> 1733000 1600000
> scaling frequency    : max [2268000] min [1600000] cur [2268000]
> turbo mode           : enabled
> 
> Does it do it silently? If so, how can I see the true frequency?

<looks aroud> You are asking me I presume?

Ummm, no idea. I would actually email the authors of the those patches (CC-ed here).

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-02 19:36                 ` Konrad Rzeszutek Wilk
@ 2011-05-02 19:54                   ` John Weekes
  2011-05-03  2:16                   ` Tian, Kevin
  1 sibling, 0 replies; 28+ messages in thread
From: John Weekes @ 2011-05-02 19:54 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: kevin.tian, xen-devel@lists.xensource.com, winston.l.wang,
	mark.langsdorf, ke.yu, gang.wei

On 5/2/2011 12:36 PM, Konrad Rzeszutek Wilk wrote:
> <looks aroud>  You are asking me I presume?
>
> Ummm, no idea. I would actually email the authors of the those patches (CC-ed here).
Not asking you specifically, no. I just did a reply-all. I would imagine 
that the maintainers of the cpufreq code who I originally CC'd when 
sending to the list would know best, but I might just be missing 
something obvious again..

-John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: Performance difference between Xen versions
  2011-05-02 19:36                 ` Konrad Rzeszutek Wilk
  2011-05-02 19:54                   ` John Weekes
@ 2011-05-03  2:16                   ` Tian, Kevin
  1 sibling, 0 replies; 28+ messages in thread
From: Tian, Kevin @ 2011-05-03  2:16 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, John Weekes, Yu, Ke
  Cc: mark.langsdorf@amd.com, xen-devel@lists.xensource.com,
	Wang, Winston L, Wei, Gang

> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: Tuesday, May 03, 2011 3:36 AM
> 
> > # xenpm get-cpufreq-para 5
> > cpu id               : 5
> > affected_cpus        : 5
> > cpuinfo frequency    : max [2268000] min [1600000] cur [2268000]
> > scaling_driver       : acpi-cpufreq
> > scaling_avail_gov    : userspace performance powersave ondemand
> > current_governor     : ondemand
> >   ondemand specific  :
> >     sampling_rate    : max [10000000] min [10000] cur [20000]
> >     up_threshold     : 80
> > scaling_avail_freq   : *2268000 2267000 2133000 2000000 1867000
> > 1733000 1600000
> > scaling frequency    : max [2268000] min [1600000] cur [2268000]
> > turbo mode           : enabled
> >
> > Does it do it silently? If so, how can I see the true frequency?
> 
> <looks aroud> You are asking me I presume?
> 
> Ummm, no idea. I would actually email the authors of the those patches (CC-ed
> here).

the actual scaling governor runs in the Xen hypervisor. As the default governor is
ondemand, which means that Xen tries to scale freq upon workload heuristic 
periodically (e.g. 20ms interval). 

scaling_avail_freq tells you available frequency steppings on current cpu, and
the one marked with "*" is the current freq. But since this call is async with the
ongoing ondemand decision, it's probably not the latest one if you just issue
one single call. But if you invoke the call multiple times, and then sample an
average freq, it should be close.

or, you can use "xenpm set-scaling-governor" to choose a different one such as
"userspace" if you want to deploy your own policy from userland.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: Performance difference between Xen versions
  2011-05-02 19:16               ` John Weekes
  2011-05-02 19:36                 ` Konrad Rzeszutek Wilk
@ 2011-05-03  3:04                 ` Tian, Kevin
  2011-05-03  3:39                   ` John Weekes
  1 sibling, 1 reply; 28+ messages in thread
From: Tian, Kevin @ 2011-05-03  3:04 UTC (permalink / raw)
  To: John Weekes, Konrad Rzeszutek Wilk
  Cc: mark.langsdorf@amd.com, xen-devel@lists.xensource.com,
	Wang, Winston L, Wei, Gang

> From: John Weekes
> Sent: Tuesday, May 03, 2011 3:16 AM
> 
> On 5/2/2011 11:43 AM, John Weekes wrote:
> >
> > The important thing that I missed on my end was not having the ACPI
> > processor driver selected (for some reason). I had cpufreq and ACPI
> > enabled, but I needed that, as well.
> 
> cpufreq seems to be working now, as is xenpm (and using xenpm is much easier
> than setting the Xen command line), but I'm still not seeing signs that turbo
> mode is bumping up my CPU speed beyond the standard value, as I would
> expect it to.

you won't know the exact frequency bumped up in the turbo mode, as it's all handled
by the CPU itself. what xen can do is just to tell the cpu now I'm OK to enter turbo 
mode, which is 2268000 (1M higher than normal P0). Then CPU will decide whether
current code can be overclocked based on various conditions, such as TDP, other
core activities in the same package, ...

One possibility to verify that turbo mode does work is to run a CPU intensive workload
on one core, while keeping other cores mostly idle. Then choose cpufreq governor
to be performance, and then compare your benchmark when BIOS turbo mode is
on/off. This should give you some feeling whether turbo mode works on your platform.

Thanks
Kevin

> 
> Here's what it looks like when I start a single process that spins and gobbles
> down a core:
> 
> # xenpm get-cpufreq-states | grep current
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 2268 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> current frequency    : 1600 MHz
> 
> And looking at the core when running at the higher speed, I see:
> 
> # xenpm get-cpufreq-para 5
> cpu id               : 5
> affected_cpus        : 5
> cpuinfo frequency    : max [2268000] min [1600000] cur [2268000]
> scaling_driver       : acpi-cpufreq
> scaling_avail_gov    : userspace performance powersave ondemand
> current_governor     : ondemand
>    ondemand specific  :
>      sampling_rate    : max [10000000] min [10000] cur [20000]
>      up_threshold     : 80
> scaling_avail_freq   : *2268000 2267000 2133000 2000000 1867000 1733000
> 1600000
> scaling frequency    : max [2268000] min [1600000] cur [2268000]
> turbo mode           : enabled
> 
> Does it do it silently? If so, how can I see the true frequency?
> 
> -John
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: Performance difference between Xen versions
  2011-05-02  8:49               ` Keir Fraser
@ 2011-05-03  3:06                 ` Tian, Kevin
  2011-05-06 13:49                   ` Juergen Gross
  0 siblings, 1 reply; 28+ messages in thread
From: Tian, Kevin @ 2011-05-03  3:06 UTC (permalink / raw)
  To: Keir Fraser, Juergen Gross, Jan Beulich
  Cc: Keir Fraser, xen-devel@lists.xensource.com

> From: Keir Fraser
> Sent: Monday, May 02, 2011 4:49 PM
> 
> On 02/05/2011 09:23, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:
> 
> > On 05/02/11 10:15, Jan Beulich wrote:
> >>>>> On 02.05.11 at 10:00, Juergen Gross<juergen.gross@ts.fujitsu.com>
> wrote:
> >>> On the long run I'd like to make the cpufreq governor a feature of
> >>> the cpupool. This would enable an administrator of a large Xen
> >>> machine with a heterogeneous load to specify which domains should
> >>> run at full speed and which are allowed to save energy at the cost of
> latency.
> >>>
> >>> What do you think?
> >> Certainly an interesting idea, with the question of how an
> >> implementation of this would look like.
> >
> > Let me do some research work first :-) I hope to make a proposal soon.
> 
> I think it's a good idea, and it should be quite possible to implement cleanly.
> 

yes, this is a good direction. Actually there have been several papers around this topic
before. Basically it's a reasonable choice to inject higher level knowledge together
with VMM heuristics, as in virtualization or cloud we usually have an intelligent stack
which needs to understand many high level requirements/characteristics already. :-)

Thanks
Kevin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-03  3:04                 ` Tian, Kevin
@ 2011-05-03  3:39                   ` John Weekes
  2011-05-03  7:23                     ` Tian, Kevin
  0 siblings, 1 reply; 28+ messages in thread
From: John Weekes @ 2011-05-03  3:39 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: mark.langsdorf@amd.com, xen-devel@lists.xensource.com,
	Wang, Winston L, Wei, Gang, Konrad Rzeszutek Wilk

On 5/2/2011 8:04 PM, Tian, Kevin wrote:
>
> you won't know the exact frequency bumped up in the turbo mode, as it's all handled
> by the CPU itself. what xen can do is just to tell the cpu now I'm OK to enter turbo
> mode, which is 2268000 (1M higher than normal P0). Then CPU will decide whether
> current code can be overclocked based on various conditions, such as TDP, other
> core activities in the same package, ...
>
> One possibility to verify that turbo mode does work is to run a CPU intensive workload
> on one core, while keeping other cores mostly idle. Then choose cpufreq governor
> to be performance, and then compare your benchmark when BIOS turbo mode is
> on/off. This should give you some feeling whether turbo mode works on your platform.

Thanks for the response, Kevin.

It's good to know that I can check turbo by looking to see if it's in 
the 1000hz-higher mode. It's a little strange to me that the true MHz 
level of the turbo wouldn't be known/shown, but I can live with that, as 
the actual performance is what counts. I'll run those benches.

-John

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
       [not found]         ` <4DBF13BB.3000309@nuclearfallout.net>
@ 2011-05-03  7:23           ` Jan Beulich
  0 siblings, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2011-05-03  7:23 UTC (permalink / raw)
  To: John Weekes; +Cc: xen-devel@lists.xensource.com

>>> On 02.05.11 at 22:27, John Weekes <lists.xen@nuclearfallout.net> wrote:
> On 5/2/2011 12:23 AM, Jan Beulich wrote:
>> Correct. I generally found the default threshold of the ondemand
>> governor nor very suitable for optimal performance of short lived
>> jobs, and boot all of my systems with "cpufreq=xen:ondemand,threshold=20".
> 
> Just sending you a quick note here to let you know that "threshold=20" 
> apparently isn't a valid option in 4.1. "up_threshold=20" is what's 
> needed there, instead. It looks like this was changed in 2009: 
> http://xenbits.xensource.com/hg/staging/xen-4.1-testing.hg/rev/ce391986ce35 
> 
> The threshold can also be changed with "xenpm set-up-threshold" at runtime.

Indeed, thanks for pointing that out. I must not have updated my
boot loader settings in that respect for a very long time... And
Xen should probably warn about unrecognized options (preparing
a patch as I write this).

Jan

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: Performance difference between Xen versions
  2011-05-03  3:39                   ` John Weekes
@ 2011-05-03  7:23                     ` Tian, Kevin
  0 siblings, 0 replies; 28+ messages in thread
From: Tian, Kevin @ 2011-05-03  7:23 UTC (permalink / raw)
  To: John Weekes
  Cc: mark.langsdorf@amd.com, xen-devel@lists.xensource.com,
	Wang, Winston L, Wei, Gang, Konrad Rzeszutek Wilk

> From: John Weekes [mailto:lists.xen@nuclearfallout.net]
> Sent: Tuesday, May 03, 2011 11:40 AM
> 
> On 5/2/2011 8:04 PM, Tian, Kevin wrote:
> >
> > you won't know the exact frequency bumped up in the turbo mode, as
> > it's all handled by the CPU itself. what xen can do is just to tell
> > the cpu now I'm OK to enter turbo mode, which is 2268000 (1M higher
> > than normal P0). Then CPU will decide whether current code can be
> > overclocked based on various conditions, such as TDP, other core activities in
> the same package, ...
> >
> > One possibility to verify that turbo mode does work is to run a CPU
> > intensive workload on one core, while keeping other cores mostly idle.
> > Then choose cpufreq governor to be performance, and then compare your
> > benchmark when BIOS turbo mode is on/off. This should give you some feeling
> whether turbo mode works on your platform.
> 
> Thanks for the response, Kevin.
> 
> It's good to know that I can check turbo by looking to see if it's in the
> 1000hz-higher mode. It's a little strange to me that the true MHz level of the
> turbo wouldn't be known/shown, but I can live with that, as the actual
> performance is what counts. I'll run those benches.
> 

the true HZ is unknown because it's all dynamic. there may be several stepping 
which may be overclocked in the turbo mode, and the selection is dynamically
done according to the core/package condition. there may be no boost, or the
freq may bump among different levels. the best info we can know is the average
freq in a given window, by accessing aperf/mperf MSR. This info has been utilized
by the ondemand governor, but I'm not sure whether those MSRs are exposed to 
the user level (possibly yes). But yes as you said the actual performance does matter. :-)

Thanks
Kevin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-03  3:06                 ` Tian, Kevin
@ 2011-05-06 13:49                   ` Juergen Gross
  2011-05-06 14:27                     ` Jan Beulich
  2011-05-11  6:08                     ` Tian, Kevin
  0 siblings, 2 replies; 28+ messages in thread
From: Juergen Gross @ 2011-05-06 13:49 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Keir Fraser, Keir Fraser, xen-devel@lists.xensource.com,
	Jan Beulich

On 05/03/11 05:06, Tian, Kevin wrote:
>> From: Keir Fraser
>> Sent: Monday, May 02, 2011 4:49 PM
>>
>> On 02/05/2011 09:23, "Juergen Gross"<juergen.gross@ts.fujitsu.com>  wrote:
>>
>>> On 05/02/11 10:15, Jan Beulich wrote:
>>>>>>> On 02.05.11 at 10:00, Juergen Gross<juergen.gross@ts.fujitsu.com>
>> wrote:
>>>>> On the long run I'd like to make the cpufreq governor a feature of
>>>>> the cpupool. This would enable an administrator of a large Xen
>>>>> machine with a heterogeneous load to specify which domains should
>>>>> run at full speed and which are allowed to save energy at the cost of
>> latency.
>>>>> What do you think?
>>>> Certainly an interesting idea, with the question of how an
>>>> implementation of this would look like.
>>> Let me do some research work first :-) I hope to make a proposal soon.
>> I think it's a good idea, and it should be quite possible to implement cleanly.
>>
> yes, this is a good direction. Actually there have been several papers around this topic
> before. Basically it's a reasonable choice to inject higher level knowledge together
> with VMM heuristics, as in virtualization or cloud we usually have an intelligent stack
> which needs to understand many high level requirements/characteristics already. :-)
>
Okay, I think I understand the basic mechanisms of cpufreq stuff now :-)
I propose the following changes:

- Cpupools get a new parameter "cpufreq" which is similar to the hypervisor
   boot parameter. It is valid if the hypervisor is responsible for cpufreq
   handling (this excludes cases cpufreq=none and cpufreq=dom0-kernel)
- Cpupool0 is initialized with the boot parameter settings, new cpupools are
   created with the cpupool0 settings, they get their new cpufreq parameters
   via libxl later (this avoids changing the interface for cpupool creation, I 
only
   need a new interface to set the cpufreq parameters for a cpupool, which
   can be used for changing the settings, too. This interface could take the
   cpufreq parameters as text string resulting in support of exactly the same
   parameters as the hypervisor).
- cpufreq_policy is only spanning multiple cpus of one cpupool (if at all). This
   requires a check for the max frequency to be set in a frequency domain
   if the frequency of a processor is changing. This is similar to the ondemand
   governor, but might cross cpufreq_policy boundaries.

Did I miss anything? Any other suggestions?


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-06 13:49                   ` Juergen Gross
@ 2011-05-06 14:27                     ` Jan Beulich
  2011-05-11  6:08                     ` Tian, Kevin
  1 sibling, 0 replies; 28+ messages in thread
From: Jan Beulich @ 2011-05-06 14:27 UTC (permalink / raw)
  To: Kevin Tian, Juergen Gross
  Cc: Keir Fraser, Keir Fraser, xen-devel@lists.xensource.com

>>> On 06.05.11 at 15:49, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:
> Okay, I think I understand the basic mechanisms of cpufreq stuff now :-)
> I propose the following changes:
> 
> - Cpupools get a new parameter "cpufreq" which is similar to the hypervisor
>    boot parameter. It is valid if the hypervisor is responsible for cpufreq
>    handling (this excludes cases cpufreq=none and cpufreq=dom0-kernel)
> - Cpupool0 is initialized with the boot parameter settings, new cpupools are
>    created with the cpupool0 settings, they get their new cpufreq parameters
>    via libxl later (this avoids changing the interface for cpupool creation, 
> I 
> only
>    need a new interface to set the cpufreq parameters for a cpupool, which
>    can be used for changing the settings, too. This interface could take the
>    cpufreq parameters as text string resulting in support of exactly the 
> same
>    parameters as the hypervisor).
> - cpufreq_policy is only spanning multiple cpus of one cpupool (if at all). 
> This
>    requires a check for the max frequency to be set in a frequency domain
>    if the frequency of a processor is changing. This is similar to the 
> ondemand
>    governor, but might cross cpufreq_policy boundaries.
> 
> Did I miss anything? Any other suggestions?

There are cases (hyperthreads, and iirc also some AMD CPUs) where
altering the frequency of one CPU at once alters that of others, and
if those live in distinct pools things are going to become "interesting".

Jan

> 
> 
> Juergen

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: Performance difference between Xen versions
  2011-05-06 13:49                   ` Juergen Gross
  2011-05-06 14:27                     ` Jan Beulich
@ 2011-05-11  6:08                     ` Tian, Kevin
  2011-05-11  6:23                       ` Juergen Gross
  1 sibling, 1 reply; 28+ messages in thread
From: Tian, Kevin @ 2011-05-11  6:08 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Keir Fraser, Keir Fraser, xen-devel@lists.xensource.com,
	Jan Beulich

> From: Juergen Gross [mailto:juergen.gross@ts.fujitsu.com]
> Sent: Friday, May 06, 2011 9:49 PM
> 
> On 05/03/11 05:06, Tian, Kevin wrote:
> >> From: Keir Fraser
> >> Sent: Monday, May 02, 2011 4:49 PM
> >>
> >> On 02/05/2011 09:23, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
> wrote:
> >>
> >>> On 05/02/11 10:15, Jan Beulich wrote:
> >>>>>>> On 02.05.11 at 10:00, Juergen
> >>>>>>> Gross<juergen.gross@ts.fujitsu.com>
> >> wrote:
> >>>>> On the long run I'd like to make the cpufreq governor a feature of
> >>>>> the cpupool. This would enable an administrator of a large Xen
> >>>>> machine with a heterogeneous load to specify which domains should
> >>>>> run at full speed and which are allowed to save energy at the cost
> >>>>> of
> >> latency.
> >>>>> What do you think?
> >>>> Certainly an interesting idea, with the question of how an
> >>>> implementation of this would look like.
> >>> Let me do some research work first :-) I hope to make a proposal soon.
> >> I think it's a good idea, and it should be quite possible to implement cleanly.
> >>
> > yes, this is a good direction. Actually there have been several papers
> > around this topic before. Basically it's a reasonable choice to inject
> > higher level knowledge together with VMM heuristics, as in
> > virtualization or cloud we usually have an intelligent stack which
> > needs to understand many high level requirements/characteristics
> > already. :-)
> >
> Okay, I think I understand the basic mechanisms of cpufreq stuff now :-) I
> propose the following changes:
> 
> - Cpupools get a new parameter "cpufreq" which is similar to the hypervisor
>    boot parameter. It is valid if the hypervisor is responsible for cpufreq
>    handling (this excludes cases cpufreq=none and cpufreq=dom0-kernel)
> - Cpupool0 is initialized with the boot parameter settings, new cpupools are
>    created with the cpupool0 settings, they get their new cpufreq parameters
>    via libxl later (this avoids changing the interface for cpupool creation, I only
>    need a new interface to set the cpufreq parameters for a cpupool, which
>    can be used for changing the settings, too. This interface could take the
>    cpufreq parameters as text string resulting in support of exactly the same
>    parameters as the hypervisor).
> - cpufreq_policy is only spanning multiple cpus of one cpupool (if at all). This
>    requires a check for the max frequency to be set in a frequency domain
>    if the frequency of a processor is changing. This is similar to the ondemand
>    governor, but might cross cpufreq_policy boundaries.
> 
> Did I miss anything? Any other suggestions?
> 

I'm a little bit concerned whether cpupool is a good logical entity to bundle a 
cpufreq policy. Basically the question is how do you define a cpupool, socket based,
core based, or thread based? fully controllable by the admin?

the reason why I ask this question is because a frequency scaling is fundamentally
a hardware attribute, and there may have some cross-dependencies among 
different cores/threads within same package. In some implementations, e.g. you
may only have one core entering a lower frequency when all other cores within
same packages request to enter same or lower frequency. Such low level 
dependency may be either managed by the firmware level automatically, or
fully coordinated by the cpufreq driver. But whatever model, the scaling dependency
may not be the same range as what user may want to define a cpupool.

Possibly you may want to warn the user if the low level cpufreq dependency is
broken by the user-defined pool.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: Performance difference between Xen versions
  2011-05-11  6:08                     ` Tian, Kevin
@ 2011-05-11  6:23                       ` Juergen Gross
  0 siblings, 0 replies; 28+ messages in thread
From: Juergen Gross @ 2011-05-11  6:23 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Keir Fraser, Keir Fraser, xen-devel@lists.xensource.com,
	Jan Beulich

On 05/11/11 08:08, Tian, Kevin wrote:
>> From: Juergen Gross [mailto:juergen.gross@ts.fujitsu.com]
>> Sent: Friday, May 06, 2011 9:49 PM
>>
>> On 05/03/11 05:06, Tian, Kevin wrote:
>>>> From: Keir Fraser
>>>> Sent: Monday, May 02, 2011 4:49 PM
>>>>
>>>> On 02/05/2011 09:23, "Juergen Gross"<juergen.gross@ts.fujitsu.com>
>> wrote:
>>>>> On 05/02/11 10:15, Jan Beulich wrote:
>>>>>>>>> On 02.05.11 at 10:00, Juergen
>>>>>>>>> Gross<juergen.gross@ts.fujitsu.com>
>>>> wrote:
>>>>>>> On the long run I'd like to make the cpufreq governor a feature of
>>>>>>> the cpupool. This would enable an administrator of a large Xen
>>>>>>> machine with a heterogeneous load to specify which domains should
>>>>>>> run at full speed and which are allowed to save energy at the cost
>>>>>>> of
>>>> latency.
>>>>>>> What do you think?
>>>>>> Certainly an interesting idea, with the question of how an
>>>>>> implementation of this would look like.
>>>>> Let me do some research work first :-) I hope to make a proposal soon.
>>>> I think it's a good idea, and it should be quite possible to implement cleanly.
>>>>
>>> yes, this is a good direction. Actually there have been several papers
>>> around this topic before. Basically it's a reasonable choice to inject
>>> higher level knowledge together with VMM heuristics, as in
>>> virtualization or cloud we usually have an intelligent stack which
>>> needs to understand many high level requirements/characteristics
>>> already. :-)
>>>
>> Okay, I think I understand the basic mechanisms of cpufreq stuff now :-) I
>> propose the following changes:
>>
>> - Cpupools get a new parameter "cpufreq" which is similar to the hypervisor
>>     boot parameter. It is valid if the hypervisor is responsible for cpufreq
>>     handling (this excludes cases cpufreq=none and cpufreq=dom0-kernel)
>> - Cpupool0 is initialized with the boot parameter settings, new cpupools are
>>     created with the cpupool0 settings, they get their new cpufreq parameters
>>     via libxl later (this avoids changing the interface for cpupool creation, I only
>>     need a new interface to set the cpufreq parameters for a cpupool, which
>>     can be used for changing the settings, too. This interface could take the
>>     cpufreq parameters as text string resulting in support of exactly the same
>>     parameters as the hypervisor).
>> - cpufreq_policy is only spanning multiple cpus of one cpupool (if at all). This
>>     requires a check for the max frequency to be set in a frequency domain
>>     if the frequency of a processor is changing. This is similar to the ondemand
>>     governor, but might cross cpufreq_policy boundaries.
>>
>> Did I miss anything? Any other suggestions?
>>
> I'm a little bit concerned whether cpupool is a good logical entity to bundle a
> cpufreq policy. Basically the question is how do you define a cpupool, socket based,
> core based, or thread based? fully controllable by the admin?

Cpupools can be either configured by the admin or you can create
one cpupool per numa node (applicable to "big" machines only).

> the reason why I ask this question is because a frequency scaling is fundamentally
> a hardware attribute, and there may have some cross-dependencies among
> different cores/threads within same package. In some implementations, e.g. you
> may only have one core entering a lower frequency when all other cores within
> same packages request to enter same or lower frequency. Such low level
> dependency may be either managed by the firmware level automatically, or
> fully coordinated by the cpufreq driver. But whatever model, the scaling dependency
> may not be the same range as what user may want to define a cpupool.

It might be a good idea to add some information about frequency domains
to e.g. "xl info" output.

> Possibly you may want to warn the user if the low level cpufreq dependency is
> broken by the user-defined pool.

That's what I want to do.
Thanks for your thoughts.


Juergen

-- 
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@ts.fujitsu.com
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2011-05-11  6:23 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-29 12:32 Performance difference between Xen versions Juergen Gross
2011-04-29 13:28 ` Keir Fraser
2011-04-29 13:35   ` Juergen Gross
2011-04-29 14:58     ` Keir Fraser
2011-04-29 16:10 ` Jan Beulich
2011-05-02  5:31   ` Juergen Gross
2011-05-02  6:41     ` Keir Fraser
2011-05-02  7:23       ` Jan Beulich
2011-05-02  8:00         ` Juergen Gross
2011-05-02  8:15           ` Jan Beulich
2011-05-02  8:23             ` Juergen Gross
2011-05-02  8:49               ` Keir Fraser
2011-05-03  3:06                 ` Tian, Kevin
2011-05-06 13:49                   ` Juergen Gross
2011-05-06 14:27                     ` Jan Beulich
2011-05-11  6:08                     ` Tian, Kevin
2011-05-11  6:23                       ` Juergen Gross
2011-05-02 17:52         ` John Weekes
2011-05-02 18:12           ` Konrad Rzeszutek Wilk
2011-05-02 18:43             ` John Weekes
2011-05-02 19:16               ` John Weekes
2011-05-02 19:36                 ` Konrad Rzeszutek Wilk
2011-05-02 19:54                   ` John Weekes
2011-05-03  2:16                   ` Tian, Kevin
2011-05-03  3:04                 ` Tian, Kevin
2011-05-03  3:39                   ` John Weekes
2011-05-03  7:23                     ` Tian, Kevin
     [not found]         ` <4DBF13BB.3000309@nuclearfallout.net>
2011-05-03  7:23           ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).