All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Bader <stefan.bader@canonical.com>
To: Boris Ostrovsky <boris.ostrovsky@amd.com>
Cc: "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Jan Beulich <jbeulich@suse.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Subject: Re: Workings/effectiveness of the xen-acpi-processor driver
Date: Wed, 02 May 2012 10:36:59 +0200	[thread overview]
Message-ID: <4FA0F22B.1030505@canonical.com> (raw)
In-Reply-To: <4FA06541.7050607@amd.com>


[-- Attachment #1.1: Type: text/plain, Size: 9103 bytes --]

On 02.05.2012 00:35, Boris Ostrovsky wrote:
> On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote:
>> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote:
>>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote:
>>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote:
>>>>> Since there have been requests about that driver to get backported into 3.2, I
>>>>> was interested to find out what or how much would be gained by that.
>>>>>
>>>>> The first system I tried was an AMD based one (8 core Opteron 6128@2GHz).
>>>>> Which
>>>>> was not very successful as the drivers bail out of the init function
>>>>> because the
>>>>> first call to acpi_processor_register_performance() returns -ENODEV. There is
>>>>> some frequency scaling when running without Xen, so I need to do some more
>>>>> debugging there.
> 
> I believe this is caused by the somewhat under-enlightened xen_apic_read():
> 
> static u32 xen_apic_read(u32 reg)
> {
>         return 0;
> }
> 
> This results in some data, most importantly boot_cpu_physical_apicid, not being
> set correctly and, in turn, causes x86_cpu_to_apicid to be broken.

Ah ok. I check what my box say and try the change below and gathering more data
as suggested in the follow-ups (including to turn on the acpi debugging and
debugging in the xen acpi processor driver). The latter I had done but that only
would print "max acpi id: 16" (or so) before the failure. No wonder missing the
acpi debugging.

> 
> On larger AMD systems boot processor is typically APICID=0x20 (I don't have
> Intel system handy to see how it looks there).
> 
> As a quick and dirty test you can try:
> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index edc2448..1f78998 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned long address)
>         }
>         if (boot_cpu_physical_apicid == -1U) {
>                 boot_cpu_physical_apicid  = read_apic_id();
> +               boot_cpu_physical_apicid = 32;
>                 apic_version[boot_cpu_physical_apicid] =
>                          GET_APIC_VERSION(apic_read(APIC_LVR));
>         }
> 
> 
> (Set it to whatever APICID on core0 is, I suspect it won't be zero).
> 
> -boris
> 
> 
>>>>
>>>> Did you back-port the other components - the ones that turn off the native
>>>> frequency scalling?
>>>>
>>>>        provide disable_cpufreq() function to disable the API.
>>>>     xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>>>>        xen/cpufreq: Disable the cpu frequency scaling drivers from loading
>>>>>
>>>
>>> Yes, here is the full set for reference:
>>>
>>> * xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
>>> * xen/acpi: Remove the WARN's as they just create noise.
>>> * xen/acpi: Fix Kconfig dependency on CPU_FREQ
>>> * xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
>>> * xen/acpi-processor: C and P-state driver that uploads said data to hyper
>>> * provide disable_cpufreq() function to disable the API.
>>
>> And (Linus just pulled it), you also need this one:
>>   df88b2d96e36d9a9e325bfcd12eb45671cbbc937 (xen/enlighten: Disable MWAIT_LEAF
>> so that acpi-pad won't be loaded.)
>>
>>>
>>>>> The second system was an Intel one (4 core i7 920@2.67GHz) which was
>>>>> successfully loading the driver. Via xenpm I can see the various
>>>>> frequencies and
>>>>> also see them being changed. However the cpuidle data out of xenpm looks a
>>>>> bit odd:
>>>>>
>>>>> #>  xenpm get-cpuidle-states 0
>>>>> Max C-state: C7
>>>>>
>>>>> cpu id               : 0
>>>>> total C-states       : 2
>>>>> idle time(ms)        : 10819311
>>>>> C0                   : transition [00000000000000000001]
>>>>>                         residency  [00000000000000005398 ms]
>>>>> C1                   : transition [00000000000000000001]
>>>>>                         residency  [00000000000010819311 ms]
>>>>> pc3                  : [00000000000000000000 ms]
>>>>> pc6                  : [00000000000000000000 ms]
>>>>> pc7                  : [00000000000000000000 ms]
>>>>> cc3                  : [00000000000000000000 ms]
>>>>> cc6                  : [00000000000000000000 ms]
>>>>>
>>>>> Also gathering samples over 30s does look like only C0 and C1 are used. This
>>>>
>>>> Yes.
>>>>> might be because C1E support is enabled in BIOS but when looking at the
>>>>> intel_idle data in sysfs when running without a hypervisor will show C3 and C6
>>>>> for the cores. That could have been just a wrong output, so I plugged in a
>>>>> power
>>>>> meter and compared a kernel running natively and running as dom0 (with and
>>>>> without the acpi-processor driver).
>>>>>
>>>>> Native: 175W
>>>>> dom0:   183W (with only marginal difference between with or without the
>>>>>                processor driver)
>>>>> [yes, the system has a somewhat high base consumption which I attribute to a
>>>>> ridiculously dimensioned graphics subsystem to be running a text console]
>>>>>
>>>>> This I would take as C3 and C6 really not being used and the frequency scaling
>>
>> So the other thing I forgot to note is that C3->C6 have a detrimental
>> effect on some Intel boxes with Xen. We haven't figured out exactly which ones
>> and the bug is definitly in the hypervisor. The bug is that when the CPU goes in
>> those states the NIC ends up being unresponsive. Its like the interrupts stopped
>> being ACKed. If I run 'xenpm set-max-cstate 2' the issue disappears.
>>
>>>>
>>>> To go in deeper modes there is also a need to backport a Xen unstable
>>>> hypercall which will allow the kernel to detect the other states besides
>>>> C0-C2.
>>>>
>>>> "XEN_SET_PDC query was implemented in c/s 23783:
>>>>      "ACPI: add _PDC input override mechanism".
>>>>
>>>
>>> I see. There is a kernel patch about enabling MWAIT that refers to that...
>>
>> Were there any special things you ran when checking the output? Just plugging
>> and looking at the results?
>>>
>>>>
>>>>> having no impact on the idle system is not that much surprising. But if
>>>>> that was
>>>>> true it would also limit the usefulness of the turbo mode which I understand
>>>>> would also be limited by the c-state of the other cores.
>>>>
>>>> Hm, I should double-check that - but somehow I thought that Xen independetly
>>>> checks for TurboMode and if the P-states are in, then they are activated.
>>
>> I did a bit of checking around and it does seem that is the case. From what
>> I have gathered the TurboMode kicks in when the CPU is C0 mode (which should
>> be obvious), and when the other cores are in anything but C0 mode. And sure
>> enough that seems to be the case. But I can't get the concrete details whether
>> the "but C0 mode" means that TurboMode will work better if the C mode is legacy
>> C1, C2, C3 or the CPU C-states (so MWAIT enabled). Trying to find out from
>> Len Brown more details..
>>>>
>>> Turbo mode should be enabled. I had been only looking at a generic overview
>>> about it on Intel site which sounded like it  would make more of a difference on
>>> how much one core could get overclocked related to how many cores are active
>>> (and I translated active or not into deeper c-states or not).
>>> Looking at the verbose output of turbostat it seems not to make that much
>>> difference whether 2-4 cores are running. A single core alone could get one more
>>> increment in clock stepping. That does not immediately sound a lot. And of
>>> course how much or long the higher clock is used depends on other factors as
>>> well and is not under OS control.
>>>
>>> In the end it is probably quite dynamic and hard to come up with hard facts to
>>> prove its value. Though if I can lower the idle power usage by reaching a bit
>>> further, that would greatly help to justify the effort and potential risk of
>>> backporting...
>>
>> I understand. I wish I could give you the exact percentage points by which
>> the power usage will drop. But I think the more substantial reason benefit of
>> these patches is performance gains. The ones that Ian Campbell ran and were
>> posted on Phorenix site paint that they are beneficial.
>>
>>>
>>>>>
>>>>> Do I misread the data I see? Or maybe its a known limitation? In case it is
>>>>> worth doing more research I'll gladly try things and gather more data.
>>>>
>>>> Just missing some patches.
>>>>
>>>> Oh, and this one:
>>>>        xen/acpi: Fix Kconfig dependency on CPU_FREQ
>>>>
>>>> Hmm.. I think a patch disappeared somewhere.
>>
>> That was the one I referenced at the beginning of this email.
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 900 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2012-05-02  8:36 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-25 13:00 Workings/effectiveness of the xen-acpi-processor driver Stefan Bader
2012-04-26 15:50 ` Konrad Rzeszutek Wilk
2012-04-26 16:25   ` Stefan Bader
2012-04-26 17:04     ` Konrad Rzeszutek Wilk
2012-05-06 15:23       ` Pasi Kärkkäinen
2012-05-07 17:33         ` Konrad Rzeszutek Wilk
2012-05-07 17:44           ` Pasi Kärkkäinen
2012-05-01 20:02     ` Konrad Rzeszutek Wilk
2012-05-01 22:35       ` Boris Ostrovsky
2012-05-01 22:54         ` Konrad Rzeszutek Wilk
2012-05-02  0:47           ` Konrad Rzeszutek Wilk
2012-05-02  1:11             ` Boris Ostrovsky
2012-05-02  9:19               ` Jan Beulich
2012-05-02 14:56           ` Stefan Bader
2012-05-02  8:36         ` Stefan Bader [this message]
2012-05-02 15:01         ` Stefan Bader
2012-05-02 16:08           ` Konrad Rzeszutek Wilk
2012-05-02 17:06             ` Boris Ostrovsky
2012-05-02 17:14               ` Konrad Rzeszutek Wilk
2012-05-02 21:31                 ` Boris Ostrovsky
2012-05-02 21:41                   ` Konrad Rzeszutek Wilk
2012-05-02 22:09                     ` Boris Ostrovsky
2012-05-03  6:55                       ` Stefan Bader
2012-05-03 10:00                         ` Stefan Bader
2012-05-03 12:58                       ` Stefan Bader
2012-05-03 14:47                         ` Stefan Bader
2012-05-03 15:46                           ` Konrad Rzeszutek Wilk
2012-05-03 17:02                             ` Boris Ostrovsky
2012-05-03 17:08                             ` Konrad Rzeszutek Wilk
2012-05-04  8:00                               ` Stefan Bader
2012-05-03 16:14                       ` Konrad Rzeszutek Wilk
2012-05-02 21:29             ` Stefan Bader
2012-05-02  8:22       ` Stefan Bader

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FA0F22B.1030505@canonical.com \
    --to=stefan.bader@canonical.com \
    --cc=boris.ostrovsky@amd.com \
    --cc=jbeulich@suse.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.