From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: Workings/effectiveness of the xen-acpi-processor driver Date: Wed, 02 May 2012 10:36:59 +0200 Message-ID: <4FA0F22B.1030505@canonical.com> References: <4F97F58A.8090409@canonical.com> <20120426155033.GE26830@phenom.dumpdata.com> <4F9976F8.8040502@canonical.com> <20120501200207.GA15313@phenom.dumpdata.com> <4FA06541.7050607@amd.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8138260408291714200==" Return-path: In-Reply-To: <4FA06541.7050607@amd.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Boris Ostrovsky Cc: "xen-devel@lists.xensource.com" , Jan Beulich , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============8138260408291714200== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig2D59051F8CC116A8C0E01A61" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig2D59051F8CC116A8C0E01A61 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 02.05.2012 00:35, Boris Ostrovsky wrote: > On 05/01/2012 04:02 PM, Konrad Rzeszutek Wilk wrote: >> On Thu, Apr 26, 2012 at 06:25:28PM +0200, Stefan Bader wrote: >>> On 26.04.2012 17:50, Konrad Rzeszutek Wilk wrote: >>>> On Wed, Apr 25, 2012 at 03:00:58PM +0200, Stefan Bader wrote: >>>>> Since there have been requests about that driver to get backported = into 3.2, I >>>>> was interested to find out what or how much would be gained by that= =2E >>>>> >>>>> The first system I tried was an AMD based one (8 core Opteron 6128@= 2GHz). >>>>> Which >>>>> was not very successful as the drivers bail out of the init functio= n >>>>> because the >>>>> first call to acpi_processor_register_performance() returns -ENODEV= =2E There is >>>>> some frequency scaling when running without Xen, so I need to do so= me more >>>>> debugging there. >=20 > I believe this is caused by the somewhat under-enlightened xen_apic_rea= d(): >=20 > static u32 xen_apic_read(u32 reg) > { > return 0; > } >=20 > This results in some data, most importantly boot_cpu_physical_apicid, n= ot being > set correctly and, in turn, causes x86_cpu_to_apicid to be broken. Ah ok. I check what my box say and try the change below and gathering mor= e data as suggested in the follow-ups (including to turn on the acpi debugging a= nd debugging in the xen acpi processor driver). The latter I had done but th= at only would print "max acpi id: 16" (or so) before the failure. No wonder missi= ng the acpi debugging. >=20 > On larger AMD systems boot processor is typically APICID=3D0x20 (I don'= t have > Intel system handy to see how it looks there). >=20 > As a quick and dirty test you can try: >=20 > diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c > index edc2448..1f78998 100644 > --- a/arch/x86/kernel/apic/apic.c > +++ b/arch/x86/kernel/apic/apic.c > @@ -1781,6 +1781,7 @@ void __init register_lapic_address(unsigned long = address) > } > if (boot_cpu_physical_apicid =3D=3D -1U) { > boot_cpu_physical_apicid =3D read_apic_id(); > + boot_cpu_physical_apicid =3D 32; > apic_version[boot_cpu_physical_apicid] =3D > GET_APIC_VERSION(apic_read(APIC_LVR)); > } >=20 >=20 > (Set it to whatever APICID on core0 is, I suspect it won't be zero). >=20 > -boris >=20 >=20 >>>> >>>> Did you back-port the other components - the ones that turn off the = native >>>> frequency scalling? >>>> >>>> provide disable_cpufreq() function to disable the API. >>>> xen/acpi-processor: Do not depend on CPU frequency scaling drive= rs. >>>> xen/cpufreq: Disable the cpu frequency scaling drivers from l= oading >>>>> >>> >>> Yes, here is the full set for reference: >>> >>> * xen/cpufreq: Disable the cpu frequency scaling drivers from loading= =2E >>> * xen/acpi: Remove the WARN's as they just create noise. >>> * xen/acpi: Fix Kconfig dependency on CPU_FREQ >>> * xen/acpi-processor: Do not depend on CPU frequency scaling drivers.= >>> * xen/acpi-processor: C and P-state driver that uploads said data to = hyper >>> * provide disable_cpufreq() function to disable the API. >> >> And (Linus just pulled it), you also need this one: >> df88b2d96e36d9a9e325bfcd12eb45671cbbc937 (xen/enlighten: Disable MWA= IT_LEAF >> so that acpi-pad won't be loaded.) >> >>> >>>>> The second system was an Intel one (4 core i7 920@2.67GHz) which wa= s >>>>> successfully loading the driver. Via xenpm I can see the various >>>>> frequencies and >>>>> also see them being changed. However the cpuidle data out of xenpm = looks a >>>>> bit odd: >>>>> >>>>> #> xenpm get-cpuidle-states 0 >>>>> Max C-state: C7 >>>>> >>>>> cpu id : 0 >>>>> total C-states : 2 >>>>> idle time(ms) : 10819311 >>>>> C0 : transition [00000000000000000001] >>>>> residency [00000000000000005398 ms] >>>>> C1 : transition [00000000000000000001] >>>>> residency [00000000000010819311 ms] >>>>> pc3 : [00000000000000000000 ms] >>>>> pc6 : [00000000000000000000 ms] >>>>> pc7 : [00000000000000000000 ms] >>>>> cc3 : [00000000000000000000 ms] >>>>> cc6 : [00000000000000000000 ms] >>>>> >>>>> Also gathering samples over 30s does look like only C0 and C1 are u= sed. This >>>> >>>> Yes. >>>>> might be because C1E support is enabled in BIOS but when looking at= the >>>>> intel_idle data in sysfs when running without a hypervisor will sho= w C3 and C6 >>>>> for the cores. That could have been just a wrong output, so I plugg= ed in a >>>>> power >>>>> meter and compared a kernel running natively and running as dom0 (w= ith and >>>>> without the acpi-processor driver). >>>>> >>>>> Native: 175W >>>>> dom0: 183W (with only marginal difference between with or without= the >>>>> processor driver) >>>>> [yes, the system has a somewhat high base consumption which I attri= bute to a >>>>> ridiculously dimensioned graphics subsystem to be running a text co= nsole] >>>>> >>>>> This I would take as C3 and C6 really not being used and the freque= ncy scaling >> >> So the other thing I forgot to note is that C3->C6 have a detrimental >> effect on some Intel boxes with Xen. We haven't figured out exactly wh= ich ones >> and the bug is definitly in the hypervisor. The bug is that when the C= PU goes in >> those states the NIC ends up being unresponsive. Its like the interrup= ts stopped >> being ACKed. If I run 'xenpm set-max-cstate 2' the issue disappears. >> >>>> >>>> To go in deeper modes there is also a need to backport a Xen unstabl= e >>>> hypercall which will allow the kernel to detect the other states bes= ides >>>> C0-C2. >>>> >>>> "XEN_SET_PDC query was implemented in c/s 23783: >>>> "ACPI: add _PDC input override mechanism". >>>> >>> >>> I see. There is a kernel patch about enabling MWAIT that refers to th= at... >> >> Were there any special things you ran when checking the output? Just p= lugging >> and looking at the results? >>> >>>> >>>>> having no impact on the idle system is not that much surprising. Bu= t if >>>>> that was >>>>> true it would also limit the usefulness of the turbo mode which I u= nderstand >>>>> would also be limited by the c-state of the other cores. >>>> >>>> Hm, I should double-check that - but somehow I thought that Xen inde= pendetly >>>> checks for TurboMode and if the P-states are in, then they are activ= ated. >> >> I did a bit of checking around and it does seem that is the case. From= what >> I have gathered the TurboMode kicks in when the CPU is C0 mode (which = should >> be obvious), and when the other cores are in anything but C0 mode. And= sure >> enough that seems to be the case. But I can't get the concrete details= whether >> the "but C0 mode" means that TurboMode will work better if the C mode = is legacy >> C1, C2, C3 or the CPU C-states (so MWAIT enabled). Trying to find out = from >> Len Brown more details.. >>>> >>> Turbo mode should be enabled. I had been only looking at a generic ov= erview >>> about it on Intel site which sounded like it would make more of a di= fference on >>> how much one core could get overclocked related to how many cores are= active >>> (and I translated active or not into deeper c-states or not). >>> Looking at the verbose output of turbostat it seems not to make that = much >>> difference whether 2-4 cores are running. A single core alone could g= et one more >>> increment in clock stepping. That does not immediately sound a lot. A= nd of >>> course how much or long the higher clock is used depends on other fac= tors as >>> well and is not under OS control. >>> >>> In the end it is probably quite dynamic and hard to come up with hard= facts to >>> prove its value. Though if I can lower the idle power usage by reachi= ng a bit >>> further, that would greatly help to justify the effort and potential = risk of >>> backporting... >> >> I understand. I wish I could give you the exact percentage points by w= hich >> the power usage will drop. But I think the more substantial reason ben= efit of >> these patches is performance gains. The ones that Ian Campbell ran and= were >> posted on Phorenix site paint that they are beneficial. >> >>> >>>>> >>>>> Do I misread the data I see? Or maybe its a known limitation? In ca= se it is >>>>> worth doing more research I'll gladly try things and gather more da= ta. >>>> >>>> Just missing some patches. >>>> >>>> Oh, and this one: >>>> xen/acpi: Fix Kconfig dependency on CPU_FREQ >>>> >>>> Hmm.. I think a patch disappeared somewhere. >> >> That was the one I referenced at the beginning of this email. >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >> >=20 >=20 >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel --------------enig2D59051F8CC116A8C0E01A61 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCgAGBQJPoPIrAAoJEOhnXe7L7s6jTaEQAIOVbD8Jg5dFXXuVemYHgRkf +gGbuI+u0kWWaYjmAzyjhstqP1ut28ePcLU1t7VJVZjyVWLgi7IgakeHPddjIFMa UkBHueFtcdqIT5HWvXhTGKuxYkxDfsnIt6Cr1TGCqEEdtYo2W5NoqvUsw31S8yeQ z974yjim2pEkUuYQXhFi5ddmTF9JH5497gOdEw0TIOeY/AuHWPHGL5RWS7zVt6y5 mChWb2aiG3C9SLgih92fcA34iQ/Tj+F+5o82LeMG6tXOkJMNRH0NGDsw36On6m+x ouXsgkfARbMSRtB3y7xdDWJIhKOSN8yQZNWfqSNnhGqfRzTHD99NjmT233OVi832 dMSricUWEHqcrxrD9xkHn0MJZSmDRiUGygw8MktOTrxI+wIvb/GzghXyPWI+zuUV au0wFKKr/9hEz+7pSQRU1Mc9Csx3copskQmnwoJOY69LBAY9fW2zNA/SStBDnRfp 8ol3AlEkle+hu8M91SJl32TmABh5HNN+wMMGzkkPsy5z06Eac1YJ045ynhxaigR8 OBga8/0CI9LNa8fVCMKCNpqgIXImiz9BBKSdPF3MXjhq1sk4BjSj6v+KMfGV3RzQ Tdqx7Wt+fE5FomKy8UwJDXa1FW0tRmN5esbE+F6Y+YlJZxlZOm5ZDaiU/RyUB33Y vGHYvJSOmyA62lNRQXzH =7eSy -----END PGP SIGNATURE----- --------------enig2D59051F8CC116A8C0E01A61-- --===============8138260408291714200== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============8138260408291714200==--