xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefan Bader <stefan.bader@canonical.com>,
	Andre Przywara <andre@andrep.de>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Matthew Garrett <mjg@redhat.com>
Subject: Re: kernel 3.7+ cpufreq regression on AMD system running as dom0
Date: Fri, 18 Jan 2013 21:03:31 +0100	[thread overview]
Message-ID: <20130118200331.GF4062@pd.tnic> (raw)
In-Reply-To: <20130118190015.GC11351@phenom.dumpdata.com>

On Fri, Jan 18, 2013 at 02:00:15PM -0500, Konrad Rzeszutek Wilk wrote:
> I did not explain myself well. The fix is OK - it just that the
> hypervisor causes the quirk to not work correctly. Hmm, I wonder if
> there BIOSes that do the same thing (cause the MSR to return 0). Per
> you estimation of BIOS quality, it seems that this could happen.

Yeah, I don't think there's a limit to the amount of SNAFU a BIOS can
cause :-).

> Oh, I was not thinking DMI per-say. I was thinking something similar to
> DMI-quirk API. But for the ACPI subsystem, so it would be:
> 
> 	if (ARM)
> 		... these quirks neccessary
> 	if (AMD)
> 		.. these quirks
> 
> and then the ACPI code can make the calls to this ACPI-quirk API to
> figure out whether it needs to modulate values. But this is all
> hand-waving at this point.

Yeah, those CPUs are just a very small set to even warrant a quirk API.

[ … ]

> Right, that information is gathered from the MSRs. I think the Xen would
> need to do this since it can do the MSRs correctly and modify the P-states.
> 
> So something like this in the hypervisor maybe (not even tested):

Yeah, something like that. Basically you can copy the quirk down to the
hypervisor.

But, Andre was explaining to me the other day that those P-states
frequencies are not that important.

Let me explain: the ondemand governor, for example, computes idle time
and each time it needs to increase, it switches straight up to the
highest frequency. When it decreases the freq. though, it goes down in a
staircase manner, going over all P-states, AFAICT.

So we use them but not for all decisions. The question is, what does the
xen governor(s) do?

If it only uses the frequencies for reporting, then it is not that big
of a deal. If it uses their values for switching decisions, then it
probably needs the correct ones.

> diff --git a/xen/arch/x86/acpi/cpufreq/powernow.c b/xen/arch/x86/acpi/cpufreq/powernow.c
> index a9b7792..54e7808 100644
> --- a/xen/arch/x86/acpi/cpufreq/powernow.c
> +++ b/xen/arch/x86/acpi/cpufreq/powernow.c
> @@ -146,7 +146,40 @@ static int powernow_cpufreq_target(struct cpufreq_policy *policy,
>  
>      return 0;
>  }
> +#define MSR_AMD_PSTATE_DEF_BASE     0xc0010064
> +static void amd_fixup_frequency(struct xen_processor_px *px, int i)
> +{
> +	u32 hi, lo, fid, did;
> +	int index = px->control & 0x00000007;
> +
> +	if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
> +		return;
> +
> +	if ((boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model < 10)
> +	    || boot_cpu_data.x86 == 0x11) {
> +		rdmsr(MSR_AMD_PSTATE_DEF_BASE + index, lo, hi);
> +        /* Bit 63 indicates whether contents are valid */
> +        if (!(hi & 0x80000000))
> +            return;

Something's funny with this indentation.

> +
> +		fid = lo & 0x3f;
> +		did = (lo >> 6) & 7;
> +		if (boot_cpu_data.x86 == 0x10)
> +			px->core_frequency = (100 * (fid + 0x10)) >> did;
> +		else
> +			px->core_frequency = (100 * (fid + 8)) >> did;
> +	}
> +}
> +
> +static void amd_fixup_freq(struct processor_performance *perf)
> +{
>  
> +    int i;
> +
> +    for (i = 0; i < perf->state_count; i++)
> +        amd_fixup_frequency(perf->states, i);
> +
> +}
>  static int powernow_cpufreq_verify(struct cpufreq_policy *policy)
>  {
>      struct acpi_cpufreq_data *data;
> @@ -158,6 +191,8 @@ static int powernow_cpufreq_verify(struct cpufreq_policy *policy)
>  
>      perf = &processor_pminfo[policy->cpu]->perf;
>  
> +    amd_fixup_freq(perf);
> +
>      cpufreq_verify_within_limits(policy, 0, 
>          perf->states[perf->platform_limit].core_frequency * 1000);

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

  parent reply	other threads:[~2013-01-18 20:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-14 15:58 kernel 3.7+ cpufreq regression on AMD system running as dom0 Stefan Bader
2013-01-14 16:34 ` Borislav Petkov
2013-01-14 16:55   ` [Xen-devel] " Jan Beulich
2013-01-14 17:08   ` Stefan Bader
2013-01-14 17:40     ` André Przywara
2013-01-15 17:53   ` Konrad Rzeszutek Wilk
2013-01-15 18:18     ` Borislav Petkov
2013-01-18 19:00       ` Konrad Rzeszutek Wilk
2013-01-18 19:38         ` [Xen-devel] " Boris Ostrovsky
2013-01-18 19:44           ` Andrew Cooper
2013-01-18 20:03         ` Borislav Petkov [this message]
2013-01-18 22:00           ` Konrad Rzeszutek Wilk
2013-01-21 12:22           ` Stefan Bader
2013-01-21 12:42             ` Borislav Petkov
2013-01-21 12:53               ` Rafael J. Wysocki
2013-01-21 13:08                 ` Borislav Petkov
2013-01-21 13:11               ` Stefan Bader
2013-01-21 15:03               ` Stefan Bader
2013-01-21 15:31                 ` Borislav Petkov
2013-01-22 13:54                   ` Rafael J. Wysocki
2013-01-22  0:01         ` [Xen-devel] " Boris Ostrovsky
2013-01-16 10:26     ` Jan Beulich
     [not found]     ` <50F68E4902000078000B61AC@nat28.tlf.novell.com>
2013-01-16 14:34       ` Stefan Bader
2013-01-15 13:04 ` Matt Wilson
2013-01-15 17:59   ` [Xen-devel] " Matt Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130118200331.GF4062@pd.tnic \
    --to=bp@alien8.de \
    --cc=andre@andrep.de \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mjg@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=stefan.bader@canonical.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).