From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andre Przywara Subject: Re: [PATCH] RFC: Linux: disable APERF/MPERF feature in PV kernels Date: Wed, 23 May 2012 11:14:23 +0200 Message-ID: <4FBCAA6F.2060708@amd.com> References: <4FBBB9AF.6020704@amd.com> <4FBCAF1902000078000855C5@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4FBCAF1902000078000855C5@nat28.tlf.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Jeremy Fitzhardinge , xen-devel , Konrad Rzeszutek Wilk List-Id: xen-devel@lists.xenproject.org On 05/23/2012 09:34 AM, Jan Beulich wrote: >>>> On 22.05.12 at 18:07, Andre Przywara wrote: >> while testing some APERF/MPERF semantics I discovered that this feature >> is enabled in Xen Dom0, but is not reliable. >> The Linux kernel's scheduler uses this feature if it sees the CPUID bit, >> leading to costly RDMSR traps (a few 100,000s during a kernel compile) >> and bogus values due to VCPU migration during the measurement. >> The attached patch explicitly disables this CPU capability inside the >> Linux kernel, I couldn't measure any APERF/MPERF reads anymore with the >> patch applied. >> I am not sure if the PVOPS code is the right place to fix this, we could >> as well do it in the HV's xen/arch/x86/traps.c:pv_cpuid(). >> Also when the Dom0 VCPUs are pinned, we could allow this, but I am not >> sure if it's worth to do so. >> >> Awaiting your comments. > > First of all I'm of the opinion that this indeed should not be > masked in the hypervisor - there's no reason to disallow the > guest to read these registers (but we should of course deny > writes as long as Xen is controlling P-states, which we do). Ok. Thanks for the acknowledgment. > Next I'd like to note that in our kernels we simply don't build > arch/x86/kernel/cpu/sched.o. Together with CPU_FREQ being > suppressed, there's no consumer of the feature flag in our > kernels. With "our kernels" you mean OpenSuSE/SLES kernels? I quickly checked upstream as well as the repos on kernel.opensuse.org. In all of them sched.o is unconditionally included in the Makefile. So is there a build patch to exclude this file for builds of distro Xen kernels? Regards, Andre. > So I would think that your suggested change is appropriate, > but I'm adding Konrad to Cc as these days he's the one to pick > this up. > > Jan -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany