All of lore.kernel.org
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	Vivek Goyal <vgoyal@redhat.com>, Yinghai Lu <yinghai@kernel.org>,
	Ingo Molnar <mingo@elte.hu>,
	Jason Wessel <jason.wessel@windriver.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Haren Myneni <hbabu@us.ibm.com>
Subject: Re: perf hw  in kexeced kernel broken in tip
Date: Wed, 8 Dec 2010 09:01:03 -0500	[thread overview]
Message-ID: <20101208140103.GM21786@redhat.com> (raw)
In-Reply-To: <1291764620.2032.1293.camel@laptop>

On Wed, Dec 08, 2010 at 12:30:20AM +0100, Peter Zijlstra wrote:
> On Thu, 2010-12-02 at 11:15 -0500, Don Zickus wrote:
> 
> > Vivek suggested to me this morning that I should just blantantly disable the
> > perf counter during init when running my test. 
> 
> Nah, we should actively scan for that during the bring-up and kill
> hw-perf when we find an enable bit set, some BIOSes actively use the
> PMU, this is something that should be discouraged.

Ok, the reboot notifier addresses the kexec problem but doesn't fix it
though (I have to test to confirm that, comments below).  The bios check
should catch those situations (ironically I stumbled upon a machine with
this problem, so I will test your patch with it, though it only uses perf
counter 0).  The kdump problem will still exist, not sure if we care and
perhaps we should document in the changelog that we know kdump is still
broken (unless we do care).

> 
> ---
>  arch/x86/kernel/cpu/perf_event.c |   30 +++++++++++++++++++++++++++---
>  1 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 817d2b1..7f92833 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -375,15 +375,40 @@ static void release_pmc_hardware(void) {}
>  static bool check_hw_exists(void)
>  {
>  	u64 val, val_new = 0;
> -	int ret = 0;
> +	int i, reg, ret = 0;
>  
>  	val = 0xabcdUL;
>  	ret |= checking_wrmsrl(x86_pmu.perfctr, val);
>  	ret |= rdmsrl_safe(x86_pmu.perfctr, &val_new);
> -	if (ret || val != val_new)
> +	if (ret || val != val_new) {
> +		printk(KERN_CONT "Broken PMU hardware detected, software events only.\n");
>  		return false;
> +	}
> +
> +	/*
> +	 * Check to see if the BIOS enabled any of the counters, if so
> +	 * complain and bail.
> +	 */
> +	for (i = 0; i < x86_pmu.num_counters; i++) {
> +		reg = x86_pmu.eventsel + i;
> +		rdmsrl(reg, val);
> +		if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
> +			goto bios_fail;
> +	}
> +
> +	for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
> +		reg = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
> +		rdmsrl(reg, val);
> +		if (val & (0x03 << i*4))
> +			goto bios_fail;
> +	}

I wonder if you should reverse these checks.  If the bios has the perf
counter enabled, there might be a high chance that it fails the first
check and never gets to the actually bios checks.

>  
>  	return true;
> +
> +bios_fail:
> +	printk(KERN_CONT "Broken BIOS detected, software events only.\n");
> +	printk(KERN_ERR FW_BUG "invalid MSR %x=%Lx\n", reg, val);
> +	return false;
>  }
>  
>  static void reserve_ds_buffers(void);
> @@ -1379,7 +1404,6 @@ int __init init_hw_perf_events(void)
>  
>  	/* sanity check that the hardware exists or is emulated */
>  	if (!check_hw_exists()) {
> -		pr_cont("Broken PMU hardware detected, software events only.\n");
>  		return 0;
>  	}

nitpick - you can probably remove the curly braces, no?

>  
> 
> 
> 
> >  Looking through the code I
> > don't think I can do this using disable_all because some routines look for
> > the active bit to be set and some arches have different disable registers
> > than others.  Thoughts?
> 
> Something like the below, preferably I'd key that off of SYS_KEXEC, but
> looking through the existing notifiers adding a state requires touching
> all of them :/
> 
> ---
>  kernel/perf_event.c |   21 +++++++++++++++++++++
>  1 files changed, 21 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 195393c..7abcd8d 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -21,6 +21,7 @@
>  #include <linux/dcache.h>
>  #include <linux/percpu.h>
>  #include <linux/ptrace.h>
> +#include <linux/reboot.h>
>  #include <linux/vmstat.h>
>  #include <linux/vmalloc.h>
>  #include <linux/hardirq.h>
> @@ -6383,6 +6384,25 @@ static void perf_event_exit_cpu(int cpu)
>  static inline void perf_event_exit_cpu(int cpu) { }
>  #endif
>  
> +static int 
> +perf_reboot(struct notifier_block *notifier, unsigned long val, void *v)
> +{
> +	/*
> +	 * XXX this relies on hotplug, does kexec do too?
> +	 */
> +	perf_event_exit_cpu(0);
> +	return NOTIFY_OK;

Ok, so this shuts down the perf counters on cpu0, but the other cpus are
still running and will fail your new bios check, no?

Privately, I used the above wrapped with for_each_online_cpu(cpu) and it
worked fine for me.

Cheers,
Don


  reply	other threads:[~2010-12-08 14:01 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-01  8:00 perf hw in kexeced kernel broken in tip Yinghai Lu
2010-12-01 11:27 ` Peter Zijlstra
2010-12-01 16:06   ` Vivek Goyal
2010-12-01 16:11     ` Peter Zijlstra
2010-12-01 16:23       ` Vivek Goyal
2010-12-01 19:38         ` Peter Zijlstra
2010-12-01 19:46           ` Vivek Goyal
2010-12-01 19:49             ` Peter Zijlstra
2010-12-01 19:58               ` Vivek Goyal
2010-12-01 20:07                 ` Peter Zijlstra
2010-12-01 21:48                   ` Eric W. Biederman
2010-12-02  5:23                     ` Don Zickus
2010-12-02  7:34                       ` Peter Zijlstra
2010-12-02 16:15                         ` Don Zickus
2010-12-07 23:30                           ` Peter Zijlstra
2010-12-08 14:01                             ` Don Zickus [this message]
2010-12-08 14:20                               ` Peter Zijlstra
2010-12-08 14:42                                 ` Vivek Goyal
2010-12-08 14:48                                   ` Peter Zijlstra
2010-12-08 15:02                                     ` Vivek Goyal
2010-12-08 15:15                                       ` Peter Zijlstra
2010-12-08 15:22                                         ` Vivek Goyal
2010-12-08 21:16                                         ` Eric W. Biederman
2010-12-08 14:59                                 ` Peter Zijlstra
2010-12-08 18:43                                   ` Yinghai Lu
2010-12-08 19:01                                     ` Don Zickus
2010-12-08 19:05                                       ` Yinghai Lu
2010-12-08 19:17                                         ` Peter Zijlstra
2010-12-08 19:20                                           ` Yinghai Lu
2010-12-08 19:06                                     ` Peter Zijlstra
2010-12-08 19:20                                       ` Yinghai Lu
2010-12-08 22:37                                   ` Don Zickus
2010-12-08 23:20                                     ` Eric W. Biederman
2010-12-09  4:34                                       ` Don Zickus
2010-12-09 20:20                                   ` Don Zickus
2010-12-09 20:44                                     ` Cyrill Gorcunov
2010-12-08 14:33                               ` Peter Zijlstra
2010-12-08 14:39                               ` Vivek Goyal
2010-12-07 21:16                         ` Don Zickus
2010-12-08  0:26                           ` Yinghai Lu
2010-12-08 10:39                             ` Peter Zijlstra
2010-12-01 20:41               ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101208140103.GM21786@redhat.com \
    --to=dzickus@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=hbabu@us.ibm.com \
    --cc=jason.wessel@windriver.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=vgoyal@redhat.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.