linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Basic perf PMU support for Haswell v12
       [not found] <1369261073-1275-1-git-send-email-andi@firstfloor.org>
@ 2013-05-28  6:29 ` Ingo Molnar
  2013-05-28 16:20   ` Andi Kleen
       [not found] ` <1369261073-1275-5-git-send-email-andi@firstfloor.org>
  2013-05-30  7:22 ` Basic perf PMU support for Haswell v12 Ingo Molnar
  2 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2013-05-28  6:29 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Stephane Eranian


* Andi Kleen <andi@firstfloor.org> wrote:

> All outstanding issues fixed I hope. And I added mem-loads/stores support.
> 
> Contains support for:
> - Basic Haswell PMU and PEBS support
> - Late unmasking of the PMI
> - mem-loads/stores support
> 
> v2: Addressed Stephane's feedback. See individual patches for details.
> v3: now even more bite-sized. Qualifier constraints merged earlier.
> v4: Rename some variables, add some comments and other minor changes.
> Add some Reviewed/Tested-bys.
> v5: Address some minor review feedback. Port to latest perf/core
> v6: Add just some variable names, add comments, edit descriptions, some
> more testing, rebased to latest perf/core
> v7: Expand comment
> v8: Rename structure field.
> v9: No wide counters, but add basic LBRs. Add some more 
> constraints. Rebase to 3.9rc1
> v10: Change some whitespace. Rebase to 3.9rc3
> v11: Rebase to perf/core. Fix extra regs. Rename INTX.
> v12: Rebase to 3.10-rc2
> Add mem-loads/stores support for parity with Sandy Bridge.
> Fix fixed counters (Thanks Ingo!)
> Make late ack optional
> Export new config bits in sysfs.
> Minor changes

I reported a pretty nasty regression with the previous version (v10) which 
made this series break default 'perf top' on non-Haswell systems - but 
it's unclear from this changelog to what extent you managed to reproduce 
the bug and fix it, and what the fix was?

If it's fixed then please don't hide fixes - you need to follow up in the 
original threads where they got reported, not just some obscure changelog 
entry in the next version post ...

I'd really like to make progress with this feature - 11 iterations is 
ridiculous really.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Basic perf PMU support for Haswell v12
  2013-05-28  6:29 ` Basic perf PMU support for Haswell v12 Ingo Molnar
@ 2013-05-28 16:20   ` Andi Kleen
  2013-05-30  6:35     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2013-05-28 16:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, linux-kernel, Peter Zijlstra,
	Arnaldo Carvalho de Melo, Stephane Eranian

On Tue, May 28, 2013 at 08:29:15AM +0200, Ingo Molnar wrote:
> 
> * Andi Kleen <andi@firstfloor.org> wrote:
> 
> > All outstanding issues fixed I hope. And I added mem-loads/stores support.
> > 
> > Contains support for:
> > - Basic Haswell PMU and PEBS support
> > - Late unmasking of the PMI
> > - mem-loads/stores support
> > 
> > v2: Addressed Stephane's feedback. See individual patches for details.
> > v3: now even more bite-sized. Qualifier constraints merged earlier.
> > v4: Rename some variables, add some comments and other minor changes.
> > Add some Reviewed/Tested-bys.
> > v5: Address some minor review feedback. Port to latest perf/core
> > v6: Add just some variable names, add comments, edit descriptions, some
> > more testing, rebased to latest perf/core
> > v7: Expand comment
> > v8: Rename structure field.
> > v9: No wide counters, but add basic LBRs. Add some more 
> > constraints. Rebase to 3.9rc1
> > v10: Change some whitespace. Rebase to 3.9rc3
> > v11: Rebase to perf/core. Fix extra regs. Rename INTX.
> > v12: Rebase to 3.10-rc2
> > Add mem-loads/stores support for parity with Sandy Bridge.
> > Fix fixed counters (Thanks Ingo!)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > Make late ack optional
> > Export new config bits in sysfs.
> > Minor changes
> 
> I reported a pretty nasty regression with the previous version (v10) which 
> made this series break default 'perf top' on non-Haswell systems - but 
> it's unclear from this changelog to what extent you managed to reproduce 
> the bug and fix it, and what the fix was?

Thanks for checking.
I didn't reproduce it, but I found a problem by code review with the 
fixed counter constraints.

I think I fixed it by adding this hunk:

@@ -2227,7 +2313,7 @@ __init int intel_pmu_init(void)
                 * counter, so do not extend mask to generic counters
                 */
                for_each_event_constraint(c, x86_pmu.event_constraints) {
-                       if (c->cmask != X86_RAW_EVENT_MASK
+                       if (c->cmask != FIXED_EVENT_FLAGS
                            || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
                                continue;
                        }

It would be cleaner to detect the fixed counters in some other way,
but that was the simplest fix I could find.

Testing appreciated 

> I'd really like to make progress with this feature - 11 iterations is 
> ridiculous really.

Thanks.

There are actually more patches unfortunately, this is just a subset.
I'll send the others once that one is in, probably split into less and
more important ones.

https://git.kernel.org/cgit/linux/kernel/git/ak/linux-misc.git/log/?h=hsw/pmu6

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Basic perf PMU support for Haswell v12
  2013-05-28 16:20   ` Andi Kleen
@ 2013-05-30  6:35     ` Ingo Molnar
  2013-05-30  7:01       ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2013-05-30  6:35 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Stephane Eranian


* Andi Kleen <andi@firstfloor.org> wrote:

> On Tue, May 28, 2013 at 08:29:15AM +0200, Ingo Molnar wrote:
> > 
> > * Andi Kleen <andi@firstfloor.org> wrote:
> > 
> > > All outstanding issues fixed I hope. And I added mem-loads/stores support.
> > > 
> > > Contains support for:
> > > - Basic Haswell PMU and PEBS support
> > > - Late unmasking of the PMI
> > > - mem-loads/stores support
> > > 
> > > v2: Addressed Stephane's feedback. See individual patches for details.
> > > v3: now even more bite-sized. Qualifier constraints merged earlier.
> > > v4: Rename some variables, add some comments and other minor changes.
> > > Add some Reviewed/Tested-bys.
> > > v5: Address some minor review feedback. Port to latest perf/core
> > > v6: Add just some variable names, add comments, edit descriptions, some
> > > more testing, rebased to latest perf/core
> > > v7: Expand comment
> > > v8: Rename structure field.
> > > v9: No wide counters, but add basic LBRs. Add some more 
> > > constraints. Rebase to 3.9rc1
> > > v10: Change some whitespace. Rebase to 3.9rc3
> > > v11: Rebase to perf/core. Fix extra regs. Rename INTX.
> > > v12: Rebase to 3.10-rc2
> > > Add mem-loads/stores support for parity with Sandy Bridge.
> > > Fix fixed counters (Thanks Ingo!)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > Make late ack optional
> > > Export new config bits in sysfs.
> > > Minor changes
> > 
> > I reported a pretty nasty regression with the previous version (v10) which 
> > made this series break default 'perf top' on non-Haswell systems - but 
> > it's unclear from this changelog to what extent you managed to reproduce 
> > the bug and fix it, and what the fix was?
> 
> Thanks for checking.
> I didn't reproduce it, but I found a problem by code review with the 
> fixed counter constraints.
> 
> I think I fixed it by adding this hunk:
> 
> @@ -2227,7 +2313,7 @@ __init int intel_pmu_init(void)
>                  * counter, so do not extend mask to generic counters
>                  */
>                 for_each_event_constraint(c, x86_pmu.event_constraints) {
> -                       if (c->cmask != X86_RAW_EVENT_MASK
> +                       if (c->cmask != FIXED_EVENT_FLAGS
>                             || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
>                                 continue;
>                         }
> 
> It would be cleaner to detect the fixed counters in some other way,
> but that was the simplest fix I could find.
> 
> Testing appreciated 

Fair enough - I'll give it a whirl - that hunk does indeed look like it 
could make a difference.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 4/6] perf, x86: Move NMI clearing to end of PMI handler v2
       [not found] ` <1369261073-1275-5-git-send-email-andi@firstfloor.org>
@ 2013-05-30  6:43   ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2013-05-30  6:43 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Andi Kleen, Peter Zijlstra,
	Arnaldo Carvalho de Melo


* Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> This avoids some problems with spurious PMIs on Haswell.
> Haswell seems to behave more like P4 in this regard. Do
> the same thing as the P4 perf handler by unmasking
> the NMI only at the end. Shouldn't make any difference
> for earlier family 6 cores.
> 
> Tested on Haswell, IvyBridge, Westmere, Saltwell (Atom)
> 
> v2: Enable only for Haswell
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  arch/x86/kernel/cpu/perf_event.h       |  1 +
>  arch/x86/kernel/cpu/perf_event_intel.c | 22 +++++++++++++---------
>  2 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
> index d2c3b42..a3887a3 100644
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -378,6 +378,7 @@ struct x86_pmu {
>  	struct event_constraint *event_constraints;
>  	struct x86_pmu_quirk *quirks;
>  	int		perfctr_second_write;
> +	bool		late_ack;
>  
>  	/*
>  	 * sysfs attrs
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 2164f39..b7442ff 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1184,16 +1184,12 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)
>  
>  	cpuc = &__get_cpu_var(cpu_hw_events);
>  
> -	/*
> -	 * Some chipsets need to unmask the LVTPC in a particular spot
> -	 * inside the nmi handler.  As a result, the unmasking was pushed
> -	 * into all the nmi handlers.
> -	 *
> -	 * This handler doesn't seem to have any issues with the unmasking
> -	 * so it was left at the top.
> +	/* 
> +	 * No known reason to not always do late ACK,
> +	 * but just in case do it opt-in.
>  	 */
> -	apic_write(APIC_LVTPC, APIC_DM_NMI);
> -
> +	if (!x86_pmu.late_ack)
> +		apic_write(APIC_LVTPC, APIC_DM_NMI);
>  	intel_pmu_disable_all();
>  	handled = intel_pmu_drain_bts_buffer();
>  	status = intel_pmu_get_status();
> @@ -1253,6 +1249,13 @@ again:
>  
>  done:
>  	intel_pmu_enable_all(0);
> +	/*
> +	 * Only unmask the NMI after the overflow counters
> +	 * have been reset. This avoids spurious NMIs on
> +	 * Haswell CPUs.
> +	 */
> +	if (x86_pmu.late_ack)
> +		apic_write(APIC_LVTPC, APIC_DM_NMI);
>  	return handled;
>  }
>  
> @@ -2257,6 +2260,7 @@ __init int intel_pmu_init(void)
>  	case 70:
>  	case 71:
>  	case 63:
> +		x86_pmu.late_ack = true;
>  		memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
>  		       sizeof(hw_cache_event_ids));
>  		memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs,

Ok - this is a lot less intrusive solution.

Once the dust has settled we can try setting late_ack for all models, and 
if that works out without regressing, we can switch to the late ack method 
altogether.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Basic perf PMU support for Haswell v12
  2013-05-30  6:35     ` Ingo Molnar
@ 2013-05-30  7:01       ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2013-05-30  7:01 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Stephane Eranian


* Ingo Molnar <mingo@kernel.org> wrote:

> * Andi Kleen <andi@firstfloor.org> wrote:
> 
> > On Tue, May 28, 2013 at 08:29:15AM +0200, Ingo Molnar wrote:
> > > 
> > > * Andi Kleen <andi@firstfloor.org> wrote:
> > > 
> > > > All outstanding issues fixed I hope. And I added mem-loads/stores support.
> > > > 
> > > > Contains support for:
> > > > - Basic Haswell PMU and PEBS support
> > > > - Late unmasking of the PMI
> > > > - mem-loads/stores support
> > > > 
> > > > v2: Addressed Stephane's feedback. See individual patches for details.
> > > > v3: now even more bite-sized. Qualifier constraints merged earlier.
> > > > v4: Rename some variables, add some comments and other minor changes.
> > > > Add some Reviewed/Tested-bys.
> > > > v5: Address some minor review feedback. Port to latest perf/core
> > > > v6: Add just some variable names, add comments, edit descriptions, some
> > > > more testing, rebased to latest perf/core
> > > > v7: Expand comment
> > > > v8: Rename structure field.
> > > > v9: No wide counters, but add basic LBRs. Add some more 
> > > > constraints. Rebase to 3.9rc1
> > > > v10: Change some whitespace. Rebase to 3.9rc3
> > > > v11: Rebase to perf/core. Fix extra regs. Rename INTX.
> > > > v12: Rebase to 3.10-rc2
> > > > Add mem-loads/stores support for parity with Sandy Bridge.
> > > > Fix fixed counters (Thanks Ingo!)
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > > Make late ack optional
> > > > Export new config bits in sysfs.
> > > > Minor changes
> > > 
> > > I reported a pretty nasty regression with the previous version (v10) which 
> > > made this series break default 'perf top' on non-Haswell systems - but 
> > > it's unclear from this changelog to what extent you managed to reproduce 
> > > the bug and fix it, and what the fix was?
> > 
> > Thanks for checking.
> > I didn't reproduce it, but I found a problem by code review with the 
> > fixed counter constraints.
> > 
> > I think I fixed it by adding this hunk:
> > 
> > @@ -2227,7 +2313,7 @@ __init int intel_pmu_init(void)
> >                  * counter, so do not extend mask to generic counters
> >                  */
> >                 for_each_event_constraint(c, x86_pmu.event_constraints) {
> > -                       if (c->cmask != X86_RAW_EVENT_MASK
> > +                       if (c->cmask != FIXED_EVENT_FLAGS
> >                             || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
> >                                 continue;
> >                         }
> > 
> > It would be cleaner to detect the fixed counters in some other way,
> > but that was the simplest fix I could find.
> > 
> > Testing appreciated 
> 
> Fair enough - I'll give it a whirl - that hunk does indeed look like it 
> could make a difference.

Ok, I can confirm that this fixed the perf top and perf record regression 
I saw on Intel systems.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Basic perf PMU support for Haswell v12
       [not found] <1369261073-1275-1-git-send-email-andi@firstfloor.org>
  2013-05-28  6:29 ` Basic perf PMU support for Haswell v12 Ingo Molnar
       [not found] ` <1369261073-1275-5-git-send-email-andi@firstfloor.org>
@ 2013-05-30  7:22 ` Ingo Molnar
  2 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2013-05-30  7:22 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo,
	Thomas Gleixner


* Andi Kleen <andi@firstfloor.org> wrote:

> v12: Rebase to 3.10-rc2
> Add mem-loads/stores support for parity with Sandy Bridge.
> Fix fixed counters (Thanks Ingo!)
> Make late ack optional
> Export new config bits in sysfs.
> Minor changes

Sigh, what you have not fixed in your patches are the basic stylistic 
mistakes I pointed out to the past:

   https://lkml.org/lkml/2013/5/1/78

(my previous feedback is also quoted below.)

Here checkpatch.pl says this about your series:

  total: 6 errors, 10 warnings, 662 lines checked

and a handful of those checkpatch.pl complaints are for valid, real 
problems.

Furthermore, you have not replied to any of those two mails of mine, nor 
have you fixed the stylistic problems I pointed out, in these latest 
patches!

To fix it simply follow the advice I gave you twice before: run 
scripts/checkpatch.pl against your patches and address any valid 
complaints it gives _BEFORE YOU RESUBMIT THEM_!

Andi, what the heck is going on here? Your behavior makes no sense to me. 
You are pretty much the only contributor I know who makes a habit out of 
willfully ignoring maintainer feedback...

Thanks,

	Ingo

--------------------->
* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Ingo Molnar <mingo@kernel.org> wrote:
> 
> > 
> > * Ingo Molnar <mingo@kernel.org> wrote:
> > 
> > > You say it's barebones, yet it does not work :-( How well was this 
> > > patch-set tested on non-Haswell hardware, which makes up 99.99% of our 
> > > installed base?
> > > 
> > > In particular, after applying your patches, 'perf top' stopped working 
> > > on an Intel testbox of mine:
> > 
> > The other problem I noticed was stylistic: when I applied your patches for 
> > testing even Git complained about their cleanliness ...
> > 
> > To quote from Documentation/SubmittingPatches:
> > 
> >   4) Style check your changes.
> > 
> >   Check your patch for basic style violations, details of which can be
> >   found in Documentation/CodingStyle.  Failure to do so simply wastes
> >   the reviewers time and will get your patch rejected, probably
> >   without even being read.
> > 
> >   At a minimum you should check your patches with the patch style
> >   checker prior to submission (scripts/checkpatch.pl).  You should
> >   be able to justify all violations that remain in your patch.
> > 
> > Please make your patches less sloppy!
> 
> Andi, you have not replied to this mail of mine.
> 
> What new measures are you taking to avoid such annoying stylistic problems 
> to creep into your patches?
> 
> These problems are regular in your patches and that has been going on for 
> years - causing maintenance overhead for many maintainers, not just me.
> 
> Apparently you are not using proper tooling (checkpatch.pl for example) to 
> check your patches. If you refuse to take action I will have to stop 
> dealing with your patches directly altogether - the overhead just does not 
> justify the effort. You'll need to get your patches reviewed by and signed 
> off by a more experienced kernel hacker who knows how to submit patches.
> 
> Thanks,

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-05-30  7:23 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1369261073-1275-1-git-send-email-andi@firstfloor.org>
2013-05-28  6:29 ` Basic perf PMU support for Haswell v12 Ingo Molnar
2013-05-28 16:20   ` Andi Kleen
2013-05-30  6:35     ` Ingo Molnar
2013-05-30  7:01       ` Ingo Molnar
     [not found] ` <1369261073-1275-5-git-send-email-andi@firstfloor.org>
2013-05-30  6:43   ` [PATCH 4/6] perf, x86: Move NMI clearing to end of PMI handler v2 Ingo Molnar
2013-05-30  7:22 ` Basic perf PMU support for Haswell v12 Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).