All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] perf: Add Tremont support
@ 2019-04-10  1:09 kan.liang
  2019-04-10  1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
  2019-04-10  1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
  0 siblings, 2 replies; 7+ messages in thread
From: kan.liang @ 2019-04-10  1:09 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: tglx, acme, jolsa, eranian, alexander.shishkin, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The patch series intends to add Tremont support for Linux perf.

The patch series is on top of Icelake V5 patch series (with Peter's cleanup patch).
https://lkml.org/lkml/2019/4/8/630

PATCH 1: The feature is for both Icelake and Tremont. It missed the
         Icelake patch series.
PATCH 2: Tremont core PMU support.

Kan Liang (2):
  perf/x86/intel: Support adaptive PEBS for fixed counters
  perf/x86/intel: Add Tremont core PMU support

 arch/x86/events/intel/core.c      | 93 +++++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/perf_event.h |  1 +
 2 files changed, 94 insertions(+)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters
  2019-04-10  1:09 [PATCH 0/2] perf: Add Tremont support kan.liang
@ 2019-04-10  1:09 ` kan.liang
  2019-04-10  7:41   ` Peter Zijlstra
  2019-04-10  1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
  1 sibling, 1 reply; 7+ messages in thread
From: kan.liang @ 2019-04-10  1:09 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: tglx, acme, jolsa, eranian, alexander.shishkin, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Fixed counters can also generate adaptive PEBS record, if the
corresponding bit in IA32_FIXED_CTR_CTRL is set.
Otherwise, only basic record is generated.

Unconditionally set the bit when PEBS is enabled on fixed counters.
Let MSR_PEBS_CFG decide which format of PEBS record should be generated.
There is no harmful to leave the bit set.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/core.c      | 5 +++++
 arch/x86/include/asm/perf_event.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 56df0f6..f34d92b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2174,6 +2174,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
 	bits <<= (idx * 4);
 	mask = 0xfULL << (idx * 4);
 
+	if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
+		bits |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
+		mask |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
+	}
+
 	rdmsrl(hwc->config_base, ctrl_val);
 	ctrl_val &= ~mask;
 	ctrl_val |= bits;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index dcb8bac..ce0dc88 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -33,6 +33,7 @@
 #define HSW_IN_TX					(1ULL << 32)
 #define HSW_IN_TX_CHECKPOINTED				(1ULL << 33)
 #define ICL_EVENTSEL_ADAPTIVE				(1ULL << 34)
+#define ICL_FIXED_0_ADAPTIVE				(1ULL << 32)
 
 #define AMD64_EVENTSEL_INT_CORE_ENABLE			(1ULL << 36)
 #define AMD64_EVENTSEL_GUESTONLY			(1ULL << 40)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support
  2019-04-10  1:09 [PATCH 0/2] perf: Add Tremont support kan.liang
  2019-04-10  1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
@ 2019-04-10  1:10 ` kan.liang
  2019-04-10  7:51   ` Peter Zijlstra
  1 sibling, 1 reply; 7+ messages in thread
From: kan.liang @ 2019-04-10  1:10 UTC (permalink / raw)
  To: peterz, mingo, linux-kernel
  Cc: tglx, acme, jolsa, eranian, alexander.shishkin, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Add perf core PMU support for Intel Tremont CPU.

The init code is based on Goldmont plus.

The generic purpose counter 0 and fixed counter 0 have less skid.
Force :ppp events on generic purpose counter 0.
Force instruction:ppp always on fixed counter 0.

Updates LLC cache event table and OFFCORE_RESPONSE mask.

The adaptive PEBS, which is already enabled on ICL, is also supported
on Tremont. No extra codes required.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/core.c | 88 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index f34d92b..f478831 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -1856,6 +1856,45 @@ static __initconst const u64 glp_hw_cache_extra_regs
 	},
 };
 
+#define TNT_LOCAL_DRAM			BIT_ULL(26)
+#define TNT_DEMAND_READ			GLM_DEMAND_DATA_RD
+#define TNT_DEMAND_WRITE		GLM_DEMAND_RFO
+#define TNT_LLC_ACCESS			GLM_ANY_RESPONSE
+#define TNT_SNP_ANY			(SNB_SNP_NOT_NEEDED|SNB_SNP_MISS| \
+					 SNB_NO_FWD|SNB_SNP_FWD|SNB_HITM)
+#define TNT_LLC_MISS			(TNT_SNP_ANY|SNB_NON_DRAM|TNT_LOCAL_DRAM)
+
+static __initconst const u64 tnt_hw_cache_extra_regs
+				[PERF_COUNT_HW_CACHE_MAX]
+				[PERF_COUNT_HW_CACHE_OP_MAX]
+				[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= TNT_DEMAND_READ|
+						  TNT_LLC_ACCESS,
+			[C(RESULT_MISS)]	= TNT_DEMAND_READ|
+						  TNT_LLC_MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= TNT_DEMAND_WRITE|
+						  TNT_LLC_ACCESS,
+			[C(RESULT_MISS)]	= TNT_DEMAND_WRITE|
+						  TNT_LLC_MISS,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= 0x0,
+			[C(RESULT_MISS)]	= 0x0,
+		},
+	},
+};
+
+static struct extra_reg intel_tnt_extra_regs[] __read_mostly = {
+	/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
+	INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0xffffff9fffull, RSP_0),
+	INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0xffffff9fffull, RSP_1),
+	EVENT_EXTRA_END
+};
+
 #define KNL_OT_L2_HITE		BIT_ULL(19) /* Other Tile L2 Hit */
 #define KNL_OT_L2_HITF		BIT_ULL(20) /* Other Tile L2 Hit */
 #define KNL_MCDRAM_LOCAL	BIT_ULL(21)
@@ -3451,6 +3490,29 @@ glp_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
 	return c;
 }
 
+static struct event_constraint *
+tnt_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
+			  struct perf_event *event)
+{
+	struct event_constraint *c;
+
+	/*
+	 * :ppp means to do reduced skid PEBS,
+	 * which is available at PMC0 and fixed counter 0.
+	 */
+	if (event->attr.precise_ip == 3) {
+		/* Force instruction:ppp in Fixed counter 0 */
+		if (event->hw.config == X86_CONFIG(.event=0xc0))
+			return &fixed_counter0_constraint;
+
+		return &counter0_constraint;
+	}
+
+	c = intel_get_event_constraints(cpuc, idx, event);
+
+	return c;
+}
+
 static bool allow_tsx_force_abort = true;
 
 static struct event_constraint *
@@ -4530,6 +4592,32 @@ __init int intel_pmu_init(void)
 		name = "goldmont_plus";
 		break;
 
+	case INTEL_FAM6_ATOM_TREMONT_X:
+		x86_pmu.late_ack = true;
+		memcpy(hw_cache_event_ids, glp_hw_cache_event_ids,
+		       sizeof(hw_cache_event_ids));
+		memcpy(hw_cache_extra_regs, tnt_hw_cache_extra_regs,
+		       sizeof(hw_cache_extra_regs));
+		hw_cache_event_ids[C(ITLB)][C(OP_READ)][C(RESULT_ACCESS)] = -1;
+
+		intel_pmu_lbr_init_skl();
+
+		x86_pmu.event_constraints = intel_slm_event_constraints;
+		x86_pmu.extra_regs = intel_tnt_extra_regs;
+		/*
+		 * It's recommended to use CPU_CLK_UNHALTED.CORE_P + NPEBS
+		 * for precise cycles.
+		 */
+		x86_pmu.pebs_aliases = NULL;
+		x86_pmu.pebs_prec_dist = true;
+		x86_pmu.lbr_pt_coexist = true;
+		x86_pmu.flags |= PMU_FL_HAS_RSP_1;
+		x86_pmu.get_event_constraints = tnt_get_event_constraints;
+		extra_attr = slm_format_attr;
+		pr_cont("Tremont events, ");
+		name = "Tremont";
+		break;
+
 	case INTEL_FAM6_WESTMERE:
 	case INTEL_FAM6_WESTMERE_EP:
 	case INTEL_FAM6_WESTMERE_EX:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters
  2019-04-10  1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
@ 2019-04-10  7:41   ` Peter Zijlstra
  2019-04-10 13:57     ` Liang, Kan
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2019-04-10  7:41 UTC (permalink / raw)
  To: kan.liang
  Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
	alexander.shishkin, ak

On Tue, Apr 09, 2019 at 06:09:59PM -0700, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> Fixed counters can also generate adaptive PEBS record, if the
> corresponding bit in IA32_FIXED_CTR_CTRL is set.
> Otherwise, only basic record is generated.
> 
> Unconditionally set the bit when PEBS is enabled on fixed counters.
> Let MSR_PEBS_CFG decide which format of PEBS record should be generated.
> There is no harmful to leave the bit set.

I'll merge this back into:

  Subject: perf/x86/intel: Support adaptive PEBSv4

such that this bug never existed, ok?

> 
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> ---
>  arch/x86/events/intel/core.c      | 5 +++++
>  arch/x86/include/asm/perf_event.h | 1 +
>  2 files changed, 6 insertions(+)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 56df0f6..f34d92b 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2174,6 +2174,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
>  	bits <<= (idx * 4);
>  	mask = 0xfULL << (idx * 4);
>  
> +	if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
> +		bits |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
> +		mask |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
> +	}
> +
>  	rdmsrl(hwc->config_base, ctrl_val);
>  	ctrl_val &= ~mask;
>  	ctrl_val |= bits;
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index dcb8bac..ce0dc88 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -33,6 +33,7 @@
>  #define HSW_IN_TX					(1ULL << 32)
>  #define HSW_IN_TX_CHECKPOINTED				(1ULL << 33)
>  #define ICL_EVENTSEL_ADAPTIVE				(1ULL << 34)
> +#define ICL_FIXED_0_ADAPTIVE				(1ULL << 32)
>  
>  #define AMD64_EVENTSEL_INT_CORE_ENABLE			(1ULL << 36)
>  #define AMD64_EVENTSEL_GUESTONLY			(1ULL << 40)
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support
  2019-04-10  1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
@ 2019-04-10  7:51   ` Peter Zijlstra
  2019-04-10 13:58     ` Liang, Kan
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2019-04-10  7:51 UTC (permalink / raw)
  To: kan.liang
  Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
	alexander.shishkin, ak

On Tue, Apr 09, 2019 at 06:10:00PM -0700, kan.liang@linux.intel.com wrote:

> The generic purpose counter 0 and fixed counter 0 have less skid.
> Force :ppp events on generic purpose counter 0.
> Force instruction:ppp always on fixed counter 0.

> +static struct event_constraint *
> +tnt_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> +			  struct perf_event *event)
> +{
> +	struct event_constraint *c;
> +
> +	/*
> +	 * :ppp means to do reduced skid PEBS,
> +	 * which is available at PMC0 and fixed counter 0.
> +	 */
> +	if (event->attr.precise_ip == 3) {
> +		/* Force instruction:ppp in Fixed counter 0 */
> +		if (event->hw.config == X86_CONFIG(.event=0xc0))
> +			return &fixed_counter0_constraint;
> +
> +		return &counter0_constraint;

I'm confused, 0xc0 is the architectural 'instructions' event, surely we
can program that on pmc0 too?

Did we want a fixed0_counter0_constraint for that?

> +	}
> +
> +	c = intel_get_event_constraints(cpuc, idx, event);
> +
> +	return c;
> +}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters
  2019-04-10  7:41   ` Peter Zijlstra
@ 2019-04-10 13:57     ` Liang, Kan
  0 siblings, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2019-04-10 13:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
	alexander.shishkin, ak



On 4/10/2019 3:41 AM, Peter Zijlstra wrote:
> On Tue, Apr 09, 2019 at 06:09:59PM -0700, kan.liang@linux.intel.com wrote:
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> Fixed counters can also generate adaptive PEBS record, if the
>> corresponding bit in IA32_FIXED_CTR_CTRL is set.
>> Otherwise, only basic record is generated.
>>
>> Unconditionally set the bit when PEBS is enabled on fixed counters.
>> Let MSR_PEBS_CFG decide which format of PEBS record should be generated.
>> There is no harmful to leave the bit set.
> 
> I'll merge this back into:
> 
>    Subject: perf/x86/intel: Support adaptive PEBSv4
> 
> such that this bug never existed, ok?

Yes, please.

Thanks,
Kan

> 
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>   arch/x86/events/intel/core.c      | 5 +++++
>>   arch/x86/include/asm/perf_event.h | 1 +
>>   2 files changed, 6 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 56df0f6..f34d92b 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -2174,6 +2174,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
>>   	bits <<= (idx * 4);
>>   	mask = 0xfULL << (idx * 4);
>>   
>> +	if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
>> +		bits |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
>> +		mask |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
>> +	}
>> +
>>   	rdmsrl(hwc->config_base, ctrl_val);
>>   	ctrl_val &= ~mask;
>>   	ctrl_val |= bits;
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index dcb8bac..ce0dc88 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -33,6 +33,7 @@
>>   #define HSW_IN_TX					(1ULL << 32)
>>   #define HSW_IN_TX_CHECKPOINTED				(1ULL << 33)
>>   #define ICL_EVENTSEL_ADAPTIVE				(1ULL << 34)
>> +#define ICL_FIXED_0_ADAPTIVE				(1ULL << 32)
>>   
>>   #define AMD64_EVENTSEL_INT_CORE_ENABLE			(1ULL << 36)
>>   #define AMD64_EVENTSEL_GUESTONLY			(1ULL << 40)
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support
  2019-04-10  7:51   ` Peter Zijlstra
@ 2019-04-10 13:58     ` Liang, Kan
  0 siblings, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2019-04-10 13:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
	alexander.shishkin, ak



On 4/10/2019 3:51 AM, Peter Zijlstra wrote:
> On Tue, Apr 09, 2019 at 06:10:00PM -0700, kan.liang@linux.intel.com wrote:
> 
>> The generic purpose counter 0 and fixed counter 0 have less skid.
>> Force :ppp events on generic purpose counter 0.
>> Force instruction:ppp always on fixed counter 0.
> 
>> +static struct event_constraint *
>> +tnt_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>> +			  struct perf_event *event)
>> +{
>> +	struct event_constraint *c;
>> +
>> +	/*
>> +	 * :ppp means to do reduced skid PEBS,
>> +	 * which is available at PMC0 and fixed counter 0.
>> +	 */
>> +	if (event->attr.precise_ip == 3) {
>> +		/* Force instruction:ppp in Fixed counter 0 */
>> +		if (event->hw.config == X86_CONFIG(.event=0xc0))
>> +			return &fixed_counter0_constraint;
>> +
>> +		return &counter0_constraint;
> 
> I'm confused, 0xc0 is the architectural 'instructions' event, surely we
> can program that on pmc0 too?
> 
> Did we want a fixed0_counter0_constraint for that?
> 

Yes, I will send out V2 shortly to fix it.

Thanks,
Kan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-04-10 13:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-10  1:09 [PATCH 0/2] perf: Add Tremont support kan.liang
2019-04-10  1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
2019-04-10  7:41   ` Peter Zijlstra
2019-04-10 13:57     ` Liang, Kan
2019-04-10  1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
2019-04-10  7:51   ` Peter Zijlstra
2019-04-10 13:58     ` Liang, Kan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.