* [PATCH 0/2] perf: Add Tremont support
@ 2019-04-10 1:09 kan.liang
2019-04-10 1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
2019-04-10 1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
0 siblings, 2 replies; 7+ messages in thread
From: kan.liang @ 2019-04-10 1:09 UTC (permalink / raw)
To: peterz, mingo, linux-kernel
Cc: tglx, acme, jolsa, eranian, alexander.shishkin, ak, Kan Liang
From: Kan Liang <kan.liang@linux.intel.com>
The patch series intends to add Tremont support for Linux perf.
The patch series is on top of Icelake V5 patch series (with Peter's cleanup patch).
https://lkml.org/lkml/2019/4/8/630
PATCH 1: The feature is for both Icelake and Tremont. It missed the
Icelake patch series.
PATCH 2: Tremont core PMU support.
Kan Liang (2):
perf/x86/intel: Support adaptive PEBS for fixed counters
perf/x86/intel: Add Tremont core PMU support
arch/x86/events/intel/core.c | 93 +++++++++++++++++++++++++++++++++++++++
arch/x86/include/asm/perf_event.h | 1 +
2 files changed, 94 insertions(+)
--
2.7.4
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters
2019-04-10 1:09 [PATCH 0/2] perf: Add Tremont support kan.liang
@ 2019-04-10 1:09 ` kan.liang
2019-04-10 7:41 ` Peter Zijlstra
2019-04-10 1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
1 sibling, 1 reply; 7+ messages in thread
From: kan.liang @ 2019-04-10 1:09 UTC (permalink / raw)
To: peterz, mingo, linux-kernel
Cc: tglx, acme, jolsa, eranian, alexander.shishkin, ak, Kan Liang
From: Kan Liang <kan.liang@linux.intel.com>
Fixed counters can also generate adaptive PEBS record, if the
corresponding bit in IA32_FIXED_CTR_CTRL is set.
Otherwise, only basic record is generated.
Unconditionally set the bit when PEBS is enabled on fixed counters.
Let MSR_PEBS_CFG decide which format of PEBS record should be generated.
There is no harmful to leave the bit set.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
arch/x86/events/intel/core.c | 5 +++++
arch/x86/include/asm/perf_event.h | 1 +
2 files changed, 6 insertions(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 56df0f6..f34d92b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2174,6 +2174,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
bits <<= (idx * 4);
mask = 0xfULL << (idx * 4);
+ if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
+ bits |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
+ mask |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
+ }
+
rdmsrl(hwc->config_base, ctrl_val);
ctrl_val &= ~mask;
ctrl_val |= bits;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index dcb8bac..ce0dc88 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -33,6 +33,7 @@
#define HSW_IN_TX (1ULL << 32)
#define HSW_IN_TX_CHECKPOINTED (1ULL << 33)
#define ICL_EVENTSEL_ADAPTIVE (1ULL << 34)
+#define ICL_FIXED_0_ADAPTIVE (1ULL << 32)
#define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL << 36)
#define AMD64_EVENTSEL_GUESTONLY (1ULL << 40)
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support
2019-04-10 1:09 [PATCH 0/2] perf: Add Tremont support kan.liang
2019-04-10 1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
@ 2019-04-10 1:10 ` kan.liang
2019-04-10 7:51 ` Peter Zijlstra
1 sibling, 1 reply; 7+ messages in thread
From: kan.liang @ 2019-04-10 1:10 UTC (permalink / raw)
To: peterz, mingo, linux-kernel
Cc: tglx, acme, jolsa, eranian, alexander.shishkin, ak, Kan Liang
From: Kan Liang <kan.liang@linux.intel.com>
Add perf core PMU support for Intel Tremont CPU.
The init code is based on Goldmont plus.
The generic purpose counter 0 and fixed counter 0 have less skid.
Force :ppp events on generic purpose counter 0.
Force instruction:ppp always on fixed counter 0.
Updates LLC cache event table and OFFCORE_RESPONSE mask.
The adaptive PEBS, which is already enabled on ICL, is also supported
on Tremont. No extra codes required.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
arch/x86/events/intel/core.c | 88 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index f34d92b..f478831 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -1856,6 +1856,45 @@ static __initconst const u64 glp_hw_cache_extra_regs
},
};
+#define TNT_LOCAL_DRAM BIT_ULL(26)
+#define TNT_DEMAND_READ GLM_DEMAND_DATA_RD
+#define TNT_DEMAND_WRITE GLM_DEMAND_RFO
+#define TNT_LLC_ACCESS GLM_ANY_RESPONSE
+#define TNT_SNP_ANY (SNB_SNP_NOT_NEEDED|SNB_SNP_MISS| \
+ SNB_NO_FWD|SNB_SNP_FWD|SNB_HITM)
+#define TNT_LLC_MISS (TNT_SNP_ANY|SNB_NON_DRAM|TNT_LOCAL_DRAM)
+
+static __initconst const u64 tnt_hw_cache_extra_regs
+ [PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+ [C(LL)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = TNT_DEMAND_READ|
+ TNT_LLC_ACCESS,
+ [C(RESULT_MISS)] = TNT_DEMAND_READ|
+ TNT_LLC_MISS,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = TNT_DEMAND_WRITE|
+ TNT_LLC_ACCESS,
+ [C(RESULT_MISS)] = TNT_DEMAND_WRITE|
+ TNT_LLC_MISS,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = 0x0,
+ [C(RESULT_MISS)] = 0x0,
+ },
+ },
+};
+
+static struct extra_reg intel_tnt_extra_regs[] __read_mostly = {
+ /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
+ INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0xffffff9fffull, RSP_0),
+ INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0xffffff9fffull, RSP_1),
+ EVENT_EXTRA_END
+};
+
#define KNL_OT_L2_HITE BIT_ULL(19) /* Other Tile L2 Hit */
#define KNL_OT_L2_HITF BIT_ULL(20) /* Other Tile L2 Hit */
#define KNL_MCDRAM_LOCAL BIT_ULL(21)
@@ -3451,6 +3490,29 @@ glp_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
return c;
}
+static struct event_constraint *
+tnt_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
+ struct perf_event *event)
+{
+ struct event_constraint *c;
+
+ /*
+ * :ppp means to do reduced skid PEBS,
+ * which is available at PMC0 and fixed counter 0.
+ */
+ if (event->attr.precise_ip == 3) {
+ /* Force instruction:ppp in Fixed counter 0 */
+ if (event->hw.config == X86_CONFIG(.event=0xc0))
+ return &fixed_counter0_constraint;
+
+ return &counter0_constraint;
+ }
+
+ c = intel_get_event_constraints(cpuc, idx, event);
+
+ return c;
+}
+
static bool allow_tsx_force_abort = true;
static struct event_constraint *
@@ -4530,6 +4592,32 @@ __init int intel_pmu_init(void)
name = "goldmont_plus";
break;
+ case INTEL_FAM6_ATOM_TREMONT_X:
+ x86_pmu.late_ack = true;
+ memcpy(hw_cache_event_ids, glp_hw_cache_event_ids,
+ sizeof(hw_cache_event_ids));
+ memcpy(hw_cache_extra_regs, tnt_hw_cache_extra_regs,
+ sizeof(hw_cache_extra_regs));
+ hw_cache_event_ids[C(ITLB)][C(OP_READ)][C(RESULT_ACCESS)] = -1;
+
+ intel_pmu_lbr_init_skl();
+
+ x86_pmu.event_constraints = intel_slm_event_constraints;
+ x86_pmu.extra_regs = intel_tnt_extra_regs;
+ /*
+ * It's recommended to use CPU_CLK_UNHALTED.CORE_P + NPEBS
+ * for precise cycles.
+ */
+ x86_pmu.pebs_aliases = NULL;
+ x86_pmu.pebs_prec_dist = true;
+ x86_pmu.lbr_pt_coexist = true;
+ x86_pmu.flags |= PMU_FL_HAS_RSP_1;
+ x86_pmu.get_event_constraints = tnt_get_event_constraints;
+ extra_attr = slm_format_attr;
+ pr_cont("Tremont events, ");
+ name = "Tremont";
+ break;
+
case INTEL_FAM6_WESTMERE:
case INTEL_FAM6_WESTMERE_EP:
case INTEL_FAM6_WESTMERE_EX:
--
2.7.4
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters
2019-04-10 1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
@ 2019-04-10 7:41 ` Peter Zijlstra
2019-04-10 13:57 ` Liang, Kan
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2019-04-10 7:41 UTC (permalink / raw)
To: kan.liang
Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
alexander.shishkin, ak
On Tue, Apr 09, 2019 at 06:09:59PM -0700, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
>
> Fixed counters can also generate adaptive PEBS record, if the
> corresponding bit in IA32_FIXED_CTR_CTRL is set.
> Otherwise, only basic record is generated.
>
> Unconditionally set the bit when PEBS is enabled on fixed counters.
> Let MSR_PEBS_CFG decide which format of PEBS record should be generated.
> There is no harmful to leave the bit set.
I'll merge this back into:
Subject: perf/x86/intel: Support adaptive PEBSv4
such that this bug never existed, ok?
>
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> ---
> arch/x86/events/intel/core.c | 5 +++++
> arch/x86/include/asm/perf_event.h | 1 +
> 2 files changed, 6 insertions(+)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 56df0f6..f34d92b 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2174,6 +2174,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
> bits <<= (idx * 4);
> mask = 0xfULL << (idx * 4);
>
> + if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
> + bits |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
> + mask |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
> + }
> +
> rdmsrl(hwc->config_base, ctrl_val);
> ctrl_val &= ~mask;
> ctrl_val |= bits;
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index dcb8bac..ce0dc88 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -33,6 +33,7 @@
> #define HSW_IN_TX (1ULL << 32)
> #define HSW_IN_TX_CHECKPOINTED (1ULL << 33)
> #define ICL_EVENTSEL_ADAPTIVE (1ULL << 34)
> +#define ICL_FIXED_0_ADAPTIVE (1ULL << 32)
>
> #define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL << 36)
> #define AMD64_EVENTSEL_GUESTONLY (1ULL << 40)
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support
2019-04-10 1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
@ 2019-04-10 7:51 ` Peter Zijlstra
2019-04-10 13:58 ` Liang, Kan
0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2019-04-10 7:51 UTC (permalink / raw)
To: kan.liang
Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
alexander.shishkin, ak
On Tue, Apr 09, 2019 at 06:10:00PM -0700, kan.liang@linux.intel.com wrote:
> The generic purpose counter 0 and fixed counter 0 have less skid.
> Force :ppp events on generic purpose counter 0.
> Force instruction:ppp always on fixed counter 0.
> +static struct event_constraint *
> +tnt_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> + struct perf_event *event)
> +{
> + struct event_constraint *c;
> +
> + /*
> + * :ppp means to do reduced skid PEBS,
> + * which is available at PMC0 and fixed counter 0.
> + */
> + if (event->attr.precise_ip == 3) {
> + /* Force instruction:ppp in Fixed counter 0 */
> + if (event->hw.config == X86_CONFIG(.event=0xc0))
> + return &fixed_counter0_constraint;
> +
> + return &counter0_constraint;
I'm confused, 0xc0 is the architectural 'instructions' event, surely we
can program that on pmc0 too?
Did we want a fixed0_counter0_constraint for that?
> + }
> +
> + c = intel_get_event_constraints(cpuc, idx, event);
> +
> + return c;
> +}
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters
2019-04-10 7:41 ` Peter Zijlstra
@ 2019-04-10 13:57 ` Liang, Kan
0 siblings, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2019-04-10 13:57 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
alexander.shishkin, ak
On 4/10/2019 3:41 AM, Peter Zijlstra wrote:
> On Tue, Apr 09, 2019 at 06:09:59PM -0700, kan.liang@linux.intel.com wrote:
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> Fixed counters can also generate adaptive PEBS record, if the
>> corresponding bit in IA32_FIXED_CTR_CTRL is set.
>> Otherwise, only basic record is generated.
>>
>> Unconditionally set the bit when PEBS is enabled on fixed counters.
>> Let MSR_PEBS_CFG decide which format of PEBS record should be generated.
>> There is no harmful to leave the bit set.
>
> I'll merge this back into:
>
> Subject: perf/x86/intel: Support adaptive PEBSv4
>
> such that this bug never existed, ok?
Yes, please.
Thanks,
Kan
>
>>
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>> arch/x86/events/intel/core.c | 5 +++++
>> arch/x86/include/asm/perf_event.h | 1 +
>> 2 files changed, 6 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 56df0f6..f34d92b 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -2174,6 +2174,11 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
>> bits <<= (idx * 4);
>> mask = 0xfULL << (idx * 4);
>>
>> + if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip) {
>> + bits |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
>> + mask |= ICL_FIXED_0_ADAPTIVE << (idx * 4);
>> + }
>> +
>> rdmsrl(hwc->config_base, ctrl_val);
>> ctrl_val &= ~mask;
>> ctrl_val |= bits;
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index dcb8bac..ce0dc88 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -33,6 +33,7 @@
>> #define HSW_IN_TX (1ULL << 32)
>> #define HSW_IN_TX_CHECKPOINTED (1ULL << 33)
>> #define ICL_EVENTSEL_ADAPTIVE (1ULL << 34)
>> +#define ICL_FIXED_0_ADAPTIVE (1ULL << 32)
>>
>> #define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL << 36)
>> #define AMD64_EVENTSEL_GUESTONLY (1ULL << 40)
>> --
>> 2.7.4
>>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support
2019-04-10 7:51 ` Peter Zijlstra
@ 2019-04-10 13:58 ` Liang, Kan
0 siblings, 0 replies; 7+ messages in thread
From: Liang, Kan @ 2019-04-10 13:58 UTC (permalink / raw)
To: Peter Zijlstra
Cc: mingo, linux-kernel, tglx, acme, jolsa, eranian,
alexander.shishkin, ak
On 4/10/2019 3:51 AM, Peter Zijlstra wrote:
> On Tue, Apr 09, 2019 at 06:10:00PM -0700, kan.liang@linux.intel.com wrote:
>
>> The generic purpose counter 0 and fixed counter 0 have less skid.
>> Force :ppp events on generic purpose counter 0.
>> Force instruction:ppp always on fixed counter 0.
>
>> +static struct event_constraint *
>> +tnt_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
>> + struct perf_event *event)
>> +{
>> + struct event_constraint *c;
>> +
>> + /*
>> + * :ppp means to do reduced skid PEBS,
>> + * which is available at PMC0 and fixed counter 0.
>> + */
>> + if (event->attr.precise_ip == 3) {
>> + /* Force instruction:ppp in Fixed counter 0 */
>> + if (event->hw.config == X86_CONFIG(.event=0xc0))
>> + return &fixed_counter0_constraint;
>> +
>> + return &counter0_constraint;
>
> I'm confused, 0xc0 is the architectural 'instructions' event, surely we
> can program that on pmc0 too?
>
> Did we want a fixed0_counter0_constraint for that?
>
Yes, I will send out V2 shortly to fix it.
Thanks,
Kan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-04-10 13:58 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-10 1:09 [PATCH 0/2] perf: Add Tremont support kan.liang
2019-04-10 1:09 ` [PATCH 1/2] perf/x86/intel: Support adaptive PEBS for fixed counters kan.liang
2019-04-10 7:41 ` Peter Zijlstra
2019-04-10 13:57 ` Liang, Kan
2019-04-10 1:10 ` [PATCH 2/2] perf/x86/intel: Add Tremont core PMU support kan.liang
2019-04-10 7:51 ` Peter Zijlstra
2019-04-10 13:58 ` Liang, Kan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.