* [PATCH 0/3] perf/x86/intel/pt: Add new packet enables
@ 2015-07-17 13:34 Alexander Shishkin
2015-07-17 13:34 ` [PATCH 1/3] perf/x86/intel/pt: Add new timing " Alexander Shishkin
` (2 more replies)
0 siblings, 3 replies; 21+ messages in thread
From: Alexander Shishkin @ 2015-07-17 13:34 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme,
Alexander Shishkin
Hi Peter,
There are new PT packets and handles to enable and configure them in the
new version if Intel Architecture SDM. This series deals with timing
related packets: adds bit definitions, cpuid enumeration and relevant
userspace-visible parts.
The bulk of work, as usual, is in the tooling, which I don't include in
this patchset, since it depends on Adrian's earlier work which is still
bouncing between him and the maintainers, but for the sake of completeness
I pushed it to my github tree [1] and Adrian is also sending it out for
review. So anybody interested in trying the whole thing out can pull it
from my tree.
This time around, however, there is no compile time dependency between
mine and his series, since all the relevant bits are communicated through
sysfs.
[1] https://github.com/virtuoso/linux-perf/commits/intel_pt
Alexander Shishkin (3):
perf/x86/intel/pt: Add new timing packet enables
perf/x86/intel/pt: Add an option to not force PSB+ on every
schedule-in
perf/x86/intel/bts: Set itrace_started in pmu::start to match the new
logic
arch/x86/include/asm/msr-index.h | 8 +++
arch/x86/kernel/cpu/intel_pt.h | 6 +++
arch/x86/kernel/cpu/perf_event.h | 1 +
arch/x86/kernel/cpu/perf_event_intel_bts.c | 1 +
arch/x86/kernel/cpu/perf_event_intel_pt.c | 78 ++++++++++++++++++++++++++++--
kernel/events/core.c | 2 -
6 files changed, 91 insertions(+), 5 deletions(-)
--
2.1.4
^ permalink raw reply [flat|nested] 21+ messages in thread* [PATCH 1/3] perf/x86/intel/pt: Add new timing packet enables 2015-07-17 13:34 [PATCH 0/3] perf/x86/intel/pt: Add new packet enables Alexander Shishkin @ 2015-07-17 13:34 ` Alexander Shishkin 2015-07-30 10:40 ` Peter Zijlstra 2015-07-17 13:34 ` [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in Alexander Shishkin 2015-07-17 13:34 ` [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic Alexander Shishkin 2 siblings, 1 reply; 21+ messages in thread From: Alexander Shishkin @ 2015-07-17 13:34 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme, Alexander Shishkin Intel PT chapter in the new Intel Architecture SDM adds several packets corresponding enable bits and registers that control packet generation. Also, additional bits in the Intel PT CPUID leaf were added to enumerate presence and parameters of these new packets and features. The packets and enables are: * CYC: cycle accurate mode, provides the number of cycles elapsed since previous CYC packet; its presence and available threshold values are enumerated via CPUID; * MTC: mini time counter packets, used for tracking TSC time between full TSC packets; its presence and available resolution options are enumerated via CPUID; * PSB packet period is now configurable, available period values are enumerated via CPUID. This patch adds corresponding bit and register definitions, pmu driver capabilities based on CPUID enumeration, new attribute format bits for the new featurens and extends event configuration validation function to take these into account. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> --- arch/x86/include/asm/msr-index.h | 8 +++++ arch/x86/kernel/cpu/intel_pt.h | 6 ++++ arch/x86/kernel/cpu/perf_event_intel_pt.c | 56 ++++++++++++++++++++++++++++++- 3 files changed, 69 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 9ebc3d0093..c665d34f72 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -80,13 +80,21 @@ #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) +#define RTIT_CTL_CYCLEACC BIT(1) #define RTIT_CTL_OS BIT(2) #define RTIT_CTL_USR BIT(3) #define RTIT_CTL_CR3EN BIT(7) #define RTIT_CTL_TOPA BIT(8) +#define RTIT_CTL_MTC_EN BIT(9) #define RTIT_CTL_TSC_EN BIT(10) #define RTIT_CTL_DISRETC BIT(11) #define RTIT_CTL_BRANCH_EN BIT(13) +#define RTIT_CTL_MTC_RANGE_OFFSET 14 +#define RTIT_CTL_MTC_RANGE (0x0full << RTIT_CTL_MTC_RANGE_OFFSET) +#define RTIT_CTL_CYC_THRESH_OFFSET 19 +#define RTIT_CTL_CYC_THRESH (0x0full << RTIT_CTL_CYC_THRESH_OFFSET) +#define RTIT_CTL_PSB_FREQ_OFFSET 24 +#define RTIT_CTL_PSB_FREQ (0x0full << RTIT_CTL_PSB_FREQ_OFFSET) #define MSR_IA32_RTIT_STATUS 0x00000571 #define RTIT_STATUS_CONTEXTEN BIT(1) #define RTIT_STATUS_TRIGGEREN BIT(2) diff --git a/arch/x86/kernel/cpu/intel_pt.h b/arch/x86/kernel/cpu/intel_pt.h index 1c338b0eba..feb293e965 100644 --- a/arch/x86/kernel/cpu/intel_pt.h +++ b/arch/x86/kernel/cpu/intel_pt.h @@ -72,9 +72,15 @@ struct topa_entry { enum pt_capabilities { PT_CAP_max_subleaf = 0, PT_CAP_cr3_filtering, + PT_CAP_psb_cyc, + PT_CAP_mtc, PT_CAP_topa_output, PT_CAP_topa_multiple_entries, + PT_CAP_single_range_output, PT_CAP_payloads_lip, + PT_CAP_mtc_periods, + PT_CAP_cycle_thresholds, + PT_CAP_psb_periods, }; struct pt_pmu { diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/kernel/cpu/perf_event_intel_pt.c index 183de71962..aacd3c9bed 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_pt.c +++ b/arch/x86/kernel/cpu/perf_event_intel_pt.c @@ -65,9 +65,15 @@ static struct pt_cap_desc { } pt_caps[] = { PT_CAP(max_subleaf, 0, CR_EAX, 0xffffffff), PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), + PT_CAP(mtc, 0, CR_EBX, BIT(3)), PT_CAP(topa_output, 0, CR_ECX, BIT(0)), PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), PT_CAP(payloads_lip, 0, CR_ECX, BIT(31)), + PT_CAP(mtc_periods, 1, CR_EAX, 0xffff0000), + PT_CAP(cycle_thresholds, 1, CR_EBX, 0xffff), + PT_CAP(psb_periods, 1, CR_EBX, 0xffff0000), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -94,12 +100,22 @@ static struct attribute_group pt_cap_group = { .name = "caps", }; +PMU_FORMAT_ATTR(cyc, "config:1" ); +PMU_FORMAT_ATTR(mtc, "config:9" ); PMU_FORMAT_ATTR(tsc, "config:10" ); PMU_FORMAT_ATTR(noretcomp, "config:11" ); +PMU_FORMAT_ATTR(mtc_period, "config:14-17" ); +PMU_FORMAT_ATTR(cyc_thresh, "config:19-22" ); +PMU_FORMAT_ATTR(psb_period, "config:24-27" ); static struct attribute *pt_formats_attr[] = { + &format_attr_cyc.attr, + &format_attr_mtc.attr, &format_attr_tsc.attr, &format_attr_noretcomp.attr, + &format_attr_mtc_period.attr, + &format_attr_cyc_thresh.attr, + &format_attr_psb_period.attr, NULL, }; @@ -170,15 +186,53 @@ fail: return ret; } -#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | RTIT_CTL_DISRETC) +#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \ + RTIT_CTL_DISRETC | \ + RTIT_CTL_CYCLEACC | \ + RTIT_CTL_MTC_EN | \ + RTIT_CTL_MTC_RANGE | \ + RTIT_CTL_CYC_THRESH | \ + RTIT_CTL_PSB_FREQ) static bool pt_event_valid(struct perf_event *event) { u64 config = event->attr.config; + u64 allowed, requested; if ((config & PT_CONFIG_MASK) != config) return false; + if (config & + (RTIT_CTL_CYCLEACC | RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ)) { + if (!pt_cap_get(PT_CAP_psb_cyc)) + return false; + + allowed = pt_cap_get(PT_CAP_psb_periods); + requested = (config & RTIT_CTL_PSB_FREQ) >> + RTIT_CTL_PSB_FREQ_OFFSET; + if (requested && (!(allowed & BIT(requested)))) + return false; + + allowed = pt_cap_get(PT_CAP_cycle_thresholds); + requested = (config & RTIT_CTL_CYC_THRESH) >> + RTIT_CTL_CYC_THRESH_OFFSET; + if (requested && (!(allowed & BIT(requested)))) + return false; + } + + if (config & (RTIT_CTL_MTC_EN | RTIT_CTL_MTC_RANGE)) { + allowed = pt_cap_get(PT_CAP_mtc_periods); + + if (!allowed) + return false; + + requested = (config & RTIT_CTL_MTC_RANGE) >> + RTIT_CTL_MTC_RANGE_OFFSET; + + if (!(allowed & BIT(requested))) + return false; + } + return true; } -- 2.1.4 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel/pt: Add new timing packet enables 2015-07-17 13:34 ` [PATCH 1/3] perf/x86/intel/pt: Add new timing " Alexander Shishkin @ 2015-07-30 10:40 ` Peter Zijlstra 2015-07-30 11:57 ` Alexander Shishkin 0 siblings, 1 reply; 21+ messages in thread From: Peter Zijlstra @ 2015-07-30 10:40 UTC (permalink / raw) To: Alexander Shishkin Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme On Fri, Jul 17, 2015 at 04:34:08PM +0300, Alexander Shishkin wrote: > +#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \ > + RTIT_CTL_DISRETC | \ > + RTIT_CTL_CYCLEACC | \ > + RTIT_CTL_MTC_EN | \ > + RTIT_CTL_MTC_RANGE | \ > + RTIT_CTL_CYC_THRESH | \ > + RTIT_CTL_PSB_FREQ) > #define RTIT_CTL_CYC (RTIT_CTL_CYCLEACC | \ RTIT_CTL_CYC_THRESH | \ RTIT_CTL_PSB_FREQ) #define RTIT_CTL_MTC (RTIT_CTL_MTC_EN | \ RTIT_CTL_MTC_RANGE) #define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \ RTIT_CTL_DISRETC | \ RTIT_CTL_CYC | \ RTIT_CTL_MTC) > static bool pt_event_valid(struct perf_event *event) > { > u64 config = event->attr.config; > + u64 allowed, requested; > > if ((config & PT_CONFIG_MASK) != config) > return false; > > + if (config & > + (RTIT_CTL_CYCLEACC | RTIT_CTL_CYC_THRESH | RTIT_CTL_PSB_FREQ)) { if (config & RTIT_CTL_CYC) { > + if (!pt_cap_get(PT_CAP_psb_cyc)) > + return false; > + > + allowed = pt_cap_get(PT_CAP_psb_periods); > + requested = (config & RTIT_CTL_PSB_FREQ) >> > + RTIT_CTL_PSB_FREQ_OFFSET; > + if (requested && (!(allowed & BIT(requested)))) > + return false; > + > + allowed = pt_cap_get(PT_CAP_cycle_thresholds); > + requested = (config & RTIT_CTL_CYC_THRESH) >> > + RTIT_CTL_CYC_THRESH_OFFSET; > + if (requested && (!(allowed & BIT(requested)))) > + return false; > + } > + > + if (config & (RTIT_CTL_MTC_EN | RTIT_CTL_MTC_RANGE)) { if (config & RTIT_CTL_MTC) { > + allowed = pt_cap_get(PT_CAP_mtc_periods); > + > + if (!allowed) > + return false; > + > + requested = (config & RTIT_CTL_MTC_RANGE) >> > + RTIT_CTL_MTC_RANGE_OFFSET; > + > + if (!(allowed & BIT(requested))) > + return false; > + } > + > return true; > } Would that make sense? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel/pt: Add new timing packet enables 2015-07-30 10:40 ` Peter Zijlstra @ 2015-07-30 11:57 ` Alexander Shishkin 2015-07-30 12:14 ` Peter Zijlstra 0 siblings, 1 reply; 21+ messages in thread From: Alexander Shishkin @ 2015-07-30 11:57 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme Peter Zijlstra <peterz@infradead.org> writes: > On Fri, Jul 17, 2015 at 04:34:08PM +0300, Alexander Shishkin wrote: >> +#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \ >> + RTIT_CTL_DISRETC | \ >> + RTIT_CTL_CYCLEACC | \ >> + RTIT_CTL_MTC_EN | \ >> + RTIT_CTL_MTC_RANGE | \ >> + RTIT_CTL_CYC_THRESH | \ >> + RTIT_CTL_PSB_FREQ) >> > > #define RTIT_CTL_CYC (RTIT_CTL_CYCLEACC | \ > RTIT_CTL_CYC_THRESH | \ > RTIT_CTL_PSB_FREQ) PSB_FREQ is not, strictly speaking, related to cycle accurate mode. Both adjustable psb frequency and cycle accurate mode settings are enumerated with the same CPUID bit, but they really do different things unrelated to one another. >> + if (config & (RTIT_CTL_MTC_EN | RTIT_CTL_MTC_RANGE)) { > > if (config & RTIT_CTL_MTC) { > > Would that make sense? To me either way is fine. Want me to respin it? Regards, -- Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 1/3] perf/x86/intel/pt: Add new timing packet enables 2015-07-30 11:57 ` Alexander Shishkin @ 2015-07-30 12:14 ` Peter Zijlstra 2015-07-30 13:15 ` [PATCH v1] " Alexander Shishkin 0 siblings, 1 reply; 21+ messages in thread From: Peter Zijlstra @ 2015-07-30 12:14 UTC (permalink / raw) To: Alexander Shishkin Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme On Thu, Jul 30, 2015 at 02:57:17PM +0300, Alexander Shishkin wrote: > Peter Zijlstra <peterz@infradead.org> writes: > > > On Fri, Jul 17, 2015 at 04:34:08PM +0300, Alexander Shishkin wrote: > >> +#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \ > >> + RTIT_CTL_DISRETC | \ > >> + RTIT_CTL_CYCLEACC | \ > >> + RTIT_CTL_MTC_EN | \ > >> + RTIT_CTL_MTC_RANGE | \ > >> + RTIT_CTL_CYC_THRESH | \ > >> + RTIT_CTL_PSB_FREQ) > >> > > > > #define RTIT_CTL_CYC (RTIT_CTL_CYCLEACC | \ > > RTIT_CTL_CYC_THRESH | \ > > RTIT_CTL_PSB_FREQ) > > PSB_FREQ is not, strictly speaking, related to cycle accurate mode. Both > adjustable psb frequency and cycle accurate mode settings are enumerated > with the same CPUID bit, but they really do different things unrelated > to one another. RTIT_CTL_CYC_PSB then, to match the CPUID bit name? > >> + if (config & (RTIT_CTL_MTC_EN | RTIT_CTL_MTC_RANGE)) { > > > > if (config & RTIT_CTL_MTC) { > > > > Would that make sense? > > To me either way is fine. Want me to respin it? Please. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v1] perf/x86/intel/pt: Add new timing packet enables 2015-07-30 12:14 ` Peter Zijlstra @ 2015-07-30 13:15 ` Alexander Shishkin 2015-08-04 8:55 ` [tip:perf/core] " tip-bot for Alexander Shishkin 0 siblings, 1 reply; 21+ messages in thread From: Alexander Shishkin @ 2015-07-30 13:15 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme, Alexander Shishkin Intel PT chapter in the new Intel Architecture SDM adds several packets corresponding enable bits and registers that control packet generation. Also, additional bits in the Intel PT CPUID leaf were added to enumerate presence and parameters of these new packets and features. The packets and enables are: * CYC: cycle accurate mode, provides the number of cycles elapsed since previous CYC packet; its presence and available threshold values are enumerated via CPUID; * MTC: mini time counter packets, used for tracking TSC time between full TSC packets; its presence and available resolution options are enumerated via CPUID; * PSB packet period is now configurable, available period values are enumerated via CPUID. This patch adds corresponding bit and register definitions, pmu driver capabilities based on CPUID enumeration, new attribute format bits for the new featurens and extends event configuration validation function to take these into account. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> --- arch/x86/include/asm/msr-index.h | 8 ++++ arch/x86/kernel/cpu/intel_pt.h | 6 +++ arch/x86/kernel/cpu/perf_event_intel_pt.c | 68 ++++++++++++++++++++++++++++++- 3 files changed, 81 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 9ebc3d0093..c665d34f72 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -80,13 +80,21 @@ #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) +#define RTIT_CTL_CYCLEACC BIT(1) #define RTIT_CTL_OS BIT(2) #define RTIT_CTL_USR BIT(3) #define RTIT_CTL_CR3EN BIT(7) #define RTIT_CTL_TOPA BIT(8) +#define RTIT_CTL_MTC_EN BIT(9) #define RTIT_CTL_TSC_EN BIT(10) #define RTIT_CTL_DISRETC BIT(11) #define RTIT_CTL_BRANCH_EN BIT(13) +#define RTIT_CTL_MTC_RANGE_OFFSET 14 +#define RTIT_CTL_MTC_RANGE (0x0full << RTIT_CTL_MTC_RANGE_OFFSET) +#define RTIT_CTL_CYC_THRESH_OFFSET 19 +#define RTIT_CTL_CYC_THRESH (0x0full << RTIT_CTL_CYC_THRESH_OFFSET) +#define RTIT_CTL_PSB_FREQ_OFFSET 24 +#define RTIT_CTL_PSB_FREQ (0x0full << RTIT_CTL_PSB_FREQ_OFFSET) #define MSR_IA32_RTIT_STATUS 0x00000571 #define RTIT_STATUS_CONTEXTEN BIT(1) #define RTIT_STATUS_TRIGGEREN BIT(2) diff --git a/arch/x86/kernel/cpu/intel_pt.h b/arch/x86/kernel/cpu/intel_pt.h index 1c338b0eba..feb293e965 100644 --- a/arch/x86/kernel/cpu/intel_pt.h +++ b/arch/x86/kernel/cpu/intel_pt.h @@ -72,9 +72,15 @@ struct topa_entry { enum pt_capabilities { PT_CAP_max_subleaf = 0, PT_CAP_cr3_filtering, + PT_CAP_psb_cyc, + PT_CAP_mtc, PT_CAP_topa_output, PT_CAP_topa_multiple_entries, + PT_CAP_single_range_output, PT_CAP_payloads_lip, + PT_CAP_mtc_periods, + PT_CAP_cycle_thresholds, + PT_CAP_psb_periods, }; struct pt_pmu { diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/kernel/cpu/perf_event_intel_pt.c index 183de71962..695fd6c562 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_pt.c +++ b/arch/x86/kernel/cpu/perf_event_intel_pt.c @@ -65,9 +65,15 @@ static struct pt_cap_desc { } pt_caps[] = { PT_CAP(max_subleaf, 0, CR_EAX, 0xffffffff), PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), + PT_CAP(mtc, 0, CR_EBX, BIT(3)), PT_CAP(topa_output, 0, CR_ECX, BIT(0)), PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), PT_CAP(payloads_lip, 0, CR_ECX, BIT(31)), + PT_CAP(mtc_periods, 1, CR_EAX, 0xffff0000), + PT_CAP(cycle_thresholds, 1, CR_EBX, 0xffff), + PT_CAP(psb_periods, 1, CR_EBX, 0xffff0000), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -94,12 +100,22 @@ static struct attribute_group pt_cap_group = { .name = "caps", }; +PMU_FORMAT_ATTR(cyc, "config:1" ); +PMU_FORMAT_ATTR(mtc, "config:9" ); PMU_FORMAT_ATTR(tsc, "config:10" ); PMU_FORMAT_ATTR(noretcomp, "config:11" ); +PMU_FORMAT_ATTR(mtc_period, "config:14-17" ); +PMU_FORMAT_ATTR(cyc_thresh, "config:19-22" ); +PMU_FORMAT_ATTR(psb_period, "config:24-27" ); static struct attribute *pt_formats_attr[] = { + &format_attr_cyc.attr, + &format_attr_mtc.attr, &format_attr_tsc.attr, &format_attr_noretcomp.attr, + &format_attr_mtc_period.attr, + &format_attr_cyc_thresh.attr, + &format_attr_psb_period.attr, NULL, }; @@ -170,15 +186,65 @@ fail: return ret; } -#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | RTIT_CTL_DISRETC) +#define RTIT_CTL_CYC_PSB (RTIT_CTL_CYCLEACC | \ + RTIT_CTL_CYC_THRESH | \ + RTIT_CTL_PSB_FREQ) + +#define RTIT_CTL_MTC (RTIT_CTL_MTC_EN | \ + RTIT_CTL_MTC_RANGE) + +#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \ + RTIT_CTL_DISRETC | \ + RTIT_CTL_CYC_PSB | \ + RTIT_CTL_MTC) static bool pt_event_valid(struct perf_event *event) { u64 config = event->attr.config; + u64 allowed, requested; if ((config & PT_CONFIG_MASK) != config) return false; + if (config & RTIT_CTL_CYC_PSB) { + if (!pt_cap_get(PT_CAP_psb_cyc)) + return false; + + allowed = pt_cap_get(PT_CAP_psb_periods); + requested = (config & RTIT_CTL_PSB_FREQ) >> + RTIT_CTL_PSB_FREQ_OFFSET; + if (requested && (!(allowed & BIT(requested)))) + return false; + + allowed = pt_cap_get(PT_CAP_cycle_thresholds); + requested = (config & RTIT_CTL_CYC_THRESH) >> + RTIT_CTL_CYC_THRESH_OFFSET; + if (requested && (!(allowed & BIT(requested)))) + return false; + } + + if (config & RTIT_CTL_MTC) { + /* + * In the unlikely case that CPUID lists valid mtc periods, + * but not the mtc capability, drop out here. + * + * Spec says that setting mtc period bits while mtc bit in + * CPUID is 0 will #GP, so better safe than sorry. + */ + if (!pt_cap_get(PT_CAP_mtc)) + return false; + + allowed = pt_cap_get(PT_CAP_mtc_periods); + if (!allowed) + return false; + + requested = (config & RTIT_CTL_MTC_RANGE) >> + RTIT_CTL_MTC_RANGE_OFFSET; + + if (!(allowed & BIT(requested))) + return false; + } + return true; } -- 2.4.6 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [tip:perf/core] perf/x86/intel/pt: Add new timing packet enables 2015-07-30 13:15 ` [PATCH v1] " Alexander Shishkin @ 2015-08-04 8:55 ` tip-bot for Alexander Shishkin 0 siblings, 0 replies; 21+ messages in thread From: tip-bot for Alexander Shishkin @ 2015-08-04 8:55 UTC (permalink / raw) To: linux-tip-commits Cc: tglx, peterz, alexander.shishkin, torvalds, mingo, hpa, linux-kernel Commit-ID: b1bf72d6691cc33fc7763fc8ec77df42ca1a8702 Gitweb: http://git.kernel.org/tip/b1bf72d6691cc33fc7763fc8ec77df42ca1a8702 Author: Alexander Shishkin <alexander.shishkin@linux.intel.com> AuthorDate: Thu, 30 Jul 2015 16:15:31 +0300 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 4 Aug 2015 10:16:55 +0200 perf/x86/intel/pt: Add new timing packet enables Intel PT chapter in the new Intel Architecture SDM adds several packets corresponding enable bits and registers that control packet generation. Also, additional bits in the Intel PT CPUID leaf were added to enumerate presence and parameters of these new packets and features. The packets and enables are: * CYC: cycle accurate mode, provides the number of cycles elapsed since previous CYC packet; its presence and available threshold values are enumerated via CPUID; * MTC: mini time counter packets, used for tracking TSC time between full TSC packets; its presence and available resolution options are enumerated via CPUID; * PSB packet period is now configurable, available period values are enumerated via CPUID. This patch adds corresponding bit and register definitions, pmu driver capabilities based on CPUID enumeration, new attribute format bits for the new featurens and extends event configuration validation function to take these into account. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/1438262131-12725-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/include/asm/msr-index.h | 8 ++++ arch/x86/kernel/cpu/intel_pt.h | 6 +++ arch/x86/kernel/cpu/perf_event_intel_pt.c | 68 ++++++++++++++++++++++++++++++- 3 files changed, 81 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 9ebc3d0..c665d34 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -80,13 +80,21 @@ #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) +#define RTIT_CTL_CYCLEACC BIT(1) #define RTIT_CTL_OS BIT(2) #define RTIT_CTL_USR BIT(3) #define RTIT_CTL_CR3EN BIT(7) #define RTIT_CTL_TOPA BIT(8) +#define RTIT_CTL_MTC_EN BIT(9) #define RTIT_CTL_TSC_EN BIT(10) #define RTIT_CTL_DISRETC BIT(11) #define RTIT_CTL_BRANCH_EN BIT(13) +#define RTIT_CTL_MTC_RANGE_OFFSET 14 +#define RTIT_CTL_MTC_RANGE (0x0full << RTIT_CTL_MTC_RANGE_OFFSET) +#define RTIT_CTL_CYC_THRESH_OFFSET 19 +#define RTIT_CTL_CYC_THRESH (0x0full << RTIT_CTL_CYC_THRESH_OFFSET) +#define RTIT_CTL_PSB_FREQ_OFFSET 24 +#define RTIT_CTL_PSB_FREQ (0x0full << RTIT_CTL_PSB_FREQ_OFFSET) #define MSR_IA32_RTIT_STATUS 0x00000571 #define RTIT_STATUS_CONTEXTEN BIT(1) #define RTIT_STATUS_TRIGGEREN BIT(2) diff --git a/arch/x86/kernel/cpu/intel_pt.h b/arch/x86/kernel/cpu/intel_pt.h index 1c338b0..feb293e 100644 --- a/arch/x86/kernel/cpu/intel_pt.h +++ b/arch/x86/kernel/cpu/intel_pt.h @@ -72,9 +72,15 @@ struct topa_entry { enum pt_capabilities { PT_CAP_max_subleaf = 0, PT_CAP_cr3_filtering, + PT_CAP_psb_cyc, + PT_CAP_mtc, PT_CAP_topa_output, PT_CAP_topa_multiple_entries, + PT_CAP_single_range_output, PT_CAP_payloads_lip, + PT_CAP_mtc_periods, + PT_CAP_cycle_thresholds, + PT_CAP_psb_periods, }; struct pt_pmu { diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/kernel/cpu/perf_event_intel_pt.c index cc58ef8..e20cfac 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_pt.c +++ b/arch/x86/kernel/cpu/perf_event_intel_pt.c @@ -65,9 +65,15 @@ static struct pt_cap_desc { } pt_caps[] = { PT_CAP(max_subleaf, 0, CR_EAX, 0xffffffff), PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), + PT_CAP(mtc, 0, CR_EBX, BIT(3)), PT_CAP(topa_output, 0, CR_ECX, BIT(0)), PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), PT_CAP(payloads_lip, 0, CR_ECX, BIT(31)), + PT_CAP(mtc_periods, 1, CR_EAX, 0xffff0000), + PT_CAP(cycle_thresholds, 1, CR_EBX, 0xffff), + PT_CAP(psb_periods, 1, CR_EBX, 0xffff0000), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -94,12 +100,22 @@ static struct attribute_group pt_cap_group = { .name = "caps", }; +PMU_FORMAT_ATTR(cyc, "config:1" ); +PMU_FORMAT_ATTR(mtc, "config:9" ); PMU_FORMAT_ATTR(tsc, "config:10" ); PMU_FORMAT_ATTR(noretcomp, "config:11" ); +PMU_FORMAT_ATTR(mtc_period, "config:14-17" ); +PMU_FORMAT_ATTR(cyc_thresh, "config:19-22" ); +PMU_FORMAT_ATTR(psb_period, "config:24-27" ); static struct attribute *pt_formats_attr[] = { + &format_attr_cyc.attr, + &format_attr_mtc.attr, &format_attr_tsc.attr, &format_attr_noretcomp.attr, + &format_attr_mtc_period.attr, + &format_attr_cyc_thresh.attr, + &format_attr_psb_period.attr, NULL, }; @@ -170,15 +186,65 @@ fail: return ret; } -#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | RTIT_CTL_DISRETC) +#define RTIT_CTL_CYC_PSB (RTIT_CTL_CYCLEACC | \ + RTIT_CTL_CYC_THRESH | \ + RTIT_CTL_PSB_FREQ) + +#define RTIT_CTL_MTC (RTIT_CTL_MTC_EN | \ + RTIT_CTL_MTC_RANGE) + +#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \ + RTIT_CTL_DISRETC | \ + RTIT_CTL_CYC_PSB | \ + RTIT_CTL_MTC) static bool pt_event_valid(struct perf_event *event) { u64 config = event->attr.config; + u64 allowed, requested; if ((config & PT_CONFIG_MASK) != config) return false; + if (config & RTIT_CTL_CYC_PSB) { + if (!pt_cap_get(PT_CAP_psb_cyc)) + return false; + + allowed = pt_cap_get(PT_CAP_psb_periods); + requested = (config & RTIT_CTL_PSB_FREQ) >> + RTIT_CTL_PSB_FREQ_OFFSET; + if (requested && (!(allowed & BIT(requested)))) + return false; + + allowed = pt_cap_get(PT_CAP_cycle_thresholds); + requested = (config & RTIT_CTL_CYC_THRESH) >> + RTIT_CTL_CYC_THRESH_OFFSET; + if (requested && (!(allowed & BIT(requested)))) + return false; + } + + if (config & RTIT_CTL_MTC) { + /* + * In the unlikely case that CPUID lists valid mtc periods, + * but not the mtc capability, drop out here. + * + * Spec says that setting mtc period bits while mtc bit in + * CPUID is 0 will #GP, so better safe than sorry. + */ + if (!pt_cap_get(PT_CAP_mtc)) + return false; + + allowed = pt_cap_get(PT_CAP_mtc_periods); + if (!allowed) + return false; + + requested = (config & RTIT_CTL_MTC_RANGE) >> + RTIT_CTL_MTC_RANGE_OFFSET; + + if (!(allowed & BIT(requested))) + return false; + } + return true; } ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in 2015-07-17 13:34 [PATCH 0/3] perf/x86/intel/pt: Add new packet enables Alexander Shishkin 2015-07-17 13:34 ` [PATCH 1/3] perf/x86/intel/pt: Add new timing " Alexander Shishkin @ 2015-07-17 13:34 ` Alexander Shishkin 2015-07-30 10:43 ` Peter Zijlstra 2015-07-17 13:34 ` [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic Alexander Shishkin 2 siblings, 1 reply; 21+ messages in thread From: Alexander Shishkin @ 2015-07-17 13:34 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme, Alexander Shishkin Currently, the PT driver zeroes out the status register every time before starting the event. However, all the writable bits are already taken care of in pt_handle_status() function, except the new PacketByteCnt field, which in new versions of PT contains the number of packet bytes written since the last sync (PSB) packet. Zeroing it out before enabling PT forces a sync packet to be written. This means that, with the existing code, a sync packet (PSB and PSBEND, 18 bytes in total) will be generated every time a PT event is scheduled in. To avoid these unnecessary syncs and save a WRMSR in the fast path, this patch adds a new attribute config bit "no_force_psb", which will disable this zeroing WRMSR. One exception where we do need to force PSB like this when tracing starts, so that the decoder has a clear sync point in the trace. For this purpose we aready have hw::itrace_started flag, which we are currently using to output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started from perf core to the pmu::start, where it should still be 0 on the very first run. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> --- arch/x86/kernel/cpu/perf_event.h | 1 + arch/x86/kernel/cpu/perf_event_intel_pt.c | 22 ++++++++++++++++++++-- kernel/events/core.c | 2 -- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h index 3474cf26b6..d7aa6c5630 100644 --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h @@ -77,6 +77,7 @@ struct event_constraint { #define PERF_X86_EVENT_EXCL_ACCT 0x0200 /* accounted EXCL event */ #define PERF_X86_EVENT_AUTO_RELOAD 0x0400 /* use PEBS auto-reload */ #define PERF_X86_EVENT_FREERUNNING 0x0800 /* use freerunning PEBS */ +#define PERF_X86_EVENT_PT_NO_FORCE_PSB 0x1000 /* don't force PSB+ packets */ struct amd_nb { diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/kernel/cpu/perf_event_intel_pt.c index aacd3c9bed..6c40455cd1 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_pt.c +++ b/arch/x86/kernel/cpu/perf_event_intel_pt.c @@ -100,6 +100,7 @@ static struct attribute_group pt_cap_group = { .name = "caps", }; +PMU_FORMAT_ATTR(no_force_psb, "config:0" ); PMU_FORMAT_ATTR(cyc, "config:1" ); PMU_FORMAT_ATTR(mtc, "config:9" ); PMU_FORMAT_ATTR(tsc, "config:10" ); @@ -109,6 +110,7 @@ PMU_FORMAT_ATTR(cyc_thresh, "config:19-22" ); PMU_FORMAT_ATTR(psb_period, "config:24-27" ); static struct attribute *pt_formats_attr[] = { + &format_attr_no_force_psb.attr, &format_attr_cyc.attr, &format_attr_mtc.attr, &format_attr_tsc.attr, @@ -199,6 +201,18 @@ static bool pt_event_valid(struct perf_event *event) u64 config = event->attr.config; u64 allowed, requested; + /* + * Re-use one of the config bits to enable resetting packet byte + * count that will result in outputting PSB+ at every event enable. + * Since we don't allow setting bit 0 (TraceEn) directly from + * attr::config, we can re-use it for our purpose. + */ + if (config & BIT(0)) { + event->hw.flags |= PERF_X86_EVENT_PT_NO_FORCE_PSB; + event->attr.config &= ~BIT(0); + config = event->attr.config; + } + if ((config & PT_CONFIG_MASK) != config) return false; @@ -245,6 +259,12 @@ static void pt_config(struct perf_event *event) { u64 reg; + if (!(event->hw.flags & PERF_X86_EVENT_PT_NO_FORCE_PSB) || + !(event->hw.itrace_started)) { + event->hw.itrace_started = 1; + wrmsrl(MSR_IA32_RTIT_STATUS, 0); + } + reg = RTIT_CTL_TOPA | RTIT_CTL_BRANCH_EN | RTIT_CTL_TRACEEN; if (!event->attr.exclude_kernel) @@ -964,7 +984,6 @@ void intel_pt_interrupt(void) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } } @@ -988,7 +1007,6 @@ static void pt_event_start(struct perf_event *event, int mode) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } diff --git a/kernel/events/core.c b/kernel/events/core.c index d3dae3419b..15da13237a 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6040,8 +6040,6 @@ static void perf_log_itrace_start(struct perf_event *event) event->hw.itrace_started) return; - event->hw.itrace_started = 1; - rec.header.type = PERF_RECORD_ITRACE_START; rec.header.misc = 0; rec.header.size = sizeof(rec); -- 2.1.4 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in 2015-07-17 13:34 ` [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in Alexander Shishkin @ 2015-07-30 10:43 ` Peter Zijlstra 2015-07-30 11:53 ` Alexander Shishkin 0 siblings, 1 reply; 21+ messages in thread From: Peter Zijlstra @ 2015-07-30 10:43 UTC (permalink / raw) To: Alexander Shishkin Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme On Fri, Jul 17, 2015 at 04:34:09PM +0300, Alexander Shishkin wrote: > Currently, the PT driver zeroes out the status register every time before > starting the event. However, all the writable bits are already taken care > of in pt_handle_status() function, except the new PacketByteCnt field, > which in new versions of PT contains the number of packet bytes written > since the last sync (PSB) packet. Zeroing it out before enabling PT forces > a sync packet to be written. This means that, with the existing code, a > sync packet (PSB and PSBEND, 18 bytes in total) will be generated every > time a PT event is scheduled in. > > To avoid these unnecessary syncs and save a WRMSR in the fast path, this > patch adds a new attribute config bit "no_force_psb", which will disable > this zeroing WRMSR. Why is this exposed? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in 2015-07-30 10:43 ` Peter Zijlstra @ 2015-07-30 11:53 ` Alexander Shishkin 2015-07-30 12:13 ` Peter Zijlstra 0 siblings, 1 reply; 21+ messages in thread From: Alexander Shishkin @ 2015-07-30 11:53 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme Peter Zijlstra <peterz@infradead.org> writes: > On Fri, Jul 17, 2015 at 04:34:09PM +0300, Alexander Shishkin wrote: >> Currently, the PT driver zeroes out the status register every time before >> starting the event. However, all the writable bits are already taken care >> of in pt_handle_status() function, except the new PacketByteCnt field, >> which in new versions of PT contains the number of packet bytes written >> since the last sync (PSB) packet. Zeroing it out before enabling PT forces >> a sync packet to be written. This means that, with the existing code, a >> sync packet (PSB and PSBEND, 18 bytes in total) will be generated every >> time a PT event is scheduled in. >> >> To avoid these unnecessary syncs and save a WRMSR in the fast path, this >> patch adds a new attribute config bit "no_force_psb", which will disable >> this zeroing WRMSR. > > Why is this exposed? The default behavior, which we have in the kernel now, is to always force the PSB, so if we change it, we need to make sure nobody is relying on it being the default behavior. Granted, for the hardware that's on the market right now it won't make a difference (it generates PSB+ at least every 256 bytes anyway), but in future if we change the default, newer hardware will produce different results with 4.2 than with the newer kernels and this would be a tad sloppy. The tools, however, can decide to set no_force_psb=1 when no_force_psb is available (sysfs caps) and otherwise the default behavior will be the same as no_force_psb==0. Makes sense? Regards, -- Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in 2015-07-30 11:53 ` Alexander Shishkin @ 2015-07-30 12:13 ` Peter Zijlstra 2015-07-30 12:49 ` Alexander Shishkin ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Peter Zijlstra @ 2015-07-30 12:13 UTC (permalink / raw) To: Alexander Shishkin Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme On Thu, Jul 30, 2015 at 02:53:51PM +0300, Alexander Shishkin wrote: > Peter Zijlstra <peterz@infradead.org> writes: > > > On Fri, Jul 17, 2015 at 04:34:09PM +0300, Alexander Shishkin wrote: > >> Currently, the PT driver zeroes out the status register every time before > >> starting the event. However, all the writable bits are already taken care > >> of in pt_handle_status() function, except the new PacketByteCnt field, > >> which in new versions of PT contains the number of packet bytes written > >> since the last sync (PSB) packet. Zeroing it out before enabling PT forces > >> a sync packet to be written. This means that, with the existing code, a > >> sync packet (PSB and PSBEND, 18 bytes in total) will be generated every > >> time a PT event is scheduled in. > >> > >> To avoid these unnecessary syncs and save a WRMSR in the fast path, this > >> patch adds a new attribute config bit "no_force_psb", which will disable > >> this zeroing WRMSR. > > > > Why is this exposed? > > The default behavior, which we have in the kernel now, is to always > force the PSB, so if we change it, we need to make sure nobody is > relying on it being the default behavior. Seeing how there's hardly any tools out there for this -- and its only recently introduced to the kernel, I think we can change this behaviour. > Granted, for the hardware > that's on the market right now it won't make a difference (it generates > PSB+ at least every 256 bytes anyway), but in future if we change the > default, newer hardware will produce different results with 4.2 than with > the newer kernels and this would be a tad sloppy. Even more reason not to expose this. By the time there's hardware out there that supports this, 4.2 will be an ancient kernel :-) > The tools, however, can decide to set no_force_psb=1 when no_force_psb > is available (sysfs caps) and otherwise the default behavior will be the > same as no_force_psb==0. Makes sense? I'd rather not expose this. It doesn't make a difference to the actual trace data other than it being bigger. Any decoder must be able to decode both versions. And you cannot really tell anything from there being too many PSB frames. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in 2015-07-30 12:13 ` Peter Zijlstra @ 2015-07-30 12:49 ` Alexander Shishkin 2015-07-30 13:36 ` Alexander Shishkin 2015-07-30 13:48 ` [PATCH v1] perf/x86/intel/pt: Do not force sync packets " Alexander Shishkin 2015-07-31 11:47 ` [PATCH v2] " Alexander Shishkin 2 siblings, 1 reply; 21+ messages in thread From: Alexander Shishkin @ 2015-07-30 12:49 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme Peter Zijlstra <peterz@infradead.org> writes: > On Thu, Jul 30, 2015 at 02:53:51PM +0300, Alexander Shishkin wrote: >> Peter Zijlstra <peterz@infradead.org> writes: >> >> > On Fri, Jul 17, 2015 at 04:34:09PM +0300, Alexander Shishkin wrote: >> >> Currently, the PT driver zeroes out the status register every time before >> >> starting the event. However, all the writable bits are already taken care >> >> of in pt_handle_status() function, except the new PacketByteCnt field, >> >> which in new versions of PT contains the number of packet bytes written >> >> since the last sync (PSB) packet. Zeroing it out before enabling PT forces >> >> a sync packet to be written. This means that, with the existing code, a >> >> sync packet (PSB and PSBEND, 18 bytes in total) will be generated every >> >> time a PT event is scheduled in. >> >> >> >> To avoid these unnecessary syncs and save a WRMSR in the fast path, this >> >> patch adds a new attribute config bit "no_force_psb", which will disable >> >> this zeroing WRMSR. >> > >> > Why is this exposed? >> >> The default behavior, which we have in the kernel now, is to always >> force the PSB, so if we change it, we need to make sure nobody is >> relying on it being the default behavior. > > Seeing how there's hardly any tools out there for this -- and its only > recently introduced to the kernel, I think we can change this > behaviour. Well, aside from perf tool itself (I'm not sure what's been happening to the PT support there), there's gdb support, which has been merged to the gdb mainline some time ago and makes use of our perf interface. I don't know, however, if it relies on this or not. >> Granted, for the hardware >> that's on the market right now it won't make a difference (it generates >> PSB+ at least every 256 bytes anyway), but in future if we change the >> default, newer hardware will produce different results with 4.2 than with >> the newer kernels and this would be a tad sloppy. > > Even more reason not to expose this. By the time there's hardware out > there that supports this, 4.2 will be an ancient kernel :-) > >> The tools, however, can decide to set no_force_psb=1 when no_force_psb >> is available (sysfs caps) and otherwise the default behavior will be the >> same as no_force_psb==0. Makes sense? > > I'd rather not expose this. It doesn't make a difference to the actual > trace data other than it being bigger. Any decoder must be able to > decode both versions. And you cannot really tell anything from there > being too many PSB frames. It kinda does make a difference: inside PSB+ you get a TSC packet, which carries the tsc timestamp as a payload, so, with the current behavior, since you'll get one of those at every event schedule-in, you'll have many more time synchronization points with the perf stream, but when the default changes, suddenly your timestamps way further apart and the decoder needs to put in additional guesswork to correlate those MMAP events in the perf stream with the PT trace. So normally, instead of forcing the PSB+, one would enable MTC (mini time counter) packets, which would give better time reference points. If one knows to do this, one doesn't need extra PSBs, if one doesn't, but stuff sort of works for them because of the frequent PSBs, they will break after the default changes. The decoders, ideally, should eventually learn to deal with both cases, so in the long run it shouldn't matter at all. I just typed in all that to clarify exactly what the difference in trace data would mean. Regards, -- Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in 2015-07-30 12:49 ` Alexander Shishkin @ 2015-07-30 13:36 ` Alexander Shishkin 0 siblings, 0 replies; 21+ messages in thread From: Alexander Shishkin @ 2015-07-30 13:36 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme Alexander Shishkin <alexander.shishkin@linux.intel.com> writes: > It kinda does make a difference: inside PSB+ you get a TSC packet, which > carries the tsc timestamp as a payload, so, with the current behavior, > since you'll get one of those at every event schedule-in, you'll have > many more time synchronization points with the perf stream, but when the > default changes, suddenly your timestamps way further apart and the > decoder needs to put in additional guesswork to correlate those MMAP > events in the perf stream with the PT trace. I spoke hastily, TSC packets will be generated on every schedule-in (as well as other packets that normally come in PSB+) regardless of the actual PSB, so scratch the above. Let me replace the above with a smaller patch. Regards, -- Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v1] perf/x86/intel/pt: Do not force sync packets on every schedule-in 2015-07-30 12:13 ` Peter Zijlstra 2015-07-30 12:49 ` Alexander Shishkin @ 2015-07-30 13:48 ` Alexander Shishkin 2015-07-30 15:24 ` Alexander Shishkin 2015-08-04 8:55 ` [tip:perf/core] " tip-bot for Alexander Shishkin 2015-07-31 11:47 ` [PATCH v2] " Alexander Shishkin 2 siblings, 2 replies; 21+ messages in thread From: Alexander Shishkin @ 2015-07-30 13:48 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme, Alexander Shishkin Currently, the PT driver zeroes out the status register every time before starting the event. However, all the writable bits are already taken care of in pt_handle_status() function, except the new PacketByteCnt field, which in new versions of PT contains the number of packet bytes written since the last sync (PSB) packet. Zeroing it out before enabling PT forces a sync packet to be written. This means that, with the existing code, a sync packet (PSB and PSBEND, 18 bytes in total) will be generated every time a PT event is scheduled in. To avoid these unnecessary syncs and save a WRMSR in the fast path, this patch changes the default behavior to not clear PacketByteCnt field, so that the sync packets will be generated with the period specified as "psb_period" attribute config field. This has little impact on the trace data as the other packets that are normally sent within PSB+ (between PSB and PSBEND) have their own generation scenarios which do not depend on the sync packets. One exception where we do need to force PSB like this when tracing starts, so that the decoder has a clear sync point in the trace. For this purpose we aready have hw::itrace_started flag, which we are currently using to output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started from perf core to the pmu::start, where it should still be 0 on the very first run. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> perf/x86/intel/pt: fixup with no force psb Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> --- arch/x86/kernel/cpu/perf_event_intel_pt.c | 7 +++++-- kernel/events/core.c | 2 -- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/kernel/cpu/perf_event_intel_pt.c index 695fd6c562..e20cfacb5a 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_pt.c +++ b/arch/x86/kernel/cpu/perf_event_intel_pt.c @@ -257,6 +257,11 @@ static void pt_config(struct perf_event *event) { u64 reg; + if (!event->hw.itrace_started) { + event->hw.itrace_started = 1; + wrmsrl(MSR_IA32_RTIT_STATUS, 0); + } + reg = RTIT_CTL_TOPA | RTIT_CTL_BRANCH_EN | RTIT_CTL_TRACEEN; if (!event->attr.exclude_kernel) @@ -976,7 +981,6 @@ void intel_pt_interrupt(void) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } } @@ -1000,7 +1004,6 @@ static void pt_event_start(struct perf_event *event, int mode) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } diff --git a/kernel/events/core.c b/kernel/events/core.c index d3dae3419b..15da13237a 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6040,8 +6040,6 @@ static void perf_log_itrace_start(struct perf_event *event) event->hw.itrace_started) return; - event->hw.itrace_started = 1; - rec.header.type = PERF_RECORD_ITRACE_START; rec.header.misc = 0; rec.header.size = sizeof(rec); -- 2.4.6 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v1] perf/x86/intel/pt: Do not force sync packets on every schedule-in 2015-07-30 13:48 ` [PATCH v1] perf/x86/intel/pt: Do not force sync packets " Alexander Shishkin @ 2015-07-30 15:24 ` Alexander Shishkin 2015-08-04 8:55 ` [tip:perf/core] " tip-bot for Alexander Shishkin 1 sibling, 0 replies; 21+ messages in thread From: Alexander Shishkin @ 2015-07-30 15:24 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme Alexander Shishkin <alexander.shishkin@linux.intel.com> writes: > Currently, the PT driver zeroes out the status register every time before > starting the event. However, all the writable bits are already taken care > of in pt_handle_status() function, except the new PacketByteCnt field, > which in new versions of PT contains the number of packet bytes written > since the last sync (PSB) packet. Zeroing it out before enabling PT forces > a sync packet to be written. This means that, with the existing code, a > sync packet (PSB and PSBEND, 18 bytes in total) will be generated every > time a PT event is scheduled in. > > To avoid these unnecessary syncs and save a WRMSR in the fast path, this > patch changes the default behavior to not clear PacketByteCnt field, so > that the sync packets will be generated with the period specified as > "psb_period" attribute config field. This has little impact on the trace > data as the other packets that are normally sent within PSB+ (between PSB > and PSBEND) have their own generation scenarios which do not depend on the > sync packets. > > One exception where we do need to force PSB like this when tracing starts, > so that the decoder has a clear sync point in the trace. For this purpose > we aready have hw::itrace_started flag, which we are currently using to > output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started > from perf core to the pmu::start, where it should still be 0 on the very > first run. > > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> > > perf/x86/intel/pt: fixup with no force psb > > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Argh, it is not my day today. Regards, -- Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tip:perf/core] perf/x86/intel/pt: Do not force sync packets on every schedule-in 2015-07-30 13:48 ` [PATCH v1] perf/x86/intel/pt: Do not force sync packets " Alexander Shishkin 2015-07-30 15:24 ` Alexander Shishkin @ 2015-08-04 8:55 ` tip-bot for Alexander Shishkin 1 sibling, 0 replies; 21+ messages in thread From: tip-bot for Alexander Shishkin @ 2015-08-04 8:55 UTC (permalink / raw) To: linux-tip-commits Cc: mingo, tglx, torvalds, peterz, alexander.shishkin, hpa, linux-kernel Commit-ID: 9a6694cfa2390181dec936a17c0d9d21ef7b08d9 Gitweb: http://git.kernel.org/tip/9a6694cfa2390181dec936a17c0d9d21ef7b08d9 Author: Alexander Shishkin <alexander.shishkin@linux.intel.com> AuthorDate: Thu, 30 Jul 2015 16:48:24 +0300 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Tue, 4 Aug 2015 10:16:55 +0200 perf/x86/intel/pt: Do not force sync packets on every schedule-in Currently, the PT driver zeroes out the status register every time before starting the event. However, all the writable bits are already taken care of in pt_handle_status() function, except the new PacketByteCnt field, which in new versions of PT contains the number of packet bytes written since the last sync (PSB) packet. Zeroing it out before enabling PT forces a sync packet to be written. This means that, with the existing code, a sync packet (PSB and PSBEND, 18 bytes in total) will be generated every time a PT event is scheduled in. To avoid these unnecessary syncs and save a WRMSR in the fast path, this patch changes the default behavior to not clear PacketByteCnt field, so that the sync packets will be generated with the period specified as "psb_period" attribute config field. This has little impact on the trace data as the other packets that are normally sent within PSB+ (between PSB and PSBEND) have their own generation scenarios which do not depend on the sync packets. One exception where we do need to force PSB like this when tracing starts, so that the decoder has a clear sync point in the trace. For this purpose we aready have hw::itrace_started flag, which we are currently using to output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started from perf core to the pmu::start, where it should still be 0 on the very first run. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/1438264104-16189-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/cpu/perf_event_intel_pt.c | 7 +++++-- kernel/events/core.c | 2 -- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/kernel/cpu/perf_event_intel_pt.c index 183de71..cc58ef8 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_pt.c +++ b/arch/x86/kernel/cpu/perf_event_intel_pt.c @@ -191,6 +191,11 @@ static void pt_config(struct perf_event *event) { u64 reg; + if (!event->hw.itrace_started) { + event->hw.itrace_started = 1; + wrmsrl(MSR_IA32_RTIT_STATUS, 0); + } + reg = RTIT_CTL_TOPA | RTIT_CTL_BRANCH_EN | RTIT_CTL_TRACEEN; if (!event->attr.exclude_kernel) @@ -910,7 +915,6 @@ void intel_pt_interrupt(void) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } } @@ -934,7 +938,6 @@ static void pt_event_start(struct perf_event *event, int mode) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } diff --git a/kernel/events/core.c b/kernel/events/core.c index a9796c8..bdea129 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6139,8 +6139,6 @@ static void perf_log_itrace_start(struct perf_event *event) event->hw.itrace_started) return; - event->hw.itrace_started = 1; - rec.header.type = PERF_RECORD_ITRACE_START; rec.header.misc = 0; rec.header.size = sizeof(rec); ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2] perf/x86/intel/pt: Do not force sync packets on every schedule-in 2015-07-30 12:13 ` Peter Zijlstra 2015-07-30 12:49 ` Alexander Shishkin 2015-07-30 13:48 ` [PATCH v1] perf/x86/intel/pt: Do not force sync packets " Alexander Shishkin @ 2015-07-31 11:47 ` Alexander Shishkin 2 siblings, 0 replies; 21+ messages in thread From: Alexander Shishkin @ 2015-07-31 11:47 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme, Alexander Shishkin Currently, the PT driver zeroes out the status register every time before starting the event. However, all the writable bits are already taken care of in pt_handle_status() function, except the new PacketByteCnt field, which in new versions of PT contains the number of packet bytes written since the last sync (PSB) packet. Zeroing it out before enabling PT forces a sync packet to be written. This means that, with the existing code, a sync packet (PSB and PSBEND, 18 bytes in total) will be generated every time a PT event is scheduled in. To avoid these unnecessary syncs and save a WRMSR in the fast path, this patch changes the default behavior to not clear PacketByteCnt field, so that the sync packets will be generated with the period specified as "psb_period" attribute config field. This has little impact on the trace data as the other packets that are normally sent within PSB+ (between PSB and PSBEND) have their own generation scenarios which do not depend on the sync packets. One exception where we do need to force PSB like this when tracing starts, so that the decoder has a clear sync point in the trace. For this purpose we aready have hw::itrace_started flag, which we are currently using to output PERF_RECORD_ITRACE_START. This patch moves setting itrace_started from perf core to the pmu::start, where it should still be 0 on the very first run. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> --- arch/x86/kernel/cpu/perf_event_intel_pt.c | 7 +++++-- kernel/events/core.c | 2 -- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_event_intel_pt.c b/arch/x86/kernel/cpu/perf_event_intel_pt.c index 695fd6c562..e20cfacb5a 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_pt.c +++ b/arch/x86/kernel/cpu/perf_event_intel_pt.c @@ -257,6 +257,11 @@ static void pt_config(struct perf_event *event) { u64 reg; + if (!event->hw.itrace_started) { + event->hw.itrace_started = 1; + wrmsrl(MSR_IA32_RTIT_STATUS, 0); + } + reg = RTIT_CTL_TOPA | RTIT_CTL_BRANCH_EN | RTIT_CTL_TRACEEN; if (!event->attr.exclude_kernel) @@ -976,7 +981,6 @@ void intel_pt_interrupt(void) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } } @@ -1000,7 +1004,6 @@ static void pt_event_start(struct perf_event *event, int mode) pt_config_buffer(buf->cur->table, buf->cur_idx, buf->output_off); - wrmsrl(MSR_IA32_RTIT_STATUS, 0); pt_config(event); } diff --git a/kernel/events/core.c b/kernel/events/core.c index d3dae3419b..15da13237a 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6040,8 +6040,6 @@ static void perf_log_itrace_start(struct perf_event *event) event->hw.itrace_started) return; - event->hw.itrace_started = 1; - rec.header.type = PERF_RECORD_ITRACE_START; rec.header.misc = 0; rec.header.size = sizeof(rec); -- 2.4.6 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic 2015-07-17 13:34 [PATCH 0/3] perf/x86/intel/pt: Add new packet enables Alexander Shishkin 2015-07-17 13:34 ` [PATCH 1/3] perf/x86/intel/pt: Add new timing " Alexander Shishkin 2015-07-17 13:34 ` [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in Alexander Shishkin @ 2015-07-17 13:34 ` Alexander Shishkin 2015-09-08 12:02 ` Alexander Shishkin 2015-09-13 10:56 ` [tip:perf/core] perf/x86/intel/bts: Set event-> hw.itrace_started " tip-bot for Alexander Shishkin 2 siblings, 2 replies; 21+ messages in thread From: Alexander Shishkin @ 2015-07-17 13:34 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme, Alexander Shishkin Since hw::itrace_started is now set in pmu::start to signal the beginning of the trace, do so also in the intel_bts driver. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> --- arch/x86/kernel/cpu/perf_event_intel_bts.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/cpu/perf_event_intel_bts.c b/arch/x86/kernel/cpu/perf_event_intel_bts.c index 54690e8857..d1c0f254af 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_bts.c +++ b/arch/x86/kernel/cpu/perf_event_intel_bts.c @@ -222,6 +222,7 @@ static void __bts_event_start(struct perf_event *event) if (!buf || bts_buffer_is_full(buf, bts)) return; + event->hw.itrace_started = 1; event->hw.state = 0; if (!buf->snapshot) -- 2.1.4 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic 2015-07-17 13:34 ` [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic Alexander Shishkin @ 2015-09-08 12:02 ` Alexander Shishkin 2015-09-08 12:28 ` Peter Zijlstra 2015-09-13 10:56 ` [tip:perf/core] perf/x86/intel/bts: Set event-> hw.itrace_started " tip-bot for Alexander Shishkin 1 sibling, 1 reply; 21+ messages in thread From: Alexander Shishkin @ 2015-09-08 12:02 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme Alexander Shishkin <alexander.shishkin@linux.intel.com> writes: > Since hw::itrace_started is now set in pmu::start to signal the beginning of > the trace, do so also in the intel_bts driver. > > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Peter, seems like you overlooked this one. Its counterpart to is in Linus' master already: commit 9a6694cfa2390181dec936a17c0d9d21ef7b08d9 Author: Alexander Shishkin <alexander.shishkin@linux.intel.com> Date: Thu Jul 30 16:48:24 2015 +0300 perf/x86/intel/pt: Do not force sync packets on every schedule-in Cheers, -- Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic 2015-09-08 12:02 ` Alexander Shishkin @ 2015-09-08 12:28 ` Peter Zijlstra 0 siblings, 0 replies; 21+ messages in thread From: Peter Zijlstra @ 2015-09-08 12:28 UTC (permalink / raw) To: Alexander Shishkin Cc: Ingo Molnar, linux-kernel, adrian.hunter, x86, hpa, acme On Tue, Sep 08, 2015 at 03:02:08PM +0300, Alexander Shishkin wrote: > Alexander Shishkin <alexander.shishkin@linux.intel.com> writes: > > > Since hw::itrace_started is now set in pmu::start to signal the beginning of > > the trace, do so also in the intel_bts driver. > > > > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> > > Peter, seems like you overlooked this one. Its counterpart to is in Linus' > master already: > > commit 9a6694cfa2390181dec936a17c0d9d21ef7b08d9 > Author: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Date: Thu Jul 30 16:48:24 2015 +0300 > > perf/x86/intel/pt: Do not force sync packets on every schedule-in > Urgh, sorry about that. ^ permalink raw reply [flat|nested] 21+ messages in thread
* [tip:perf/core] perf/x86/intel/bts: Set event-> hw.itrace_started in pmu::start to match the new logic 2015-07-17 13:34 ` [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic Alexander Shishkin 2015-09-08 12:02 ` Alexander Shishkin @ 2015-09-13 10:56 ` tip-bot for Alexander Shishkin 1 sibling, 0 replies; 21+ messages in thread From: tip-bot for Alexander Shishkin @ 2015-09-13 10:56 UTC (permalink / raw) To: linux-tip-commits Cc: torvalds, alexander.shishkin, tglx, hpa, linux-kernel, peterz, mingo Commit-ID: d249872939bfa86c9cce44a56a8cbdbc7086519b Gitweb: http://git.kernel.org/tip/d249872939bfa86c9cce44a56a8cbdbc7086519b Author: Alexander Shishkin <alexander.shishkin@linux.intel.com> AuthorDate: Fri, 17 Jul 2015 16:34:10 +0300 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Fri, 11 Sep 2015 10:06:03 +0200 perf/x86/intel/bts: Set event->hw.itrace_started in pmu::start to match the new logic Since event->hw.itrace_started is now set in pmu::start() to signal the beginning of the trace, do so also in the intel_bts driver. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@infradead.org Cc: adrian.hunter@intel.com Cc: hpa@zytor.com Link: http://lkml.kernel.org/r/1437140050-23363-4-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- arch/x86/kernel/cpu/perf_event_intel_bts.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/kernel/cpu/perf_event_intel_bts.c b/arch/x86/kernel/cpu/perf_event_intel_bts.c index 54690e8..d1c0f25 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_bts.c +++ b/arch/x86/kernel/cpu/perf_event_intel_bts.c @@ -222,6 +222,7 @@ static void __bts_event_start(struct perf_event *event) if (!buf || bts_buffer_is_full(buf, bts)) return; + event->hw.itrace_started = 1; event->hw.state = 0; if (!buf->snapshot) ^ permalink raw reply related [flat|nested] 21+ messages in thread
end of thread, other threads:[~2015-09-13 10:57 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-17 13:34 [PATCH 0/3] perf/x86/intel/pt: Add new packet enables Alexander Shishkin 2015-07-17 13:34 ` [PATCH 1/3] perf/x86/intel/pt: Add new timing " Alexander Shishkin 2015-07-30 10:40 ` Peter Zijlstra 2015-07-30 11:57 ` Alexander Shishkin 2015-07-30 12:14 ` Peter Zijlstra 2015-07-30 13:15 ` [PATCH v1] " Alexander Shishkin 2015-08-04 8:55 ` [tip:perf/core] " tip-bot for Alexander Shishkin 2015-07-17 13:34 ` [PATCH 2/3] perf/x86/intel/pt: Add an option to not force PSB+ on every schedule-in Alexander Shishkin 2015-07-30 10:43 ` Peter Zijlstra 2015-07-30 11:53 ` Alexander Shishkin 2015-07-30 12:13 ` Peter Zijlstra 2015-07-30 12:49 ` Alexander Shishkin 2015-07-30 13:36 ` Alexander Shishkin 2015-07-30 13:48 ` [PATCH v1] perf/x86/intel/pt: Do not force sync packets " Alexander Shishkin 2015-07-30 15:24 ` Alexander Shishkin 2015-08-04 8:55 ` [tip:perf/core] " tip-bot for Alexander Shishkin 2015-07-31 11:47 ` [PATCH v2] " Alexander Shishkin 2015-07-17 13:34 ` [PATCH 3/3] perf/x86/intel/bts: Set itrace_started in pmu::start to match the new logic Alexander Shishkin 2015-09-08 12:02 ` Alexander Shishkin 2015-09-08 12:28 ` Peter Zijlstra 2015-09-13 10:56 ` [tip:perf/core] perf/x86/intel/bts: Set event-> hw.itrace_started " tip-bot for Alexander Shishkin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox