linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* (no subject)
@ 2015-05-28  4:13 Andi Kleen
  2015-05-28  4:13 ` [PATCH 1/5] x86, perf: Allow time stamp for free running PEBSv3 Andi Kleen
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Andi Kleen @ 2015-05-28  4:13 UTC (permalink / raw)
  To: peterz; +Cc: acme, linux-kernel, jolsa, eranian

[Repost; I forgot to copy linux-kernel earlier. Apologies if you
see it twice.]

Skylake moved to 32 Last Branch Records, from previously 16. 
The current call stack LBR implementation reads all LBRs and
also saves/restores them on context switch. This patchkit
adds some optimizations to avoid extra costs in most cases
from the larger number of LBRs for call-stack, unless a nesting
larger than 16 is actually needed. It applies on top of the
earlier Skylake code. Some of the optimization will also benefit
earlier CPUs, such as Haswell.

note: one patch is for perf user space, the rest is kernel.

-Andi


^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH 1/5] x86, perf: Fix LBR call stack save/restore
@ 2015-10-20 18:46 Andi Kleen
  2015-10-20 18:46 ` [PATCH 2/5] x86, perf: Add option to disable reading branch flags/cycles Andi Kleen
  0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2015-10-20 18:46 UTC (permalink / raw)
  To: peterz; +Cc: acme, jolsa, linux-kernel, Andi Kleen, stable

From: Andi Kleen <ak@linux.intel.com>

This fixes a bug added with the earlier 90405aa02. The bug
could lead to lost LBR call stacks. When restoring the LBR
state we need to use the TOS of the previous context, not
the current context. To do that we need to save/restore the tos.

Cc: <stable@vger.kernel.org> # 4.2+
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/kernel/cpu/perf_event.h           | 1 +
 arch/x86/kernel/cpu/perf_event_intel_lbr.c | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index d871c94..1b47164 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -631,6 +631,7 @@ struct x86_perf_task_context {
 	u64 lbr_from[MAX_LBR_ENTRIES];
 	u64 lbr_to[MAX_LBR_ENTRIES];
 	u64 lbr_info[MAX_LBR_ENTRIES];
+	int tos;
 	int lbr_callstack_users;
 	int lbr_stack_state;
 };
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index ad0b8b0..0e4ea00 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -239,7 +239,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 	}
 
 	mask = x86_pmu.lbr_nr - 1;
-	tos = intel_pmu_lbr_tos();
+	tos = task_ctx->tos;
 	for (i = 0; i < tos; i++) {
 		lbr_idx = (tos - i) & mask;
 		wrmsrl(x86_pmu.lbr_from + lbr_idx, task_ctx->lbr_from[i]);
@@ -247,6 +247,7 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx)
 		if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)
 			wrmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]);
 	}
+	wrmsrl(x86_pmu.lbr_tos, tos);
 	task_ctx->lbr_stack_state = LBR_NONE;
 }
 
@@ -270,6 +271,7 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
 		if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_INFO)
 			rdmsrl(MSR_LBR_INFO_0 + lbr_idx, task_ctx->lbr_info[i]);
 	}
+	task_ctx->tos = tos;
 	task_ctx->lbr_stack_state = LBR_VALID;
 }
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-10-20 18:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-28  4:13 Andi Kleen
2015-05-28  4:13 ` [PATCH 1/5] x86, perf: Allow time stamp for free running PEBSv3 Andi Kleen
2015-08-04  8:56   ` [tip:perf/core] perf/x86/intel/lbr: " tip-bot for Andi Kleen
2015-05-28  4:13 ` [PATCH 2/5] x86, perf: Add option to disable reading branch flags/cycles Andi Kleen
2015-06-15 10:48   ` Peter Zijlstra
2015-05-28  4:13 ` [PATCH 3/5] perf, tools: Disable branch flags/cycles for lbr call graph Andi Kleen
2015-05-28  4:13 ` [PATCH 4/5] x86, perf: Use correct index to save/restore LBR_INFO with callstack Andi Kleen
2015-08-04  8:59   ` [tip:perf/core] perf/x86/intel/lbr: Use correct index to save/ restore LBR_INFO with call stack tip-bot for Andi Kleen
2015-05-28  4:13 ` [PATCH 5/5] x86, perf: Limit LBR accesses to TOS in callstack mode Andi Kleen
2015-08-04  8:59   ` [tip:perf/core] perf/x86/intel/lbr: " tip-bot for Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2015-10-20 18:46 [PATCH 1/5] x86, perf: Fix LBR call stack save/restore Andi Kleen
2015-10-20 18:46 ` [PATCH 2/5] x86, perf: Add option to disable reading branch flags/cycles Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).