From: kan.liang@intel.com
To: a.p.zijlstra@chello.nl, eranian@google.com
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com, paulus@samba.org,
acme@kernel.org, ak@linux.intel.com, kan.liang@intel.com, "Yan,
Zheng" <zheng.z.yan@intel.com>
Subject: [PATCH V5 12/16] perf, x86: use LBR call stack to get user callchain
Date: Sun, 7 Jan 2001 21:31:59 -0500 [thread overview]
Message-ID: <978921123-30240-3-git-send-email-kan.liang@intel.com> (raw)
In-Reply-To: <978921123-30240-1-git-send-email-kan.liang@intel.com>
From: Kan Liang <kan.liang@intel.com>
Haswell has a new feature that utilizes the existing Last Branch Record
facility to record call chains. When the feature is enabled, function
call will be collected as normal, but as return instructions are
executed
the last captured branch record is popped from the on-chip LBR
registers.
The LBR call stack facility can help perf to get call chains of progam
without frame pointer.
This patch makes x86's perf_callchain_user() failback to use LBR call
stack data when there is no frame pointer in the user program. The
'from'
address of branch entry is used as 'return' address of function call.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
---
arch/x86/kernel/cpu/perf_event.c | 34 ++++++++++++++++++++++++++----
arch/x86/kernel/cpu/perf_event_intel.c | 2 +-
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 2 ++
include/linux/perf_event.h | 1 +
4 files changed, 34 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 71e293a..0a71f04 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -2005,12 +2005,29 @@ static unsigned long get_segment_base(unsigned int segment)
return get_desc_base(desc + idx);
}
+static inline void
+perf_callchain_lbr_callstack(struct perf_callchain_entry *entry,
+ struct perf_sample_data *data)
+{
+ struct perf_branch_stack *br_stack = data->br_stack;
+
+ if (br_stack && br_stack->user_callstack) {
+ int i = 0;
+
+ while (i < br_stack->nr && entry->nr < PERF_MAX_STACK_DEPTH) {
+ perf_callchain_store(entry, br_stack->entries[i].from);
+ i++;
+ }
+ }
+}
+
#ifdef CONFIG_COMPAT
#include <asm/compat.h>
static inline int
-perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
+perf_callchain_user32(struct perf_callchain_entry *entry,
+ struct pt_regs *regs, struct perf_sample_data *data)
{
/* 32-bit process in 64-bit kernel. */
unsigned long ss_base, cs_base;
@@ -2039,11 +2056,16 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
perf_callchain_store(entry, cs_base + frame.return_address);
fp = compat_ptr(ss_base + frame.next_frame);
}
+
+ if (fp == compat_ptr(regs->bp))
+ perf_callchain_lbr_callstack(entry, data);
+
return 1;
}
#else
static inline int
-perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
+perf_callchain_user32(struct perf_callchain_entry *entry,
+ struct pt_regs *regs, struct perf_sample_data *data)
{
return 0;
}
@@ -2073,12 +2095,12 @@ void perf_callchain_user(struct perf_callchain_entry *entry,
if (!current->mm)
return;
- if (perf_callchain_user32(regs, entry))
+ if (perf_callchain_user32(entry, regs, data))
return;
while (entry->nr < PERF_MAX_STACK_DEPTH) {
unsigned long bytes;
- frame.next_frame = NULL;
+ frame.next_frame = NULL;
frame.return_address = 0;
bytes = copy_from_user_nmi(&frame, fp, sizeof(frame));
@@ -2091,6 +2113,10 @@ void perf_callchain_user(struct perf_callchain_entry *entry,
perf_callchain_store(entry, frame.return_address);
fp = frame.next_frame;
}
+
+ /* try LBR callstack if there is no frame pointer */
+ if (fp == (void __user *)regs->bp)
+ perf_callchain_lbr_callstack(entry, data);
}
/*
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 49e7d14..93e8038 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1404,7 +1404,7 @@ again:
perf_sample_data_init(&data, 0, event->hw.last_period);
- if (has_branch_stack(event))
+ if (needs_branch_stack(event))
data.br_stack = &cpuc->lbr_stack;
if (perf_event_overflow(event, &data, regs))
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index 6aabbb4..5afb21b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -743,6 +743,8 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
int i, j, type;
bool compress = false;
+ cpuc->lbr_stack.user_callstack = branch_user_callstack(br_sel);
+
/* if sampling all branches, then nothing to filter */
if ((br_sel & X86_BR_ALL) == X86_BR_ALL)
return;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 8db3520..4d38d5e 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -75,6 +75,7 @@ struct perf_raw_record {
* recent branch.
*/
struct perf_branch_stack {
+ bool user_callstack;
__u64 nr;
struct perf_branch_entry entries[0];
};
--
1.8.3.2
next prev parent reply other threads:[~2014-09-10 2:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-01-08 2:31 [PATCH V5 10/16] perf, core: simplify need branch stack check kan.liang
2001-01-08 2:31 ` [PATCH V5 11/16] perf, core: Pass perf_sample_data to perf_callchain() kan.liang
2001-01-08 2:31 ` kan.liang [this message]
2001-01-08 2:32 ` [PATCH V5 13/16] perf, x86: re-organize code that implicitly enables LBR/PEBS kan.liang
2001-01-08 2:32 ` [PATCH V5 14/16] perf, x86: enable LBR callstack when recording callchain kan.liang
2001-01-08 2:32 ` [PATCH V5 15/16] perf, x86: disable FREEZE_LBRS_ON_PMI when LBR operates in callstack mode kan.liang
2001-01-08 2:32 ` [PATCH V5 16/16] perf, x86: Discard zero length call entries in LBR call stack kan.liang
-- strict thread matches above, loose matches on Subject: below --
2014-07-07 6:28 [PATCH v5 00/16] perf, x86: Haswell LBR call stack support Yan, Zheng
2014-07-07 6:28 ` [PATCH v5 12/16] perf, x86: use LBR call stack to get user callchain Yan, Zheng
2014-09-10 14:08 [PATCH V5 00/16] perf, x86: Haswell LBR call stack support kan.liang
2014-09-10 14:09 ` [PATCH V5 12/16] perf, x86: use LBR call stack to get user callchain kan.liang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=978921123-30240-3-git-send-email-kan.liang@intel.com \
--to=kan.liang@intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=eranian@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulus@samba.org \
--cc=zheng.z.yan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox