* [GIT PULL 00/12] perf/core improvements and fixes
@ 2016-05-17 2:45 Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 01/12] perf stat: Avoid fractional digits for integer scales Arnaldo Carvalho de Melo
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Alexei Starovoitov,
Ananth N Mavinakayanahalli, Andi Kleen, Brendan Gregg,
David Ahern, Ekaterina Tumanova, Frederic Weisbecker, He Kuang,
Hemant Kumar, Jiri Olsa, Josh Poimboeuf, Kan Liang,
Linus Torvalds, Masami Hiramatsu, Milian Wolff, Namhyung Kim,
Pekka Enberg, Peter Zijlstra, Stephane Eranian,
Sukadev Bhattiprolu, Thomas Gleixner, Vince Weaver, Wang Nan,
Zefan Li
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Hi Ingo,
Please consider pulling,
- Arnaldo
The following changes since commit 3f56e687a138481894a1088d5aa7d41951bdb020:
perf/core: Disable the event on a truncated AUX record (2016-05-12 10:14:55 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160516
for you to fetch changes up to a29d5c9b8167dbc21a7ca8c0302e3799f9063b4e:
perf tools: Separate accounting of contexts and real addresses in a stack trace (2016-05-16 23:11:54 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
User visible:
- Honour the kernel.perf_event_max_stack knob more precisely by not counting
PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)
- Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)
- Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)
- Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)
- Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)
- Store vdso buildid unconditionally, as it appears in callchains and
we're not checking those when creating the build-id table, so we
end up not being able to resolve VDSO symbols when doing analysis
on a different machine than the one where recording was done, possibly
of a different arch even (arm -> x86_64) (He Kuang)
Infrastructure:
- Generalize max_stack sysctl handler, will be used for configuring
multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)
Cleanups:
- Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
open coded strings (Masami Hiramatsu)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Andi Kleen (1):
perf stat: Avoid fractional digits for integer scales
Arnaldo Carvalho de Melo (6):
perf core: Generalize max_stack sysctl handler
perf core: Pass max stack as a perf_callchain_entry context
perf core: Add a 'nr' field to perf_event_callchain_context
perf core: Add perf_callchain_store_context() helper
perf core: Separate accounting of contexts and real addresses in a stack trace
perf tools: Separate accounting of contexts and real addresses in a stack trace
He Kuang (1):
perf symbols: Store vdso buildid unconditionally
Masami Hiramatsu (1):
perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE
Namhyung Kim (3):
perf stat: Fix indentation of stalled backend cycle
perf stat: Update runtime using cpu-clock event
perf stat: Use cpu-clock event for cpu targets
Documentation/sysctl/kernel.txt | 14 ++++++++++++++
arch/arc/kernel/perf_event.c | 6 +++---
arch/arm/kernel/perf_callchain.c | 10 +++++-----
arch/arm64/kernel/perf_callchain.c | 14 +++++++-------
arch/metag/kernel/perf_callchain.c | 10 +++++-----
arch/mips/kernel/perf_event.c | 12 ++++++------
arch/powerpc/perf/callchain.c | 20 ++++++++++----------
arch/s390/kernel/perf_event.c | 4 ++--
arch/sh/kernel/perf_callchain.c | 4 ++--
arch/sparc/kernel/perf_event.c | 14 +++++++-------
arch/tile/kernel/perf_event.c | 6 +++---
arch/x86/events/core.c | 14 +++++++-------
arch/xtensa/kernel/perf_event.c | 10 +++++-----
include/linux/perf_event.h | 34 +++++++++++++++++++++++++++++-----
include/uapi/linux/perf_event.h | 1 +
kernel/bpf/stackmap.c | 3 ++-
kernel/events/callchain.c | 36 ++++++++++++++++++++++++------------
kernel/sysctl.c | 11 ++++++++++-
tools/perf/builtin-buildid-cache.c | 8 ++++----
tools/perf/builtin-stat.c | 22 +++++++++++++---------
tools/perf/perf.c | 3 +++
tools/perf/util/annotate.c | 2 +-
tools/perf/util/build-id.c | 2 +-
tools/perf/util/dso.c | 3 ++-
tools/perf/util/machine.c | 28 ++++++++++++++++++----------
tools/perf/util/stat-shadow.c | 8 +++++---
tools/perf/util/symbol.c | 10 +++++-----
tools/perf/util/symbol.h | 3 +++
tools/perf/util/util.c | 3 ++-
tools/perf/util/util.h | 3 ++-
30 files changed, 201 insertions(+), 117 deletions(-)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 01/12] perf stat: Avoid fractional digits for integer scales
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 02/12] perf symbols: Store vdso buildid unconditionally Arnaldo Carvalho de Melo
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andi Kleen, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Andi Kleen <ak@linux.intel.com>
When the scaling factor is a full integer don't display fractional
digits. This avoids unnecessary .00 output for topdown metrics with
scale factors.
v2: Remove redundant check.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462489447-31832-7-git-send-email-andi@firstfloor.org
[ Rename 'round' to 'stat_round' as 'round' is defined in math.h,
included by this patch, and this breaks the build on ubuntu 12.04 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-stat.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 5645a8361de6..16a923c1633b 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -66,6 +66,7 @@
#include <stdlib.h>
#include <sys/prctl.h>
#include <locale.h>
+#include <math.h>
#define DEFAULT_SEPARATOR " "
#define CNTR_NOT_SUPPORTED "<not supported>"
@@ -986,12 +987,12 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
const char *fmt;
if (csv_output) {
- fmt = sc != 1.0 ? "%.2f%s" : "%.0f%s";
+ fmt = floor(sc) != sc ? "%.2f%s" : "%.0f%s";
} else {
if (big_num)
- fmt = sc != 1.0 ? "%'18.2f%s" : "%'18.0f%s";
+ fmt = floor(sc) != sc ? "%'18.2f%s" : "%'18.0f%s";
else
- fmt = sc != 1.0 ? "%18.2f%s" : "%18.0f%s";
+ fmt = floor(sc) != sc ? "%18.2f%s" : "%18.0f%s";
}
aggr_printout(evsel, id, nr);
@@ -1995,7 +1996,7 @@ static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
union perf_event *event,
struct perf_session *session)
{
- struct stat_round_event *round = &event->stat_round;
+ struct stat_round_event *stat_round = &event->stat_round;
struct perf_evsel *counter;
struct timespec tsh, *ts = NULL;
const char **argv = session->header.env.cmdline_argv;
@@ -2004,12 +2005,12 @@ static int process_stat_round_event(struct perf_tool *tool __maybe_unused,
evlist__for_each(evsel_list, counter)
perf_stat_process_counter(&stat_config, counter);
- if (round->type == PERF_STAT_ROUND_TYPE__FINAL)
- update_stats(&walltime_nsecs_stats, round->time);
+ if (stat_round->type == PERF_STAT_ROUND_TYPE__FINAL)
+ update_stats(&walltime_nsecs_stats, stat_round->time);
- if (stat_config.interval && round->time) {
- tsh.tv_sec = round->time / NSECS_PER_SEC;
- tsh.tv_nsec = round->time % NSECS_PER_SEC;
+ if (stat_config.interval && stat_round->time) {
+ tsh.tv_sec = stat_round->time / NSECS_PER_SEC;
+ tsh.tv_nsec = stat_round->time % NSECS_PER_SEC;
ts = &tsh;
}
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 02/12] perf symbols: Store vdso buildid unconditionally
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 01/12] perf stat: Avoid fractional digits for integer scales Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 03/12] perf stat: Fix indentation of stalled backend cycle Arnaldo Carvalho de Melo
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, He Kuang, Adrian Hunter, Alexander Shishkin,
Andi Kleen, David Ahern, Ekaterina Tumanova, Jiri Olsa,
Josh Poimboeuf, Kan Liang, Masami Hiramatsu, Namhyung Kim,
Pekka Enberg, Peter Zijlstra, Stephane Eranian,
Sukadev Bhattiprolu, Wang Nan, Arnaldo Carvalho de Melo
From: He Kuang <hekuang@huawei.com>
When unwinding callchains on a different machine, vdso info should be
available so the unwind process won't be interrupted if address falls
into vdso region. But in most cases, the addresses of sample events are
not in vdso range, the buildid of a zero hit vdso won't be stored into
perf.data.
This patch stores vdso buildid regardless of whether the vdso is hit or
not.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463042596-61703-3-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/build-id.c | 2 +-
tools/perf/util/dso.c | 3 ++-
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index bff425e1232c..67e5966503b2 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -256,7 +256,7 @@ static int machine__write_buildid_table(struct machine *machine, int fd)
size_t name_len;
bool in_kernel = false;
- if (!pos->hit)
+ if (!pos->hit && !dso__is_vdso(pos))
continue;
if (dso__is_vdso(pos)) {
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 3357479082ca..75b75615e2f8 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -7,6 +7,7 @@
#include "auxtrace.h"
#include "util.h"
#include "debug.h"
+#include "vdso.h"
char dso__symtab_origin(const struct dso *dso)
{
@@ -1169,7 +1170,7 @@ bool __dsos__read_build_ids(struct list_head *head, bool with_hits)
struct dso *pos;
list_for_each_entry(pos, head, node) {
- if (with_hits && !pos->hit)
+ if (with_hits && !pos->hit && !dso__is_vdso(pos))
continue;
if (pos->has_build_id) {
have_build_id = true;
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 03/12] perf stat: Fix indentation of stalled backend cycle
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 01/12] perf stat: Avoid fractional digits for integer scales Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 02/12] perf symbols: Store vdso buildid unconditionally Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 04/12] perf stat: Update runtime using cpu-clock event Arnaldo Carvalho de Melo
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Namhyung Kim, Andi Kleen, Jiri Olsa, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Namhyung Kim <namhyung@kernel.org>
The commit 140aeadc1fb5 ("perf stat: Abstract stat metrics printing")
changed how shadow metrics are printed, but it missed to update the
width of the stalled backend cycles event to 7.2% like others. This
resulted in misaligned output like below:
Performance counter stats for 'pwd':
0.638313 task-clock (msec) # 0.567 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.085 M/sec
885,600 cycles # 1.387 GHz
558,438 stalled-cycles-frontend # 63.06% frontend cycles idle
431,355 stalled-cycles-backend # 48.71% backend cycles idle
674,956 instructions # 0.76 insn per cycle
# 0.83 stalled cycles per insn
130,380 branches # 204.257 M/sec
<not counted> branch-misses
0.001125426 seconds time elapsed
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 140aeadc1fb5 ("perf stat: Abstract stat metrics printing")
Link: http://lkml.kernel.org/r/1463119263-5569-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/stat-shadow.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index fdb71961143e..61200fcac5ef 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -188,7 +188,7 @@ static void print_stalled_cycles_backend(int cpu,
color = get_ratio_color(GRC_STALLED_CYCLES_BE, ratio);
- out->print_metric(out->ctx, color, "%6.2f%%", "backend cycles idle", ratio);
+ out->print_metric(out->ctx, color, "%7.2f%%", "backend cycles idle", ratio);
}
static void print_branch_misses(int cpu,
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 04/12] perf stat: Update runtime using cpu-clock event
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (2 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 03/12] perf stat: Fix indentation of stalled backend cycle Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 05/12] perf stat: Use cpu-clock event for cpu targets Arnaldo Carvalho de Melo
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Namhyung Kim, Andi Kleen, Jiri Olsa, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Namhyung Kim <namhyung@kernel.org>
Currently only the task-clock event updates the runtime_nsec so it
cannot show the metric when using cpu-clock events. However cpu clock
works basically same as task-clock, so no need to not update the runtime
IMHO.
Before:
# perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1
Performance counter stats for 'system wide':
1217.759506 cpu-clock (msec)
93 context-switches
61 page-faults
18,958,022 cycles
0.101393794 seconds time elapsed
After:
Performance counter stats for 'system wide':
1220.471884 cpu-clock (msec) # 12.013 CPUs utilized
118 context-switches # 0.097 K/sec
59 page-faults # 0.048 K/sec
17,941,247 cycles # 0.015 GHz
0.101594777 seconds time elapsed
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/stat-shadow.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 61200fcac5ef..aa9efe08762b 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -94,7 +94,8 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 *count,
{
int ctx = evsel_context(counter);
- if (perf_evsel__match(counter, SOFTWARE, SW_TASK_CLOCK))
+ if (perf_evsel__match(counter, SOFTWARE, SW_TASK_CLOCK) ||
+ perf_evsel__match(counter, SOFTWARE, SW_CPU_CLOCK))
update_stats(&runtime_nsecs_stats[cpu], count[0]);
else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
update_stats(&runtime_cycles_stats[ctx][cpu], count[0]);
@@ -444,7 +445,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
ratio = total / avg;
print_metric(ctxp, NULL, "%8.0f", "cycles / elision", ratio);
- } else if (perf_evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK)) {
+ } else if (perf_evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK) ||
+ perf_evsel__match(evsel, SOFTWARE, SW_CPU_CLOCK)) {
if ((ratio = avg_stats(&walltime_nsecs_stats)) != 0)
print_metric(ctxp, NULL, "%8.3f", "CPUs utilized",
avg / ratio);
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 05/12] perf stat: Use cpu-clock event for cpu targets
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (3 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 04/12] perf stat: Update runtime using cpu-clock event Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 06/12] perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE Arnaldo Carvalho de Melo
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Namhyung Kim, Andi Kleen, Jiri Olsa, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Namhyung Kim <namhyung@kernel.org>
Currently 'perf stat' always counts task-clock event by default. But
it's somewhat confusing for system-wide targets (especially with 'sleep
N' as the 'sleep' task just sleeps and doesn't use cputime). Changing
to cpu-clock event instead for that case makes more sense IMHO.
Before:
# perf stat -a sleep 0.1
Performance counter stats for 'system wide':
403.038603 task-clock (msec) # 4.001 CPUs utilized
150 context-switches # 0.372 K/sec
7 cpu-migrations # 0.017 K/sec
71 page-faults # 0.176 K/sec
23,705,169 cycles # 0.059 GHz
15,888,166 instructions # 0.67 insn per cycle
3,326,078 branches # 8.253 M/sec
87,643 branch-misses # 2.64% of all branches
0.100737009 seconds time elapsed
#
After:
# perf stat -a sleep 0.1
Performance counter stats for 'system wide':
404.271182 cpu-clock (msec) # 4.000 CPUs utilized
143 context-switches # 0.354 K/sec
13 cpu-migrations # 0.032 K/sec
73 page-faults # 0.181 K/sec
22,119,220 cycles # 0.055 GHz
13,622,065 instructions # 0.62 insn per cycle
2,918,769 branches # 7.220 M/sec
85,033 branch-misses # 2.91% of all branches
0.101073089 seconds time elapsed
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-stat.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 16a923c1633b..efdd23221c16 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1905,6 +1905,9 @@ static int add_default_attributes(void)
}
if (!evsel_list->nr_entries) {
+ if (target__has_cpu(&target))
+ default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
+
if (perf_evlist__add_default_attrs(evsel_list, default_attrs0) < 0)
return -1;
if (pmu_have_event("cpu", "stalled-cycles-frontend")) {
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 06/12] perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (4 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 05/12] perf stat: Use cpu-clock event for cpu targets Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 07/12] perf core: Generalize max_stack sysctl handler Arnaldo Carvalho de Melo
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Masami Hiramatsu, Ananth N Mavinakayanahalli,
Brendan Gregg, Hemant Kumar, Namhyung Kim, Peter Zijlstra,
Arnaldo Carvalho de Melo
From: Masami Hiramatsu <mhiramat@kernel.org>
Instead of using a raw string, use DSO__NAME_KALLSYMS and
DSO__NAME_KCORE macros for kallsyms and kcore.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-buildid-cache.c | 8 ++++----
tools/perf/util/annotate.c | 2 +-
tools/perf/util/machine.c | 2 +-
tools/perf/util/symbol.c | 10 +++++-----
tools/perf/util/symbol.h | 3 +++
5 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index 632efc6b79a0..d75bded21fe0 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -119,8 +119,8 @@ static int build_id_cache__add_kcore(const char *filename, bool force)
if (build_id_cache__kcore_buildid(from_dir, sbuildid) < 0)
return -1;
- scnprintf(to_dir, sizeof(to_dir), "%s/[kernel.kcore]/%s",
- buildid_dir, sbuildid);
+ scnprintf(to_dir, sizeof(to_dir), "%s/%s/%s",
+ buildid_dir, DSO__NAME_KCORE, sbuildid);
if (!force &&
!build_id_cache__kcore_existing(from_dir, to_dir, sizeof(to_dir))) {
@@ -131,8 +131,8 @@ static int build_id_cache__add_kcore(const char *filename, bool force)
if (build_id_cache__kcore_dir(dir, sizeof(dir)))
return -1;
- scnprintf(to_dir, sizeof(to_dir), "%s/[kernel.kcore]/%s/%s",
- buildid_dir, sbuildid, dir);
+ scnprintf(to_dir, sizeof(to_dir), "%s/%s/%s/%s",
+ buildid_dir, DSO__NAME_KCORE, sbuildid, dir);
if (mkdir_p(to_dir, 0755))
return -1;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4db73d5a0dbc..b811924e5e1b 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1122,7 +1122,7 @@ int symbol__annotate(struct symbol *sym, struct map *map, size_t privsize)
} else if (dso__is_kcore(dso)) {
goto fallback;
} else if (readlink(symfs_filename, command, sizeof(command)) < 0 ||
- strstr(command, "[kernel.kallsyms]") ||
+ strstr(command, DSO__NAME_KALLSYMS) ||
access(symfs_filename, R_OK)) {
free(filename);
fallback:
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 639a2903065e..18dd96bdde05 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -709,7 +709,7 @@ static struct dso *machine__get_kernel(struct machine *machine)
if (machine__is_host(machine)) {
vmlinux_name = symbol_conf.vmlinux_name;
if (!vmlinux_name)
- vmlinux_name = "[kernel.kallsyms]";
+ vmlinux_name = DSO__NAME_KALLSYMS;
kernel = machine__findnew_kernel(machine, vmlinux_name,
"[kernel]", DSO_TYPE_KERNEL);
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 7fb33304fb4e..2252b545ff43 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1662,8 +1662,8 @@ static char *dso__find_kallsyms(struct dso *dso, struct map *map)
build_id__sprintf(dso->build_id, sizeof(dso->build_id), sbuild_id);
- scnprintf(path, sizeof(path), "%s/[kernel.kcore]/%s", buildid_dir,
- sbuild_id);
+ scnprintf(path, sizeof(path), "%s/%s/%s", buildid_dir,
+ DSO__NAME_KCORE, sbuild_id);
/* Use /proc/kallsyms if possible */
if (is_host) {
@@ -1699,8 +1699,8 @@ static char *dso__find_kallsyms(struct dso *dso, struct map *map)
if (!find_matching_kcore(map, path, sizeof(path)))
return strdup(path);
- scnprintf(path, sizeof(path), "%s/[kernel.kallsyms]/%s",
- buildid_dir, sbuild_id);
+ scnprintf(path, sizeof(path), "%s/%s/%s",
+ buildid_dir, DSO__NAME_KALLSYMS, sbuild_id);
if (access(path, F_OK)) {
pr_err("No kallsyms or vmlinux with build-id %s was found\n",
@@ -1769,7 +1769,7 @@ do_kallsyms:
if (err > 0 && !dso__is_kcore(dso)) {
dso->binary_type = DSO_BINARY_TYPE__KALLSYMS;
- dso__set_long_name(dso, "[kernel.kallsyms]", false);
+ dso__set_long_name(dso, DSO__NAME_KALLSYMS, false);
map__fixup_start(map);
map__fixup_end(map);
}
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 2b5e4ed76fcb..25f2fd672c2e 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -44,6 +44,9 @@ Elf_Scn *elf_section_by_name(Elf *elf, GElf_Ehdr *ep,
#define DMGL_ANSI (1 << 1) /* Include const, volatile, etc */
#endif
+#define DSO__NAME_KALLSYMS "[kernel.kallsyms]"
+#define DSO__NAME_KCORE "[kernel.kcore]"
+
/** struct symbol - symtab entry
*
* @ignore - resolvable but tools ignore it (e.g. idle routines)
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 07/12] perf core: Generalize max_stack sysctl handler
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (5 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 06/12] perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 08/12] perf core: Pass max stack as a perf_callchain_entry context Arnaldo Carvalho de Melo
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Peter Zijlstra
From: Arnaldo Carvalho de Melo <acme@redhat.com>
So that it can be used for other stack related knobs, such as the
upcoming one to tweak the max number of of contexts per stack sample.
In all those cases we can only change the value if there are no perf
sessions collecting stacks, so they need to grab that mutex, etc.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-8t3fk94wuzp8m2z1n4gc0s17@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
kernel/events/callchain.c | 5 +++--
kernel/sysctl.c | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index b9325e7dcba1..7fc89939ede9 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -228,7 +228,8 @@ exit_put:
int perf_event_max_stack_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
- int new_value = sysctl_perf_event_max_stack, ret;
+ int *value = table->data;
+ int new_value = *value, ret;
struct ctl_table new_table = *table;
new_table.data = &new_value;
@@ -240,7 +241,7 @@ int perf_event_max_stack_handler(struct ctl_table *table, int write,
if (atomic_read(&nr_callchain_events))
ret = -EBUSY;
else
- sysctl_perf_event_max_stack = new_value;
+ *value = new_value;
mutex_unlock(&callchain_mutex);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c8b318663525..0ec6907a16b3 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1149,7 +1149,7 @@ static struct ctl_table kern_table[] = {
},
{
.procname = "perf_event_max_stack",
- .data = NULL, /* filled in by handler */
+ .data = &sysctl_perf_event_max_stack,
.maxlen = sizeof(sysctl_perf_event_max_stack),
.mode = 0644,
.proc_handler = perf_event_max_stack_handler,
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 08/12] perf core: Pass max stack as a perf_callchain_entry context
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (6 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 07/12] perf core: Generalize max_stack sysctl handler Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 09/12] perf core: Add a 'nr' field to perf_event_callchain_context Arnaldo Carvalho de Melo
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Alexei Starovoitov, Brendan Gregg,
David Ahern, Frederic Weisbecker, He Kuang, Jiri Olsa,
Linus Torvalds, Masami Hiramatsu, Milian Wolff, Namhyung Kim,
Peter Zijlstra, Stephane Eranian, Thomas Gleixner, Vince Weaver,
Wang Nan, Zefan Li
From: Arnaldo Carvalho de Melo <acme@redhat.com>
This makes perf_callchain_{user,kernel}() receive the max stack
as context for the perf_callchain_entry, instead of accessing
the global sysctl_perf_event_max_stack.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
arch/arc/kernel/perf_event.c | 6 +++---
arch/arm/kernel/perf_callchain.c | 10 +++++-----
arch/arm64/kernel/perf_callchain.c | 14 +++++++-------
arch/metag/kernel/perf_callchain.c | 10 +++++-----
arch/mips/kernel/perf_event.c | 12 ++++++------
arch/powerpc/perf/callchain.c | 14 +++++++-------
arch/s390/kernel/perf_event.c | 4 ++--
arch/sh/kernel/perf_callchain.c | 4 ++--
arch/sparc/kernel/perf_event.c | 14 +++++++-------
arch/tile/kernel/perf_event.c | 6 +++---
arch/x86/events/core.c | 14 +++++++-------
arch/xtensa/kernel/perf_event.c | 10 +++++-----
include/linux/perf_event.h | 16 +++++++++++-----
kernel/bpf/stackmap.c | 3 ++-
kernel/events/callchain.c | 20 ++++++++++++--------
15 files changed, 84 insertions(+), 73 deletions(-)
diff --git a/arch/arc/kernel/perf_event.c b/arch/arc/kernel/perf_event.c
index 8b134cfe5e1f..6fd48021324b 100644
--- a/arch/arc/kernel/perf_event.c
+++ b/arch/arc/kernel/perf_event.c
@@ -48,7 +48,7 @@ struct arc_callchain_trace {
static int callchain_trace(unsigned int addr, void *data)
{
struct arc_callchain_trace *ctrl = data;
- struct perf_callchain_entry *entry = ctrl->perf_stuff;
+ struct perf_callchain_entry_ctx *entry = ctrl->perf_stuff;
perf_callchain_store(entry, addr);
if (ctrl->depth++ < 3)
@@ -58,7 +58,7 @@ static int callchain_trace(unsigned int addr, void *data)
}
void
-perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
struct arc_callchain_trace ctrl = {
.depth = 0,
@@ -69,7 +69,7 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
}
void
-perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
/*
* User stack can't be unwound trivially with kernel dwarf unwinder
diff --git a/arch/arm/kernel/perf_callchain.c b/arch/arm/kernel/perf_callchain.c
index 27563befa8a2..bc552e813e7b 100644
--- a/arch/arm/kernel/perf_callchain.c
+++ b/arch/arm/kernel/perf_callchain.c
@@ -31,7 +31,7 @@ struct frame_tail {
*/
static struct frame_tail __user *
user_backtrace(struct frame_tail __user *tail,
- struct perf_callchain_entry *entry)
+ struct perf_callchain_entry_ctx *entry)
{
struct frame_tail buftail;
unsigned long err;
@@ -59,7 +59,7 @@ user_backtrace(struct frame_tail __user *tail,
}
void
-perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
struct frame_tail __user *tail;
@@ -75,7 +75,7 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
tail = (struct frame_tail __user *)regs->ARM_fp - 1;
- while ((entry->nr < sysctl_perf_event_max_stack) &&
+ while ((entry->entry->nr < entry->max_stack) &&
tail && !((unsigned long)tail & 0x3))
tail = user_backtrace(tail, entry);
}
@@ -89,13 +89,13 @@ static int
callchain_trace(struct stackframe *fr,
void *data)
{
- struct perf_callchain_entry *entry = data;
+ struct perf_callchain_entry_ctx *entry = data;
perf_callchain_store(entry, fr->pc);
return 0;
}
void
-perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
struct stackframe fr;
diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
index 32c3c6e70119..0d60150057cf 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -31,7 +31,7 @@ struct frame_tail {
*/
static struct frame_tail __user *
user_backtrace(struct frame_tail __user *tail,
- struct perf_callchain_entry *entry)
+ struct perf_callchain_entry_ctx *entry)
{
struct frame_tail buftail;
unsigned long err;
@@ -76,7 +76,7 @@ struct compat_frame_tail {
static struct compat_frame_tail __user *
compat_user_backtrace(struct compat_frame_tail __user *tail,
- struct perf_callchain_entry *entry)
+ struct perf_callchain_entry_ctx *entry)
{
struct compat_frame_tail buftail;
unsigned long err;
@@ -106,7 +106,7 @@ compat_user_backtrace(struct compat_frame_tail __user *tail,
}
#endif /* CONFIG_COMPAT */
-void perf_callchain_user(struct perf_callchain_entry *entry,
+void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
@@ -122,7 +122,7 @@ void perf_callchain_user(struct perf_callchain_entry *entry,
tail = (struct frame_tail __user *)regs->regs[29];
- while (entry->nr < sysctl_perf_event_max_stack &&
+ while (entry->entry->nr < entry->max_stack &&
tail && !((unsigned long)tail & 0xf))
tail = user_backtrace(tail, entry);
} else {
@@ -132,7 +132,7 @@ void perf_callchain_user(struct perf_callchain_entry *entry,
tail = (struct compat_frame_tail __user *)regs->compat_fp - 1;
- while ((entry->nr < sysctl_perf_event_max_stack) &&
+ while ((entry->entry->nr < entry->max_stack) &&
tail && !((unsigned long)tail & 0x3))
tail = compat_user_backtrace(tail, entry);
#endif
@@ -146,12 +146,12 @@ void perf_callchain_user(struct perf_callchain_entry *entry,
*/
static int callchain_trace(struct stackframe *frame, void *data)
{
- struct perf_callchain_entry *entry = data;
+ struct perf_callchain_entry_ctx *entry = data;
perf_callchain_store(entry, frame->pc);
return 0;
}
-void perf_callchain_kernel(struct perf_callchain_entry *entry,
+void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
struct stackframe frame;
diff --git a/arch/metag/kernel/perf_callchain.c b/arch/metag/kernel/perf_callchain.c
index 252abc12a5a3..b3261a98b15b 100644
--- a/arch/metag/kernel/perf_callchain.c
+++ b/arch/metag/kernel/perf_callchain.c
@@ -29,7 +29,7 @@ static bool is_valid_call(unsigned long calladdr)
static struct metag_frame __user *
user_backtrace(struct metag_frame __user *user_frame,
- struct perf_callchain_entry *entry)
+ struct perf_callchain_entry_ctx *entry)
{
struct metag_frame frame;
unsigned long calladdr;
@@ -56,7 +56,7 @@ user_backtrace(struct metag_frame __user *user_frame,
}
void
-perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
unsigned long sp = regs->ctx.AX[0].U0;
struct metag_frame __user *frame;
@@ -65,7 +65,7 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
--frame;
- while ((entry->nr < sysctl_perf_event_max_stack) && frame)
+ while ((entry->entry->nr < entry->max_stack) && frame)
frame = user_backtrace(frame, entry);
}
@@ -78,13 +78,13 @@ static int
callchain_trace(struct stackframe *fr,
void *data)
{
- struct perf_callchain_entry *entry = data;
+ struct perf_callchain_entry_ctx *entry = data;
perf_callchain_store(entry, fr->pc);
return 0;
}
void
-perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
struct stackframe fr;
diff --git a/arch/mips/kernel/perf_event.c b/arch/mips/kernel/perf_event.c
index 5021c546ad07..22395c7d7030 100644
--- a/arch/mips/kernel/perf_event.c
+++ b/arch/mips/kernel/perf_event.c
@@ -25,8 +25,8 @@
* the user stack callchains, we will add it here.
*/
-static void save_raw_perf_callchain(struct perf_callchain_entry *entry,
- unsigned long reg29)
+static void save_raw_perf_callchain(struct perf_callchain_entry_ctx *entry,
+ unsigned long reg29)
{
unsigned long *sp = (unsigned long *)reg29;
unsigned long addr;
@@ -35,14 +35,14 @@ static void save_raw_perf_callchain(struct perf_callchain_entry *entry,
addr = *sp++;
if (__kernel_text_address(addr)) {
perf_callchain_store(entry, addr);
- if (entry->nr >= sysctl_perf_event_max_stack)
+ if (entry->entry->nr >= entry->max_stack)
break;
}
}
}
-void perf_callchain_kernel(struct perf_callchain_entry *entry,
- struct pt_regs *regs)
+void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
+ struct pt_regs *regs)
{
unsigned long sp = regs->regs[29];
#ifdef CONFIG_KALLSYMS
@@ -59,7 +59,7 @@ void perf_callchain_kernel(struct perf_callchain_entry *entry,
}
do {
perf_callchain_store(entry, pc);
- if (entry->nr >= sysctl_perf_event_max_stack)
+ if (entry->entry->nr >= entry->max_stack)
break;
pc = unwind_stack(current, &sp, pc, &ra);
} while (pc);
diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c
index 22d9015c1acc..c9260c1dfdbc 100644
--- a/arch/powerpc/perf/callchain.c
+++ b/arch/powerpc/perf/callchain.c
@@ -47,7 +47,7 @@ static int valid_next_sp(unsigned long sp, unsigned long prev_sp)
}
void
-perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
unsigned long sp, next_sp;
unsigned long next_ip;
@@ -232,7 +232,7 @@ static int sane_signal_64_frame(unsigned long sp)
puc == (unsigned long) &sf->uc;
}
-static void perf_callchain_user_64(struct perf_callchain_entry *entry,
+static void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
unsigned long sp, next_sp;
@@ -247,7 +247,7 @@ static void perf_callchain_user_64(struct perf_callchain_entry *entry,
sp = regs->gpr[1];
perf_callchain_store(entry, next_ip);
- while (entry->nr < sysctl_perf_event_max_stack) {
+ while (entry->entry->nr < entry->max_stack) {
fp = (unsigned long __user *) sp;
if (!valid_user_sp(sp, 1) || read_user_stack_64(fp, &next_sp))
return;
@@ -319,7 +319,7 @@ static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret)
return rc;
}
-static inline void perf_callchain_user_64(struct perf_callchain_entry *entry,
+static inline void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
}
@@ -439,7 +439,7 @@ static unsigned int __user *signal_frame_32_regs(unsigned int sp,
return mctx->mc_gregs;
}
-static void perf_callchain_user_32(struct perf_callchain_entry *entry,
+static void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
unsigned int sp, next_sp;
@@ -453,7 +453,7 @@ static void perf_callchain_user_32(struct perf_callchain_entry *entry,
sp = regs->gpr[1];
perf_callchain_store(entry, next_ip);
- while (entry->nr < sysctl_perf_event_max_stack) {
+ while (entry->entry->nr < entry->max_stack) {
fp = (unsigned int __user *) (unsigned long) sp;
if (!valid_user_sp(sp, 0) || read_user_stack_32(fp, &next_sp))
return;
@@ -487,7 +487,7 @@ static void perf_callchain_user_32(struct perf_callchain_entry *entry,
}
void
-perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
if (current_is_64bit())
perf_callchain_user_64(entry, regs);
diff --git a/arch/s390/kernel/perf_event.c b/arch/s390/kernel/perf_event.c
index c3e4099b60a5..87035fa58bbe 100644
--- a/arch/s390/kernel/perf_event.c
+++ b/arch/s390/kernel/perf_event.c
@@ -224,13 +224,13 @@ arch_initcall(service_level_perf_register);
static int __perf_callchain_kernel(void *data, unsigned long address)
{
- struct perf_callchain_entry *entry = data;
+ struct perf_callchain_entry_ctx *entry = data;
perf_callchain_store(entry, address);
return 0;
}
-void perf_callchain_kernel(struct perf_callchain_entry *entry,
+void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
if (user_mode(regs))
diff --git a/arch/sh/kernel/perf_callchain.c b/arch/sh/kernel/perf_callchain.c
index cc80b614b5fa..fa2c0cd23eaa 100644
--- a/arch/sh/kernel/perf_callchain.c
+++ b/arch/sh/kernel/perf_callchain.c
@@ -21,7 +21,7 @@ static int callchain_stack(void *data, char *name)
static void callchain_address(void *data, unsigned long addr, int reliable)
{
- struct perf_callchain_entry *entry = data;
+ struct perf_callchain_entry_ctx *entry = data;
if (reliable)
perf_callchain_store(entry, addr);
@@ -33,7 +33,7 @@ static const struct stacktrace_ops callchain_ops = {
};
void
-perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
perf_callchain_store(entry, regs->pc);
diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index a4b8b5aed21c..bcc5376db74b 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1711,7 +1711,7 @@ static int __init init_hw_perf_events(void)
}
pure_initcall(init_hw_perf_events);
-void perf_callchain_kernel(struct perf_callchain_entry *entry,
+void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
unsigned long ksp, fp;
@@ -1756,7 +1756,7 @@ void perf_callchain_kernel(struct perf_callchain_entry *entry,
}
}
#endif
- } while (entry->nr < sysctl_perf_event_max_stack);
+ } while (entry->entry->nr < entry->max_stack);
}
static inline int
@@ -1769,7 +1769,7 @@ valid_user_frame(const void __user *fp, unsigned long size)
return (__range_not_ok(fp, size, TASK_SIZE) == 0);
}
-static void perf_callchain_user_64(struct perf_callchain_entry *entry,
+static void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
unsigned long ufp;
@@ -1790,10 +1790,10 @@ static void perf_callchain_user_64(struct perf_callchain_entry *entry,
pc = sf.callers_pc;
ufp = (unsigned long)sf.fp + STACK_BIAS;
perf_callchain_store(entry, pc);
- } while (entry->nr < sysctl_perf_event_max_stack);
+ } while (entry->entry->nr < entry->max_stack);
}
-static void perf_callchain_user_32(struct perf_callchain_entry *entry,
+static void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
unsigned long ufp;
@@ -1822,11 +1822,11 @@ static void perf_callchain_user_32(struct perf_callchain_entry *entry,
ufp = (unsigned long)sf.fp;
}
perf_callchain_store(entry, pc);
- } while (entry->nr < sysctl_perf_event_max_stack);
+ } while (entry->entry->nr < entry->max_stack);
}
void
-perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
u64 saved_fault_address = current_thread_info()->fault_address;
u8 saved_fault_code = get_thread_fault_code();
diff --git a/arch/tile/kernel/perf_event.c b/arch/tile/kernel/perf_event.c
index 8767060d70fb..6394c1ccb68e 100644
--- a/arch/tile/kernel/perf_event.c
+++ b/arch/tile/kernel/perf_event.c
@@ -941,7 +941,7 @@ arch_initcall(init_hw_perf_events);
/*
* Tile specific backtracing code for perf_events.
*/
-static inline void perf_callchain(struct perf_callchain_entry *entry,
+static inline void perf_callchain(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
struct KBacktraceIterator kbt;
@@ -992,13 +992,13 @@ static inline void perf_callchain(struct perf_callchain_entry *entry,
}
}
-void perf_callchain_user(struct perf_callchain_entry *entry,
+void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
perf_callchain(entry, regs);
}
-void perf_callchain_kernel(struct perf_callchain_entry *entry,
+void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
perf_callchain(entry, regs);
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 5e5e76a52f58..07f2b01cfb72 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2202,7 +2202,7 @@ static int backtrace_stack(void *data, char *name)
static int backtrace_address(void *data, unsigned long addr, int reliable)
{
- struct perf_callchain_entry *entry = data;
+ struct perf_callchain_entry_ctx *entry = data;
return perf_callchain_store(entry, addr);
}
@@ -2214,7 +2214,7 @@ static const struct stacktrace_ops backtrace_ops = {
};
void
-perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
/* TODO: We don't support guest os callchain now */
@@ -2268,7 +2268,7 @@ static unsigned long get_segment_base(unsigned int segment)
#include <asm/compat.h>
static inline int
-perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
+perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry_ctx *entry)
{
/* 32-bit process in 64-bit kernel. */
unsigned long ss_base, cs_base;
@@ -2283,7 +2283,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
fp = compat_ptr(ss_base + regs->bp);
pagefault_disable();
- while (entry->nr < sysctl_perf_event_max_stack) {
+ while (entry->entry->nr < entry->max_stack) {
unsigned long bytes;
frame.next_frame = 0;
frame.return_address = 0;
@@ -2309,14 +2309,14 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
}
#else
static inline int
-perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry *entry)
+perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry_ctx *entry)
{
return 0;
}
#endif
void
-perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
+perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
{
struct stack_frame frame;
const void __user *fp;
@@ -2343,7 +2343,7 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
return;
pagefault_disable();
- while (entry->nr < sysctl_perf_event_max_stack) {
+ while (entry->entry->nr < entry->max_stack) {
unsigned long bytes;
frame.next_frame = NULL;
frame.return_address = 0;
diff --git a/arch/xtensa/kernel/perf_event.c b/arch/xtensa/kernel/perf_event.c
index a6b00b3af429..ef90479e0397 100644
--- a/arch/xtensa/kernel/perf_event.c
+++ b/arch/xtensa/kernel/perf_event.c
@@ -323,23 +323,23 @@ static void xtensa_pmu_read(struct perf_event *event)
static int callchain_trace(struct stackframe *frame, void *data)
{
- struct perf_callchain_entry *entry = data;
+ struct perf_callchain_entry_ctx *entry = data;
perf_callchain_store(entry, frame->pc);
return 0;
}
-void perf_callchain_kernel(struct perf_callchain_entry *entry,
+void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
- xtensa_backtrace_kernel(regs, sysctl_perf_event_max_stack,
+ xtensa_backtrace_kernel(regs, entry->max_stack,
callchain_trace, NULL, entry);
}
-void perf_callchain_user(struct perf_callchain_entry *entry,
+void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
- xtensa_backtrace_user(regs, sysctl_perf_event_max_stack,
+ xtensa_backtrace_user(regs, entry->max_stack,
callchain_trace, entry);
}
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 9e1c3ada91c4..dbd18246b36e 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -61,6 +61,11 @@ struct perf_callchain_entry {
__u64 ip[0]; /* /proc/sys/kernel/perf_event_max_stack */
};
+struct perf_callchain_entry_ctx {
+ struct perf_callchain_entry *entry;
+ u32 max_stack;
+};
+
struct perf_raw_record {
u32 size;
void *data;
@@ -1063,19 +1068,20 @@ extern void perf_event_fork(struct task_struct *tsk);
/* Callchains */
DECLARE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry);
-extern void perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs);
-extern void perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs);
+extern void perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs);
+extern void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs);
extern struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
- bool crosstask, bool add_mark);
+ u32 max_stack, bool crosstask, bool add_mark);
extern int get_callchain_buffers(void);
extern void put_callchain_buffers(void);
extern int sysctl_perf_event_max_stack;
-static inline int perf_callchain_store(struct perf_callchain_entry *entry, u64 ip)
+static inline int perf_callchain_store(struct perf_callchain_entry_ctx *ctx, u64 ip)
{
- if (entry->nr < sysctl_perf_event_max_stack) {
+ struct perf_callchain_entry *entry = ctx->entry;
+ if (entry->nr < ctx->max_stack) {
entry->ip[entry->nr++] = ip;
return 0;
} else {
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index f5a19548be12..a82d7605db3f 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -136,7 +136,8 @@ static u64 bpf_get_stackid(u64 r1, u64 r2, u64 flags, u64 r4, u64 r5)
BPF_F_FAST_STACK_CMP | BPF_F_REUSE_STACKID)))
return -EINVAL;
- trace = get_perf_callchain(regs, init_nr, kernel, user, false, false);
+ trace = get_perf_callchain(regs, init_nr, kernel, user,
+ sysctl_perf_event_max_stack, false, false);
if (unlikely(!trace))
/* couldn't fetch the stack trace */
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 7fc89939ede9..af95ad92893a 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -32,12 +32,12 @@ static DEFINE_MUTEX(callchain_mutex);
static struct callchain_cpus_entries *callchain_cpus_entries;
-__weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
+__weak void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
}
-__weak void perf_callchain_user(struct perf_callchain_entry *entry,
+__weak void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
struct pt_regs *regs)
{
}
@@ -176,14 +176,15 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs)
if (!kernel && !user)
return NULL;
- return get_perf_callchain(regs, 0, kernel, user, crosstask, true);
+ return get_perf_callchain(regs, 0, kernel, user, sysctl_perf_event_max_stack, crosstask, true);
}
struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
- bool crosstask, bool add_mark)
+ u32 max_stack, bool crosstask, bool add_mark)
{
struct perf_callchain_entry *entry;
+ struct perf_callchain_entry_ctx ctx;
int rctx;
entry = get_callchain_entry(&rctx);
@@ -193,12 +194,15 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
if (!entry)
goto exit_put;
+ ctx.entry = entry;
+ ctx.max_stack = max_stack;
+
entry->nr = init_nr;
if (kernel && !user_mode(regs)) {
if (add_mark)
- perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
- perf_callchain_kernel(entry, regs);
+ perf_callchain_store(&ctx, PERF_CONTEXT_KERNEL);
+ perf_callchain_kernel(&ctx, regs);
}
if (user) {
@@ -214,8 +218,8 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
goto exit_put;
if (add_mark)
- perf_callchain_store(entry, PERF_CONTEXT_USER);
- perf_callchain_user(entry, regs);
+ perf_callchain_store(&ctx, PERF_CONTEXT_USER);
+ perf_callchain_user(&ctx, regs);
}
}
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 09/12] perf core: Add a 'nr' field to perf_event_callchain_context
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (7 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 08/12] perf core: Pass max stack as a perf_callchain_entry context Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 10/12] perf core: Add perf_callchain_store_context() helper Arnaldo Carvalho de Melo
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Peter Zijlstra
From: Arnaldo Carvalho de Melo <acme@redhat.com>
We will use it to count how many addresses are in the entry->ip[] array,
excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
return the number of entries specified by the user via the relevant
sysctl, kernel.perf_event_max_contexts, or via the per event
perf_event_attr.sample_max_stack knob.
This way we keep the perf_sample->ip_callchain->nr meaning, that is the
number of entries, be it real addresses or PERF_CONTEXT_ entries, while
honouring the max_stack knobs, i.e. the end result will be max_stack
entries if we have at least that many entries in a given stack trace.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
arch/arm/kernel/perf_callchain.c | 2 +-
arch/arm64/kernel/perf_callchain.c | 4 ++--
arch/metag/kernel/perf_callchain.c | 2 +-
arch/mips/kernel/perf_event.c | 4 ++--
arch/powerpc/perf/callchain.c | 4 ++--
arch/sparc/kernel/perf_event.c | 6 +++---
arch/x86/events/core.c | 4 ++--
include/linux/perf_event.h | 6 ++++--
kernel/events/callchain.c | 3 +--
9 files changed, 18 insertions(+), 17 deletions(-)
diff --git a/arch/arm/kernel/perf_callchain.c b/arch/arm/kernel/perf_callchain.c
index bc552e813e7b..22bf1f64d99a 100644
--- a/arch/arm/kernel/perf_callchain.c
+++ b/arch/arm/kernel/perf_callchain.c
@@ -75,7 +75,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
tail = (struct frame_tail __user *)regs->ARM_fp - 1;
- while ((entry->entry->nr < entry->max_stack) &&
+ while ((entry->nr < entry->max_stack) &&
tail && !((unsigned long)tail & 0x3))
tail = user_backtrace(tail, entry);
}
diff --git a/arch/arm64/kernel/perf_callchain.c b/arch/arm64/kernel/perf_callchain.c
index 0d60150057cf..713ca824f266 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -122,7 +122,7 @@ void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
tail = (struct frame_tail __user *)regs->regs[29];
- while (entry->entry->nr < entry->max_stack &&
+ while (entry->nr < entry->max_stack &&
tail && !((unsigned long)tail & 0xf))
tail = user_backtrace(tail, entry);
} else {
@@ -132,7 +132,7 @@ void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
tail = (struct compat_frame_tail __user *)regs->compat_fp - 1;
- while ((entry->entry->nr < entry->max_stack) &&
+ while ((entry->nr < entry->max_stack) &&
tail && !((unsigned long)tail & 0x3))
tail = compat_user_backtrace(tail, entry);
#endif
diff --git a/arch/metag/kernel/perf_callchain.c b/arch/metag/kernel/perf_callchain.c
index b3261a98b15b..3e8e048040df 100644
--- a/arch/metag/kernel/perf_callchain.c
+++ b/arch/metag/kernel/perf_callchain.c
@@ -65,7 +65,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
--frame;
- while ((entry->entry->nr < entry->max_stack) && frame)
+ while ((entry->nr < entry->max_stack) && frame)
frame = user_backtrace(frame, entry);
}
diff --git a/arch/mips/kernel/perf_event.c b/arch/mips/kernel/perf_event.c
index 22395c7d7030..d64056e0bb56 100644
--- a/arch/mips/kernel/perf_event.c
+++ b/arch/mips/kernel/perf_event.c
@@ -35,7 +35,7 @@ static void save_raw_perf_callchain(struct perf_callchain_entry_ctx *entry,
addr = *sp++;
if (__kernel_text_address(addr)) {
perf_callchain_store(entry, addr);
- if (entry->entry->nr >= entry->max_stack)
+ if (entry->nr >= entry->max_stack)
break;
}
}
@@ -59,7 +59,7 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
}
do {
perf_callchain_store(entry, pc);
- if (entry->entry->nr >= entry->max_stack)
+ if (entry->nr >= entry->max_stack)
break;
pc = unwind_stack(current, &sp, pc, &ra);
} while (pc);
diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c
index c9260c1dfdbc..f68f213dc36c 100644
--- a/arch/powerpc/perf/callchain.c
+++ b/arch/powerpc/perf/callchain.c
@@ -247,7 +247,7 @@ static void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry,
sp = regs->gpr[1];
perf_callchain_store(entry, next_ip);
- while (entry->entry->nr < entry->max_stack) {
+ while (entry->nr < entry->max_stack) {
fp = (unsigned long __user *) sp;
if (!valid_user_sp(sp, 1) || read_user_stack_64(fp, &next_sp))
return;
@@ -453,7 +453,7 @@ static void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry,
sp = regs->gpr[1];
perf_callchain_store(entry, next_ip);
- while (entry->entry->nr < entry->max_stack) {
+ while (entry->nr < entry->max_stack) {
fp = (unsigned int __user *) (unsigned long) sp;
if (!valid_user_sp(sp, 0) || read_user_stack_32(fp, &next_sp))
return;
diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index bcc5376db74b..710f3278d448 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1756,7 +1756,7 @@ void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry,
}
}
#endif
- } while (entry->entry->nr < entry->max_stack);
+ } while (entry->nr < entry->max_stack);
}
static inline int
@@ -1790,7 +1790,7 @@ static void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry,
pc = sf.callers_pc;
ufp = (unsigned long)sf.fp + STACK_BIAS;
perf_callchain_store(entry, pc);
- } while (entry->entry->nr < entry->max_stack);
+ } while (entry->nr < entry->max_stack);
}
static void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry,
@@ -1822,7 +1822,7 @@ static void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry,
ufp = (unsigned long)sf.fp;
}
perf_callchain_store(entry, pc);
- } while (entry->entry->nr < entry->max_stack);
+ } while (entry->nr < entry->max_stack);
}
void
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 07f2b01cfb72..5de96a18cd9c 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2283,7 +2283,7 @@ perf_callchain_user32(struct pt_regs *regs, struct perf_callchain_entry_ctx *ent
fp = compat_ptr(ss_base + regs->bp);
pagefault_disable();
- while (entry->entry->nr < entry->max_stack) {
+ while (entry->nr < entry->max_stack) {
unsigned long bytes;
frame.next_frame = 0;
frame.return_address = 0;
@@ -2343,7 +2343,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs
return;
pagefault_disable();
- while (entry->entry->nr < entry->max_stack) {
+ while (entry->nr < entry->max_stack) {
unsigned long bytes;
frame.next_frame = NULL;
frame.return_address = 0;
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index dbd18246b36e..3803bb1a862b 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -64,6 +64,7 @@ struct perf_callchain_entry {
struct perf_callchain_entry_ctx {
struct perf_callchain_entry *entry;
u32 max_stack;
+ u32 nr;
};
struct perf_raw_record {
@@ -1080,9 +1081,10 @@ extern int sysctl_perf_event_max_stack;
static inline int perf_callchain_store(struct perf_callchain_entry_ctx *ctx, u64 ip)
{
- struct perf_callchain_entry *entry = ctx->entry;
- if (entry->nr < ctx->max_stack) {
+ if (ctx->nr < ctx->max_stack) {
+ struct perf_callchain_entry *entry = ctx->entry;
entry->ip[entry->nr++] = ip;
+ ++ctx->nr;
return 0;
} else {
return -1; /* no more room, stop walking the stack */
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index af95ad92893a..8774ff86debb 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -196,8 +196,7 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
ctx.entry = entry;
ctx.max_stack = max_stack;
-
- entry->nr = init_nr;
+ ctx.nr = entry->nr = init_nr;
if (kernel && !user_mode(regs)) {
if (add_mark)
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 10/12] perf core: Add perf_callchain_store_context() helper
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (8 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 09/12] perf core: Add a 'nr' field to perf_event_callchain_context Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 11/12] perf core: Separate accounting of contexts and real addresses in a stack trace Arnaldo Carvalho de Melo
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, David Ahern,
Frederic Weisbecker, Jiri Olsa, Namhyung Kim, Peter Zijlstra
From: Arnaldo Carvalho de Melo <acme@redhat.com>
We need have different helpers to account how many contexts we have in
the sample and for real addresses, so do it now as a prep patch, to
ease review.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
arch/powerpc/perf/callchain.c | 6 +++---
include/linux/perf_event.h | 2 ++
kernel/events/callchain.c | 4 ++--
3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/perf/callchain.c b/arch/powerpc/perf/callchain.c
index f68f213dc36c..f62597dbd757 100644
--- a/arch/powerpc/perf/callchain.c
+++ b/arch/powerpc/perf/callchain.c
@@ -76,7 +76,7 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re
next_ip = regs->nip;
lr = regs->link;
level = 0;
- perf_callchain_store(entry, PERF_CONTEXT_KERNEL);
+ perf_callchain_store_context(entry, PERF_CONTEXT_KERNEL);
} else {
if (level == 0)
@@ -274,7 +274,7 @@ static void perf_callchain_user_64(struct perf_callchain_entry_ctx *entry,
read_user_stack_64(&uregs[PT_R1], &sp))
return;
level = 0;
- perf_callchain_store(entry, PERF_CONTEXT_USER);
+ perf_callchain_store_context(entry, PERF_CONTEXT_USER);
perf_callchain_store(entry, next_ip);
continue;
}
@@ -473,7 +473,7 @@ static void perf_callchain_user_32(struct perf_callchain_entry_ctx *entry,
read_user_stack_32(&uregs[PT_R1], &sp))
return;
level = 0;
- perf_callchain_store(entry, PERF_CONTEXT_USER);
+ perf_callchain_store_context(entry, PERF_CONTEXT_USER);
perf_callchain_store(entry, next_ip);
continue;
}
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3803bb1a862b..2024b14cc2b1 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1079,6 +1079,8 @@ extern void put_callchain_buffers(void);
extern int sysctl_perf_event_max_stack;
+#define perf_callchain_store_context(ctx, context) perf_callchain_store(ctx, context)
+
static inline int perf_callchain_store(struct perf_callchain_entry_ctx *ctx, u64 ip)
{
if (ctx->nr < ctx->max_stack) {
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 8774ff86debb..ca645736a983 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -200,7 +200,7 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
if (kernel && !user_mode(regs)) {
if (add_mark)
- perf_callchain_store(&ctx, PERF_CONTEXT_KERNEL);
+ perf_callchain_store_context(&ctx, PERF_CONTEXT_KERNEL);
perf_callchain_kernel(&ctx, regs);
}
@@ -217,7 +217,7 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
goto exit_put;
if (add_mark)
- perf_callchain_store(&ctx, PERF_CONTEXT_USER);
+ perf_callchain_store_context(&ctx, PERF_CONTEXT_USER);
perf_callchain_user(&ctx, regs);
}
}
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 11/12] perf core: Separate accounting of contexts and real addresses in a stack trace
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (9 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 10/12] perf core: Add perf_callchain_store_context() helper Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 12/12] perf tools: " Arnaldo Carvalho de Melo
2016-05-20 6:23 ` [GIT PULL 00/12] perf/core improvements and fixes Ingo Molnar
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Alexei Starovoitov, Brendan Gregg,
David Ahern, Frederic Weisbecker, He Kuang, Jiri Olsa,
Masami Hiramatsu, Milian Wolff, Namhyung Kim, Peter Zijlstra,
Stephane Eranian, Thomas Gleixner, Vince Weaver, Wang Nan,
Zefan Li
From: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.
So allocate a bunch of extra entries for contexts, and do the accounting
via perf_callchain_entry_ctx struct members.
A new sysctl, kernel.perf_event_max_contexts_per_stack is also
introduced for investigating possible bugs in the callchain
implementation by some arch.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
Documentation/sysctl/kernel.txt | 14 ++++++++++++++
include/linux/perf_event.h | 18 ++++++++++++++++--
include/uapi/linux/perf_event.h | 1 +
kernel/events/callchain.c | 10 +++++++++-
kernel/sysctl.c | 9 +++++++++
5 files changed, 49 insertions(+), 3 deletions(-)
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index daabdd7ee543..a3683ce2a2f3 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -61,6 +61,7 @@ show up in /proc/sys/kernel:
- perf_cpu_time_max_percent
- perf_event_paranoid
- perf_event_max_stack
+- perf_event_max_contexts_per_stack
- pid_max
- powersave-nap [ PPC only ]
- printk
@@ -668,6 +669,19 @@ The default value is 127.
==============================================================
+perf_event_max_contexts_per_stack:
+
+Controls maximum number of stack frame context entries for
+(attr.sample_type & PERF_SAMPLE_CALLCHAIN) configured events, for
+instance, when using 'perf record -g' or 'perf trace --call-graph fp'.
+
+This can only be done when no events are in use that have callchains
+enabled, otherwise writing to this file will return -EBUSY.
+
+The default value is 8.
+
+==============================================================
+
pid_max:
PID allocation wrap value. When the kernel's next PID value
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2024b14cc2b1..6b87be908790 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -65,6 +65,8 @@ struct perf_callchain_entry_ctx {
struct perf_callchain_entry *entry;
u32 max_stack;
u32 nr;
+ short contexts;
+ bool contexts_maxed;
};
struct perf_raw_record {
@@ -1078,12 +1080,24 @@ extern int get_callchain_buffers(void);
extern void put_callchain_buffers(void);
extern int sysctl_perf_event_max_stack;
+extern int sysctl_perf_event_max_contexts_per_stack;
-#define perf_callchain_store_context(ctx, context) perf_callchain_store(ctx, context)
+static inline int perf_callchain_store_context(struct perf_callchain_entry_ctx *ctx, u64 ip)
+{
+ if (ctx->contexts < sysctl_perf_event_max_contexts_per_stack) {
+ struct perf_callchain_entry *entry = ctx->entry;
+ entry->ip[entry->nr++] = ip;
+ ++ctx->contexts;
+ return 0;
+ } else {
+ ctx->contexts_maxed = true;
+ return -1; /* no more room, stop walking the stack */
+ }
+}
static inline int perf_callchain_store(struct perf_callchain_entry_ctx *ctx, u64 ip)
{
- if (ctx->nr < ctx->max_stack) {
+ if (ctx->nr < ctx->max_stack && !ctx->contexts_maxed) {
struct perf_callchain_entry *entry = ctx->entry;
entry->ip[entry->nr++] = ip;
++ctx->nr;
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 43fc8d213472..36ce552cf6a9 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -862,6 +862,7 @@ enum perf_event_type {
};
#define PERF_MAX_STACK_DEPTH 127
+#define PERF_MAX_CONTEXTS_PER_STACK 8
enum perf_callchain_context {
PERF_CONTEXT_HV = (__u64)-32,
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index ca645736a983..179ef4640964 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -19,11 +19,13 @@ struct callchain_cpus_entries {
};
int sysctl_perf_event_max_stack __read_mostly = PERF_MAX_STACK_DEPTH;
+int sysctl_perf_event_max_contexts_per_stack __read_mostly = PERF_MAX_CONTEXTS_PER_STACK;
static inline size_t perf_callchain_entry__sizeof(void)
{
return (sizeof(struct perf_callchain_entry) +
- sizeof(__u64) * sysctl_perf_event_max_stack);
+ sizeof(__u64) * (sysctl_perf_event_max_stack +
+ sysctl_perf_event_max_contexts_per_stack));
}
static DEFINE_PER_CPU(int, callchain_recursion[PERF_NR_CONTEXTS]);
@@ -197,6 +199,8 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
ctx.entry = entry;
ctx.max_stack = max_stack;
ctx.nr = entry->nr = init_nr;
+ ctx.contexts = 0;
+ ctx.contexts_maxed = false;
if (kernel && !user_mode(regs)) {
if (add_mark)
@@ -228,6 +232,10 @@ exit_put:
return entry;
}
+/*
+ * Used for sysctl_perf_event_max_stack and
+ * sysctl_perf_event_max_contexts_per_stack.
+ */
int perf_event_max_stack_handler(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 0ec6907a16b3..bec4c11c47d6 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1156,6 +1156,15 @@ static struct ctl_table kern_table[] = {
.extra1 = &zero,
.extra2 = &six_hundred_forty_kb,
},
+ {
+ .procname = "perf_event_max_contexts_per_stack",
+ .data = &sysctl_perf_event_max_contexts_per_stack,
+ .maxlen = sizeof(sysctl_perf_event_max_contexts_per_stack),
+ .mode = 0644,
+ .proc_handler = perf_event_max_stack_handler,
+ .extra1 = &zero,
+ .extra2 = &one_thousand,
+ },
#endif
#ifdef CONFIG_KMEMCHECK
{
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 12/12] perf tools: Separate accounting of contexts and real addresses in a stack trace
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (10 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 11/12] perf core: Separate accounting of contexts and real addresses in a stack trace Arnaldo Carvalho de Melo
@ 2016-05-17 2:45 ` Arnaldo Carvalho de Melo
2016-05-20 6:23 ` [GIT PULL 00/12] perf/core improvements and fixes Ingo Molnar
12 siblings, 0 replies; 14+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-05-17 2:45 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Alexei Starovoitov, Brendan Gregg,
David Ahern, Frederic Weisbecker, He Kuang, Jiri Olsa,
Masami Hiramatsu, Milian Wolff, Namhyung Kim, Peter Zijlstra,
Stephane Eranian, Thomas Gleixner, Vince Weaver, Wang Nan,
Zefan Li
From: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.
So match the kernel support and validate chain->nr taking into account
both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/perf.c | 3 +++
tools/perf/util/machine.c | 26 +++++++++++++++++---------
tools/perf/util/util.c | 3 ++-
tools/perf/util/util.h | 3 ++-
4 files changed, 24 insertions(+), 11 deletions(-)
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 797000842d40..15982cee5ef3 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -549,6 +549,9 @@ int main(int argc, const char **argv)
if (sysctl__read_int("kernel/perf_event_max_stack", &value) == 0)
sysctl_perf_event_max_stack = value;
+ if (sysctl__read_int("kernel/perf_event_max_contexts_per_stack", &value) == 0)
+ sysctl_perf_event_max_contexts_per_stack = value;
+
cmd = extract_argv0_path(argv[0]);
if (!cmd)
cmd = "perf-help";
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 18dd96bdde05..7ba9fadb68af 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1811,9 +1811,9 @@ static int thread__resolve_callchain_sample(struct thread *thread,
{
struct branch_stack *branch = sample->branch_stack;
struct ip_callchain *chain = sample->callchain;
- int chain_nr = min(max_stack, (int)chain->nr);
+ int chain_nr = chain->nr;
u8 cpumode = PERF_RECORD_MISC_USER;
- int i, j, err;
+ int i, j, err, nr_entries, nr_contexts;
int skip_idx = -1;
int first_call = 0;
@@ -1828,7 +1828,7 @@ static int thread__resolve_callchain_sample(struct thread *thread,
* Based on DWARF debug information, some architectures skip
* a callchain entry saved by the kernel.
*/
- if (chain->nr < sysctl_perf_event_max_stack)
+ if (chain_nr < sysctl_perf_event_max_stack)
skip_idx = arch_skip_callchain_idx(thread, chain);
/*
@@ -1889,12 +1889,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
}
check_calls:
- if (chain->nr > sysctl_perf_event_max_stack && (int)chain->nr > max_stack) {
- pr_warning("corrupted callchain. skipping...\n");
- return 0;
- }
-
- for (i = first_call; i < chain_nr; i++) {
+ for (i = first_call, nr_entries = 0, nr_contexts = 0;
+ i < chain_nr && nr_entries < max_stack; i++) {
u64 ip;
if (callchain_param.order == ORDER_CALLEE)
@@ -1908,6 +1904,14 @@ check_calls:
#endif
ip = chain->ips[j];
+ if (ip >= PERF_CONTEXT_MAX) {
+ if (++nr_contexts > sysctl_perf_event_max_contexts_per_stack)
+ goto out_corrupted_callchain;
+ } else {
+ if (++nr_entries > sysctl_perf_event_max_stack)
+ goto out_corrupted_callchain;
+ }
+
err = add_callchain_ip(thread, cursor, parent, root_al, &cpumode, ip);
if (err)
@@ -1915,6 +1919,10 @@ check_calls:
}
return 0;
+
+out_corrupted_callchain:
+ pr_warning("corrupted callchain. skipping...\n");
+ return 0;
}
static int unwind_entry(struct unwind_entry *entry, void *arg)
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index eab077ad6ca9..23504ad5d6dd 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -33,7 +33,8 @@ struct callchain_param callchain_param = {
unsigned int page_size;
int cacheline_size;
-unsigned int sysctl_perf_event_max_stack = PERF_MAX_STACK_DEPTH;
+int sysctl_perf_event_max_stack = PERF_MAX_STACK_DEPTH;
+int sysctl_perf_event_max_contexts_per_stack = PERF_MAX_CONTEXTS_PER_STACK;
bool test_attr__enabled;
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 7651633a8dc7..1e8c3167b9fb 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -261,7 +261,8 @@ void sighandler_dump_stack(int sig);
extern unsigned int page_size;
extern int cacheline_size;
-extern unsigned int sysctl_perf_event_max_stack;
+extern int sysctl_perf_event_max_stack;
+extern int sysctl_perf_event_max_contexts_per_stack;
struct parse_tag {
char tag;
--
2.5.5
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [GIT PULL 00/12] perf/core improvements and fixes
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
` (11 preceding siblings ...)
2016-05-17 2:45 ` [PATCH 12/12] perf tools: " Arnaldo Carvalho de Melo
@ 2016-05-20 6:23 ` Ingo Molnar
12 siblings, 0 replies; 14+ messages in thread
From: Ingo Molnar @ 2016-05-20 6:23 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Alexei Starovoitov,
Ananth N Mavinakayanahalli, Andi Kleen, Brendan Gregg,
David Ahern, Ekaterina Tumanova, Frederic Weisbecker, He Kuang,
Hemant Kumar, Jiri Olsa, Josh Poimboeuf, Kan Liang,
Linus Torvalds, Masami Hiramatsu, Milian Wolff, Namhyung Kim,
Pekka Enberg, Peter Zijlstra, Stephane Eranian,
Sukadev Bhattiprolu, Thomas Gleixner, Vince Weaver, Wang Nan,
Zefan Li
* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> From: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> The following changes since commit 3f56e687a138481894a1088d5aa7d41951bdb020:
>
> perf/core: Disable the event on a truncated AUX record (2016-05-12 10:14:55 +0200)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160516
>
> for you to fetch changes up to a29d5c9b8167dbc21a7ca8c0302e3799f9063b4e:
>
> perf tools: Separate accounting of contexts and real addresses in a stack trace (2016-05-16 23:11:54 -0300)
>
> ----------------------------------------------------------------
> perf/core improvements and fixes:
>
> User visible:
>
> - Honour the kernel.perf_event_max_stack knob more precisely by not counting
> PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
> the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)
>
> - Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)
>
> - Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)
>
> - Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)
>
> - Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)
>
> - Store vdso buildid unconditionally, as it appears in callchains and
> we're not checking those when creating the build-id table, so we
> end up not being able to resolve VDSO symbols when doing analysis
> on a different machine than the one where recording was done, possibly
> of a different arch even (arm -> x86_64) (He Kuang)
>
> Infrastructure:
>
> - Generalize max_stack sysctl handler, will be used for configuring
> multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)
>
> Cleanups:
>
> - Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
> open coded strings (Masami Hiramatsu)
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ----------------------------------------------------------------
> Andi Kleen (1):
> perf stat: Avoid fractional digits for integer scales
>
> Arnaldo Carvalho de Melo (6):
> perf core: Generalize max_stack sysctl handler
> perf core: Pass max stack as a perf_callchain_entry context
> perf core: Add a 'nr' field to perf_event_callchain_context
> perf core: Add perf_callchain_store_context() helper
> perf core: Separate accounting of contexts and real addresses in a stack trace
> perf tools: Separate accounting of contexts and real addresses in a stack trace
>
> He Kuang (1):
> perf symbols: Store vdso buildid unconditionally
>
> Masami Hiramatsu (1):
> perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE
>
> Namhyung Kim (3):
> perf stat: Fix indentation of stalled backend cycle
> perf stat: Update runtime using cpu-clock event
> perf stat: Use cpu-clock event for cpu targets
>
> Documentation/sysctl/kernel.txt | 14 ++++++++++++++
> arch/arc/kernel/perf_event.c | 6 +++---
> arch/arm/kernel/perf_callchain.c | 10 +++++-----
> arch/arm64/kernel/perf_callchain.c | 14 +++++++-------
> arch/metag/kernel/perf_callchain.c | 10 +++++-----
> arch/mips/kernel/perf_event.c | 12 ++++++------
> arch/powerpc/perf/callchain.c | 20 ++++++++++----------
> arch/s390/kernel/perf_event.c | 4 ++--
> arch/sh/kernel/perf_callchain.c | 4 ++--
> arch/sparc/kernel/perf_event.c | 14 +++++++-------
> arch/tile/kernel/perf_event.c | 6 +++---
> arch/x86/events/core.c | 14 +++++++-------
> arch/xtensa/kernel/perf_event.c | 10 +++++-----
> include/linux/perf_event.h | 34 +++++++++++++++++++++++++++++-----
> include/uapi/linux/perf_event.h | 1 +
> kernel/bpf/stackmap.c | 3 ++-
> kernel/events/callchain.c | 36 ++++++++++++++++++++++++------------
> kernel/sysctl.c | 11 ++++++++++-
> tools/perf/builtin-buildid-cache.c | 8 ++++----
> tools/perf/builtin-stat.c | 22 +++++++++++++---------
> tools/perf/perf.c | 3 +++
> tools/perf/util/annotate.c | 2 +-
> tools/perf/util/build-id.c | 2 +-
> tools/perf/util/dso.c | 3 ++-
> tools/perf/util/machine.c | 28 ++++++++++++++++++----------
> tools/perf/util/stat-shadow.c | 8 +++++---
> tools/perf/util/symbol.c | 10 +++++-----
> tools/perf/util/symbol.h | 3 +++
> tools/perf/util/util.c | 3 ++-
> tools/perf/util/util.h | 3 ++-
> 30 files changed, 201 insertions(+), 117 deletions(-)
Pulled, thanks a lot Arnaldo!
Ingo
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-05-20 6:23 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-17 2:45 [GIT PULL 00/12] perf/core improvements and fixes Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 01/12] perf stat: Avoid fractional digits for integer scales Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 02/12] perf symbols: Store vdso buildid unconditionally Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 03/12] perf stat: Fix indentation of stalled backend cycle Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 04/12] perf stat: Update runtime using cpu-clock event Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 05/12] perf stat: Use cpu-clock event for cpu targets Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 06/12] perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 07/12] perf core: Generalize max_stack sysctl handler Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 08/12] perf core: Pass max stack as a perf_callchain_entry context Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 09/12] perf core: Add a 'nr' field to perf_event_callchain_context Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 10/12] perf core: Add perf_callchain_store_context() helper Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 11/12] perf core: Separate accounting of contexts and real addresses in a stack trace Arnaldo Carvalho de Melo
2016-05-17 2:45 ` [PATCH 12/12] perf tools: " Arnaldo Carvalho de Melo
2016-05-20 6:23 ` [GIT PULL 00/12] perf/core improvements and fixes Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).