* Re: [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 19:34 ` [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data Don Zickus
@ 2014-03-24 19:54 ` Andi Kleen
2014-03-24 20:17 ` Don Zickus
2014-03-24 20:54 ` [PATCH 01/15 V3] perf: Fix stddev calculation Don Zickus
` (2 subsequent siblings)
3 siblings, 1 reply; 25+ messages in thread
From: Andi Kleen @ 2014-03-24 19:54 UTC (permalink / raw)
To: Don Zickus; +Cc: acme, peterz, LKML, jolsa, jmario, fowles, eranian, andi.kleen
Don Zickus <dzickus@redhat.com> writes:
> In order for the c2c tool to work correctly, it needs to properly
> sort all the records on uniquely identifiable data addresses. These
> unique addresses are converted from virtual addresses provided by the
> hardware into a kernel address using an mmap2 record as the decoder.
No documentation for the new option?
Probably the new mode should be also supported by --sort
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 19:54 ` Andi Kleen
@ 2014-03-24 20:17 ` Don Zickus
2014-03-24 20:20 ` Andi Kleen
0 siblings, 1 reply; 25+ messages in thread
From: Don Zickus @ 2014-03-24 20:17 UTC (permalink / raw)
To: Andi Kleen; +Cc: acme, peterz, LKML, jolsa, jmario, fowles, eranian, andi.kleen
On Mon, Mar 24, 2014 at 12:54:31PM -0700, Andi Kleen wrote:
> Don Zickus <dzickus@redhat.com> writes:
>
> > In order for the c2c tool to work correctly, it needs to properly
> > sort all the records on uniquely identifiable data addresses. These
> > unique addresses are converted from virtual addresses provided by the
> > hardware into a kernel address using an mmap2 record as the decoder.
>
> No documentation for the new option?
>
> Probably the new mode should be also supported by --sort
I hid the new option further in the changelog, so it isn't obvious. Sorry
about that.
Sample output: (perf report --stdio --physid-mode)
So the option was '--physid-mode' and if you don't pass in a '--sort' then
it takes the default sort of
'daddr,iaddr,pid,tid,major,minor,inode,inode_gen'
Otherwise you could pass in a combination of the other fields.
The output is not the best way to use the mmap2 data as it just gives you
hottest data addresses. Our c2c tool really takes the data addresses and
combines them into a cacheline and then processes the cacheline for
interesting bottlenecks (HITMs in our case).
I don't know a good way to present the data and yet still have the sort
useful for our c2c tool. So I threw this interface together. I am open
to suggestions.
Cheers,
Don
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 20:17 ` Don Zickus
@ 2014-03-24 20:20 ` Andi Kleen
2014-03-24 20:26 ` Don Zickus
0 siblings, 1 reply; 25+ messages in thread
From: Andi Kleen @ 2014-03-24 20:20 UTC (permalink / raw)
To: Don Zickus
Cc: Andi Kleen, acme, peterz, LKML, jolsa, jmario, fowles, eranian,
andi.kleen
On Mon, Mar 24, 2014 at 04:17:57PM -0400, Don Zickus wrote:
> On Mon, Mar 24, 2014 at 12:54:31PM -0700, Andi Kleen wrote:
> > Don Zickus <dzickus@redhat.com> writes:
> >
> > > In order for the c2c tool to work correctly, it needs to properly
> > > sort all the records on uniquely identifiable data addresses. These
> > > unique addresses are converted from virtual addresses provided by the
> > > hardware into a kernel address using an mmap2 record as the decoder.
> >
> > No documentation for the new option?
> >
> > Probably the new mode should be also supported by --sort
>
> I hid the new option further in the changelog, so it isn't obvious. Sorry
> about that.
I meant there's no manpage change
-Andi
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 20:20 ` Andi Kleen
@ 2014-03-24 20:26 ` Don Zickus
0 siblings, 0 replies; 25+ messages in thread
From: Don Zickus @ 2014-03-24 20:26 UTC (permalink / raw)
To: Andi Kleen; +Cc: acme, peterz, LKML, jolsa, jmario, fowles, eranian, andi.kleen
On Mon, Mar 24, 2014 at 09:20:45PM +0100, Andi Kleen wrote:
> On Mon, Mar 24, 2014 at 04:17:57PM -0400, Don Zickus wrote:
> > On Mon, Mar 24, 2014 at 12:54:31PM -0700, Andi Kleen wrote:
> > > Don Zickus <dzickus@redhat.com> writes:
> > >
> > > > In order for the c2c tool to work correctly, it needs to properly
> > > > sort all the records on uniquely identifiable data addresses. These
> > > > unique addresses are converted from virtual addresses provided by the
> > > > hardware into a kernel address using an mmap2 record as the decoder.
> > >
> > > No documentation for the new option?
> > >
> > > Probably the new mode should be also supported by --sort
> >
> > I hid the new option further in the changelog, so it isn't obvious. Sorry
> > about that.
>
> I meant there's no manpage change
Ah, right. I missed that. Thanks for noticing. Let me respin that.
Cheers,
Don
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 01/15 V3] perf: Fix stddev calculation
2014-03-24 19:34 ` [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data Don Zickus
2014-03-24 19:54 ` Andi Kleen
@ 2014-03-24 20:54 ` Don Zickus
2014-03-24 20:57 ` Don Zickus
2014-03-24 20:57 ` [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data Don Zickus
2014-04-09 3:06 ` [PATCH 4/6] " Don Zickus
3 siblings, 1 reply; 25+ messages in thread
From: Don Zickus @ 2014-03-24 20:54 UTC (permalink / raw)
To: acme; +Cc: LKML, jolsa, jmario, fowles, peterz, eranian, andi.kleen,
Don Zickus
The stddev calculation written matched standard error. As a result when
using this result to find the relative stddev between runs, it was not
accurate.
Update the formula to match traditional stddev. Then rename the old
stddev calculation to stderr_stats in case someone wants to use it.
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
tools/perf/util/stat.c | 13 +++++++++++++
tools/perf/util/stat.h | 1 +
2 files changed, 14 insertions(+)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 6506b3d..0cb4dbc 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -33,6 +33,7 @@ double avg_stats(struct stats *stats)
* http://en.wikipedia.org/wiki/Stddev
*
* The std dev of the mean is related to the std dev by:
+ * (also known as standard error)
*
* s
* s_mean = -------
@@ -41,6 +42,18 @@ double avg_stats(struct stats *stats)
*/
double stddev_stats(struct stats *stats)
{
+ double variance;
+
+ if (stats->n < 2)
+ return 0.0;
+
+ variance = stats->M2 / (stats->n - 1);
+
+ return sqrt(variance);
+}
+
+double stderr_stats(struct stats *stats)
+{
double variance, variance_mean;
if (stats->n < 2)
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index ae8ccd7..6f61615 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -12,6 +12,7 @@ struct stats
void update_stats(struct stats *stats, u64 val);
double avg_stats(struct stats *stats);
double stddev_stats(struct stats *stats);
+double stderr_stats(struct stats *stats);
double rel_stddev_stats(double stddev, double avg);
static inline void init_stats(struct stats *stats)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 01/15 V3] perf: Fix stddev calculation
2014-03-24 20:54 ` [PATCH 01/15 V3] perf: Fix stddev calculation Don Zickus
@ 2014-03-24 20:57 ` Don Zickus
0 siblings, 0 replies; 25+ messages in thread
From: Don Zickus @ 2014-03-24 20:57 UTC (permalink / raw)
To: acme; +Cc: LKML, jolsa, jmario, fowles, peterz, eranian, andi.kleen
On Mon, Mar 24, 2014 at 04:54:38PM -0400, Don Zickus wrote:
> The stddev calculation written matched standard error. As a result when
> using this result to find the relative stddev between runs, it was not
> accurate.
>
This isn't the patch I that had my updates... Sorry for the noise.
Cheers,
Don
^ permalink raw reply [flat|nested] 25+ messages in thread
* [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 19:34 ` [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data Don Zickus
2014-03-24 19:54 ` Andi Kleen
2014-03-24 20:54 ` [PATCH 01/15 V3] perf: Fix stddev calculation Don Zickus
@ 2014-03-24 20:57 ` Don Zickus
2014-03-29 17:11 ` Jiri Olsa
2014-04-09 5:21 ` Namhyung Kim
2014-04-09 3:06 ` [PATCH 4/6] " Don Zickus
3 siblings, 2 replies; 25+ messages in thread
From: Don Zickus @ 2014-03-24 20:57 UTC (permalink / raw)
To: acme; +Cc: LKML, jolsa, jmario, fowles, peterz, eranian, andi.kleen,
Don Zickus
In order for the c2c tool to work correctly, it needs to properly
sort all the records on uniquely identifiable data addresses. These
unique addresses are converted from virtual addresses provided by the
hardware into a kernel address using an mmap2 record as the decoder.
Once a unique address is converted, we can sort on them based on
various rules. Then it becomes clear which address are overlapping
with each other across mmap regions or pid spaces.
This patch just creates the rules and inserts the records into a
sort entry for safe keeping until later patches process them.
The general sorting rule is:
o group cpumodes together
o if (nonzero major/minor number - ie mmap'd areas)
o sort on major, minor, inode, inode generation numbers
o else if cpumode is not kernel
o sort on pid
o sort on data addresses
I also hacked in the concept of 'color'. The purpose of that bit is to
provides hints later when processing these records that indicate a new unique
address has been encountered. Because later processing only checks the data
addresses, there can be a theoretical scenario that similar sequential data
addresses (when walking the rbtree) could be misinterpreted as overlapping
when in fact they are not.
Sample output: (perf report --stdio --physid-mode)
Overhead Data Address Source Address Command: Pid Tid Major Minor Inode Inode Gen
........ ...................... ........................ ................. ..... ..... ..... ....... .........
18.93% [k] 0xffffc900139c40b0 [k] igb_update_stats kworker/0:1: 257 257 0 0 0 0
7.63% [k] 0xffff88082e6cf0a8 [k] watchdog_timer_fn swapper: 0 0 0 0 0 0
1.86% [k] 0xffff88042ef94700 [k] _raw_spin_lock swapper: 0 0 0 0 0 0
1.77% [k] 0xffff8804278afa50 [k] __switch_to swapper: 0 0 0 0 0 0
V4: add manpage entry in perf-report
V3: split out the sorting into unique entries. This makes it look
far less ugly
create a new 'physid mode' to group all the sorting rules together
(mimics the mem-mode)
Signed-off-by: Don Zickus <dzickus@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 23 +++
tools/perf/builtin-report.c | 20 ++-
tools/perf/util/hist.c | 27 ++-
tools/perf/util/hist.h | 8 +
tools/perf/util/sort.c | 294 +++++++++++++++++++++++++++++++
tools/perf/util/sort.h | 13 ++
6 files changed, 381 insertions(+), 4 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 8eab8a4..01391b0 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -95,6 +95,23 @@ OPTIONS
And default sort keys are changed to comm, dso_from, symbol_from, dso_to
and symbol_to, see '--branch-stack'.
+ If --physid-mode option is used, following sort keys are also
+ available:
+ daddr, iaddr, pid, tid, major, minor, inode, inode_gen.
+
+ - daddr: data address (sorted based on major, minor, inode and inode
+ generation numbers if shared, otherwise pid)
+ - iaddr: instruction address
+ - pid: command and pid of the task
+ - tid: tid of the task
+ - major: major number of mapped location (0 if not mapped)
+ - minor: minor number of mapped location (0 if not mapped)
+ - inode: inode number of mapped location (0 if not mapped)
+ - inode_gen: inode generation number of mapped location (0 if not mapped)
+
+ And default sort keys are changed to daddr, iaddr, pid, tid, major,
+ minor, inode and inode_gen, see '--physid-mode'.
+
-p::
--parent=<regex>::
A regex filter to identify parent. The parent is a caller of this
@@ -223,6 +240,12 @@ OPTIONS
branch stacks and it will automatically switch to the branch view mode,
unless --no-branch-stack is used.
+--physid-mode::
+ Use the data addresses sampled using perf record -d and combine them
+ with the mmap'd area region where they are located. This helps identify
+ which data addresses collide with similar addresses in another process
+ space. See --sort for output choices.
+
--objdump=<path>::
Path to objdump binary.
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index c87412b..093f5ad 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -49,6 +49,7 @@ struct report {
bool show_threads;
bool inverted_callchain;
bool mem_mode;
+ bool physid_mode;
bool header;
bool header_only;
int max_stack;
@@ -241,7 +242,7 @@ static int process_sample_event(struct perf_tool *tool,
ret = report__add_branch_hist_entry(rep, &al, sample, evsel);
if (ret < 0)
pr_debug("problem adding lbr entry, skipping event\n");
- } else if (rep->mem_mode == 1) {
+ } else if ((rep->mem_mode == 1) || (rep->physid_mode)) {
ret = report__add_mem_hist_entry(rep, &al, sample, evsel);
if (ret < 0)
pr_debug("problem adding mem entry, skipping event\n");
@@ -746,6 +747,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
"Disable symbol demangling"),
OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"),
+ OPT_BOOLEAN(0, "physid-mode", &report.physid_mode, "physid access profile"),
OPT_CALLBACK(0, "percent-limit", &report, "percent",
"Don't show entries under that percent", parse_percent_limit),
OPT_END()
@@ -817,6 +819,22 @@ repeat:
sort_order = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked";
}
+ if (report.physid_mode) {
+ if ((sort__mode == SORT_MODE__BRANCH) ||
+ (sort__mode == SORT_MODE__MEMORY)) {
+ pr_err("branch or memory and physid mode incompatible\n");
+ goto error;
+ }
+ sort__mode = SORT_MODE__PHYSID;
+
+ /*
+ * if no sort_order is provided, then specify
+ * branch-mode specific order
+ */
+ if (sort_order == default_sort_order)
+ sort_order = "daddr,iaddr,pid,tid,major,minor,inode,inode_gen";
+ }
+
if (setup_sorting() < 0) {
parse_options_usage(report_usage, options, "s", 1);
goto error;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index f38590d..81f47ee 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -136,14 +136,34 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
symlen = dso__name_len(h->mem_info->daddr.map->dso);
hists__new_col_len(hists, HISTC_MEM_DADDR_DSO,
symlen);
+ hists__new_col_len(hists, HISTC_PHYSID_DADDR, symlen);
} else {
symlen = unresolved_col_width + 4 + 2;
hists__set_unres_dso_col_len(hists, HISTC_MEM_DADDR_DSO);
+ hists__set_unres_dso_col_len(hists, HISTC_PHYSID_DADDR);
+ }
+
+ if (h->mem_info->iaddr.sym) {
+ symlen = (int)h->mem_info->iaddr.sym->namelen + 4
+ + unresolved_col_width + 2;
+ hists__new_col_len(hists, HISTC_PHYSID_IADDR, symlen);
+ } else {
+ symlen = unresolved_col_width + 4 + 2;
+ hists__new_col_len(hists, HISTC_PHYSID_IADDR, symlen);
+ }
+ if (h->mem_info->iaddr.map) {
+ symlen = dso__name_len(h->mem_info->iaddr.map->dso);
+ hists__new_col_len(hists, HISTC_PHYSID_IADDR, symlen);
+ } else {
+ symlen = unresolved_col_width + 4 + 2;
+ hists__set_unres_dso_col_len(hists, HISTC_PHYSID_IADDR);
}
} else {
symlen = unresolved_col_width + 4 + 2;
hists__new_col_len(hists, HISTC_MEM_DADDR_SYMBOL, symlen);
hists__set_unres_dso_col_len(hists, HISTC_MEM_DADDR_DSO);
+ hists__set_unres_dso_col_len(hists, HISTC_PHYSID_DADDR);
+ hists__set_unres_dso_col_len(hists, HISTC_PHYSID_IADDR);
}
hists__new_col_len(hists, HISTC_MEM_LOCKED, 6);
@@ -413,9 +433,10 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.map = al->map,
.sym = al->sym,
},
- .cpu = al->cpu,
- .ip = al->addr,
- .level = al->level,
+ .cpu = al->cpu,
+ .cpumode = al->cpumode,
+ .ip = al->addr,
+ .level = al->level,
.stat = {
.nr_events = 1,
.period = period,
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 1f1f513..664d83f 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -71,6 +71,14 @@ enum hist_column {
HISTC_MEM_LVL,
HISTC_MEM_SNOOP,
HISTC_TRANSACTION,
+ HISTC_PHYSID_DADDR,
+ HISTC_PHYSID_IADDR,
+ HISTC_PHYSID_PID,
+ HISTC_PHYSID_TID,
+ HISTC_PHYSID_MAJOR,
+ HISTC_PHYSID_MINOR,
+ HISTC_PHYSID_INODE,
+ HISTC_PHYSID_INODE_GEN,
HISTC_NR_COLS, /* Last entry */
};
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 635cd8f..e016fc1 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -977,6 +977,269 @@ struct sort_entry sort_transaction = {
.se_width_idx = HISTC_TRANSACTION,
};
+static int64_t
+sort__physid_daddr_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ u64 l, r;
+ struct map *l_map = left->mem_info->daddr.map;
+ struct map *r_map = right->mem_info->daddr.map;
+
+ /* store all NULL mem maps at the bottom */
+ /* shouldn't even need this check, should have stubs */
+ if (!left->mem_info->daddr.map || !right->mem_info->daddr.map)
+ return 1;
+
+ /* group event types together */
+ if (left->cpumode > right->cpumode) return -1;
+ if (left->cpumode < right->cpumode) return 1;
+
+ /*
+ * Addresses with no major/minor numbers are assumed to be
+ * anonymous in userspace. Sort those on pid then address.
+ *
+ * The kernel and non-zero major/minor mapped areas are
+ * assumed to be unity mapped. Sort those on address then pid.
+ */
+
+ if (l_map->maj || l_map->min || l_map->ino || l_map->ino_generation) {
+ /* mmapped areas */
+
+ if (l_map->maj > r_map->maj) return -1;
+ if (l_map->maj < r_map->maj) return 1;
+
+ if (l_map->min > r_map->min) return -1;
+ if (l_map->min < r_map->min) return 1;
+
+ if (l_map->ino > r_map->ino) return -1;
+ if (l_map->ino < r_map->ino) return 1;
+
+ if (l_map->ino_generation > r_map->ino_generation) return -1;
+ if (l_map->ino_generation < r_map->ino_generation) return 1;
+
+ } else if (left->cpumode != PERF_RECORD_MISC_KERNEL) {
+ /* userspace anonymous */
+ if (left->thread->pid_ > right->thread->pid_) return -1;
+ if (left->thread->pid_ < right->thread->pid_) return 1;
+ }
+
+ /* hack to mark similar regions, 'right' is new entry */
+ right->color = TRUE;
+
+ /* al_addr does all the right addr - start + offset calculations */
+ l = left->mem_info->daddr.al_addr;
+ r = right->mem_info->daddr.al_addr;
+
+ if (l > r) return -1;
+ if (l < r) return 1;
+
+ /* sanity check the maps; only mmaped areas should have different maps */
+ if ((left->mem_info->daddr.map != right->mem_info->daddr.map) &&
+ !right->mem_info->daddr.map->maj && !right->mem_info->daddr.map->min)
+ pr_debug("physid_cmp: Similar entries have different maps\n");
+
+ return 0;
+}
+
+static int hist_entry__physid_daddr_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ uint64_t addr = 0;
+ struct map *map = NULL;
+ struct symbol *sym = NULL;
+ char level = he->level;
+
+ if (he->mem_info) {
+ addr = he->mem_info->daddr.addr;
+ map = he->mem_info->daddr.map;
+ sym = he->mem_info->daddr.sym;
+
+ /* print [s] for data mmaps */
+ if ((he->cpumode != PERF_RECORD_MISC_KERNEL) &&
+ map && (map->type == MAP__VARIABLE) &&
+ (map->maj || map->min || map->ino ||
+ map->ino_generation))
+ level = 's';
+ }
+
+ return _hist_entry__sym_snprintf(map, sym, addr, level, bf, size,
+ width);
+}
+
+struct sort_entry sort_physid_daddr = {
+ .se_header = "Data Address",
+ .se_cmp = sort__physid_daddr_cmp,
+ .se_snprintf = hist_entry__physid_daddr_snprintf,
+ .se_width_idx = HISTC_PHYSID_DADDR,
+};
+
+static int64_t
+sort__physid_iaddr_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ u64 l = left->mem_info->iaddr.al_addr;
+ u64 r = right->mem_info->iaddr.al_addr;
+
+ return r - l;
+}
+
+static int hist_entry__physid_iaddr_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ uint64_t addr = 0;
+ struct map *map = NULL;
+ struct symbol *sym = NULL;
+ char level = he->level;
+
+ if (he->mem_info) {
+ addr = he->mem_info->iaddr.addr;
+ map = he->mem_info->iaddr.map;
+ sym = he->mem_info->iaddr.sym;
+ }
+
+ return _hist_entry__sym_snprintf(map, sym, addr, level, bf, size,
+ width);
+}
+
+struct sort_entry sort_physid_iaddr = {
+ .se_header = "Source Address",
+ .se_cmp = sort__physid_iaddr_cmp,
+ .se_snprintf = hist_entry__physid_iaddr_snprintf,
+ .se_width_idx = HISTC_PHYSID_IADDR,
+};
+
+static int64_t
+sort__physid_pid_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ pid_t l = left->thread->pid_;
+ pid_t r = right->thread->pid_;
+
+ return r - l;
+}
+
+static int hist_entry__physid_pid_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ const char *comm = thread__comm_str(he->thread);
+ return repsep_snprintf(bf, size, "%*s:%5d", width - 6,
+ comm ?: "", he->thread->pid_);
+}
+
+struct sort_entry sort_physid_pid = {
+ .se_header = "Command: Pid",
+ .se_cmp = sort__physid_pid_cmp,
+ .se_snprintf = hist_entry__physid_pid_snprintf,
+ .se_width_idx = HISTC_PHYSID_PID,
+};
+
+static int64_t
+sort__physid_tid_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ pid_t l = left->thread->tid;
+ pid_t r = right->thread->tid;
+
+ return r - l;
+}
+
+static int hist_entry__physid_tid_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ return repsep_snprintf(bf, size, "%*d", width, he->thread->tid);
+}
+
+struct sort_entry sort_physid_tid = {
+ .se_header = "Tid ",
+ .se_cmp = sort__physid_tid_cmp,
+ .se_snprintf = hist_entry__physid_tid_snprintf,
+ .se_width_idx = HISTC_PHYSID_TID,
+};
+
+static int64_t
+sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ struct map *l = left->mem_info->daddr.map;
+ struct map *r = right->mem_info->daddr.map;
+
+ return r->maj - l->maj;
+}
+
+static int hist_entry__physid_major_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ return repsep_snprintf(bf, size, "%*x", width, he->mem_info->daddr.map->maj);
+}
+
+struct sort_entry sort_physid_major = {
+ .se_header = "Major",
+ .se_cmp = sort__physid_major_cmp,
+ .se_snprintf = hist_entry__physid_major_snprintf,
+ .se_width_idx = HISTC_PHYSID_MAJOR,
+};
+
+static int64_t
+sort__physid_minor_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ struct map *l = left->mem_info->daddr.map;
+ struct map *r = right->mem_info->daddr.map;
+
+ return r->min - l->min;
+}
+
+static int hist_entry__physid_minor_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ return repsep_snprintf(bf, size, "%*x", width, he->mem_info->daddr.map->min);
+}
+
+struct sort_entry sort_physid_minor = {
+ .se_header = "Minor",
+ .se_cmp = sort__physid_minor_cmp,
+ .se_snprintf = hist_entry__physid_minor_snprintf,
+ .se_width_idx = HISTC_PHYSID_MINOR,
+};
+
+static int64_t
+sort__physid_inode_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ struct map *l = left->mem_info->daddr.map;
+ struct map *r = right->mem_info->daddr.map;
+
+ return r->ino - l->ino;
+}
+
+static int hist_entry__physid_inode_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ return repsep_snprintf(bf, size, "%*x", width, he->mem_info->daddr.map->ino);
+}
+
+struct sort_entry sort_physid_inode = {
+ .se_header = "Inode ",
+ .se_cmp = sort__physid_inode_cmp,
+ .se_snprintf = hist_entry__physid_inode_snprintf,
+ .se_width_idx = HISTC_PHYSID_INODE,
+};
+
+static int64_t
+sort__physid_inode_gen_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ struct map *l = left->mem_info->daddr.map;
+ struct map *r = right->mem_info->daddr.map;
+
+ return r->ino_generation - l->ino_generation;
+}
+
+static int hist_entry__physid_inode_gen_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ return repsep_snprintf(bf, size, "%-*x", width, he->mem_info->daddr.map->ino_generation);
+}
+
+struct sort_entry sort_physid_inode_gen = {
+ .se_header = "Inode Gen",
+ .se_cmp = sort__physid_inode_gen_cmp,
+ .se_snprintf = hist_entry__physid_inode_gen_snprintf,
+ .se_width_idx = HISTC_PHYSID_INODE_GEN,
+};
+
struct sort_dimension {
const char *name;
struct sort_entry *entry;
@@ -1027,6 +1290,21 @@ static struct sort_dimension memory_sort_dimensions[] = {
#undef DIM
+#define DIM(d, n, func) [d - __SORT_PHYSID_MODE] = { .name = n, .entry = &(func) }
+
+static struct sort_dimension physid_sort_dimensions[] = {
+ DIM(SORT_PHYSID_DADDR, "daddr", sort_physid_daddr),
+ DIM(SORT_PHYSID_IADDR, "iaddr", sort_physid_iaddr),
+ DIM(SORT_PHYSID_PID, "pid", sort_physid_pid),
+ DIM(SORT_PHYSID_TID, "tid", sort_physid_tid),
+ DIM(SORT_PHYSID_MAJOR, "major", sort_physid_major),
+ DIM(SORT_PHYSID_MINOR, "minor", sort_physid_minor),
+ DIM(SORT_PHYSID_INODE, "inode", sort_physid_inode),
+ DIM(SORT_PHYSID_INODE_GEN, "inode_gen", sort_physid_inode_gen),
+};
+
+#undef DIM
+
static void __sort_dimension__add(struct sort_dimension *sd, enum sort_type idx)
{
if (sd->taken)
@@ -1104,6 +1382,22 @@ int sort_dimension__add(const char *tok)
return 0;
}
+ for (i = 0; i < ARRAY_SIZE(physid_sort_dimensions); i++) {
+ struct sort_dimension *sd = &physid_sort_dimensions[i];
+
+ if (strncasecmp(tok, sd->name, strlen(tok)))
+ continue;
+
+ if (sort__mode != SORT_MODE__PHYSID)
+ return -EINVAL;
+
+ if (sd->entry == &sort_physid_daddr)
+ sort__has_sym = 1;
+
+ __sort_dimension__add(sd, i + __SORT_PHYSID_MODE);
+ return 0;
+ }
+
return -ESRCH;
}
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 43e5ff4..b1f52a8 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -87,11 +87,13 @@ struct hist_entry {
u64 ip;
u64 transaction;
s32 cpu;
+ u8 cpumode;
struct hist_entry_diff diff;
/* We are added by hists__add_dummy_entry. */
bool dummy;
+ bool color;
/* XXX These two should move to some tree widget lib */
u16 row_offset;
@@ -133,6 +135,7 @@ enum sort_mode {
SORT_MODE__NORMAL,
SORT_MODE__BRANCH,
SORT_MODE__MEMORY,
+ SORT_MODE__PHYSID,
};
enum sort_type {
@@ -166,6 +169,16 @@ enum sort_type {
SORT_MEM_TLB,
SORT_MEM_LVL,
SORT_MEM_SNOOP,
+
+ __SORT_PHYSID_MODE,
+ SORT_PHYSID_DADDR = __SORT_PHYSID_MODE,
+ SORT_PHYSID_IADDR,
+ SORT_PHYSID_PID,
+ SORT_PHYSID_TID,
+ SORT_PHYSID_MAJOR,
+ SORT_PHYSID_MINOR,
+ SORT_PHYSID_INODE,
+ SORT_PHYSID_INODE_GEN,
};
/*
--
1.7.11.7
^ permalink raw reply related [flat|nested] 25+ messages in thread* Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 20:57 ` [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data Don Zickus
@ 2014-03-29 17:11 ` Jiri Olsa
2014-04-01 2:58 ` Don Zickus
2014-04-09 5:21 ` Namhyung Kim
1 sibling, 1 reply; 25+ messages in thread
From: Jiri Olsa @ 2014-03-29 17:11 UTC (permalink / raw)
To: Don Zickus; +Cc: acme, LKML, jmario, fowles, peterz, eranian, andi.kleen
On Mon, Mar 24, 2014 at 04:57:18PM -0400, Don Zickus wrote:
> In order for the c2c tool to work correctly, it needs to properly
> sort all the records on uniquely identifiable data addresses. These
> unique addresses are converted from virtual addresses provided by the
> hardware into a kernel address using an mmap2 record as the decoder.
>
> Once a unique address is converted, we can sort on them based on
> various rules. Then it becomes clear which address are overlapping
> with each other across mmap regions or pid spaces.
>
> This patch just creates the rules and inserts the records into a
> sort entry for safe keeping until later patches process them.
>
> The general sorting rule is:
SNIP
> +
> +static int64_t
> +sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + struct map *l = left->mem_info->daddr.map;
> + struct map *r = right->mem_info->daddr.map;
> +
> + return r->maj - l->maj;
I got segfault here, and consequently in all other sorting
functions, because it failed to resolve map earlier in
ip__resolve_data
we need to check it here, or before adding to the tree
thanks,
jirka
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
2014-03-29 17:11 ` Jiri Olsa
@ 2014-04-01 2:58 ` Don Zickus
0 siblings, 0 replies; 25+ messages in thread
From: Don Zickus @ 2014-04-01 2:58 UTC (permalink / raw)
To: Jiri Olsa; +Cc: acme, LKML, jmario, fowles, peterz, eranian, andi.kleen
On Sat, Mar 29, 2014 at 06:11:52PM +0100, Jiri Olsa wrote:
> On Mon, Mar 24, 2014 at 04:57:18PM -0400, Don Zickus wrote:
> > In order for the c2c tool to work correctly, it needs to properly
> > sort all the records on uniquely identifiable data addresses. These
> > unique addresses are converted from virtual addresses provided by the
> > hardware into a kernel address using an mmap2 record as the decoder.
> >
> > Once a unique address is converted, we can sort on them based on
> > various rules. Then it becomes clear which address are overlapping
> > with each other across mmap regions or pid spaces.
> >
> > This patch just creates the rules and inserts the records into a
> > sort entry for safe keeping until later patches process them.
> >
> > The general sorting rule is:
>
> SNIP
>
> > +
> > +static int64_t
> > +sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right)
> > +{
> > + struct map *l = left->mem_info->daddr.map;
> > + struct map *r = right->mem_info->daddr.map;
> > +
> > + return r->maj - l->maj;
>
> I got segfault here, and consequently in all other sorting
> functions, because it failed to resolve map earlier in
> ip__resolve_data
>
> we need to check it here, or before adding to the tree
Crap. I checked it before, when I had one big function. I forgot to
carry that though. Honestly I would love to block these before they made
it to the sort routine but don't know a good way without adding checks to
all the builtins.
Cheers,
Don
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 20:57 ` [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data Don Zickus
2014-03-29 17:11 ` Jiri Olsa
@ 2014-04-09 5:21 ` Namhyung Kim
2014-04-09 5:45 ` Peter Zijlstra
1 sibling, 1 reply; 25+ messages in thread
From: Namhyung Kim @ 2014-04-09 5:21 UTC (permalink / raw)
To: Don Zickus; +Cc: acme, LKML, jolsa, jmario, fowles, peterz, eranian, andi.kleen
On Mon, 24 Mar 2014 16:57:18 -0400, Don Zickus wrote:
> In order for the c2c tool to work correctly, it needs to properly
> sort all the records on uniquely identifiable data addresses. These
> unique addresses are converted from virtual addresses provided by the
> hardware into a kernel address using an mmap2 record as the decoder.
>
> Once a unique address is converted, we can sort on them based on
> various rules. Then it becomes clear which address are overlapping
> with each other across mmap regions or pid spaces.
>
> This patch just creates the rules and inserts the records into a
> sort entry for safe keeping until later patches process them.
>
> The general sorting rule is:
>
> o group cpumodes together
> o if (nonzero major/minor number - ie mmap'd areas)
> o sort on major, minor, inode, inode generation numbers
> o else if cpumode is not kernel
> o sort on pid
> o sort on data addresses
>
> I also hacked in the concept of 'color'. The purpose of that bit is to
> provides hints later when processing these records that indicate a new unique
> address has been encountered. Because later processing only checks the data
> addresses, there can be a theoretical scenario that similar sequential data
> addresses (when walking the rbtree) could be misinterpreted as overlapping
> when in fact they are not.
>
> Sample output: (perf report --stdio --physid-mode)
>
> Overhead Data Address Source Address Command: Pid Tid Major Minor Inode Inode Gen
> ........ ...................... ........................ ................. ..... ..... ..... ....... .........
> 18.93% [k] 0xffffc900139c40b0 [k] igb_update_stats kworker/0:1: 257 257 0 0 0 0
> 7.63% [k] 0xffff88082e6cf0a8 [k] watchdog_timer_fn swapper: 0 0 0 0 0 0
> 1.86% [k] 0xffff88042ef94700 [k] _raw_spin_lock swapper: 0 0 0 0 0 0
> 1.77% [k] 0xffff8804278afa50 [k] __switch_to swapper: 0 0 0 0 0 0
>
> V4: add manpage entry in perf-report
>
> V3: split out the sorting into unique entries. This makes it look
> far less ugly
> create a new 'physid mode' to group all the sorting rules together
> (mimics the mem-mode)
What is 'physid' then? I guess you meant physical id but it seems
unique id or unique map id looks like a better fit IMHO.
>
> Signed-off-by: Don Zickus <dzickus@redhat.com>
> ---
> tools/perf/Documentation/perf-report.txt | 23 +++
> tools/perf/builtin-report.c | 20 ++-
> tools/perf/util/hist.c | 27 ++-
> tools/perf/util/hist.h | 8 +
> tools/perf/util/sort.c | 294 +++++++++++++++++++++++++++++++
> tools/perf/util/sort.h | 13 ++
> 6 files changed, 381 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 8eab8a4..01391b0 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -95,6 +95,23 @@ OPTIONS
> And default sort keys are changed to comm, dso_from, symbol_from, dso_to
> and symbol_to, see '--branch-stack'.
>
> + If --physid-mode option is used, following sort keys are also
> + available:
> + daddr, iaddr, pid, tid, major, minor, inode, inode_gen.
> +
> + - daddr: data address (sorted based on major, minor, inode and inode
> + generation numbers if shared, otherwise pid)
By "if shared", did you mean "for shared file mapping"?
> + - iaddr: instruction address
> + - pid: command and pid of the task
> + - tid: tid of the task
> + - major: major number of mapped location (0 if not mapped)
> + - minor: minor number of mapped location (0 if not mapped)
> + - inode: inode number of mapped location (0 if not mapped)
> + - inode_gen: inode generation number of mapped location (0 if not mapped)
s/if not mapped/if not file-mapped/ ?
> +
> + And default sort keys are changed to daddr, iaddr, pid, tid, major,
> + minor, inode and inode_gen, see '--physid-mode'.
> +
> -p::
> --parent=<regex>::
> A regex filter to identify parent. The parent is a caller of this
> @@ -223,6 +240,12 @@ OPTIONS
> branch stacks and it will automatically switch to the branch view mode,
> unless --no-branch-stack is used.
>
> +--physid-mode::
> + Use the data addresses sampled using perf record -d and combine them
> + with the mmap'd area region where they are located. This helps identify
> + which data addresses collide with similar addresses in another process
> + space. See --sort for output choices.
> +
> --objdump=<path>::
> Path to objdump binary.
>
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index c87412b..093f5ad 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -49,6 +49,7 @@ struct report {
> bool show_threads;
> bool inverted_callchain;
> bool mem_mode;
> + bool physid_mode;
> bool header;
> bool header_only;
> int max_stack;
> @@ -241,7 +242,7 @@ static int process_sample_event(struct perf_tool *tool,
> ret = report__add_branch_hist_entry(rep, &al, sample, evsel);
> if (ret < 0)
> pr_debug("problem adding lbr entry, skipping event\n");
> - } else if (rep->mem_mode == 1) {
> + } else if ((rep->mem_mode == 1) || (rep->physid_mode)) {
As you can see rep->mem_mode is also a boolean field. Please change it
like:
} else if (rep->mem_mode || rep->physid_mode) {
> ret = report__add_mem_hist_entry(rep, &al, sample, evsel);
> if (ret < 0)
> pr_debug("problem adding mem entry, skipping event\n");
> @@ -746,6 +747,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
> OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
> "Disable symbol demangling"),
> OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"),
> + OPT_BOOLEAN(0, "physid-mode", &report.physid_mode, "physid access profile"),
> OPT_CALLBACK(0, "percent-limit", &report, "percent",
> "Don't show entries under that percent", parse_percent_limit),
> OPT_END()
> @@ -817,6 +819,22 @@ repeat:
> sort_order = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked";
> }
>
> + if (report.physid_mode) {
> + if ((sort__mode == SORT_MODE__BRANCH) ||
> + (sort__mode == SORT_MODE__MEMORY)) {
> + pr_err("branch or memory and physid mode incompatible\n");
> + goto error;
> + }
> + sort__mode = SORT_MODE__PHYSID;
> +
> + /*
> + * if no sort_order is provided, then specify
> + * branch-mode specific order
s/branch-mode/physid-mode/
It looks mem-mode has the same copy-n-paste problem.
> + */
> + if (sort_order == default_sort_order)
> + sort_order = "daddr,iaddr,pid,tid,major,minor,inode,inode_gen";
So if the 'daddr' key already checks major, minor, inode and inode_gen
by itself why do we need to add those sort keys again?
> + }
> +
> if (setup_sorting() < 0) {
> parse_options_usage(report_usage, options, "s", 1);
> goto error;
[SNIP]
> +static int64_t
> +sort__physid_daddr_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + u64 l, r;
> + struct map *l_map = left->mem_info->daddr.map;
> + struct map *r_map = right->mem_info->daddr.map;
> +
> + /* store all NULL mem maps at the bottom */
> + /* shouldn't even need this check, should have stubs */
> + if (!left->mem_info->daddr.map || !right->mem_info->daddr.map)
> + return 1;
You might want to use 'return cmp_null(l_map, r_map);' here.
> +
> + /* group event types together */
> + if (left->cpumode > right->cpumode) return -1;
> + if (left->cpumode < right->cpumode) return 1;
> +
> + /*
> + * Addresses with no major/minor numbers are assumed to be
> + * anonymous in userspace. Sort those on pid then address.
> + *
> + * The kernel and non-zero major/minor mapped areas are
> + * assumed to be unity mapped. Sort those on address then pid.
> + */
> +
> + if (l_map->maj || l_map->min || l_map->ino || l_map->ino_generation) {
> + /* mmapped areas */
> +
> + if (l_map->maj > r_map->maj) return -1;
> + if (l_map->maj < r_map->maj) return 1;
> +
> + if (l_map->min > r_map->min) return -1;
> + if (l_map->min < r_map->min) return 1;
> +
> + if (l_map->ino > r_map->ino) return -1;
> + if (l_map->ino < r_map->ino) return 1;
> +
> + if (l_map->ino_generation > r_map->ino_generation) return -1;
> + if (l_map->ino_generation < r_map->ino_generation) return 1;
> +
> + } else if (left->cpumode != PERF_RECORD_MISC_KERNEL) {
> + /* userspace anonymous */
> + if (left->thread->pid_ > right->thread->pid_) return -1;
> + if (left->thread->pid_ < right->thread->pid_) return 1;
> + }
> +
> + /* hack to mark similar regions, 'right' is new entry */
> + right->color = TRUE;
I don't understand the logic behind the 'color'. It seems just mark
every samples except first one on a same file (or same pid for anon map)
indicating that those accesses are for distinct maps, right?
I don't know how it could help to distinguish whether an access is for a
same map or different map. For the userspace anon map case, why doesn't
it check the start addresses of l_map and r_map?
I'm feeling ignorant.. :-(
> +
> + /* al_addr does all the right addr - start + offset calculations */
> + l = left->mem_info->daddr.al_addr;
> + r = right->mem_info->daddr.al_addr;
> +
> + if (l > r) return -1;
> + if (l < r) return 1;
> +
> + /* sanity check the maps; only mmaped areas should have different maps */
> + if ((left->mem_info->daddr.map != right->mem_info->daddr.map) &&
> + !right->mem_info->daddr.map->maj && !right->mem_info->daddr.map->min)
> + pr_debug("physid_cmp: Similar entries have different maps\n");
> +
> + return 0;
> +}
[SNIP]
> @@ -1104,6 +1382,22 @@ int sort_dimension__add(const char *tok)
> return 0;
> }
>
> + for (i = 0; i < ARRAY_SIZE(physid_sort_dimensions); i++) {
> + struct sort_dimension *sd = &physid_sort_dimensions[i];
> +
> + if (strncasecmp(tok, sd->name, strlen(tok)))
> + continue;
> +
> + if (sort__mode != SORT_MODE__PHYSID)
> + return -EINVAL;
> +
> + if (sd->entry == &sort_physid_daddr)
> + sort__has_sym = 1;
I think it's not needed. The sort__has_sym is for doing annotate during
report/top session and it only works for symbol (i.e. function) basis.
Thanks,
Namhyung
> +
> + __sort_dimension__add(sd, i + __SORT_PHYSID_MODE);
> + return 0;
> + }
> +
> return -ESRCH;
> }
^ permalink raw reply [flat|nested] 25+ messages in thread* Re: [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data
2014-04-09 5:21 ` Namhyung Kim
@ 2014-04-09 5:45 ` Peter Zijlstra
0 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2014-04-09 5:45 UTC (permalink / raw)
To: Namhyung Kim
Cc: Don Zickus, acme, LKML, jolsa, jmario, fowles, eranian,
andi.kleen
On Wed, Apr 09, 2014 at 02:21:49PM +0900, Namhyung Kim wrote:
> > create a new 'physid mode' to group all the sorting rules together
> > (mimics the mem-mode)
>
> What is 'physid' then? I guess you meant physical id but it seems
> unique id or unique map id looks like a better fit IMHO.
I suspect this is legacy naming; they used to do this using physical
addresses.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data
2014-03-24 19:34 ` [PATCH 4/6] perf, sort: Add physid sorting based on mmap2 data Don Zickus
` (2 preceding siblings ...)
2014-03-24 20:57 ` [PATCH 4/6 V2] perf, sort: Add physid sorting based on mmap2 data Don Zickus
@ 2014-04-09 3:06 ` Don Zickus
3 siblings, 0 replies; 25+ messages in thread
From: Don Zickus @ 2014-04-09 3:06 UTC (permalink / raw)
To: acme, peterz; +Cc: LKML, jolsa, jmario, fowles, eranian, andi.kleen
On Mon, Mar 24, 2014 at 03:34:34PM -0400, Don Zickus wrote:
> In order for the c2c tool to work correctly, it needs to properly
> sort all the records on uniquely identifiable data addresses. These
> unique addresses are converted from virtual addresses provided by the
> hardware into a kernel address using an mmap2 record as the decoder.
>
> Once a unique address is converted, we can sort on them based on
> various rules. Then it becomes clear which address are overlapping
> with each other across mmap regions or pid spaces.
I am finishing up another way to sort this data that might make more sense
then the approach in this patch. Hopefully tomorrow I can do that.
Cheers,
Don
>
> This patch just creates the rules and inserts the records into a
> sort entry for safe keeping until later patches process them.
>
> The general sorting rule is:
>
> o group cpumodes together
> o if (nonzero major/minor number - ie mmap'd areas)
> o sort on major, minor, inode, inode generation numbers
> o else if cpumode is not kernel
> o sort on pid
> o sort on data addresses
>
> I also hacked in the concept of 'color'. The purpose of that bit is to
> provides hints later when processing these records that indicate a new unique
> address has been encountered. Because later processing only checks the data
> addresses, there can be a theoretical scenario that similar sequential data
> addresses (when walking the rbtree) could be misinterpreted as overlapping
> when in fact they are not.
>
> Sample output: (perf report --stdio --physid-mode)
>
> 18.93% [k] 0xffffc900139c40b0 [k] igb_update_stats kworker/0:1: 257 257 0 0 0 0
> 7.63% [k] 0xffff88082e6cf0a8 [k] watchdog_timer_fn swapper: 0 0 0 0 0 0
> 1.86% [k] 0xffff88042ef94700 [k] _raw_spin_lock swapper: 0 0 0 0 0 0
> 1.77% [k] 0xffff8804278afa50 [k] __switch_to swapper: 0 0 0 0 0 0
>
> V3: split out the sorting into unique entries. This makes it look
> far less ugly
> create a new 'physid mode' to group all the sorting rules together
> (mimics the mem-mode)
>
> Signed-off-by: Don Zickus <dzickus@redhat.com>
> ---
> tools/perf/builtin-report.c | 20 ++-
> tools/perf/util/hist.c | 27 +++-
> tools/perf/util/hist.h | 8 ++
> tools/perf/util/sort.c | 294 ++++++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/sort.h | 13 ++
> 5 files changed, 358 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index c87412b..093f5ad 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -49,6 +49,7 @@ struct report {
> bool show_threads;
> bool inverted_callchain;
> bool mem_mode;
> + bool physid_mode;
> bool header;
> bool header_only;
> int max_stack;
> @@ -241,7 +242,7 @@ static int process_sample_event(struct perf_tool *tool,
> ret = report__add_branch_hist_entry(rep, &al, sample, evsel);
> if (ret < 0)
> pr_debug("problem adding lbr entry, skipping event\n");
> - } else if (rep->mem_mode == 1) {
> + } else if ((rep->mem_mode == 1) || (rep->physid_mode)) {
> ret = report__add_mem_hist_entry(rep, &al, sample, evsel);
> if (ret < 0)
> pr_debug("problem adding mem entry, skipping event\n");
> @@ -746,6 +747,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
> OPT_BOOLEAN(0, "demangle", &symbol_conf.demangle,
> "Disable symbol demangling"),
> OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"),
> + OPT_BOOLEAN(0, "physid-mode", &report.physid_mode, "physid access profile"),
> OPT_CALLBACK(0, "percent-limit", &report, "percent",
> "Don't show entries under that percent", parse_percent_limit),
> OPT_END()
> @@ -817,6 +819,22 @@ repeat:
> sort_order = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked";
> }
>
> + if (report.physid_mode) {
> + if ((sort__mode == SORT_MODE__BRANCH) ||
> + (sort__mode == SORT_MODE__MEMORY)) {
> + pr_err("branch or memory and physid mode incompatible\n");
> + goto error;
> + }
> + sort__mode = SORT_MODE__PHYSID;
> +
> + /*
> + * if no sort_order is provided, then specify
> + * branch-mode specific order
> + */
> + if (sort_order == default_sort_order)
> + sort_order = "daddr,iaddr,pid,tid,major,minor,inode,inode_gen";
> + }
> +
> if (setup_sorting() < 0) {
> parse_options_usage(report_usage, options, "s", 1);
> goto error;
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index f38590d..81f47ee 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -136,14 +136,34 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
> symlen = dso__name_len(h->mem_info->daddr.map->dso);
> hists__new_col_len(hists, HISTC_MEM_DADDR_DSO,
> symlen);
> + hists__new_col_len(hists, HISTC_PHYSID_DADDR, symlen);
> } else {
> symlen = unresolved_col_width + 4 + 2;
> hists__set_unres_dso_col_len(hists, HISTC_MEM_DADDR_DSO);
> + hists__set_unres_dso_col_len(hists, HISTC_PHYSID_DADDR);
> + }
> +
> + if (h->mem_info->iaddr.sym) {
> + symlen = (int)h->mem_info->iaddr.sym->namelen + 4
> + + unresolved_col_width + 2;
> + hists__new_col_len(hists, HISTC_PHYSID_IADDR, symlen);
> + } else {
> + symlen = unresolved_col_width + 4 + 2;
> + hists__new_col_len(hists, HISTC_PHYSID_IADDR, symlen);
> + }
> + if (h->mem_info->iaddr.map) {
> + symlen = dso__name_len(h->mem_info->iaddr.map->dso);
> + hists__new_col_len(hists, HISTC_PHYSID_IADDR, symlen);
> + } else {
> + symlen = unresolved_col_width + 4 + 2;
> + hists__set_unres_dso_col_len(hists, HISTC_PHYSID_IADDR);
> }
> } else {
> symlen = unresolved_col_width + 4 + 2;
> hists__new_col_len(hists, HISTC_MEM_DADDR_SYMBOL, symlen);
> hists__set_unres_dso_col_len(hists, HISTC_MEM_DADDR_DSO);
> + hists__set_unres_dso_col_len(hists, HISTC_PHYSID_DADDR);
> + hists__set_unres_dso_col_len(hists, HISTC_PHYSID_IADDR);
> }
>
> hists__new_col_len(hists, HISTC_MEM_LOCKED, 6);
> @@ -413,9 +433,10 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
> .map = al->map,
> .sym = al->sym,
> },
> - .cpu = al->cpu,
> - .ip = al->addr,
> - .level = al->level,
> + .cpu = al->cpu,
> + .cpumode = al->cpumode,
> + .ip = al->addr,
> + .level = al->level,
> .stat = {
> .nr_events = 1,
> .period = period,
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index 1f1f513..664d83f 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -71,6 +71,14 @@ enum hist_column {
> HISTC_MEM_LVL,
> HISTC_MEM_SNOOP,
> HISTC_TRANSACTION,
> + HISTC_PHYSID_DADDR,
> + HISTC_PHYSID_IADDR,
> + HISTC_PHYSID_PID,
> + HISTC_PHYSID_TID,
> + HISTC_PHYSID_MAJOR,
> + HISTC_PHYSID_MINOR,
> + HISTC_PHYSID_INODE,
> + HISTC_PHYSID_INODE_GEN,
> HISTC_NR_COLS, /* Last entry */
> };
>
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 635cd8f..e016fc1 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -977,6 +977,269 @@ struct sort_entry sort_transaction = {
> .se_width_idx = HISTC_TRANSACTION,
> };
>
> +static int64_t
> +sort__physid_daddr_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + u64 l, r;
> + struct map *l_map = left->mem_info->daddr.map;
> + struct map *r_map = right->mem_info->daddr.map;
> +
> + /* store all NULL mem maps at the bottom */
> + /* shouldn't even need this check, should have stubs */
> + if (!left->mem_info->daddr.map || !right->mem_info->daddr.map)
> + return 1;
> +
> + /* group event types together */
> + if (left->cpumode > right->cpumode) return -1;
> + if (left->cpumode < right->cpumode) return 1;
> +
> + /*
> + * Addresses with no major/minor numbers are assumed to be
> + * anonymous in userspace. Sort those on pid then address.
> + *
> + * The kernel and non-zero major/minor mapped areas are
> + * assumed to be unity mapped. Sort those on address then pid.
> + */
> +
> + if (l_map->maj || l_map->min || l_map->ino || l_map->ino_generation) {
> + /* mmapped areas */
> +
> + if (l_map->maj > r_map->maj) return -1;
> + if (l_map->maj < r_map->maj) return 1;
> +
> + if (l_map->min > r_map->min) return -1;
> + if (l_map->min < r_map->min) return 1;
> +
> + if (l_map->ino > r_map->ino) return -1;
> + if (l_map->ino < r_map->ino) return 1;
> +
> + if (l_map->ino_generation > r_map->ino_generation) return -1;
> + if (l_map->ino_generation < r_map->ino_generation) return 1;
> +
> + } else if (left->cpumode != PERF_RECORD_MISC_KERNEL) {
> + /* userspace anonymous */
> + if (left->thread->pid_ > right->thread->pid_) return -1;
> + if (left->thread->pid_ < right->thread->pid_) return 1;
> + }
> +
> + /* hack to mark similar regions, 'right' is new entry */
> + right->color = TRUE;
> +
> + /* al_addr does all the right addr - start + offset calculations */
> + l = left->mem_info->daddr.al_addr;
> + r = right->mem_info->daddr.al_addr;
> +
> + if (l > r) return -1;
> + if (l < r) return 1;
> +
> + /* sanity check the maps; only mmaped areas should have different maps */
> + if ((left->mem_info->daddr.map != right->mem_info->daddr.map) &&
> + !right->mem_info->daddr.map->maj && !right->mem_info->daddr.map->min)
> + pr_debug("physid_cmp: Similar entries have different maps\n");
> +
> + return 0;
> +}
> +
> +static int hist_entry__physid_daddr_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + uint64_t addr = 0;
> + struct map *map = NULL;
> + struct symbol *sym = NULL;
> + char level = he->level;
> +
> + if (he->mem_info) {
> + addr = he->mem_info->daddr.addr;
> + map = he->mem_info->daddr.map;
> + sym = he->mem_info->daddr.sym;
> +
> + /* print [s] for data mmaps */
> + if ((he->cpumode != PERF_RECORD_MISC_KERNEL) &&
> + map && (map->type == MAP__VARIABLE) &&
> + (map->maj || map->min || map->ino ||
> + map->ino_generation))
> + level = 's';
> + }
> +
> + return _hist_entry__sym_snprintf(map, sym, addr, level, bf, size,
> + width);
> +}
> +
> +struct sort_entry sort_physid_daddr = {
> + .se_header = "Data Address",
> + .se_cmp = sort__physid_daddr_cmp,
> + .se_snprintf = hist_entry__physid_daddr_snprintf,
> + .se_width_idx = HISTC_PHYSID_DADDR,
> +};
> +
> +static int64_t
> +sort__physid_iaddr_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + u64 l = left->mem_info->iaddr.al_addr;
> + u64 r = right->mem_info->iaddr.al_addr;
> +
> + return r - l;
> +}
> +
> +static int hist_entry__physid_iaddr_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + uint64_t addr = 0;
> + struct map *map = NULL;
> + struct symbol *sym = NULL;
> + char level = he->level;
> +
> + if (he->mem_info) {
> + addr = he->mem_info->iaddr.addr;
> + map = he->mem_info->iaddr.map;
> + sym = he->mem_info->iaddr.sym;
> + }
> +
> + return _hist_entry__sym_snprintf(map, sym, addr, level, bf, size,
> + width);
> +}
> +
> +struct sort_entry sort_physid_iaddr = {
> + .se_header = "Source Address",
> + .se_cmp = sort__physid_iaddr_cmp,
> + .se_snprintf = hist_entry__physid_iaddr_snprintf,
> + .se_width_idx = HISTC_PHYSID_IADDR,
> +};
> +
> +static int64_t
> +sort__physid_pid_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + pid_t l = left->thread->pid_;
> + pid_t r = right->thread->pid_;
> +
> + return r - l;
> +}
> +
> +static int hist_entry__physid_pid_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + const char *comm = thread__comm_str(he->thread);
> + return repsep_snprintf(bf, size, "%*s:%5d", width - 6,
> + comm ?: "", he->thread->pid_);
> +}
> +
> +struct sort_entry sort_physid_pid = {
> + .se_header = "Command: Pid",
> + .se_cmp = sort__physid_pid_cmp,
> + .se_snprintf = hist_entry__physid_pid_snprintf,
> + .se_width_idx = HISTC_PHYSID_PID,
> +};
> +
> +static int64_t
> +sort__physid_tid_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + pid_t l = left->thread->tid;
> + pid_t r = right->thread->tid;
> +
> + return r - l;
> +}
> +
> +static int hist_entry__physid_tid_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + return repsep_snprintf(bf, size, "%*d", width, he->thread->tid);
> +}
> +
> +struct sort_entry sort_physid_tid = {
> + .se_header = "Tid ",
> + .se_cmp = sort__physid_tid_cmp,
> + .se_snprintf = hist_entry__physid_tid_snprintf,
> + .se_width_idx = HISTC_PHYSID_TID,
> +};
> +
> +static int64_t
> +sort__physid_major_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + struct map *l = left->mem_info->daddr.map;
> + struct map *r = right->mem_info->daddr.map;
> +
> + return r->maj - l->maj;
> +}
> +
> +static int hist_entry__physid_major_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + return repsep_snprintf(bf, size, "%*x", width, he->mem_info->daddr.map->maj);
> +}
> +
> +struct sort_entry sort_physid_major = {
> + .se_header = "Major",
> + .se_cmp = sort__physid_major_cmp,
> + .se_snprintf = hist_entry__physid_major_snprintf,
> + .se_width_idx = HISTC_PHYSID_MAJOR,
> +};
> +
> +static int64_t
> +sort__physid_minor_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + struct map *l = left->mem_info->daddr.map;
> + struct map *r = right->mem_info->daddr.map;
> +
> + return r->min - l->min;
> +}
> +
> +static int hist_entry__physid_minor_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + return repsep_snprintf(bf, size, "%*x", width, he->mem_info->daddr.map->min);
> +}
> +
> +struct sort_entry sort_physid_minor = {
> + .se_header = "Minor",
> + .se_cmp = sort__physid_minor_cmp,
> + .se_snprintf = hist_entry__physid_minor_snprintf,
> + .se_width_idx = HISTC_PHYSID_MINOR,
> +};
> +
> +static int64_t
> +sort__physid_inode_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + struct map *l = left->mem_info->daddr.map;
> + struct map *r = right->mem_info->daddr.map;
> +
> + return r->ino - l->ino;
> +}
> +
> +static int hist_entry__physid_inode_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + return repsep_snprintf(bf, size, "%*x", width, he->mem_info->daddr.map->ino);
> +}
> +
> +struct sort_entry sort_physid_inode = {
> + .se_header = "Inode ",
> + .se_cmp = sort__physid_inode_cmp,
> + .se_snprintf = hist_entry__physid_inode_snprintf,
> + .se_width_idx = HISTC_PHYSID_INODE,
> +};
> +
> +static int64_t
> +sort__physid_inode_gen_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> + struct map *l = left->mem_info->daddr.map;
> + struct map *r = right->mem_info->daddr.map;
> +
> + return r->ino_generation - l->ino_generation;
> +}
> +
> +static int hist_entry__physid_inode_gen_snprintf(struct hist_entry *he, char *bf,
> + size_t size, unsigned int width)
> +{
> + return repsep_snprintf(bf, size, "%-*x", width, he->mem_info->daddr.map->ino_generation);
> +}
> +
> +struct sort_entry sort_physid_inode_gen = {
> + .se_header = "Inode Gen",
> + .se_cmp = sort__physid_inode_gen_cmp,
> + .se_snprintf = hist_entry__physid_inode_gen_snprintf,
> + .se_width_idx = HISTC_PHYSID_INODE_GEN,
> +};
> +
> struct sort_dimension {
> const char *name;
> struct sort_entry *entry;
> @@ -1027,6 +1290,21 @@ static struct sort_dimension memory_sort_dimensions[] = {
>
> #undef DIM
>
> +#define DIM(d, n, func) [d - __SORT_PHYSID_MODE] = { .name = n, .entry = &(func) }
> +
> +static struct sort_dimension physid_sort_dimensions[] = {
> + DIM(SORT_PHYSID_DADDR, "daddr", sort_physid_daddr),
> + DIM(SORT_PHYSID_IADDR, "iaddr", sort_physid_iaddr),
> + DIM(SORT_PHYSID_PID, "pid", sort_physid_pid),
> + DIM(SORT_PHYSID_TID, "tid", sort_physid_tid),
> + DIM(SORT_PHYSID_MAJOR, "major", sort_physid_major),
> + DIM(SORT_PHYSID_MINOR, "minor", sort_physid_minor),
> + DIM(SORT_PHYSID_INODE, "inode", sort_physid_inode),
> + DIM(SORT_PHYSID_INODE_GEN, "inode_gen", sort_physid_inode_gen),
> +};
> +
> +#undef DIM
> +
> static void __sort_dimension__add(struct sort_dimension *sd, enum sort_type idx)
> {
> if (sd->taken)
> @@ -1104,6 +1382,22 @@ int sort_dimension__add(const char *tok)
> return 0;
> }
>
> + for (i = 0; i < ARRAY_SIZE(physid_sort_dimensions); i++) {
> + struct sort_dimension *sd = &physid_sort_dimensions[i];
> +
> + if (strncasecmp(tok, sd->name, strlen(tok)))
> + continue;
> +
> + if (sort__mode != SORT_MODE__PHYSID)
> + return -EINVAL;
> +
> + if (sd->entry == &sort_physid_daddr)
> + sort__has_sym = 1;
> +
> + __sort_dimension__add(sd, i + __SORT_PHYSID_MODE);
> + return 0;
> + }
> +
> return -ESRCH;
> }
>
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index 43e5ff4..b1f52a8 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -87,11 +87,13 @@ struct hist_entry {
> u64 ip;
> u64 transaction;
> s32 cpu;
> + u8 cpumode;
>
> struct hist_entry_diff diff;
>
> /* We are added by hists__add_dummy_entry. */
> bool dummy;
> + bool color;
>
> /* XXX These two should move to some tree widget lib */
> u16 row_offset;
> @@ -133,6 +135,7 @@ enum sort_mode {
> SORT_MODE__NORMAL,
> SORT_MODE__BRANCH,
> SORT_MODE__MEMORY,
> + SORT_MODE__PHYSID,
> };
>
> enum sort_type {
> @@ -166,6 +169,16 @@ enum sort_type {
> SORT_MEM_TLB,
> SORT_MEM_LVL,
> SORT_MEM_SNOOP,
> +
> + __SORT_PHYSID_MODE,
> + SORT_PHYSID_DADDR = __SORT_PHYSID_MODE,
> + SORT_PHYSID_IADDR,
> + SORT_PHYSID_PID,
> + SORT_PHYSID_TID,
> + SORT_PHYSID_MAJOR,
> + SORT_PHYSID_MINOR,
> + SORT_PHYSID_INODE,
> + SORT_PHYSID_INODE_GEN,
> };
>
> /*
> --
> 1.7.11.7
>
^ permalink raw reply [flat|nested] 25+ messages in thread