From: Namhyung Kim <namhyung@gmail.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: jolsa@redhat.com, linux-kernel@vger.kernel.org,
acme@infradead.org, Andi Kleen <ak@linux.intel.com>
Subject: Re: [PATCH 1/9] perf, tools: Support handling complete branch stacks as histograms v6
Date: Mon, 19 May 2014 17:21:15 +0900 [thread overview]
Message-ID: <87fvk6ji10.fsf@sejong.aot.lge.com> (raw)
In-Reply-To: <1400259938-3436-2-git-send-email-andi@firstfloor.org> (Andi Kleen's message of "Fri, 16 May 2014 10:05:30 -0700")
Hi Andi,
On Fri, 16 May 2014 10:05:30 -0700, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> Currently branch stacks can be only shown as edge histograms for
> individual branches. I never found this display particularly useful.
>
> This implements an alternative mode that creates histograms over complete
> branch traces, instead of individual branches, similar to how normal
> callgraphs are handled. This is done by putting it in
> front of the normal callgraph and then using the normal callgraph
> histogram infrastructure to unify them.
>
> This way in complex functions we can understand the control flow
> that lead to a particular sample, and may even see some control
> flow in the caller for short functions.
[SNIP]
> +static int add_callchain_ip(struct machine *machine,
> + struct thread *thread,
> + struct symbol **parent,
> + struct addr_location *root_al,
> + int cpumode,
> + u64 ip)
> +{
> + struct addr_location al;
> +
> + al.filtered = 0;
> + al.sym = NULL;
> + thread__find_addr_location(thread, machine, cpumode, MAP__FUNCTION,
> + ip, &al);
> + if (al.sym != NULL) {
> + if (sort__has_parent && !*parent &&
> + symbol__match_regex(al.sym, &parent_regex))
> + *parent = al.sym;
> + else if (have_ignore_callees && root_al &&
> + symbol__match_regex(al.sym, &ignore_callees_regex)) {
> + /* Treat this symbol as the root,
> + forgetting its callees. */
> + *root_al = al;
> + callchain_cursor_reset(&callchain_cursor);
> + }
> + if (!symbol_conf.use_callchain)
> + return -EINVAL;
This is gone with 540476de74c9 ("perf tools: Remove
symbol_conf.use_callchain check").
> + }
> +
> + return callchain_cursor_append(&callchain_cursor, ip, al.map, al.sym);
> +}
> +
> +#define CHASHSZ 127
> +#define CHASHBITS 7
> +#define NO_ENTRY 0xff
> +
> +#define PERF_MAX_BRANCH_DEPTH 127
> +
> +/* Remove loops. */
> +static int remove_loops(struct branch_entry *l, int nr)
> +{
> + int i, j, off;
> + unsigned char chash[CHASHSZ];
> + memset(chash, -1, sizeof(chash));
s/-1/NO_ENTRY/ ?
> +
> + BUG_ON(nr >= 256);
> + for (i = 0; i < nr; i++) {
> + int h = hash_64(l[i].from, CHASHBITS) % CHASHSZ;
> +
> + /* no collision handling for now */
> + if (chash[h] == NO_ENTRY) {
> + chash[h] = i;
> + } else if (l[chash[h]].from == l[i].from) {
> + bool is_loop = true;
> + /* check if it is a real loop */
> + off = 0;
> + for (j = chash[h]; j < i && i + off < nr; j++, off++)
> + if (l[j].from != l[i + off].from) {
> + is_loop = false;
> + break;
> + }
> + if (is_loop) {
> + memmove(l + i, l + i + off,
> + (nr - (i + off))
> + * sizeof(struct branch_entry));
> + nr -= off;
> + }
> + }
> + }
> + return nr;
> +}
> +
> static int machine__resolve_callchain_sample(struct machine *machine,
> struct thread *thread,
> struct ip_callchain *chain,
> + struct branch_stack *branch,
> struct symbol **parent,
> struct addr_location *root_al,
> int max_stack)
> @@ -1290,17 +1363,73 @@ static int machine__resolve_callchain_sample(struct machine *machine,
> int chain_nr = min(max_stack, (int)chain->nr);
> int i;
> int err;
> + int first_call = 0;
>
> callchain_cursor_reset(&callchain_cursor);
>
> + /*
> + * Add branches to call stack for easier browsing. This gives
> + * more context for a sample than just the callers.
> + *
> + * This uses individual histograms of paths compared to the
> + * aggregated histograms the normal LBR mode uses.
> + *
> + * Limitations for now:
> + * - No extra filters
> + * - No annotations (should annotate somehow)
> + */
> +
> + if (branch->nr > PERF_MAX_BRANCH_DEPTH) {
> + pr_warning("corrupted branch chain. skipping...\n");
> + return 0;
> + }
> +
> + if (callchain_param.branch_callstack) {
> + int nr = min(max_stack, (int)branch->nr);
> + struct branch_entry be[nr];
> +
> + for (i = 0; i < nr; i++) {
> + if (callchain_param.order == ORDER_CALLEE) {
> + be[i] = branch->entries[i];
> + /*
> + * Check for overlap into the callchain.
> + * The return address is one off compared to
> + * the branch entry. To adjust for this
> + * assume the calling instruction is not longer
> + * than 8 bytes.
> + */
> + if (be[i].from < chain->ips[first_call] &&
> + be[i].from >= chain->ips[first_call] - 8)
> + first_call++;
It seems that you need to check chain->ips[first_call] is greater than
PERF_CONTEXT_MAX and use such value as the cpumode...
> + } else
> + be[i] = branch->entries[branch->nr - i - 1];
> + }
> +
> + nr = remove_loops(be, nr);
> +
> + for (i = 0; i < nr; i++) {
> + err = add_callchain_ip(machine, thread, parent,
> + root_al,
> + -1, be[i].to);
> + if (!err)
> + err = add_callchain_ip(machine, thread,
> + parent, root_al,
> + -1, be[i].from);
... for here.
> + if (err == -EINVAL)
> + break;
> + if (err)
> + return err;
> + }
> + chain_nr -= nr;
It seems it could make some callchain nodes being ignored. What if a
case like small callchains with matches to only 2 nodes in the LBR?
nr = 16, chain_nr = 10 and first_call = 2
Thanks,
Namhyung
next prev parent reply other threads:[~2014-05-19 8:21 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-16 17:05 Implement lbr-as-callgraph v7 Andi Kleen
2014-05-16 17:05 ` [PATCH 1/9] perf, tools: Support handling complete branch stacks as histograms v6 Andi Kleen
2014-05-19 8:21 ` Namhyung Kim [this message]
2014-05-23 21:35 ` Andi Kleen
2014-05-26 2:45 ` Namhyung Kim
2014-05-16 17:05 ` [PATCH 2/9] perf, tools: Add --branch-history option to report v3 Andi Kleen
2014-05-19 8:26 ` Namhyung Kim
2014-05-23 18:11 ` Andi Kleen
2014-05-26 2:55 ` Namhyung Kim
2014-05-16 17:05 ` [PATCH 3/9] perf, tools: Enable printing the srcline in the history v4 Andi Kleen
2014-05-16 17:05 ` [PATCH 4/9] perf, tools: Only print base source file for srcline Andi Kleen
2014-05-16 17:05 ` [PATCH 5/9] perf, tools: Support source line numbers in annotate Andi Kleen
2014-05-16 17:05 ` [PATCH 6/9] perf, tools: Fix srcline sort key output to use width Andi Kleen
2014-05-16 17:05 ` [PATCH 7/9] tools, perf: Make get_srcline fall back to sym+offset Andi Kleen
2014-05-16 17:05 ` [PATCH 8/9] tools, perf: Make srcline output address with -v Andi Kleen
2014-05-16 17:05 ` [PATCH 9/9] tools, perf: Add asprintf replacement Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fvk6ji10.fsf@sejong.aot.lge.com \
--to=namhyung@gmail.com \
--cc=acme@infradead.org \
--cc=ak@linux.intel.com \
--cc=andi@firstfloor.org \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.