linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: perf python script to create callgrind file for kcachegrind visualization
@ 2015-10-01 21:45 Milian Wolff
  0 siblings, 0 replies; only message in thread
From: Milian Wolff @ 2015-10-01 21:45 UTC (permalink / raw)
  To: perf group


[-- Attachment #1.1: Type: text/plain, Size: 1478 bytes --]

Hey all,

while I got used to using perf report from the CLI with it's quirky interface, 
I always wanted to open perf results in kcachegrind. There are some converters 
out there, but all of them seem to be quite complicated, and none are shipped 
with perf itself.

Since I wanted to play with the python bindings anyways, I decided to write a 
converter there. It's currently at less than 100 lines of code and seems to 
work reasonably well.

Usage:

perf record ...
perf script -s ./perf-callgrind.py > callgrind.out
kcachegrind callgrind.out

Screenshot: http://imgur.com/QHttohs

Caveats:
- I did not do extensive tests
- addr2line using event/sym/begin isn't implemented yet. I have something 
locally already, but it breaks the callgraph, i.e. I do something wrong. I'll 
fix this and then send this file in as a proper patch to be included in perf 
itself.
- missing support for multiple events in a single

Future work:
- extend the python binding to get access to the header data, esp. the command 
line
- estimate the callcount. VTune does this as well, if anyone has some papers 
or input on that topic I'd be interested. Essentially, my idea currently is to 
look at how the callgraph changes. If the leaf is exited and then reenters a 
known function, we can be sure it was called again

Feedback welcome!
-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #1.2: perf-callgrind.py --]
[-- Type: text/x-python, Size: 3178 bytes --]

# perf script event handlers, generated by perf script -g python
# Licensed under the terms of the GNU GPL License version 2

# The common_* event handler fields are the most useful fields common to
# all events.  They don't necessarily correspond to the 'common_*' fields
# in the format files.  Those fields not available as handler params can
# be retrieved using Python functions of the form common_*(context).
# See the perf-trace-python Documentation for the list of available functions.

import os
import sys
import json
from collections import defaultdict

sys.path.append(os.environ["PERF_EXEC_PATH"] + "/scripts/python/Perf-Trace-Util/lib/Perf/Trace")

from perf_trace_context import *

class Function:
    def __init__(self, dso, name, sym):
        self.cost = 0
        self.dso = dso
        self.name = name
        self.sym = sym
        self.callees = defaultdict(lambda: 0)

class DSO:
    def __init__(self):
        self.functions = dict()

# a map of all encountered dso's and the functions therein
# this is done to prevent name clashes
dsos = defaultdict(lambda: DSO())

def addFunction(dsoName, name, sym):
    global dsos
    dso = dsos[dsoName]
    function = dso.functions.get(name, None)
    # create function if it's not yet known
    if not function:
        function = Function(dsoName, name, sym)
        dso.functions[name] = function
    return function

# write the callgrind file format to stdout
def trace_end():
    global dsos

    print("version: 1")
    print("creator: perf-callgrind 0.1")
    print("part: 1")
    # TODO: get access to command line, it's in the perf data header
    #       but not accessible to the scripting backend, is it?
    print("events: Samples")

    for name, dso in dsos.iteritems():
        print("ob=%s" % name)
        for sym, function in dso.functions.iteritems():
            print("fn=%s" % sym)
            print("0 %d" % function.cost)
            for callee, cost in function.callees.iteritems():
                print("cob=%s" % callee.dso)
                print("cfn=%s" % callee.name)
                print("calls=1 0")
                print("0 %d" % cost)
            print("")

def process_event(event):
    caller = None
    if not event["callchain"]:
        # only add the single symbol where we got the sample, without a backtrace
        dsoName = event.get("dso", "???")
        name = event.get("symbol", "???")
        caller = addFunction(dsoName, name, None)
    else:
        # add a function for every frame in the callchain
        for item in reversed(event["callchain"]):
            dsoName = item.get("dso", "???")
            name = "???"
            if "sym" in item:
                name = item["sym"]["name"]
            function = addFunction(dsoName, name, item.get("sym", None))
            # add current frame to parent's callee list
            if caller is not None:
                caller.callees[function] += 1
            caller = function

    # increase the self cost of the last frame
    # all other frames include it now and kcachegrind will automatically
    # take care of adapting their inclusive cost
    if caller is not None:
        caller.cost += 1

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2015-10-01 21:45 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-01 21:45 RFC: perf python script to create callgrind file for kcachegrind visualization Milian Wolff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).