* RFC: perf python script to create callgrind file for kcachegrind visualization
@ 2015-10-01 21:45 Milian Wolff
0 siblings, 0 replies; only message in thread
From: Milian Wolff @ 2015-10-01 21:45 UTC (permalink / raw)
To: perf group
[-- Attachment #1.1: Type: text/plain, Size: 1478 bytes --]
Hey all,
while I got used to using perf report from the CLI with it's quirky interface,
I always wanted to open perf results in kcachegrind. There are some converters
out there, but all of them seem to be quite complicated, and none are shipped
with perf itself.
Since I wanted to play with the python bindings anyways, I decided to write a
converter there. It's currently at less than 100 lines of code and seems to
work reasonably well.
Usage:
perf record ...
perf script -s ./perf-callgrind.py > callgrind.out
kcachegrind callgrind.out
Screenshot: http://imgur.com/QHttohs
Caveats:
- I did not do extensive tests
- addr2line using event/sym/begin isn't implemented yet. I have something
locally already, but it breaks the callgraph, i.e. I do something wrong. I'll
fix this and then send this file in as a proper patch to be included in perf
itself.
- missing support for multiple events in a single
Future work:
- extend the python binding to get access to the header data, esp. the command
line
- estimate the callcount. VTune does this as well, if anyone has some papers
or input on that topic I'd be interested. Essentially, my idea currently is to
look at how the callgraph changes. If the leaf is exited and then reenters a
known function, we can be sure it was called again
Feedback welcome!
--
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
[-- Attachment #1.2: perf-callgrind.py --]
[-- Type: text/x-python, Size: 3178 bytes --]
# perf script event handlers, generated by perf script -g python
# Licensed under the terms of the GNU GPL License version 2
# The common_* event handler fields are the most useful fields common to
# all events. They don't necessarily correspond to the 'common_*' fields
# in the format files. Those fields not available as handler params can
# be retrieved using Python functions of the form common_*(context).
# See the perf-trace-python Documentation for the list of available functions.
import os
import sys
import json
from collections import defaultdict
sys.path.append(os.environ["PERF_EXEC_PATH"] + "/scripts/python/Perf-Trace-Util/lib/Perf/Trace")
from perf_trace_context import *
class Function:
def __init__(self, dso, name, sym):
self.cost = 0
self.dso = dso
self.name = name
self.sym = sym
self.callees = defaultdict(lambda: 0)
class DSO:
def __init__(self):
self.functions = dict()
# a map of all encountered dso's and the functions therein
# this is done to prevent name clashes
dsos = defaultdict(lambda: DSO())
def addFunction(dsoName, name, sym):
global dsos
dso = dsos[dsoName]
function = dso.functions.get(name, None)
# create function if it's not yet known
if not function:
function = Function(dsoName, name, sym)
dso.functions[name] = function
return function
# write the callgrind file format to stdout
def trace_end():
global dsos
print("version: 1")
print("creator: perf-callgrind 0.1")
print("part: 1")
# TODO: get access to command line, it's in the perf data header
# but not accessible to the scripting backend, is it?
print("events: Samples")
for name, dso in dsos.iteritems():
print("ob=%s" % name)
for sym, function in dso.functions.iteritems():
print("fn=%s" % sym)
print("0 %d" % function.cost)
for callee, cost in function.callees.iteritems():
print("cob=%s" % callee.dso)
print("cfn=%s" % callee.name)
print("calls=1 0")
print("0 %d" % cost)
print("")
def process_event(event):
caller = None
if not event["callchain"]:
# only add the single symbol where we got the sample, without a backtrace
dsoName = event.get("dso", "???")
name = event.get("symbol", "???")
caller = addFunction(dsoName, name, None)
else:
# add a function for every frame in the callchain
for item in reversed(event["callchain"]):
dsoName = item.get("dso", "???")
name = "???"
if "sym" in item:
name = item["sym"]["name"]
function = addFunction(dsoName, name, item.get("sym", None))
# add current frame to parent's callee list
if caller is not None:
caller.callees[function] += 1
caller = function
# increase the self cost of the last frame
# all other frames include it now and kcachegrind will automatically
# take care of adapting their inclusive cost
if caller is not None:
caller.cost += 1
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2015-10-01 21:45 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-01 21:45 RFC: perf python script to create callgrind file for kcachegrind visualization Milian Wolff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).