From: Yordan Karadzhov <y.karadz@gmail.com>
To: "Tzvetomir Stoyanov (VMware)" <tz.stoyanov@gmail.com>,
acme@kernel.org, olsajiri@gmail.com, irogers@google.com
Cc: rostedt@goodmis.org, linux-trace-devel@vger.kernel.org,
linux-perf-users@vger.kernel.org
Subject: Re: [RFC PATCH 3/3] trace-cruncher: perf example
Date: Fri, 18 Mar 2022 11:52:16 +0200 [thread overview]
Message-ID: <eae0e1d8-ef40-b2eb-b08a-e233a4183373@gmail.com> (raw)
In-Reply-To: <20220224163711.185308-4-tz.stoyanov@gmail.com>
Hi Ceco,
Thanks a lot!
I really like the idea of have a perf sub-module in trace-cruncher and I think your RFC patch-set is a great starting point.
Let's start with some discussion of the structure of the APIs that this module has to contain. See my comment bellow.
On 24.02.22 г. 18:37 ч., Tzvetomir Stoyanov (VMware) wrote:
> Example python program for using trace-cruncher to collect performance
> statistics of a given process.
>
> Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
> ---
> examples/perf_sampling.py | 51 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 51 insertions(+)
> create mode 100755 examples/perf_sampling.py
>
> diff --git a/examples/perf_sampling.py b/examples/perf_sampling.py
> new file mode 100755
> index 0000000..1b57f39
> --- /dev/null
> +++ b/examples/perf_sampling.py
> @@ -0,0 +1,51 @@
> +#!/usr/bin/env python3
> +
> +"""
> +SPDX-License-Identifier: CC-BY-4.0
> +
> +Copyright 2022 VMware Inc, Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
> +"""
> +
> +import sys
> +import time
> +import signal
> +
> +import tracecruncher.perfpy as perf
> +
> +def SortKey(sample):
> + return sample.time()
> +
> +def perf_stop(sig, frame):
> + # Stop collection of performance traces
> + p.stop()
> +
> +if __name__ == "__main__":
> + if len(sys.argv) < 2:
> + print('Usage: ', sys.argv[0], ' [PROCESS]')
> + sys.exit(1)
> +
> + # Create perf sample object for the given process
> + p = perf.sample(pid=int(sys.argv[1]), freq=99)
If you initiate the sampling using the PID of the process, you are limited to tracing only processes that are already
running. Hence, there will be no way to trace the very beginning of the process you are interested in. Let's keep the
current way of initializing (via PID), but make it optional and have a second option that will be to provide a process
name and arguments to be started internally (using fork–>exec).
Also we need a better name for this API. Something that is more coherent with the naming of the equivalent ftracepy APIs.
> + signal.signal(signal.SIGINT, perf_stop)
I would prefer to have the signal handling done internally inside the C code and do not bother the Python user with this.
> + print('Start collecting performance data, press ctrl+c to stop')
> + # Start collecting performance traces
> + p.start()
I wonder what is the reason for having the constructor of the perf instance above and 'start()' as separate APIs? Do you
have in mind some use case in which we have to create the instance, do something important and only then start()?
Also in the current implementation, the only way to stop the sampling is 'ctrl+c'. You have the 'stop()' API but the
user has no way of really calling it, since the execution is blocked inside 'start()' which will never return if the
sampling is running.
But if the sampling runs on its own (started using fork->exec) then the stop() API will be indeed useful. Note that in
this case you will have to also provide 'destroy' method for the 'perf' object, because we have to guaranty that the
sampling will stop when the execution of the user script exits.
cheers,
Yordan
> + # wait for ctrl+c
> + signal.pause()
> + # Get collected samples
> + samples = p.get_samples()
> + # Sort the list based on the timestamp
> + samples.sort(key=SortKey)
> + time = 0
> + ip_count = 0
> + for s in samples:
> + # Print PID, TID, time and trace depth of each sample
> + if time == 0:
> + time = s.time()
> + print("{0} {1} ({2}), +{3}:".format(s.ip(), s.tid(), s.tid_comm(), s.time() - time))
> + ips = s.stack()
> + ip_count += len(ips)
> + for ip in reversed(ips):
> + # Print stack trace of the sample
> + print("\t{0}".format(ip))
> + print("\nCollected {0} samples, {1} ip traces".format(len(samples), ip_count))
> \ No newline at end of file
next prev parent reply other threads:[~2022-03-18 9:52 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-24 16:37 [RFC PATCH 0/3] trace-cruncher: Initial support for perf Tzvetomir Stoyanov (VMware)
2022-02-24 16:37 ` [RFC PATCH 1/3] trace-cruncher: Logic for resolving address to function name Tzvetomir Stoyanov (VMware)
2022-02-24 16:37 ` [RFC PATCH 2/3] trace-cruncher: Support for perf Tzvetomir Stoyanov (VMware)
2022-02-25 14:51 ` Arnaldo Carvalho de Melo
2022-02-25 15:38 ` Tzvetomir Stoyanov
2022-02-24 16:37 ` [RFC PATCH 3/3] trace-cruncher: perf example Tzvetomir Stoyanov (VMware)
2022-03-18 9:52 ` Yordan Karadzhov [this message]
2022-02-24 16:52 ` [RFC PATCH 0/3] trace-cruncher: Initial support for perf Ian Rogers
2022-02-25 10:21 ` Tzvetomir Stoyanov
2022-02-25 14:06 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=eae0e1d8-ef40-b2eb-b08a-e233a4183373@gmail.com \
--to=y.karadz@gmail.com \
--cc=acme@kernel.org \
--cc=irogers@google.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux-trace-devel@vger.kernel.org \
--cc=olsajiri@gmail.com \
--cc=rostedt@goodmis.org \
--cc=tz.stoyanov@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.