From: Adrian Hunter <adrian.hunter@intel.com>
To: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>,
Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Ian Rogers <irogers@google.com>,
linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
Subject: Re: [PATCH] perf scripts python: Add a script to run instances of perf script in parallel
Date: Mon, 11 Mar 2024 19:52:01 +0200 [thread overview]
Message-ID: <3a92ebdb-8923-46af-a020-0e12233262a9@intel.com> (raw)
In-Reply-To: <Ze8ttn4bxBrYi63h@tassilo>
On 11/03/24 18:13, Andi Kleen wrote:
> On Sun, Mar 10, 2024 at 09:35:02PM +0200, Adrian Hunter wrote:
>> Add a Python script to run a perf script command multiple times in
>> parallel, using perf script options --cpu and --time so that each job
>> processes a different chunk of the data.
>>
>> Refer to the script's own help text at the end of the patch for more
>> details.
>>
>> The script is useful for Intel PT traces, that can be efficiently
>> decoded by perf script when split by CPU and/or time ranges. Running
>> jobs in parallel can decrease the overall decoding time.
>
> This only optimizes for the run time of the decoder. Often when you do
> analysis you have a non trivial part of it in some analysis script too,
> but you currently have no directi / easy way to paralelize that. It would
> be better to support parallel pipelines.
It will parallelize any scripts and / or dlfilters that perf script
itself executes.
>
> TBH I'm not sure the script is worth it. If you need to do parallel
> pipelines (which imho is the common case) it's probably better to just
> write a custom shell script, which is not that difficult.
It can be a pain to figure out how best to split the data if it is not
evenly distributed.
The script also has value as a reference or starting point for
users.
> It might be
> better to have a helper that makes writing such scripts easier,
> e.g. figuring out reasonable options for manual parallelization
> based on the input file. I think parts of your script do that, maybe
> it is usable for that.
The --dry-run option shows the perf script commands, but an option
to pipe through another command could be added.
>
> Also as a default output it would be better to just merge the
> original output in order and output it on stdout.
That assumes that the output comes from perf script printf
output and not a perf script _script_.
If the data is split by CPU, it will not be in time order
if it is simply concatenated back together.
>
> You should probably limit the number of jobs to some minimum
> length, otherwise on systems with many CPUs there might be
> inefficiently short jobs.
That happens for Intel PT (64 PSB minimum), but could be added
for the normal case also.
prev parent reply other threads:[~2024-03-11 17:52 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-10 19:35 [PATCH] perf scripts python: Add a script to run instances of perf script in parallel Adrian Hunter
2024-03-11 16:13 ` Andi Kleen
2024-03-11 17:52 ` Adrian Hunter [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3a92ebdb-8923-46af-a020-0e12233262a9@intel.com \
--to=adrian.hunter@intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=namhyung@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).