All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anup Sharma <anupnewsmail@gmail.com>
To: Namhyung Kim <namhyung@kernel.org>
Cc: Anup Sharma <anupnewsmail@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/9] scripts: python: Implement parsing of input data in convertPerfScriptProfile
Date: Thu, 6 Jul 2023 01:26:50 +0530	[thread overview]
Message-ID: <ZKXLAqWF29sxCS1B@yoga> (raw)
In-Reply-To: <CAM9d7cj1bWWM7j5LCTpDQqLXmn5UH1mkCvZ-k3VEXJb7S2+wxg@mail.gmail.com>

On Fri, Jun 23, 2023 at 05:03:12PM -0700, Namhyung Kim wrote:
> Hi Anup,
> 
> On Wed, Jun 21, 2023 at 12:41 PM Anup Sharma <anupnewsmail@gmail.com> wrote:
> >
> > The lines variable is created by splitting the profile string into individual
> > lines. It allows for iterating over each line for processing.
> >
> > The line is considered the start of a sample. It is matched against a regular
> > expression pattern to extract relevant information such as before_time_stamp,
> > time_stamp, threadNamePidAndTidMatch, threadName, pid, and tid.
> >
> > The stack frames of the current sample are then parsed in a nested loop.
> > Each stackFrameLine is matched against a regular expression pattern to
> > extract rawFunc and mod information.
> >
> > Also fixed few checkpatch warnings.
> >
> > Signed-off-by: Anup Sharma <anupnewsmail@gmail.com>
> > ---
> >  .../scripts/python/firefox-gecko-converter.py | 62 ++++++++++++++++++-
> >  1 file changed, 60 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> > index 0ff70c0349c8..e5bc7a11c3e6 100644
> > --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> > +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> > @@ -1,4 +1,5 @@
> >  #!/usr/bin/env python3
> > +# SPDX-License-Identifier: GPL-2.0
> 
> Please put this line in the first commit.

Sure, followed in latest version.

> >  import re
> >  import sys
> >  import json
> > @@ -14,13 +15,13 @@ def isPerfScriptFormat(profile):
> >      firstLine = profile[:profile.index('\n')]
> >      return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
> >
> > -def convertPerfScriptProfile(profile):
> > +def convertPerfScriptProfile(profile):
> 
> You'd better configure your editor to warn or even fix
> the trailing whitespace automatically.

Thanks, I followed your advice and configured my nvim to handle trailing
whitespace automatically. It has significantly improved my workflow.
Here's the updated snippet I added to my vimrc file:

highlight ExtraWhitespace ctermbg=white guibg=white
match ExtraWhitespace /\s\+$/

> Thanks,
> Namhyung
> 
> 
> >
> >          def addSample(threadName, stackArray, time):
> >              nonlocal name
> >              if name != threadName:
> >                  name = threadName
> > -            # TODO:
> > +            # TODO:
> >              # get_or_create_stack will create a new stack if it doesn't exist, or return the existing stack if it does.
> >              # get_or_create_frame will create a new frame if it doesn't exist, or return the existing frame if it does.
> >              stack = reduce(lambda prefix, stackFrame: get_or_create_stack(get_or_create_frame(stackFrame), prefix), stackArray, None)
> > @@ -54,3 +55,60 @@ def convertPerfScriptProfile(profile):
> >              thread = _createtread(threadName, pid, tid)
> >              threadMap[tid] = thread
> >          thread['addSample'](threadName, stack, time_stamp)
> > +
> > +    lines = profile.split('\n')
> > +
> > +    line_index = 0
> > +    startTime = 0
> > +    while line_index < len(lines):
> > +        line = lines[line_index]
> > +        line_index += 1
> > +    # perf script --header outputs header lines beginning with #
> > +        if line == '' or line.startswith('#'):
> > +            continue
> > +
> > +        sample_start_line = line
> > +
> > +        sample_start_match = re.match(r'^(.*)\s+([\d.]+):', sample_start_line)
> > +        if not sample_start_match:
> > +            print(f'Could not parse line as the start of a sample in the "perf script" profile format: "{sample_start_line}"')
> > +            continue
> > +
> > +        before_time_stamp = sample_start_match[1]
> > +        time_stamp = float(sample_start_match[2]) * 1000
> > +        threadNamePidAndTidMatch = re.match(r'^(.*)\s+(?:(\d+)\/)?(\d+)\b', before_time_stamp)
> > +
> > +        if not threadNamePidAndTidMatch:
> > +            print('Could not parse line as the start of a sample in the "perf script" profile format: "%s"' % sampleStartLine)
> > +            continue
> > +        threadName = threadNamePidAndTidMatch[1].strip()
> > +        pid = int(threadNamePidAndTidMatch[2] or 0)
> > +        tid = int(threadNamePidAndTidMatch[3] or 0)
> > +        if startTime == 0:
> > +            startTime = time_stamp
> > +        # Parse the stack frames of the current sample in a nested loop.
> > +        stack = []
> > +        while line_index < len(lines):
> > +            stackFrameLine = lines[line_index]
> > +            line_index += 1
> > +            if stackFrameLine.strip() == '':
> > +                # Sample ends.
> > +                break
> > +            stackFrameMatch = re.match(r'^\s*(\w+)\s*(.+) \(([^)]*)\)', stackFrameLine)
> > +            if stackFrameMatch:
> > +                rawFunc = stackFrameMatch[2]
> > +                mod = stackFrameMatch[3]
> > +                rawFunc = re.sub(r'\+0x[\da-f]+$', '', rawFunc)
> > +
> > +            if rawFunc.startswith('('):
> > +                continue # skip process names
> > +
> > +            if mod:
> > +                # If we have a module name, provide it.
> > +                # The code processing the profile will search for
> > +                # "functionName (in libraryName)" using a regexp,
> > +                # and automatically create the library information.
> > +                rawFunc += f' (in {mod})'
> > +
> > +            stack.append(rawFunc)
> > +
> > --
> > 2.34.1
> >

  reply	other threads:[~2023-07-05 19:57 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-21 19:35 [PATCH 0/9] Add support for Firefox's gecko profile format Anup Sharma
2023-06-21 19:37 ` [PATCH 1/9] scripts: python: Add check for correct perf script format Anup Sharma
2023-06-21 19:39 ` [PATCH 2/9] scripts: python: implement add sample function and return finish function Anup Sharma
2023-06-21 19:40 ` [PATCH 3/9] scripts: python: Introduce thread sample processing in convertPerfScriptProfile Anup Sharma
2023-06-21 19:41 ` [PATCH 4/9] scripts: python: Implement parsing of input data " Anup Sharma
2023-06-24  0:03   ` Namhyung Kim
2023-07-05 19:56     ` Anup Sharma [this message]
2023-06-21 19:43 ` [PATCH 5/9] scripts: python: implement function for thread creation Anup Sharma
2023-06-21 19:44 ` [PATCH 6/9] scripts: python: implement get or create stack function Anup Sharma
2023-06-21 19:45 ` [PATCH 7/9] scripts: python: implement get or create frame function Anup Sharma
2023-06-24  0:04   ` Namhyung Kim
2023-07-05 20:01     ` Anup Sharma
2023-07-06 15:57       ` Ian Rogers
2023-06-21 19:46 ` [PATCH 8/9] scripts: python: Finalize convertPerfScriptProfile and return profile data Anup Sharma
2023-06-21 19:46 ` [PATCH 9/9] scripts: python: Add temporary main function for testing purposes Anup Sharma
2023-06-22  4:43 ` [PATCH 0/9] Add support for Firefox's gecko profile format Adrian Hunter
2023-06-22 20:03   ` Anup Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZKXLAqWF29sxCS1B@yoga \
    --to=anupnewsmail@gmail.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.