linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Yicong Yang <yangyicong@hisilicon.com>,
	James Clark <james.clark@linaro.org>,
	"Dr. David Alan Gilbert" <linux@treblig.org>,
	Levi Yun <yeoreum.yun@arm.com>, Ze Gao <zegao2021@gmail.com>,
	Weilin Wang <weilin.wang@intel.com>, Xu Yang <xu.yang_2@nxp.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Howard Chu <howardchu95@gmail.com>
Subject: Re: [PATCH v2 11/11] perf python tracepoint: Switch to using parse_events
Date: Tue, 11 Mar 2025 15:49:37 -0300	[thread overview]
Message-ID: <Z9CFwbXSGnGLn0lA@x1> (raw)
In-Reply-To: <Z9BtpXnqSUYKr1TK@x1>

On Tue, Mar 11, 2025 at 02:06:45PM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Mar 11, 2025 at 12:32:05PM -0300, Arnaldo Carvalho de Melo wrote:
> >                 if (field->flags & TEP_FIELD_IS_STRING &&
> >                     is_printable_array(data + offset, len)) {
> >                         ret = PyUnicode_FromString((char *)data + offset);
> >                 } else {
> >                         ret = PyByteArray_FromStringAndSize((const char *) data + offset, len);
> >                         field->flags &= ~TEP_FIELD_IS_STRING;
> >                 }
> > 
> > 
> > For some reason in sessions where bytearrays are returned, and all comes
> > as bytearrays, the problem is not noticed.
> > 
> > But when it comes as a string it breaks after a short time, /me
> > scratches head...
> 
> I think I'm getting closer, with some debugging sprinkled in the python
> binding:
> 
> ip 0 pid=74131 tid=74131 cpu=3
>  ( field 'prev_comm' ret=0x7f0a66a1b970 )  ( field 'prev_pid' ret=0x7f0a66b1bed0 )  ( field 'prev_prio' ret=0x7f0a74f60d08 )  ( field 'prev_state' ret=0x7f0a74f60e08 ) time 4503271651784 prev_comm=kworker/u112:14 prev_pid=74131 prev_prio=120 prev_state=0x80 ==>
>  ( field 'next_comm' ret=0x7f0a66a1b970 )  ( field 'next_pid' ret=0x7f0a74f5fe08 )  ( field 'next_prio' ret=0x7f0a74f60d08 ) next_comm=swapper/3 next_pid=0 next_prio=120
> ip 0 pid=80209 tid=80209 cpu=4
>  ( field 'prev_comm' ret=0x7f0a66a1b970 )  ( field 'prev_pid' ret=0x7f0a66b1bed0 )  ( field 'prev_prio' ret=0x7f0a74f60d08 )  ( field 'prev_state' ret=0x7f0a74f60e08 ) time 4503280531143 prev_comm=kworker/u112:3 prev_pid=80209 prev_prio=120 prev_state=0x80 ==>
>  ( field 'next_comm' ret=0x7f0a66a1b970 )  ( field 'next_pid' ret=0x7f0a74f5fe08 )  ( field 'next_prio' ret=0x7f0a74f60d08 ) next_comm=swapper/4 next_pid=0 next_prio=120
> ip 0 pid=74131 tid=74131 cpu=5
>  ( XXX '�!' len=16)  ( field 'prev_comm' ret=(nil) ) Traceback (most recent call last):
>   File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module>
>     main()
>   File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 40, in main
>     event.prev_comm,
>     ^^^^^^^^^^^^^^^
> AttributeError: 'perf.sample_event' object has no attribute 'prev_comm'
> 
> 
> So the size of the field, in the tracepoint format file is 16, its a
> comm:
> 
> root@number:/home/acme/git/perf-tools-next# grep '_comm\[' /sys/kernel/tracing/events/sched/sched_switch/format 
> 	field:char prev_comm[16];	offset:8;	size:16;	signed:0;
> 	field:char next_comm[16];	offset:40;	size:16;	signed:0;
> root@number:/home/acme/git/perf-tools-next#
> 
> But:
> 
> root@number:/home/acme/git/perf-tools-next# cat /proc/74131/comm
> kworker/u112:14-events_unbound
> root@number:/home/acme/git/perf-tools-next#
> 
> root@number:/home/acme/git/perf-tools-next# echo -n "kworker/u112:14-events_unbound" | wc -c
> 30
> root@number:/home/acme/git/perf-tools-next#
> 
> Which should explain that:
> 
> ( XXX '�!' len=16)  ( field 'prev_comm' ret=(nil) )
> 
> That is printed by:
> 
> +++ b/tools/perf/util/python.c
> @@ -318,16 +318,17 @@ tracepoint_field(const struct pyrf_event *pe, struct tep_format_field *field)
>                         if (tep_field_is_relative(field->flags))
>                                 offset += field->offset + field->size;
>                 }
> -               if (field->flags & TEP_FIELD_IS_STRING &&
> -                   is_printable_array(data + offset, len)) {
> +               if (field->flags & TEP_FIELD_IS_STRING) {
>                         ret = PyUnicode_FromString((char *)data + offset);
> +                       if (ret == NULL) {
> +                               printf(" ( XXX '%.*s' len=%d) ", len, (char *)data + offset, len); fflush(stdout);
> +                       }
>                 } else {
>                         ret = PyByteArray_FromStringAndSize((const char *) data + offset, len);
> -                       field->flags &= ~TEP_FIELD_IS_STRING;
>                 }
> 
> So now trying to figure out why when the comm lenght is more than 16 the
> raw_data appears to contain trash...

So it seems to be something just in the python binding, as perf trace
seems to handle it well:

 ( field 'prev_comm' ret=0x7f7c31f65110, raw_size=68 )  ( field 'prev_pid' ret=0x7f7c23b1bed0, raw_size=68 )  ( field 'prev_prio' ret=0x7f7c239c0030, raw_size=68 )  ( field 'prev_state' ret=0x7f7c239c0250, raw_size=68 ) time 14771421785867 prev_comm= prev_pid=1919907691 prev_prio=796026219 prev_state=0x303a32313175 ==>
 ( XXX '��' len=16, raw_size=68)  ( field 'next_comm' ret=(nil), raw_size=68 ) Traceback (most recent call last):
  File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module>
    main()
  File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 46, in main
    event.next_comm,
    ^^^^^^^^^^^^^^^
AttributeError: 'perf.sample_event' object has no attribute 'next_comm'
root@number:/home/acme/git/perf-tools-next# cat /proc/125355/comm
kworker/u112:0-i915
root@number:/home/acme/git/perf-tools-next# 
root@number:/home/acme/git/perf-tools-next# 
root@number:/home/acme/git/perf-tools-next# perf trace -e sched:sched_switch -p 125355
     0.000 sched:sched_switch(prev_comm: "kworker/u112:0", prev_pid: 125355 (kworker/u112:0-), prev_prio: 120, prev_state: 128, next_comm: "swapper/6", next_prio: 120)
^Croot@number:/home/acme/git/perf-tools-next#

I.e. it chops up the prev_comm size to what is specified in the
tracepoint format.

And that sample->raw_size is the same accross the sched:sched_switch
raw_datas (seemingly suboptimal, most are less than 16 bytes, but
probably its not guaranteed that the \0 will be there, so copy all the
16 bytes).

Now to try to figure out why simply using PyUnicode_FromStringAndSize
doesn't work...

- Arnaldo

  reply	other threads:[~2025-03-11 18:49 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 22:22 [PATCH v2 00/11] Python improvements for a real use of parse_events Ian Rogers
2025-02-28 22:22 ` [PATCH v2 01/11] perf debug: Avoid stack overflow in recursive error message Ian Rogers
2025-03-10 20:22   ` Arnaldo Carvalho de Melo
2025-03-10 20:22     ` Arnaldo Carvalho de Melo
2025-02-28 22:22 ` [PATCH v2 02/11] perf evlist: Add success path to evlist__create_syswide_maps Ian Rogers
2025-03-10 20:31   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 03/11] perf evsel: tp_format accessing improvements Ian Rogers
2025-03-10 20:43   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 04/11] perf python: Add evlist enable and disable methods Ian Rogers
2025-03-10 20:45   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 05/11] perf python: Add member access to a number of evsel variables Ian Rogers
2025-03-10 20:45   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 06/11] perf python: Add optional cpus and threads arguments to parse_events Ian Rogers
2025-03-10 20:46   ` Arnaldo Carvalho de Melo
2025-03-10 22:12   ` Namhyung Kim
2025-03-11  0:28     ` Ian Rogers
2025-03-12  1:51       ` Namhyung Kim
2025-02-28 22:23 ` [PATCH v2 07/11] perf python: Update ungrouped evsel leader in clone Ian Rogers
2025-03-10 20:53   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 08/11] perf python: Avoid duplicated code in get_tracepoint_field Ian Rogers
2025-03-10 20:55   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 09/11] perf python: Add evlist all_cpus accessor Ian Rogers
2025-03-10 20:56   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 10/11] perf python: Add evlist.config to set up record options Ian Rogers
2025-03-10 20:59   ` Arnaldo Carvalho de Melo
2025-02-28 22:23 ` [PATCH v2 11/11] perf python tracepoint: Switch to using parse_events Ian Rogers
2025-03-10 21:15   ` Arnaldo Carvalho de Melo
2025-03-10 21:16     ` Arnaldo Carvalho de Melo
2025-03-10 21:17       ` Ian Rogers
2025-03-10 21:28         ` Arnaldo Carvalho de Melo
2025-03-11 15:32           ` Arnaldo Carvalho de Melo
2025-03-11 17:06             ` Arnaldo Carvalho de Melo
2025-03-11 18:49               ` Arnaldo Carvalho de Melo [this message]
2025-03-11 21:52                 ` Arnaldo Carvalho de Melo
2025-03-12  1:54                   ` Namhyung Kim
2025-03-10 22:17   ` Namhyung Kim
2025-03-10 23:40     ` Ian Rogers
2025-03-12  1:53       ` Namhyung Kim
2025-03-07  2:07 ` [PATCH v2 00/11] Python improvements for a real use of parse_events Howard Chu
2025-03-12 19:53 ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z9CFwbXSGnGLn0lA@x1 \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=howardchu95@gmail.com \
    --cc=irogers@google.com \
    --cc=james.clark@linaro.org \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=linux@treblig.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=weilin.wang@intel.com \
    --cc=xu.yang_2@nxp.com \
    --cc=yangyicong@hisilicon.com \
    --cc=yeoreum.yun@arm.com \
    --cc=zegao2021@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).