From: Jiri Olsa <jolsa@redhat.com>
To: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: linux-kernel@vger.kernel.org, Jiri Olsa <jolsa@kernel.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: Re: perf script, libperf: python binding bug (bytearrays vs. strings)
Date: Mon, 28 Sep 2020 12:08:08 +0200 [thread overview]
Message-ID: <20200928100808.GA3517742@krava> (raw)
In-Reply-To: <20200927074312.GA3664097@laniakea>
On Sun, Sep 27, 2020 at 09:43:12AM +0200, Hagen Paul Pfeifer wrote:
> Hallo Jiri, Arnaldo,
>
> after updating Debian (probably with the advent of Python 3.8.5, guessing)
> I get a wired behavior with python scripting. The error is that the python type
> for prev_comm and next_comm are not strings anymore, rather bytearrays. Which
> are incompatible types and scripts will not work anymore. NOTE: common_comm is
> still fine (see swapper & mutex-thread-co), so they must be treated internal
> differently compared to prev_comm and next_comm and possibly show a way to solve
> this problem!
>
> After bisecting the kernel (perf) even back to v5.6 the problem still exist.
> Compiling perf with PYTHON=python2 do not show any problems - no problems in
> the Python2 world. So I assume with Python 3.8.5 (or other helper library)
> something changed internally. I assume the cause exists in perf forever but
> is now triggered with the new Python3 version.
>
> How to reproduce:
>
> make PYTHON=python3
> ./perf record -e sched:sched_switch -a -- sleep 5
> ./perf script --gen-script py
> ./perf script -s ./perf-script.py
>
> [..]
> sched__sched_switch 7 563231.759525792 0 swapper prev_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), prev_pid=0, prev_prio=120, prev_state=, next_comm=bytearray(b'mutex-thread-co\x00'), next_pid=3447985, next_prio=120
> Sample: {addr=0, cpu=7, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744072189289569, period=1, phys_addr=0, pid=0, tid=0, time=563231759525792, transaction=0, values=[(0, 0)], weight=0}
>
> sched__sched_switch 7 563231.759582596 3447985 mutex-thread-co prev_comm=bytearray(b'mutex-thread-co\x00'), prev_pid=3447985, prev_prio=120, prev_state=, next_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), next_pid=0, next_prio=120
> Sample: {addr=0, cpu=7, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744072189289569, period=1, phys_addr=0, pid=3447983, tid=3447985, time=563231759582596, transaction=0, values=[(0, 0)], weight=0}
>
>
> See =bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00') - should be swapper/7
>
>
> Note: the byte array has the length of 16 - exactly like the kernel
> (TASK_COMM_LEN). I assume this is somehow copied directly into the variables
> and not stringified anymore.
>
>
> Even worse: I discovered bytearrays which are not correctly "memseted":
> bytearray(b'chrome\x00sandbox\x00\x00')
>
> chrome should be the comm name, but is 'chromesandbox' somehow. See the null
> bytes in between.
>
> Jiri, Arnaldo - I tried to fix this. But the Python binding magic for the
> automatically generated events are hard to get comfy.
>
> Hagen
>
> PS: assume this fix is also kernel stable relevant.
>
patch below fixes it for me, but seems strange this was
working till now.. maybe you're the only one using this
with python3 ;-)
jirka
---
diff --git a/tools/perf/util/print_binary.c b/tools/perf/util/print_binary.c
index 599a1543871d..13fdc51c61d9 100644
--- a/tools/perf/util/print_binary.c
+++ b/tools/perf/util/print_binary.c
@@ -50,7 +50,7 @@ int is_printable_array(char *p, unsigned int len)
len--;
- for (i = 0; i < len; i++) {
+ for (i = 0; i < len && p[i]; i++) {
if (!isprint(p[i]) && !isspace(p[i]))
return 0;
}
next prev parent reply other threads:[~2020-09-28 10:08 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-27 7:43 perf script, libperf: python binding bug (bytearrays vs. strings) Hagen Paul Pfeifer
2020-09-27 9:20 ` Hagen Paul Pfeifer
2020-09-28 10:08 ` Jiri Olsa [this message]
2020-09-28 10:43 ` Hagen Paul Pfeifer
2020-09-28 13:39 ` Jiri Olsa
2020-09-28 19:19 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200928100808.GA3517742@krava \
--to=jolsa@redhat.com \
--cc=acme@redhat.com \
--cc=hagen@jauu.net \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.