From: Hagen Paul Pfeifer <hagen@jauu.net>
To: linux-kernel@vger.kernel.org
Cc: Jiri Olsa <jolsa@kernel.org>, Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: perf script, libperf: python binding bug (bytearrays vs. strings)
Date: Sun, 27 Sep 2020 09:43:12 +0200 [thread overview]
Message-ID: <20200927074312.GA3664097@laniakea> (raw)
Hallo Jiri, Arnaldo,
after updating Debian (probably with the advent of Python 3.8.5, guessing)
I get a wired behavior with python scripting. The error is that the python type
for prev_comm and next_comm are not strings anymore, rather bytearrays. Which
are incompatible types and scripts will not work anymore. NOTE: common_comm is
still fine (see swapper & mutex-thread-co), so they must be treated internal
differently compared to prev_comm and next_comm and possibly show a way to solve
this problem!
After bisecting the kernel (perf) even back to v5.6 the problem still exist.
Compiling perf with PYTHON=python2 do not show any problems - no problems in
the Python2 world. So I assume with Python 3.8.5 (or other helper library)
something changed internally. I assume the cause exists in perf forever but
is now triggered with the new Python3 version.
How to reproduce:
make PYTHON=python3
./perf record -e sched:sched_switch -a -- sleep 5
./perf script --gen-script py
./perf script -s ./perf-script.py
[..]
sched__sched_switch 7 563231.759525792 0 swapper prev_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), prev_pid=0, prev_prio=120, prev_state=, next_comm=bytearray(b'mutex-thread-co\x00'), next_pid=3447985, next_prio=120
Sample: {addr=0, cpu=7, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744072189289569, period=1, phys_addr=0, pid=0, tid=0, time=563231759525792, transaction=0, values=[(0, 0)], weight=0}
sched__sched_switch 7 563231.759582596 3447985 mutex-thread-co prev_comm=bytearray(b'mutex-thread-co\x00'), prev_pid=3447985, prev_prio=120, prev_state=, next_comm=bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00'), next_pid=0, next_prio=120
Sample: {addr=0, cpu=7, datasrc=84410401, datasrc_decode=N/A|SNP N/A|TLB N/A|LCK N/A, ip=18446744072189289569, period=1, phys_addr=0, pid=3447983, tid=3447985, time=563231759582596, transaction=0, values=[(0, 0)], weight=0}
See =bytearray(b'swapper/7\x00\x00\x00\x00\x00\x00\x00') - should be swapper/7
Note: the byte array has the length of 16 - exactly like the kernel
(TASK_COMM_LEN). I assume this is somehow copied directly into the variables
and not stringified anymore.
Even worse: I discovered bytearrays which are not correctly "memseted":
bytearray(b'chrome\x00sandbox\x00\x00')
chrome should be the comm name, but is 'chromesandbox' somehow. See the null
bytes in between.
Jiri, Arnaldo - I tried to fix this. But the Python binding magic for the
automatically generated events are hard to get comfy.
Hagen
PS: assume this fix is also kernel stable relevant.
next reply other threads:[~2020-09-27 7:43 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-27 7:43 Hagen Paul Pfeifer [this message]
2020-09-27 9:20 ` perf script, libperf: python binding bug (bytearrays vs. strings) Hagen Paul Pfeifer
2020-09-28 10:08 ` Jiri Olsa
2020-09-28 10:43 ` Hagen Paul Pfeifer
2020-09-28 13:39 ` Jiri Olsa
2020-09-28 19:19 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200927074312.GA3664097@laniakea \
--to=hagen@jauu.net \
--cc=acme@redhat.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.