From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
Jiri Olsa <jolsa@kernel.org>, lkml <linux-kernel@vger.kernel.org>,
Ingo Molnar <mingo@kernel.org>,
Namhyung Kim <namhyung@kernel.org>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
"Luis Claudio R. Goncalves" <lclaudio@uudg.org>,
ldv@altlinux.org, esyr@redhat.com,
Frederic Weisbecker <fweisbec@gmail.com>
Subject: Re: [PATCH 1/8] perf: Allow to block process in syscall tracepoints
Date: Fri, 7 Dec 2018 12:49:21 -0300 [thread overview]
Message-ID: <20181207154921.GB28174@kernel.org> (raw)
In-Reply-To: <20181207151105.GB5289@hirez.programming.kicks-ass.net>
Em Fri, Dec 07, 2018 at 04:11:05PM +0100, Peter Zijlstra escreveu:
> On Fri, Dec 07, 2018 at 08:41:18AM -0500, Steven Rostedt wrote:
> > On Fri, 7 Dec 2018 09:58:39 +0100 Peter Zijlstra <peterz@infradead.org> wrote:
> > > These patches give no justification *what*so*ever* for why we're doing
> > > ugly arse things like this. And why does this, whatever this is, need to
> > > be done in perf?
> > > IOW, what problem are we solving ?
> > I guess the cover letter should have had a link (or copy) of this:
> > http://lkml.kernel.org/r/20181128134700.212ed035@gandalf.local.home
> That doesn't even begin to explain. Who cares about strace and why? And
> why is it such a bad thing to loose the occasional record etc..
The following almost addresses all the details for doing such a perf
based strace tool (modulo signals), was started by Thomas a long time
ago [1] and is in the tree for a long time, with BPF attached to the
raw_syscalls:sys_{enter,exit} (that -e augmented_raw_syscalls.c thing
that lives in tools/perf/examples/bpf now but would be put in place
transparently and pre-compiled as augmented_raw_syscalls.o) to collect
pointer contents without using ptrace:
[root@seventh bpf]# perf trace --no-inherit -e augmented_raw_syscalls.c sleep 1
0.019 ( 0.001 ms): brk( ) = 0x559ca9dd7000
0.031 ( 0.004 ms): access(filename: , mode: R ) = -1 ENOENT No such file or directory
0.039 ( 0.003 ms): openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3
0.043 ( 0.002 ms): fstat(fd: 3, statbuf: 0x7ffdfa50e940 ) = 0
0.045 ( 0.003 ms): mmap(len: 103484, prot: READ, flags: PRIVATE, fd: 3 ) = 0x7fa76b2ce000
0.049 ( 0.001 ms): close(fd: 3 ) = 0
0.056 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC ) = 3
0.060 ( 0.002 ms): read(fd: 3, buf: 0x7ffdfa50eb08, count: 832 ) = 832
0.063 ( 0.001 ms): fstat(fd: 3, statbuf: 0x7ffdfa50e9a0 ) = 0
0.065 ( 0.002 ms): mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS ) = 0x7fa76b2cc000
0.070 ( 0.003 ms): mmap(len: 3889792, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3 ) = 0x7fa76ad0e000
0.075 ( 0.005 ms): mprotect(start: 0x7fa76aebb000, len: 2093056 ) = 0
0.081 ( 0.005 ms): mmap(addr: 0x7fa76b0ba000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 1753088) = 0x7fa76b0ba000
0.089 ( 0.002 ms): mmap(addr: 0x7fa76b0c0000, len: 14976, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7fa76b0c0000
0.095 ( 0.001 ms): close(fd: 3 ) = 0
0.105 ( 0.001 ms): arch_prctl(option: 4098, arg2: 140357034366208 ) = 0
0.144 ( 0.004 ms): mprotect(start: 0x7fa76b0ba000, len: 16384, prot: READ ) = 0
0.159 ( 0.002 ms): mprotect(start: 0x559ca81aa000, len: 4096, prot: READ ) = 0
0.163 ( 0.002 ms): mprotect(start: 0x7fa76b2e8000, len: 4096, prot: READ ) = 0
0.167 ( 0.008 ms): munmap(addr: 0x7fa76b2ce000, len: 103484 ) = 0
0.218 ( 0.001 ms): brk( ) = 0x559ca9dd7000
0.220 ( 0.002 ms): brk(brk: 0x559ca9df8000 ) = 0x559ca9df8000
0.224 ( 0.001 ms): brk( ) = 0x559ca9df8000
0.228 ( 0.004 ms): open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC ) = 3
0.233 ( 0.001 ms): fstat(fd: 3, statbuf: 0x7fa76b0bfaa0 ) = 0
0.235 ( 0.003 ms): mmap(len: 113045344, prot: READ, flags: PRIVATE, fd: 3 ) = 0x7fa76413f000
0.241 ( 0.001 ms): close(fd: 3 ) = 0
0.264 (1000.163 ms): nanosleep(rqtp: 0x7ffdfa50f680 ) = 0
1000.440 ( 0.002 ms): close(fd: 1 ) = 0
1000.443 ( 0.001 ms): close(fd: 2 ) = 0
1000.449 ( ): exit_group( )
[root@seventh bpf]#
doing just:
# perf trace --filter-pids pid-of-your-local-xterm,gnome-shell-pid-for-your-term
For system wide stracing elliminating the feedback loop of your GUI
components will most of the time work with the default ring buffer size
and when it doesn't, you'll get things like that "LOST events!" message,
and for such extreme cases, just use '-m' to bump the rb size some more.
# trace --filter-pids 2279,1643
<SNIP>
8793.387 ( 0.003 ms): cc1/19097 openat(dfd: CWD, filename: 0x3df9940, flags: NOCTTY ) = 21
8793.390 ( 0.001 ms): cc1/19097 fstat(fd: 21, statbuf: 0x3df9648 ) = 0
8793.394 ( 0.003 ms): cc1/19097 read(fd: 21, buf: 0x3d6e0f0, count: 5804 ) = 5804
8793.398 ( 0.001 ms): cc1/19097 close(fd: 21 ) = 0
8793.403 ( 0.002 ms): cc1/19097 openat(dfd: CWD, filename: 0x3dfca70, flags: NOCTTY ) = -1 ENOENT No such file or directory
8793.406 ( 0.001 ms): cc1/19097 openat(dfd: CWD, filename: 0x3dfca70, flags: NOCTTY ) = -1 ENOENT No such file or directory
8793.409 ( 0.002 ms): cc1/19097 openat(dfd: CWD, filename: 0x3dfca70, flags: NOCTTY ) = 21
8793.412 ( 0.001 ms): cc1/19097 fstat(fd: 21, statbuf: 0x3df9738 ) = 0
8793.414 ( 0.022 ms): cc1/19097 read(fd: 21, buf: 0x3eb6d90, count: 25315 ) = 25315
8793.437 ( 0.001 ms): cc1/19097 close(fd: 21 ) = 0
8793.605 ( 0.005 ms): cc1/19097 openat(dfd: CWD, filename: 0x3dfca70 ) = 21
8793.614 ( 0.001 ms): cc1/19097 close(fd: 5 ) = 0
8793.618 ( 0.001 ms): cc1/19097 fstat(fd: 21, statbuf: 0x7ffd653ed380 ) = 0
8793.621 ( 0.002 ms): cc1/19097 read(fd: 21, buf: 0x3d41e30, count: 8192 ) = 8192
8793.626 ( 0.002 ms): cc1/19097 read(fd: 21, buf: 0x3ebf090, count: 8192 ) = 8192
8793.635 ( 0.003 ms): cc1/19097 read(fd: 21, buf: 0x3ed4730, count: 16384 ) = 8931
8793.640 ( 0.001 ms): cc1/19097 read(fd: 21, buf: 0x3ed6a13, count: 4096 ) = 0
LOST 34956 events!
9497.921 ( 0.006 ms): gcc/19180 openat(dfd: CWD, filename: 0x3115e20, flags: NOCTTY ) = -1 ENOENT No such file or directory
9497.935 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x3115e20, flags: NOCTTY ) = -1 ENOENT No such file or directory
9497.939 ( 0.003 ms): gcc/19180 openat(dfd: CWD, filename: 0x3115e20, flags: NOCTTY ) = 9
9497.943 ( 0.001 ms): gcc/19180 fstat(fd: 9</home/acme/git/linux/include/linux/page_ref.h>, statbuf: 0x3153568) = 0
9497.945 ( 0.006 ms): gcc/19180 read(fd: 9</home/acme/git/linux/include/linux/page_ref.h>, buf: 0x31c00a0, count: 40574) = 40574
9497.952 ( 0.001 ms): gcc/19180 close(fd: 9</home/acme/git/linux/include/linux/page_ref.h> ) = 0
9497.959 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x31540a0, flags: NOCTTY ) = -1 ENOENT No such file or directory
9497.962 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x31540a0, flags: NOCTTY ) = -1 ENOENT No such file or directory
9497.965 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x31540a0, flags: NOCTTY ) = 9
9497.969 ( 0.001 ms): gcc/19180 fstat(fd: 9</home/acme/git/linux/include/linux/page_ref.h>, statbuf: 0x30dcf38) = 0
9497.970 ( 0.003 ms): gcc/19180 read(fd: 9</home/acme/git/linux/include/linux/page_ref.h>, buf: 0x31b8320, count: 23058) = 23058
9497.975 ( 0.001 ms): gcc/19180 close(fd: 9</home/acme/git/linux/include/linux/page_ref.h> ) = 0
9497.979 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x3154020, flags: NOCTTY ) = -1 ENOENT No such file or directory
9497.982 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x3154020, flags: NOCTTY ) = -1 ENOENT No such file or directory
9497.986 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x3154020, flags: NOCTTY ) = 9
9497.989 ( 0.001 ms): gcc/19180 fstat(fd: 9</home/acme/git/linux/include/linux/page_ref.h>, statbuf: 0x3116178) = 0
9497.990 ( 0.001 ms): gcc/19180 read(fd: 9</home/acme/git/linux/include/linux/page_ref.h>, buf: 0x31509c0, count: 1948) = 1948
9497.993 ( 0.001 ms): gcc/19180 close(fd: 9</home/acme/git/linux/include/linux/page_ref.h> ) = 0
9498.039 ( 0.003 ms): gcc/19180 openat(dfd: CWD, filename: 0x3154060, flags: NOCTTY ) = -1 ENOENT No such file or directory
9498.043 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x3154060, flags: NOCTTY ) = -1 ENOENT No such file or directory
9498.046 ( 0.002 ms): gcc/19180 openat(dfd: CWD, filename: 0x3154060, flags: NOCTTY ) = 9
9498.049 ( 0.001 ms): gcc/19180 fstat(fd: 9</home/acme/git/linux/include/linux/page_ref.h>, statbuf: 0x3153a08) = 0
'trace' is just 'perf trace', perf has logic to see if argv[0] is
'trace', then it goes and does a 'perf trace'. The above case does't use
bpf at all, gets details from /proc.
With bpf we get the 'filename' contents:
[root@seventh bpf]# trace -e augmented_raw_syscalls.c --filter-pids 2279,1643
<SNIP>
19766.027 ( 0.003 ms): gcc/27524 openat(dfd: CWD, filename: /lib64/libz.so.1, flags: CLOEXEC ) = 5
19766.035 ( 0.001 ms): gcc/27524 fstat(fd: 5, statbuf: 0x7ffe9323e2a0 ) = 0
19766.037 ( 0.003 ms): gcc/27524 mmap(len: 2187272, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 5 ) = 0x7fa2df435000
19766.042 ( 0.003 ms): gcc/27524 mprotect(start: 0x7fa2df44b000, len: 2093056 ) = 0
19766.046 ( 0.004 ms): gcc/27524 mmap(addr: 0x7fa2df64a000, len: 4096, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 5, off: 86016) = 0x7fa2df64a000
19766.051 ( 0.002 ms): gcc/27524 mmap(addr: 0x7fa2df64b000, len: 8, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7fa2df64b000
19766.057 ( 0.001 ms): gcc/27524 close(fd: 5 ) = 0
19766.062 ( 0.003 ms): gcc/27524 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC ) = 5
<SNIP>
- Arnaldo
[1] https://lwn.net/Articles/415728/
next prev parent reply other threads:[~2018-12-07 15:49 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-05 16:05 [RFC 1/8] perf: Block perf calls for system call tracepoints Jiri Olsa
2018-12-05 16:05 ` [PATCH 1/8] perf: Allow to block process in syscall tracepoints Jiri Olsa
2018-12-05 17:35 ` Steven Rostedt
2018-12-05 17:56 ` Jiri Olsa
2018-12-06 8:09 ` Peter Zijlstra
2018-12-06 10:30 ` Jiri Olsa
2018-12-06 8:10 ` Peter Zijlstra
2018-12-06 8:24 ` Jiri Olsa
2018-12-06 10:31 ` Peter Zijlstra
2018-12-06 8:34 ` Peter Zijlstra
2018-12-06 10:31 ` Jiri Olsa
2018-12-06 18:19 ` Steven Rostedt
2018-12-07 8:44 ` Jiri Olsa
2018-12-07 8:58 ` Peter Zijlstra
2018-12-07 13:41 ` Steven Rostedt
2018-12-07 15:11 ` Peter Zijlstra
2018-12-07 15:49 ` Arnaldo Carvalho de Melo [this message]
2018-12-08 10:41 ` Peter Zijlstra
2018-12-08 17:34 ` Steven Rostedt
2018-12-07 20:14 ` Steven Rostedt
2018-12-08 10:44 ` Peter Zijlstra
2018-12-08 17:38 ` Steven Rostedt
2018-12-10 10:18 ` Peter Zijlstra
2018-12-13 0:39 ` Dmitry V. Levin
2018-12-13 1:26 ` Steven Rostedt
2018-12-13 1:49 ` Dmitry V. Levin
2018-12-13 10:01 ` Peter Zijlstra
2018-12-13 10:05 ` Peter Zijlstra
2018-12-13 10:08 ` Peter Zijlstra
2018-12-13 11:29 ` Jiri Olsa
2018-12-06 8:17 ` Peter Zijlstra
2018-12-06 10:27 ` Jiri Olsa
2018-12-05 16:05 ` [PATCH 2/8] perf tools: Sync uapi perf_event.h Jiri Olsa
2018-12-05 16:05 ` [PATCH 3/8] perf record: Add --block option Jiri Olsa
2018-12-05 16:05 ` [PATCH 4/8] perf trace: " Jiri Olsa
2018-12-05 16:05 ` [PATCH 5/8] perf tools: Add block term support for tracepoints Jiri Olsa
2018-12-05 16:05 ` [PATCH 6/8] perf tools: Add ordered_events__flush_time interface Jiri Olsa
2018-12-14 21:00 ` [tip:perf/core] perf ordered_events: " tip-bot for Jiri Olsa
2018-12-18 14:27 ` tip-bot for Jiri Olsa
2018-12-05 16:05 ` [PATCH 7/8] perf trace: Move event delivery to deliver_event function Jiri Olsa
2018-12-14 21:01 ` [tip:perf/core] perf trace: Move event delivery to a new deliver_event() function tip-bot for Jiri Olsa
2018-12-18 14:28 ` tip-bot for Jiri Olsa
2018-12-05 16:05 ` [PATCH 8/8] perf trace: Add ordered processing for --block option Jiri Olsa
2018-12-14 21:02 ` [tip:perf/core] perf trace: Add ordered processing tip-bot for Jiri Olsa
2018-12-18 14:29 ` tip-bot for Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181207154921.GB28174@kernel.org \
--to=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=esyr@redhat.com \
--cc=fweisbec@gmail.com \
--cc=jolsa@kernel.org \
--cc=lclaudio@uudg.org \
--cc=ldv@altlinux.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox