* [GIT PULL 00/22] perf/core improvements and fixes
@ 2018-11-30 18:26 Arnaldo Carvalho de Melo
2018-11-30 18:26 ` [PATCH 01/22] perf build: Give better hint about devel package for libssl Arnaldo Carvalho de Melo
` (21 more replies)
0 siblings, 22 replies; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw)
To: Ingo Molnar
Cc: Clark Williams, linux-kernel, linux-perf-users,
Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin,
Alexei Starovoitov, Alexey Budankov, Andi Kleen, Anton Blanchard,
Daniel Borkmann, David Ahern, David Howells, David S . Miller,
Eric Saint-Etienne, Ivan Babrou, Jin Yao, Jiri Olsa, Julia Lawall,
Leo Yan, Mathieu Poirier, Namhyung Kim, Peter Zijlstra,
Ravi Bangoria, Slavomir Kaslev, Stephane Eranian, Steven Rostedt,
Thomas Richter, Tzvetomir Stoyanov, Wang Nan, Wen Yang,
yuzhoujian, zhong.weidong, Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling, more to come,
Regards,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit b1a9d7b0190119dad5b9b7841751b5a7586bbc8b:
Merge tag 'perf-urgent-for-mingo-4.20-20181121' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent (2018-11-21 15:57:21 +0100)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.21-20181130
for you to fetch changes up to 09d3f015d1e1b4fee7e9bbdcf54201d239393391:
uprobes: Fix handle_swbp() vs. unregister() + register() race once more (2018-11-23 08:31:19 +0100)
----------------------------------------------------------------
perf/core improvements and fixes:
- Introduce 'perf record --aio' to use asynchronous IO trace writing in
'perf record' disabled by default, i.e. one needs to explicitly use
'perf record --aio' to use it, in which case the number of AIO aiocb
structs will be one, specify 'perf record --aio=N' to ask for more,
according to your needs, related to the number of processors in your
machine. Reports about the effectiveness of this option are welcome
so that we can decide on making it the default mode of operation. Read
the respective patches commit logs for further information (Alexey Budankov)
- Add fallback routines to be used in places where we don't have the cpu mode
(kernel/user space/hypervisor) and thus must first fallback lookups looking
at all map trees when trying to resolve symbols (Adrian Hunter)
- Introduce 'perf top --kallsyms file' to match 'perf report --kallsyms', useful
when dealing with BPF, where symbol resolution happens via kallsyms, not via
the default vmlinux ELF symtabs (Arnaldo Carvalho de Melo)
- Fix CSV mode column output for non-cgroup events in 'perf stat' (Stephane Eranian)
- Fix 'perf stat' shadow stats for clock events. (Ravi Bangoria)
- Fix error with config term "pt=0", where we should just force "pt=1" and
warn the user about the former being non-sensical (Adrian Hunter)
- Fix 'perf test' entry where we expect 'sleep' to come in a PERF_RECORD_COMM
but instead we get 'coreutils' when sleep is provided by some versions of
the 'coreutils' package (Adrian Hunter)
- Remove needless rb_tree extra indirection from map__find() (Eric Saint-Etienne)
- Add sanity check to libtraceevent's is_timestamp_in_us() (Tzvetomir Stoyanov)
- Use ERR_CAST instead of ERR_PTR(PTR_ERR()) (Wen Yang)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Andrea Parri (1):
uprobes: Fix handle_swbp() vs. unregister() + register() race once more
Jiri Olsa (3):
perf/x86/intel: Move branch tracing setup to the Intel-specific source file
perf/x86/intel: Add generic branch tracing check to intel_pmu_has_bts()
perf/x86/intel: Disallow precise_ip on BTS events
arch/x86/events/core.c | 20 ----------------
arch/x86/events/intel/core.c | 56 ++++++++++++++++++++++++++++++++++----------
arch/x86/events/perf_event.h | 13 ++++++----
kernel/events/uprobes.c | 12 ++++++++--
4 files changed, 63 insertions(+), 38 deletions(-)
Test results:
XXX: Investigation on the watchpoint and breakpoint 'perf test' failures is
underway, doesn't look like related to patches in this batch.
The first ones are container (docker) based builds of tools/perf with
and without libelf support. Where clang is available, it is also used
to build perf with/without libelf, and building with LIBCLANGLLVM=1
(built-in clang) with gcc and clang when clang and its devel libraries
are installed.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
# dm
1 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0
2 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822
3 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0
4 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0
5 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0
6 alpine:edge : Ok gcc (Alpine 6.4.0) 6.4.0
7 amazonlinux:1 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
8 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
9 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
10 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
11 centos:5 : Ok gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55)
12 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
13 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28)
14 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502
15 debian:7 : Ok gcc (Debian 4.7.2-5) 4.7.2
16 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u1) 4.9.2
17 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
18 debian:experimental : Ok gcc (Debian 8.2.0-10) 8.2.0
19 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 8.2.0-10) 8.2.0
20 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0
21 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 8.2.0-10) 8.2.0
22 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 8.2.0-7) 8.2.0
23 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
24 fedora:21 : Ok gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
25 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
26 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6)
27 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
28 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
29 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1)
30 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2)
31 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
32 fedora:28 : Ok gcc (GCC) 8.2.1 20181105 (Red Hat 8.2.1-5)
33 fedora:29 : Ok gcc (GCC) 8.2.1 20181011 (Red Hat 8.2.1-4)
34 fedora:rawhide : Ok gcc (GCC) 8.2.1 20181011 (Red Hat 8.2.1-4)
35 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 7.3.0-r3 p1.4) 7.3.0
36 mageia:5 : Ok gcc (GCC) 4.9.2
37 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0
38 opensuse:13.2 : Ok gcc (SUSE Linux) 4.8.3 20140627 [gcc-4_8-branch revision 212064]
39 opensuse:42.1 : Ok gcc (SUSE Linux) 4.8.5
40 opensuse:42.2 : Ok gcc (SUSE Linux) 4.8.5
41 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5
42 opensuse:tumbleweed : Ok gcc (SUSE Linux) 7.3.1 20180323 [gcc-7-branch revision 258812]
43 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
44 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-36.0.1)
45 ubuntu:12.04.5 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
46 ubuntu:14.04.4 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4
47 ubuntu:14.04.4-x-linaro-arm64 : Ok aarch64-linux-gnu-gcc (Linaro GCC 5.5-2017.10) 5.5.0
48 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
49 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
50 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
51 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
52 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
53 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
54 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
55 ubuntu:16.10 : Ok gcc (Ubuntu 6.2.0-5ubuntu12) 6.2.0 20161005
56 ubuntu:17.10 : Ok gcc (Ubuntu 7.2.0-8ubuntu3.2) 7.2.0
57 ubuntu:18.04 : Ok gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
58 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
59 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.3.0-27ubuntu1~18.04) 7.3.0
60 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
61 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
62 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
63 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
64 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
65 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
66 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
67 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
68 ubuntu:18.10 : Ok gcc (Ubuntu 8.2.0-7ubuntu1) 8.2.0
#
# uname -a
Linux seventh 4.18.19-100.fc27.x86_64 #1 SMP Wed Nov 14 22:04:34 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
# git log --oneline -1
ceed7c10bbb4 tools lib traceevent: Add sanity check to is_timestamp_in_us()
# perf version --build-options
perf version 4.20.rc3.gceed7c
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
gtk2: [ on ] # HAVE_GTK2_SUPPORT
syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: DSO data read : Ok
11: DSO data cache : Ok
12: DSO data reopen : Ok
13: Roundtrip evsel->name : Ok
14: Parse sched tracepoints fields : Ok
15: syscalls:sys_enter_openat event fields : Ok
16: Setup struct perf_event_attr : Ok
17: Match and link multiple hists : Ok
18: 'import perf' in python : Ok
19: Breakpoint overflow signal handler : Ok
20: Breakpoint overflow sampling : Ok
21: Breakpoint accounting : Ok
22: Watchpoint :
22.1: Read Only Watchpoint : Skip
22.2: Write Only Watchpoint : Ok
22.3: Read / Write Watchpoint : Ok
22.4: Modify Watchpoint : FAILED!
23: Number of exit events of a simple workload : Ok
24: Software clock events period values : Ok
25: Object code reading : Ok
26: Sample parsing : Ok
27: Use a dummy software event to keep tracking : Ok
28: Parse with no sample_id_all bit set : Ok
29: Filter hist entries : Ok
30: Lookup mmap thread : Ok
31: Share thread mg : Ok
32: Sort output of hist entries : Ok
33: Cumulate child hist entries : Ok
34: Track with sched_switch : Ok
35: Filter fds with revents mask in a fdarray : Ok
36: Add fd to a fdarray, making it autogrow : Ok
37: kmod_path__parse : Ok
38: Thread map : Ok
39: LLVM search and compile :
39.1: Basic BPF llvm compile : Ok
39.2: kbuild searching : Ok
39.3: Compile source for BPF prologue generation : Ok
39.4: Compile source for BPF relocation : Ok
40: Session topology : Ok
41: BPF filter :
41.1: Basic BPF filtering : Ok
41.2: BPF pinning : Ok
41.3: BPF prologue generation : Ok
41.4: BPF relocation checker : Ok
42: Synthesize thread map : Ok
43: Remove thread map : Ok
44: Synthesize cpu map : Ok
45: Synthesize stat config : Ok
46: Synthesize stat : Ok
47: Synthesize stat round : Ok
48: Synthesize attr update : Ok
49: Event times : Ok
50: Read backward ring buffer : Ok
51: Print cpu map : Ok
52: Probe SDT events : Ok
53: is_printable_array : Ok
54: Print bitmap : Ok
55: perf hooks : Ok
56: builtin clang support : Skip (not compiled in)
57: unit_number__scnprintf : Ok
58: mem2node : Ok
59: x86 rdpmc : Ok
60: Convert perf time to TSC : Ok
61: DWARF unwind : Ok
62: x86 instruction decoder - new instructions : Ok
63: x86 bp modify : FAILED!
64: probe libc's inet_pton & backtrace it with ping : Ok
65: Check open filename arg using perf trace + vfs_getname: Ok
66: Use vfs_getname probe to get syscall args filenames : Ok
67: Add vfs_getname probe to get syscall args filenames : Ok
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_util_map_o_O: make util/map.o
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_help_O: make help
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_newt_O: make NO_NEWT=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_install_O: make install
make_no_auxtrace_O: make NO_AUXTRACE=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_debug_O: make DEBUG=1
make_doc_O: make doc
make_no_libbpf_O: make NO_LIBBPF=1
make_no_libelf_O: make NO_LIBELF=1
make_no_gtk2_O: make NO_GTK2=1
make_no_slang_O: make NO_SLANG=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_perf_o_O: make perf.o
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_install_bin_O: make install-bin
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_pure_O: make
make_no_libperl_O: make NO_LIBPERL=1
make_clean_all_O: make clean all
make_static_O: make LDFLAGS=-static
make_tags_O: make tags
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
^ permalink raw reply [flat|nested] 36+ messages in thread* [PATCH 01/22] perf build: Give better hint about devel package for libssl 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 02/22] perf stat: Fix shadow stats for clock events Arnaldo Carvalho de Melo ` (20 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa, Namhyung Kim, Stephane Eranian, Wang Nan From: Arnaldo Carvalho de Melo <acme@redhat.com> In debian/ubuntu its libssl-dev, but for fedora/RHEL/Centos/etc its openssl-devel, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 8ee4646038e4 ("perf build: Add libcrypto feature detection") Link: https://lkml.kernel.org/n/tip-lnxqszts6aq2c9jy4b7mlnym@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Makefile.config | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index e110010e7faa..c643d5e0c26b 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -588,7 +588,7 @@ endif ifndef NO_LIBCRYPTO ifneq ($(feature-libcrypto), 1) - msg := $(warning No libcrypto.h found, disables jitted code injection, please install libssl-devel or libssl-dev); + msg := $(warning No libcrypto.h found, disables jitted code injection, please install openssl-devel or libssl-dev); NO_LIBCRYPTO := 1 else CFLAGS += -DHAVE_LIBCRYPTO_SUPPORT -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 02/22] perf stat: Fix shadow stats for clock events 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 01/22] perf build: Give better hint about devel package for libssl Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 03/22] perf stat: Fix CSV mode column output for non-cgroup events Arnaldo Carvalho de Melo ` (19 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Ravi Bangoria, Alexander Shishkin, Jin Yao, Namhyung Kim, Thomas Richter, yuzhoujian, Arnaldo Carvalho de Melo From: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Commit 0aa802a79469 ("perf stat: Get rid of extra clock display function") introduced scale and unit for clock events. Thus, perf_stat__update_shadow_stats() now saves scaled values of clock events in msecs, instead of original nsecs. But while calculating values of shadow stats we still consider clock event values in nsecs. This results in a wrong shadow stat values. Ex, # ./perf stat -e task-clock,cycles ls <SNIP> 2.60 msec task-clock:u # 0.877 CPUs utilized 2,430,564 cycles:u # 1215282.000 GHz Fix this by saving original nsec values for clock events in perf_stat__update_shadow_stats(). After patch: # ./perf stat -e task-clock,cycles ls <SNIP> 3.14 msec task-clock:u # 0.839 CPUs utilized 3,094,528 cycles:u # 0.985 GHz Suggested-by: Jiri Olsa <jolsa@redhat.com> Reported-by: Anton Blanchard <anton@samba.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: yuzhoujian@didichuxing.com Fixes: 0aa802a79469 ("perf stat: Get rid of extra clock display function") Link: http://lkml.kernel.org/r/20181116042843.24067-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/stat-shadow.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c index f0a8cec55c47..3c22c58b3e90 100644 --- a/tools/perf/util/stat-shadow.c +++ b/tools/perf/util/stat-shadow.c @@ -209,11 +209,12 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count, int cpu, struct runtime_stat *st) { int ctx = evsel_context(counter); + u64 count_ns = count; count *= counter->scale; if (perf_evsel__is_clock(counter)) - update_runtime_stat(st, STAT_NSECS, 0, cpu, count); + update_runtime_stat(st, STAT_NSECS, 0, cpu, count_ns); else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES)) update_runtime_stat(st, STAT_CYCLES, ctx, cpu, count); else if (perf_stat_evsel__is(counter, CYCLES_IN_TX)) -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 03/22] perf stat: Fix CSV mode column output for non-cgroup events 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 01/22] perf build: Give better hint about devel package for libssl Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 02/22] perf stat: Fix shadow stats for clock events Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 04/22] perf map: Remove extra indirection from map__find() Arnaldo Carvalho de Melo ` (18 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Stephane Eranian, Peter Zijlstra, Arnaldo Carvalho de Melo From: Stephane Eranian <eranian@google.com> When using the -x option, perf stat prints CSV-style output with one event per line. For each event, it prints the count, the unit, the event name, the cgroup, and a bunch of other event specific fields (such as insn per cycles). When you use CSV-style mode, you expect a normalized output where each event is printed with the same number of fields regardless of what it is so it can easily be imported into a spreadsheet or parsed. For instance, if an event does not have a unit, then print an empty field for it. Although this approach was implemented for the unit, it was not for the cgroup. When mixing cgroup and non-cgroup events, then non-cgroup events would not show an empty field, instead the next field was printed, make columns not line up correctly. This patch fixes the cgroup output issues by forcing an empty field for non-cgroup events as soon as one event has cgroup. Before: <not counted> @ @cycles @foo @ 0 @100.00@@ 2531614 @ @cycles @6420922@100.00@ @ foo cgroup lines up with time_running! After: <not counted> @ @cycles @foo @0 @100.00@@ 2594834 @ @cycles @ @5287372 @100.00@@ Fields line up. Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1541587845-9150-1-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/stat-display.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index e7b4c44ebb62..665ee374fc01 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -59,6 +59,15 @@ static void print_noise(struct perf_stat_config *config, print_noise_pct(config, stddev_stats(&ps->res_stats[0]), avg); } +static void print_cgroup(struct perf_stat_config *config, struct perf_evsel *evsel) +{ + if (nr_cgroups) { + const char *cgrp_name = evsel->cgrp ? evsel->cgrp->name : ""; + fprintf(config->output, "%s%s", config->csv_sep, cgrp_name); + } +} + + static void aggr_printout(struct perf_stat_config *config, struct perf_evsel *evsel, int id, int nr) { @@ -336,8 +345,7 @@ static void abs_printout(struct perf_stat_config *config, fprintf(output, "%-*s", config->csv_output ? 0 : 25, perf_evsel__name(evsel)); - if (evsel->cgrp) - fprintf(output, "%s%s", config->csv_sep, evsel->cgrp->name); + print_cgroup(config, evsel); } static bool is_mixed_hw_group(struct perf_evsel *counter) @@ -431,9 +439,7 @@ static void printout(struct perf_stat_config *config, int id, int nr, config->csv_output ? 0 : -25, perf_evsel__name(counter)); - if (counter->cgrp) - fprintf(config->output, "%s%s", - config->csv_sep, counter->cgrp->name); + print_cgroup(config, counter); if (!config->csv_output) pm(config, &os, NULL, NULL, "", 0); -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 04/22] perf map: Remove extra indirection from map__find() 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (2 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 03/22] perf stat: Fix CSV mode column output for non-cgroup events Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 05/22] perf env: Also consider env->arch == NULL as local operation Arnaldo Carvalho de Melo ` (17 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Eric Saint-Etienne, Alexander Shishkin, Eric Saint-Etienne, Namhyung Kim, Peter Zijlstra, Arnaldo Carvalho de Melo From: Eric Saint-Etienne <eric.saint.etienne@oracle.com> A double pointer is used in map__find() where a single pointer is enough because the function doesn't affect the rbtree and the rbtree is locked. Signed-off-by: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eric Saint-Etienne <eric.saintetienne@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1542969759-24346-1-git-send-email-eric.saint.etienne@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/map.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index 781eed8e3265..a0d58b4d9c32 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -873,19 +873,18 @@ void maps__remove(struct maps *maps, struct map *map) struct map *maps__find(struct maps *maps, u64 ip) { - struct rb_node **p, *parent = NULL; + struct rb_node *p; struct map *m; down_read(&maps->lock); - p = &maps->entries.rb_node; - while (*p != NULL) { - parent = *p; - m = rb_entry(parent, struct map, rb_node); + p = maps->entries.rb_node; + while (p != NULL) { + m = rb_entry(p, struct map, rb_node); if (ip < m->start) - p = &(*p)->rb_left; + p = p->rb_left; else if (ip >= m->end) - p = &(*p)->rb_right; + p = p->rb_right; else goto out; } -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 05/22] perf env: Also consider env->arch == NULL as local operation 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (3 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 04/22] perf map: Remove extra indirection from map__find() Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 06/22] perf machine: Record if a arch has a single user/kernel address space Arnaldo Carvalho de Melo ` (16 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter, Andi Kleen, David Ahern, David S . Miller, Jiri Olsa, Leo Yan, Mathieu Poirier, Namhyung Kim, Wang Nan, stable From: Arnaldo Carvalho de Melo <acme@redhat.com> We'll set a new machine field based on env->arch, which for live mode, like with 'perf top' means we need to use uname() to figure the name of the arch, fix perf_env__arch() to consider both (env == NULL) and (env->arch == NULL) as local operation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/n/tip-vcz4ufzdon7cwy8dm2ua53xk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/env.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c index 59f38c7693f8..4c23779e271a 100644 --- a/tools/perf/util/env.c +++ b/tools/perf/util/env.c @@ -166,7 +166,7 @@ const char *perf_env__arch(struct perf_env *env) struct utsname uts; char *arch_name; - if (!env) { /* Assume local operation */ + if (!env || !env->arch) { /* Assume local operation */ if (uname(&uts) < 0) return NULL; arch_name = uts.machine; -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 06/22] perf machine: Record if a arch has a single user/kernel address space 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (4 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 05/22] perf env: Also consider env->arch == NULL as local operation Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 07/22] perf thread: Add fallback functions for cases where cpumode is insufficient Arnaldo Carvalho de Melo ` (15 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Andi Kleen, David S . Miller, Jiri Olsa, Leo Yan, Mathieu Poirier, stable, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Some architectures have a single address space for kernel and user addresses, which makes it possible to determine if an address is in kernel space or user space. Some don't, e.g.: sparc. Cache that info in perf_env so that, for instance, code needing to fallback failed symbol lookups at the kernel space in single address space arches can lookup at userspace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/arch/common.c | 10 ++++++++++ tools/perf/arch/common.h | 1 + tools/perf/util/machine.h | 1 + tools/perf/util/session.c | 4 ++++ 4 files changed, 16 insertions(+) diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c index 82657c01a3b8..5f69fd0b745a 100644 --- a/tools/perf/arch/common.c +++ b/tools/perf/arch/common.c @@ -200,3 +200,13 @@ int perf_env__lookup_objdump(struct perf_env *env, const char **path) return perf_env__lookup_binutils_path(env, "objdump", path); } + +/* + * Some architectures have a single address space for kernel and user addresses, + * which makes it possible to determine if an address is in kernel space or user + * space. + */ +bool perf_env__single_address_space(struct perf_env *env) +{ + return strcmp(perf_env__arch(env), "sparc"); +} diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h index 2167001b18c5..c298a446d1f6 100644 --- a/tools/perf/arch/common.h +++ b/tools/perf/arch/common.h @@ -5,5 +5,6 @@ #include "../util/env.h" int perf_env__lookup_objdump(struct perf_env *env, const char **path); +bool perf_env__single_address_space(struct perf_env *env); #endif /* ARCH_PERF_COMMON_H */ diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index d856b85862e2..ca897a73014c 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -42,6 +42,7 @@ struct machine { u16 id_hdr_size; bool comm_exec; bool kptr_restrict_warned; + bool single_address_space; char *root_dir; char *mmap_name; struct threads threads[THREADS__TABLE_SIZE]; diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 7d2c8ce6cfad..f8eab197f35c 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -24,6 +24,7 @@ #include "thread.h" #include "thread-stack.h" #include "stat.h" +#include "arch/common.h" static int perf_session__deliver_event(struct perf_session *session, union perf_event *event, @@ -150,6 +151,9 @@ struct perf_session *perf_session__new(struct perf_data *data, session->machines.host.env = &perf_env; } + session->machines.host.single_address_space = + perf_env__single_address_space(session->machines.host.env); + if (!data || perf_data__is_write(data)) { /* * In O_RDONLY mode this will be performed when reading the -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 07/22] perf thread: Add fallback functions for cases where cpumode is insufficient 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (5 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 06/22] perf machine: Record if a arch has a single user/kernel address space Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 08/22] perf tools: Use fallback for sample_addr_correlates_sym() cases Arnaldo Carvalho de Melo ` (14 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Andi Kleen, David S . Miller, Jiri Olsa, Leo Yan, Mathieu Poirier, stable, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> For branch stacks or branch samples, the sample cpumode might not be correct because it applies only to the sample 'ip' and not necessary to 'addr' or branch stack addresses. Add fallback functions that can be used to deal with those cases Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20181106210712.12098-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/event.c | 27 +++++++++++++++++++++++++++ tools/perf/util/machine.c | 27 +++++++++++++++++++++++++++ tools/perf/util/machine.h | 2 ++ tools/perf/util/thread.h | 4 ++++ 4 files changed, 60 insertions(+) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index e9c108a6b1c3..9431b20c1337 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -1577,6 +1577,24 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr, return al->map; } +/* + * For branch stacks or branch samples, the sample cpumode might not be correct + * because it applies only to the sample 'ip' and not necessary to 'addr' or + * branch stack addresses. If possible, use a fallback to deal with those cases. + */ +struct map *thread__find_map_fb(struct thread *thread, u8 cpumode, u64 addr, + struct addr_location *al) +{ + struct map *map = thread__find_map(thread, cpumode, addr, al); + struct machine *machine = thread->mg->machine; + u8 addr_cpumode = machine__addr_cpumode(machine, cpumode, addr); + + if (map || addr_cpumode == cpumode) + return map; + + return thread__find_map(thread, addr_cpumode, addr, al); +} + struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode, u64 addr, struct addr_location *al) { @@ -1586,6 +1604,15 @@ struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode, return al->sym; } +struct symbol *thread__find_symbol_fb(struct thread *thread, u8 cpumode, + u64 addr, struct addr_location *al) +{ + al->sym = NULL; + if (thread__find_map_fb(thread, cpumode, addr, al)) + al->sym = map__find_symbol(al->map, al->addr); + return al->sym; +} + /* * Callers need to drop the reference to al->thread, obtained in * machine__findnew_thread() diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 8f36ce813bc5..9397e3f2444d 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -2592,6 +2592,33 @@ int machine__get_kernel_start(struct machine *machine) return err; } +u8 machine__addr_cpumode(struct machine *machine, u8 cpumode, u64 addr) +{ + u8 addr_cpumode = cpumode; + bool kernel_ip; + + if (!machine->single_address_space) + goto out; + + kernel_ip = machine__kernel_ip(machine, addr); + switch (cpumode) { + case PERF_RECORD_MISC_KERNEL: + case PERF_RECORD_MISC_USER: + addr_cpumode = kernel_ip ? PERF_RECORD_MISC_KERNEL : + PERF_RECORD_MISC_USER; + break; + case PERF_RECORD_MISC_GUEST_KERNEL: + case PERF_RECORD_MISC_GUEST_USER: + addr_cpumode = kernel_ip ? PERF_RECORD_MISC_GUEST_KERNEL : + PERF_RECORD_MISC_GUEST_USER; + break; + default: + break; + } +out: + return addr_cpumode; +} + struct dso *machine__findnew_dso(struct machine *machine, const char *filename) { return dsos__findnew(&machine->dsos, filename); diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h index ca897a73014c..ebde3ea70225 100644 --- a/tools/perf/util/machine.h +++ b/tools/perf/util/machine.h @@ -100,6 +100,8 @@ static inline bool machine__kernel_ip(struct machine *machine, u64 ip) return ip >= kernel_start; } +u8 machine__addr_cpumode(struct machine *machine, u8 cpumode, u64 addr); + struct thread *machine__find_thread(struct machine *machine, pid_t pid, pid_t tid); struct comm *machine__thread_exec_comm(struct machine *machine, diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h index 30e2b4c165fe..5920c3bb8ffe 100644 --- a/tools/perf/util/thread.h +++ b/tools/perf/util/thread.h @@ -96,9 +96,13 @@ struct thread *thread__main_thread(struct machine *machine, struct thread *threa struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr, struct addr_location *al); +struct map *thread__find_map_fb(struct thread *thread, u8 cpumode, u64 addr, + struct addr_location *al); struct symbol *thread__find_symbol(struct thread *thread, u8 cpumode, u64 addr, struct addr_location *al); +struct symbol *thread__find_symbol_fb(struct thread *thread, u8 cpumode, + u64 addr, struct addr_location *al); void thread__find_cpumode_addr_location(struct thread *thread, u64 addr, struct addr_location *al); -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 08/22] perf tools: Use fallback for sample_addr_correlates_sym() cases 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (6 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 07/22] perf thread: Add fallback functions for cases where cpumode is insufficient Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 09/22] perf script: Use fallbacks for branch stacks Arnaldo Carvalho de Melo ` (13 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Andi Kleen, David S . Miller, Jiri Olsa, Leo Yan, Mathieu Poirier, stable, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> thread__resolve() is used in the sample_addr_correlates_sym() cases where 'addr' is a destination of a branch which does not necessarily have the same cpumode as the 'ip'. Use the fallback function in that case. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20181106210712.12098-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/event.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 9431b20c1337..24493200cf80 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -1706,7 +1706,7 @@ bool sample_addr_correlates_sym(struct perf_event_attr *attr) void thread__resolve(struct thread *thread, struct addr_location *al, struct perf_sample *sample) { - thread__find_map(thread, sample->cpumode, sample->addr, al); + thread__find_map_fb(thread, sample->cpumode, sample->addr, al); al->cpu = sample->cpu; al->sym = NULL; -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 09/22] perf script: Use fallbacks for branch stacks 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (7 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 08/22] perf tools: Use fallback for sample_addr_correlates_sym() cases Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 10/22] tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c Arnaldo Carvalho de Melo ` (12 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Andi Kleen, David S . Miller, Jiri Olsa, Leo Yan, Mathieu Poirier, stable, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Branch stacks do not necessarily have the same cpumode as the 'ip'. Use the fallback functions in those cases. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20181106210712.12098-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/builtin-script.c | 12 ++++++------ .../util/scripting-engines/trace-event-python.c | 16 ++++++++-------- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 04913136bac9..3ea98fe72f7f 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -724,8 +724,8 @@ static int perf_sample__fprintf_brstack(struct perf_sample *sample, if (PRINT_FIELD(DSO)) { memset(&alf, 0, sizeof(alf)); memset(&alt, 0, sizeof(alt)); - thread__find_map(thread, sample->cpumode, from, &alf); - thread__find_map(thread, sample->cpumode, to, &alt); + thread__find_map_fb(thread, sample->cpumode, from, &alf); + thread__find_map_fb(thread, sample->cpumode, to, &alt); } printed += fprintf(fp, " 0x%"PRIx64, from); @@ -771,8 +771,8 @@ static int perf_sample__fprintf_brstacksym(struct perf_sample *sample, from = br->entries[i].from; to = br->entries[i].to; - thread__find_symbol(thread, sample->cpumode, from, &alf); - thread__find_symbol(thread, sample->cpumode, to, &alt); + thread__find_symbol_fb(thread, sample->cpumode, from, &alf); + thread__find_symbol_fb(thread, sample->cpumode, to, &alt); printed += symbol__fprintf_symname_offs(alf.sym, &alf, fp); if (PRINT_FIELD(DSO)) { @@ -816,11 +816,11 @@ static int perf_sample__fprintf_brstackoff(struct perf_sample *sample, from = br->entries[i].from; to = br->entries[i].to; - if (thread__find_map(thread, sample->cpumode, from, &alf) && + if (thread__find_map_fb(thread, sample->cpumode, from, &alf) && !alf.map->dso->adjust_symbols) from = map__map_ip(alf.map, from); - if (thread__find_map(thread, sample->cpumode, to, &alt) && + if (thread__find_map_fb(thread, sample->cpumode, to, &alt) && !alt.map->dso->adjust_symbols) to = map__map_ip(alt.map, to); diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c index 69aa93d4ee99..0c4b050f6fc2 100644 --- a/tools/perf/util/scripting-engines/trace-event-python.c +++ b/tools/perf/util/scripting-engines/trace-event-python.c @@ -494,14 +494,14 @@ static PyObject *python_process_brstack(struct perf_sample *sample, pydict_set_item_string_decref(pyelem, "cycles", PyLong_FromUnsignedLongLong(br->entries[i].flags.cycles)); - thread__find_map(thread, sample->cpumode, - br->entries[i].from, &al); + thread__find_map_fb(thread, sample->cpumode, + br->entries[i].from, &al); dsoname = get_dsoname(al.map); pydict_set_item_string_decref(pyelem, "from_dsoname", _PyUnicode_FromString(dsoname)); - thread__find_map(thread, sample->cpumode, - br->entries[i].to, &al); + thread__find_map_fb(thread, sample->cpumode, + br->entries[i].to, &al); dsoname = get_dsoname(al.map); pydict_set_item_string_decref(pyelem, "to_dsoname", _PyUnicode_FromString(dsoname)); @@ -576,14 +576,14 @@ static PyObject *python_process_brstacksym(struct perf_sample *sample, if (!pyelem) Py_FatalError("couldn't create Python dictionary"); - thread__find_symbol(thread, sample->cpumode, - br->entries[i].from, &al); + thread__find_symbol_fb(thread, sample->cpumode, + br->entries[i].from, &al); get_symoff(al.sym, &al, true, bf, sizeof(bf)); pydict_set_item_string_decref(pyelem, "from", _PyUnicode_FromString(bf)); - thread__find_symbol(thread, sample->cpumode, - br->entries[i].to, &al); + thread__find_symbol_fb(thread, sample->cpumode, + br->entries[i].to, &al); get_symoff(al.sym, &al, true, bf, sizeof(bf)); pydict_set_item_string_decref(pyelem, "to", _PyUnicode_FromString(bf)); -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 10/22] tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (8 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 09/22] perf script: Use fallbacks for branch stacks Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 11/22] perf tests record: Allow for 'sleep' being 'coreutils' Arnaldo Carvalho de Melo ` (11 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Jiri Olsa, Steven Rostedt, Tzvetomir Stoyanov, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Fix following warnings: event-parse.c: In function ‘tep_find_event_by_name’: event-parse.c:3521:21: warning: ‘event’ may be used uninitialized in this function [-Wmaybe-uninitialized] pevent->last_event = event; ~~~~~~~~~~~~~~~~~~~^~~~~~~ CC ui/gtk/hists.o LINK plugin_mac80211.so CC nlattr.o event-parse.c: In function ‘tep_data_lat_fmt’: event-parse.c:5200:4: warning: ‘migrate_disable’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, "%d", migrate_disable); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:5207:4: warning: ‘lock_depth’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, "%d", lock_depth); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ LINK plugin_sched_switch.so LINK plugin_function.so LINK plugin_xen.so event-parse.c: In function ‘tep_event_info’: event-parse.c:5047:7: warning: ‘len_arg’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, format, len_arg, (char)val); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:4884:6: note: ‘len_arg’ was declared here int len_arg; ^~~~~~~ event-parse.c:4338:11: warning: ‘vsize’ may be used uninitialized in this function [-Wmaybe-uninitialized] val = tep_read_number(pevent, bptr, vsize); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:4224:6: note: ‘vsize’ was declared here int vsize; ^~~~~ $ gcc --version gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502 Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Link: http://lkml.kernel.org/r/20181122112937.10582-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/lib/traceevent/event-parse.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index 3692f29fee46..fbd6d6813fab 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -3498,7 +3498,7 @@ struct tep_event_format * tep_find_event_by_name(struct tep_handle *pevent, const char *sys, const char *name) { - struct tep_event_format *event; + struct tep_event_format *event = NULL; int i; if (pevent->last_event && @@ -4221,7 +4221,7 @@ static struct tep_print_arg *make_bprint_args(char *fmt, void *data, int size, s unsigned long long ip, val; char *ptr; void *bptr; - int vsize; + int vsize = 0; field = pevent->bprint_buf_field; ip_field = pevent->bprint_ip_field; @@ -4881,7 +4881,7 @@ static void pretty_print(struct trace_seq *s, void *data, int size, struct tep_e char format[32]; int show_func; int len_as_arg; - int len_arg; + int len_arg = 0; int len; int ls; @@ -5146,8 +5146,8 @@ void tep_data_lat_fmt(struct tep_handle *pevent, static int migrate_disable_exists; unsigned int lat_flags; unsigned int pc; - int lock_depth; - int migrate_disable; + int lock_depth = 0; + int migrate_disable = 0; int hardirq; int softirq; void *data = record->data; -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 11/22] perf tests record: Allow for 'sleep' being 'coreutils' 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (9 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 10/22] tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 12/22] perf test: Fix perf_event_attr test failure Arnaldo Carvalho de Melo ` (10 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> If the 'sleep' command is provided by coreutils, then the "PERF_RECORD_* events & perf_sample fields" test will fail because the MMAP name is 'coreutils' not 'sleep', and there is an extra COMM event. Fix the test to detect that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20181122135545.16295-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/tests/perf-record.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c index 34394cc05077..07f6bd8ed719 100644 --- a/tools/perf/tests/perf-record.c +++ b/tools/perf/tests/perf-record.c @@ -58,6 +58,7 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int subtest __maybe_unus char *bname, *mmap_filename; u64 prev_time = 0; bool found_cmd_mmap = false, + found_coreutils_mmap = false, found_libc_mmap = false, found_vdso_mmap = false, found_ld_mmap = false; @@ -254,6 +255,8 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int subtest __maybe_unus if (bname != NULL) { if (!found_cmd_mmap) found_cmd_mmap = !strcmp(bname + 1, cmd); + if (!found_coreutils_mmap) + found_coreutils_mmap = !strcmp(bname + 1, "coreutils"); if (!found_libc_mmap) found_libc_mmap = !strncmp(bname + 1, "libc", 4); if (!found_ld_mmap) @@ -292,7 +295,7 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int subtest __maybe_unus } found_exit: - if (nr_events[PERF_RECORD_COMM] > 1) { + if (nr_events[PERF_RECORD_COMM] > 1 + !!found_coreutils_mmap) { pr_debug("Excessive number of PERF_RECORD_COMM events!\n"); ++errs; } @@ -302,7 +305,7 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int subtest __maybe_unus ++errs; } - if (!found_cmd_mmap) { + if (!found_cmd_mmap && !found_coreutils_mmap) { pr_debug("PERF_RECORD_MMAP for %s missing!\n", cmd); ++errs; } -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 12/22] perf test: Fix perf_event_attr test failure 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (10 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 11/22] perf tests record: Allow for 'sleep' being 'coreutils' Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 13/22] tools include: Adopt ERR_CAST() from the kernel err.h header Arnaldo Carvalho de Melo ` (9 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Fix inconsistent use of tabs and spaces error: # perf test 16 -v 16: Setup struct perf_event_attr : --- start --- test child forked, pid 20224 File "/usr/libexec/perf-core/tests/attr.py", line 119 log.warning("expected %s=%s, got %s" % (t, self[t], other[t])) ^ TabError: inconsistent use of tabs and spaces in indentation test child finished with -1 ---- end ---- Setup struct perf_event_attr: FAILED! Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20181122140456.16817-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/tests/attr.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/tests/attr.py b/tools/perf/tests/attr.py index ff9b60b99f52..44090a9a19f3 100644 --- a/tools/perf/tests/attr.py +++ b/tools/perf/tests/attr.py @@ -116,7 +116,7 @@ class Event(dict): if not self.has_key(t) or not other.has_key(t): continue if not data_equal(self[t], other[t]): - log.warning("expected %s=%s, got %s" % (t, self[t], other[t])) + log.warning("expected %s=%s, got %s" % (t, self[t], other[t])) # Test file description needs to have following sections: # [config] -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 13/22] tools include: Adopt ERR_CAST() from the kernel err.h header 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (11 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 12/22] perf test: Fix perf_event_attr test failure Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 14/22] perf bpf: Use ERR_CAST instead of ERR_PTR(PTR_ERR()) Arnaldo Carvalho de Melo ` (8 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter, Alexander Shishkin, David Ahern, David Howells, Jiri Olsa, Julia Lawall, Namhyung Kim, Peter Zijlstra, Wang Nan, Wen Yang, zhong.weidong From: Arnaldo Carvalho de Melo <acme@redhat.com> Add ERR_CAST(), so that tools can use it, just like the kernel. This addresses coccinelle checks that are being performed to tools/ in addition to kernel sources, so lets add this to cover that and to get tools code closer to kernel coding standards. This originally was introduced in the kernel headers in this cset: d1bc8e954452 ("Add an ERR_CAST() function to complement ERR_PTR and co.") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: Wen Yang <yellowriver2010@hotmail.com> Cc: zhong.weidong@zte.com.cn Link: https://lkml.kernel.org/n/tip-tlt97p066zyhzqhl5jt86og7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/include/linux/err.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/tools/include/linux/err.h b/tools/include/linux/err.h index 094649667bae..2f5a12b88a86 100644 --- a/tools/include/linux/err.h +++ b/tools/include/linux/err.h @@ -59,4 +59,17 @@ static inline int __must_check PTR_ERR_OR_ZERO(__force const void *ptr) else return 0; } + +/** + * ERR_CAST - Explicitly cast an error-valued pointer to another pointer type + * @ptr: The pointer to cast. + * + * Explicitly cast an error-valued pointer to another pointer type in such a + * way as to make it clear that's what's going on. + */ +static inline void * __must_check ERR_CAST(__force const void *ptr) +{ + /* cast away the const */ + return (void *) ptr; +} #endif /* _LINUX_ERR_H */ -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 14/22] perf bpf: Use ERR_CAST instead of ERR_PTR(PTR_ERR()) 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (12 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 13/22] tools include: Adopt ERR_CAST() from the kernel err.h header Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 15/22] perf top: Allow passing a kallsyms file Arnaldo Carvalho de Melo ` (7 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Wen Yang, Alexander Shishkin, Jiri Olsa, Julia Lawall, Namhyung Kim, Peter Zijlstra, Wen Yang, zhong.weidong, Arnaldo Carvalho de Melo From: Wen Yang <wen.yang99@zte.com.cn> Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)). This makes it more readable and also fix this warning detected by err_cast.cocci: tools/perf/util/bpf-loader.c:1606:11-18: WARNING: ERR_CAST can be used with op Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Julia Lawall <julia.lawall@lip6.fr> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wen Yang <yellowriver2010@hotmail.com> Cc: zhong.weidong@zte.com.cn Link: http://lkml.kernel.org/r/20181127090610.28488-1-wen.yang99@zte.com.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/bpf-loader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c index f9ae1a993806..9a280647d829 100644 --- a/tools/perf/util/bpf-loader.c +++ b/tools/perf/util/bpf-loader.c @@ -1603,7 +1603,7 @@ struct perf_evsel *bpf__setup_output_event(struct perf_evlist *evlist, const cha op = bpf_map__add_newop(map, NULL); if (IS_ERR(op)) - return ERR_PTR(PTR_ERR(op)); + return ERR_CAST(op); op->op_type = BPF_MAP_OP_SET_EVSEL; op->v.evsel = evsel; } -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 15/22] perf top: Allow passing a kallsyms file 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (13 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 14/22] perf bpf: Use ERR_CAST instead of ERR_PTR(PTR_ERR()) Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 16/22] perf intel-pt: Fix error with config term "pt=0" Arnaldo Carvalho de Melo ` (6 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter, Alexei Starovoitov, Daniel Borkmann, David Ahern, David S . Miller, Jiri Olsa, Namhyung Kim, Wang Nan From: Arnaldo Carvalho de Melo <acme@redhat.com> This basically replicates what was done for 'perf report' in: b226a5a72901 ("perf report: Allow user to specify path to kallsyms file") This should help with resolving eBPF symbols, that are in kallsyms but, of course, not in vmlinux. Reported-by: Ivan Babrou <ibobrik@gmail.com> Tested-by: Ivan Babrou <ibobrik@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-x52mx1ybq8128rtg9hjrj5qk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Documentation/perf-top.txt | 3 +++ tools/perf/builtin-top.c | 2 ++ 2 files changed, 5 insertions(+) diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt index 808b664343c9..44d89fb9c788 100644 --- a/tools/perf/Documentation/perf-top.txt +++ b/tools/perf/Documentation/perf-top.txt @@ -70,6 +70,9 @@ Default is to monitor all CPUS. --ignore-vmlinux:: Ignore vmlinux files. +--kallsyms=<file>:: + kallsyms pathname + -m <pages>:: --mmap-pages=<pages>:: Number of mmap data pages (must be a power of two) or size diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index aa0c73e57924..1252d1759064 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1289,6 +1289,8 @@ int cmd_top(int argc, const char **argv) "file", "vmlinux pathname"), OPT_BOOLEAN(0, "ignore-vmlinux", &symbol_conf.ignore_vmlinux, "don't load vmlinux even if found"), + OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name, + "file", "kallsyms pathname"), OPT_BOOLEAN('K', "hide_kernel_symbols", &top.hide_kernel_symbols, "hide kernel symbols"), OPT_CALLBACK('m', "mmap-pages", &opts->mmap_pages, "pages", -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 16/22] perf intel-pt: Fix error with config term "pt=0" 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (14 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 15/22] perf top: Allow passing a kallsyms file Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 17/22] tools build feature: Check if libaio is available Arnaldo Carvalho de Melo ` (5 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo From: Adrian Hunter <adrian.hunter@intel.com> Users should never use 'pt=0', but if they do it may give a meaningless error: $ perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). Fix that by forcing 'pt=1'. Committer testing: # perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). /bin/dmesg | grep -i perf may provide additional information. # perf record -e intel_pt/pt=0/u uname pt=0 doesn't make sense, forcing pt=1 Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data ] # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/arch/x86/util/intel-pt.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c index db0ba8caf5a2..ba8ecaf52200 100644 --- a/tools/perf/arch/x86/util/intel-pt.c +++ b/tools/perf/arch/x86/util/intel-pt.c @@ -524,10 +524,21 @@ static int intel_pt_validate_config(struct perf_pmu *intel_pt_pmu, struct perf_evsel *evsel) { int err; + char c; if (!evsel) return 0; + /* + * If supported, force pass-through config term (pt=1) even if user + * sets pt=0, which avoids senseless kernel errors. + */ + if (perf_pmu__scan_file(intel_pt_pmu, "format/pt", "%c", &c) == 1 && + !(evsel->attr.config & 1)) { + pr_warning("pt=0 doesn't make sense, forcing pt=1\n"); + evsel->attr.config |= 1; + } + err = intel_pt_val_config_term(intel_pt_pmu, "caps/cycle_thresholds", "cyc_thresh", "caps/psb_cyc", evsel->attr.config); -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 17/22] tools build feature: Check if libaio is available 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (15 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 16/22] perf intel-pt: Fix error with config term "pt=0" Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 18/22] perf mmap: Map data buffer for preserving collected data Arnaldo Carvalho de Melo ` (4 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Alexey Budankov, Alexander Shishkin, Andi Kleen, Peter Zijlstra, Arnaldo Carvalho de Melo From: Alexey Budankov <alexey.budankov@linux.intel.com> This will be used by 'perf record' to speed up reading the perf ring buffer. Committer testing: $ make -C tools/perf O=/tmp/build/perf make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j8' parallel build Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ on ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] $ ls -la /tmp/build/perf/feature/test-libaio.* -rwxrwxr-x. 1 acme acme 18296 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.bin -rw-rw-r--. 1 acme acme 1165 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.d -rw-rw-r--. 1 acme acme 0 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.make.output $ $ grep -i aio /tmp/build/perf/FEATURE-DUMP feature-libaio=1 $ Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5fcda10c-6c63-68df-383a-c6d9e5d1f918@linux.intel.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/build/Makefile.feature | 6 ++++-- tools/build/feature/Makefile | 6 +++++- tools/build/feature/test-all.c | 5 +++++ tools/build/feature/test-libaio.c | 16 ++++++++++++++++ tools/perf/Makefile.config | 6 ++++++ tools/perf/Makefile.perf | 7 ++++++- 6 files changed, 42 insertions(+), 4 deletions(-) create mode 100644 tools/build/feature/test-libaio.c diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 8a123834a2a3..d47b8f73e2e7 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -70,7 +70,8 @@ FEATURE_TESTS_BASIC := \ sched_getcpu \ sdt \ setns \ - libopencsd + libopencsd \ + libaio # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list # of all feature tests @@ -116,7 +117,8 @@ FEATURE_DISPLAY ?= \ zlib \ lzma \ get_cpuid \ - bpf + bpf \ + libaio # Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features. # If in the future we need per-feature checks/flags for features not diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 38c22e122cb0..2dbcc0d00f52 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -61,7 +61,8 @@ FILES= \ test-libopencsd.bin \ test-clang.bin \ test-llvm.bin \ - test-llvm-version.bin + test-llvm-version.bin \ + test-libaio.bin FILES := $(addprefix $(OUTPUT),$(FILES)) @@ -297,6 +298,9 @@ $(OUTPUT)test-clang.bin: -include $(OUTPUT)*.d +$(OUTPUT)test-libaio.bin: + $(BUILD) -lrt + ############################### clean: diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index 58f01b950195..20cdaa4fc112 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -174,6 +174,10 @@ # include "test-libopencsd.c" #undef main +#define main main_test_libaio +# include "test-libaio.c" +#undef main + int main(int argc, char *argv[]) { main_test_libpython(); @@ -214,6 +218,7 @@ int main(int argc, char *argv[]) main_test_sdt(); main_test_setns(); main_test_libopencsd(); + main_test_libaio(); return 0; } diff --git a/tools/build/feature/test-libaio.c b/tools/build/feature/test-libaio.c new file mode 100644 index 000000000000..932133c9a265 --- /dev/null +++ b/tools/build/feature/test-libaio.c @@ -0,0 +1,16 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <aio.h> + +int main(void) +{ + struct aiocb aiocb; + + aiocb.aio_fildes = 0; + aiocb.aio_offset = 0; + aiocb.aio_buf = 0; + aiocb.aio_nbytes = 0; + aiocb.aio_reqprio = 0; + aiocb.aio_sigevent.sigev_notify = 1 /*SIGEV_NONE*/; + + return (int)aio_return(&aiocb); +} diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index c643d5e0c26b..b66f97a04b12 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -365,6 +365,12 @@ ifeq ($(feature-glibc), 1) CFLAGS += -DHAVE_GLIBC_SUPPORT endif +ifeq ($(feature-libaio), 1) + ifndef NO_AIO + CFLAGS += -DHAVE_AIO_SUPPORT + endif +endif + ifdef NO_DWARF NO_LIBDW_DWARF_UNWIND := 1 endif diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 239e7b3270f4..67e9adbe6ee8 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -101,8 +101,13 @@ include ../scripts/utilities.mak # Define LIBCLANGLLVM if you DO want builtin clang and llvm support. # When selected, pass LLVM_CONFIG=/path/to/llvm-config to `make' if # llvm-config is not in $PATH. - +# # Define NO_CORESIGHT if you do not want support for CoreSight trace decoding. +# +# Define NO_AIO if you do not want support of Posix AIO based trace +# streaming for record mode. Currently Posix AIO trace streaming is +# supported only when linking with glibc. +# # As per kernel Makefile, avoid funny character set dependencies unexport LC_ALL -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 18/22] perf mmap: Map data buffer for preserving collected data 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (16 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 17/22] tools build feature: Check if libaio is available Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 19/22] perf record: Enable asynchronous trace writing Arnaldo Carvalho de Melo ` (3 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Alexey Budankov, Alexander Shishkin, Andi Kleen, Peter Zijlstra, Arnaldo Carvalho de Melo From: Alexey Budankov <alexey.budankov@linux.intel.com> The map->data buffer is used to preserve map->base profiling data for writing to disk. AIO map->cblock is used to queue corresponding map->data buffer for asynchronous writing. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5fcda10c-6c63-68df-383a-c6d9e5d1f918@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/util/evlist.c | 2 +- tools/perf/util/mmap.c | 49 +++++++++++++++++++++++++++++++++++++++- tools/perf/util/mmap.h | 11 ++++++++- 3 files changed, 59 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 36526d229315..6f010b9f0a81 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1028,7 +1028,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, * Its value is decided by evsel's write_backward. * So &mp should not be passed through const pointer. */ - struct mmap_params mp; + struct mmap_params mp = { .nr_cblocks = 0 }; if (!evlist->mmap) evlist->mmap = perf_evlist__alloc_mmap(evlist, false); diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c index cdb95b3a1213..47cdc3ad6546 100644 --- a/tools/perf/util/mmap.c +++ b/tools/perf/util/mmap.c @@ -153,8 +153,55 @@ void __weak auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp __mayb { } +#ifdef HAVE_AIO_SUPPORT +static int perf_mmap__aio_mmap(struct perf_mmap *map, struct mmap_params *mp) +{ + int delta_max; + + if (mp->nr_cblocks) { + map->aio.data = malloc(perf_mmap__mmap_len(map)); + if (!map->aio.data) { + pr_debug2("failed to allocate data buffer, error %m\n"); + return -1; + } + /* + * Use cblock.aio_fildes value different from -1 + * to denote started aio write operation on the + * cblock so it requires explicit record__aio_sync() + * call prior the cblock may be reused again. + */ + map->aio.cblock.aio_fildes = -1; + /* + * Allocate cblock with max priority delta to + * have faster aio write system calls. + */ + delta_max = sysconf(_SC_AIO_PRIO_DELTA_MAX); + map->aio.cblock.aio_reqprio = delta_max; + } + + return 0; +} + +static void perf_mmap__aio_munmap(struct perf_mmap *map) +{ + if (map->aio.data) + zfree(&map->aio.data); +} +#else +static int perf_mmap__aio_mmap(struct perf_mmap *map __maybe_unused, + struct mmap_params *mp __maybe_unused) +{ + return 0; +} + +static void perf_mmap__aio_munmap(struct perf_mmap *map __maybe_unused) +{ +} +#endif + void perf_mmap__munmap(struct perf_mmap *map) { + perf_mmap__aio_munmap(map); if (map->base != NULL) { munmap(map->base, perf_mmap__mmap_len(map)); map->base = NULL; @@ -197,7 +244,7 @@ int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd, int c &mp->auxtrace_mp, map->base, fd)) return -1; - return 0; + return perf_mmap__aio_mmap(map, mp); } static int overwrite_rb_find_range(void *buf, int mask, u64 *start, u64 *end) diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h index cc5e2d6d17a9..3f10ad030c5e 100644 --- a/tools/perf/util/mmap.h +++ b/tools/perf/util/mmap.h @@ -6,6 +6,9 @@ #include <linux/types.h> #include <linux/ring_buffer.h> #include <stdbool.h> +#ifdef HAVE_AIO_SUPPORT +#include <aio.h> +#endif #include "auxtrace.h" #include "event.h" @@ -26,6 +29,12 @@ struct perf_mmap { bool overwrite; struct auxtrace_mmap auxtrace_mmap; char event_copy[PERF_SAMPLE_MAX_SIZE] __aligned(8); +#ifdef HAVE_AIO_SUPPORT + struct { + void *data; + struct aiocb cblock; + } aio; +#endif }; /* @@ -57,7 +66,7 @@ enum bkw_mmap_state { }; struct mmap_params { - int prot, mask; + int prot, mask, nr_cblocks; struct auxtrace_mmap_params auxtrace_mp; }; -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 19/22] perf record: Enable asynchronous trace writing 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (17 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 18/22] perf mmap: Map data buffer for preserving collected data Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 20/22] perf record: Extend trace writing to multi AIO Arnaldo Carvalho de Melo ` (2 subsequent siblings) 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Alexey Budankov, Alexander Shishkin, Andi Kleen, Peter Zijlstra, Arnaldo Carvalho de Melo From: Alexey Budankov <alexey.budankov@linux.intel.com> The trace file offset is read once before mmaps iterating loop and written back after all performance data is enqueued for aio writing. The trace file offset is incremented linearly after every successful aio write operation. record__aio_sync() blocks till completion of the started AIO operation and then proceeds. record__aio_mmap_read_sync() implements a barrier for all incomplete aio write requests. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ce2d45e9-d236-871c-7c8f-1bed2d37e8ac@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Documentation/perf-record.txt | 5 + tools/perf/builtin-record.c | 218 ++++++++++++++++++++++- tools/perf/perf.h | 1 + tools/perf/util/evlist.c | 6 +- tools/perf/util/evlist.h | 2 +- tools/perf/util/mmap.c | 77 +++++++- tools/perf/util/mmap.h | 14 ++ 7 files changed, 314 insertions(+), 9 deletions(-) diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index 246dee081efd..7efb4af88a68 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -435,6 +435,11 @@ Specify vmlinux path which has debuginfo. --buildid-all:: Record build-id of all DSOs regardless whether it's actually hit or not. +--aio:: +Enable asynchronous (Posix AIO) trace writing mode. +Asynchronous mode is supported only when linking Perf tool with libc library +providing implementation for Posix AIO API. + --all-kernel:: Configure all used events to run in kernel space. diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 488779bc4c8d..408d6477c960 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -124,6 +124,183 @@ static int record__write(struct record *rec, struct perf_mmap *map __maybe_unuse return 0; } +#ifdef HAVE_AIO_SUPPORT +static int record__aio_write(struct aiocb *cblock, int trace_fd, + void *buf, size_t size, off_t off) +{ + int rc; + + cblock->aio_fildes = trace_fd; + cblock->aio_buf = buf; + cblock->aio_nbytes = size; + cblock->aio_offset = off; + cblock->aio_sigevent.sigev_notify = SIGEV_NONE; + + do { + rc = aio_write(cblock); + if (rc == 0) { + break; + } else if (errno != EAGAIN) { + cblock->aio_fildes = -1; + pr_err("failed to queue perf data, error: %m\n"); + break; + } + } while (1); + + return rc; +} + +static int record__aio_complete(struct perf_mmap *md, struct aiocb *cblock) +{ + void *rem_buf; + off_t rem_off; + size_t rem_size; + int rc, aio_errno; + ssize_t aio_ret, written; + + aio_errno = aio_error(cblock); + if (aio_errno == EINPROGRESS) + return 0; + + written = aio_ret = aio_return(cblock); + if (aio_ret < 0) { + if (aio_errno != EINTR) + pr_err("failed to write perf data, error: %m\n"); + written = 0; + } + + rem_size = cblock->aio_nbytes - written; + + if (rem_size == 0) { + cblock->aio_fildes = -1; + /* + * md->refcount is incremented in perf_mmap__push() for + * every enqueued aio write request so decrement it because + * the request is now complete. + */ + perf_mmap__put(md); + rc = 1; + } else { + /* + * aio write request may require restart with the + * reminder if the kernel didn't write whole + * chunk at once. + */ + rem_off = cblock->aio_offset + written; + rem_buf = (void *)(cblock->aio_buf + written); + record__aio_write(cblock, cblock->aio_fildes, + rem_buf, rem_size, rem_off); + rc = 0; + } + + return rc; +} + +static void record__aio_sync(struct perf_mmap *md) +{ + struct aiocb *cblock = &md->aio.cblock; + struct timespec timeout = { 0, 1000 * 1000 * 1 }; /* 1ms */ + + do { + if (cblock->aio_fildes == -1 || record__aio_complete(md, cblock)) + return; + + while (aio_suspend((const struct aiocb**)&cblock, 1, &timeout)) { + if (!(errno == EAGAIN || errno == EINTR)) + pr_err("failed to sync perf data, error: %m\n"); + } + } while (1); +} + +static int record__aio_pushfn(void *to, struct aiocb *cblock, void *bf, size_t size, off_t off) +{ + struct record *rec = to; + int ret, trace_fd = rec->session->data->file.fd; + + rec->samples++; + + ret = record__aio_write(cblock, trace_fd, bf, size, off); + if (!ret) { + rec->bytes_written += size; + if (switch_output_size(rec)) + trigger_hit(&switch_output_trigger); + } + + return ret; +} + +static off_t record__aio_get_pos(int trace_fd) +{ + return lseek(trace_fd, 0, SEEK_CUR); +} + +static void record__aio_set_pos(int trace_fd, off_t pos) +{ + lseek(trace_fd, pos, SEEK_SET); +} + +static void record__aio_mmap_read_sync(struct record *rec) +{ + int i; + struct perf_evlist *evlist = rec->evlist; + struct perf_mmap *maps = evlist->mmap; + + if (!rec->opts.nr_cblocks) + return; + + for (i = 0; i < evlist->nr_mmaps; i++) { + struct perf_mmap *map = &maps[i]; + + if (map->base) + record__aio_sync(map); + } +} + +static int nr_cblocks_default = 1; + +static int record__aio_parse(const struct option *opt, + const char *str __maybe_unused, + int unset) +{ + struct record_opts *opts = (struct record_opts *)opt->value; + + if (unset) + opts->nr_cblocks = 0; + else + opts->nr_cblocks = nr_cblocks_default; + + return 0; +} +#else /* HAVE_AIO_SUPPORT */ +static void record__aio_sync(struct perf_mmap *md __maybe_unused) +{ +} + +static int record__aio_pushfn(void *to __maybe_unused, struct aiocb *cblock __maybe_unused, + void *bf __maybe_unused, size_t size __maybe_unused, off_t off __maybe_unused) +{ + return -1; +} + +static off_t record__aio_get_pos(int trace_fd __maybe_unused) +{ + return -1; +} + +static void record__aio_set_pos(int trace_fd __maybe_unused, off_t pos __maybe_unused) +{ +} + +static void record__aio_mmap_read_sync(struct record *rec __maybe_unused) +{ +} +#endif + +static int record__aio_enabled(struct record *rec) +{ + return rec->opts.nr_cblocks > 0; +} + static int process_synthesized_event(struct perf_tool *tool, union perf_event *event, struct perf_sample *sample __maybe_unused, @@ -329,7 +506,7 @@ static int record__mmap_evlist(struct record *rec, if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, opts->auxtrace_mmap_pages, - opts->auxtrace_snapshot_mode) < 0) { + opts->auxtrace_snapshot_mode, opts->nr_cblocks) < 0) { if (errno == EPERM) { pr_err("Permission error mapping pages.\n" "Consider increasing " @@ -525,6 +702,8 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli int i; int rc = 0; struct perf_mmap *maps; + int trace_fd = rec->data.file.fd; + off_t off; if (!evlist) return 0; @@ -536,13 +715,29 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli if (overwrite && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING) return 0; + if (record__aio_enabled(rec)) + off = record__aio_get_pos(trace_fd); + for (i = 0; i < evlist->nr_mmaps; i++) { struct perf_mmap *map = &maps[i]; if (map->base) { - if (perf_mmap__push(map, rec, record__pushfn) != 0) { - rc = -1; - goto out; + if (!record__aio_enabled(rec)) { + if (perf_mmap__push(map, rec, record__pushfn) != 0) { + rc = -1; + goto out; + } + } else { + /* + * Call record__aio_sync() to wait till map->data buffer + * becomes available after previous aio write request. + */ + record__aio_sync(map); + if (perf_mmap__aio_push(map, rec, record__aio_pushfn, &off) != 0) { + record__aio_set_pos(trace_fd, off); + rc = -1; + goto out; + } } } @@ -553,6 +748,9 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli } } + if (record__aio_enabled(rec)) + record__aio_set_pos(trace_fd, off); + /* * Mark the round finished in case we wrote * at least one event. @@ -658,6 +856,8 @@ record__switch_output(struct record *rec, bool at_exit) /* Same Size: "2015122520103046"*/ char timestamp[] = "InvalidTimestamp"; + record__aio_mmap_read_sync(rec); + record__synthesize(rec, true); if (target__none(&rec->opts.target)) record__synthesize_workload(rec, true); @@ -1168,6 +1368,8 @@ static int __cmd_record(struct record *rec, int argc, const char **argv) record__synthesize_workload(rec, true); out_child: + record__aio_mmap_read_sync(rec); + if (forks) { int exit_status; @@ -1706,6 +1908,11 @@ static struct option __record_options[] = { "signal"), OPT_BOOLEAN(0, "dry-run", &dry_run, "Parse options then exit"), +#ifdef HAVE_AIO_SUPPORT + OPT_CALLBACK_NOOPT(0, "aio", &record.opts, + NULL, "Enable asynchronous trace writing mode", + record__aio_parse), +#endif OPT_END() }; @@ -1898,6 +2105,9 @@ int cmd_record(int argc, const char **argv) goto out; } + if (verbose > 0) + pr_info("nr_cblocks: %d\n", rec->opts.nr_cblocks); + err = __cmd_record(&record, argc, argv); out: perf_evlist__delete(rec->evlist); diff --git a/tools/perf/perf.h b/tools/perf/perf.h index 0ed4a34c74c4..4d40baa45a5f 100644 --- a/tools/perf/perf.h +++ b/tools/perf/perf.h @@ -83,6 +83,7 @@ struct record_opts { clockid_t clockid; u64 clockid_res_ns; unsigned int proc_map_timeout; + int nr_cblocks; }; struct option; diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 6f010b9f0a81..e90575192209 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1018,7 +1018,7 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str, */ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, unsigned int auxtrace_pages, - bool auxtrace_overwrite) + bool auxtrace_overwrite, int nr_cblocks) { struct perf_evsel *evsel; const struct cpu_map *cpus = evlist->cpus; @@ -1028,7 +1028,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, * Its value is decided by evsel's write_backward. * So &mp should not be passed through const pointer. */ - struct mmap_params mp = { .nr_cblocks = 0 }; + struct mmap_params mp = { .nr_cblocks = nr_cblocks }; if (!evlist->mmap) evlist->mmap = perf_evlist__alloc_mmap(evlist, false); @@ -1060,7 +1060,7 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages) { - return perf_evlist__mmap_ex(evlist, pages, 0, false); + return perf_evlist__mmap_ex(evlist, pages, 0, false, 0); } int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target) diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index d108d167eb36..868294491194 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -162,7 +162,7 @@ unsigned long perf_event_mlock_kb_in_pages(void); int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, unsigned int auxtrace_pages, - bool auxtrace_overwrite); + bool auxtrace_overwrite, int nr_cblocks); int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages); void perf_evlist__munmap(struct perf_evlist *evlist); diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c index 47cdc3ad6546..61aa381d05d0 100644 --- a/tools/perf/util/mmap.c +++ b/tools/perf/util/mmap.c @@ -158,7 +158,8 @@ static int perf_mmap__aio_mmap(struct perf_mmap *map, struct mmap_params *mp) { int delta_max; - if (mp->nr_cblocks) { + map->aio.nr_cblocks = mp->nr_cblocks; + if (map->aio.nr_cblocks) { map->aio.data = malloc(perf_mmap__mmap_len(map)); if (!map->aio.data) { pr_debug2("failed to allocate data buffer, error %m\n"); @@ -187,6 +188,80 @@ static void perf_mmap__aio_munmap(struct perf_mmap *map) if (map->aio.data) zfree(&map->aio.data); } + +int perf_mmap__aio_push(struct perf_mmap *md, void *to, + int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off), + off_t *off) +{ + u64 head = perf_mmap__read_head(md); + unsigned char *data = md->base + page_size; + unsigned long size, size0 = 0; + void *buf; + int rc = 0; + + rc = perf_mmap__read_init(md); + if (rc < 0) + return (rc == -EAGAIN) ? 0 : -1; + + /* + * md->base data is copied into md->data buffer to + * release space in the kernel buffer as fast as possible, + * thru perf_mmap__consume() below. + * + * That lets the kernel to proceed with storing more + * profiling data into the kernel buffer earlier than other + * per-cpu kernel buffers are handled. + * + * Coping can be done in two steps in case the chunk of + * profiling data crosses the upper bound of the kernel buffer. + * In this case we first move part of data from md->start + * till the upper bound and then the reminder from the + * beginning of the kernel buffer till the end of + * the data chunk. + */ + + size = md->end - md->start; + + if ((md->start & md->mask) + size != (md->end & md->mask)) { + buf = &data[md->start & md->mask]; + size = md->mask + 1 - (md->start & md->mask); + md->start += size; + memcpy(md->aio.data, buf, size); + size0 = size; + } + + buf = &data[md->start & md->mask]; + size = md->end - md->start; + md->start += size; + memcpy(md->aio.data + size0, buf, size); + + /* + * Increment md->refcount to guard md->data buffer + * from premature deallocation because md object can be + * released earlier than aio write request started + * on mmap->data is complete. + * + * perf_mmap__put() is done at record__aio_complete() + * after started request completion. + */ + perf_mmap__get(md); + + md->prev = head; + perf_mmap__consume(md); + + rc = push(to, &md->aio.cblock, md->aio.data, size0 + size, *off); + if (!rc) { + *off += size0 + size; + } else { + /* + * Decrement md->refcount back if aio write + * operation failed to start. + */ + perf_mmap__put(md); + } + + return rc; +} #else static int perf_mmap__aio_mmap(struct perf_mmap *map __maybe_unused, struct mmap_params *mp __maybe_unused) diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h index 3f10ad030c5e..b99213ba11b5 100644 --- a/tools/perf/util/mmap.h +++ b/tools/perf/util/mmap.h @@ -12,6 +12,7 @@ #include "auxtrace.h" #include "event.h" +struct aiocb; /** * struct perf_mmap - perf's ring buffer mmap details * @@ -33,6 +34,7 @@ struct perf_mmap { struct { void *data; struct aiocb cblock; + int nr_cblocks; } aio; #endif }; @@ -94,6 +96,18 @@ union perf_event *perf_mmap__read_event(struct perf_mmap *map); int perf_mmap__push(struct perf_mmap *md, void *to, int push(struct perf_mmap *map, void *to, void *buf, size_t size)); +#ifdef HAVE_AIO_SUPPORT +int perf_mmap__aio_push(struct perf_mmap *md, void *to, + int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off), + off_t *off); +#else +static inline int perf_mmap__aio_push(struct perf_mmap *md __maybe_unused, void *to __maybe_unused, + int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off) __maybe_unused, + off_t *off __maybe_unused) +{ + return 0; +} +#endif size_t perf_mmap__mmap_len(struct perf_mmap *map); -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 20/22] perf record: Extend trace writing to multi AIO 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (18 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 19/22] perf record: Enable asynchronous trace writing Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 21/22] perf beauty mmap_flags: Check if the arch has a mmap.h file Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 22/22] tools lib traceevent: Add sanity check to is_timestamp_in_us() Arnaldo Carvalho de Melo 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Alexey Budankov, Alexander Shishkin, Andi Kleen, Peter Zijlstra, Arnaldo Carvalho de Melo From: Alexey Budankov <alexey.budankov@linux.intel.com> Multi AIO trace writing allows caching more kernel data into userspace memory postponing trace writing for the sake of overall profiling data thruput increase. It could be seen as kernel data buffer extension into userspace memory. With an --aio option value different from 0 (default value is 1) the tool has capability to cache more and more data into user space along with delegating spill to AIO. That allows avoiding to suspend at record__aio_sync() between calls of record__mmap_read_evlist() and increases profiling data thruput at the cost of userspace memory. Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Reviewed-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/050bb053-e7f3-aa83-fde7-f27ff90be7f6@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Documentation/perf-record.txt | 4 +- tools/perf/builtin-record.c | 67 ++++++++++++++++++------ tools/perf/util/mmap.c | 64 ++++++++++++++-------- tools/perf/util/mmap.h | 9 ++-- 4 files changed, 102 insertions(+), 42 deletions(-) diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index 7efb4af88a68..d232b13ea713 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -435,8 +435,8 @@ Specify vmlinux path which has debuginfo. --buildid-all:: Record build-id of all DSOs regardless whether it's actually hit or not. ---aio:: -Enable asynchronous (Posix AIO) trace writing mode. +--aio[=n]:: +Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4). Asynchronous mode is supported only when linking Perf tool with libc library providing implementation for Posix AIO API. diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 408d6477c960..4736dc96c4ca 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -196,16 +196,35 @@ static int record__aio_complete(struct perf_mmap *md, struct aiocb *cblock) return rc; } -static void record__aio_sync(struct perf_mmap *md) +static int record__aio_sync(struct perf_mmap *md, bool sync_all) { - struct aiocb *cblock = &md->aio.cblock; + struct aiocb **aiocb = md->aio.aiocb; + struct aiocb *cblocks = md->aio.cblocks; struct timespec timeout = { 0, 1000 * 1000 * 1 }; /* 1ms */ + int i, do_suspend; do { - if (cblock->aio_fildes == -1 || record__aio_complete(md, cblock)) - return; + do_suspend = 0; + for (i = 0; i < md->aio.nr_cblocks; ++i) { + if (cblocks[i].aio_fildes == -1 || record__aio_complete(md, &cblocks[i])) { + if (sync_all) + aiocb[i] = NULL; + else + return i; + } else { + /* + * Started aio write is not complete yet + * so it has to be waited before the + * next allocation. + */ + aiocb[i] = &cblocks[i]; + do_suspend = 1; + } + } + if (!do_suspend) + return -1; - while (aio_suspend((const struct aiocb**)&cblock, 1, &timeout)) { + while (aio_suspend((const struct aiocb **)aiocb, md->aio.nr_cblocks, &timeout)) { if (!(errno == EAGAIN || errno == EINTR)) pr_err("failed to sync perf data, error: %m\n"); } @@ -252,28 +271,36 @@ static void record__aio_mmap_read_sync(struct record *rec) struct perf_mmap *map = &maps[i]; if (map->base) - record__aio_sync(map); + record__aio_sync(map, true); } } static int nr_cblocks_default = 1; +static int nr_cblocks_max = 4; static int record__aio_parse(const struct option *opt, - const char *str __maybe_unused, + const char *str, int unset) { struct record_opts *opts = (struct record_opts *)opt->value; - if (unset) + if (unset) { opts->nr_cblocks = 0; - else - opts->nr_cblocks = nr_cblocks_default; + } else { + if (str) + opts->nr_cblocks = strtol(str, NULL, 0); + if (!opts->nr_cblocks) + opts->nr_cblocks = nr_cblocks_default; + } return 0; } #else /* HAVE_AIO_SUPPORT */ -static void record__aio_sync(struct perf_mmap *md __maybe_unused) +static int nr_cblocks_max = 0; + +static int record__aio_sync(struct perf_mmap *md __maybe_unused, bool sync_all __maybe_unused) { + return -1; } static int record__aio_pushfn(void *to __maybe_unused, struct aiocb *cblock __maybe_unused, @@ -728,12 +755,13 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli goto out; } } else { + int idx; /* * Call record__aio_sync() to wait till map->data buffer * becomes available after previous aio write request. */ - record__aio_sync(map); - if (perf_mmap__aio_push(map, rec, record__aio_pushfn, &off) != 0) { + idx = record__aio_sync(map, false); + if (perf_mmap__aio_push(map, rec, idx, record__aio_pushfn, &off) != 0) { record__aio_set_pos(trace_fd, off); rc = -1; goto out; @@ -1503,6 +1531,13 @@ static int perf_record_config(const char *var, const char *value, void *cb) var = "call-graph.record-mode"; return perf_default_config(var, value, cb); } +#ifdef HAVE_AIO_SUPPORT + if (!strcmp(var, "record.aio")) { + rec->opts.nr_cblocks = strtol(value, NULL, 0); + if (!rec->opts.nr_cblocks) + rec->opts.nr_cblocks = nr_cblocks_default; + } +#endif return 0; } @@ -1909,8 +1944,8 @@ static struct option __record_options[] = { OPT_BOOLEAN(0, "dry-run", &dry_run, "Parse options then exit"), #ifdef HAVE_AIO_SUPPORT - OPT_CALLBACK_NOOPT(0, "aio", &record.opts, - NULL, "Enable asynchronous trace writing mode", + OPT_CALLBACK_OPTARG(0, "aio", &record.opts, + &nr_cblocks_default, "n", "Use <n> control blocks in asynchronous trace writing mode (default: 1, max: 4)", record__aio_parse), #endif OPT_END() @@ -2105,6 +2140,8 @@ int cmd_record(int argc, const char **argv) goto out; } + if (rec->opts.nr_cblocks > nr_cblocks_max) + rec->opts.nr_cblocks = nr_cblocks_max; if (verbose > 0) pr_info("nr_cblocks: %d\n", rec->opts.nr_cblocks); diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c index 61aa381d05d0..ab30555d2afc 100644 --- a/tools/perf/util/mmap.c +++ b/tools/perf/util/mmap.c @@ -156,28 +156,50 @@ void __weak auxtrace_mmap_params__set_idx(struct auxtrace_mmap_params *mp __mayb #ifdef HAVE_AIO_SUPPORT static int perf_mmap__aio_mmap(struct perf_mmap *map, struct mmap_params *mp) { - int delta_max; + int delta_max, i, prio; map->aio.nr_cblocks = mp->nr_cblocks; if (map->aio.nr_cblocks) { - map->aio.data = malloc(perf_mmap__mmap_len(map)); + map->aio.aiocb = calloc(map->aio.nr_cblocks, sizeof(struct aiocb *)); + if (!map->aio.aiocb) { + pr_debug2("failed to allocate aiocb for data buffer, error %m\n"); + return -1; + } + map->aio.cblocks = calloc(map->aio.nr_cblocks, sizeof(struct aiocb)); + if (!map->aio.cblocks) { + pr_debug2("failed to allocate cblocks for data buffer, error %m\n"); + return -1; + } + map->aio.data = calloc(map->aio.nr_cblocks, sizeof(void *)); if (!map->aio.data) { pr_debug2("failed to allocate data buffer, error %m\n"); return -1; } - /* - * Use cblock.aio_fildes value different from -1 - * to denote started aio write operation on the - * cblock so it requires explicit record__aio_sync() - * call prior the cblock may be reused again. - */ - map->aio.cblock.aio_fildes = -1; - /* - * Allocate cblock with max priority delta to - * have faster aio write system calls. - */ delta_max = sysconf(_SC_AIO_PRIO_DELTA_MAX); - map->aio.cblock.aio_reqprio = delta_max; + for (i = 0; i < map->aio.nr_cblocks; ++i) { + map->aio.data[i] = malloc(perf_mmap__mmap_len(map)); + if (!map->aio.data[i]) { + pr_debug2("failed to allocate data buffer area, error %m"); + return -1; + } + /* + * Use cblock.aio_fildes value different from -1 + * to denote started aio write operation on the + * cblock so it requires explicit record__aio_sync() + * call prior the cblock may be reused again. + */ + map->aio.cblocks[i].aio_fildes = -1; + /* + * Allocate cblocks with priority delta to have + * faster aio write system calls because queued requests + * are kept in separate per-prio queues and adding + * a new request will iterate thru shorter per-prio + * list. Blocks with numbers higher than + * _SC_AIO_PRIO_DELTA_MAX go with priority 0. + */ + prio = delta_max - i; + map->aio.cblocks[i].aio_reqprio = prio >= 0 ? prio : 0; + } } return 0; @@ -189,7 +211,7 @@ static void perf_mmap__aio_munmap(struct perf_mmap *map) zfree(&map->aio.data); } -int perf_mmap__aio_push(struct perf_mmap *md, void *to, +int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx, int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off), off_t *off) { @@ -204,7 +226,7 @@ int perf_mmap__aio_push(struct perf_mmap *md, void *to, return (rc == -EAGAIN) ? 0 : -1; /* - * md->base data is copied into md->data buffer to + * md->base data is copied into md->data[idx] buffer to * release space in the kernel buffer as fast as possible, * thru perf_mmap__consume() below. * @@ -226,20 +248,20 @@ int perf_mmap__aio_push(struct perf_mmap *md, void *to, buf = &data[md->start & md->mask]; size = md->mask + 1 - (md->start & md->mask); md->start += size; - memcpy(md->aio.data, buf, size); + memcpy(md->aio.data[idx], buf, size); size0 = size; } buf = &data[md->start & md->mask]; size = md->end - md->start; md->start += size; - memcpy(md->aio.data + size0, buf, size); + memcpy(md->aio.data[idx] + size0, buf, size); /* - * Increment md->refcount to guard md->data buffer + * Increment md->refcount to guard md->data[idx] buffer * from premature deallocation because md object can be * released earlier than aio write request started - * on mmap->data is complete. + * on mmap->data[idx] is complete. * * perf_mmap__put() is done at record__aio_complete() * after started request completion. @@ -249,7 +271,7 @@ int perf_mmap__aio_push(struct perf_mmap *md, void *to, md->prev = head; perf_mmap__consume(md); - rc = push(to, &md->aio.cblock, md->aio.data, size0 + size, *off); + rc = push(to, &md->aio.cblocks[idx], md->aio.data[idx], size0 + size, *off); if (!rc) { *off += size0 + size; } else { diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h index b99213ba11b5..aeb6942fdb00 100644 --- a/tools/perf/util/mmap.h +++ b/tools/perf/util/mmap.h @@ -32,8 +32,9 @@ struct perf_mmap { char event_copy[PERF_SAMPLE_MAX_SIZE] __aligned(8); #ifdef HAVE_AIO_SUPPORT struct { - void *data; - struct aiocb cblock; + void **data; + struct aiocb *cblocks; + struct aiocb **aiocb; int nr_cblocks; } aio; #endif @@ -97,11 +98,11 @@ union perf_event *perf_mmap__read_event(struct perf_mmap *map); int perf_mmap__push(struct perf_mmap *md, void *to, int push(struct perf_mmap *map, void *to, void *buf, size_t size)); #ifdef HAVE_AIO_SUPPORT -int perf_mmap__aio_push(struct perf_mmap *md, void *to, +int perf_mmap__aio_push(struct perf_mmap *md, void *to, int idx, int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off), off_t *off); #else -static inline int perf_mmap__aio_push(struct perf_mmap *md __maybe_unused, void *to __maybe_unused, +static inline int perf_mmap__aio_push(struct perf_mmap *md __maybe_unused, void *to __maybe_unused, int idx __maybe_unused, int push(void *to, struct aiocb *cblock, void *buf, size_t size, off_t off) __maybe_unused, off_t *off __maybe_unused) { -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 21/22] perf beauty mmap_flags: Check if the arch has a mmap.h file 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (19 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 20/22] perf record: Extend trace writing to multi AIO Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 22/22] tools lib traceevent: Add sanity check to is_timestamp_in_us() Arnaldo Carvalho de Melo 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter, David Ahern, Jiri Olsa, Namhyung Kim, Wang Nan From: Arnaldo Carvalho de Melo <acme@redhat.com> If not, then just use what is in asm-generic. This fixes the build for my sh4, m68k and riscv64 perf test build containers that were failing due to 80ee5668b8a7 ("perf beauty: Add a generator for MAP_ mmap's flag constants"), that were not covered in the cset introducing those tools/arch/*/include/uapi/asm/mman.h files. f3539c12d819 ("tools include: Add uapi mman.h for each architecture") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 80ee5668b8a7 ("perf beauty: Add a generator for MAP_ mmap's flag constants") Link: https://lkml.kernel.org/n/tip-rpy9t2e0wxpnum1yvxhreafe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/Makefile.perf | 2 +- tools/perf/trace/beauty/mmap_flags.sh | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index 67e9adbe6ee8..bfdaefd500ab 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -474,7 +474,7 @@ $(madvise_behavior_array): $(madvise_hdr_dir)/mman-common.h $(madvise_behavior_t mmap_flags_array := $(beauty_outdir)/mmap_flags_array.c mmap_flags_tbl := $(srctree)/tools/perf/trace/beauty/mmap_flags.sh -$(mmap_flags_array): $(asm_generic_uapi_dir)/mman.h $(asm_generic_uapi_dir)/mman-common.h $(arch_asm_uapi_dir)/mman.h $(mmap_flags_tbl) +$(mmap_flags_array): $(asm_generic_uapi_dir)/mman.h $(asm_generic_uapi_dir)/mman-common.h $(mmap_flags_tbl) $(Q)$(SHELL) '$(mmap_flags_tbl)' $(asm_generic_uapi_dir) $(arch_asm_uapi_dir) > $@ mount_flags_array := $(beauty_outdir)/mount_flags_array.c diff --git a/tools/perf/trace/beauty/mmap_flags.sh b/tools/perf/trace/beauty/mmap_flags.sh index 22c3fdca8975..cd41023107d7 100755 --- a/tools/perf/trace/beauty/mmap_flags.sh +++ b/tools/perf/trace/beauty/mmap_flags.sh @@ -20,12 +20,12 @@ egrep -q $regex ${arch_mman} && \ (egrep $regex ${arch_mman} | \ sed -r "s/$regex/\2 \1/g" | \ xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n") -egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.*' ${arch_mman} && +[ ! -f ${arch_mman} || egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.*' ${arch_mman} ] && (egrep $regex ${header_dir}/mman-common.h | \ egrep -vw 'MAP_(UNINITIALIZED|TYPE|SHARED_VALIDATE)' | \ sed -r "s/$regex/\2 \1/g" | \ xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n") -egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.h>.*' ${arch_mman} && +[ ! -f ${arch_mman} || egrep -q '#[[:space:]]*include[[:space:]]+<uapi/asm-generic/mman.h>.*' ${arch_mman} ] && (egrep $regex ${header_dir}/mman.h | \ sed -r "s/$regex/\2 \1/g" | \ xargs printf "\t[ilog2(%s) + 1] = \"%s\",\n") -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH 22/22] tools lib traceevent: Add sanity check to is_timestamp_in_us() 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo ` (20 preceding siblings ...) 2018-11-30 18:26 ` [PATCH 21/22] perf beauty mmap_flags: Check if the arch has a mmap.h file Arnaldo Carvalho de Melo @ 2018-11-30 18:26 ` Arnaldo Carvalho de Melo 21 siblings, 0 replies; 36+ messages in thread From: Arnaldo Carvalho de Melo @ 2018-11-30 18:26 UTC (permalink / raw) To: Ingo Molnar Cc: Clark Williams, linux-kernel, linux-perf-users, Tzvetomir Stoyanov, Jiri Olsa, Steven Rostedt, Arnaldo Carvalho de Melo From: Tzvetomir Stoyanov <tstoyanov@vmware.com> This patch adds a sanity check to is_timestamp_in_us() input parameter trace_clock. It avoids a potential segfault in this function for the case trace_clock is NULL. Reported-by: Slavomir Kaslev <kaslevs@vmware.com> Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20181128145552.68c4f87b@gandalf.local.home Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/lib/traceevent/event-parse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c index fbd6d6813fab..2b5cb33046ce 100644 --- a/tools/lib/traceevent/event-parse.c +++ b/tools/lib/traceevent/event-parse.c @@ -5409,7 +5409,7 @@ void tep_event_info(struct trace_seq *s, struct tep_event_format *event, static bool is_timestamp_in_us(char *trace_clock, bool use_trace_clock) { - if (!use_trace_clock) + if (!trace_clock || !use_trace_clock) return true; if (!strcmp(trace_clock, "local") || !strcmp(trace_clock, "global") -- 2.19.1 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [GIT PULL 00/22] perf/core improvements and fixes
@ 2017-04-24 19:54 Arnaldo Carvalho de Melo
2017-04-24 20:40 ` Ingo Molnar
0 siblings, 1 reply; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-04-24 19:54 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter, Al Viro,
Andi Kleen, David Ahern, David Howells, Jiri Olsa, Kyle Huey,
Namhyung Kim, Peter Zijlstra, Stephane Eranian, Taeung Song,
Thomas Gleixner, Tony Luck, Wang Nan, Yao Jin,
Arnaldo Carvalho de Melo
Hi Ingo,
Please consider applying,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit 07590a7d4030c159b9a0d7171f81049a9ce23245:
Merge tag 'perf-core-for-mingo-4.12-20170419' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-04-20 10:07:18 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170424
for you to fetch changes up to 9d43f5e8df6804ae271407500af9062e9278167a:
perf tools: Fix the code to strip command name (2017-04-24 13:43:37 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
User visible:
- Fix display of data source snoop indication in 'perf mem' (Andi Kleen)
- Fix the code to strip command name from /proc/PID/stat (Jiri Olsa)
Infrastructure:
- Continue the disentanglement of headers, specially util.h (Arnaldo Carvalho de Melo)
- Synchronize some header files with the kernel (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Andi Kleen (1):
perf mem: Fix display of data source snoop indication
Arnaldo Carvalho de Melo (20):
perf unwind: Provide only forward declarations for pointer types
perf tools: Add signal.h to places using its definitions
perf tools: Move units conversion/formatting routines to separate object
perf tools: Move timestamp routines from util.h to time-utils.h
perf kvm: Make function only used by 'perf kvm' static
perf debug: Move dump_stack() and sighandler_dump_stack() to debug.h
perf tools: Add compress.h for the *_decompress_to_file() headers
perf callchain: Move callchain specific routines from util.[ch]
perf tools: Include sys/param.h where needed
perf tools: Remove a few more needless includes from util.h
perf tools: Remove sys/ioctl.h from util.h
perf tools: Remove string.h from util.h
perf tools: Remove stale prototypes from builtin.h
perf tools: Remove string.h, unistd.h and sys/stat.h from util.h
perf tools: Remove poll.h and wait.h from util.h
perf tools: Add the right header to obtain PERF_ALIGN()
perf tools: Use just forward declarations for struct thread where possible
tools: Update asm-generic/mman-common.h copy from the kernel
tools arch: Sync arch/x86/lib/memcpy_64.S with the kernel
tools arch x86: Sync cpufeatures.h
Jiri Olsa (1):
perf tools: Fix the code to strip command name
tools/arch/x86/include/asm/cpufeatures.h | 1 +
tools/arch/x86/lib/memcpy_64.S | 2 +-
tools/include/uapi/linux/stat.h | 5 +-
tools/lib/subcmd/help.h | 1 +
tools/perf/arch/arm/util/cs-etm.c | 1 +
tools/perf/arch/arm/util/unwind-libdw.c | 1 +
tools/perf/arch/arm64/util/dwarf-regs.c | 1 +
tools/perf/arch/x86/tests/intel-cqm.c | 2 +
tools/perf/arch/x86/util/unwind-libdw.c | 1 +
tools/perf/builtin-buildid-cache.c | 1 +
tools/perf/builtin-c2c.c | 2 +
tools/perf/builtin-ftrace.c | 1 +
tools/perf/builtin-help.c | 4 +
tools/perf/builtin-inject.c | 2 +
tools/perf/builtin-kvm.c | 17 +++
tools/perf/builtin-mem.c | 4 +
tools/perf/builtin-record.c | 5 +
tools/perf/builtin-report.c | 5 +
tools/perf/builtin-script.c | 5 +
tools/perf/builtin-stat.c | 5 +
tools/perf/builtin-timechart.c | 1 +
tools/perf/builtin-top.c | 1 +
tools/perf/builtin-trace.c | 2 +
tools/perf/builtin-version.c | 3 +-
tools/perf/builtin.h | 4 -
tools/perf/perf.c | 6 +-
tools/perf/tests/attr.c | 4 +
tools/perf/tests/bpf.c | 2 +
tools/perf/tests/builtin-test.c | 1 +
tools/perf/tests/code-reading.c | 1 +
tools/perf/tests/event-times.c | 1 +
tools/perf/tests/parse-events.c | 3 +
tools/perf/tests/unit_number__scnprintf.c | 2 +-
tools/perf/trace/beauty/signum.c | 1 +
tools/perf/ui/browsers/hists.c | 2 +
tools/perf/ui/gtk/annotate.c | 1 +
tools/perf/ui/gtk/hists.c | 1 +
tools/perf/ui/stdio/hist.c | 1 +
tools/perf/util/Build | 1 +
tools/perf/util/build-id.c | 3 +
tools/perf/util/callchain.c | 103 +++++++++++++++
tools/perf/util/color.h | 2 +
tools/perf/util/comm.c | 1 +
tools/perf/util/compress.h | 12 ++
tools/perf/util/config.c | 4 +
tools/perf/util/debug.c | 33 ++++-
tools/perf/util/debug.h | 3 +
tools/perf/util/dso.c | 4 +
tools/perf/util/event.c | 12 +-
tools/perf/util/event.h | 2 +-
tools/perf/util/evlist.c | 3 +
tools/perf/util/evlist.h | 1 +
tools/perf/util/evsel.c | 1 +
tools/perf/util/header.c | 4 +
tools/perf/util/help-unknown-cmd.c | 1 +
tools/perf/util/hist.c | 2 +
tools/perf/util/llvm-utils.c | 1 +
tools/perf/util/lzma.c | 1 +
tools/perf/util/machine.c | 3 +
tools/perf/util/mem-events.c | 2 +-
tools/perf/util/namespaces.c | 1 +
tools/perf/util/parse-events.c | 2 +
tools/perf/util/pmu.c | 1 +
tools/perf/util/probe-file.c | 3 +
tools/perf/util/python-ext-sources | 1 +
tools/perf/util/python.c | 13 ++
tools/perf/util/session.c | 1 +
tools/perf/util/session.h | 3 +-
tools/perf/util/sort.c | 1 +
tools/perf/util/sort.h | 3 +-
tools/perf/util/strlist.c | 1 +
tools/perf/util/time-utils.c | 25 ++++
tools/perf/util/time-utils.h | 7 +
tools/perf/util/top.h | 2 +-
tools/perf/util/units.c | 39 ++++++
tools/perf/util/units.h | 10 ++
tools/perf/util/unwind-libdw.h | 6 +-
tools/perf/util/unwind.h | 9 +-
tools/perf/util/util.c | 211 +-----------------------------
tools/perf/util/util.h | 38 ------
tools/perf/util/xyarray.c | 2 +
tools/perf/util/zlib.c | 1 +
82 files changed, 412 insertions(+), 270 deletions(-)
create mode 100644 tools/perf/util/compress.h
create mode 100644 tools/perf/util/units.c
create mode 100644 tools/perf/util/units.h
Test results:
The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.
Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
# dm
1 alpine:3.4: Ok
2 alpine:3.5: Ok
3 alpine:edge: Ok
4 android-ndk:r12b-arm: Ok
5 archlinux:latest: Ok
6 centos:5: Ok
7 centos:6: Ok
8 centos:7: Ok
9 debian:7: Ok
10 debian:8: Ok
11 debian:9: Ok
12 debian:experimental: Ok
13 debian:experimental-x-arm64: Ok
14 debian:experimental-x-mips: Ok
15 debian:experimental-x-mips64: Ok
16 debian:experimental-x-mipsel: Ok
17 fedora:20: Ok
18 fedora:21: Ok
19 fedora:22: Ok
20 fedora:23: Ok
21 fedora:24: Ok
22 fedora:24-x-ARC-uClibc: Ok
23 fedora:25: Ok
24 fedora:rawhide: Ok
25 mageia:5: Ok
26 opensuse:13.2: Ok
27 opensuse:42.1: Ok
28 opensuse:tumbleweed: Ok
29 ubuntu:12.04.5: Ok
30 ubuntu:14.04.4: Ok
31 ubuntu:14.04.4-x-linaro-arm64: Ok
32 ubuntu:15.10: Ok
33 ubuntu:16.04: Ok
34 ubuntu:16.04-x-arm: Ok
35 ubuntu:16.04-x-arm64: Ok
36 ubuntu:16.04-x-powerpc: Ok
37 ubuntu:16.04-x-powerpc64: Ok
38 ubuntu:16.04-x-s390: Ok
39 ubuntu:16.10: Ok
40 ubuntu:17.04: Ok
#
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: Simple expression parser : Ok
7: PERF_RECORD_* events & perf_sample fields : Ok
8: Parse perf pmu format : Ok
9: DSO data read : Ok
10: DSO data cache : Ok
11: DSO data reopen : Ok
12: Roundtrip evsel->name : Ok
13: Parse sched tracepoints fields : Ok
14: syscalls:sys_enter_openat event fields : Ok
15: Setup struct perf_event_attr : Ok
16: Match and link multiple hists : Ok
17: 'import perf' in python : Ok
18: Breakpoint overflow signal handler : Ok
19: Breakpoint overflow sampling : Ok
20: Number of exit events of a simple workload : Ok
21: Software clock events period values : Ok
22: Object code reading : Ok
23: Sample parsing : Ok
24: Use a dummy software event to keep tracking: Ok
25: Parse with no sample_id_all bit set : Ok
26: Filter hist entries : Ok
27: Lookup mmap thread : Ok
28: Share thread mg : Ok
29: Sort output of hist entries : Ok
30: Cumulate child hist entries : Ok
31: Track with sched_switch : Ok
32: Filter fds with revents mask in a fdarray : Ok
33: Add fd to a fdarray, making it autogrow : Ok
34: kmod_path__parse : Ok
35: Thread map : Ok
36: LLVM search and compile :
36.1: Basic BPF llvm compile : Ok
36.2: kbuild searching : Ok
36.3: Compile source for BPF prologue generation: Ok
36.4: Compile source for BPF relocation : Ok
37: Session topology : Ok
38: BPF filter :
38.1: Basic BPF filtering : Ok
38.2: BPF pinning : Ok
38.3: BPF prologue generation : Ok
38.4: BPF relocation checker : Ok
39: Synthesize thread map : Ok
40: Remove thread map : Ok
41: Synthesize cpu map : Ok
42: Synthesize stat config : Ok
43: Synthesize stat : Ok
44: Synthesize stat round : Ok
45: Synthesize attr update : Ok
46: Event times : Ok
47: Read backward ring buffer : Ok
48: Print cpu map : Ok
49: Probe SDT events : Ok
50: is_printable_array : Ok
51: Print bitmap : Ok
52: perf hooks : Ok
53: builtin clang support : Skip (not compiled in)
54: unit_number__scnprintf : Ok
55: x86 rdpmc : Ok
56: Convert perf time to TSC : Ok
57: DWARF unwind : Ok
58: x86 instruction decoder - new instructions : Ok
59: Intel cqm nmi context read : Skip
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
make_doc_O: && make doc
make_install_prefix_O: && make install prefix=/tmp/krava
make_no_libaudit_O: && make NO_LIBAUDIT=1
make_util_map_o_O: && make util/map.o
make_no_libelf_O: && make NO_LIBELF=1
make_no_scripts_O: && make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_gtk2_O: && make NO_GTK2=1
make_no_backtrace_O: && make NO_BACKTRACE=1
make_minimal_O: && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_ui_O: && make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_auxtrace_O: && make NO_AUXTRACE=1
make_perf_o_O: && make perf.o
make_install_prefix_slash_O: && make install prefix=/tmp/krava/
make_with_babeltrace_O: && make LIBBABELTRACE=1
make_no_newt_O: && make NO_NEWT=1
make_no_libperl_O: && make NO_LIBPERL=1
make_no_demangle_O: && make NO_DEMANGLE=1
make_debug_O: && make DEBUG=1
make_no_libunwind_O: && make NO_LIBUNWIND=1
make_tags_O: && make tags
make_install_bin_O: && make install-bin
make_no_libbpf_O: && make NO_LIBBPF=1
make_help_O: && make help
make_pure_O: && make
make_static_O: && make LDFLAGS=-static
make_with_clangllvm_O: && make LIBCLANGLLVM=1
make_util_pmu_bison_o_O: && make util/pmu-bison.o
make_no_libbionic_O: && make NO_LIBBIONIC=1
make_no_slang_O: && make NO_SLANG=1
make_no_libpython_O: && make NO_LIBPYTHON=1
make_clean_all_O: && make clean all
make_no_libdw_dwarf_unwind_O: && make NO_LIBDW_DWARF_UNWIND=1
make_no_libnuma_O: && make NO_LIBNUMA=1
make_install_O: && make install
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'
$
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [GIT PULL 00/22] perf/core improvements and fixes 2017-04-24 19:54 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo @ 2017-04-24 20:40 ` Ingo Molnar 0 siblings, 0 replies; 36+ messages in thread From: Ingo Molnar @ 2017-04-24 20:40 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Al Viro, Andi Kleen, David Ahern, David Howells, Jiri Olsa, Kyle Huey, Namhyung Kim, Peter Zijlstra, Stephane Eranian, Taeung Song, Thomas Gleixner, Tony Luck, Wang Nan, Yao Jin, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider applying, > > - Arnaldo > > Test results at the end of this message, as usual. > > The following changes since commit 07590a7d4030c159b9a0d7171f81049a9ce23245: > > Merge tag 'perf-core-for-mingo-4.12-20170419' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2017-04-20 10:07:18 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-4.12-20170424 > > for you to fetch changes up to 9d43f5e8df6804ae271407500af9062e9278167a: > > perf tools: Fix the code to strip command name (2017-04-24 13:43:37 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Fix display of data source snoop indication in 'perf mem' (Andi Kleen) > > - Fix the code to strip command name from /proc/PID/stat (Jiri Olsa) > > Infrastructure: > > - Continue the disentanglement of headers, specially util.h (Arnaldo Carvalho de Melo) > > - Synchronize some header files with the kernel (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Andi Kleen (1): > perf mem: Fix display of data source snoop indication > > Arnaldo Carvalho de Melo (20): > perf unwind: Provide only forward declarations for pointer types > perf tools: Add signal.h to places using its definitions > perf tools: Move units conversion/formatting routines to separate object > perf tools: Move timestamp routines from util.h to time-utils.h > perf kvm: Make function only used by 'perf kvm' static > perf debug: Move dump_stack() and sighandler_dump_stack() to debug.h > perf tools: Add compress.h for the *_decompress_to_file() headers > perf callchain: Move callchain specific routines from util.[ch] > perf tools: Include sys/param.h where needed > perf tools: Remove a few more needless includes from util.h > perf tools: Remove sys/ioctl.h from util.h > perf tools: Remove string.h from util.h > perf tools: Remove stale prototypes from builtin.h > perf tools: Remove string.h, unistd.h and sys/stat.h from util.h > perf tools: Remove poll.h and wait.h from util.h > perf tools: Add the right header to obtain PERF_ALIGN() > perf tools: Use just forward declarations for struct thread where possible > tools: Update asm-generic/mman-common.h copy from the kernel > tools arch: Sync arch/x86/lib/memcpy_64.S with the kernel > tools arch x86: Sync cpufeatures.h > > Jiri Olsa (1): > perf tools: Fix the code to strip command name > > tools/arch/x86/include/asm/cpufeatures.h | 1 + > tools/arch/x86/lib/memcpy_64.S | 2 +- > tools/include/uapi/linux/stat.h | 5 +- > tools/lib/subcmd/help.h | 1 + > tools/perf/arch/arm/util/cs-etm.c | 1 + > tools/perf/arch/arm/util/unwind-libdw.c | 1 + > tools/perf/arch/arm64/util/dwarf-regs.c | 1 + > tools/perf/arch/x86/tests/intel-cqm.c | 2 + > tools/perf/arch/x86/util/unwind-libdw.c | 1 + > tools/perf/builtin-buildid-cache.c | 1 + > tools/perf/builtin-c2c.c | 2 + > tools/perf/builtin-ftrace.c | 1 + > tools/perf/builtin-help.c | 4 + > tools/perf/builtin-inject.c | 2 + > tools/perf/builtin-kvm.c | 17 +++ > tools/perf/builtin-mem.c | 4 + > tools/perf/builtin-record.c | 5 + > tools/perf/builtin-report.c | 5 + > tools/perf/builtin-script.c | 5 + > tools/perf/builtin-stat.c | 5 + > tools/perf/builtin-timechart.c | 1 + > tools/perf/builtin-top.c | 1 + > tools/perf/builtin-trace.c | 2 + > tools/perf/builtin-version.c | 3 +- > tools/perf/builtin.h | 4 - > tools/perf/perf.c | 6 +- > tools/perf/tests/attr.c | 4 + > tools/perf/tests/bpf.c | 2 + > tools/perf/tests/builtin-test.c | 1 + > tools/perf/tests/code-reading.c | 1 + > tools/perf/tests/event-times.c | 1 + > tools/perf/tests/parse-events.c | 3 + > tools/perf/tests/unit_number__scnprintf.c | 2 +- > tools/perf/trace/beauty/signum.c | 1 + > tools/perf/ui/browsers/hists.c | 2 + > tools/perf/ui/gtk/annotate.c | 1 + > tools/perf/ui/gtk/hists.c | 1 + > tools/perf/ui/stdio/hist.c | 1 + > tools/perf/util/Build | 1 + > tools/perf/util/build-id.c | 3 + > tools/perf/util/callchain.c | 103 +++++++++++++++ > tools/perf/util/color.h | 2 + > tools/perf/util/comm.c | 1 + > tools/perf/util/compress.h | 12 ++ > tools/perf/util/config.c | 4 + > tools/perf/util/debug.c | 33 ++++- > tools/perf/util/debug.h | 3 + > tools/perf/util/dso.c | 4 + > tools/perf/util/event.c | 12 +- > tools/perf/util/event.h | 2 +- > tools/perf/util/evlist.c | 3 + > tools/perf/util/evlist.h | 1 + > tools/perf/util/evsel.c | 1 + > tools/perf/util/header.c | 4 + > tools/perf/util/help-unknown-cmd.c | 1 + > tools/perf/util/hist.c | 2 + > tools/perf/util/llvm-utils.c | 1 + > tools/perf/util/lzma.c | 1 + > tools/perf/util/machine.c | 3 + > tools/perf/util/mem-events.c | 2 +- > tools/perf/util/namespaces.c | 1 + > tools/perf/util/parse-events.c | 2 + > tools/perf/util/pmu.c | 1 + > tools/perf/util/probe-file.c | 3 + > tools/perf/util/python-ext-sources | 1 + > tools/perf/util/python.c | 13 ++ > tools/perf/util/session.c | 1 + > tools/perf/util/session.h | 3 +- > tools/perf/util/sort.c | 1 + > tools/perf/util/sort.h | 3 +- > tools/perf/util/strlist.c | 1 + > tools/perf/util/time-utils.c | 25 ++++ > tools/perf/util/time-utils.h | 7 + > tools/perf/util/top.h | 2 +- > tools/perf/util/units.c | 39 ++++++ > tools/perf/util/units.h | 10 ++ > tools/perf/util/unwind-libdw.h | 6 +- > tools/perf/util/unwind.h | 9 +- > tools/perf/util/util.c | 211 +----------------------------- > tools/perf/util/util.h | 38 ------ > tools/perf/util/xyarray.c | 2 + > tools/perf/util/zlib.c | 1 + > 82 files changed, 412 insertions(+), 270 deletions(-) > create mode 100644 tools/perf/util/compress.h > create mode 100644 tools/perf/util/units.c > create mode 100644 tools/perf/util/units.h Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [GIT PULL 00/22] perf/core improvements and fixes
@ 2016-12-13 15:09 Arnaldo Carvalho de Melo
0 siblings, 0 replies; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-12-13 15:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Alexei Starovoitov, Alexis Berlemont,
Andi Kleen, Daniel Borkmann, David Ahern, Hemant Kumar, Jiri Olsa,
Joe Stringer, Masami Hiramatsu, Minchan Kim, Namhyung Kim,
Peter Zijlstra, Wang Nan
From: Arnaldo Carvalho de Melo <acme@redhat.com>
Hi Ingo,
Please consider pulling, I had most of this queued before your first
pull req to Linus for 4.10, most are fixes, with 'perf sched timehist --idle'
as a followup new feature to the 'perf sched timehist' command introduced in
this window.
Thanks,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit b0c1ef52959582144bbea9a2b37db7f4c9e399f7:
perf/x86: Fix exclusion of BTS and LBR for Goldmont (2016-12-11 13:06:09 +0100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161213
for you to fetch changes up to a03f73547fb6e0f7f2942c46cce9b48df50238ba:
samples/bpf: Drop unnecessary build targets. (2016-12-13 10:38:10 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
New features:
- Introduce 'perf sched timehist --idle', to analyse processes
going to/from idle state (Namhyung Kim)
- Add scanning of SDT (Software Defined Tracing) probles arguments (Alexis Berlemont)
Fixes:
- Allow 'perf record -u user' to continue when facing races with threads
going away after having scanned them via /proc (Jiri Olsa)
- Fix 'perf mem' --all-user/--all-kernel options (Jiri Olsa)
Infrastructure:
- Switch over samples/bpf/ to tools/lib/bpf, removing libbpf duplication (Joe Stringer)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Alexis Berlemont (1):
perf sdt: Add scanning of sdt probles arguments
Arnaldo Carvalho de Melo (1):
perf tools: Remove some needless __maybe_unused
Jiri Olsa (6):
perf tools: Move headers check into bash script
perf mem: Fix --all-user/--all-kernel options
perf evsel: Use variable instead of repeating lengthy FD macro
perf thread_map: Add thread_map__remove function
perf evsel: Allow to ignore missing pid
perf record: Force ignore_missing_thread for uid option
Joe Stringer (8):
tools lib bpf: Sync {tools,}/include/uapi/linux/bpf.h
tools lib bpf: use __u32 from linux/types.h
tools lib bpf: Add flags to bpf_create_map()
samples/bpf: Make samples more libbpf-centric
samples/bpf: Switch over to libbpf
samples/bpf: Remove perf_event_open() declaration
samples/bpf: Move open_raw_sock to separate header
samples/bpf: Drop unnecessary build targets.
Namhyung Kim (6):
perf sched timehist: Split is_idle_sample()
perf sched timehist: Introduce struct idle_time_data
perf sched timehist: Save callchain when entering idle
perf sched timehist: Skip non-idle events when necessary
perf sched timehist: Add -I/--idle-hist option
perf sched timehist: Show callchains for idle stat
samples/bpf/Makefile | 60 ++---
samples/bpf/README.rst | 4 +-
samples/bpf/bpf_load.c | 20 +-
samples/bpf/fds_example.c | 10 +-
samples/bpf/lathist_user.c | 3 +-
samples/bpf/libbpf.c | 155 -------------
samples/bpf/libbpf.h | 25 +--
samples/bpf/map_perf_test_user.c | 1 +
samples/bpf/offwaketime_user.c | 10 +-
samples/bpf/sampleip_user.c | 8 +-
samples/bpf/sock_example.c | 11 +-
samples/bpf/sock_example.h | 35 +++
samples/bpf/sockex1_user.c | 9 +-
samples/bpf/sockex2_user.c | 7 +-
samples/bpf/sockex3_user.c | 7 +-
samples/bpf/spintest_user.c | 10 +-
samples/bpf/tc_l2_redirect_user.c | 4 +-
samples/bpf/test_cgrp2_array_pin.c | 4 +-
samples/bpf/test_current_task_under_cgroup_user.c | 10 +-
samples/bpf/test_maps.c | 142 ++++++------
samples/bpf/test_overhead_user.c | 2 +
samples/bpf/test_probe_write_user_user.c | 4 +-
samples/bpf/test_verifier.c | 8 +-
samples/bpf/trace_event_user.c | 24 +-
samples/bpf/trace_output_user.c | 6 +-
samples/bpf/tracex1_user.c | 2 +
samples/bpf/tracex2_user.c | 12 +-
samples/bpf/tracex3_user.c | 6 +-
samples/bpf/tracex4_user.c | 6 +-
samples/bpf/tracex5_user.c | 2 +
samples/bpf/tracex6_user.c | 7 +-
samples/bpf/xdp1_user.c | 4 +-
tools/include/uapi/linux/bpf.h | 51 +++++
tools/lib/bpf/bpf.c | 7 +-
tools/lib/bpf/bpf.h | 6 +-
tools/lib/bpf/libbpf.c | 3 +-
tools/perf/Documentation/perf-sched.txt | 4 +
tools/perf/Makefile.perf | 94 +-------
tools/perf/builtin-c2c.c | 13 +-
tools/perf/builtin-mem.c | 4 +-
tools/perf/builtin-record.c | 3 +
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 261 +++++++++++++++++++---
tools/perf/builtin-stat.c | 6 +-
tools/perf/check-headers.sh | 59 +++++
tools/perf/perf.h | 1 +
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/tests.h | 1 +
tools/perf/tests/thread-map.c | 44 ++++
tools/perf/util/evsel.c | 61 ++++-
tools/perf/util/evsel.h | 1 +
tools/perf/util/symbol-elf.c | 25 ++-
tools/perf/util/symbol.h | 1 +
tools/perf/util/thread_map.c | 22 ++
tools/perf/util/thread_map.h | 1 +
55 files changed, 786 insertions(+), 506 deletions(-)
delete mode 100644 samples/bpf/libbpf.c
create mode 100644 samples/bpf/sock_example.h
create mode 100755 tools/perf/check-headers.sh
# uname -a
Linux jouet 4.9.0-rc8+ #1 SMP Mon Dec 12 11:20:49 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: PERF_RECORD_* events & perf_sample fields : Ok
7: Parse perf pmu format : Ok
8: DSO data read : Ok
9: DSO data cache : Ok
10: DSO data reopen : Ok
11: Roundtrip evsel->name : Ok
12: Parse sched tracepoints fields : Ok
13: syscalls:sys_enter_openat event fields : Ok
14: Setup struct perf_event_attr : Ok
15: Match and link multiple hists : Ok
16: 'import perf' in python : Ok
17: Breakpoint overflow signal handler : Ok
18: Breakpoint overflow sampling : Ok
19: Number of exit events of a simple workload : Ok
20: Software clock events period values : Ok
21: Object code reading : Ok
22: Sample parsing : Ok
23: Use a dummy software event to keep tracking: Ok
24: Parse with no sample_id_all bit set : Ok
25: Filter hist entries : Ok
26: Lookup mmap thread : Ok
27: Share thread mg : Ok
28: Sort output of hist entries : Ok
29: Cumulate child hist entries : Ok
30: Track with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: kmod_path__parse : Ok
34: Thread map : Ok
35: LLVM search and compile :
35.1: Basic BPF llvm compile : Ok
35.2: kbuild searching : Ok
35.3: Compile source for BPF prologue generation: Ok
35.4: Compile source for BPF relocation : Ok
36: Session topology : Ok
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF prologue generation : Ok
37.3: BPF relocation checker : Ok
38: Synthesize thread map : Ok
39: Remove thread map : Ok
40: Synthesize cpu map : Ok
41: Synthesize stat config : Ok
42: Synthesize stat : Ok
43: Synthesize stat round : Ok
44: Synthesize attr update : Ok
45: Event times : Ok
46: Read backward ring buffer : Ok
47: Print cpu map : Ok
48: Probe SDT events : Ok
49: is_printable_array : Ok
50: Print bitmap : Ok
51: perf hooks : Ok
52: builtin clang support : Skip (not compiled in)
53: x86 rdpmc : Ok
54: Convert perf time to TSC : Ok
55: DWARF unwind : Ok
56: x86 instruction decoder - new instructions : Ok
57: Intel cqm nmi context read : Skip
#
# time dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-mips64: Ok
11 fedora:20: Ok
12 fedora:21: Ok
13 fedora:22: Ok
14 fedora:23: Ok
15 fedora:24: Ok
16 fedora:24-x-ARC-uClibc: Ok
17 fedora:25: Ok
18 fedora:rawhide: Ok
19 mageia:5: Ok
20 opensuse:13.2: Ok
21 opensuse:42.1: Ok
22 opensuse:tumbleweed: Ok
23 ubuntu:12.04.5: Ok
24 ubuntu:14.04.4-x-linaro-arm64: Ok
25 ubuntu:16.04: Ok
26 ubuntu:16.04-x-arm: Ok
27 ubuntu:16.04-x-arm64: Ok
28 ubuntu:16.04-x-powerpc: Ok
29 ubuntu:16.04-x-powerpc64: Ok
30 ubuntu:16.04-x-powerpc64el: Ok
31 ubuntu:16.04-x-s390: Ok
32 ubuntu:16.10: Ok
#
[acme@felicio linux]$ make -C tools/perf build-test
make: Entering directory `/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_with_babeltrace_O: make LIBBABELTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_doc_O: make doc
make_cscope_O: make cscope
make_debug_O: make DEBUG=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_perf_o_O: make perf.o
make_install_bin_O: make install-bin
make_no_newt_O: make NO_NEWT=1
make_no_slang_O: make NO_SLANG=1
make_clean_all_O: make clean all
make_help_O: make help
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_demangle_O: make NO_DEMANGLE=1
make_no_libelf_O: make NO_LIBELF=1
make_pure_O: make
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_gtk2_O: make NO_GTK2=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_static_O: make LDFLAGS=-static
make_util_map_o_O: make util/map.o
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_install_prefix_O: make install prefix=/tmp/krava
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_tags_O: make tags
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_install_O: make install
make_no_auxtrace_O: make NO_AUXTRACE=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libunwind_O: make NO_LIBUNWIND=1
OK
make: Leaving directory `/home/acme/git/linux/tools/perf'
[acme@felicio linux]$
^ permalink raw reply [flat|nested] 36+ messages in thread* [GIT PULL 00/22] perf/core improvements and fixes
@ 2016-10-04 2:36 Arnaldo Carvalho de Melo
2016-10-04 8:07 ` Ingo Molnar
0 siblings, 1 reply; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-10-04 2:36 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Linux Weekly News, Arnaldo Carvalho de Melo,
Adrian Hunter, Alexander Shishkin, Andi Kleen, Colin Ian King,
David Ahern, Jiri Olsa, linuxppc-dev, Madhavan Srinivasan,
Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, Ravi Bangoria,
Sukadev Bhattiprolu, Wang Nan, Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling,
- Arnaldo
Build and test stats at the end of the message.
The following changes since commit 41aad2a6d4fcdda8d73c9739daf7a9f3f49499d6:
Merge tag 'perf-core-for-mingo-20160929' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-09-29 19:09:58 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161003
for you to fetch changes up to b42c7369e3f451e22c2b0be5d193955498d37546:
perf pmu-events: Add Skylake frontend MSR support (2016-10-03 21:52:01 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
- Allow vendors to provide JSON files describing PMU events, that then
get parsed to generate C tables that are linked against perf, allowing
the use of the names in their documentations, such as:
# perf list l1d
List of pre-defined events (to be used in -e):
Cache:
l1d.replacement
[L1D data line replacements]
l1d_pend_miss.fb_full
[Cycles a demand request was blocked due to Fill Buffers inavailability]
l1d_pend_miss.pending
[L1D miss oustandings duration in cycles]
l1d_pend_miss.pending_cycles
[Cycles with L1D load Misses outstanding]
l1d_pend_miss.pending_cycles_any
[Cycles with L1D load Misses outstanding from any thread on physical core]
l2_trans.l1d_wb
[L1D writebacks that access L2 cache]
Pipeline:
cycle_activity.cycles_l1d_miss
[Cycles while L1 cache miss demand load is outstanding]
cycle_activity.cycles_l1d_pending
[Cycles while L1 cache miss demand load is outstanding]
cycle_activity.stalls_l1d_miss
[Execution stalls while L1 cache miss demand load is outstanding]
cycle_activity.stalls_l1d_pending
[Execution stalls while L1 cache miss demand load is outstanding]
The above example was done on a Broadwell based ThinkPad t450s after
downloading and installing such JSON files which will be added to the
tools/perf/pmu-events/ directory in a subsequent patchkit.
Now one can use those names with -e/--event in all 'perf tools'.
(Andi Kleen, Sukadev Bhattiprolu)
- Add a missing pointer dereference in 'perf probe' (Colin Ian King)
- Add support for building host programs to be used in generating files
to be used in the build process, such as fixdep and jevents, fixing
the usage of these features in a cross compilation setup (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Andi Kleen (12):
perf tools: Add jsmn `jasmine' JSON parser
perf jevents: Program to convert JSON file
perf tools: Support CPU id matching for x86 v2
perf jevents: Handle header line in mapfile
perf pmu: Support alias descriptions
perf tools: Query terminal width and use in perf list
perf list: Add a --no-desc flag
perf pmu: Add override support for event list CPUID
perf list jevents: Add support for event list topics
perf tools: Make alias matching case-insensitive
perf pmu-events: Fix fixed counters on Intel
perf pmu-events: Add Skylake frontend MSR support
Arnaldo Carvalho de Melo (1):
perf tools: Experiment with cppcheck
Colin Ian King (1):
perf probe: Check if *ptr2 is zero and not ptr2
Jiri Olsa (2):
tools build: Add support for host programs format
tools build: Make fixdep a hostprog
Sukadev Bhattiprolu (6):
perf pmu: Use pmu_events table to create aliases
perf powerpc: Support CPU ID matching for Powerpc
perf jevents: Add support for long descriptions
perf list: Support long jevents descriptions
perf tools: Add README for info on parsing JSON/map files
perf tools: Allow period= in perf stat CPU event descriptions.
tools/build/Build | 2 +
tools/build/Build.include | 5 +
tools/build/Makefile | 8 +-
tools/build/Makefile.build | 19 +-
tools/build/Makefile.include | 4 -
tools/lib/subcmd/pager.c | 16 +
tools/lib/subcmd/pager.h | 1 +
tools/perf/Documentation/perf-list.txt | 12 +-
tools/perf/Makefile.perf | 34 +-
tools/perf/arch/powerpc/util/header.c | 11 +
tools/perf/arch/x86/util/header.c | 24 +-
tools/perf/builtin-list.c | 20 +-
tools/perf/pmu-events/Build | 13 +
tools/perf/pmu-events/README | 147 ++++++
tools/perf/pmu-events/jevents.c | 812 +++++++++++++++++++++++++++++++++
tools/perf/pmu-events/jevents.h | 18 +
tools/perf/pmu-events/jsmn.c | 313 +++++++++++++
tools/perf/pmu-events/jsmn.h | 67 +++
tools/perf/pmu-events/json.c | 162 +++++++
tools/perf/pmu-events/json.h | 38 ++
tools/perf/pmu-events/pmu-events.h | 37 ++
tools/perf/util/evlist.c | 12 +-
tools/perf/util/evsel.c | 3 +-
tools/perf/util/header.h | 1 +
tools/perf/util/machine.c | 6 +-
tools/perf/util/parse-events.c | 8 +-
tools/perf/util/parse-events.h | 3 +-
tools/perf/util/pmu.c | 176 ++++++-
tools/perf/util/pmu.h | 6 +-
tools/perf/util/probe-event.c | 2 +-
tools/perf/util/strbuf.h | 3 +-
tools/perf/util/thread.c | 9 +-
32 files changed, 1926 insertions(+), 66 deletions(-)
create mode 100644 tools/perf/pmu-events/Build
create mode 100644 tools/perf/pmu-events/README
create mode 100644 tools/perf/pmu-events/jevents.c
create mode 100644 tools/perf/pmu-events/jevents.h
create mode 100644 tools/perf/pmu-events/jsmn.c
create mode 100644 tools/perf/pmu-events/jsmn.h
create mode 100644 tools/perf/pmu-events/json.c
create mode 100644 tools/perf/pmu-events/json.h
create mode 100644 tools/perf/pmu-events/pmu-events.h
# time dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 fedora:20: Ok
11 fedora:21: Ok
12 fedora:22: Ok
13 fedora:23: Ok
14 fedora:24: Ok
15 fedora:24-x-ARC-uClibc: Ok
16 fedora:rawhide: Ok
17 mageia:5: Ok
18 opensuse:13.2: Ok
19 opensuse:42.1: Ok
20 opensuse:tumbleweed: Ok
21 ubuntu:12.04.5: Ok
22 ubuntu:14.04: Ok
23 ubuntu:14.04.4: Ok
24 ubuntu:15.10: Ok
25 ubuntu:16.04: Ok
26 ubuntu:16.04-x-arm: Ok
27 ubuntu:16.04-x-arm64: Ok
28 ubuntu:16.04-x-powerpc: Ok
29 ubuntu:16.04-x-powerpc64: Ok
30 ubuntu:16.04-x-powerpc64el: Ok
31 ubuntu:16.04-x-s390: Ok
32 ubuntu:16.10: Ok
real 33m23.855s
user 0m2.128s
sys 0m2.305s
#
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: detect openat syscall event : Ok
3: detect openat syscall event on all cpus : Ok
4: read samples using the mmap interface : Ok
5: parse events tests : Ok
6: Validate PERF_RECORD_* events & perf_sample fields : Ok
7: Test perf pmu format parsing : Ok
8: Test dso data read : Ok
9: Test dso data cache : Ok
10: Test dso data reopen : Ok
11: roundtrip evsel->name check : Ok
12: Check parsing of sched tracepoints fields : Ok
13: Generate and check syscalls:sys_enter_openat event fields: Ok
14: struct perf_event_attr setup : Ok
15: Test matching and linking multiple hists : Ok
16: Try 'import perf' in python, checking link problems : Ok
17: Test breakpoint overflow signal handler : Ok
18: Test breakpoint overflow sampling : Ok
19: Test number of exit event of a simple workload : Ok
20: Test software clock events have valid period values : Ok
21: Test object code reading : Ok
22: Test sample parsing : Ok
23: Test using a dummy software event to keep tracking : Ok
24: Test parsing with no sample_id_all bit set : Ok
25: Test filtering hist entries : Ok
26: Test mmap thread lookup : Ok
27: Test thread mg sharing : Ok
28: Test output sorting of hist entries : Ok
29: Test cumulation of child hist entries : Ok
30: Test tracking with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: Test kmod_path__parse function : Ok
34: Test thread map : Ok
35: Test LLVM searching and compiling :
35.1: Basic BPF llvm compiling test : Ok
35.2: Test kbuild searching : Ok
35.3: Compile source for BPF prologue generation test : Ok
35.4: Compile source for BPF relocation test : Ok
36: Test topology in session : Ok
37: Test BPF filter :
37.1: Test basic BPF filtering : Ok
37.2: Test BPF prologue generation : Ok
37.3: Test BPF relocation checker : Ok
38: Test thread map synthesize : Ok
39: Test cpu map synthesize : Ok
40: Test stat config synthesize : Ok
41: Test stat synthesize : Ok
42: Test stat round synthesize : Ok
43: Test attr update synthesize : Ok
44: Test events times : Ok
45: Test backward reading from ring buffer : Ok
46: Test cpu map print : Ok
47: Test SDT event probing : Ok
48: Test is_printable_array function : Ok
49: Test bitmap print : Ok
50: x86 rdpmc test : Ok
51: Test converting perf time to TSC : Ok
52: Test dwarf unwind : Ok
53: Test x86 instruction decoder - new instructions : Ok
54: Test intel cqm nmi context read : Skip
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_no_libperl_O: make NO_LIBPERL=1
make_no_newt_O: make NO_NEWT=1
make_no_slang_O: make NO_SLANG=1
make_no_libnuma_O: make NO_LIBNUMA=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libbpf_O: make NO_LIBBPF=1
make_no_demangle_O: make NO_DEMANGLE=1
make_install_bin_O: make install-bin
make_install_O: make install
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_tags_O: make tags
make_perf_o_O: make perf.o
make_static_O: make LDFLAGS=-static
make_clean_all_O: make clean all
make_install_prefix_O: make install prefix=/tmp/krava
make_help_O: make help
make_util_map_o_O: make util/map.o
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_debug_O: make DEBUG=1
make_no_libelf_O: make NO_LIBELF=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_doc_O: make doc
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_gtk2_O: make NO_GTK2=1
OK
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [GIT PULL 00/22] perf/core improvements and fixes 2016-10-04 2:36 Arnaldo Carvalho de Melo @ 2016-10-04 8:07 ` Ingo Molnar 0 siblings, 0 replies; 36+ messages in thread From: Ingo Molnar @ 2016-10-04 8:07 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Linux Weekly News, Adrian Hunter, Alexander Shishkin, Andi Kleen, Colin Ian King, David Ahern, Jiri Olsa, linuxppc-dev, Madhavan Srinivasan, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, Ravi Bangoria, Sukadev Bhattiprolu, Wang Nan, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > Build and test stats at the end of the message. > > The following changes since commit 41aad2a6d4fcdda8d73c9739daf7a9f3f49499d6: > > Merge tag 'perf-core-for-mingo-20160929' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2016-09-29 19:09:58 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20161003 > > for you to fetch changes up to b42c7369e3f451e22c2b0be5d193955498d37546: > > perf pmu-events: Add Skylake frontend MSR support (2016-10-03 21:52:01 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > - Allow vendors to provide JSON files describing PMU events, that then > get parsed to generate C tables that are linked against perf, allowing > the use of the names in their documentations, such as: > > # perf list l1d > > List of pre-defined events (to be used in -e): > > Cache: > l1d.replacement > [L1D data line replacements] > l1d_pend_miss.fb_full > [Cycles a demand request was blocked due to Fill Buffers inavailability] > l1d_pend_miss.pending > [L1D miss oustandings duration in cycles] > l1d_pend_miss.pending_cycles > [Cycles with L1D load Misses outstanding] > l1d_pend_miss.pending_cycles_any > [Cycles with L1D load Misses outstanding from any thread on physical core] > l2_trans.l1d_wb > [L1D writebacks that access L2 cache] > > Pipeline: > cycle_activity.cycles_l1d_miss > [Cycles while L1 cache miss demand load is outstanding] > cycle_activity.cycles_l1d_pending > [Cycles while L1 cache miss demand load is outstanding] > cycle_activity.stalls_l1d_miss > [Execution stalls while L1 cache miss demand load is outstanding] > cycle_activity.stalls_l1d_pending > [Execution stalls while L1 cache miss demand load is outstanding] > > The above example was done on a Broadwell based ThinkPad t450s after > downloading and installing such JSON files which will be added to the > tools/perf/pmu-events/ directory in a subsequent patchkit. > > Now one can use those names with -e/--event in all 'perf tools'. > (Andi Kleen, Sukadev Bhattiprolu) > > - Add a missing pointer dereference in 'perf probe' (Colin Ian King) > > - Add support for building host programs to be used in generating files > to be used in the build process, such as fixdep and jevents, fixing > the usage of these features in a cross compilation setup (Jiri Olsa) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Andi Kleen (12): > perf tools: Add jsmn `jasmine' JSON parser > perf jevents: Program to convert JSON file > perf tools: Support CPU id matching for x86 v2 > perf jevents: Handle header line in mapfile > perf pmu: Support alias descriptions > perf tools: Query terminal width and use in perf list > perf list: Add a --no-desc flag > perf pmu: Add override support for event list CPUID > perf list jevents: Add support for event list topics > perf tools: Make alias matching case-insensitive > perf pmu-events: Fix fixed counters on Intel > perf pmu-events: Add Skylake frontend MSR support > > Arnaldo Carvalho de Melo (1): > perf tools: Experiment with cppcheck > > Colin Ian King (1): > perf probe: Check if *ptr2 is zero and not ptr2 > > Jiri Olsa (2): > tools build: Add support for host programs format > tools build: Make fixdep a hostprog > > Sukadev Bhattiprolu (6): > perf pmu: Use pmu_events table to create aliases > perf powerpc: Support CPU ID matching for Powerpc > perf jevents: Add support for long descriptions > perf list: Support long jevents descriptions > perf tools: Add README for info on parsing JSON/map files > perf tools: Allow period= in perf stat CPU event descriptions. > > tools/build/Build | 2 + > tools/build/Build.include | 5 + > tools/build/Makefile | 8 +- > tools/build/Makefile.build | 19 +- > tools/build/Makefile.include | 4 - > tools/lib/subcmd/pager.c | 16 + > tools/lib/subcmd/pager.h | 1 + > tools/perf/Documentation/perf-list.txt | 12 +- > tools/perf/Makefile.perf | 34 +- > tools/perf/arch/powerpc/util/header.c | 11 + > tools/perf/arch/x86/util/header.c | 24 +- > tools/perf/builtin-list.c | 20 +- > tools/perf/pmu-events/Build | 13 + > tools/perf/pmu-events/README | 147 ++++++ > tools/perf/pmu-events/jevents.c | 812 +++++++++++++++++++++++++++++++++ > tools/perf/pmu-events/jevents.h | 18 + > tools/perf/pmu-events/jsmn.c | 313 +++++++++++++ > tools/perf/pmu-events/jsmn.h | 67 +++ > tools/perf/pmu-events/json.c | 162 +++++++ > tools/perf/pmu-events/json.h | 38 ++ > tools/perf/pmu-events/pmu-events.h | 37 ++ > tools/perf/util/evlist.c | 12 +- > tools/perf/util/evsel.c | 3 +- > tools/perf/util/header.h | 1 + > tools/perf/util/machine.c | 6 +- > tools/perf/util/parse-events.c | 8 +- > tools/perf/util/parse-events.h | 3 +- > tools/perf/util/pmu.c | 176 ++++++- > tools/perf/util/pmu.h | 6 +- > tools/perf/util/probe-event.c | 2 +- > tools/perf/util/strbuf.h | 3 +- > tools/perf/util/thread.c | 9 +- > 32 files changed, 1926 insertions(+), 66 deletions(-) > create mode 100644 tools/perf/pmu-events/Build > create mode 100644 tools/perf/pmu-events/README > create mode 100644 tools/perf/pmu-events/jevents.c > create mode 100644 tools/perf/pmu-events/jevents.h > create mode 100644 tools/perf/pmu-events/jsmn.c > create mode 100644 tools/perf/pmu-events/jsmn.h > create mode 100644 tools/perf/pmu-events/json.c > create mode 100644 tools/perf/pmu-events/json.h > create mode 100644 tools/perf/pmu-events/pmu-events.h > > # time dm > 1 alpine:3.4: Ok > 2 android-ndk:r12b-arm: Ok > 3 archlinux:latest: Ok > 4 centos:5: Ok > 5 centos:6: Ok > 6 centos:7: Ok > 7 debian:7: Ok > 8 debian:8: Ok > 9 debian:experimental: Ok > 10 fedora:20: Ok > 11 fedora:21: Ok > 12 fedora:22: Ok > 13 fedora:23: Ok > 14 fedora:24: Ok > 15 fedora:24-x-ARC-uClibc: Ok > 16 fedora:rawhide: Ok > 17 mageia:5: Ok > 18 opensuse:13.2: Ok > 19 opensuse:42.1: Ok > 20 opensuse:tumbleweed: Ok > 21 ubuntu:12.04.5: Ok > 22 ubuntu:14.04: Ok > 23 ubuntu:14.04.4: Ok > 24 ubuntu:15.10: Ok > 25 ubuntu:16.04: Ok > 26 ubuntu:16.04-x-arm: Ok > 27 ubuntu:16.04-x-arm64: Ok > 28 ubuntu:16.04-x-powerpc: Ok > 29 ubuntu:16.04-x-powerpc64: Ok > 30 ubuntu:16.04-x-powerpc64el: Ok > 31 ubuntu:16.04-x-s390: Ok > 32 ubuntu:16.10: Ok > > real 33m23.855s > user 0m2.128s > sys 0m2.305s > # > > # perf test > 1: vmlinux symtab matches kallsyms : Ok > 2: detect openat syscall event : Ok > 3: detect openat syscall event on all cpus : Ok > 4: read samples using the mmap interface : Ok > 5: parse events tests : Ok > 6: Validate PERF_RECORD_* events & perf_sample fields : Ok > 7: Test perf pmu format parsing : Ok > 8: Test dso data read : Ok > 9: Test dso data cache : Ok > 10: Test dso data reopen : Ok > 11: roundtrip evsel->name check : Ok > 12: Check parsing of sched tracepoints fields : Ok > 13: Generate and check syscalls:sys_enter_openat event fields: Ok > 14: struct perf_event_attr setup : Ok > 15: Test matching and linking multiple hists : Ok > 16: Try 'import perf' in python, checking link problems : Ok > 17: Test breakpoint overflow signal handler : Ok > 18: Test breakpoint overflow sampling : Ok > 19: Test number of exit event of a simple workload : Ok > 20: Test software clock events have valid period values : Ok > 21: Test object code reading : Ok > 22: Test sample parsing : Ok > 23: Test using a dummy software event to keep tracking : Ok > 24: Test parsing with no sample_id_all bit set : Ok > 25: Test filtering hist entries : Ok > 26: Test mmap thread lookup : Ok > 27: Test thread mg sharing : Ok > 28: Test output sorting of hist entries : Ok > 29: Test cumulation of child hist entries : Ok > 30: Test tracking with sched_switch : Ok > 31: Filter fds with revents mask in a fdarray : Ok > 32: Add fd to a fdarray, making it autogrow : Ok > 33: Test kmod_path__parse function : Ok > 34: Test thread map : Ok > 35: Test LLVM searching and compiling : > 35.1: Basic BPF llvm compiling test : Ok > 35.2: Test kbuild searching : Ok > 35.3: Compile source for BPF prologue generation test : Ok > 35.4: Compile source for BPF relocation test : Ok > 36: Test topology in session : Ok > 37: Test BPF filter : > 37.1: Test basic BPF filtering : Ok > 37.2: Test BPF prologue generation : Ok > 37.3: Test BPF relocation checker : Ok > 38: Test thread map synthesize : Ok > 39: Test cpu map synthesize : Ok > 40: Test stat config synthesize : Ok > 41: Test stat synthesize : Ok > 42: Test stat round synthesize : Ok > 43: Test attr update synthesize : Ok > 44: Test events times : Ok > 45: Test backward reading from ring buffer : Ok > 46: Test cpu map print : Ok > 47: Test SDT event probing : Ok > 48: Test is_printable_array function : Ok > 49: Test bitmap print : Ok > 50: x86 rdpmc test : Ok > 51: Test converting perf time to TSC : Ok > 52: Test dwarf unwind : Ok > 53: Test x86 instruction decoder - new instructions : Ok > 54: Test intel cqm nmi context read : Skip > # > > $ make -C tools/perf build-test > make: Entering directory '/home/acme/git/linux/tools/perf' > - tarpkg: ./tests/perf-targz-src-pkg . > make_no_libperl_O: make NO_LIBPERL=1 > make_no_newt_O: make NO_NEWT=1 > make_no_slang_O: make NO_SLANG=1 > make_no_libnuma_O: make NO_LIBNUMA=1 > make_with_babeltrace_O: make LIBBABELTRACE=1 > make_install_prefix_slash_O: make install prefix=/tmp/krava/ > make_no_libbpf_O: make NO_LIBBPF=1 > make_no_demangle_O: make NO_DEMANGLE=1 > make_install_bin_O: make install-bin > make_install_O: make install > make_no_libbionic_O: make NO_LIBBIONIC=1 > make_no_libunwind_O: make NO_LIBUNWIND=1 > make_no_libaudit_O: make NO_LIBAUDIT=1 > make_no_libpython_O: make NO_LIBPYTHON=1 > make_tags_O: make tags > make_perf_o_O: make perf.o > make_static_O: make LDFLAGS=-static > make_clean_all_O: make clean all > make_install_prefix_O: make install prefix=/tmp/krava > make_help_O: make help > make_util_map_o_O: make util/map.o > make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1 > make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1 > make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 > make_no_backtrace_O: make NO_BACKTRACE=1 > make_debug_O: make DEBUG=1 > make_no_libelf_O: make NO_LIBELF=1 > make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1 > make_no_auxtrace_O: make NO_AUXTRACE=1 > make_doc_O: make doc > make_util_pmu_bison_o_O: make util/pmu-bison.o > make_no_gtk2_O: make NO_GTK2=1 > OK Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [GIT PULL 00/22] perf/core improvements and fixes
@ 2016-09-20 20:03 Arnaldo Carvalho de Melo
2016-09-20 21:34 ` Ingo Molnar
0 siblings, 1 reply; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-09-20 20:03 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexander Shishkin, Andi Kleen, Chris Riyder, David Ahern,
Don Zickus, Hemant Kumar, Jiri Olsa, Joe Mario, Kim Phillips,
Markus Trippelsdorf, Masami Hiramatsu, Mathieu Poirier,
Michael Ellerman, Milian Wolff, Namhyung Kim, Naveen N . Rao,
Pawel Moll, Peter Zijlstra, pi3orama, Ravi Bangoria, Russell King,
Taeung Song, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling,
- Arnaldo
The following changes since commit cd34cd97b7b4336aa2c623c37daffab264c7c6ce:
perf/x86/intel/uncore: Add Skylake server uncore support (2016-09-10 11:18:52 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160920
for you to fetch changes up to 3c028a0cb5b71f47d523bc8ad2c597cb257f41fb:
perf symbols: Do not open device files (2016-09-20 16:20:21 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
User visible:
- Support event group view with hierarchy mode in 'perf top' and 'perf report'
(Namhyung Kim)
e.g.:
$ perf record -e '{cycles,instructions}' make
$ perf report --hierarchy --stdio
...
# Overhead Command / Shared Object / Symbol
# ...................... ..................................
...
25.74% 27.18% sh
19.96% 24.14% libc-2.24.so
9.55% 14.64% [.] __strcmp_sse2
1.54% 0.00% [.] __tfind
1.07% 1.13% [.] _int_malloc
0.95% 0.00% [.] __strchr_sse2
0.89% 1.39% [.] __tsearch
0.76% 0.00% [.] strlen
- Fix the dwarf regs table for x86_64, adding a missing % to the "%di"
register, noticed with a failing 'perf test bpf' (Arnaldo Carvalho de Melo)
- Fix handling of mmap parameters in the 'perf trace' beautifier in
architectures that don't have the same mappings as x86_64 (Wang Nan)
- Handle hugetbl mappings in older systems running new kernels (Wang Nan)
- Resolve 'call' operands in 'annotate', that when using /proc/kcore
were appearing just as hexadecimal addresses, to function names
(Arnaldo Carvalho de Melo)
- Fix width computation for srcline sort entry (Jiri Olsa)
- Do not ignore call instruction with indirect target in 'annotate'
(Ravi Bangoria)
- Handle MADV_FREE in the madvise 'trace' beautifier (Wang Nan)
- Fix build of 'perf trace' mman beautifier in !x86_64 (Wang Nan)
Infrastructure:
- Add infrastructure for PMU specific configuration, allowing to pass
config variables directly to the kernel PMU driver, prefixing those
variables with a '@', part of a larger series to support Coresight (Mathieu Poirier)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Build stats at the end of this message.
----------------------------------------------------------------
Arnaldo Carvalho de Melo (5):
perf probe: Fix dwarf regs table for x86_64
perf trace beauty mmap: Fix defines for non !x86_64
perf tools: Do hugetlb handling in more systems
perf annotate: Pass the symbol's map/dso to the instruction parsers
perf annotate: Resolve 'call' operands to function names
Jiri Olsa (2):
perf hists: Fix width computation for srcline sort entry
perf symbols: Do not open device files
Mathieu Poirier (1):
perf tools: Add infrastructure for PMU specific configuration
Namhyung Kim (9):
perf hists browser: Fix event group display
perf hists: Introduce hists__match_hierarchy()
perf hists: Introduce hists__link_hierarchy()
perf hist: Initialize hierarchy tree explicitly
perf ui/stdio: Always reset output width for hierarchy
perf ui/stdio: Rename print_hierarchy_header()
perf report: Enable group view with hierarchy
perf ui/tui: Reset output width for hierarchy
perf hists: Factor out hists__reset_column_width()
Ravi Bangoria (1):
perf annotate: Do not ignore call instruction with indirect target
Wang Nan (4):
tools include: Add uapi mman.h for each architecture
perf build: Compare mman.h related headers against kernel originals
perf trace beauty mmap: Add missing MADV_FREE
tools include: Add mman macros needed by perf for all arch
tools/arch/alpha/include/uapi/asm/mman.h | 47 ++++++++
tools/arch/arc/include/uapi/asm/mman.h | 6 +
tools/arch/arm/include/uapi/asm/mman.h | 6 +
tools/arch/arm64/include/uapi/asm/mman.h | 6 +
tools/arch/frv/include/uapi/asm/mman.h | 6 +
tools/arch/h8300/include/uapi/asm/mman.h | 6 +
tools/arch/hexagon/include/uapi/asm/mman.h | 6 +
tools/arch/ia64/include/uapi/asm/mman.h | 6 +
tools/arch/m32r/include/uapi/asm/mman.h | 6 +
tools/arch/microblaze/include/uapi/asm/mman.h | 6 +
tools/arch/mips/include/uapi/asm/mman.h | 46 ++++++++
tools/arch/mn10300/include/uapi/asm/mman.h | 6 +
tools/arch/parisc/include/uapi/asm/mman.h | 47 ++++++++
tools/arch/powerpc/include/uapi/asm/mman.h | 15 +++
tools/arch/s390/include/uapi/asm/mman.h | 6 +
tools/arch/score/include/uapi/asm/mman.h | 6 +
tools/arch/sh/include/uapi/asm/mman.h | 6 +
tools/arch/sparc/include/uapi/asm/mman.h | 15 +++
tools/arch/tile/include/uapi/asm/mman.h | 15 +++
tools/arch/x86/include/uapi/asm/mman.h | 5 +
tools/arch/xtensa/include/uapi/asm/mman.h | 47 ++++++++
tools/include/uapi/asm-generic/mman-common.h | 75 ++++++++++++
tools/include/uapi/asm-generic/mman.h | 22 ++++
tools/include/uapi/linux/mman.h | 13 +++
tools/perf/Documentation/perf-record.txt | 12 ++
tools/perf/MANIFEST | 4 +
tools/perf/Makefile.perf | 9 ++
tools/perf/arch/x86/include/dwarf-regs-table.h | 2 +-
tools/perf/builtin-report.c | 1 -
tools/perf/trace/beauty/mmap.c | 72 +-----------
tools/perf/ui/browsers/hists.c | 7 +-
tools/perf/ui/hist.c | 15 +++
tools/perf/ui/stdio/hist.c | 25 +---
tools/perf/util/annotate.c | 37 +++---
tools/perf/util/annotate.h | 2 +-
tools/perf/util/dso.c | 3 +
tools/perf/util/event.c | 7 +-
tools/perf/util/evsel.h | 2 +
tools/perf/util/hist.c | 154 ++++++++++++++++++++++++-
tools/perf/util/hist.h | 1 +
tools/perf/util/map.c | 9 +-
tools/perf/util/parse-events.c | 7 +-
tools/perf/util/parse-events.h | 1 +
tools/perf/util/parse-events.l | 22 ++++
tools/perf/util/parse-events.y | 11 ++
tools/perf/util/sort.h | 1 +
46 files changed, 698 insertions(+), 131 deletions(-)
create mode 100644 tools/arch/alpha/include/uapi/asm/mman.h
create mode 100644 tools/arch/arc/include/uapi/asm/mman.h
create mode 100644 tools/arch/arm/include/uapi/asm/mman.h
create mode 100644 tools/arch/arm64/include/uapi/asm/mman.h
create mode 100644 tools/arch/frv/include/uapi/asm/mman.h
create mode 100644 tools/arch/h8300/include/uapi/asm/mman.h
create mode 100644 tools/arch/hexagon/include/uapi/asm/mman.h
create mode 100644 tools/arch/ia64/include/uapi/asm/mman.h
create mode 100644 tools/arch/m32r/include/uapi/asm/mman.h
create mode 100644 tools/arch/microblaze/include/uapi/asm/mman.h
create mode 100644 tools/arch/mips/include/uapi/asm/mman.h
create mode 100644 tools/arch/mn10300/include/uapi/asm/mman.h
create mode 100644 tools/arch/parisc/include/uapi/asm/mman.h
create mode 100644 tools/arch/powerpc/include/uapi/asm/mman.h
create mode 100644 tools/arch/s390/include/uapi/asm/mman.h
create mode 100644 tools/arch/score/include/uapi/asm/mman.h
create mode 100644 tools/arch/sh/include/uapi/asm/mman.h
create mode 100644 tools/arch/sparc/include/uapi/asm/mman.h
create mode 100644 tools/arch/tile/include/uapi/asm/mman.h
create mode 100644 tools/arch/x86/include/uapi/asm/mman.h
create mode 100644 tools/arch/xtensa/include/uapi/asm/mman.h
create mode 100644 tools/include/uapi/asm-generic/mman-common.h
create mode 100644 tools/include/uapi/asm-generic/mman.h
create mode 100644 tools/include/uapi/linux/mman.h
[root@jouet ~]# perf test
1: vmlinux symtab matches kallsyms : Ok
2: detect openat syscall event : Ok
3: detect openat syscall event on all cpus : Ok
4: read samples using the mmap interface : Ok
5: parse events tests : Ok
6: Validate PERF_RECORD_* events & perf_sample fields : Ok
7: Test perf pmu format parsing : Ok
8: Test dso data read : Ok
9: Test dso data cache : Ok
10: Test dso data reopen : Ok
11: roundtrip evsel->name check : Ok
12: Check parsing of sched tracepoints fields : Ok
13: Generate and check syscalls:sys_enter_openat event fields: Ok
14: struct perf_event_attr setup : Ok
15: Test matching and linking multiple hists : Ok
16: Try 'import perf' in python, checking link problems : Ok
17: Test breakpoint overflow signal handler : Ok
18: Test breakpoint overflow sampling : Ok
19: Test number of exit event of a simple workload : Ok
20: Test software clock events have valid period values : Ok
21: Test object code reading : Ok
22: Test sample parsing : Ok
23: Test using a dummy software event to keep tracking : Ok
24: Test parsing with no sample_id_all bit set : Ok
25: Test filtering hist entries : Ok
26: Test mmap thread lookup : Ok
27: Test thread mg sharing : Ok
28: Test output sorting of hist entries : Ok
29: Test cumulation of child hist entries : Ok
30: Test tracking with sched_switch : Ok
31: Filter fds with revents mask in a fdarray : Ok
32: Add fd to a fdarray, making it autogrow : Ok
33: Test kmod_path__parse function : Ok
34: Test thread map : Ok
35: Test LLVM searching and compiling :
35.1: Basic BPF llvm compiling test : Ok
35.2: Test kbuild searching : Ok
35.3: Compile source for BPF prologue generation test : Ok
35.4: Compile source for BPF relocation test : Ok
36: Test topology in session : Ok
37: Test BPF filter :
37.1: Test basic BPF filtering : Ok
37.2: Test BPF prologue generation : Ok
37.3: Test BPF relocation checker : Ok
38: Test thread map synthesize : Ok
39: Test cpu map synthesize : Ok
40: Test stat config synthesize : Ok
41: Test stat synthesize : Ok
42: Test stat round synthesize : Ok
43: Test attr update synthesize : Ok
44: Test events times : Ok
45: Test backward reading from ring buffer : Ok
46: Test cpu map print : Ok
47: Test SDT event probing : Ok
48: Test is_printable_array function : Ok
49: Test bitmap print : Ok
50: x86 rdpmc test : Ok
51: Test converting perf time to TSC : Ok
52: Test dwarf unwind : Ok
53: Test x86 instruction decoder - new instructions : Ok
54: Test intel cqm nmi context read : Skip
[root@jouet ~]#
Build stats:
# time dm
1 74.534 alpine:3.4: Ok
2 25.636 android-ndk:r12b-arm: Ok
3 78.066 archlinux:latest: Ok
4 41.189 centos:5: Ok
5 64.550 centos:6: Ok
6 74.689 centos:7: Ok
7 68.580 debian:7: Ok
8 75.115 debian:8: Ok
9 75.288 fedora:20: Ok
10 79.294 fedora:21: Ok
11 76.839 fedora:22: Ok
12 76.695 fedora:23: Ok
13 82.058 fedora:24: Ok
14 31.649 fedora:24-x-ARC-uClibc: Ok
15 85.826 fedora:rawhide: Ok
16 83.272 mageia:5: Ok
17 76.883 opensuse:13.2: Ok
18 78.530 opensuse:42.1: Ok
19 85.315 opensuse:tumbleweed: Ok
20 63.436 ubuntu:12.04.5: Ok
21 40.909 ubuntu:14.04: Ok
22 72.689 ubuntu:14.04.4: Ok
23 76.374 ubuntu:15.10: Ok
24 70.309 ubuntu:16.04: Ok
25 59.159 ubuntu:16.04-x-arm: Ok
26 56.011 ubuntu:16.04-x-arm64: Ok
27 56.913 ubuntu:16.04-x-powerpc64: Ok
28 57.442 ubuntu:16.04-x-powerpc64el: Ok
29 80.282 ubuntu:16.10: Ok
30 60.964 ubuntu:16.10-x-arm64: Ok
31 61.390 ubuntu:16.10-x-powerpc: Ok
32 63.167 ubuntu:16.10-x-s390: Ok
real 35m54.027s
user 0m2.855s
sys 0m2.652s
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [GIT PULL 00/22] perf/core improvements and fixes 2016-09-20 20:03 Arnaldo Carvalho de Melo @ 2016-09-20 21:34 ` Ingo Molnar 0 siblings, 0 replies; 36+ messages in thread From: Ingo Molnar @ 2016-09-20 21:34 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexander Shishkin, Andi Kleen, Chris Riyder, David Ahern, Don Zickus, Hemant Kumar, Jiri Olsa, Joe Mario, Kim Phillips, Markus Trippelsdorf, Masami Hiramatsu, Mathieu Poirier, Michael Ellerman, Milian Wolff, Namhyung Kim, Naveen N . Rao, Pawel Moll, Peter Zijlstra, pi3orama, Ravi Bangoria, Russell King, Taeung Song, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit cd34cd97b7b4336aa2c623c37daffab264c7c6ce: > > perf/x86/intel/uncore: Add Skylake server uncore support (2016-09-10 11:18:52 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo-20160920 > > for you to fetch changes up to 3c028a0cb5b71f47d523bc8ad2c597cb257f41fb: > > perf symbols: Do not open device files (2016-09-20 16:20:21 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Support event group view with hierarchy mode in 'perf top' and 'perf report' > (Namhyung Kim) > > e.g.: > > $ perf record -e '{cycles,instructions}' make > $ perf report --hierarchy --stdio > ... > # Overhead Command / Shared Object / Symbol > # ...................... .................................. > ... > 25.74% 27.18% sh > 19.96% 24.14% libc-2.24.so > 9.55% 14.64% [.] __strcmp_sse2 > 1.54% 0.00% [.] __tfind > 1.07% 1.13% [.] _int_malloc > 0.95% 0.00% [.] __strchr_sse2 > 0.89% 1.39% [.] __tsearch > 0.76% 0.00% [.] strlen > > - Fix the dwarf regs table for x86_64, adding a missing % to the "%di" > register, noticed with a failing 'perf test bpf' (Arnaldo Carvalho de Melo) > > - Fix handling of mmap parameters in the 'perf trace' beautifier in > architectures that don't have the same mappings as x86_64 (Wang Nan) > > - Handle hugetbl mappings in older systems running new kernels (Wang Nan) > > - Resolve 'call' operands in 'annotate', that when using /proc/kcore > were appearing just as hexadecimal addresses, to function names > (Arnaldo Carvalho de Melo) > > - Fix width computation for srcline sort entry (Jiri Olsa) > > - Do not ignore call instruction with indirect target in 'annotate' > (Ravi Bangoria) > > - Handle MADV_FREE in the madvise 'trace' beautifier (Wang Nan) > > - Fix build of 'perf trace' mman beautifier in !x86_64 (Wang Nan) > > Infrastructure: > > - Add infrastructure for PMU specific configuration, allowing to pass > config variables directly to the kernel PMU driver, prefixing those > variables with a '@', part of a larger series to support Coresight (Mathieu Poirier) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > Build stats at the end of this message. > > ---------------------------------------------------------------- > Arnaldo Carvalho de Melo (5): > perf probe: Fix dwarf regs table for x86_64 > perf trace beauty mmap: Fix defines for non !x86_64 > perf tools: Do hugetlb handling in more systems > perf annotate: Pass the symbol's map/dso to the instruction parsers > perf annotate: Resolve 'call' operands to function names > > Jiri Olsa (2): > perf hists: Fix width computation for srcline sort entry > perf symbols: Do not open device files > > Mathieu Poirier (1): > perf tools: Add infrastructure for PMU specific configuration > > Namhyung Kim (9): > perf hists browser: Fix event group display > perf hists: Introduce hists__match_hierarchy() > perf hists: Introduce hists__link_hierarchy() > perf hist: Initialize hierarchy tree explicitly > perf ui/stdio: Always reset output width for hierarchy > perf ui/stdio: Rename print_hierarchy_header() > perf report: Enable group view with hierarchy > perf ui/tui: Reset output width for hierarchy > perf hists: Factor out hists__reset_column_width() > > Ravi Bangoria (1): > perf annotate: Do not ignore call instruction with indirect target > > Wang Nan (4): > tools include: Add uapi mman.h for each architecture > perf build: Compare mman.h related headers against kernel originals > perf trace beauty mmap: Add missing MADV_FREE > tools include: Add mman macros needed by perf for all arch > > tools/arch/alpha/include/uapi/asm/mman.h | 47 ++++++++ > tools/arch/arc/include/uapi/asm/mman.h | 6 + > tools/arch/arm/include/uapi/asm/mman.h | 6 + > tools/arch/arm64/include/uapi/asm/mman.h | 6 + > tools/arch/frv/include/uapi/asm/mman.h | 6 + > tools/arch/h8300/include/uapi/asm/mman.h | 6 + > tools/arch/hexagon/include/uapi/asm/mman.h | 6 + > tools/arch/ia64/include/uapi/asm/mman.h | 6 + > tools/arch/m32r/include/uapi/asm/mman.h | 6 + > tools/arch/microblaze/include/uapi/asm/mman.h | 6 + > tools/arch/mips/include/uapi/asm/mman.h | 46 ++++++++ > tools/arch/mn10300/include/uapi/asm/mman.h | 6 + > tools/arch/parisc/include/uapi/asm/mman.h | 47 ++++++++ > tools/arch/powerpc/include/uapi/asm/mman.h | 15 +++ > tools/arch/s390/include/uapi/asm/mman.h | 6 + > tools/arch/score/include/uapi/asm/mman.h | 6 + > tools/arch/sh/include/uapi/asm/mman.h | 6 + > tools/arch/sparc/include/uapi/asm/mman.h | 15 +++ > tools/arch/tile/include/uapi/asm/mman.h | 15 +++ > tools/arch/x86/include/uapi/asm/mman.h | 5 + > tools/arch/xtensa/include/uapi/asm/mman.h | 47 ++++++++ > tools/include/uapi/asm-generic/mman-common.h | 75 ++++++++++++ > tools/include/uapi/asm-generic/mman.h | 22 ++++ > tools/include/uapi/linux/mman.h | 13 +++ > tools/perf/Documentation/perf-record.txt | 12 ++ > tools/perf/MANIFEST | 4 + > tools/perf/Makefile.perf | 9 ++ > tools/perf/arch/x86/include/dwarf-regs-table.h | 2 +- > tools/perf/builtin-report.c | 1 - > tools/perf/trace/beauty/mmap.c | 72 +----------- > tools/perf/ui/browsers/hists.c | 7 +- > tools/perf/ui/hist.c | 15 +++ > tools/perf/ui/stdio/hist.c | 25 +--- > tools/perf/util/annotate.c | 37 +++--- > tools/perf/util/annotate.h | 2 +- > tools/perf/util/dso.c | 3 + > tools/perf/util/event.c | 7 +- > tools/perf/util/evsel.h | 2 + > tools/perf/util/hist.c | 154 ++++++++++++++++++++++++- > tools/perf/util/hist.h | 1 + > tools/perf/util/map.c | 9 +- > tools/perf/util/parse-events.c | 7 +- > tools/perf/util/parse-events.h | 1 + > tools/perf/util/parse-events.l | 22 ++++ > tools/perf/util/parse-events.y | 11 ++ > tools/perf/util/sort.h | 1 + > 46 files changed, 698 insertions(+), 131 deletions(-) > create mode 100644 tools/arch/alpha/include/uapi/asm/mman.h > create mode 100644 tools/arch/arc/include/uapi/asm/mman.h > create mode 100644 tools/arch/arm/include/uapi/asm/mman.h > create mode 100644 tools/arch/arm64/include/uapi/asm/mman.h > create mode 100644 tools/arch/frv/include/uapi/asm/mman.h > create mode 100644 tools/arch/h8300/include/uapi/asm/mman.h > create mode 100644 tools/arch/hexagon/include/uapi/asm/mman.h > create mode 100644 tools/arch/ia64/include/uapi/asm/mman.h > create mode 100644 tools/arch/m32r/include/uapi/asm/mman.h > create mode 100644 tools/arch/microblaze/include/uapi/asm/mman.h > create mode 100644 tools/arch/mips/include/uapi/asm/mman.h > create mode 100644 tools/arch/mn10300/include/uapi/asm/mman.h > create mode 100644 tools/arch/parisc/include/uapi/asm/mman.h > create mode 100644 tools/arch/powerpc/include/uapi/asm/mman.h > create mode 100644 tools/arch/s390/include/uapi/asm/mman.h > create mode 100644 tools/arch/score/include/uapi/asm/mman.h > create mode 100644 tools/arch/sh/include/uapi/asm/mman.h > create mode 100644 tools/arch/sparc/include/uapi/asm/mman.h > create mode 100644 tools/arch/tile/include/uapi/asm/mman.h > create mode 100644 tools/arch/x86/include/uapi/asm/mman.h > create mode 100644 tools/arch/xtensa/include/uapi/asm/mman.h > create mode 100644 tools/include/uapi/asm-generic/mman-common.h > create mode 100644 tools/include/uapi/asm-generic/mman.h > create mode 100644 tools/include/uapi/linux/mman.h > > [root@jouet ~]# perf test > 1: vmlinux symtab matches kallsyms : Ok > 2: detect openat syscall event : Ok > 3: detect openat syscall event on all cpus : Ok > 4: read samples using the mmap interface : Ok > 5: parse events tests : Ok > 6: Validate PERF_RECORD_* events & perf_sample fields : Ok > 7: Test perf pmu format parsing : Ok > 8: Test dso data read : Ok > 9: Test dso data cache : Ok > 10: Test dso data reopen : Ok > 11: roundtrip evsel->name check : Ok > 12: Check parsing of sched tracepoints fields : Ok > 13: Generate and check syscalls:sys_enter_openat event fields: Ok > 14: struct perf_event_attr setup : Ok > 15: Test matching and linking multiple hists : Ok > 16: Try 'import perf' in python, checking link problems : Ok > 17: Test breakpoint overflow signal handler : Ok > 18: Test breakpoint overflow sampling : Ok > 19: Test number of exit event of a simple workload : Ok > 20: Test software clock events have valid period values : Ok > 21: Test object code reading : Ok > 22: Test sample parsing : Ok > 23: Test using a dummy software event to keep tracking : Ok > 24: Test parsing with no sample_id_all bit set : Ok > 25: Test filtering hist entries : Ok > 26: Test mmap thread lookup : Ok > 27: Test thread mg sharing : Ok > 28: Test output sorting of hist entries : Ok > 29: Test cumulation of child hist entries : Ok > 30: Test tracking with sched_switch : Ok > 31: Filter fds with revents mask in a fdarray : Ok > 32: Add fd to a fdarray, making it autogrow : Ok > 33: Test kmod_path__parse function : Ok > 34: Test thread map : Ok > 35: Test LLVM searching and compiling : > 35.1: Basic BPF llvm compiling test : Ok > 35.2: Test kbuild searching : Ok > 35.3: Compile source for BPF prologue generation test : Ok > 35.4: Compile source for BPF relocation test : Ok > 36: Test topology in session : Ok > 37: Test BPF filter : > 37.1: Test basic BPF filtering : Ok > 37.2: Test BPF prologue generation : Ok > 37.3: Test BPF relocation checker : Ok > 38: Test thread map synthesize : Ok > 39: Test cpu map synthesize : Ok > 40: Test stat config synthesize : Ok > 41: Test stat synthesize : Ok > 42: Test stat round synthesize : Ok > 43: Test attr update synthesize : Ok > 44: Test events times : Ok > 45: Test backward reading from ring buffer : Ok > 46: Test cpu map print : Ok > 47: Test SDT event probing : Ok > 48: Test is_printable_array function : Ok > 49: Test bitmap print : Ok > 50: x86 rdpmc test : Ok > 51: Test converting perf time to TSC : Ok > 52: Test dwarf unwind : Ok > 53: Test x86 instruction decoder - new instructions : Ok > 54: Test intel cqm nmi context read : Skip > [root@jouet ~]# > > Build stats: > > # time dm > 1 74.534 alpine:3.4: Ok > 2 25.636 android-ndk:r12b-arm: Ok > 3 78.066 archlinux:latest: Ok > 4 41.189 centos:5: Ok > 5 64.550 centos:6: Ok > 6 74.689 centos:7: Ok > 7 68.580 debian:7: Ok > 8 75.115 debian:8: Ok > 9 75.288 fedora:20: Ok > 10 79.294 fedora:21: Ok > 11 76.839 fedora:22: Ok > 12 76.695 fedora:23: Ok > 13 82.058 fedora:24: Ok > 14 31.649 fedora:24-x-ARC-uClibc: Ok > 15 85.826 fedora:rawhide: Ok > 16 83.272 mageia:5: Ok > 17 76.883 opensuse:13.2: Ok > 18 78.530 opensuse:42.1: Ok > 19 85.315 opensuse:tumbleweed: Ok > 20 63.436 ubuntu:12.04.5: Ok > 21 40.909 ubuntu:14.04: Ok > 22 72.689 ubuntu:14.04.4: Ok > 23 76.374 ubuntu:15.10: Ok > 24 70.309 ubuntu:16.04: Ok > 25 59.159 ubuntu:16.04-x-arm: Ok > 26 56.011 ubuntu:16.04-x-arm64: Ok > 27 56.913 ubuntu:16.04-x-powerpc64: Ok > 28 57.442 ubuntu:16.04-x-powerpc64el: Ok > 29 80.282 ubuntu:16.10: Ok > 30 60.964 ubuntu:16.10-x-arm64: Ok > 31 61.390 ubuntu:16.10-x-powerpc: Ok > 32 63.167 ubuntu:16.10-x-s390: Ok > > real 35m54.027s > user 0m2.855s > sys 0m2.652s Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [GIT PULL 00/22] perf/core improvements and fixes
@ 2016-02-19 22:41 Arnaldo Carvalho de Melo
2016-02-20 10:56 ` Ingo Molnar
0 siblings, 1 reply; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-02-19 22:41 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Alexei Starovoitov, Andi Kleen, Brendan Gregg, Cody P Schafer,
David Ahern, Frederic Weisbecker, He Kuang, Jeremie Galarneau,
Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu,
Namhyung Kim, Peter Zijlstra, pi3orama, Stephane Eranian,
Steven Noonan, Wang Nan, Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling,
- Arnaldo
The following changes since commit 3b364d7b587db0f0eeafde0f271e0698187de776:
perf/core: Remove unused arguments from a bunch of functions (2016-02-17 10:37:48 +0100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo
for you to fetch changes up to 5b2ea6f2f6ac81a230e6cc68e1473e796a583f00:
perf report: Check error during report__collapse_hists() (2016-02-19 19:17:50 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
User visible:
- Add 'perf record' --all-user/--all-kernel options, so that one can tell
that all the events in the command line should be restricted to the user
or kernel levels (Jiri Olsa), i.e.:
perf record -e cycles:u,instructions:u
is equivalent to:
perf record --all-user -e cycles,instructions
- Fix percentage update on key press, due to the buffering code
(that creates hist_entries that will later be consumed) touching
per hists state that is used by the display thread (Namhyung Kim)
- Bail out when event modifiers not supported by 'perf stat' are
specified, i.e.: (Wang Nan)
# perf stat -e cycles/no-inherit/ usleep 1
event syntax error: 'cycles/no-inherit/'
\___ 'no-inherit' is not usable in 'perf stat'
# perf stat -e cycles/foo/ usleep 1
event syntax error: 'cycles/foo/'
\___ unknown term
valid terms: config,config1,config2,name
#
- Enable setting names for legacy cache, raw and numeric events, e.g: (Wang Nan)
# perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.659 MB perf.data (844 samples) ]
# perf evlist
cycles
evtx
#
Miscelaneous/Infrastructure:
- Handled scaled == -1 case for counters in 'perf stat', fixing
recent, only in perf/core, regression (Andi Kleen)
- Reference count the cpu and thread maps at set_maps(), fixing the
'object code reading' 'perf test' entry when it was requesting a
perf_event_attr.sample_freq > /proc/sys/kernel/perf_event_max_sample_rate
(Arnaldo Carvalho de Melo)
- Improve perf_evlist__strerror_open() to provide hints for -EINVAL due
to perf_event_attr.sample_freq > /proc/sys/kernel/perf_event_max_sample_rate
(Arnaldo Carvalho de Melo)
- Add checks to various callchain and histogram routines (Namhyung Kim)
- Fix checking asprintf return value when parsing additional event config terms (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Andi Kleen (1):
perf stat: Handled scaled == -1 case for counters
Arnaldo Carvalho de Melo (5):
perf evlist: Reference count the cpu and thread maps at set_maps()
perf evlist: Handle -EINVAL for sample_freq > max_sample_rate in strerror_open()
perf tests: Use perf_evlist__strerror_open() to provide hints about max_freq
perf test: Reduce the sample_freq for the 'object code reading' test
perf tools: Introduce opt_event_config nonterminal
Jiri Olsa (1):
perf record: Add --all-user/--all-kernel options
Namhyung Kim (8):
perf hists browser: Fix percentage update on key press
perf callchain: Check return value of add_child()
perf callchain: Check return value of fill_node()
perf callchain: Add enum match_result for match_chain()
perf callchain: Check return value of split_add_child()
perf callchain: Check return value of append_chain_children()
perf hists: Return error from hists__collapse_resort()
perf report: Check error during report__collapse_hists()
Wang Nan (7):
perf bpf: Rename bpf_prog_priv__clear() to clear_prog_priv()
perf tools: Fix checking asprintf return value
perf tools: Create config_term_names array
perf stat: Bail out on unsupported event config modifiers
perf tools: Rename and move pmu_event_name to get_config_name
perf tools: Enable config raw and numeric events
perf tools: Enable config and setting names for legacy cache events
tools/perf/Documentation/perf-record.txt | 6 ++
tools/perf/builtin-record.c | 6 ++
tools/perf/builtin-report.c | 14 ++-
tools/perf/builtin-stat.c | 3 +-
tools/perf/perf.h | 2 +
tools/perf/tests/code-reading.c | 10 +-
tools/perf/tests/parse-events.c | 52 ++++++++++
tools/perf/util/bpf-loader.c | 6 +-
tools/perf/util/callchain.c | 102 +++++++++++++-----
tools/perf/util/evlist.c | 24 ++++-
tools/perf/util/evsel.c | 10 ++
tools/perf/util/hist.c | 55 +++++++---
tools/perf/util/hist.h | 6 +-
tools/perf/util/parse-events.c | 173 ++++++++++++++++++++++++++-----
tools/perf/util/parse-events.h | 8 +-
tools/perf/util/parse-events.l | 3 +-
tools/perf/util/parse-events.y | 75 +++++++-------
17 files changed, 426 insertions(+), 129 deletions(-)
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [GIT PULL 00/22] perf/core improvements and fixes 2016-02-19 22:41 Arnaldo Carvalho de Melo @ 2016-02-20 10:56 ` Ingo Molnar 0 siblings, 0 replies; 36+ messages in thread From: Ingo Molnar @ 2016-02-20 10:56 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Alexei Starovoitov, Andi Kleen, Brendan Gregg, Cody P Schafer, David Ahern, Frederic Weisbecker, He Kuang, Jeremie Galarneau, Jiri Olsa, Kirill Smelkov, Li Zefan, Masami Hiramatsu, Namhyung Kim, Peter Zijlstra, pi3orama, Stephane Eranian, Steven Noonan, Wang Nan, Arnaldo Carvalho de Melo * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > The following changes since commit 3b364d7b587db0f0eeafde0f271e0698187de776: > > perf/core: Remove unused arguments from a bunch of functions (2016-02-17 10:37:48 +0100) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to 5b2ea6f2f6ac81a230e6cc68e1473e796a583f00: > > perf report: Check error during report__collapse_hists() (2016-02-19 19:17:50 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Add 'perf record' --all-user/--all-kernel options, so that one can tell > that all the events in the command line should be restricted to the user > or kernel levels (Jiri Olsa), i.e.: > > perf record -e cycles:u,instructions:u > > is equivalent to: > > perf record --all-user -e cycles,instructions > > - Fix percentage update on key press, due to the buffering code > (that creates hist_entries that will later be consumed) touching > per hists state that is used by the display thread (Namhyung Kim) > > - Bail out when event modifiers not supported by 'perf stat' are > specified, i.e.: (Wang Nan) > > # perf stat -e cycles/no-inherit/ usleep 1 > event syntax error: 'cycles/no-inherit/' > \___ 'no-inherit' is not usable in 'perf stat' > # perf stat -e cycles/foo/ usleep 1 > event syntax error: 'cycles/foo/' > \___ unknown term > > valid terms: config,config1,config2,name > # > > - Enable setting names for legacy cache, raw and numeric events, e.g: (Wang Nan) > > # perf record -e cycles -e 4:0x6530160/name=evtx,call-graph=fp/ -a sleep 1 > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 1.659 MB perf.data (844 samples) ] > # perf evlist > cycles > evtx > # > > Miscelaneous/Infrastructure: > > - Handled scaled == -1 case for counters in 'perf stat', fixing > recent, only in perf/core, regression (Andi Kleen) > > - Reference count the cpu and thread maps at set_maps(), fixing the > 'object code reading' 'perf test' entry when it was requesting a > perf_event_attr.sample_freq > /proc/sys/kernel/perf_event_max_sample_rate > (Arnaldo Carvalho de Melo) > > - Improve perf_evlist__strerror_open() to provide hints for -EINVAL due > to perf_event_attr.sample_freq > /proc/sys/kernel/perf_event_max_sample_rate > (Arnaldo Carvalho de Melo) > > - Add checks to various callchain and histogram routines (Namhyung Kim) > > - Fix checking asprintf return value when parsing additional event config terms (Wang Nan) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Andi Kleen (1): > perf stat: Handled scaled == -1 case for counters > > Arnaldo Carvalho de Melo (5): > perf evlist: Reference count the cpu and thread maps at set_maps() > perf evlist: Handle -EINVAL for sample_freq > max_sample_rate in strerror_open() > perf tests: Use perf_evlist__strerror_open() to provide hints about max_freq > perf test: Reduce the sample_freq for the 'object code reading' test > perf tools: Introduce opt_event_config nonterminal > > Jiri Olsa (1): > perf record: Add --all-user/--all-kernel options > > Namhyung Kim (8): > perf hists browser: Fix percentage update on key press > perf callchain: Check return value of add_child() > perf callchain: Check return value of fill_node() > perf callchain: Add enum match_result for match_chain() > perf callchain: Check return value of split_add_child() > perf callchain: Check return value of append_chain_children() > perf hists: Return error from hists__collapse_resort() > perf report: Check error during report__collapse_hists() > > Wang Nan (7): > perf bpf: Rename bpf_prog_priv__clear() to clear_prog_priv() > perf tools: Fix checking asprintf return value > perf tools: Create config_term_names array > perf stat: Bail out on unsupported event config modifiers > perf tools: Rename and move pmu_event_name to get_config_name > perf tools: Enable config raw and numeric events > perf tools: Enable config and setting names for legacy cache events > > tools/perf/Documentation/perf-record.txt | 6 ++ > tools/perf/builtin-record.c | 6 ++ > tools/perf/builtin-report.c | 14 ++- > tools/perf/builtin-stat.c | 3 +- > tools/perf/perf.h | 2 + > tools/perf/tests/code-reading.c | 10 +- > tools/perf/tests/parse-events.c | 52 ++++++++++ > tools/perf/util/bpf-loader.c | 6 +- > tools/perf/util/callchain.c | 102 +++++++++++++----- > tools/perf/util/evlist.c | 24 ++++- > tools/perf/util/evsel.c | 10 ++ > tools/perf/util/hist.c | 55 +++++++--- > tools/perf/util/hist.h | 6 +- > tools/perf/util/parse-events.c | 173 ++++++++++++++++++++++++++----- > tools/perf/util/parse-events.h | 8 +- > tools/perf/util/parse-events.l | 3 +- > tools/perf/util/parse-events.y | 75 +++++++------- > 17 files changed, 426 insertions(+), 129 deletions(-) Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [GIT PULL 00/22] perf/core improvements and fixes
@ 2015-08-26 15:57 Arnaldo Carvalho de Melo
2015-08-28 6:24 ` Ingo Molnar
0 siblings, 1 reply; 36+ messages in thread
From: Arnaldo Carvalho de Melo @ 2015-08-26 15:57 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Adrian Hunter,
Borislav Petkov, David Ahern, Frederic Weisbecker, Jiri Olsa,
Masami Hiramatsu, Namhyung Kim, pi3orama, Stephane Eranian,
Steven Rostedt, Sukadev Bhattiprolu, Wang Nan, Zefan Li,
Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling, this replaces the previous perf-core-for-mingo
pull req, replacing the last patch in that series and adding a few more fixes from
Jiri and Wang,
Thanks,
- Arnaldo
The following changes since commit 0e53909a1cf0153736fb52c216558a65530d8c40:
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2015-08-22 08:45:46 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo
for you to fetch changes up to a2fb3382edbea83c6f2bf6ac15e3673b2e254aad:
tracing/uprobes: Do not print '0x (null)' when offset is 0 (2015-08-26 10:43:01 -0300)
----------------------------------------------------------------
perf/core improvements and fixes:
User visible:
- Add support for using several Intel PT features (CYC, MTC packets), the
relevant documentation was updated: tools/perf/Documentation/intel-pt.txt,
briefly describing those packets, its purposes, how to configure them in
the event config terms and relevant external documentation for further
reading. (Adrian Hunter)
- Introduce support for probing at an absolute address, for user and kernel
'perf probe's, useful when one have the symbol maps on a developer machine
but not on an embedded system (Wang Nan)
- Fix 'perf probe' list results when a symbol can't be found or the
address is zero and when an offset is provided without a function (Wang Nan)
- Do not print '0x (null)' in uprobes when offset is zero (Wang Nan)
- Clear the progress bar at the end of a ordered_events flush, fixing
an UI artifact when, after ordering the events the screen doesn't get
completely redraw, for instance, when an error window covers just the
center of the screen and waits for user input. (Arnaldo Carvalho de Melo)
- Fix 'annotate' segfault by resetting the dso find_symbol cache when removing
symbols (Arnaldo Carvalho de Melo)
Infrastructure:
- Allow duplicate objects in the object list, just like it is possible to have
things like this, in the kernel: (Jiri Olsa)
drivers/Makefile:obj-$(CONFIG_PCI) += usb/
drivers/Makefile:obj-$(CONFIG_USB_GADGET) += usb/
- Fix Intel PT 'instructions' sample period (Adrian Hunter)
- Prevent segfault when reading probe point with absolute address (Wang Nan)
Build fixes:
- Fix tarball build broken by pt/bts (Adrian Hunter)
- Remove export.h from MANIFEST, fixing the perf tarball make target (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Adrian Hunter (11):
perf tools: Fix tarball build broken by pt/bts
perf tools: Fix Intel PT 'instructions' sample period
perf tools: Add Intel PT support for PSB periods
perf tools: Add new Intel PT packet definitions
perf tools: Pass Intel PT information for decoding MTC and CYC
perf tools: Add Intel PT support for decoding MTC packets
perf tools: Add Intel PT support for using MTC packets
perf tools: Add Intel PT support for decoding CYC packets
perf tools: Add Intel PT support for using CYC packets
perf tools: Add Intel PT support for decoding TRACESTOP packets
perf tools: Update Intel PT documentation
Arnaldo Carvalho de Melo (3):
perf annotate: Reset the dso find_symbol cache when removing symbols
perf ui tui progress: Implement the ui_progress_ops->finish() method
perf ordered_events: Clear the progress bar at the end of a flush
Jiri Olsa (2):
perf tools: Remove export.h from MANIFEST
tools build: Allow duplicate objects in the object list
Wang Nan (6):
perf probe: Prevent segfault when reading probe point with absolute address
perf probe: Fix list result when symbol can't be found
perf probe: Fix list result when address is zero
perf probe: Fix error reported when offset without function
perf probe: Support probing at absolute address
tracing/uprobes: Do not print '0x (null)' when offset is 0
kernel/trace/trace_uprobe.c | 17 +-
tools/build/Documentation/Build.txt | 1 +
tools/build/Makefile.build | 2 +-
tools/build/tests/ex/Build | 1 +
tools/perf/Documentation/intel-pt.txt | 194 ++++++-
tools/perf/MANIFEST | 1 -
tools/perf/arch/x86/util/intel-pt.c | 271 +++++++++-
tools/perf/builtin-annotate.c | 1 +
tools/perf/ui/tui/progress.c | 19 +-
tools/perf/util/dso.h | 2 +
tools/perf/util/intel-pt-decoder/inat.c | 2 +-
tools/perf/util/intel-pt-decoder/inat.h | 2 +-
tools/perf/util/intel-pt-decoder/inat_types.h | 29 ++
tools/perf/util/intel-pt-decoder/insn.c | 4 +-
tools/perf/util/intel-pt-decoder/insn.h | 2 +-
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 555 ++++++++++++++++++++-
.../perf/util/intel-pt-decoder/intel-pt-decoder.h | 5 +
.../util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 +-
.../util/intel-pt-decoder/intel-pt-pkt-decoder.c | 142 +++++-
.../util/intel-pt-decoder/intel-pt-pkt-decoder.h | 6 +
tools/perf/util/intel-pt.c | 67 ++-
tools/perf/util/intel-pt.h | 5 +
tools/perf/util/ordered-events.c | 3 +
tools/perf/util/probe-event.c | 210 +++++++-
tools/perf/util/probe-event.h | 4 +
tools/perf/util/probe-finder.c | 21 +-
tools/perf/util/symbol.c | 10 +
27 files changed, 1481 insertions(+), 97 deletions(-)
create mode 100644 tools/perf/util/intel-pt-decoder/inat_types.h
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [GIT PULL 00/22] perf/core improvements and fixes 2015-08-26 15:57 Arnaldo Carvalho de Melo @ 2015-08-28 6:24 ` Ingo Molnar 0 siblings, 0 replies; 36+ messages in thread From: Ingo Molnar @ 2015-08-28 6:24 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: linux-kernel, Adrian Hunter, Borislav Petkov, David Ahern, Frederic Weisbecker, Jiri Olsa, Masami Hiramatsu, Namhyung Kim, pi3orama, Stephane Eranian, Steven Rostedt, Sukadev Bhattiprolu, Wang Nan, Zefan Li, Arnaldo Carvalho de Melo, Peter Zijlstra * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, this replaces the previous perf-core-for-mingo > pull req, replacing the last patch in that series and adding a few more fixes from > Jiri and Wang, > > Thanks, > > - Arnaldo > > The following changes since commit 0e53909a1cf0153736fb52c216558a65530d8c40: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2015-08-22 08:45:46 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-core-for-mingo > > for you to fetch changes up to a2fb3382edbea83c6f2bf6ac15e3673b2e254aad: > > tracing/uprobes: Do not print '0x (null)' when offset is 0 (2015-08-26 10:43:01 -0300) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > User visible: > > - Add support for using several Intel PT features (CYC, MTC packets), the > relevant documentation was updated: tools/perf/Documentation/intel-pt.txt, > briefly describing those packets, its purposes, how to configure them in > the event config terms and relevant external documentation for further > reading. (Adrian Hunter) > > - Introduce support for probing at an absolute address, for user and kernel > 'perf probe's, useful when one have the symbol maps on a developer machine > but not on an embedded system (Wang Nan) > > - Fix 'perf probe' list results when a symbol can't be found or the > address is zero and when an offset is provided without a function (Wang Nan) > > - Do not print '0x (null)' in uprobes when offset is zero (Wang Nan) > > - Clear the progress bar at the end of a ordered_events flush, fixing > an UI artifact when, after ordering the events the screen doesn't get > completely redraw, for instance, when an error window covers just the > center of the screen and waits for user input. (Arnaldo Carvalho de Melo) > > - Fix 'annotate' segfault by resetting the dso find_symbol cache when removing > symbols (Arnaldo Carvalho de Melo) > > Infrastructure: > > - Allow duplicate objects in the object list, just like it is possible to have > things like this, in the kernel: (Jiri Olsa) > > drivers/Makefile:obj-$(CONFIG_PCI) += usb/ > drivers/Makefile:obj-$(CONFIG_USB_GADGET) += usb/ > > - Fix Intel PT 'instructions' sample period (Adrian Hunter) > > - Prevent segfault when reading probe point with absolute address (Wang Nan) > > Build fixes: > > - Fix tarball build broken by pt/bts (Adrian Hunter) > > - Remove export.h from MANIFEST, fixing the perf tarball make target (Jiri Olsa) > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > ---------------------------------------------------------------- > Adrian Hunter (11): > perf tools: Fix tarball build broken by pt/bts > perf tools: Fix Intel PT 'instructions' sample period > perf tools: Add Intel PT support for PSB periods > perf tools: Add new Intel PT packet definitions > perf tools: Pass Intel PT information for decoding MTC and CYC > perf tools: Add Intel PT support for decoding MTC packets > perf tools: Add Intel PT support for using MTC packets > perf tools: Add Intel PT support for decoding CYC packets > perf tools: Add Intel PT support for using CYC packets > perf tools: Add Intel PT support for decoding TRACESTOP packets > perf tools: Update Intel PT documentation > > Arnaldo Carvalho de Melo (3): > perf annotate: Reset the dso find_symbol cache when removing symbols > perf ui tui progress: Implement the ui_progress_ops->finish() method > perf ordered_events: Clear the progress bar at the end of a flush > > Jiri Olsa (2): > perf tools: Remove export.h from MANIFEST > tools build: Allow duplicate objects in the object list > > Wang Nan (6): > perf probe: Prevent segfault when reading probe point with absolute address > perf probe: Fix list result when symbol can't be found > perf probe: Fix list result when address is zero > perf probe: Fix error reported when offset without function > perf probe: Support probing at absolute address > tracing/uprobes: Do not print '0x (null)' when offset is 0 > > kernel/trace/trace_uprobe.c | 17 +- > tools/build/Documentation/Build.txt | 1 + > tools/build/Makefile.build | 2 +- > tools/build/tests/ex/Build | 1 + > tools/perf/Documentation/intel-pt.txt | 194 ++++++- > tools/perf/MANIFEST | 1 - > tools/perf/arch/x86/util/intel-pt.c | 271 +++++++++- > tools/perf/builtin-annotate.c | 1 + > tools/perf/ui/tui/progress.c | 19 +- > tools/perf/util/dso.h | 2 + > tools/perf/util/intel-pt-decoder/inat.c | 2 +- > tools/perf/util/intel-pt-decoder/inat.h | 2 +- > tools/perf/util/intel-pt-decoder/inat_types.h | 29 ++ > tools/perf/util/intel-pt-decoder/insn.c | 4 +- > tools/perf/util/intel-pt-decoder/insn.h | 2 +- > .../perf/util/intel-pt-decoder/intel-pt-decoder.c | 555 ++++++++++++++++++++- > .../perf/util/intel-pt-decoder/intel-pt-decoder.h | 5 + > .../util/intel-pt-decoder/intel-pt-insn-decoder.c | 2 +- > .../util/intel-pt-decoder/intel-pt-pkt-decoder.c | 142 +++++- > .../util/intel-pt-decoder/intel-pt-pkt-decoder.h | 6 + > tools/perf/util/intel-pt.c | 67 ++- > tools/perf/util/intel-pt.h | 5 + > tools/perf/util/ordered-events.c | 3 + > tools/perf/util/probe-event.c | 210 +++++++- > tools/perf/util/probe-event.h | 4 + > tools/perf/util/probe-finder.c | 21 +- > tools/perf/util/symbol.c | 10 + > 27 files changed, 1481 insertions(+), 97 deletions(-) > create mode 100644 tools/perf/util/intel-pt-decoder/inat_types.h Pulled, thanks a lot Arnaldo! Ingo ^ permalink raw reply [flat|nested] 36+ messages in thread
* [GIT PULL 00/22] perf/core improvements and fixes
@ 2014-05-21 13:12 Jiri Olsa
2014-05-22 9:38 ` Ingo Molnar
0 siblings, 1 reply; 36+ messages in thread
From: Jiri Olsa @ 2014-05-21 13:12 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Corey Ashford,
David Ahern, Frederic Weisbecker, Michael Lentine, Namhyung Kim,
Paul Mackerras, Peter Zijlstra, Stephane Eranian, Jiri Olsa
hi Ingo,
please consider pulling
thanks,
jirka
The following changes since commit 6480c56130ba073df84d57d61062ec4118b10bbe:
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-05-20 08:36:09 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo
for you to fetch changes up to eca8183699964579ca8a0b8d116bd1f4da0136f7:
perf tools: Add automatic remapping of Android libraries (2014-05-21 15:03:25 +0200)
----------------------------------------------------------------
perf/core improvements and fixes:
. Android related fixes for pager and map dso resolving (Michael Lentine)
. Add -F option for specifying output fields (Namhyung Kim)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
----------------------------------------------------------------
Michael Lentine (2):
perf tools: Add cat as fallback pager
perf tools: Add automatic remapping of Android libraries
Namhyung Kim (20):
perf tools: Add ->cmp(), ->collapse() and ->sort() to perf_hpp_fmt
perf tools: Convert sort entries to hpp formats
perf tools: Use hpp formats to sort hist entries
perf tools: Support event grouping in hpp ->sort()
perf tools: Use hpp formats to sort final output
perf tools: Consolidate output field handling to hpp format routines
perf ui: Get rid of callback from __hpp__fmt()
perf tools: Allow hpp fields to be sort keys
perf tools: Consolidate management of default sort orders
perf tools: Call perf_hpp__init() before setting up GUI browsers
perf report: Add -F option to specify output fields
perf tools: Add ->sort() member to struct sort_entry
perf report/tui: Fix a bug when --fields/sort is given
perf top: Add --fields option to specify output fields
perf tools: Skip elided sort entries
perf hists: Reset width of output fields with header length
perf tools: Get rid of obsolete hist_entry__sort_list
perf tools: Introduce reset_output_field()
perf tests: Factor out print_hists_*()
perf tests: Add a testcase for histogram output sorting
tools/perf/Documentation/perf-diff.txt | 5 +-
tools/perf/Documentation/perf-report.txt | 19 +
tools/perf/Documentation/perf-top.txt | 14 +-
tools/perf/Makefile.perf | 1 +
tools/perf/builtin-diff.c | 7 +-
tools/perf/builtin-report.c | 41 +-
tools/perf/builtin-top.c | 20 +-
tools/perf/tests/builtin-test.c | 4 +
tools/perf/tests/hists_common.c | 57 +++
tools/perf/tests/hists_common.h | 3 +
tools/perf/tests/hists_filter.c | 38 +-
tools/perf/tests/hists_link.c | 30 +-
tools/perf/tests/hists_output.c | 618 +++++++++++++++++++++++++++++++
tools/perf/tests/tests.h | 1 +
tools/perf/ui/browsers/hists.c | 104 +++---
tools/perf/ui/gtk/hists.c | 41 +-
tools/perf/ui/hist.c | 244 +++++++++---
tools/perf/ui/setup.c | 2 -
tools/perf/ui/stdio/hist.c | 79 ++--
tools/perf/util/hist.c | 83 ++---
tools/perf/util/hist.h | 27 +-
tools/perf/util/map.c | 95 ++++-
tools/perf/util/pager.c | 12 +-
tools/perf/util/sort.c | 436 ++++++++++++++++++++--
tools/perf/util/sort.h | 6 +
25 files changed, 1601 insertions(+), 386 deletions(-)
create mode 100644 tools/perf/tests/hists_output.c
^ permalink raw reply [flat|nested] 36+ messages in thread* Re: [GIT PULL 00/22] perf/core improvements and fixes 2014-05-21 13:12 Jiri Olsa @ 2014-05-22 9:38 ` Ingo Molnar 0 siblings, 0 replies; 36+ messages in thread From: Ingo Molnar @ 2014-05-22 9:38 UTC (permalink / raw) To: Jiri Olsa Cc: linux-kernel, Arnaldo Carvalho de Melo, Corey Ashford, David Ahern, Frederic Weisbecker, Michael Lentine, Namhyung Kim, Paul Mackerras, Peter Zijlstra, Stephane Eranian * Jiri Olsa <jolsa@kernel.org> wrote: > hi Ingo, > please consider pulling > > thanks, > jirka > > > The following changes since commit 6480c56130ba073df84d57d61062ec4118b10bbe: > > Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf into perf/core (2014-05-20 08:36:09 +0200) > > are available in the git repository at: > > > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git tags/perf-core-for-mingo > > for you to fetch changes up to eca8183699964579ca8a0b8d116bd1f4da0136f7: > > perf tools: Add automatic remapping of Android libraries (2014-05-21 15:03:25 +0200) > > ---------------------------------------------------------------- > perf/core improvements and fixes: > > . Android related fixes for pager and map dso resolving (Michael Lentine) > > . Add -F option for specifying output fields (Namhyung Kim) > > Signed-off-by: Jiri Olsa <jolsa@kernel.org> > > ---------------------------------------------------------------- > Michael Lentine (2): > perf tools: Add cat as fallback pager > perf tools: Add automatic remapping of Android libraries > > Namhyung Kim (20): > perf tools: Add ->cmp(), ->collapse() and ->sort() to perf_hpp_fmt > perf tools: Convert sort entries to hpp formats > perf tools: Use hpp formats to sort hist entries > perf tools: Support event grouping in hpp ->sort() > perf tools: Use hpp formats to sort final output > perf tools: Consolidate output field handling to hpp format routines > perf ui: Get rid of callback from __hpp__fmt() > perf tools: Allow hpp fields to be sort keys > perf tools: Consolidate management of default sort orders > perf tools: Call perf_hpp__init() before setting up GUI browsers > perf report: Add -F option to specify output fields > perf tools: Add ->sort() member to struct sort_entry > perf report/tui: Fix a bug when --fields/sort is given > perf top: Add --fields option to specify output fields > perf tools: Skip elided sort entries > perf hists: Reset width of output fields with header length > perf tools: Get rid of obsolete hist_entry__sort_list > perf tools: Introduce reset_output_field() > perf tests: Factor out print_hists_*() > perf tests: Add a testcase for histogram output sorting > > tools/perf/Documentation/perf-diff.txt | 5 +- > tools/perf/Documentation/perf-report.txt | 19 + > tools/perf/Documentation/perf-top.txt | 14 +- > tools/perf/Makefile.perf | 1 + > tools/perf/builtin-diff.c | 7 +- > tools/perf/builtin-report.c | 41 +- > tools/perf/builtin-top.c | 20 +- > tools/perf/tests/builtin-test.c | 4 + > tools/perf/tests/hists_common.c | 57 +++ > tools/perf/tests/hists_common.h | 3 + > tools/perf/tests/hists_filter.c | 38 +- > tools/perf/tests/hists_link.c | 30 +- > tools/perf/tests/hists_output.c | 618 +++++++++++++++++++++++++++++++ > tools/perf/tests/tests.h | 1 + > tools/perf/ui/browsers/hists.c | 104 +++--- > tools/perf/ui/gtk/hists.c | 41 +- > tools/perf/ui/hist.c | 244 +++++++++--- > tools/perf/ui/setup.c | 2 - > tools/perf/ui/stdio/hist.c | 79 ++-- > tools/perf/util/hist.c | 83 ++--- > tools/perf/util/hist.h | 27 +- > tools/perf/util/map.c | 95 ++++- > tools/perf/util/pager.c | 12 +- > tools/perf/util/sort.c | 436 ++++++++++++++++++++-- > tools/perf/util/sort.h | 6 + > 25 files changed, 1601 insertions(+), 386 deletions(-) > create mode 100644 tools/perf/tests/hists_output.c Pulled, thanks a lot Jiri! Ingo ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2018-11-30 18:28 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-11-30 18:26 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 01/22] perf build: Give better hint about devel package for libssl Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 02/22] perf stat: Fix shadow stats for clock events Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 03/22] perf stat: Fix CSV mode column output for non-cgroup events Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 04/22] perf map: Remove extra indirection from map__find() Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 05/22] perf env: Also consider env->arch == NULL as local operation Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 06/22] perf machine: Record if a arch has a single user/kernel address space Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 07/22] perf thread: Add fallback functions for cases where cpumode is insufficient Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 08/22] perf tools: Use fallback for sample_addr_correlates_sym() cases Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 09/22] perf script: Use fallbacks for branch stacks Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 10/22] tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 11/22] perf tests record: Allow for 'sleep' being 'coreutils' Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 12/22] perf test: Fix perf_event_attr test failure Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 13/22] tools include: Adopt ERR_CAST() from the kernel err.h header Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 14/22] perf bpf: Use ERR_CAST instead of ERR_PTR(PTR_ERR()) Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 15/22] perf top: Allow passing a kallsyms file Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 16/22] perf intel-pt: Fix error with config term "pt=0" Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 17/22] tools build feature: Check if libaio is available Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 18/22] perf mmap: Map data buffer for preserving collected data Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 19/22] perf record: Enable asynchronous trace writing Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 20/22] perf record: Extend trace writing to multi AIO Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 21/22] perf beauty mmap_flags: Check if the arch has a mmap.h file Arnaldo Carvalho de Melo 2018-11-30 18:26 ` [PATCH 22/22] tools lib traceevent: Add sanity check to is_timestamp_in_us() Arnaldo Carvalho de Melo -- strict thread matches above, loose matches on Subject: below -- 2017-04-24 19:54 [GIT PULL 00/22] perf/core improvements and fixes Arnaldo Carvalho de Melo 2017-04-24 20:40 ` Ingo Molnar 2016-12-13 15:09 Arnaldo Carvalho de Melo 2016-10-04 2:36 Arnaldo Carvalho de Melo 2016-10-04 8:07 ` Ingo Molnar 2016-09-20 20:03 Arnaldo Carvalho de Melo 2016-09-20 21:34 ` Ingo Molnar 2016-02-19 22:41 Arnaldo Carvalho de Melo 2016-02-20 10:56 ` Ingo Molnar 2015-08-26 15:57 Arnaldo Carvalho de Melo 2015-08-28 6:24 ` Ingo Molnar 2014-05-21 13:12 Jiri Olsa 2014-05-22 9:38 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).