* [GIT PULL 0/1] perf probe fix
@ 2017-06-23 0:02 Arnaldo Carvalho de Melo
2017-06-23 0:02 ` [PATCH 1/1] perf probe: Fix probe definition for inlined functions Arnaldo Carvalho de Melo
2017-06-23 8:04 ` [GIT PULL 0/1] perf probe fix Ingo Molnar
0 siblings, 2 replies; 3+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-23 0:02 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnaldo Carvalho de Melo, Björn Töpel,
Magnus Karlsson, Masami Hiramatsu, stable,
Arnaldo Carvalho de Melo
Hi Ingo,
Please consider pulling,
- Arnaldo
Test results at the end of this message, as usual.
The following changes since commit fb3a5055cd7098f8d1dd0cd38d7172211113255f:
perf/x86/intel: Add 1G DTLB load/store miss support for SKL (2017-06-22 11:07:08 +0200)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-urgent-for-mingo-4.12-20170622
for you to fetch changes up to 7598f8bc1383ffd77686cb4e92e749bef3c75937:
perf probe: Fix probe definition for inlined functions (2017-06-22 16:08:09 -0300)
----------------------------------------------------------------
perf probe fix:
- Do not double the offset of inline expansions when using
'perf probe' on inlined functions (Björn Töpel)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
----------------------------------------------------------------
Björn Töpel (1):
perf probe: Fix probe definition for inlined functions
tools/perf/util/probe-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Test results:
The first ones are container (docker) based builds of tools/perf with and
without libelf support, objtool where it is supported and samples/bpf/, ditto.
Where clang is available, it is also used to build perf with/without libelf.
Several are cross builds, the ones with -x-ARCH, and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
# dm
1 alpine:3.4: Ok
2 alpine:3.5: Ok
3 alpine:3.6: Ok
4 alpine:edge: Ok
5 android-ndk:r12b-arm: Ok
6 archlinux:latest: Ok
7 centos:5: Ok
8 centos:6: Ok
9 centos:7: Ok
10 debian:7: Ok
11 debian:8: Ok
12 debian:9: Ok
13 debian:experimental: Ok
14 debian:experimental-x-arm64: Ok
15 debian:experimental-x-mips: Ok
16 debian:experimental-x-mips64: Ok
17 debian:experimental-x-mipsel: Ok
18 fedora:20: Ok
19 fedora:21: Ok
20 fedora:22: Ok
21 fedora:23: Ok
22 fedora:24: Ok
23 fedora:24-x-ARC-uClibc: Ok
24 fedora:25: Ok
25 fedora:rawhide: Ok
26 mageia:5: Ok
27 opensuse:13.2: Ok
28 opensuse:42.1: Ok
29 opensuse:tumbleweed: Ok
30 ubuntu:12.04.5: Ok
31 ubuntu:14.04.4: Ok
32 ubuntu:14.04.4-x-linaro-arm64: Ok
33 ubuntu:15.10: Ok
34 ubuntu:16.04: Ok
35 ubuntu:16.04-x-arm: Ok
36 ubuntu:16.04-x-arm64: Ok
37 ubuntu:16.04-x-powerpc: Ok
38 ubuntu:16.04-x-powerpc64: Ok
39 ubuntu:16.04-x-powerpc64el: Ok
40 ubuntu:16.04-x-s390: Ok
41 ubuntu:16.10: Ok
42 ubuntu:17.04: Ok
# uname -a
Linux jouet 4.12.0-rc4+ #1 SMP Fri Jun 9 12:59:23 -03 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@jouet ~]# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Parse event definition strings : Ok
6: Simple expression parser : Ok
7: PERF_RECORD_* events & perf_sample fields : Ok
8: Parse perf pmu format : Ok
9: DSO data read : Ok
10: DSO data cache : Ok
11: DSO data reopen : Ok
12: Roundtrip evsel->name : Ok
13: Parse sched tracepoints fields : Ok
14: syscalls:sys_enter_openat event fields : Ok
15: Setup struct perf_event_attr : Ok
16: Match and link multiple hists : Ok
17: 'import perf' in python : Ok
18: Breakpoint overflow signal handler : Ok
19: Breakpoint overflow sampling : Ok
20: Number of exit events of a simple workload : Ok
21: Software clock events period values : Ok
22: Object code reading : Ok
23: Sample parsing : Ok
24: Use a dummy software event to keep tracking: Ok
25: Parse with no sample_id_all bit set : Ok
26: Filter hist entries : Ok
27: Lookup mmap thread : Ok
28: Share thread mg : Ok
29: Sort output of hist entries : Ok
30: Cumulate child hist entries : Ok
31: Track with sched_switch : Ok
32: Filter fds with revents mask in a fdarray : Ok
33: Add fd to a fdarray, making it autogrow : Ok
34: kmod_path__parse : Ok
35: Thread map : Ok
36: LLVM search and compile :
36.1: Basic BPF llvm compile : Ok
36.2: kbuild searching : Ok
36.3: Compile source for BPF prologue generation: Ok
36.4: Compile source for BPF relocation : Ok
37: Session topology : Ok
38: BPF filter :
38.1: Basic BPF filtering : Ok
38.2: BPF pinning : Ok
38.3: BPF prologue generation : Ok
38.4: BPF relocation checker : Ok
39: Synthesize thread map : Ok
40: Remove thread map : Ok
41: Synthesize cpu map : Ok
42: Synthesize stat config : Ok
43: Synthesize stat : Ok
44: Synthesize stat round : Ok
45: Synthesize attr update : Ok
46: Event times : Ok
47: Read backward ring buffer : Ok
48: Print cpu map : Ok
49: Probe SDT events : Ok
50: is_printable_array : Ok
51: Print bitmap : Ok
52: perf hooks : Ok
53: builtin clang support : Skip (not compiled in)
54: unit_number__scnprintf : Ok
55: x86 rdpmc : Ok
56: Convert perf time to TSC : Ok
57: DWARF unwind : Ok
58: x86 instruction decoder - new instructions : Ok
59: Intel cqm nmi context read : Skip
#
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
make_no_newt_O: make NO_NEWT=1
make_pure_O: make
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_slang_O: make NO_SLANG=1
make_with_babeltrace_O: make LIBBABELTRACE=1
make_perf_o_O: make perf.o
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_libbpf_O: make NO_LIBBPF=1
make_no_gtk2_O: make NO_GTK2=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_static_O: make LDFLAGS=-static
make_clean_all_O: make clean all
make_no_libnuma_O: make NO_LIBNUMA=1
make_util_map_o_O: make util/map.o
make_tags_O: make tags
make_no_demangle_O: make NO_DEMANGLE=1
make_install_O: make install
make_debug_O: make DEBUG=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_install_bin_O: make install-bin
make_help_O: make help
make_no_auxtrace_O: make NO_AUXTRACE=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_no_libperl_O: make NO_LIBPERL=1
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libelf_O: make NO_LIBELF=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1
make_no_libaudit_O: make NO_LIBAUDIT=1
make_install_prefix_O: make install prefix=/tmp/krava
make_doc_O: make doc
OK
make: Leaving directory '/home/acme/git/linux/tools/perf'
$
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/1] perf probe: Fix probe definition for inlined functions
2017-06-23 0:02 [GIT PULL 0/1] perf probe fix Arnaldo Carvalho de Melo
@ 2017-06-23 0:02 ` Arnaldo Carvalho de Melo
2017-06-23 8:04 ` [GIT PULL 0/1] perf probe fix Ingo Molnar
1 sibling, 0 replies; 3+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-06-23 0:02 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Björn Töpel, stable,
Arnaldo Carvalho de Melo
From: Björn Töpel <bjorn.topel@intel.com>
In commit 613f050d68a8 ("perf probe: Fix to probe on gcc generated
functions in modules"), the offset from symbol is, incorrectly, added
to the trace point address. This leads to incorrect probe trace points
for inlined functions and when using relative line number on symbols.
Prior this patch:
$ perf probe -m nf_nat -D in_range
p:probe/in_range nf_nat:in_range.isra.9+0
$ perf probe -m i40e -D i40e_clean_rx_irq
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2212
$ perf probe -m i40e -D i40e_clean_rx_irq:16
p:probe/i40e_clean_rx_irq i40e:i40e_lan_xmit_frame+626
After:
$ perf probe -m nf_nat -D in_range
p:probe/in_range nf_nat:in_range.isra.9+0
$ perf probe -m i40e -D i40e_clean_rx_irq
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+1106
$ perf probe -m i40e -D i40e_clean_rx_irq:16
p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2665
Committer testing:
Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are
the functions that while not being explicitely marked as inline, were inlined
by the compiler:
# pfunct --cc_inlined /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko | head
__ew32
e1000_regdump
e1000e_dump_ps_pages
e1000_desc_unused
e1000e_systim_to_hwtstamp
e1000e_rx_hwtstamp
e1000e_update_rdt_wa
e1000e_update_tdt_wa
e1000_put_txbuf
e1000_consume_page
Then ask 'perf probe' to produce the kprobe_tracer probe definitions for two of
them:
# perf probe -m e1000e -D e1000e_rx_hwtstamp
p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+74
# perf probe -m e1000e -D e1000_consume_page
p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+876
p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+1506
p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in
e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for
that function:
$ readelf -wi /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
<SNIP>
<1><13e27b>: Abbrev Number: 121 (DW_TAG_subprogram)
<13e27c> DW_AT_name : (indirect string, offset: 0xa8945): e1000_clean_jumbo_rx_irq
<13e287> DW_AT_low_pc : 0x17a30
<3><13e6ef>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13e6f0> DW_AT_abstract_origin: <0x13ed2c>
<13e6f4> DW_AT_low_pc : 0x17be6
<SNIP>
<1><13ed2c>: Abbrev Number: 142 (DW_TAG_subprogram)
<13ed2e> DW_AT_name : (indirect string, offset: 0xa54c3): e1000_consume_page
So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is
inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s
address, gives us the offset we should use in the probe definition:
0x17be6 - 0x17a30 = 438
but above we have 876, which is twice as much.
Lets see the second inline expansion of e1000_consume_page() in
e1000_clean_jumbo_rx_irq():
<3><13e86e>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13e86f> DW_AT_abstract_origin: <0x13ed2c>
<13e873> DW_AT_low_pc : 0x17d21
0x17d21 - 0x17a30 = 753
So we where adding it at twice the offset from the containing function as we
should.
And then after this patch:
# perf probe -m e1000e -D e1000e_rx_hwtstamp
p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+37
# perf probe -m e1000e -D e1000_consume_page
p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+438
p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+753
p:probe/e1000_consume_page_2 e1000e:e1000_clean_jumbo_rx_irq+1353
#
Which matches the two first expansions and shows that because we were
doubling the offset it would spill over the next function:
readelf -sw /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko
673: 0000000000017a30 1626 FUNC LOCAL DEFAULT 2 e1000_clean_jumbo_rx_irq
674: 0000000000018090 2013 FUNC LOCAL DEFAULT 2 e1000_clean_rx_irq_ps
This is the 3rd inline expansion of e1000_consume_page() in
e1000_clean_jumbo_rx_irq():
<3><13ec77>: Abbrev Number: 119 (DW_TAG_inlined_subroutine)
<13ec78> DW_AT_abstract_origin: <0x13ed2c>
<13ec7c> DW_AT_low_pc : 0x17f79
0x17f79 - 0x17a30 = 1353
So:
0x17a30 + 2 * 1353 = 0x184c2
And:
0x184c2 - 0x18090 = 1074
Which explains the bogus third expansion for e1000_consume_page() to end up at:
p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074
All fixed now :-)
[1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 613f050d68a8 ("perf probe: Fix to probe on gcc generated functions in modules")
Link: http://lkml.kernel.org/r/20170621164134.5701-1-bjorn.topel@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/probe-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 84e7e698411e..a2670e9d652d 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -619,7 +619,7 @@ static int post_process_probe_trace_point(struct probe_trace_point *tp,
struct map *map, unsigned long offs)
{
struct symbol *sym;
- u64 addr = tp->address + tp->offset - offs;
+ u64 addr = tp->address - offs;
sym = map__find_symbol(map, addr);
if (!sym)
--
2.9.4
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [GIT PULL 0/1] perf probe fix
2017-06-23 0:02 [GIT PULL 0/1] perf probe fix Arnaldo Carvalho de Melo
2017-06-23 0:02 ` [PATCH 1/1] perf probe: Fix probe definition for inlined functions Arnaldo Carvalho de Melo
@ 2017-06-23 8:04 ` Ingo Molnar
1 sibling, 0 replies; 3+ messages in thread
From: Ingo Molnar @ 2017-06-23 8:04 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: linux-kernel, Björn Töpel, Magnus Karlsson,
Masami Hiramatsu, stable, Arnaldo Carvalho de Melo
* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> Test results at the end of this message, as usual.
>
> The following changes since commit fb3a5055cd7098f8d1dd0cd38d7172211113255f:
>
> perf/x86/intel: Add 1G DTLB load/store miss support for SKL (2017-06-22 11:07:08 +0200)
>
> are available in the git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git tags/perf-urgent-for-mingo-4.12-20170622
>
> for you to fetch changes up to 7598f8bc1383ffd77686cb4e92e749bef3c75937:
>
> perf probe: Fix probe definition for inlined functions (2017-06-22 16:08:09 -0300)
>
> ----------------------------------------------------------------
> perf probe fix:
>
> - Do not double the offset of inline expansions when using
> 'perf probe' on inlined functions (Björn Töpel)
>
> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
>
> ----------------------------------------------------------------
> Björn Töpel (1):
> perf probe: Fix probe definition for inlined functions
>
> tools/perf/util/probe-event.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Pulled, thanks a lot Arnaldo!
Ingo
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-06-23 8:04 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-23 0:02 [GIT PULL 0/1] perf probe fix Arnaldo Carvalho de Melo
2017-06-23 0:02 ` [PATCH 1/1] perf probe: Fix probe definition for inlined functions Arnaldo Carvalho de Melo
2017-06-23 8:04 ` [GIT PULL 0/1] perf probe fix Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox