From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Mark Rutland <mark.rutland@arm.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
Adrian Hunter <adrian.hunter@intel.com>,
Nathan Chancellor <nathan@kernel.org>,
Nick Desaulniers <ndesaulniers@google.com>,
Tom Rix <trix@redhat.com>, Kan Liang <kan.liang@linux.intel.com>,
Yang Jihong <yangjihong1@huawei.com>,
Ravi Bangoria <ravi.bangoria@amd.com>,
Carsten Haitzler <carsten.haitzler@arm.com>,
Zhengjun Xing <zhengjun.xing@linux.intel.com>,
James Clark <james.clark@arm.com>,
linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, llvm@lists.linux.dev, maskray@google.com
Subject: Re: [PATCH v1 0/4] Perf tool LTO support
Date: Mon, 24 Jul 2023 18:29:31 -0300 [thread overview]
Message-ID: <ZL7tO4pwpfX8n0gZ@kernel.org> (raw)
In-Reply-To: <20230724201247.748146-1-irogers@google.com>
Em Mon, Jul 24, 2023 at 01:12:43PM -0700, Ian Rogers escreveu:
> Add a build flag, LTO=1, so that perf is built with the -flto
> flag. Address some build errors this configuration throws up.
>
> For me on my Debian derived OS, "CC=clang CXX=clang++ LD=ld.lld" works
> fine. With GCC LTO this fails with:
> ```
> lto-wrapper: warning: using serial compilation of 50 LTRANS jobs
> lto-wrapper: note: see the ‘-flto’ option documentation for more information
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel.ro+0x28): undefined reference to `memset_orig'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel.ro+0x40): undefined reference to `__memset'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel+0x28): undefined reference to `memcpy_orig'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel+0x40): undefined reference to `__memcpy'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans44.ltrans.o: in function `test__arch_unwind_sample':
> /home/irogers/kernel.org/tools/perf/arch/x86/tests/dwarf-unwind.c:72: undefined reference to `perf_regs_load'
> collect2: error: ld returned 1 exit status
> ```
>
> The issue is that we build multiple .o files in a directory and then
> link them into a .o with "ld -r" (cmd_ld_multi). This early link step
> appears to trigger GCC to remove the .S file definition of the symbol
> and break the later link step (the perf-in.o shows perf_regs_load, for
> example, going from the text section to being undefined at the link
> step which doesn't happen with clang or without LTO). It is possible
> to work around this by taking the final perf link command and adding
> the .o files generated from .S back into it, namely:
> arch/x86/tests/regs_load.o
> bench/mem-memset-x86-64-asm.o
> bench/mem-memcpy-x86-64-asm.o
>
> A quick performance check and the performance improvements from LTO
> are noticeable:
>
> Non-LTO
> ```
> $ perf bench internals synthesize
> # Running 'internals/synthesize' benchmark:
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
> Average synthesis took: 202.216 usec (+- 0.160 usec)
> Average num. events: 51.000 (+- 0.000)
> Average time per event 3.965 usec
> Average data synthesis took: 230.875 usec (+- 0.285 usec)
> Average num. events: 271.000 (+- 0.000)
> Average time per event 0.852 usec
> ```
>
> LTO
> ```
> $ perf bench internals synthesize
> # Running 'internals/synthesize' benchmark:
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
> Average synthesis took: 104.530 usec (+- 0.074 usec)
> Average num. events: 51.000 (+- 0.000)
> Average time per event 2.050 usec
> Average data synthesis took: 112.660 usec (+- 0.114 usec)
> Average num. events: 273.000 (+- 0.000)
> Average time per event 0.413 usec
Cool stuff! Applied locally, test building now on the container suite.
- Arnaldo
> ```
>
> Ian Rogers (4):
> perf stat: Avoid uninitialized use of perf_stat_config
> perf parse-events: Avoid use uninitialized warning
> perf test: Avoid weak symbol for arch_tests
> perf build: Add LTO build option
>
> tools/perf/Makefile.config | 5 +++++
> tools/perf/tests/builtin-test.c | 11 ++++++++++-
> tools/perf/tests/stat.c | 2 +-
> tools/perf/util/parse-events.c | 2 +-
> tools/perf/util/stat.c | 2 +-
> 5 files changed, 18 insertions(+), 4 deletions(-)
>
> --
> 2.41.0.487.g6d72f3e995-goog
>
--
- Arnaldo
prev parent reply other threads:[~2023-07-24 21:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-24 20:12 [PATCH v1 0/4] Perf tool LTO support Ian Rogers
2023-07-24 20:12 ` [PATCH v1 1/4] perf stat: Avoid uninitialized use of perf_stat_config Ian Rogers
2023-07-24 21:09 ` Nick Desaulniers
2023-07-24 20:12 ` [PATCH v1 2/4] perf parse-events: Avoid use uninitialized warning Ian Rogers
2023-07-24 21:01 ` Nick Desaulniers
2023-07-24 20:12 ` [PATCH v1 3/4] perf test: Avoid weak symbol for arch_tests Ian Rogers
2023-07-24 20:12 ` [PATCH v1 4/4] perf build: Add LTO build option Ian Rogers
2023-07-24 21:15 ` [PATCH v1 0/4] Perf tool LTO support Nick Desaulniers
2023-07-24 21:48 ` Ian Rogers
2023-07-24 22:27 ` Nick Desaulniers
2023-07-24 22:38 ` Ian Rogers
2023-07-24 21:29 ` Arnaldo Carvalho de Melo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZL7tO4pwpfX8n0gZ@kernel.org \
--to=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=bpf@vger.kernel.org \
--cc=carsten.haitzler@arm.com \
--cc=irogers@google.com \
--cc=james.clark@arm.com \
--cc=jolsa@kernel.org \
--cc=kan.liang@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=mark.rutland@arm.com \
--cc=maskray@google.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=nathan@kernel.org \
--cc=ndesaulniers@google.com \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
--cc=trix@redhat.com \
--cc=yangjihong1@huawei.com \
--cc=zhengjun.xing@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.