bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Tom Rix <trix@redhat.com>, Kan Liang <kan.liang@linux.intel.com>,
	Yang Jihong <yangjihong1@huawei.com>,
	Ravi Bangoria <ravi.bangoria@amd.com>,
	Carsten Haitzler <carsten.haitzler@arm.com>,
	Zhengjun Xing <zhengjun.xing@linux.intel.com>,
	James Clark <james.clark@arm.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	bpf@vger.kernel.org, llvm@lists.linux.dev, maskray@google.com
Subject: Re: [PATCH v1 0/4] Perf tool LTO support
Date: Mon, 24 Jul 2023 18:29:31 -0300	[thread overview]
Message-ID: <ZL7tO4pwpfX8n0gZ@kernel.org> (raw)
In-Reply-To: <20230724201247.748146-1-irogers@google.com>

Em Mon, Jul 24, 2023 at 01:12:43PM -0700, Ian Rogers escreveu:
> Add a build flag, LTO=1, so that perf is built with the -flto
> flag. Address some build errors this configuration throws up.
> 
> For me on my Debian derived OS, "CC=clang CXX=clang++ LD=ld.lld" works
> fine. With GCC LTO this fails with:
> ```
> lto-wrapper: warning: using serial compilation of 50 LTRANS jobs
> lto-wrapper: note: see the ‘-flto’ option documentation for more information
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel.ro+0x28): undefined reference to `memset_orig'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel.ro+0x40): undefined reference to `__memset'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel+0x28): undefined reference to `memcpy_orig'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans10.ltrans.o:(.data.rel+0x40): undefined reference to `__memcpy'
> /usr/bin/ld: /tmp/ccK8kXAu.ltrans44.ltrans.o: in function `test__arch_unwind_sample':
> /home/irogers/kernel.org/tools/perf/arch/x86/tests/dwarf-unwind.c:72: undefined reference to `perf_regs_load'
> collect2: error: ld returned 1 exit status
> ```
> 
> The issue is that we build multiple .o files in a directory and then
> link them into a .o with "ld -r" (cmd_ld_multi). This early link step
> appears to trigger GCC to remove the .S file definition of the symbol
> and break the later link step (the perf-in.o shows perf_regs_load, for
> example, going from the text section to being undefined at the link
> step which doesn't happen with clang or without LTO). It is possible
> to work around this by taking the final perf link command and adding
> the .o files generated from .S back into it, namely:
> arch/x86/tests/regs_load.o
> bench/mem-memset-x86-64-asm.o
> bench/mem-memcpy-x86-64-asm.o
> 
> A quick performance check and the performance improvements from LTO
> are noticeable:
> 
> Non-LTO
> ```
> $ perf bench internals synthesize
>  # Running 'internals/synthesize' benchmark:
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
>   Average synthesis took: 202.216 usec (+- 0.160 usec)
>   Average num. events: 51.000 (+- 0.000)
>   Average time per event 3.965 usec
>   Average data synthesis took: 230.875 usec (+- 0.285 usec)
>   Average num. events: 271.000 (+- 0.000)
>   Average time per event 0.852 usec
> ```
> 
> LTO
> ```
> $ perf bench internals synthesize
>  # Running 'internals/synthesize' benchmark:
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
>   Average synthesis took: 104.530 usec (+- 0.074 usec)
>   Average num. events: 51.000 (+- 0.000)
>   Average time per event 2.050 usec
>   Average data synthesis took: 112.660 usec (+- 0.114 usec)
>   Average num. events: 273.000 (+- 0.000)
>   Average time per event 0.413 usec


Cool stuff! Applied locally, test building now on the container suite.

- Arnaldo

> ```
> 
> Ian Rogers (4):
>   perf stat: Avoid uninitialized use of perf_stat_config
>   perf parse-events: Avoid use uninitialized warning
>   perf test: Avoid weak symbol for arch_tests
>   perf build: Add LTO build option
> 
>  tools/perf/Makefile.config      |  5 +++++
>  tools/perf/tests/builtin-test.c | 11 ++++++++++-
>  tools/perf/tests/stat.c         |  2 +-
>  tools/perf/util/parse-events.c  |  2 +-
>  tools/perf/util/stat.c          |  2 +-
>  5 files changed, 18 insertions(+), 4 deletions(-)
> 
> -- 
> 2.41.0.487.g6d72f3e995-goog
> 

-- 

- Arnaldo

      parent reply	other threads:[~2023-07-24 21:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-24 20:12 [PATCH v1 0/4] Perf tool LTO support Ian Rogers
2023-07-24 20:12 ` [PATCH v1 1/4] perf stat: Avoid uninitialized use of perf_stat_config Ian Rogers
2023-07-24 21:09   ` Nick Desaulniers
2023-07-24 20:12 ` [PATCH v1 2/4] perf parse-events: Avoid use uninitialized warning Ian Rogers
2023-07-24 21:01   ` Nick Desaulniers
2023-07-24 20:12 ` [PATCH v1 3/4] perf test: Avoid weak symbol for arch_tests Ian Rogers
2023-07-24 20:12 ` [PATCH v1 4/4] perf build: Add LTO build option Ian Rogers
2023-07-24 21:15 ` [PATCH v1 0/4] Perf tool LTO support Nick Desaulniers
2023-07-24 21:48   ` Ian Rogers
2023-07-24 22:27     ` Nick Desaulniers
2023-07-24 22:38       ` Ian Rogers
2023-07-24 21:29 ` Arnaldo Carvalho de Melo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZL7tO4pwpfX8n0gZ@kernel.org \
    --to=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=carsten.haitzler@arm.com \
    --cc=irogers@google.com \
    --cc=james.clark@arm.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=mark.rutland@arm.com \
    --cc=maskray@google.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=peterz@infradead.org \
    --cc=ravi.bangoria@amd.com \
    --cc=trix@redhat.com \
    --cc=yangjihong1@huawei.com \
    --cc=zhengjun.xing@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).