From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Masahiro Yamada <masahiroy@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
the arch/x86 maintainers <x86@kernel.org>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: [GIT pull] perf/urgent for 5.7-rc2
Date: Mon, 20 Apr 2020 09:48:45 +0200 [thread overview]
Message-ID: <20200420074845.GA72554@gmail.com> (raw)
In-Reply-To: <20200419200758.3xry3vn2a5caxapx@treble>
* Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> On Sun, Apr 19, 2020 at 11:56:51AM -0700, Linus Torvalds wrote:
>
> > So I'm wondering if there any way that objtool could be run at
> > link-time (and archive time) rather than force a re-build of all the
> > object files from source?
>
> We've actually been making progress in that direction. Peter added
> partial vmlinux.o support, for Thomas' noinstr validation. The problem
> is, linking is single-threaded so it ends up making the kernel build
> slower overall.
>
> So right now, we still do most things per compilation unit, and only do
> the noinstr validation at vmlinux.o link time. Eventually, especially
> with LTO, we'll probably end up moving everything over to link time.
Fortunately, much of what objtool does against vmlinux.o can be
parallelized in a rather straightforward fashion I believe, if we build
with -ffunction-sections.
Here's the main "objtool check" processing steps:
int check(const char *_objname, bool orc)
{
...
ret = decode_sections(&file);
...
ret = validate_functions(&file);
...
ret = validate_unwind_hints(&file);
...
ret = validate_reachable_instructions(&file);
...
ret = create_orc(&file);
...
ret = create_orc_sections(&file);
}
The 'decode_sections()' step takes about 92% of the runtime against
vmlinux.o:
$ taskset 1 perf stat --repeat 3 --sync --null tools/objtool/objtool check vmlinux.o
Performance counter stats for 'tools/objtool/objtool check vmlinux.o' (3 runs):
3.05757 +- 0.00247 seconds time elapsed ( +- 0.08% )
$ taskset 1 perf stat --repeat 3 --exit-after-decode --null tools/objtool/objtool check vmlinux.o
Performance counter stats for 'tools/objtool/objtool check vmlinux.o' (3 runs):
2.83132 +- 0.00272 seconds time elapsed ( +- 0.10% )
(The --exit-after-decode hack makes it exit right after
decode_sections().)
Within decode_sections(), the main overhead is in decode_instructions()
(~75% of the total objtool overhead):
2.31325 +- 0.00609 seconds time elapsed ( +- 0.26% )
This goes through every executable section, to decode the instructions:
static int decode_instructions(struct objtool_file *file)
{
...
for_each_sec(file, sec) {
if (!(sec->sh.sh_flags & SHF_EXECINSTR))
continue;
The size distribution of function section sizes is strongly biased
towards section sizes of 100 bytes or less, over 95% of all instructions
in the vmlinux.o are in such a section.
In fact over 99% of all decoded instructions are in a section of 500
bytes or smaller, so a threaded decoder where each thread batch-decodes a
handful of sections in a single processing step and then batch-inserts it
into the (global) instructions hash should do the trick.
The batching size could be driven by section byte size, i.e. we could say
that the unit of batching is for a decoding thread to grab ~10k bytes
worth of sections from the list, build a local list of decoded
instructions, and then insert them into the global hash in a single go.
This would scale very well IMO, with the defconfig already having almost
3 million instructions, and a distro build or allmodconfig build a lot
more.
I believe the 3.0 seconds total objdump runtime above could be reduced to
below 1.0 second on typical contemporary development systems - which
would IMHO make it a feasible model to run objtool only against the whole
kernel binary.
Is there any code generation disadvantage or other quirk to
-ffunction-sections, or other complications that I missed, that would
make this difficult?
Thanks,
Ingo
next prev parent reply other threads:[~2020-04-20 7:48 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-19 13:56 [GIT pull] core/urgent for 5.7-rc2 Thomas Gleixner
2020-04-19 13:56 ` [GIT pull] irq/urgent " Thomas Gleixner
2020-04-19 19:20 ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] perf/urgent " Thomas Gleixner
2020-04-19 18:56 ` Linus Torvalds
2020-04-19 20:07 ` Josh Poimboeuf
2020-04-20 7:48 ` Ingo Molnar [this message]
2020-04-20 8:27 ` Peter Zijlstra
2020-04-22 7:45 ` Ingo Molnar
2020-04-22 11:56 ` Peter Zijlstra
2020-04-20 16:51 ` Linus Torvalds
2020-04-20 17:40 ` Peter Zijlstra
2020-04-20 18:17 ` Josh Poimboeuf
2020-04-20 19:17 ` Peter Zijlstra
2020-04-20 19:21 ` Linus Torvalds
2020-04-20 19:36 ` Josh Poimboeuf
2020-04-19 19:20 ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] sched/urgent " Thomas Gleixner
2020-04-19 19:20 ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] timers/urgent " Thomas Gleixner
2020-04-19 19:20 ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] x86/urgent " Thomas Gleixner
2020-04-19 19:20 ` pr-tracker-bot
2020-04-19 19:20 ` [GIT pull] core/urgent " pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200420074845.GA72554@gmail.com \
--to=mingo@kernel.org \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=masahiroy@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox