Re: [GIT pull] perf/urgent for 5.7-rc2

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	the arch/x86 maintainers <x86@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [GIT pull] perf/urgent for 5.7-rc2
Date: Mon, 20 Apr 2020 09:48:45 +0200	[thread overview]
Message-ID: <20200420074845.GA72554@gmail.com> (raw)
In-Reply-To: <20200419200758.3xry3vn2a5caxapx@treble>

* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Sun, Apr 19, 2020 at 11:56:51AM -0700, Linus Torvalds wrote:
> 
> > So I'm wondering if there any way that objtool could be run at 
> > link-time (and archive time) rather than force a re-build of all the 
> > object files from source?
> 
> We've actually been making progress in that direction.  Peter added 
> partial vmlinux.o support, for Thomas' noinstr validation.  The problem 
> is, linking is single-threaded so it ends up making the kernel build 
> slower overall.
> 
> So right now, we still do most things per compilation unit, and only do 
> the noinstr validation at vmlinux.o link time.  Eventually, especially 
> with LTO, we'll probably end up moving everything over to link time.

Fortunately, much of what objtool does against vmlinux.o can be 
parallelized in a rather straightforward fashion I believe, if we build 
with -ffunction-sections.

Here's the main "objtool check" processing steps:

int check(const char *_objname, bool orc)
{
...
        ret = decode_sections(&file);
...

        ret = validate_functions(&file);
...
        ret = validate_unwind_hints(&file);
...
                ret = validate_reachable_instructions(&file);
...
                ret = create_orc(&file);
...
                ret = create_orc_sections(&file);
}

The 'decode_sections()' step takes about 92% of the runtime against 
vmlinux.o:

 $ taskset 1 perf stat --repeat 3 --sync --null tools/objtool/objtool check vmlinux.o

 Performance counter stats for 'tools/objtool/objtool check vmlinux.o' (3 runs):

           3.05757 +- 0.00247 seconds time elapsed  ( +-  0.08% )

 $ taskset 1 perf stat --repeat 3 --exit-after-decode --null tools/objtool/objtool check vmlinux.o            

 Performance counter stats for 'tools/objtool/objtool check vmlinux.o' (3 runs):

           2.83132 +- 0.00272 seconds time elapsed  ( +-  0.10% )

(The --exit-after-decode hack makes it exit right after 
decode_sections().)

Within decode_sections(), the main overhead is in decode_instructions() 
(~75% of the total objtool overhead):

           2.31325 +- 0.00609 seconds time elapsed  ( +-  0.26% )

This goes through every executable section, to decode the instructions:

static int decode_instructions(struct objtool_file *file)
{
...
        for_each_sec(file, sec) {

                if (!(sec->sh.sh_flags & SHF_EXECINSTR))
                        continue;

The size distribution of function section sizes is strongly biased 
towards section sizes of 100 bytes or less, over 95% of all instructions 
in the vmlinux.o are in such a section.

In fact over 99% of all decoded instructions are in a section of 500 
bytes or smaller, so a threaded decoder where each thread batch-decodes a 
handful of sections in a single processing step and then batch-inserts it 
into the (global) instructions hash should do the trick.

The batching size could be driven by section byte size, i.e. we could say 
that the unit of batching is for a decoding thread to grab ~10k bytes 
worth of sections from the list, build a local list of decoded 
instructions, and then insert them into the global hash in a single go. 
This would scale very well IMO, with the defconfig already having almost 
3 million instructions, and a distro build or allmodconfig build a lot 
more.

I believe the 3.0 seconds total objdump runtime above could be reduced to 
below 1.0 second on typical contemporary development systems - which 
would IMHO make it a feasible model to run objtool only against the whole 
kernel binary.

Is there any code generation disadvantage or other quirk to 
-ffunction-sections, or other complications that I missed, that would 
make this difficult?

Thanks,

	Ingo

next prev parent reply	other threads:[~2020-04-20  7:48 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-19 13:56 [GIT pull] core/urgent for 5.7-rc2 Thomas Gleixner
2020-04-19 13:56 ` [GIT pull] irq/urgent " Thomas Gleixner
2020-04-19 19:20   ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] perf/urgent " Thomas Gleixner
2020-04-19 18:56   ` Linus Torvalds
2020-04-19 20:07     ` Josh Poimboeuf
2020-04-20  7:48       ` Ingo Molnar [this message]
2020-04-20  8:27         ` Peter Zijlstra
2020-04-22  7:45           ` Ingo Molnar
2020-04-22 11:56             ` Peter Zijlstra
2020-04-20 16:51         ` Linus Torvalds
2020-04-20 17:40           ` Peter Zijlstra
2020-04-20 18:17             ` Josh Poimboeuf
2020-04-20 19:17               ` Peter Zijlstra
2020-04-20 19:21               ` Linus Torvalds
2020-04-20 19:36                 ` Josh Poimboeuf
2020-04-19 19:20   ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] sched/urgent " Thomas Gleixner
2020-04-19 19:20   ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] timers/urgent " Thomas Gleixner
2020-04-19 19:20   ` pr-tracker-bot
2020-04-19 13:56 ` [GIT pull] x86/urgent " Thomas Gleixner
2020-04-19 19:20   ` pr-tracker-bot
2020-04-19 19:20 ` [GIT pull] core/urgent " pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200420074845.GA72554@gmail.com \
    --to=mingo@kernel.org \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox