public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Ingo Molnar <mingo@kernel.org>
Cc: X86 ML <x86@kernel.org>, Andy Lutomirski <luto@amacapital.net>,
	LKML <linux-kernel@vger.kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2 05/15] x86/alternatives: Use optimized NOPs for padding
Date: Wed, 4 Mar 2015 09:42:40 +0100	[thread overview]
Message-ID: <20150304084240.GA3233@pd.tnic> (raw)
In-Reply-To: <20150304064303.GA16387@gmail.com>

On Wed, Mar 04, 2015 at 07:43:03AM +0100, Ingo Molnar wrote:
> But the main question is, do such alignment details ever matter to
> decoder performance?

Right, so I have some ideas about it but this all is probably very
uarch-specific so below is probably only speculation.

So both optimization manuals suggest using longer NOPs is better
probably because the less micro-ops you decode from the NOPs, the less
instructions you have in the pipe, less resources, etc etc. Thus, making
them longer with prefixes is better than having multiple unprefixed and
shorter nops.

Now, there's the dependency on operands and NOPs generally reference
rAX. On some uarches this dependency is broken much earlier so the
micro-op gets scheduled earlier. Thus freeing resources earlier, etc,
etc.

If the NOP is crossing cacheline, you obviously can't know it is a NOP
yet so you have to fetch the next cacheline to finish decoding it. So in
the best case you'll go down to L1 and in the worst case go to memory.

And on modern machines this is probably so very unnoticeable. I hardly
doubt we'll see that in a hotpath even. But it could be a good measuring
exercise :)

Btw, when we pad JMPs with NOPs, we shouldn't be affected because the
unconditional JMP will not have us decode the NOPs behind it. And
modern, hungry prefetching beasts would've probably fetched and decoded
a bunch of instructions along with the NOPs. But they should see the JMP
and stop prefetching after it though. Who knows, uarch stuff.

See, all speculations :)

Btw #2 and more importantly: this patchset of mine doesn't necessarily
enlarge the patch sites - before it, you'll have to add proper-sized
NOPs explicitly and now we let the toolchain do it for us, which
sometimes even leads to smaller NOPs depending on the instruction the
toolchain generates.

With the JMP optimizations, the instructions become smaller too.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

  reply	other threads:[~2015-03-04  8:43 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-24 11:14 [PATCH v2 00/15] x86, alternatives: Instruction padding and more robust JMPs Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 01/15] x86/lib/copy_user_64.S: Remove FIX_ALIGNMENT define Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 02/15] x86/alternatives: Cleanup DPRINTK macro Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 03/15] x86/alternatives: Add instruction padding Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 04/15] x86/alternatives: Make JMPs more robust Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 05/15] x86/alternatives: Use optimized NOPs for padding Borislav Petkov
2015-03-04  6:43   ` Ingo Molnar
2015-03-04  8:42     ` Borislav Petkov [this message]
2015-02-24 11:14 ` [PATCH v2 06/15] x86/lib/copy_page_64.S: Use generic ALTERNATIVE macro Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 07/15] x86/lib/copy_user_64.S: Convert to ALTERNATIVE_2 Borislav Petkov
2015-03-04  6:25   ` Ingo Molnar
2015-03-04  7:13     ` Ingo Molnar
2015-03-04  9:06       ` Borislav Petkov
2015-03-05  0:34         ` Ingo Molnar
2015-03-05  8:23           ` Borislav Petkov
2015-03-04  9:00     ` Borislav Petkov
2015-03-05  0:32       ` Ingo Molnar
2015-03-05  8:35         ` Borislav Petkov
2015-03-05  9:34           ` Ingo Molnar
2015-03-05  9:46             ` Ingo Molnar
2015-02-24 11:14 ` [PATCH v2 08/15] x86/smap: Use ALTERNATIVE macro Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 09/15] x86/entry_32: Convert X86_INVD_BUG to " Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 10/15] x86/lib/clear_page_64.S: Convert to ALTERNATIVE_2 macro Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 11/15] x86/asm: Use alternative_2() in rdtsc_barrier() Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 12/15] x86/asm: Cleanup prefetch primitives Borislav Petkov
2015-03-04  6:48   ` Ingo Molnar
2015-03-04  9:08     ` Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 13/15] x86/lib/memset_64.S: Convert to ALTERNATIVE_2 macro Borislav Petkov
2015-02-24 11:14 ` [PATCH v2 14/15] x86/lib/memmove_64.S: Convert memmove() to ALTERNATIVE macro Borislav Petkov
2015-03-04  7:19   ` Ingo Molnar
2015-02-24 11:14 ` [PATCH v2 15/15] x86/lib/memcpy_64.S: Convert memcpy to ALTERNATIVE_2 macro Borislav Petkov
2015-03-04  7:26   ` Ingo Molnar
2015-03-04 13:58     ` Borislav Petkov
2015-03-05  0:26       ` Ingo Molnar
2015-03-05  8:37         ` Borislav Petkov
2015-02-24 20:25 ` [PATCH v2 00/15] x86, alternatives: Instruction padding and more robust JMPs Andy Lutomirski
2015-02-26 18:13 ` Borislav Petkov
2015-02-26 18:16   ` [PATCH 1/3] perf/bench: Fix mem* routines usage after alternatives change Borislav Petkov
2015-02-26 18:16     ` [PATCH 2/3] perf/bench: Carve out mem routine benchmarking Borislav Petkov
2015-02-26 18:16     ` [PATCH 3/3] perf/bench: Add -r all so that you can run all mem* routines Borislav Petkov
2015-03-04  7:30       ` Ingo Molnar
2015-03-02 14:51   ` [PATCH v2 00/15] x86, alternatives: Instruction padding and more robust JMPs Hitoshi Mitake
2015-03-02 16:27     ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150304084240.GA3233@pd.tnic \
    --to=bp@alien8.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mingo@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox