All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nathan Chancellor <nathan@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org, hpa@zystor.com,
	samitolvanen@google.com, kees@kernel.org,
	scott.d.constable@intel.com
Subject: Re: [PATCH] x86/kcfi: Optimize call sequence
Date: Wed, 17 Jun 2026 15:37:28 -0700	[thread overview]
Message-ID: <20260617223728.GA3913972@ax162> (raw)
In-Reply-To: <20260612071506.GQ187714@noisy.programming.kicks-ass.net>

On Fri, Jun 12, 2026 at 09:15:06AM +0200, Peter Zijlstra wrote:
> 
> As noted in commit 85a2d4a890dc ("x86,ibt: Use UDB instead of 0xEA") Jcc should
> be assumed not-taken, however the normal kCFI (ABI) emits the following sequence:
> 
>    movl	$(-hash), %r10d
>    addl	-15(%r11), %r10d
>    je 1f
>    ud2
> 1: cs call __x86_indirect_thunk_r11
> 
> (when used in conjunction with -mretpoline-external-thunk).
> 
> Notably, the Jcc here is always taken, resulting in lower throughput than would
> be ideal. Replace it with the following sequence on boot:
> 
>    movl	$(-hash), %r10d
>    addl	-15(%r11), %r10d
>    jne . + 3
>    test $0xd6, %al
>    cs call __x86_indirect_thunk_r11
> 
> This jumps to the UDB instruction used as an immediate byte in the test
> instruction. The test instruction will clobber eflags, but that is immaterial,
> eflags is already changed by the preceding addl.
> 
> Intel recommends the FineIBT sequence on platforms that support IBT; older
> platforms are still widely used and would benefit from this.
> 
> An earlier PoC was benchmarked by Scott:
> 
> Indirect branch miss rate (br_misp_retired.indirect:k / br_inst_retired.indirect:k)
> 
> BHI_DIS_S=1
> 
>   Benchmark            Baseline             IBT            kCFI        kCFI-opt
>   -----------------------------------------------------------------------------
>   iperf3 UDP           0.103764        0.103180        0.104311        0.102945
>   hackbench            0.000885        0.000876        0.001996        0.000826
>   lmbench syscall      0.005089        0.004486        0.016990        0.005852
>   lmbench fork+exit    0.018454        0.019176        0.031085        0.015153
>   lmbench fork+exec    0.017147        0.021613        0.029129        0.016337
>   redis                0.032220        0.032655        0.045540        0.027946
>   nginx+wrk            0.109033        0.112765        0.132557        0.102417
>   fio randread         0.009704        0.009620        0.008548        0.000962
>   fio seqwrite         0.006927        0.006707        0.019372        0.004590
>   kbuild               0.056748        0.057324        0.064640        0.048136
> 
> BHI_DIS_S=0
> 
>   Benchmark            Baseline             IBT            kCFI        kCFI-opt
>   -----------------------------------------------------------------------------
>   iperf3 UDP           0.000077        0.000106        0.000186        0.000073
>   hackbench            0.000123        0.000132        0.000367        0.000097
>   lmbench syscall      0.023259        0.018319        0.040903        0.012772
>   lmbench fork+exit    0.011494        0.011887        0.029079        0.016415
>   lmbench fork+exec    0.037782        0.038994        0.055378        0.026381
>   redis                0.002481        0.003152        0.017073        0.000184
>   nginx+wrk            0.015478        0.016266        0.033637        0.000268
>   fio randread         0.009836        0.007949        0.007096        0.000143
>   fio seqwrite         0.014587        0.014165        0.041792        0.002157
>   kbuild               0.055774        0.055249        0.062590        0.046546
> 
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Kees Cook <kees@kernel.org>
> Cc: Nathan Chancellor <nathan@kernel.org>
> Cc: hpa@zystor.com
> Suggested-by: Scott D Constable <scott.d.constable@intel.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

I booted this on two machines without IBT (so using kCFI by default)
without any issues. lkdtm's CFI_FORWARD_PROTO test case still fails for
me and I can see the d6 immediate in the stacktrace.

Tested-by: Nathan Chancellor <nathan@kernel.org>

> ---
>  arch/x86/kernel/alternative.c |   11 ++++++++++-
>  arch/x86/kernel/cfi.c         |    6 ++++++
>  2 files changed, 16 insertions(+), 1 deletion(-)
> 
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1356,6 +1356,10 @@ early_param("cfi", cfi_parse_cmdline);
>   *  "Make conditional jumps most often not taken: The efficiency and throughput
>   *   for not-taken branches is better than for taken branches on most
>   *   processors. Therefore, it is good to place the most frequent branch first"
> + *
> + * NOTE: Update the kCFI caller sequence to make use of this observation.
> + * Replace the "je 1f; ud2" sequence with "jne +1; test $0xd6, %al". This
> + * clobbers flags, but those are clobbered by the hash test anyway.
>   */
>  
>  /*
> @@ -1518,9 +1522,10 @@ static int cfi_disable_callers(s32 *star
>  static int cfi_enable_callers(s32 *start, s32 *end)
>  {
>  	/*
> -	 * Re-enable kCFI, undo what cfi_disable_callers() did.
> +	 * Re-enable (and update) kCFI, undo what cfi_disable_callers() did.
>  	 */
>  	const u8 mov[] = { 0x41, 0xba };
> +	const u8 udne[] = { 0x75, 0x01, 0xa8, 0xd6 };
>  	s32 *s;
>  
>  	for (s = start; s < end; s++) {
> @@ -1532,6 +1537,10 @@ static int cfi_enable_callers(s32 *start
>  		if (!hash) /* nocfi callers */
>  			continue;
>  
> +		/*
> +		 * See the kCFI/FineIBT comment above -- update note.
> +		 */
> +		text_poke_early(addr + 10, udne, 4);
>  		text_poke_early(addr, mov, 2);
>  	}
>  
> --- a/arch/x86/kernel/cfi.c
> +++ b/arch/x86/kernel/cfi.c
> @@ -72,6 +72,12 @@ enum bug_trap_type handle_cfi_failure(st
>  
>  	switch (cfi_mode) {
>  	case CFI_KCFI:
> +		/*
> +		 * The updated kCFI sequence has "test $0xd6, %al" instead of
> +		 * "ud2", adjust the offset.
> +		 */
> +		addr -= 1;
> +
>  		if (!is_cfi_trap(addr))
>  			return BUG_TRAP_TYPE_NONE;
>  

-- 
Cheers,
Nathan

      parent reply	other threads:[~2026-06-17 22:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-12  7:15 [PATCH] x86/kcfi: Optimize call sequence Peter Zijlstra
2026-06-16 18:55 ` Borislav Petkov
2026-06-16 20:47 ` David Laight
2026-06-17  7:08   ` Peter Zijlstra
2026-06-17  9:26     ` David Laight
2026-06-17 11:12       ` Peter Zijlstra
2026-06-17 12:36         ` David Laight
2026-06-17 12:47           ` Peter Zijlstra
2026-06-17 22:37 ` Nathan Chancellor [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260617223728.GA3913972@ax162 \
    --to=nathan@kernel.org \
    --cc=hpa@zystor.com \
    --cc=kees@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=samitolvanen@google.com \
    --cc=scott.d.constable@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.