From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8577E3803F7 for ; Wed, 17 Jun 2026 22:37:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781735853; cv=none; b=KSb8ZcZwFYr3aQeUGeSiu6xUMlslPGI9FTuuDjqBZA3FIGA2VO+k6q0aFOj22Tl6MLedGqu3WXibckIkOFjoF6axasmDhM6Y4rOqBSut+v9mtOWSQYJc7yakx9AhvbxQuG7I8d/RQKCRqpgOiL7+YyskhPE9WY3aZ0BhnwSiQAU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781735853; c=relaxed/simple; bh=/381PRWnYsY9vj+KXNvoNan1F6J3faMfc061TlCD6N0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=kjPu+rn4IiVyb8113O1E8i3mG62Ype9DdtYPS/36iUoaR4SqAi2lszqWT8deDjGlkSwCZ4k03Hmnqh+hTW3wjh0UF38HtrtjETsXWTu/TH80aFnmoi+tRQourtbNhkrtk2VmrvxwY52AFM+enpqNR3AjCPOmrFbXaStxrSvQAQI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HvHEA/UT; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HvHEA/UT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B3C1D1F00A3A; Wed, 17 Jun 2026 22:37:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1781735852; bh=7rS5H6MivHpqndh6dH1mLtz9NeY80Sym1TitBpP+i9U=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=HvHEA/UT8TwXl6wXLyirwaFgHHsbPd3x9S/M8Y6nOgjxRI/95JsM0ITUI2UYl1OKL mtrSNGtBLD72b40JZfOlWhI80VkevXWT8VKuvKSuEL1+iwxkHGdgFF55oPQL8anbdd 8EhdkTrdx/eaaZN7Knw7DXDhOU14TKX0LRttHMD+MCIX47kTZZNxU8EL/7pgCxtMpi Olf2edrJBKiPL3zKCojUb+OOA9DVvcQq2GfdoMAHnn7ssFIWzORAIA6V0QI0ftpcru V4ZRHSwnJ+5YYQ/dbE1TciORdJem7fcDUZntTu1egv2j48RE5UqxYdAhXPAoaxLxwy XrBZQxA1guaYw== Date: Wed, 17 Jun 2026 15:37:28 -0700 From: Nathan Chancellor To: Peter Zijlstra Cc: x86@kernel.org, linux-kernel@vger.kernel.org, hpa@zystor.com, samitolvanen@google.com, kees@kernel.org, scott.d.constable@intel.com Subject: Re: [PATCH] x86/kcfi: Optimize call sequence Message-ID: <20260617223728.GA3913972@ax162> References: <20260612071506.GQ187714@noisy.programming.kicks-ass.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260612071506.GQ187714@noisy.programming.kicks-ass.net> On Fri, Jun 12, 2026 at 09:15:06AM +0200, Peter Zijlstra wrote: > > As noted in commit 85a2d4a890dc ("x86,ibt: Use UDB instead of 0xEA") Jcc should > be assumed not-taken, however the normal kCFI (ABI) emits the following sequence: > > movl $(-hash), %r10d > addl -15(%r11), %r10d > je 1f > ud2 > 1: cs call __x86_indirect_thunk_r11 > > (when used in conjunction with -mretpoline-external-thunk). > > Notably, the Jcc here is always taken, resulting in lower throughput than would > be ideal. Replace it with the following sequence on boot: > > movl $(-hash), %r10d > addl -15(%r11), %r10d > jne . + 3 > test $0xd6, %al > cs call __x86_indirect_thunk_r11 > > This jumps to the UDB instruction used as an immediate byte in the test > instruction. The test instruction will clobber eflags, but that is immaterial, > eflags is already changed by the preceding addl. > > Intel recommends the FineIBT sequence on platforms that support IBT; older > platforms are still widely used and would benefit from this. > > An earlier PoC was benchmarked by Scott: > > Indirect branch miss rate (br_misp_retired.indirect:k / br_inst_retired.indirect:k) > > BHI_DIS_S=1 > > Benchmark Baseline IBT kCFI kCFI-opt > ----------------------------------------------------------------------------- > iperf3 UDP 0.103764 0.103180 0.104311 0.102945 > hackbench 0.000885 0.000876 0.001996 0.000826 > lmbench syscall 0.005089 0.004486 0.016990 0.005852 > lmbench fork+exit 0.018454 0.019176 0.031085 0.015153 > lmbench fork+exec 0.017147 0.021613 0.029129 0.016337 > redis 0.032220 0.032655 0.045540 0.027946 > nginx+wrk 0.109033 0.112765 0.132557 0.102417 > fio randread 0.009704 0.009620 0.008548 0.000962 > fio seqwrite 0.006927 0.006707 0.019372 0.004590 > kbuild 0.056748 0.057324 0.064640 0.048136 > > BHI_DIS_S=0 > > Benchmark Baseline IBT kCFI kCFI-opt > ----------------------------------------------------------------------------- > iperf3 UDP 0.000077 0.000106 0.000186 0.000073 > hackbench 0.000123 0.000132 0.000367 0.000097 > lmbench syscall 0.023259 0.018319 0.040903 0.012772 > lmbench fork+exit 0.011494 0.011887 0.029079 0.016415 > lmbench fork+exec 0.037782 0.038994 0.055378 0.026381 > redis 0.002481 0.003152 0.017073 0.000184 > nginx+wrk 0.015478 0.016266 0.033637 0.000268 > fio randread 0.009836 0.007949 0.007096 0.000143 > fio seqwrite 0.014587 0.014165 0.041792 0.002157 > kbuild 0.055774 0.055249 0.062590 0.046546 > > Cc: Sami Tolvanen > Cc: Kees Cook > Cc: Nathan Chancellor > Cc: hpa@zystor.com > Suggested-by: Scott D Constable > Signed-off-by: Peter Zijlstra (Intel) I booted this on two machines without IBT (so using kCFI by default) without any issues. lkdtm's CFI_FORWARD_PROTO test case still fails for me and I can see the d6 immediate in the stacktrace. Tested-by: Nathan Chancellor > --- > arch/x86/kernel/alternative.c | 11 ++++++++++- > arch/x86/kernel/cfi.c | 6 ++++++ > 2 files changed, 16 insertions(+), 1 deletion(-) > > --- a/arch/x86/kernel/alternative.c > +++ b/arch/x86/kernel/alternative.c > @@ -1356,6 +1356,10 @@ early_param("cfi", cfi_parse_cmdline); > * "Make conditional jumps most often not taken: The efficiency and throughput > * for not-taken branches is better than for taken branches on most > * processors. Therefore, it is good to place the most frequent branch first" > + * > + * NOTE: Update the kCFI caller sequence to make use of this observation. > + * Replace the "je 1f; ud2" sequence with "jne +1; test $0xd6, %al". This > + * clobbers flags, but those are clobbered by the hash test anyway. > */ > > /* > @@ -1518,9 +1522,10 @@ static int cfi_disable_callers(s32 *star > static int cfi_enable_callers(s32 *start, s32 *end) > { > /* > - * Re-enable kCFI, undo what cfi_disable_callers() did. > + * Re-enable (and update) kCFI, undo what cfi_disable_callers() did. > */ > const u8 mov[] = { 0x41, 0xba }; > + const u8 udne[] = { 0x75, 0x01, 0xa8, 0xd6 }; > s32 *s; > > for (s = start; s < end; s++) { > @@ -1532,6 +1537,10 @@ static int cfi_enable_callers(s32 *start > if (!hash) /* nocfi callers */ > continue; > > + /* > + * See the kCFI/FineIBT comment above -- update note. > + */ > + text_poke_early(addr + 10, udne, 4); > text_poke_early(addr, mov, 2); > } > > --- a/arch/x86/kernel/cfi.c > +++ b/arch/x86/kernel/cfi.c > @@ -72,6 +72,12 @@ enum bug_trap_type handle_cfi_failure(st > > switch (cfi_mode) { > case CFI_KCFI: > + /* > + * The updated kCFI sequence has "test $0xd6, %al" instead of > + * "ud2", adjust the offset. > + */ > + addr -= 1; > + > if (!is_cfi_trap(addr)) > return BUG_TRAP_TYPE_NONE; > -- Cheers, Nathan