All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	tglx@kernel.org, mingo@redhat.com, bp@alien8.de,
	Nathan Chancellor <nathan@kernel.org>,
	Calvin Owens <calvin@wbinvd.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	x86-ML <x86@kernel.org>, LKML <linux-kernel@vger.kernel.org>
Subject: Re: 8aeb879baf12 - significant system call latency regression, bisected
Date: Fri, 19 Jun 2026 09:31:34 +0200	[thread overview]
Message-ID: <20260619073134.GQ49951@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <9ae04c80-d1e2-44f0-bca2-d0889c61b45f@zytor.com>

On Thu, Jun 18, 2026 at 03:40:00PM -0700, H. Peter Anvin wrote:
> On 2026-06-17 05:37, Peter Zijlstra wrote:
> > 
> > This builds with kcfi on and seems to do more or less do what is expected.
> > 
> > I've not actually tried performance measurements on my IDT based system.
> > 
> 
> I'm going to run this through its paces.
> 
> I'm still confused, though, by the claim that changing the
> patchable_function_entry() breaks the kCFI ABI. When I do a symbol check on my
> system, the __pfx symbols are still at an offset of -16, and the additional
> NOPs are located *before* them. Isn't this completely consistent with the
> existing ABI? What am I missing here?

CONFIG_CFI=y is what's missing, I suspect you're building with GCC, and
Kees is still trying to get the kCFI patches to land there :/

(assuming debian):

$ make O=defconfig-build LLVM=-22 defconfig
$ ./scripts/config --file defconfig-build/.config --enable CFI
$ make O=defconfig-build LLVM=-22 kernel/sched/core.o
$ objdump -wdr defconfig-build/kernel/sched/core.o

...

0000000000005630 <__cfi_resched_curr>:
    5630:       b9 9c 43 4e 4d          mov    $0x4d4e439c,%ecx

0000000000005635 <.Ltmp268>:
    5635:       90                      nop
    5636:       90                      nop
    5637:       90                      nop
    5638:       90                      nop
    5639:       90                      nop
    563a:       90                      nop
    563b:       90                      nop
    563c:       90                      nop
    563d:       90                      nop
    563e:       90                      nop
    563f:       90                      nop

0000000000005640 <resched_curr>:
    5640:       f3 0f 1e fa             endbr64
    5644:       be 04 00 00 00          mov    $0x4,%esi
    5649:       eb 15                   jmp    5660 <__resched_curr>
    564b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    5650:       90                      nop
    5651:       90                      nop
    5652:       90                      nop
    5653:       90                      nop
    5654:       90                      nop

...


diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 8b791e9e9f67..32c15ea31c02 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1228,7 +1228,7 @@ void __trace_set_need_resched(struct task_struct *curr, int tif)
 }
 EXPORT_SYMBOL_GPL(__trace_set_need_resched);
 
-void resched_curr(struct rq *rq)
+__attribute__((patchable_function_entry(27, 27))) void resched_curr(struct rq *rq) 
 {
 	__resched_curr(rq, TIF_NEED_RESCHED);
 }

...

0000000000005630 <__cfi_resched_curr>:
    5630:       b9 9c 43 4e 4d          mov    $0x4d4e439c,%ecx

0000000000005635 <.Ltmp268>:
    5635:       90                      nop
    5636:       90                      nop
    5637:       90                      nop
    5638:       90                      nop
    5639:       90                      nop
    563a:       90                      nop
    563b:       90                      nop
    563c:       90                      nop
    563d:       90                      nop
    563e:       90                      nop
    563f:       90                      nop
    5640:       90                      nop
    5641:       90                      nop
    5642:       90                      nop
    5643:       90                      nop
    5644:       90                      nop
    5645:       90                      nop
    5646:       90                      nop
    5647:       90                      nop
    5648:       90                      nop
    5649:       90                      nop
    564a:       90                      nop
    564b:       90                      nop
    564c:       90                      nop
    564d:       90                      nop
    564e:       90                      nop
    564f:       90                      nop

0000000000005650 <resched_curr>:
    5650:       f3 0f 1e fa             endbr64
    5654:       be 04 00 00 00          mov    $0x4,%esi
    5659:       eb 15                   jmp    5670 <__resched_curr>
    565b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
    5660:       90                      nop
    5661:       90                      nop
    5662:       90                      nop
    5663:       90                      nop
    5664:       90                      nop

...


Now look at any x86_indirect_thunk site, the first being:

      30:       41 ba 5a 69 61 c7       mov    $0xc761695a,%r10d
      36:       45 03 53 f1             add    -0xf(%r11),%r10d
      3a:       74 02                   je     3e <__traceiter_sched_kthread_stop+0x2e>
      3c:       0f 0b                   ud2
      3e:       2e e8 00 00 00 00       cs call 44 <__traceiter_sched_kthread_stop+0x34>        40: R_X86_64_PLT32      __x86_indirect_thunk_r11-0x4

This hard assumes the hash (that mov in the __cfi symbol) sits at -15,
and observe how that __attribute__((patchable_function_entry(27,27)))
has shifted that for this one function?

That is what's broken. It makes it impossible to actually do an indirect
call to these functions, which is why if you touch
patchable_function_entry, you also need nocf_check.



  parent reply	other threads:[~2026-06-19  7:31 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-13  1:45 8aeb879baf12 - significant system call latency regression, bisected "H. Peter Anvin" (Intel)
2026-06-13  8:59 ` Peter Zijlstra
2026-06-13 20:34   ` H. Peter Anvin
2026-06-13 23:52     ` H. Peter Anvin
2026-06-14  1:50       ` H. Peter Anvin
2026-06-14 18:08         ` Xin Li
2026-06-14 18:31           ` H. Peter Anvin
2026-06-15  0:19         ` H. Peter Anvin
2026-06-15  2:07           ` H. Peter Anvin
2026-06-15  3:41             ` Linus Torvalds
2026-06-15 18:30               ` H. Peter Anvin
2026-06-16  7:12                 ` Peter Zijlstra
2026-06-16  7:38             ` Peter Zijlstra
2026-06-16  7:53             ` Peter Zijlstra
2026-06-18 23:05               ` H. Peter Anvin
2026-06-19  7:50                 ` Peter Zijlstra
2026-06-19 10:22                   ` H. Peter Anvin
2026-06-16  8:28         ` Peter Zijlstra
2026-06-16  8:46           ` Linus Torvalds
2026-06-16  9:51             ` Ingo Molnar
2026-06-16 17:44               ` H. Peter Anvin
2026-06-17  9:54                 ` Ingo Molnar
2026-06-17 10:05                   ` Ingo Molnar
2026-06-17 12:37             ` Peter Zijlstra
2026-06-18 22:40               ` H. Peter Anvin
2026-06-19  1:11                 ` H. Peter Anvin
2026-06-19  2:08                   ` Linus Torvalds
2026-06-19  2:11                     ` Linus Torvalds
2026-06-19  4:32                       ` H. Peter Anvin
2026-06-19  7:35                         ` Peter Zijlstra
2026-06-19  2:11                     ` H. Peter Anvin
2026-06-19  7:31                 ` Peter Zijlstra [this message]
2026-06-19  8:14               ` Peter Zijlstra
2026-06-19 10:23                 ` H. Peter Anvin
2026-06-19 11:18                   ` Peter Zijlstra
2026-06-19 21:53                     ` syscall path improvements (was: syscall performance regression, debunked) H. Peter Anvin
2026-06-16 13:53           ` 8aeb879baf12 - significant system call latency regression, bisected David Laight
2026-06-18 23:03             ` H. Peter Anvin
2026-06-14  2:11       ` Calvin Owens
2026-06-14  2:14         ` Calvin Owens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260619073134.GQ49951@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bp@alien8.de \
    --cc=calvin@wbinvd.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nathan@kernel.org \
    --cc=tglx@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.