BPF List
 help / color / mirror / Atom feed
From: Andrey Grodzovsky <andrey.grodzovsky@crowdstrike.com>
To: <bpf@vger.kernel.org>, <live-patching@vger.kernel.org>
Cc: DL Linux Open Source Team <linux-open-source@crowdstrike.com>,
	Petr Mladek <pmladek@suse.com>, Song Liu <song@kernel.org>,
	<andrii@kernel.org>, Raja Khan <raja.khan@crowdstrike.com>
Subject: BPF fentry/fexit trampolines stall livepatch stalls transition due to missing ORC unwind metadata
Date: Wed, 19 Nov 2025 10:41:32 -0500	[thread overview]
Message-ID: <0e555733-c670-4e84-b2e6-abb8b84ade38@crowdstrike.com> (raw)

Hello BPF and livepatch teams,

This is somewhat a followup on 
https://lists.ubuntu.com/archives/kernel-team/2025-October/163881.html 
as we continue encounter issues and conflicts between BPF and livepatch.

We've encountered an issue between BPF fentry/fexit trampolines and 
kernel livepatching (kpatch/livepatch) on x86_64 systems with ORC 
unwinder enabled. I'm reaching out to understand if this is a known 
limitation and to explore potential solutions. I assume it's known as I 
see information along this lines in 
https://www.kernel.org/doc/Documentation/livepatch/reliable-stacktrace.rst

Problem Summary

When BPF programs attach to kernel functions using fentry/fexit hooks, 
the resulting JIT-compiled trampolines lack ORC unwind metadata. This 
causes livepatch transition stall when threads are blocked in hooked 
functions, as the stack becomes unreliable for unwinding purposes.

In our case the environment is

- RHEL 9.6 (kernel 5.14.0-570.17.1.el9_6.x86_64)
- CONFIG_UNWINDER_ORC=y
- CONFIG_BPF_JIT_ALWAYS_ON=y
- BPF fentry/fexit hooks on inet_recvmsg()

Scenario:
1. BPF program attached to inet_recvmsg via fentry/fexit (creates BPF 
trampoline)
2. CIFS filesystem mounted (creates cifsd kernel thread)
3. cifsd thread blocks in inet_recvmsg → BPF trampoline is on the stack
4. Attempt to load kpatch module
5. Livepatch transition stalls indefinitely

Error Message (repeated every ~1 second):
livepatch: klp_try_switch_task: cifsd:2886 has an unreliable stack

Stack trace showing BPF trampoline:
cifsd           D  0  2886
Call Trace:
  wait_woken+0x50/0x60
  sk_wait_data+0x176/0x190
  tcp_recvmsg_locked+0x234/0x920
  tcp_recvmsg+0x78/0x210
  inet_recvmsg+0x5c/0x140
  bpf_trampoline_6442469985+0x89/0x130  ← NO ORC metadata
  sock_recvmsg+0x95/0xa0
  cifs_readv_from_socket+0x1ca/0x2d0 [cifs]
  ...

As far as I understand and please correct me if it's wrong -

The failure occurs in arch/x86/kernel/unwind_orc.c

orc = orc_find(state->signal ? state->ip : state->ip - 1);
if (!orc) {
     /*
      * As a fallback, try to assume this code uses a frame pointer.
      * This is useful for generated code, like BPF, which ORC
      * doesn't know about.  This is just a guess, so the rest of
      * the unwind is no longer considered reliable.
      */
     orc = &orc_fp_entry;
     state->error = true;  // ← Marks stack as unreliable
}

When orc_find() returns NULL for the BPF trampoline address, the 
unwinder falls back to frame pointers and marks the stack unreliable. 
This causes arch_stack_walk_reliable() to fail, which in turn causes 
livepatch's klp_check_stack() to return -EINVAL before even checking if 
to-be-patched functions are on the stack.

Key observations:
1. The kernel comment explicitly mentions "generated code, like BPF"
2. Documentation/livepatch/reliable-stacktrace.rst lists "Dynamically 
generated code (e.g. eBPF)" as causing unreliable stacks
3. Native kernel functions have ORC metadata from objtool during build
4. Ftrace trampolines have special ORC handling via orc_ftrace_find()
5. BPF JIT trampolines have no such handling - Is this correct ?

Impact

This affects production systems where:
- Security/observability tools use BPF fentry/fexit hooks
- Live kernel patching is required for security updates
- Kernel threads may be blocked in hooked network/storage functions

The livepatch transition can stall for 60+ seconds before failing, 
blocking critical security patches.

Questions for the Community

1. Is this a known limitation (I assume yes) ?
2. Runtime ORC generation? Could the BPF JIT generate ORC unwind entries 
for trampolines, similar to how ftrace trampolines are handled?
3. Trampoline registration? Could BPF trampolines register their address 
ranges with the ORC unwinder to avoid the "unreliable" marking?
4. Alternative unwinding? Could livepatch use an alternative unwinding 
method when BPF trampolines are detected (e.g., frame pointers with 
validation)?
5. Workarounds? I mention one bellow and I would be happy to hear if 
anyone has a better idea to propose ?

The only possible workaround I see is switching everything from 
trampoline based hooks to kprobe since I assume kprobes won't have this 
issue

BPF kprobes use the ftrace infrastructure with kprobe_ftrace_handler, 
which has ORC metadata and special handling in the unwinder. The stack 
remains reliable:
inet_recvmsg+0x50/0x140  ← Has ORC metadata
kprobe_ftrace_handler+... ← Has ORC metadata

Problem with kprobes is obviously their performance penalty.

Additional Context

 From arch/x86/net/bpf_jit_comp.c:3559:
bool bpf_jit_supports_exceptions(void)
{
     /* We unwind through both kernel frames (starting from within bpf_throw
      * call) and BPF frames. Therefore we require ORC unwinder to be 
enabled
      * to walk kernel frames and reach BPF frames in the stack trace.
      */
     return IS_ENABLED(CONFIG_UNWINDER_ORC);
}

This shows that BPF already has some integration with ORC for exception 
handling. Could this be extended to trampolines?

References

- Kernel: 5.14.0-570.17.1.el9_6.x86_64
- Code: arch/x86/kernel/unwind_orc.c:510-519
- Docs: Documentation/livepatch/reliable-stacktrace.rst lines 84-85, 111-112

I appreciate any guidance on whether this is something that could be 
addressed in the kernel, or if we should focus on user-space workarounds.

Thanks,
Andrey

             reply	other threads:[~2025-11-19 16:13 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-19 15:41 Andrey Grodzovsky [this message]
2025-11-20 12:15 ` BPF fentry/fexit trampolines stall livepatch stalls transition due to missing ORC unwind metadata Miroslav Benes
2025-11-22  0:56   ` Josh Poimboeuf
2025-11-24 17:14     ` Alexei Starovoitov
2025-11-24 19:51       ` Josh Poimboeuf
2025-11-24 22:06     ` [External] " Andrey Grodzovsky
2025-11-24 22:51       ` Josh Poimboeuf
2025-11-24 22:54         ` Andrey Grodzovsky
2025-11-25  0:06           ` Josh Poimboeuf
2025-11-27 14:55             ` Andrey Grodzovsky
2025-12-01 20:59               ` Josh Poimboeuf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e555733-c670-4e84-b2e6-abb8b84ade38@crowdstrike.com \
    --to=andrey.grodzovsky@crowdstrike.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=linux-open-source@crowdstrike.com \
    --cc=live-patching@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=raja.khan@crowdstrike.com \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox