From: Viktor Malik <vmalik@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Andrii Nakryiko <andrii@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@linux.dev>,
Song Liu <song@kernel.org>,
Yonghong Song <yonghong.song@linux.dev>,
John Fastabend <john.fastabend@gmail.com>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>,
Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org>,
Jordan Rome <linux@jordanrome.com>,
Qais Yousef <qyousef@layalina.io>, Hou Tao <houtao1@huawei.com>
Subject: Re: [PATCH bpf-next v3 0/3] Separate tests that need error injection
Date: Mon, 16 Feb 2026 08:46:28 +0100 [thread overview]
Message-ID: <fa67d3db-273f-4e4e-90a2-ba805f484bd1@redhat.com> (raw)
In-Reply-To: <CAADnVQLT7cGgExtaqusyk+XwrD=-x15kL0veJB5YPewjPo=PUg@mail.gmail.com>
On 2/14/26 00:13, Alexei Starovoitov wrote:
> On Fri, Feb 13, 2026 at 1:22 AM Viktor Malik <vmalik@redhat.com> wrote:
>>
>> On 2/12/26 17:35, Alexei Starovoitov wrote:
>>> On Wed, Feb 11, 2026 at 11:22 PM Viktor Malik <vmalik@redhat.com> wrote:
>>>>
>>>> On 2/12/26 02:42, Alexei Starovoitov wrote:
>>>>> On Wed, Feb 11, 2026 at 5:55 AM Viktor Malik <vmalik@redhat.com> wrote:
>>>>>>
>>>>>> On 2/11/26 03:42, Alexei Starovoitov wrote:
>>>>>>> On Tue, Feb 10, 2026 at 7:14 AM Viktor Malik <vmalik@redhat.com> wrote:
>>>>>>>>
>>>>>>>> Some enterprise kernels (such as RHEL) do not enable error injection via
>>>>>>>> BPF (CONFIG_FUNCTION_ERROR_INJECTION and CONFIG_BPF_KPROBE_OVERRIDE).
>>>>>>>> When running test_progs on such kernels, a lot of test cases fail since
>>>>>>>> they use sleepable fentry or fmod_ret program types which require error
>>>>>>>> injection to be enabled. While it is possible to skip these via custom
>>>>>>>> DENYLIST, some test_progs are not properly split into subtests and
>>>>>>>> therefore must be entirely skipped.
>>>>>>>
>>>>>>> Something is wrong.
>>>>>>> -SEC("fentry/" SYS_PREFIX "sys_nanosleep")
>>>>>>> -int do_probe_read(void *ctx)
>>>>>>> +SEC("?fentry/" SYS_PREFIX "sys_nanosleep")
>>>>>>> +int probe_read(void *ctx)
>>>>>>> {
>>>>>>> char buf[8];
>>>>>>>
>>>>>>> @@ -37,19 +37,19 @@ int do_probe_read(void *ctx)
>>>>>>> return 0;
>>>>>>> }
>>>>>>>
>>>>>>> -SEC("fentry.s/" SYS_PREFIX "sys_nanosleep")
>>>>>>> -int do_copy_from_user(void *ctx)
>>>>>>> +SEC("?fentry.s/" SYS_PREFIX "sys_nanosleep")
>>>>>>>
>>>>>>> sleepable should have nothing to do with CONFIG_FUNCTION_ERROR_INJECTION.
>>>>>>
>>>>>> I may be missing something but I think it is directly dependent, at
>>>>>> least for fentry.
>>>>>>
>>>>>> In bpf_check_attach_target (kernel/bpf/verifier.c), we have:
>>>>>>
>>>>>> if (prog->sleepable) {
>>>>>> ret = -EINVAL;
>>>>>> switch (prog->type) {
>>>>>> case BPF_PROG_TYPE_TRACING:
>>>>>>
>>>>>> /* fentry/fexit/fmod_ret progs can be sleepable if they are
>>>>>> * attached to ALLOW_ERROR_INJECTION and are not in denylist.
>>>>>> */
>>>>>> if (!check_non_sleepable_error_inject(btf_id) &&
>>>>>> within_error_injection_list(addr))
>>>>>
>>>>> Oh. That's an annoying side effect of disabling error inject.
>>>>> fentry progs are not injecting any errors. They just use that
>>>>> list as allowlist of good attach points.
>>>>> Let's explicitly allow all syscalls here.
>>>>
>>>> Ok, I'll post a patch.
>>>>
>>>>>
>>>>>> ret = 0;
>>>>>> /* fentry/fexit/fmod_ret progs can also be sleepable if they are
>>>>>> * in the fmodret id set with the KF_SLEEPABLE flag.
>>>>>> */
>>>>>> else {
>>>>>> u32 *flags = btf_kfunc_is_modify_return(btf, btf_id,
>>>>>> prog);
>>>>>>
>>>>>> if (flags && (*flags & KF_SLEEPABLE))
>>>>>> ret = 0;
>>>>>> }
>>>>>> break;
>>>>>> [...]
>>>>>> }
>>>>>> if (ret) {
>>>>>> module_put(mod);
>>>>>> bpf_log(log, "%s is not sleepable\n", tname);
>>>>>> return ret;
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> If I read that correctly, fentry must be on the error injection list
>>>>>> (which is populated by functions marked with ALLOW_ERROR_INJECTION when
>>>>>> CONFIG_FUNCTION_ERROR_INJECTION=y) to be allowed as sleepable.
>>>>>>
>>>>>> The other branch for fmodret id set seems to be only relevant for
>>>>>> kernel modules which explicily register their functions into
>>>>>> BTF_KFUNC_HOOK_FMODRET set.
>>>>>>
>>>>>>> While CONFIG_BPF_KPROBE_OVERRIDE affects a single helper.
>>>>>>>
>>>>>>> Both fentry and fentry.s should work fine on sys_nanosleep.
>>>>>>> fmod_ret is also completely independent of these two flags.
>>>>>>
>>>>>> For fmod_ret, we have a very similar code:
>>>>>>
>>>>>> } else if (prog->expected_attach_type == BPF_MODIFY_RETURN) {
>>>>>> [...]
>>>>>> ret = -EINVAL;
>>>>>> if (btf_kfunc_is_modify_return(btf, btf_id, prog) ||
>>>>>> !check_attach_modify_return(addr, tname))
>>>>>> ret = 0;
>>>>>> if (ret) {
>>>>>> module_put(mod);
>>>>>> bpf_log(log, "%s() is not modifiable\n", tname);
>>>>>> return ret;
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> with
>>>>>>
>>>>>> static int check_attach_modify_return(unsigned long addr, const char *func_name)
>>>>>> {
>>>>>> if (within_error_injection_list(addr) ||
>>>>>> !strncmp(SECURITY_PREFIX, func_name, sizeof(SECURITY_PREFIX) - 1))
>>>>>
>>>>> Same thing here.
>>>>> Let's wrap within_error_injection_list() with bpf specific helper
>>>>> that in case of !CONFIG_FUNCTION_ERROR_INJECTION
>>>>> will allow syscalls anyway.
>>>>
>>>> I would argue that fmod_ret should remain gated behind
>>>> CONFIG_ERROR_INJECTION. It's a similar concept to kprobe override - it
>>>> allows to override the execution of the original syscall and return a
>>>> custom error code.
>>>
>>> Definitely not. fmod_ret and error injection are two orthogonal concepts.
>>> fmod_ret was introduced to be independent of error injection in the
>>> first place. fmod_ret is a deliberate spot to allow bpf to change
>>> return code
>>> in production.
>>
>> I'm not sure that was the original intention of fmod_ret. Even the
>> original patch set mentions that error injection needs to be explicitly
>> enabled [1,2]. Changing that is IMHO a significant shift which needs
>> discussion.
>
> That discussion already happened when error_injection was forked
> as a config flag and went to default N.
Then I'm a bit confused about the purpose of error_injection. The
description of the config option says:
Add fault injections into various functions that are annotated with
ALLOW_ERROR_INJECTION() in the kernel. BPF may also modify the return
value of these functions. This is useful to test error paths of code.
If unsure, say N
But if you say that fmod_ret and error_injection are orthogonal
concepts, then what is error_injection good for without fmod_ret?
>>
>>> While error injection is a debug feature to stress test the kernel.
>>> syscalls are in fmod_ret territory, since this is not debugging
>>> and not stressing the kernel. ptrace can change syscall returns without
>>> bpf, without fmod_ret, and without error injection.
>>
>> The thing I'm worried about here is that allowing BPF programs override
>> syscalls will esentially allow to implement a BPF-based rootkit. I'm
>> well aware that BPF is root-only (although there are ways to allow
>> unprivileged users run BPF, such as tokens) but if a potential attacker
>> gains root access even for a short time, they can install the rootkit
>> which may live unnoticed from that moment. Which is not something we
>> really want to enable in production. For ptrace, my understanding is
>> that it is significantly easier to detect if it is used.
>>
>> If you don't mind, I'd start with only allowing sleepable syscalls (when
>> error injection is disabled) for now.
>
> I mind and I hate the reasoning that in kernels with error_injection=y
> bpf is prone to be a rootkit. This is just a security circus
> detached from reality.
Could you share a bit more details why? I'm not a security expert myself
but it seems to me that intercepting syscalls could be used to implement
a rootkit. For instance [1] does something similar, although they use
bpf_override_return in syscall kprobes. So, I'd love to understand why
this is detached from reality.
Thanks!
Viktor
[1] https://github.com/Gui774ume/ebpfkit
next prev parent reply other threads:[~2026-02-16 7:46 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-10 15:13 [PATCH bpf-next v3 0/3] Separate tests that need error injection Viktor Malik
2026-02-10 15:14 ` [PATCH bpf-next v3 1/3] selftests/bpf: Split module_attach into subtests Viktor Malik
2026-02-10 15:14 ` [PATCH bpf-next v3 2/3] selftests/bpf: Split read_vsyscall " Viktor Malik
2026-02-10 15:14 ` [PATCH bpf-next v3 3/3] selftests/bpf: Split sleepable fentry from LSM test Viktor Malik
2026-02-11 2:42 ` [PATCH bpf-next v3 0/3] Separate tests that need error injection Alexei Starovoitov
2026-02-11 13:54 ` Viktor Malik
2026-02-12 1:42 ` Alexei Starovoitov
2026-02-12 7:22 ` Viktor Malik
2026-02-12 16:35 ` Alexei Starovoitov
2026-02-13 9:21 ` Viktor Malik
2026-02-13 23:13 ` Alexei Starovoitov
2026-02-16 7:46 ` Viktor Malik [this message]
2026-02-23 11:52 ` Viktor Malik
2026-02-24 2:58 ` Alexei Starovoitov
2026-02-24 8:03 ` Viktor Malik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fa67d3db-273f-4e4e-90a2-ba805f484bd1@redhat.com \
--to=vmalik@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=haoluo@google.com \
--cc=houtao1@huawei.com \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=linux@jordanrome.com \
--cc=martin.lau@linux.dev \
--cc=qyousef@layalina.io \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox