From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Maxim Mikityanskiy <maxtram95@gmail.com>
Cc: Sasha Levin <sashal@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
stable@vger.kernel.org, Maxim Mikityanskiy <maxim@isovalent.com>
Subject: Re: [PATCH stable 5.10] bpf: Allow reads from uninit stack
Date: Mon, 15 Jul 2024 14:09:26 +0200 [thread overview]
Message-ID: <2024071507-conical-issuing-2e8d@gregkh> (raw)
In-Reply-To: <20240711184323.2355017-1-maxtram95@gmail.com>
On Thu, Jul 11, 2024 at 09:43:21PM +0300, Maxim Mikityanskiy wrote:
> From: Eduard Zingerman <eddyz87@gmail.com>
>
> [ Upstream commit 6715df8d5d24655b9fd368e904028112b54c7de1 ]
>
> This commits updates the following functions to allow reads from
> uninitialized stack locations when env->allow_uninit_stack option is
> enabled:
> - check_stack_read_fixed_off()
> - check_stack_range_initialized(), called from:
> - check_stack_read_var_off()
> - check_helper_mem_access()
>
> Such change allows to relax logic in stacksafe() to treat STACK_MISC
> and STACK_INVALID in a same way and make the following stack slot
> configurations equivalent:
>
> | Cached state | Current state |
> | stack slot | stack slot |
> |------------------+------------------|
> | STACK_INVALID or | STACK_INVALID or |
> | STACK_MISC | STACK_SPILL or |
> | | STACK_MISC or |
> | | STACK_ZERO or |
> | | STACK_DYNPTR |
>
> This leads to significant verification speed gains (see below).
>
> The idea was suggested by Andrii Nakryiko [1] and initial patch was
> created by Alexei Starovoitov [2].
>
> Currently the env->allow_uninit_stack is allowed for programs loaded
> by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities.
>
> A number of test cases from verifier/*.c were expecting uninitialized
> stack access to be an error. These test cases were updated to execute
> in unprivileged mode (thus preserving the tests).
>
> The test progs/test_global_func10.c expected "invalid indirect read
> from stack" error message because of the access to uninitialized
> memory region. This error is no longer possible in privileged mode.
> The test is updated to provoke an error "invalid indirect access to
> stack" because of access to invalid stack address (such error is not
> verified by progs/test_global_func*.c series of tests).
>
> The following tests had to be removed because these can't be made
> unprivileged:
> - verifier/sock.c:
> - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init
> stack_value"
> BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode.
> - verifier/var_off.c:
> - "indirect variable-offset stack access, max_off+size > max_initialized"
> - "indirect variable-offset stack access, uninitialized"
> These tests verify that access to uninitialized stack values is
> detected when stack offset is not a constant. However, variable
> stack access is prohibited in unprivileged mode, thus these tests
> are no longer valid.
>
> * * *
>
> Here is veristat log comparing this patch with current master on a
> set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
> and cilium BPF binaries (see [3]):
>
> $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log
> File Program States (A) States (B) States (DIFF)
> -------------------------- -------------------------- ---------- ---------- ----------------
> bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%)
> bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%)
> bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%)
> bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%)
> bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%)
> bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%)
> bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%)
> pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%)
> strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%)
> strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%)
> xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%)
> xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%)
> -------------------------- -------------------------- ---------- ---------- ----------------
>
> Note: I limited selection by states_pct<-30%.
>
> Inspection of differences in pyperf600_bpf_loop behavior shows that
> the following patch for the test removes almost all differences:
>
> - a/tools/testing/selftests/bpf/progs/pyperf.h
> + b/tools/testing/selftests/bpf/progs/pyperf.h
> @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args *ctx)
> }
>
> if (event->pthread_match || !pidData->use_tls) {
> - void* frame_ptr;
> - FrameData frame;
> + void* frame_ptr = 0;
> + FrameData frame = {};
> Symbol sym = {};
> int cur_cpu = bpf_get_smp_processor_id();
>
> W/o this patch the difference comes from the following pattern
> (for different variables):
>
> static bool get_frame_data(... FrameData *frame ...)
> {
> ...
> bpf_probe_read_user(&frame->f_code, ...);
> if (!frame->f_code)
> return false;
> ...
> bpf_probe_read_user(&frame->co_name, ...);
> if (frame->co_name)
> ...;
> }
>
> int __on_event(struct bpf_raw_tracepoint_args *ctx)
> {
> FrameData frame;
> ...
> get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback
> ...
> }
>
> SEC("raw_tracepoint/kfree_skb")
> int on_event(struct bpf_raw_tracepoint_args* ctx)
> {
> ...
> ret |= __on_event(ctx);
> ret |= __on_event(ctx);
> ...
> }
>
> With regards to value `frame->co_name` the following is important:
> - Because of the conditional `if (!frame->f_code)` each call to
> __on_event() produces two states, one with `frame->co_name` marked
> as STACK_MISC, another with it as is (and marked STACK_INVALID on a
> first call).
> - The call to bpf_probe_read_user() does not mark stack slots
> corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks
> these slots as BPF_MISC, this happens because of the following loop
> in the check_helper_call():
>
> for (i = 0; i < meta.access_size; i++) {
> err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B,
> BPF_WRITE, -1, false);
> if (err)
> return err;
> }
>
> Note the size of the write, it is a one byte write for each byte
> touched by a helper. The BPF_B write does not lead to write marks
> for the target stack slot.
> - Which means that w/o this patch when second __on_event() call is
> verified `if (frame->co_name)` will propagate read marks first to a
> stack slot with STACK_MISC marks and second to a stack slot with
> STACK_INVALID marks and these states would be considered different.
>
> [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/
> [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/
> [3] git@github.com:anakryiko/cilium.git
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Co-developed-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
> Acked-by: Andrii Nakryiko <andrii@kernel.org>
> Link: https://lore.kernel.org/r/20230219200427.606541-2-eddyz87@gmail.com
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
> ---
> Backporting to address the complexity regression introduced by commit
> 71f656a50176 ("bpf: Fix to preserve reg parent/live fields when copying
> range info"), that affects Cilium built with LLVM 18.
All now queued up, thanks.
greg k-h
prev parent reply other threads:[~2024-07-15 12:09 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-11 18:43 [PATCH stable 5.10] bpf: Allow reads from uninit stack Maxim Mikityanskiy
2024-07-11 18:43 ` [PATCH stable 5.15] " Maxim Mikityanskiy
2024-08-23 1:45 ` BPF selftest failed to build Sherry Yang
2024-08-23 2:50 ` Shung-Hsi Yu
2024-08-27 12:44 ` Greg KH
2024-07-11 18:43 ` [PATCH stable 6.1] bpf: Allow reads from uninit stack Maxim Mikityanskiy
2024-07-15 12:09 ` Greg Kroah-Hartman [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2024071507-conical-issuing-2e8d@gregkh \
--to=gregkh@linuxfoundation.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=john.fastabend@gmail.com \
--cc=maxim@isovalent.com \
--cc=maxtram95@gmail.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox