public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>
Subject: Re: [PATCH bpf-next 00/24] Support bpf trampoline for s390x
Date: Fri, 27 Jan 2023 17:51:20 +0100	[thread overview]
Message-ID: <8e9f72c6b43361a778e623085eb5b7aea7bd0fbd.camel@linux.ibm.com> (raw)
In-Reply-To: <CAEf4BzZP5771Wbv4w1gM+8vcGwvhmFi2tH-8aSGfnzvb=ZgaJg@mail.gmail.com>

On Wed, 2023-01-25 at 16:45 -0800, Andrii Nakryiko wrote:
> On Wed, Jan 25, 2023 at 1:39 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > Hi,
> > 
> > This series implements poke, trampoline, kfunc, mixing subprogs and
> > tailcalls, and fixes a number of tests on s390x.
> > 
> > The following failures still remain:
> > 
> > #52      core_read_macros:FAIL
> > Uses BPF_PROBE_READ(), shouldn't there be BPF_PROBE_READ_KERNEL()?
> 
> BPF_PROBE_READ(), similarly to BPF_CORE_READ() both use
> bpf_probe_read_kernel() internally, as it's most common use case. We
> have separate BPF_PROBE_READ_USER() and BPF_CORE_READ_USER() for when
> bpf_probe_read_user() has to be used.

At least purely from the code perspective, BPF_PROBE_READ() seems to
delegate to bpf_probe_read(). The following therefore helps with this
test:

--- a/tools/lib/bpf/bpf_core_read.h
+++ b/tools/lib/bpf/bpf_core_read.h
@@ -364,7 +364,7 @@ enum bpf_enum_value_kind {
 
 /* Non-CO-RE variant of BPF_CORE_READ_INTO() */
 #define BPF_PROBE_READ_INTO(dst, src, a, ...) ({                     
\
-       ___core_read(bpf_probe_read, bpf_probe_read,                  
\
+       ___core_read(bpf_probe_read_kernel, bpf_probe_read_kernel,    
\
                     dst, (src), a, ##__VA_ARGS__)                    
\
 })
 
@@ -400,7 +400,7 @@ enum bpf_enum_value_kind {
 
 /* Non-CO-RE variant of BPF_CORE_READ_STR_INTO() */
 #define BPF_PROBE_READ_STR_INTO(dst, src, a, ...) ({                 
\
-       ___core_read(bpf_probe_read_str, bpf_probe_read,              
\
+       ___core_read(bpf_probe_read_kernel_str, bpf_probe_read_kernel,
\
                     dst, (src), a, ##__VA_ARGS__)                    
\
 })

but I'm not sure if there are backward compatibility concerns, or if
libbpf is supposed to rewrite this when
!ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE.

> > 
> > #82      get_stack_raw_tp:FAIL
> > get_stack_print_output:FAIL:user_stack corrupted user stack
> > Known issue:
> > We cannot reliably unwind userspace on s390x without DWARF.
> 
> like in principle, or frame pointers (or some equivalent) needs to be
> configured for this to work?
> 
> Asking also in the context of [0], where s390x was excluded. If there
> is actually a way to enable frame pointer-based stack unwinding on
> s390x, would be nice to hear from an expert.
> 
>   [0] https://pagure.io/fesco/issue/2923

For DWARFless unwinding we have -mbackchain (not to be confused with
-fno-omit-frame-pointer, which we also have, but which just hurts
performance without providing tangible benefits).
-mbackchain has a few problems though:

- It's not atomic. Here is a typical prologue with -mbackchain:

        1: stmg    %r11,%r15,88(%r15)  # save non-volatile registers
        2: lgr     %r14,%r15           # %r14 = sp
        3: lay     %r15,-160(%r15)     # sp -= 160
        4: stg     %r14,0(%r15)        # *(void **)sp = %r14

  The invariant here is that *(void **)%r15 is always a pointer to the
  next frame. This means that if we unwind from (4), we are totally
  broken. This does not happen if we unwind from any other instruction,
  but still.

- Unwinding from (1)-(3) is not particularly good either. PSW points to
  the callee, but R15 points to the caller's frame.

- Unwinding leaf functions is like the previous problem, but worse:
  they often do not establish a stack frame at all, so PSW and R15 are
  out of sync for the entire duration of the call.

Therefore .eh_frame-based unwinding is preferred, since it covers all
these corner cases and is therefore reliable. From what I understand,
adding .eh_frame unwinding to the kernel is not desirable. In an
internal discussion we had another idea though: would it be possible to
have an eBPF program that does .eh_frame parsing and unwinding? I
understand that it can be technically challenging at the moment, but
the end result would not be exploitable by crafting malicious
.eh_frame sections, won't loop endlessly, will have good performance,
etc.

> > #101     ksyms_module:FAIL
> > address of kernel function bpf_testmod_test_mod_kfunc is out of
> > range
> > Known issue:
> > Kernel and modules are too far away from each other on s390x.
> > 
> > #167     sk_assign:FAIL
> > Uses legacy map definitions in 'maps' section.
> 
> Hm.. assuming new enough iproute2, new-style .maps definition should
> be supported, right? Let's convert map definition?

Yep, that worked. Will include in v2.

> > #190     stacktrace_build_id:FAIL
> > Known issue:
> > We cannot reliably unwind userspace on s390x without DWARF.
> > 
> > #211     test_bpffs:FAIL
> > iterators.bpf.c is broken on s390x, it uses BPF_CORE_READ(),
> > shouldn't
> > there be BPF_CORE_READ_KERNEL()?
> 
> BPF_CORE_READ() is that, so must be something else
> 
> > 
> > #218     test_profiler:FAIL
> > A lot of BPF_PROBE_READ() usages.
> 
> ditto, something else
> 
> > 
> > #281     xdp_metadata:FAIL
> > See patch 24.
> > 
> > #284     xdp_synproxy:FAIL
> > Verifier error:
> > ; value = bpf_tcp_raw_gen_syncookie_ipv4(hdr->ipv4, hdr->tcp,
> > 281: (79) r1 = *(u64 *)(r10 -80)      ; R1_w=pkt(off=14,r=74,imm=0)
> > R10=fp0
> > 282: (bf) r2 = r8                     ;
> > R2_w=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
> > R8=pkt(id=5,off=14,r=74,umax=60,var_off=(0x0; 0x3c))
> > 283: (79) r3 = *(u64 *)(r10 -104)     ;
> > R3_w=scalar(umax=60,var_off=(0x0; 0x3c)) R10=fp0
> > 284: (85) call bpf_tcp_raw_gen_syncookie_ipv4#204
> > invalid access to packet, off=14 size=0, R2(id=5,off=14,r=74)
> > R2 offset is outside of the packet
> 
> third arg to bpf_tcp_raw_gen_syncookie_ipv4() is defined as
> ARG_CONST_SIZE, so is required to be strictly positive, which doesn't
> seem to be "proven" to verifier. Please provided bigger log, it might
> another instance of needing to sprinkle barrier_var() around.
> 
> And maybe thinking about using ARG_CONST_SIZE_OR_ZERO instead of
> ARG_CONST_SIZE.

Here is the full log:

https://gist.github.com/iii-i/8e20100c33ab6f0dffb5e6e51d1330e8

Apparently we do indeed lose a constraint established by
if (hdr->tcp_len < sizeof(*hdr->tcp)). But the naive

--- a/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
+++ b/tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
@@ -401,6 +401,7 @@ static __always_inline int tcp_dissect(void *data,
void *data_end,
        hdr->tcp_len = hdr->tcp->doff * 4;
        if (hdr->tcp_len < sizeof(*hdr->tcp))
                return XDP_DROP;
+       barrier_var(hdr->tcp_len);
 
        return XDP_TX;
 }
@@ -791,6 +792,7 @@ static __always_inline int syncookie_part2(void
*ctx, void *data, void *data_end
        hdr->tcp_len = hdr->tcp->doff * 4;
        if (hdr->tcp_len < sizeof(*hdr->tcp))
                return XDP_ABORTED;
+       barrier_var(hdr->tcp_len);
 
        return hdr->tcp->syn ? syncookie_handle_syn(hdr, ctx, data,
data_end, xdp) :
                               syncookie_handle_ack(hdr);

does not help.

> 
> > 
> > None of these seem to be due to the new changes.
> > 
> > Best regards,
> > Ilya
> > 
> > Ilya Leoshkevich (24):
> >   selftests/bpf: Fix liburandom_read.so linker error
> >   selftests/bpf: Fix symlink creation error
> >   selftests/bpf: Fix fexit_stress on s390x
> >   selftests/bpf: Fix trampoline_count on s390x
> >   selftests/bpf: Fix kfree_skb on s390x
> >   selftests/bpf: Set errno when urand_spawn() fails
> >   selftests/bpf: Fix decap_sanity_ns cleanup
> >   selftests/bpf: Fix verify_pkcs7_sig on s390x
> >   selftests/bpf: Fix xdp_do_redirect on s390x
> >   selftests/bpf: Fix cgrp_local_storage on s390x
> >   selftests/bpf: Check stack_mprotect() return value
> >   selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x
> >   selftests/bpf: Add a sign-extension test for kfuncs
> >   selftests/bpf: Fix test_lsm on s390x
> >   selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x
> >   selftests/bpf: Fix vmlinux test on s390x
> >   libbpf: Read usdt arg spec with bpf_probe_read_kernel()
> >   s390/bpf: Fix a typo in a comment
> >   s390/bpf: Add expoline to tail calls
> >   s390/bpf: Implement bpf_arch_text_poke()
> >   bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag
> >   s390/bpf: Implement arch_prepare_bpf_trampoline()
> >   s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()
> >   s390/bpf: Implement bpf_jit_supports_kfunc_call()
> > 
> >  arch/s390/net/bpf_jit_comp.c                  | 708
> > +++++++++++++++++-
> >  include/linux/bpf.h                           |   8 +
> >  include/linux/btf.h                           |  15 +-
> >  kernel/bpf/btf.c                              |  16 +-
> >  net/bpf/test_run.c                            |   9 +
> >  tools/lib/bpf/usdt.bpf.h                      |  33 +-
> >  tools/testing/selftests/bpf/Makefile          |   7 +-
> >  tools/testing/selftests/bpf/netcnt_common.h   |   6 +-
> >  .../selftests/bpf/prog_tests/bpf_cookie.c     |   6 +-
> >  .../bpf/prog_tests/cgrp_local_storage.c       |   2 +-
> >  .../selftests/bpf/prog_tests/decap_sanity.c   |   2 +-
> >  .../selftests/bpf/prog_tests/fexit_stress.c   |   6 +-
> >  .../selftests/bpf/prog_tests/kfree_skb.c      |   2 +-
> >  .../selftests/bpf/prog_tests/kfunc_call.c     |   1 +
> >  .../selftests/bpf/prog_tests/test_lsm.c       |   3 +-
> >  .../bpf/prog_tests/trampoline_count.c         |   4 +
> >  tools/testing/selftests/bpf/prog_tests/usdt.c |   1 +
> >  .../bpf/prog_tests/verify_pkcs7_sig.c         |   9 +
> >  .../bpf/prog_tests/xdp_adjust_tail.c          |   7 +-
> >  .../bpf/prog_tests/xdp_do_redirect.c          |   4 +
> >  .../selftests/bpf/progs/kfunc_call_test.c     |  18 +
> >  tools/testing/selftests/bpf/progs/lsm.c       |   7 +-
> >  .../bpf/progs/test_verify_pkcs7_sig.c         |  12 +-
> >  .../selftests/bpf/progs/test_vmlinux.c        |   4 +-
> >  .../bpf/progs/test_xdp_adjust_tail_grow.c     |   8 +-
> >  25 files changed, 816 insertions(+), 82 deletions(-)
> > 
> > --
> > 2.39.1
> > 


  reply	other threads:[~2023-01-27 16:52 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-25 21:37 [PATCH bpf-next 00/24] Support bpf trampoline for s390x Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 01/24] selftests/bpf: Fix liburandom_read.so linker error Ilya Leoshkevich
2023-01-26  1:07   ` Andrii Nakryiko
2023-01-26 13:30     ` Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 02/24] selftests/bpf: Fix symlink creation error Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 03/24] selftests/bpf: Fix fexit_stress on s390x Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 04/24] selftests/bpf: Fix trampoline_count " Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 05/24] selftests/bpf: Fix kfree_skb " Ilya Leoshkevich
2023-01-25 21:37 ` [PATCH bpf-next 06/24] selftests/bpf: Set errno when urand_spawn() fails Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 07/24] selftests/bpf: Fix decap_sanity_ns cleanup Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 08/24] selftests/bpf: Fix verify_pkcs7_sig on s390x Ilya Leoshkevich
2023-01-26  1:06   ` Andrii Nakryiko
2023-01-27 12:36     ` Ilya Leoshkevich
2023-01-27 17:26       ` Andrii Nakryiko
2023-01-25 21:38 ` [PATCH bpf-next 09/24] selftests/bpf: Fix xdp_do_redirect " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 10/24] selftests/bpf: Fix cgrp_local_storage " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 11/24] selftests/bpf: Check stack_mprotect() return value Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 12/24] selftests/bpf: Increase SIZEOF_BPF_LOCAL_STORAGE_ELEM on s390x Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 13/24] selftests/bpf: Add a sign-extension test for kfuncs Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 14/24] selftests/bpf: Fix test_lsm on s390x Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 15/24] selftests/bpf: Fix test_xdp_adjust_tail_grow2 " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 16/24] selftests/bpf: Fix vmlinux test " Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 17/24] libbpf: Read usdt arg spec with bpf_probe_read_kernel() Ilya Leoshkevich
2023-01-26  0:26   ` Andrii Nakryiko
2023-01-26 11:41     ` Ilya Leoshkevich
2023-01-26 19:03       ` Andrii Nakryiko
2023-01-27 11:01         ` Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 18/24] s390/bpf: Fix a typo in a comment Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 19/24] s390/bpf: Add expoline to tail calls Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 20/24] s390/bpf: Implement bpf_arch_text_poke() Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 21/24] bpf: btf: Add BTF_FMODEL_SIGNED_ARG flag Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 22/24] s390/bpf: Implement arch_prepare_bpf_trampoline() Ilya Leoshkevich
2023-01-26  1:15   ` Andrii Nakryiko
2023-01-26 14:30     ` Ilya Leoshkevich
2023-01-26 19:06       ` Andrii Nakryiko
2023-01-27 11:15         ` Ilya Leoshkevich
2023-01-27 17:30           ` Andrii Nakryiko
2023-01-25 21:38 ` [PATCH bpf-next 23/24] s390/bpf: Implement bpf_jit_supports_subprog_tailcalls() Ilya Leoshkevich
2023-01-25 21:38 ` [PATCH bpf-next 24/24] s390/bpf: Implement bpf_jit_supports_kfunc_call() Ilya Leoshkevich
2023-01-26  1:28   ` Alexei Starovoitov
2023-01-27 11:36     ` Ilya Leoshkevich
2023-01-27 16:04       ` Alexei Starovoitov
2023-01-26  0:45 ` [PATCH bpf-next 00/24] Support bpf trampoline for s390x Andrii Nakryiko
2023-01-27 16:51   ` Ilya Leoshkevich [this message]
2023-01-27 17:24     ` Andrii Nakryiko
2023-01-27 22:50       ` Ilya Leoshkevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8e9f72c6b43361a778e623085eb5b7aea7bd0fbd.camel@linux.ibm.com \
    --to=iii@linux.ibm.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox