public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next] selftests/bpf: Future-proof connect4_prog.c
@ 2023-09-07 21:00 Stanislav Fomichev
  2023-09-08 23:42 ` Andrii Nakryiko
  0 siblings, 1 reply; 3+ messages in thread
From: Stanislav Fomichev @ 2023-09-07 21:00 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, haoluo, jolsa, Nick Desaulniers

With the new internal clang version I see the following optimization
that makes connect4 program unverifiable.

The following code:

	int do_bind()
	{
		if (bpf_bind() != 0)
			return 0;
		return 1;
	}
	int connect_v4_prog()
	{
		return do_bind() ? 1 : 0;
	}

Becomes:

	int do_bind()
	{
		if (bpf_bind() != 0)
			return 0;
		return 1;
	}
	int connect_v4_prog()
	{
		return do_bind();
	}

IOW, looks like clang is able to see that do_bind returns only 0 and
1 and the extra branch around 'return do_bind' is not needed.
This, however, seems to break the verifier, which assumes that
bpf2bpf calls can return 0-0xffffffff.

Note, I can produce those programs only with the internal fork of clang.
The latest one from git still produced correct bytecode. It might be
some options/optimizations that we enable and that are still
disabled for the general upstream users, not sure. I've desided
to send this patch out anyway since it seems like a correct optimization
the compiler might do.

So to be future-proof, reshape the code a bit to return bpf_bind
result directly. This will not give any hint to the clang about
the return value and will force it generate that '? 1: 0' branch
at the callee.

Good program:

0000000000000000 <do_bind>:
       0:       b4 02 00 00 7f 00 00 04 w2 = 0x400007f
       1:       63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2
       2:       b4 02 00 00 02 00 00 00 w2 = 0x2
       3:       63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2
       4:       b7 02 00 00 00 00 00 00 r2 = 0x0
       5:       63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2
       6:       63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2
       7:       bf a2 00 00 00 00 00 00 r2 = r10
       8:       07 02 00 00 f0 ff ff ff r2 += -0x10
       9:       b4 03 00 00 10 00 00 00 w3 = 0x10
      10:       85 00 00 00 40 00 00 00 call 0x40
      11:       bf 01 00 00 00 00 00 00 r1 = r0
      12:       b4 00 00 00 01 00 00 00 w0 = 0x1
      13:       15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2>
      14:       b4 00 00 00 00 00 00 00 w0 = 0x0

00000000000001b0 <LBB1_30>:
      54:       bc 60 00 00 00 00 00 00 w0 = w6
      55:       95 00 00 00 00 00 00 00 exit

0000000000000578 <LBB1_28>:
     ...
     180:       85 10 00 00 ff ff ff ff call -0x1
     181:       b4 06 00 00 01 00 00 00 w6 = 0x1
     182:       56 00 7f ff 00 00 00 00 if w0 != 0x0 goto -0x81 <LBB1_30>
     183:       b4 06 00 00 00 00 00 00 w6 = 0x0
     184:       05 00 7d ff 00 00 00 00 goto -0x83 <LBB1_30>

Bad program:
0000000000000000 <do_bind>:
       0:       b4 02 00 00 7f 00 00 04 w2 = 0x400007f
       1:       63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2
       2:       b4 02 00 00 02 00 00 00 w2 = 0x2
       3:       63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2
       4:       b7 02 00 00 00 00 00 00 r2 = 0x0
       5:       63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2
       6:       63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2
       7:       bf a2 00 00 00 00 00 00 r2 = r10
       8:       07 02 00 00 f0 ff ff ff r2 += -0x10
       9:       b4 03 00 00 10 00 00 00 w3 = 0x10
      10:       85 00 00 00 40 00 00 00 call 0x40
      11:       bf 01 00 00 00 00 00 00 r1 = r0
      12:       b4 00 00 00 01 00 00 00 w0 = 0x1
      13:       15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2>
      14:       b4 00 00 00 00 00 00 00 w0 = 0x0

00000000000001b0 <LBB1_3>:
      54:       bc 60 00 00 00 00 00 00 w0 = w6
      55:       95 00 00 00 00 00 00 00 exit

0000000000000578 <LBB1_28>:
     ...
     180:       85 10 00 00 ff ff ff ff call -0x1
     181:       bc 06 00 00 00 00 00 00 w6 = w0
     182:       05 00 7f ff 00 00 00 00 goto -0x81 <LBB1_3>

Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 tools/testing/selftests/bpf/progs/connect4_prog.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/connect4_prog.c b/tools/testing/selftests/bpf/progs/connect4_prog.c
index 7ef49ec04838..b7fc46a0787b 100644
--- a/tools/testing/selftests/bpf/progs/connect4_prog.c
+++ b/tools/testing/selftests/bpf/progs/connect4_prog.c
@@ -41,10 +41,7 @@ int do_bind(struct bpf_sock_addr *ctx)
 	sa.sin_port = bpf_htons(0);
 	sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4);
 
-	if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0)
-		return 0;
-
-	return 1;
+	return bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa));
 }
 
 static __inline int verify_cc(struct bpf_sock_addr *ctx,
@@ -194,7 +191,7 @@ int connect_v4_prog(struct bpf_sock_addr *ctx)
 	ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4);
 	ctx->user_port = bpf_htons(DST_REWRITE_PORT4);
 
-	return do_bind(ctx) ? 1 : 0;
+	return do_bind(ctx) ? 0 : 1;
 }
 
 char _license[] SEC("license") = "GPL";
-- 
2.42.0.283.g2d96d420d3-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH bpf-next] selftests/bpf: Future-proof connect4_prog.c
  2023-09-07 21:00 [PATCH bpf-next] selftests/bpf: Future-proof connect4_prog.c Stanislav Fomichev
@ 2023-09-08 23:42 ` Andrii Nakryiko
  2023-09-09  0:28   ` Stanislav Fomichev
  0 siblings, 1 reply; 3+ messages in thread
From: Andrii Nakryiko @ 2023-09-08 23:42 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, haoluo, jolsa, Nick Desaulniers

On Thu, Sep 7, 2023 at 2:00 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> With the new internal clang version I see the following optimization
> that makes connect4 program unverifiable.
>
> The following code:
>
>         int do_bind()

Yonghong added __weak to do_bind a few months ago ([0]), which makes
it illegal for the compiler to assume 0 or 1 return. Can you please
double check that this is the issue with __weak?

  [0] https://lore.kernel.org/bpf/20230310012410.2920570-1-yhs@fb.com/


>         {
>                 if (bpf_bind() != 0)
>                         return 0;
>                 return 1;
>         }
>         int connect_v4_prog()
>         {
>                 return do_bind() ? 1 : 0;
>         }
>
> Becomes:
>
>         int do_bind()
>         {
>                 if (bpf_bind() != 0)
>                         return 0;
>                 return 1;
>         }
>         int connect_v4_prog()
>         {
>                 return do_bind();
>         }
>
> IOW, looks like clang is able to see that do_bind returns only 0 and
> 1 and the extra branch around 'return do_bind' is not needed.
> This, however, seems to break the verifier, which assumes that
> bpf2bpf calls can return 0-0xffffffff.
>
> Note, I can produce those programs only with the internal fork of clang.
> The latest one from git still produced correct bytecode. It might be
> some options/optimizations that we enable and that are still
> disabled for the general upstream users, not sure. I've desided
> to send this patch out anyway since it seems like a correct optimization
> the compiler might do.
>
> So to be future-proof, reshape the code a bit to return bpf_bind
> result directly. This will not give any hint to the clang about
> the return value and will force it generate that '? 1: 0' branch
> at the callee.
>
> Good program:
>
> 0000000000000000 <do_bind>:
>        0:       b4 02 00 00 7f 00 00 04 w2 = 0x400007f
>        1:       63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2
>        2:       b4 02 00 00 02 00 00 00 w2 = 0x2
>        3:       63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2
>        4:       b7 02 00 00 00 00 00 00 r2 = 0x0
>        5:       63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2
>        6:       63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2
>        7:       bf a2 00 00 00 00 00 00 r2 = r10
>        8:       07 02 00 00 f0 ff ff ff r2 += -0x10
>        9:       b4 03 00 00 10 00 00 00 w3 = 0x10
>       10:       85 00 00 00 40 00 00 00 call 0x40
>       11:       bf 01 00 00 00 00 00 00 r1 = r0
>       12:       b4 00 00 00 01 00 00 00 w0 = 0x1
>       13:       15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2>
>       14:       b4 00 00 00 00 00 00 00 w0 = 0x0
>
> 00000000000001b0 <LBB1_30>:
>       54:       bc 60 00 00 00 00 00 00 w0 = w6
>       55:       95 00 00 00 00 00 00 00 exit
>
> 0000000000000578 <LBB1_28>:
>      ...
>      180:       85 10 00 00 ff ff ff ff call -0x1
>      181:       b4 06 00 00 01 00 00 00 w6 = 0x1
>      182:       56 00 7f ff 00 00 00 00 if w0 != 0x0 goto -0x81 <LBB1_30>
>      183:       b4 06 00 00 00 00 00 00 w6 = 0x0
>      184:       05 00 7d ff 00 00 00 00 goto -0x83 <LBB1_30>
>
> Bad program:
> 0000000000000000 <do_bind>:
>        0:       b4 02 00 00 7f 00 00 04 w2 = 0x400007f
>        1:       63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2
>        2:       b4 02 00 00 02 00 00 00 w2 = 0x2
>        3:       63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2
>        4:       b7 02 00 00 00 00 00 00 r2 = 0x0
>        5:       63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2
>        6:       63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2
>        7:       bf a2 00 00 00 00 00 00 r2 = r10
>        8:       07 02 00 00 f0 ff ff ff r2 += -0x10
>        9:       b4 03 00 00 10 00 00 00 w3 = 0x10
>       10:       85 00 00 00 40 00 00 00 call 0x40
>       11:       bf 01 00 00 00 00 00 00 r1 = r0
>       12:       b4 00 00 00 01 00 00 00 w0 = 0x1
>       13:       15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2>
>       14:       b4 00 00 00 00 00 00 00 w0 = 0x0
>
> 00000000000001b0 <LBB1_3>:
>       54:       bc 60 00 00 00 00 00 00 w0 = w6
>       55:       95 00 00 00 00 00 00 00 exit
>
> 0000000000000578 <LBB1_28>:
>      ...
>      180:       85 10 00 00 ff ff ff ff call -0x1
>      181:       bc 06 00 00 00 00 00 00 w6 = w0
>      182:       05 00 7f ff 00 00 00 00 goto -0x81 <LBB1_3>
>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>
> ---
>  tools/testing/selftests/bpf/progs/connect4_prog.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/progs/connect4_prog.c b/tools/testing/selftests/bpf/progs/connect4_prog.c
> index 7ef49ec04838..b7fc46a0787b 100644
> --- a/tools/testing/selftests/bpf/progs/connect4_prog.c
> +++ b/tools/testing/selftests/bpf/progs/connect4_prog.c
> @@ -41,10 +41,7 @@ int do_bind(struct bpf_sock_addr *ctx)
>         sa.sin_port = bpf_htons(0);
>         sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4);
>
> -       if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0)
> -               return 0;
> -
> -       return 1;
> +       return bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa));
>  }
>
>  static __inline int verify_cc(struct bpf_sock_addr *ctx,
> @@ -194,7 +191,7 @@ int connect_v4_prog(struct bpf_sock_addr *ctx)
>         ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4);
>         ctx->user_port = bpf_htons(DST_REWRITE_PORT4);
>
> -       return do_bind(ctx) ? 1 : 0;
> +       return do_bind(ctx) ? 0 : 1;
>  }
>
>  char _license[] SEC("license") = "GPL";
> --
> 2.42.0.283.g2d96d420d3-goog
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH bpf-next] selftests/bpf: Future-proof connect4_prog.c
  2023-09-08 23:42 ` Andrii Nakryiko
@ 2023-09-09  0:28   ` Stanislav Fomichev
  0 siblings, 0 replies; 3+ messages in thread
From: Stanislav Fomichev @ 2023-09-09  0:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, haoluo, jolsa, Nick Desaulniers

On Fri, Sep 8, 2023 at 4:42 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Sep 7, 2023 at 2:00 PM Stanislav Fomichev <sdf@google.com> wrote:
> >
> > With the new internal clang version I see the following optimization
> > that makes connect4 program unverifiable.
> >
> > The following code:
> >
> >         int do_bind()
>
> Yonghong added __weak to do_bind a few months ago ([0]), which makes
> it illegal for the compiler to assume 0 or 1 return. Can you please
> double check that this is the issue with __weak?
>
>   [0] https://lore.kernel.org/bpf/20230310012410.2920570-1-yhs@fb.com/

It does indeed fix it for me, thank you! Mystery solved on "why I
can't repro this on the upstream" :-) I've completely missed that
extra __weak..

>
> >         {
> >                 if (bpf_bind() != 0)
> >                         return 0;
> >                 return 1;
> >         }
> >         int connect_v4_prog()
> >         {
> >                 return do_bind() ? 1 : 0;
> >         }
> >
> > Becomes:
> >
> >         int do_bind()
> >         {
> >                 if (bpf_bind() != 0)
> >                         return 0;
> >                 return 1;
> >         }
> >         int connect_v4_prog()
> >         {
> >                 return do_bind();
> >         }
> >
> > IOW, looks like clang is able to see that do_bind returns only 0 and
> > 1 and the extra branch around 'return do_bind' is not needed.
> > This, however, seems to break the verifier, which assumes that
> > bpf2bpf calls can return 0-0xffffffff.
> >
> > Note, I can produce those programs only with the internal fork of clang.
> > The latest one from git still produced correct bytecode. It might be
> > some options/optimizations that we enable and that are still
> > disabled for the general upstream users, not sure. I've desided
> > to send this patch out anyway since it seems like a correct optimization
> > the compiler might do.
> >
> > So to be future-proof, reshape the code a bit to return bpf_bind
> > result directly. This will not give any hint to the clang about
> > the return value and will force it generate that '? 1: 0' branch
> > at the callee.
> >
> > Good program:
> >
> > 0000000000000000 <do_bind>:
> >        0:       b4 02 00 00 7f 00 00 04 w2 = 0x400007f
> >        1:       63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2
> >        2:       b4 02 00 00 02 00 00 00 w2 = 0x2
> >        3:       63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2
> >        4:       b7 02 00 00 00 00 00 00 r2 = 0x0
> >        5:       63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2
> >        6:       63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2
> >        7:       bf a2 00 00 00 00 00 00 r2 = r10
> >        8:       07 02 00 00 f0 ff ff ff r2 += -0x10
> >        9:       b4 03 00 00 10 00 00 00 w3 = 0x10
> >       10:       85 00 00 00 40 00 00 00 call 0x40
> >       11:       bf 01 00 00 00 00 00 00 r1 = r0
> >       12:       b4 00 00 00 01 00 00 00 w0 = 0x1
> >       13:       15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2>
> >       14:       b4 00 00 00 00 00 00 00 w0 = 0x0
> >
> > 00000000000001b0 <LBB1_30>:
> >       54:       bc 60 00 00 00 00 00 00 w0 = w6
> >       55:       95 00 00 00 00 00 00 00 exit
> >
> > 0000000000000578 <LBB1_28>:
> >      ...
> >      180:       85 10 00 00 ff ff ff ff call -0x1
> >      181:       b4 06 00 00 01 00 00 00 w6 = 0x1
> >      182:       56 00 7f ff 00 00 00 00 if w0 != 0x0 goto -0x81 <LBB1_30>
> >      183:       b4 06 00 00 00 00 00 00 w6 = 0x0
> >      184:       05 00 7d ff 00 00 00 00 goto -0x83 <LBB1_30>
> >
> > Bad program:
> > 0000000000000000 <do_bind>:
> >        0:       b4 02 00 00 7f 00 00 04 w2 = 0x400007f
> >        1:       63 2a f4 ff 00 00 00 00 *(u32 *)(r10 - 0xc) = r2
> >        2:       b4 02 00 00 02 00 00 00 w2 = 0x2
> >        3:       63 2a f0 ff 00 00 00 00 *(u32 *)(r10 - 0x10) = r2
> >        4:       b7 02 00 00 00 00 00 00 r2 = 0x0
> >        5:       63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 0x4) = r2
> >        6:       63 2a f8 ff 00 00 00 00 *(u32 *)(r10 - 0x8) = r2
> >        7:       bf a2 00 00 00 00 00 00 r2 = r10
> >        8:       07 02 00 00 f0 ff ff ff r2 += -0x10
> >        9:       b4 03 00 00 10 00 00 00 w3 = 0x10
> >       10:       85 00 00 00 40 00 00 00 call 0x40
> >       11:       bf 01 00 00 00 00 00 00 r1 = r0
> >       12:       b4 00 00 00 01 00 00 00 w0 = 0x1
> >       13:       15 01 01 00 00 00 00 00 if r1 == 0x0 goto +0x1 <LBB0_2>
> >       14:       b4 00 00 00 00 00 00 00 w0 = 0x0
> >
> > 00000000000001b0 <LBB1_3>:
> >       54:       bc 60 00 00 00 00 00 00 w0 = w6
> >       55:       95 00 00 00 00 00 00 00 exit
> >
> > 0000000000000578 <LBB1_28>:
> >      ...
> >      180:       85 10 00 00 ff ff ff ff call -0x1
> >      181:       bc 06 00 00 00 00 00 00 w6 = w0
> >      182:       05 00 7f ff 00 00 00 00 goto -0x81 <LBB1_3>
> >
> > Cc: Nick Desaulniers <ndesaulniers@google.com>
> > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > ---
> >  tools/testing/selftests/bpf/progs/connect4_prog.c | 7 ++-----
> >  1 file changed, 2 insertions(+), 5 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/progs/connect4_prog.c b/tools/testing/selftests/bpf/progs/connect4_prog.c
> > index 7ef49ec04838..b7fc46a0787b 100644
> > --- a/tools/testing/selftests/bpf/progs/connect4_prog.c
> > +++ b/tools/testing/selftests/bpf/progs/connect4_prog.c
> > @@ -41,10 +41,7 @@ int do_bind(struct bpf_sock_addr *ctx)
> >         sa.sin_port = bpf_htons(0);
> >         sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4);
> >
> > -       if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0)
> > -               return 0;
> > -
> > -       return 1;
> > +       return bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa));
> >  }
> >
> >  static __inline int verify_cc(struct bpf_sock_addr *ctx,
> > @@ -194,7 +191,7 @@ int connect_v4_prog(struct bpf_sock_addr *ctx)
> >         ctx->user_ip4 = bpf_htonl(DST_REWRITE_IP4);
> >         ctx->user_port = bpf_htons(DST_REWRITE_PORT4);
> >
> > -       return do_bind(ctx) ? 1 : 0;
> > +       return do_bind(ctx) ? 0 : 1;
> >  }
> >
> >  char _license[] SEC("license") = "GPL";
> > --
> > 2.42.0.283.g2d96d420d3-goog
> >

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-09-09  0:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-07 21:00 [PATCH bpf-next] selftests/bpf: Future-proof connect4_prog.c Stanislav Fomichev
2023-09-08 23:42 ` Andrii Nakryiko
2023-09-09  0:28   ` Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox