Re: [PATCH bpf-next] selftests/bpf: Fix implementation-defined behavior in sk_lookup test

BPF List
 help / color / mirror / Atom feed

From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Jakub Sitnicki <jakub@cloudflare.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	kernel-team@cloudflare.com,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>
Subject: Re: [PATCH bpf-next] selftests/bpf: Fix implementation-defined behavior in sk_lookup test
Date: Tue, 22 Feb 2022 22:51:21 +0100	[thread overview]
Message-ID: <8a88c49e2d4f5715776c33a6cc83cc1985f4e106.camel@linux.ibm.com> (raw)
In-Reply-To: <87bkyykapm.fsf@cloudflare.com>

On Tue, 2022-02-22 at 19:24 +0100, Jakub Sitnicki wrote:
> On Tue, Feb 22, 2022 at 06:42 PM +01, Ilya Leoshkevich wrote:
> > On Tue, 2022-02-22 at 15:53 +0100, Jakub Sitnicki wrote:
> > > On Tue, Feb 22, 2022 at 03:22 AM +01, Ilya Leoshkevich wrote:
> > > > On Tue, 2022-02-22 at 01:43 +0100, Ilya Leoshkevich wrote:
> > > > > On Mon, 2022-02-21 at 22:39 +0100, Ilya Leoshkevich wrote:
> > > > > > On Mon, 2022-02-21 at 19:03 +0100, Jakub Sitnicki wrote:
> > > > > > > Shifting 16-bit type by 16 bits is implementation-defined
> > > > > > > for
> > > > > > > BPF
> > > > > > > programs.
> > > > > > > Don't rely on it in case it is causing the test failures
> > > > > > > we
> > > > > > > are
> > > > > > > seeing on
> > > > > > > s390x z15 target.
> > > > > > > 
> > > > > > > Fixes: 2ed0dc5937d3 ("selftests/bpf: Cover 4-byte load
> > > > > > > from
> > > > > > > remote_port in bpf_sk_lookup")
> > > > > > > Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
> > > > > > > Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
> > > > > > > ---
> > > > > > > 
> > > > > > > I don't have a dev env for s390x/z15 set up yet, so can't
> > > > > > > definitely
> > > > > > > confirm the fix.
> > > > > > > That said, it seems worth fixing either way.
> > > > > > > 
> > > > > > >  tools/testing/selftests/bpf/progs/test_sk_lookup.c | 3
> > > > > > > ++-
> > > > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git
> > > > > > > a/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > > > b/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > > > index bf5b7caefdd0..7d47276a8964 100644
> > > > > > > --- a/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > > > +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup.c
> > > > > > > @@ -65,6 +65,7 @@ static const __u32 KEY_SERVER_A =
> > > > > > > SERVER_A;
> > > > > > >  static const __u32 KEY_SERVER_B = SERVER_B;
> > > > > > >  
> > > > > > >  static const __u16 SRC_PORT = bpf_htons(8008);
> > > > > > > +static const __u32 SRC_PORT_U32 = bpf_htonl(8008U <<
> > > > > > > 16);
> > > > > > >  static const __u32 SRC_IP4 = IP4(127, 0, 0, 2);
> > > > > > >  static const __u32 SRC_IP6[] = IP6(0xfd000000, 0x0, 0x0,
> > > > > > > 0x00000002);
> > > > > > >  
> > > > > > > @@ -421,7 +422,7 @@ int ctx_narrow_access(struct
> > > > > > > bpf_sk_lookup
> > > > > > > *ctx)
> > > > > > >  
> > > > > > >         /* Load from remote_port field with zero padding
> > > > > > > (backward
> > > > > > > compatibility) */
> > > > > > >         val_u32 = *(__u32 *)&ctx->remote_port;
> > > > > > > -       if (val_u32 != bpf_htonl(bpf_ntohs(SRC_PORT) <<
> > > > > > > 16))
> > > > > > > +       if (val_u32 != SRC_PORT_U32)
> > > > > > >                 return SK_DROP;
> > > > > > >  
> > > > > > >         /* Narrow loads from local_port field. Expect
> > > > > > > DST_PORT.
> > > > > > > */
> > > > > > 
> > > > > > Unfortunately this doesn't help with the s390 problem.
> > > > > > I'll try to debug this.
> > > > > 
> > > > > I have to admit I have a hard time wrapping my head around
> > > > > the
> > > > > requirements here.
> > > > > 
> > > > > Based on the pre-9a69e2b385f4 code, do I understand correctly
> > > > > that
> > > > > for the following input
> > > > > 
> > > > > Port:     0x1f48
> > > > > SRC_PORT: 0x481f
> > > > > 
> > > > > we expect the following results for different kinds of loads:
> > > > > 
> > > > > Size   Offset  LE      BE
> > > > > BPF_B  0       0x1f    0
> > > > > BPF_B  1       0x48    0
> > > > > BPF_B  2       0       0x48
> > > > > BPF_B  3       0       0x1f
> > > > > BPF_H  0       0x481f  0
> > > > > BPF_H  1       0       0x481f
> > > > > BPF_W  0       0x481f  0x481f
> > > > > 
> > > > > and this is guaranteed by the struct bpf_sk_lookup ABI?
> > > > > Because
> > > > > then
> > > > > it
> > > > > looks as if 9a69e2b385f4 breaks it on big-endian as follows:
> > > > > 
> > > > > Size   Offset  BE-9a69e2b385f4
> > > > > BPF_B  0       0x48
> > > > > BPF_B  1       0x1f
> > > > > BPF_B  2       0
> > > > > BPF_B  3       0
> > > > > BPF_H  0       0x481f
> > > > > BPF_H  1       0
> > > > > BPF_W  0       0x481f0000
> > > > 
> > > > Sorry, I worded this incorrectly: 9a69e2b385f4 did not change
> > > > the
> > > > kernel behavior, the ABI is not broken and the old compiled
> > > > code
> > > > should
> > > > continue to work.
> > > > What the second table really shows are what the results should
> > > > be
> > > > according to the 9a69e2b385f4 struct bpf_sk_lookup definition,
> > > > which I
> > > > still think is broken on big-endian and needs to be adjusted to
> > > > match
> > > > the ABI.
> > > > 
> > > > I noticed one other strange thing in the meantime: loads from
> > > > *(__u32 *)&ctx->remote_port, *(__u16 *)&ctx->remote_port and
> > > > *((__u16 *)&ctx->remote_port + 1) all produce 8008 on s390,
> > > > which
> > > > is
> > > > clearly inconsistent. It looks as if convert_ctx_accesses()
> > > > needs
> > > > to be
> > > > adjusted to handle combinations like ctx_field_size == 4 &&
> > > > size ==
> > > > 2
> > > > && target_size == 2. I will continue with this tomorrow.
> > > > 
> > > > > Or is the old behavior a bug and this new one is desirable?
> > > > > 9a69e2b385f4 has no Fixes: tag, so I assume that's the former
> > > > > :-(
> > > > > 
> > > > > In which case, would it make sense to fix it by swapping
> > > > > remote_port
> > > > > and :16 in bpf_sk_lookup on big-endian?
> > > 
> > > Thanks for looking into it.
> > > 
> > > When it comes to requirements, my intention was to keep the same
> > > behavior as before the split up of the remote_port field in
> > > 9a69e2b385f4
> > > ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit
> > > wide").
> > > 
> > > 9a69e2b385f4 was supposed to be a formality, after a similar
> > > change
> > > in
> > > 4421a582718a ("bpf: Make dst_port field in struct bpf_sock 16-bit
> > > wide"), which went in earlier.
> > > 
> > > In 4421a582718a I've provided a bit more context. I understand
> > > that
> > > the
> > > remote_port value, even before the type changed from u32 to u16,
> > > appeared to the BPF program as if laid out in memory like so:
> > > 
> > >       offsetof(struct bpf_sk_lookup, remote_port) +0  <port MSB>
> > >                                                   +1  <port LSB>
> > >                                                   +2  0x00
> > >                                                   +3  0x00
> > > 
> > > Translating it to your handy table format, I expect should result
> > > in
> > > loads as so if port is 8008 = 0x1f48:
> > > 
> > >       Size   Offset  LE      BE
> > >       BPF_B  0       0x1f    0x1f
> > >       BPF_B  1       0x48    0x48
> > >       BPF_B  2       0       0
> > >       BPF_B  3       0       0
> > >       BPF_H  0       0x481f  0x1f48
> > >       BPF_H  1       0       0
> > >       BPF_W  0       0x481f  0x1f480000
> > 
> > Hmm, I think for big-endian the layout is different.
> > If we look at test_sk_lookup.c from 9a69e2b385f4^:
> > 
> >         /* Narrow loads from remote_port field. Expect SRC_PORT. */
> >         if (LSB(ctx->remote_port, 0) != ((SRC_PORT >> 0) & 0xff) ||
> >             LSB(ctx->remote_port, 1) != ((SRC_PORT >> 8) & 0xff) ||
> >             LSB(ctx->remote_port, 2) != 0 || LSB(ctx->remote_port,
> > 3)
> > != 0)
> >                 return SK_DROP;
> > 
> > LSB() on little-endian is just byte indexing, so it's indeed 
> > 1f,48,00,00. However, on big-endian it's indexing from the end, so
> > it's 00,00,48,1f.
> 
> I understood that LSB() is indexing from the end on BE because
> SRC_PORT
> constant value differs on LE (= 0x481f) and BE (= 0x1f48) platforms,
> so
> 
>                  LE  BE
>   SRC_PORT >> 0  1f  48
>   SRC_PORT >> 8  48  1f
> 
> So on LE we first compare remote_port MSB, then LSB.
> While on BE we start with remote_port LSB, then MSB.
> 
> But, now that you have pointed it out, I notice that
> sizeof(remote_port)
> has changed and from 4 to 2, and I can't see how LSB(…, 3) and LSB(…,
> 4)
> loads can keep working on big-endian.

Oh, right - it should be 00,00,1f,48 on big-endian.
Out-of-bounds LSB is indeed an issue. I've posted my current
thoughts as an RFC series [1], this one is addressed in patch 3.

[1]
https://lore.kernel.org/bpf/20220222182559.2865596-1-iii@linux.ibm.com/

> 
> [...]

next prev parent reply	other threads:[~2022-02-22 21:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-21 18:03 [PATCH bpf-next] selftests/bpf: Fix implementation-defined behavior in sk_lookup test Jakub Sitnicki
2022-02-21 21:39 ` Ilya Leoshkevich
2022-02-22  0:43   ` Ilya Leoshkevich
2022-02-22  2:22     ` Ilya Leoshkevich
2022-02-22 14:53       ` Jakub Sitnicki
2022-02-22 17:42         ` Ilya Leoshkevich
2022-02-22 18:24           ` Jakub Sitnicki
2022-02-22 21:51             ` Ilya Leoshkevich [this message]
2022-02-25 10:01               ` Jakub Sitnicki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a88c49e2d4f5715776c33a6cc83cc1985f4e106.camel@linux.ibm.com \
    --to=iii@linux.ibm.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox