BPF List
 help / color / mirror / Atom feed
From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Jakub Sitnicki <jakub@cloudflare.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	bpf@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>
Subject: Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian
Date: Mon, 28 Feb 2022 11:19:10 +0100	[thread overview]
Message-ID: <87d79308a2ffce76a805cc1e5f60d28bebc74239.camel@linux.ibm.com> (raw)
In-Reply-To: <87y21whwwl.fsf@cloudflare.com>

On Sun, 2022-02-27 at 21:30 +0100, Jakub Sitnicki wrote:
> 
> On Sat, Feb 26, 2022 at 06:44 PM -08, Alexei Starovoitov wrote:
> > On Tue, Feb 22, 2022 at 07:25:58PM +0100, Ilya Leoshkevich wrote:
> > > On big-endian, the port is available in the second __u16, not the
> > > first
> > > one. Therefore, provide a big-endian-specific definition that
> > > reflects
> > > that. Also, define remote_port_compat in order to have nicer
> > > architecture-agnostic code in the verifier and in tests.
> > > 
> > > Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct
> > > bpf_sk_lookup 16-bit wide")
> > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > ---
> > >  include/uapi/linux/bpf.h       | 17 +++++++++++++++--
> > >  net/core/filter.c              |  5 ++---
> > >  tools/include/uapi/linux/bpf.h | 17 +++++++++++++++--
> > >  3 files changed, 32 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index afe3d0d7f5f2..7b0e5efa58e0 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -10,6 +10,7 @@
> > >  
> > >  #include <linux/types.h>
> > >  #include <linux/bpf_common.h>
> > > +#include <asm/byteorder.h>
> > >  
> > >  /* Extended instruction set based on top of classic BPF */
> > >  
> > > @@ -6453,8 +6454,20 @@ struct bpf_sk_lookup {
> > >         __u32 protocol;         /* IP protocol (IPPROTO_TCP,
> > > IPPROTO_UDP) */
> > >         __u32 remote_ip4;       /* Network byte order */
> > >         __u32 remote_ip6[4];    /* Network byte order */
> > > -       __be16 remote_port;     /* Network byte order */
> > > -       __u16 :16;              /* Zero padding */
> > > +       union {
> > > +               struct {
> > > +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN :
> > > defined(__LITTLE_ENDIAN)
> > > +                       __be16 remote_port;     /* Network byte
> > > order */
> > > +                       __u16 :16;              /* Zero padding
> > > */
> > > +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN :
> > > defined(__BIG_ENDIAN)
> > > +                       __u16 :16;              /* Zero padding
> > > */
> > > +                       __be16 remote_port;     /* Network byte
> > > order */
> > > +#else
> > > +#error unspecified endianness
> > > +#endif
> > > +               };
> > > +               __u32 remote_port_compat;
> > 
> > Sorry this hack is not an option.
> > Don't have any suggestions at this point. Pls come up with
> > something else.
> 
> I think we can keep the bpf_sk_lookup definition as is, if we leave
> the
> 4-byte load from remote_port offset quirky behavior on little-endian.
> 
> Please take a look at the test fix I've posted for 4-byte load from
> bpf_sock dst_port that works for me on x86_64 and s390. It is exactly
> the same case as we're dealing with here:
> 
> https://lore.kernel.org/bpf/20220227202757.519015-4-jakub@cloudflare.com/T/#u
> 

What about 2-byte loads?

static __noinline bool sk_dst_port__load_half(struct bpf_sock *sk)
{
	__u16 *half = (__u16 *)&sk->dst_port;
	return half[0] == bpf_htons(0xcafe);
}

requires "ca fe ?? ??" in memory on BE, while

static __noinline bool sk_dst_port__load_word(struct bpf_sock *sk)
{
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
	const __u8 SHIFT = 16;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
	const __u8 SHIFT = 0;
#else
#error "Unrecognized __BYTE_ORDER__"
#endif
	__u32 *word = (__u32 *)&sk->dst_port;
	return word[0] == bpf_htonl(0xcafe << SHIFT);
}

requires "00 00 ca fe". This is inconsistent. Furthermore, one
cannot see it with bpf_sock thanks to

	case offsetofend(struct bpf_sock, dst_port) ...
	     offsetof(struct bpf_sock, dst_ip4) - 1:
		return false;

however, with sk_lookup this is the case: loading the most significant
half of the port produces non-zero! So, it's not simply a quirkiness of
the 4-byte load, it's a mutual inconsistency between LSW loads, MSW
loads and 4-byte loads.

One might argue that we can live with that, especially since all the
user-relevant tests pass - here I can only say that an inconsistency on
such a fundamental level makes me nervous.

In order to resolve this inconsistency I've implemented patch 1 of this
series. With that, "sk->dst_port == bpf_htons(0xcafe)" starts to fail,
and that's where one needs something like this patch.

Best regards,
Ilya

  reply	other threads:[~2022-02-28 10:19 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-22 18:25 [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes Ilya Leoshkevich
2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich
2022-03-08 15:01   ` Jakub Sitnicki
2022-03-08 23:58     ` Ilya Leoshkevich
2022-03-09  8:36       ` Jakub Sitnicki
2022-03-09 12:34         ` Ilya Leoshkevich
2022-03-10 22:57           ` Jakub Sitnicki
2022-03-14 17:35             ` Jakub Sitnicki
2022-03-14 18:25               ` Ilya Leoshkevich
2022-03-14 20:57                 ` Jakub Sitnicki
2022-02-22 18:25 ` [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian Ilya Leoshkevich
2022-02-27  2:44   ` Alexei Starovoitov
2022-02-27 20:30     ` Jakub Sitnicki
2022-02-28 10:19       ` Ilya Leoshkevich [this message]
2022-02-28 13:26         ` Jakub Sitnicki
2022-03-01  0:39           ` Ilya Leoshkevich
2022-03-01  0:40           ` Ilya Leoshkevich
2022-02-22 18:25 ` [PATCH RFC bpf-next 3/3] selftests/bpf: Adapt bpf_sk_lookup.remote_port loads Ilya Leoshkevich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d79308a2ffce76a805cc1e5f60d28bebc74239.camel@linux.ibm.com \
    --to=iii@linux.ibm.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=jakub@cloudflare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox