From: Ilya Leoshkevich <iii@linux.ibm.com>
To: Jakub Sitnicki <jakub@cloudflare.com>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
bpf@vger.kernel.org, Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>
Subject: Re: [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian
Date: Mon, 28 Feb 2022 11:19:10 +0100 [thread overview]
Message-ID: <87d79308a2ffce76a805cc1e5f60d28bebc74239.camel@linux.ibm.com> (raw)
In-Reply-To: <87y21whwwl.fsf@cloudflare.com>
On Sun, 2022-02-27 at 21:30 +0100, Jakub Sitnicki wrote:
>
> On Sat, Feb 26, 2022 at 06:44 PM -08, Alexei Starovoitov wrote:
> > On Tue, Feb 22, 2022 at 07:25:58PM +0100, Ilya Leoshkevich wrote:
> > > On big-endian, the port is available in the second __u16, not the
> > > first
> > > one. Therefore, provide a big-endian-specific definition that
> > > reflects
> > > that. Also, define remote_port_compat in order to have nicer
> > > architecture-agnostic code in the verifier and in tests.
> > >
> > > Fixes: 9a69e2b385f4 ("bpf: Make remote_port field in struct
> > > bpf_sk_lookup 16-bit wide")
> > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > ---
> > > include/uapi/linux/bpf.h | 17 +++++++++++++++--
> > > net/core/filter.c | 5 ++---
> > > tools/include/uapi/linux/bpf.h | 17 +++++++++++++++--
> > > 3 files changed, 32 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index afe3d0d7f5f2..7b0e5efa58e0 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -10,6 +10,7 @@
> > >
> > > #include <linux/types.h>
> > > #include <linux/bpf_common.h>
> > > +#include <asm/byteorder.h>
> > >
> > > /* Extended instruction set based on top of classic BPF */
> > >
> > > @@ -6453,8 +6454,20 @@ struct bpf_sk_lookup {
> > > __u32 protocol; /* IP protocol (IPPROTO_TCP,
> > > IPPROTO_UDP) */
> > > __u32 remote_ip4; /* Network byte order */
> > > __u32 remote_ip6[4]; /* Network byte order */
> > > - __be16 remote_port; /* Network byte order */
> > > - __u16 :16; /* Zero padding */
> > > + union {
> > > + struct {
> > > +#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN :
> > > defined(__LITTLE_ENDIAN)
> > > + __be16 remote_port; /* Network byte
> > > order */
> > > + __u16 :16; /* Zero padding
> > > */
> > > +#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN :
> > > defined(__BIG_ENDIAN)
> > > + __u16 :16; /* Zero padding
> > > */
> > > + __be16 remote_port; /* Network byte
> > > order */
> > > +#else
> > > +#error unspecified endianness
> > > +#endif
> > > + };
> > > + __u32 remote_port_compat;
> >
> > Sorry this hack is not an option.
> > Don't have any suggestions at this point. Pls come up with
> > something else.
>
> I think we can keep the bpf_sk_lookup definition as is, if we leave
> the
> 4-byte load from remote_port offset quirky behavior on little-endian.
>
> Please take a look at the test fix I've posted for 4-byte load from
> bpf_sock dst_port that works for me on x86_64 and s390. It is exactly
> the same case as we're dealing with here:
>
> https://lore.kernel.org/bpf/20220227202757.519015-4-jakub@cloudflare.com/T/#u
>
What about 2-byte loads?
static __noinline bool sk_dst_port__load_half(struct bpf_sock *sk)
{
__u16 *half = (__u16 *)&sk->dst_port;
return half[0] == bpf_htons(0xcafe);
}
requires "ca fe ?? ??" in memory on BE, while
static __noinline bool sk_dst_port__load_word(struct bpf_sock *sk)
{
#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
const __u8 SHIFT = 16;
#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
const __u8 SHIFT = 0;
#else
#error "Unrecognized __BYTE_ORDER__"
#endif
__u32 *word = (__u32 *)&sk->dst_port;
return word[0] == bpf_htonl(0xcafe << SHIFT);
}
requires "00 00 ca fe". This is inconsistent. Furthermore, one
cannot see it with bpf_sock thanks to
case offsetofend(struct bpf_sock, dst_port) ...
offsetof(struct bpf_sock, dst_ip4) - 1:
return false;
however, with sk_lookup this is the case: loading the most significant
half of the port produces non-zero! So, it's not simply a quirkiness of
the 4-byte load, it's a mutual inconsistency between LSW loads, MSW
loads and 4-byte loads.
One might argue that we can live with that, especially since all the
user-relevant tests pass - here I can only say that an inconsistency on
such a fundamental level makes me nervous.
In order to resolve this inconsistency I've implemented patch 1 of this
series. With that, "sk->dst_port == bpf_htons(0xcafe)" starts to fail,
and that's where one needs something like this patch.
Best regards,
Ilya
next prev parent reply other threads:[~2022-02-28 10:19 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-22 18:25 [PATCH RFC bpf-next 0/3] bpf_sk_lookup.remote_port fixes Ilya Leoshkevich
2022-02-22 18:25 ` [PATCH RFC bpf-next 1/3] bpf: Fix certain narrow loads with offsets Ilya Leoshkevich
2022-03-08 15:01 ` Jakub Sitnicki
2022-03-08 23:58 ` Ilya Leoshkevich
2022-03-09 8:36 ` Jakub Sitnicki
2022-03-09 12:34 ` Ilya Leoshkevich
2022-03-10 22:57 ` Jakub Sitnicki
2022-03-14 17:35 ` Jakub Sitnicki
2022-03-14 18:25 ` Ilya Leoshkevich
2022-03-14 20:57 ` Jakub Sitnicki
2022-02-22 18:25 ` [PATCH RFC bpf-next 2/3] bpf: Fix bpf_sk_lookup.remote_port on big-endian Ilya Leoshkevich
2022-02-27 2:44 ` Alexei Starovoitov
2022-02-27 20:30 ` Jakub Sitnicki
2022-02-28 10:19 ` Ilya Leoshkevich [this message]
2022-02-28 13:26 ` Jakub Sitnicki
2022-03-01 0:39 ` Ilya Leoshkevich
2022-03-01 0:40 ` Ilya Leoshkevich
2022-02-22 18:25 ` [PATCH RFC bpf-next 3/3] selftests/bpf: Adapt bpf_sk_lookup.remote_port loads Ilya Leoshkevich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d79308a2ffce76a805cc1e5f60d28bebc74239.camel@linux.ibm.com \
--to=iii@linux.ibm.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jakub@cloudflare.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox