Re: [PATCH bpf-next 07/11] bpf: Add helper to retrieve socket in BPF

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Joe Stringer <joe@wand.net.nz>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Joe Stringer <joe@wand.net.nz>,
	daniel@iogearbox.net, netdev <netdev@vger.kernel.org>,
	ast@kernel.org, john fastabend <john.fastabend@gmail.com>,
	tgraf@suug.ch, Martin KaFai Lau <kafai@fb.com>,
	Nitin Hande <nitin.hande@gmail.com>,
	mauricio.vasquez@polito.it
Subject: Re: [PATCH bpf-next 07/11] bpf: Add helper to retrieve socket in BPF
Date: Thu, 13 Sep 2018 14:24:03 -0700	[thread overview]
Message-ID: <CAOftzPiUXxuWLL3tmc=3VMCMRizn7LUxm4QziXs=g6wnGLPD0g@mail.gmail.com> (raw)
In-Reply-To: <20180913212205.ght2mompuoyuhd4g@ast-mbp>

On Thu, 13 Sep 2018 at 14:22, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Sep 13, 2018 at 02:17:17PM -0700, Joe Stringer wrote:
> > On Thu, 13 Sep 2018 at 14:02, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Sep 13, 2018 at 01:55:01PM -0700, Joe Stringer wrote:
> > > > On Thu, 13 Sep 2018 at 12:06, Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Wed, Sep 12, 2018 at 5:06 PM, Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > On Tue, Sep 11, 2018 at 05:36:36PM -0700, Joe Stringer wrote:
> > > > > >> This patch adds new BPF helper functions, bpf_sk_lookup_tcp() and
> > > > > >> bpf_sk_lookup_udp() which allows BPF programs to find out if there is a
> > > > > >> socket listening on this host, and returns a socket pointer which the
> > > > > >> BPF program can then access to determine, for instance, whether to
> > > > > >> forward or drop traffic. bpf_sk_lookup_xxx() may take a reference on the
> > > > > >> socket, so when a BPF program makes use of this function, it must
> > > > > >> subsequently pass the returned pointer into the newly added sk_release()
> > > > > >> to return the reference.
> > > > > >>
> > > > > >> By way of example, the following pseudocode would filter inbound
> > > > > >> connections at XDP if there is no corresponding service listening for
> > > > > >> the traffic:
> > > > > >>
> > > > > >>   struct bpf_sock_tuple tuple;
> > > > > >>   struct bpf_sock_ops *sk;
> > > > > >>
> > > > > >>   populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet
> > > > > >>   sk = bpf_sk_lookup_tcp(ctx, &tuple, sizeof tuple, netns, 0);
> > > > > > ...
> > > > > >> +struct bpf_sock_tuple {
> > > > > >> +     union {
> > > > > >> +             __be32 ipv6[4];
> > > > > >> +             __be32 ipv4;
> > > > > >> +     } saddr;
> > > > > >> +     union {
> > > > > >> +             __be32 ipv6[4];
> > > > > >> +             __be32 ipv4;
> > > > > >> +     } daddr;
> > > > > >> +     __be16 sport;
> > > > > >> +     __be16 dport;
> > > > > >> +     __u8 family;
> > > > > >> +};
> > > > > >
> > > > > > since we can pass ptr_to_packet into map lookup and other helpers now,
> > > > > > can you move 'family' out of bpf_sock_tuple and combine with netns_id arg?
> > > > > > then progs wouldn't need to copy bytes from the packet into tuple
> > > > > > to do a lookup.
> > > >
> > > > If I follow, you're proposing that users should be able to pass a
> > > > pointer to the source address field of the L3 header, and assuming
> > > > that the L3 header ends with saddr+daddr (no options/extheaders), and
> > > > is immediately followed by the sport/dport then a packet pointer
> > > > should work for performing socket lookup. Then it is up to the BPF
> > > > program writer to ensure that this is the case, or otherwise fall back
> > > > to populating a copy of the sock tuple on the stack.
> > >
> > > yep.
> > >
> > > > > have been thinking more about it.
> > > > > since only ipv4 and ipv6 supported may be use size of bpf_sock_tuple
> > > > > to infer family inside the helper, so it doesn't need to be passed explicitly?
> > > >
> > > > Let me make sure I understand the proposal here.
> > > >
> > > > The current structure and function prototypes are:
> > > >
> > > > struct bpf_sock_tuple {
> > > >       union {
> > > >               __be32 ipv6[4];
> > > >               __be32 ipv4;
> > > >       } saddr;
> > > >       union {
> > > >               __be32 ipv6[4];
> > > >               __be32 ipv4;
> > > >       } daddr;
> > > >       __be16 sport;
> > > >       __be16 dport;
> > > >       __u8 family;
> > > > };
> > > ...
> > > > You're proposing something like:
> > > >
> > > > struct bpf_sock_tuple4 {
> > > >       __be32 saddr;
> > > >       __be32 daddr;
> > > >       __be16 sport;
> > > >       __be16 dport;
> > > >       __u8 family;
> > > > };
> > > >
> > > > struct bpf_sock_tuple6 {
> > > >       __be32 saddr[4];
> > > >       __be32 daddr[4];
> > > >       __be16 sport;
> > > >       __be16 dport;
> > > >       __u8 family;
> > > > };
> > >
> > > I think the split is unnecessary.
> > > I'm proposing:
> > > struct bpf_sock_tuple {
> > >       union {
> > >               __be32 ipv6[4];
> > >               __be32 ipv4;
> > >       } saddr;
> > >       union {
> > >               __be32 ipv6[4];
> > >               __be32 ipv4;
> > >       } daddr;
> > >       __be16 sport;
> > >       __be16 dport;
> > > };
> > >
> > > that points directly into the packet (when ipv4 options are not there)
> > > and bpf_sk_lookup_tcp() uses 'size' argument to figure out ipv4/ipv6 family.
> >
> > Needs to be subtly different, the 'sport'/'dport' offset would be
> > wrong in the IPv4 case otherwise:
>
> ahh. right.
>
> >
> > We could take my definitions above and do the following if we want to
> > try to type the helper definition:
> >
> > union bpf_sock_tuple {
> >        struct bpf_sock_tuple4 t4;
> >        struct bpf_sock_tuple6 t6;
> > };
>
> yes. sounds great to me. Much better than 'void *' in the helper.

Could even do something like this:

$ cat foo.c
#include <linux/types.h>

struct bpf_sock_tuple {
   union {
   struct {
       __be32 saddr;
       __be32 daddr;
       __be16 sport;
       __be16 dport;
   } ipv4;
   struct {
       __be32 saddr[4];
       __be32 daddr[4];
       __be16 sport;
       __be16 dport;
   } ipv6;
   };
};

int main(int argc, char *argv[]) {
       struct bpf_sock_tuple tuple;

       return 0;
}
$ gcc -g ./foo.c -o foo.o
$ pahole foo.o
struct bpf_sock_tuple {
       union {
               struct {
                       __be32     saddr;                /*     0     4 */
                       __be32     daddr;                /*     4     4 */
                       __be16     sport;                /*     8     2 */
                       __be16     dport;                /*    10     2 */
               } ipv4;                                  /*          12 */
               struct {
                       __be32     saddr[4];             /*     0    16 */
                       __be32     daddr[4];             /*    16    16 */
                       __be16     sport;                /*    32     2 */
                       __be16     dport;                /*    34     2 */
               } ipv6;                                  /*          36 */
       };                                               /*     0    36 */

       /* size: 36, cachelines: 1, members: 1 */
       /* last cacheline: 36 bytes */
};

next prev parent reply	other threads:[~2018-09-14  2:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-13 19:06 [PATCH bpf-next 07/11] bpf: Add helper to retrieve socket in BPF Alexei Starovoitov
2018-09-13 20:55 ` Joe Stringer
2018-09-13 20:57   ` Joe Stringer
2018-09-13 21:01   ` Alexei Starovoitov
2018-09-13 21:17     ` Joe Stringer
2018-09-13 21:22       ` Alexei Starovoitov
2018-09-13 21:24         ` Joe Stringer [this message]
2018-09-13 22:23           ` Alexei Starovoitov
  -- strict thread matches above, loose matches on Subject: below --
2018-09-12  0:36 [PATCH bpf-next 00/11] Add socket lookup support Joe Stringer
2018-09-12  0:36 ` [PATCH bpf-next 07/11] bpf: Add helper to retrieve socket in BPF Joe Stringer
2018-09-13  0:06   ` Alexei Starovoitov
2018-09-14  6:57   ` kbuild test robot
2018-09-14  7:11   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOftzPiUXxuWLL3tmc=3VMCMRizn7LUxm4QziXs=g6wnGLPD0g@mail.gmail.com' \
    --to=joe@wand.net.nz \
    --cc=alexei.starovoitov@gmail.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=mauricio.vasquez@polito.it \
    --cc=netdev@vger.kernel.org \
    --cc=nitin.hande@gmail.com \
    --cc=tgraf@suug.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).