netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Stringer <joe@wand.net.nz>
To: Martin KaFai Lau <kafai@fb.com>
Cc: Joe Stringer <joe@wand.net.nz>,
	daniel@iogearbox.net, netdev <netdev@vger.kernel.org>,
	ast@kernel.org, john fastabend <john.fastabend@gmail.com>
Subject: Re: [RFC bpf-next 07/11] bpf: Add helper to retrieve socket in BPF
Date: Fri, 11 May 2018 17:54:33 -0700	[thread overview]
Message-ID: <CAOftzPinEyoaaokgsssA809LQ571x_u8hhA1rXw0HjN8a5O-7w@mail.gmail.com> (raw)
In-Reply-To: <20180511214111.hi6ax6qoe37al37q@kafai-mbp.dhcp.thefacebook.com>

On 11 May 2018 at 14:41, Martin KaFai Lau <kafai@fb.com> wrote:
> On Fri, May 11, 2018 at 02:08:01PM -0700, Joe Stringer wrote:
>> On 10 May 2018 at 22:00, Martin KaFai Lau <kafai@fb.com> wrote:
>> > On Wed, May 09, 2018 at 02:07:05PM -0700, Joe Stringer wrote:
>> >> This patch adds a new BPF helper function, sk_lookup() which allows BPF
>> >> programs to find out if there is a socket listening on this host, and
>> >> returns a socket pointer which the BPF program can then access to
>> >> determine, for instance, whether to forward or drop traffic. sk_lookup()
>> >> takes a reference on the socket, so when a BPF program makes use of this
>> >> function, it must subsequently pass the returned pointer into the newly
>> >> added sk_release() to return the reference.
>> >>
>> >> By way of example, the following pseudocode would filter inbound
>> >> connections at XDP if there is no corresponding service listening for
>> >> the traffic:
>> >>
>> >>   struct bpf_sock_tuple tuple;
>> >>   struct bpf_sock_ops *sk;
>> >>
>> >>   populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet
>> >>   sk = bpf_sk_lookup(ctx, &tuple, sizeof tuple, netns, 0);
>> >>   if (!sk) {
>> >>     // Couldn't find a socket listening for this traffic. Drop.
>> >>     return TC_ACT_SHOT;
>> >>   }
>> >>   bpf_sk_release(sk, 0);
>> >>   return TC_ACT_OK;
>> >>
>> >> Signed-off-by: Joe Stringer <joe@wand.net.nz>
>> >> ---
>>
>> ...
>>
>> >> @@ -4032,6 +4036,96 @@ static const struct bpf_func_proto bpf_skb_get_xfrm_state_proto = {
>> >>  };
>> >>  #endif
>> >>
>> >> +struct sock *
>> >> +sk_lookup(struct net *net, struct bpf_sock_tuple *tuple) {
>> > Would it be possible to have another version that
>> > returns a sk without taking its refcnt?
>> > It may have performance benefit.
>>
>> Not really. The sockets are not RCU-protected, and established sockets
>> may be torn down without notice. If we don't take a reference, there's
>> no guarantee that the socket will continue to exist for the duration
>> of running the BPF program.
>>
>> From what I follow, the comment below has a hidden implication which
>> is that sockets without SOCK_RCU_FREE, eg established sockets, may be
>> directly freed regardless of RCU.
> Right, SOCK_RCU_FREE sk is the one I am concern about.
> For example, TCP_LISTEN socket does not require taking a refcnt
> now.  Doing a bpf_sk_lookup() may have a rather big
> impact on handling TCP syn flood.  or the usual intention
> is to redirect instead of passing it up to the stack?

I see, if you're only interested in listen sockets then probably this
series could be extended with a new flag, eg something like
BPF_F_SK_FIND_LISTENERS which restricts the set of possible sockets
found to only listen sockets, then the implementation would call into
__inet_lookup_listener() instead of inet_lookup(). The presence of
that flag in the relevant register during CALL instruction would show
that the verifier should not reference-track the result, then there'd
need to be a check on the release to ensure that this unreferenced
socket is never released. Just a thought, completely untested and I
could still be missing some detail..

That said, I don't completely follow how you would expect to handle
the traffic for sockets that are already established - the helper
would no longer find those sockets, so you wouldn't know whether to
pass the traffic up the stack for established traffic or not.

  reply	other threads:[~2018-05-12  0:54 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-09 21:06 [RFC bpf-next 00/11] Add socket lookup support Joe Stringer
2018-05-09 21:06 ` [RFC bpf-next 01/11] bpf: Add iterator for spilled registers Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 02/11] bpf: Simplify ptr_min_max_vals adjustment Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 03/11] bpf: Generalize ptr_or_null regs check Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 04/11] bpf: Add PTR_TO_SOCKET verifier type Joe Stringer
2018-05-15  2:37   ` Alexei Starovoitov
2018-05-16 23:56     ` Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 05/11] bpf: Macrofy stack state copy Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 06/11] bpf: Add reference tracking to verifier Joe Stringer
2018-05-15  3:04   ` Alexei Starovoitov
2018-05-17  1:05     ` Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 07/11] bpf: Add helper to retrieve socket in BPF Joe Stringer
2018-05-11  5:00   ` Martin KaFai Lau
2018-05-11 21:08     ` Joe Stringer
2018-05-11 21:41       ` Martin KaFai Lau
2018-05-12  0:54         ` Joe Stringer [this message]
2018-05-15  3:16           ` Alexei Starovoitov
2018-05-15 16:48             ` Martin KaFai Lau
2018-05-16 18:55               ` Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 08/11] selftests/bpf: Add tests for reference tracking Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 09/11] libbpf: Support loading individual progs Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 10/11] selftests/bpf: Add C tests for reference tracking Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 11/11] Documentation: Describe bpf " Joe Stringer
2018-05-15  3:19   ` Alexei Starovoitov
2018-05-16 19:05 ` [RFC bpf-next 00/11] Add socket lookup support Joe Stringer
2018-05-16 20:04   ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOftzPinEyoaaokgsssA809LQ571x_u8hhA1rXw0HjN8a5O-7w@mail.gmail.com \
    --to=joe@wand.net.nz \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).