From: Joe Stringer <joe@wand.net.nz>
To: Martin KaFai Lau <kafai@fb.com>
Cc: Joe Stringer <joe@wand.net.nz>,
daniel@iogearbox.net, netdev <netdev@vger.kernel.org>,
ast@kernel.org, john fastabend <john.fastabend@gmail.com>
Subject: Re: [RFC bpf-next 07/11] bpf: Add helper to retrieve socket in BPF
Date: Fri, 11 May 2018 17:54:33 -0700 [thread overview]
Message-ID: <CAOftzPinEyoaaokgsssA809LQ571x_u8hhA1rXw0HjN8a5O-7w@mail.gmail.com> (raw)
In-Reply-To: <20180511214111.hi6ax6qoe37al37q@kafai-mbp.dhcp.thefacebook.com>
On 11 May 2018 at 14:41, Martin KaFai Lau <kafai@fb.com> wrote:
> On Fri, May 11, 2018 at 02:08:01PM -0700, Joe Stringer wrote:
>> On 10 May 2018 at 22:00, Martin KaFai Lau <kafai@fb.com> wrote:
>> > On Wed, May 09, 2018 at 02:07:05PM -0700, Joe Stringer wrote:
>> >> This patch adds a new BPF helper function, sk_lookup() which allows BPF
>> >> programs to find out if there is a socket listening on this host, and
>> >> returns a socket pointer which the BPF program can then access to
>> >> determine, for instance, whether to forward or drop traffic. sk_lookup()
>> >> takes a reference on the socket, so when a BPF program makes use of this
>> >> function, it must subsequently pass the returned pointer into the newly
>> >> added sk_release() to return the reference.
>> >>
>> >> By way of example, the following pseudocode would filter inbound
>> >> connections at XDP if there is no corresponding service listening for
>> >> the traffic:
>> >>
>> >> struct bpf_sock_tuple tuple;
>> >> struct bpf_sock_ops *sk;
>> >>
>> >> populate_tuple(ctx, &tuple); // Extract the 5tuple from the packet
>> >> sk = bpf_sk_lookup(ctx, &tuple, sizeof tuple, netns, 0);
>> >> if (!sk) {
>> >> // Couldn't find a socket listening for this traffic. Drop.
>> >> return TC_ACT_SHOT;
>> >> }
>> >> bpf_sk_release(sk, 0);
>> >> return TC_ACT_OK;
>> >>
>> >> Signed-off-by: Joe Stringer <joe@wand.net.nz>
>> >> ---
>>
>> ...
>>
>> >> @@ -4032,6 +4036,96 @@ static const struct bpf_func_proto bpf_skb_get_xfrm_state_proto = {
>> >> };
>> >> #endif
>> >>
>> >> +struct sock *
>> >> +sk_lookup(struct net *net, struct bpf_sock_tuple *tuple) {
>> > Would it be possible to have another version that
>> > returns a sk without taking its refcnt?
>> > It may have performance benefit.
>>
>> Not really. The sockets are not RCU-protected, and established sockets
>> may be torn down without notice. If we don't take a reference, there's
>> no guarantee that the socket will continue to exist for the duration
>> of running the BPF program.
>>
>> From what I follow, the comment below has a hidden implication which
>> is that sockets without SOCK_RCU_FREE, eg established sockets, may be
>> directly freed regardless of RCU.
> Right, SOCK_RCU_FREE sk is the one I am concern about.
> For example, TCP_LISTEN socket does not require taking a refcnt
> now. Doing a bpf_sk_lookup() may have a rather big
> impact on handling TCP syn flood. or the usual intention
> is to redirect instead of passing it up to the stack?
I see, if you're only interested in listen sockets then probably this
series could be extended with a new flag, eg something like
BPF_F_SK_FIND_LISTENERS which restricts the set of possible sockets
found to only listen sockets, then the implementation would call into
__inet_lookup_listener() instead of inet_lookup(). The presence of
that flag in the relevant register during CALL instruction would show
that the verifier should not reference-track the result, then there'd
need to be a check on the release to ensure that this unreferenced
socket is never released. Just a thought, completely untested and I
could still be missing some detail..
That said, I don't completely follow how you would expect to handle
the traffic for sockets that are already established - the helper
would no longer find those sockets, so you wouldn't know whether to
pass the traffic up the stack for established traffic or not.
next prev parent reply other threads:[~2018-05-12 0:54 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-09 21:06 [RFC bpf-next 00/11] Add socket lookup support Joe Stringer
2018-05-09 21:06 ` [RFC bpf-next 01/11] bpf: Add iterator for spilled registers Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 02/11] bpf: Simplify ptr_min_max_vals adjustment Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 03/11] bpf: Generalize ptr_or_null regs check Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 04/11] bpf: Add PTR_TO_SOCKET verifier type Joe Stringer
2018-05-15 2:37 ` Alexei Starovoitov
2018-05-16 23:56 ` Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 05/11] bpf: Macrofy stack state copy Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 06/11] bpf: Add reference tracking to verifier Joe Stringer
2018-05-15 3:04 ` Alexei Starovoitov
2018-05-17 1:05 ` Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 07/11] bpf: Add helper to retrieve socket in BPF Joe Stringer
2018-05-11 5:00 ` Martin KaFai Lau
2018-05-11 21:08 ` Joe Stringer
2018-05-11 21:41 ` Martin KaFai Lau
2018-05-12 0:54 ` Joe Stringer [this message]
2018-05-15 3:16 ` Alexei Starovoitov
2018-05-15 16:48 ` Martin KaFai Lau
2018-05-16 18:55 ` Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 08/11] selftests/bpf: Add tests for reference tracking Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 09/11] libbpf: Support loading individual progs Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 10/11] selftests/bpf: Add C tests for reference tracking Joe Stringer
2018-05-09 21:07 ` [RFC bpf-next 11/11] Documentation: Describe bpf " Joe Stringer
2018-05-15 3:19 ` Alexei Starovoitov
2018-05-16 19:05 ` [RFC bpf-next 00/11] Add socket lookup support Joe Stringer
2018-05-16 20:04 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOftzPinEyoaaokgsssA809LQ571x_u8hhA1rXw0HjN8a5O-7w@mail.gmail.com \
--to=joe@wand.net.nz \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).