netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexei Starovoitov <ast@kernel.org>
To: <davem@davemloft.net>
Cc: <daniel@iogearbox.net>, <netdev@vger.kernel.org>, <kernel-team@fb.com>
Subject: [PATCH RFC bpf-next 0/6] bpf: introduce cgroup-bpf bind, connect, post-bind hooks
Date: Tue, 13 Mar 2018 20:39:28 -0700	[thread overview]
Message-ID: <20180314033934.3502167-1-ast@kernel.org> (raw)

For our container management we've been using complicated and fragile setup
consisting of LD_PRELOAD wrapper intercepting bind and connect calls from
all containerized applications.
The setup involves per-container IPs, policy, etc, so traditional
network-only solutions that involve VRFs, netns, acls are not applicable.
Changing apps is not possible and LD_PRELOAD doesn't work
for apps that don't use glibc like java and golang.
BPF+cgroup looks to be the best solution for this problem.
Hence we introduce 3 hooks:
- at entry into sys_bind and sys_connect
  to let bpf prog look and modify 'struct sockaddr' provided
  by user space and fail bind/connect when appropriate
- post sys_bind after port is allocated

The approach works great and has zero overhead for anyone who doesn't
use it and very low overhead when deployed.

The main question for Daniel and Dave is what approach to take
with prog types...

In this patch set we introduce 6 new program types to make user
experience easier:
  BPF_PROG_TYPE_CGROUP_INET4_BIND,
  BPF_PROG_TYPE_CGROUP_INET6_BIND,
  BPF_PROG_TYPE_CGROUP_INET4_CONNECT,
  BPF_PROG_TYPE_CGROUP_INET6_CONNECT,
  BPF_PROG_TYPE_CGROUP_INET4_POST_BIND,
  BPF_PROG_TYPE_CGROUP_INET6_POST_BIND,

since v4 programs should not be using 'struct bpf_sock_addr'->user_ip6 fields
and different prog type for v4 and v6 helps verifier reject such access
at load time.
Similarly bind vs connect are two different prog types too,
since only sys_connect programs can call new bpf_bind() helper.

This approach is very different from tcp-bpf where single
'struct bpf_sock_ops' and single prog type is used for different hooks.
The field checks are done at run-time instead of load time.

I think the approach taken by this patch set is justified,
but we may do better if we extend BPF_PROG_ATTACH cmd
with log_buf + log_size, then we should be able to combine
bind+connect+v4+v6 into single program type.
The idea that at load time the verifier will remember a bitmask
of fields in bpf_sock_addr used by the program and helpers
that program used, then at attach time we can check that
hook is compatible with features used by the program and
report human readable error message back via log_buf.
We cannot do this right now with just EINVAL, since combinations
of errors like 'using user_ip6 field but attaching to v4 hook'
are too high to express as errno.
This would be bigger change. If you folks think it's worth it
we can go with this approach or if you think 6 new prog types
is not too bad, we can leave the patch as-is.
Comments?
Other comments on patches are welcome.

Andrey Ignatov (6):
  bpf: Hooks for sys_bind
  selftests/bpf: Selftest for sys_bind hooks
  net: Introduce __inet_bind() and __inet6_bind
  bpf: Hooks for sys_connect
  selftests/bpf: Selftest for sys_connect hooks
  bpf: Post-hooks for sys_bind

 include/linux/bpf-cgroup.h                    |  68 +++-
 include/linux/bpf_types.h                     |   6 +
 include/linux/filter.h                        |  10 +
 include/net/inet_common.h                     |   2 +
 include/net/ipv6.h                            |   2 +
 include/net/sock.h                            |   3 +
 include/net/udp.h                             |   1 +
 include/uapi/linux/bpf.h                      |  52 ++-
 kernel/bpf/cgroup.c                           |  36 ++
 kernel/bpf/syscall.c                          |  42 ++
 kernel/bpf/verifier.c                         |   6 +
 net/core/filter.c                             | 479 ++++++++++++++++++++++-
 net/ipv4/af_inet.c                            |  60 ++-
 net/ipv4/tcp_ipv4.c                           |  16 +
 net/ipv4/udp.c                                |  14 +
 net/ipv6/af_inet6.c                           |  47 ++-
 net/ipv6/tcp_ipv6.c                           |  16 +
 net/ipv6/udp.c                                |  20 +
 tools/include/uapi/linux/bpf.h                |  39 +-
 tools/testing/selftests/bpf/Makefile          |   8 +-
 tools/testing/selftests/bpf/bpf_helpers.h     |   2 +
 tools/testing/selftests/bpf/connect4_prog.c   |  45 +++
 tools/testing/selftests/bpf/connect6_prog.c   |  61 +++
 tools/testing/selftests/bpf/test_sock_addr.c  | 541 ++++++++++++++++++++++++++
 tools/testing/selftests/bpf/test_sock_addr.sh |  57 +++
 25 files changed, 1580 insertions(+), 53 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/connect4_prog.c
 create mode 100644 tools/testing/selftests/bpf/connect6_prog.c
 create mode 100644 tools/testing/selftests/bpf/test_sock_addr.c
 create mode 100755 tools/testing/selftests/bpf/test_sock_addr.sh

-- 
2.9.5

             reply	other threads:[~2018-03-14  3:39 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-14  3:39 Alexei Starovoitov [this message]
2018-03-14  3:39 ` [PATCH RFC bpf-next 1/6] bpf: Hooks for sys_bind Alexei Starovoitov
2018-03-14  6:21   ` Eric Dumazet
2018-03-14 18:00     ` Alexei Starovoitov
2018-03-14 14:37   ` Daniel Borkmann
2018-03-14 14:55     ` Daniel Borkmann
2018-03-14 18:11     ` Alexei Starovoitov
2018-03-14 23:27       ` Daniel Borkmann
2018-03-15  0:29         ` Alexei Starovoitov
2018-03-14  3:39 ` [PATCH RFC bpf-next 2/6] selftests/bpf: Selftest for sys_bind hooks Alexei Starovoitov
2018-03-14  3:39 ` [PATCH RFC bpf-next 3/6] net: Introduce __inet_bind() and __inet6_bind Alexei Starovoitov
2018-03-14  3:39 ` [PATCH RFC bpf-next 4/6] bpf: Hooks for sys_connect Alexei Starovoitov
2018-03-14  3:39 ` [PATCH RFC bpf-next 5/6] selftests/bpf: Selftest for sys_connect hooks Alexei Starovoitov
2018-03-14  3:39 ` [PATCH RFC bpf-next 6/6] bpf: Post-hooks for sys_bind Alexei Starovoitov
2018-03-14 17:13 ` [PATCH RFC bpf-next 0/6] bpf: introduce cgroup-bpf bind, connect, post-bind hooks David Ahern
2018-03-14 18:00   ` Alexei Starovoitov
2018-03-14 17:22 ` Mahesh Bandewar (महेश बंडेवार)
2018-03-14 18:01   ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180314033934.3502167-1-ast@kernel.org \
    --to=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).