Re: [PATCH bpf-next v3 1/8] bpf: implement getsockopt and setsockopt hooks

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Stanislav Fomichev <sdf@fomichev.me>
To: Martin Lau <kafai@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"ast@kernel.org" <ast@kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	Andrii Nakryiko <andriin@fb.com>
Subject: Re: [PATCH bpf-next v3 1/8] bpf: implement getsockopt and setsockopt hooks
Date: Mon, 10 Jun 2019 09:10:15 -0700	[thread overview]
Message-ID: <20190610161015.GF9660@mini-arch> (raw)
In-Reply-To: <20190608070838.4vhwss4anyibju53@kafai-mbp.dhcp.thefacebook.com>

On 06/08, Martin Lau wrote:
> On Fri, Jun 07, 2019 at 09:29:13AM -0700, Stanislav Fomichev wrote:
> > Implement new BPF_PROG_TYPE_CGROUP_SOCKOPT program type and
> > BPF_CGROUP_{G,S}ETSOCKOPT cgroup hooks.
> > 
> > BPF_CGROUP_SETSOCKOPT get a read-only view of the setsockopt arguments.
> > BPF_CGROUP_GETSOCKOPT can modify the supplied buffer.
> > Both of them reuse existing PTR_TO_PACKET{,_END} infrastructure.
> > 
> > The buffer memory is pre-allocated (because I don't think there is
> > a precedent for working with __user memory from bpf). This might be
> > slow to do for each {s,g}etsockopt call, that's why I've added
> > __cgroup_bpf_prog_array_is_empty that exits early if there is nothing
> > attached to a cgroup. Note, however, that there is a race between
> > __cgroup_bpf_prog_array_is_empty and BPF_PROG_RUN_ARRAY where cgroup
> > program layout might have changed; this should not be a problem
> > because in general there is a race between multiple calls to
> > {s,g}etsocktop and user adding/removing bpf progs from a cgroup.
> > 
> > The return code of the BPF program is handled as follows:
> > * 0: EPERM
> > * 1: success, execute kernel {s,g}etsockopt path after BPF prog exits
> > * 2: success, do _not_ execute kernel {s,g}etsockopt path after BPF
> >      prog exits
> > 
> > v3:
> > * typos in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY comments (Andrii Nakryiko)
> > * reverse christmas tree in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY (Andrii
> >   Nakryiko)
> > * use __bpf_md_ptr instead of __u32 for optval{,_end} (Martin Lau)
> > * use BPF_FIELD_SIZEOF() for consistency (Martin Lau)
> > * new CG_SOCKOPT_ACCESS macro to wrap repeated parts
> > 
> > v2:
> > * moved bpf_sockopt_kern fields around to remove a hole (Martin Lau)
> > * aligned bpf_sockopt_kern->buf to 8 bytes (Martin Lau)
> > * bpf_prog_array_is_empty instead of bpf_prog_array_length (Martin Lau)
> > * added [0,2] return code check to verifier (Martin Lau)
> > * dropped unused buf[64] from the stack (Martin Lau)
> > * use PTR_TO_SOCKET for bpf_sockopt->sk (Martin Lau)
> > * dropped bpf_target_off from ctx rewrites (Martin Lau)
> > * use return code for kernel bypass (Martin Lau & Andrii Nakryiko)
> > 
> 
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 1b65ab0df457..4fc8429af6fc 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
> 
> [ ... ]
> 
> > +static const struct bpf_func_proto *
> > +cg_sockopt_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> > +{
> > +	switch (func_id) {
> > +	case BPF_FUNC_sk_fullsock:
> > +		return &bpf_sk_fullsock_proto;
> May be my v2 comment has been missed.
> 
> sk here (i.e. PTR_TO_SOCKET) must be a fullsock.
> bpf_sk_fullsock() will be a no-op.  Hence, there is
> no need to expose bpf_sk_fullsock_proto.
I think I missed that fact that PTR_TO_SOCKET implies fullsock.
Will remove, thanks!

> > +	case BPF_FUNC_sk_storage_get:
> > +		return &bpf_sk_storage_get_proto;
> > +	case BPF_FUNC_sk_storage_delete:
> > +		return &bpf_sk_storage_delete_proto;
> > +#ifdef CONFIG_INET
> > +	case BPF_FUNC_tcp_sock:
> > +		return &bpf_tcp_sock_proto;
> > +#endif
> > +	default:
> > +		return cgroup_base_func_proto(func_id, prog);
> > +	}
> > +}
> > +
> > +static bool cg_sockopt_is_valid_access(int off, int size,
> > +				       enum bpf_access_type type,
> > +				       const struct bpf_prog *prog,
> > +				       struct bpf_insn_access_aux *info)
> > +{
> > +	const int size_default = sizeof(__u32);
> > +
> > +	if (off < 0 || off >= sizeof(struct bpf_sockopt))
> > +		return false;
> > +
> > +	if (off % size != 0)
> > +		return false;
> > +
> > +	if (type == BPF_WRITE) {
> > +		switch (off) {
> > +		case offsetof(struct bpf_sockopt, optlen):
> > +			if (size != size_default)
> > +				return false;
> > +			return prog->expected_attach_type ==
> > +				BPF_CGROUP_GETSOCKOPT;
> > +		default:
> > +			return false;
> > +		}
> > +	}
> > +
> > +	switch (off) {
> > +	case offsetof(struct bpf_sockopt, sk):
> > +		if (size != sizeof(struct bpf_sock *))
> Based on my understanding in commit b7df9ada9a77 ("bpf: fix pointer offsets in context for 32 bit"),
> I think it should be 'size != sizeof(__u64)'
> 
> Same for the optval and optval_end below.
Good point. I was actually wondering when converting BPF_W to BPF_DW in
the tests whether that would work correctly on 32 bits. Thanks for
commit pointer, that should, indeed, always be all sizeof(__u64).

> > +			return false;
> > +		info->reg_type = PTR_TO_SOCKET;
> > +		break;
> > +	case bpf_ctx_range(struct bpf_sockopt, optval):
> offsetof(struct bpf_sockopt, optval)
Ack. No narrow loads for the pointers.

> > +		if (size != sizeof(void *))
> > +			return false;
> > +		info->reg_type = PTR_TO_PACKET;
> > +		break;
> > +	case bpf_ctx_range(struct bpf_sockopt, optval_end):
> offsetof(struct bpf_sockopt, optval_end)
> 
> > +		if (size != sizeof(void *))
> > +			return false;
> > +		info->reg_type = PTR_TO_PACKET_END;
> > +		break;
> > +	default:
> > +		if (size != size_default)
> > +			return false;
> > +		break;
> > +	}
> > +	return true;
> > +}
> > +
> 
> [ ... ]
> 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 55bfc941d17a..4652c0a005ca 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -1835,7 +1835,7 @@ BPF_CALL_1(bpf_sk_fullsock, struct sock *, sk)
> >  	return sk_fullsock(sk) ? (unsigned long)sk : (unsigned long)NULL;
> >  }
> >  
> > -static const struct bpf_func_proto bpf_sk_fullsock_proto = {
> > +const struct bpf_func_proto bpf_sk_fullsock_proto = {
> As mentioned above, this change is also not needed.
> 
> Others LGTM.
Agreed, will not export.

> >  	.func		= bpf_sk_fullsock,
> >  	.gpl_only	= false,
> >  	.ret_type	= RET_PTR_TO_SOCKET_OR_NULL,
> > @@ -5636,7 +5636,7 @@ BPF_CALL_1(bpf_tcp_sock, struct sock *, sk)
> >  	return (unsigned long)NULL;
> >  }
> >

next prev parent reply	other threads:[~2019-06-10 16:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-07 16:29 [PATCH bpf-next v3 0/8] bpf: getsockopt and setsockopt hooks Stanislav Fomichev
2019-06-07 16:29 ` [PATCH bpf-next v3 1/8] bpf: implement " Stanislav Fomichev
2019-06-08  7:08   ` Martin Lau
2019-06-10 16:10     ` Stanislav Fomichev [this message]
2019-06-07 16:29 ` [PATCH bpf-next v3 2/8] bpf: sync bpf.h to tools/ Stanislav Fomichev
2019-06-07 16:29 ` [PATCH bpf-next v3 3/8] libbpf: support sockopt hooks Stanislav Fomichev
2019-06-07 16:29 ` [PATCH bpf-next v3 4/8] selftests/bpf: test sockopt section name Stanislav Fomichev
2019-06-07 16:29 ` [PATCH bpf-next v3 5/8] selftests/bpf: add sockopt test Stanislav Fomichev
2019-06-07 16:29 ` [PATCH bpf-next v3 6/8] selftests/bpf: add sockopt test that exercises sk helpers Stanislav Fomichev
2019-06-07 16:29 ` [PATCH bpf-next v3 7/8] bpf: add sockopt documentation Stanislav Fomichev
2019-06-07 16:29 ` [PATCH bpf-next v3 8/8] bpftool: support cgroup sockopt Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190610161015.GF9660@mini-arch \
    --to=sdf@fomichev.me \
    --cc=andriin@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.