From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tushar Dave <tushar.n.dave@oracle.com>
Subject: Re: [PATCH net-next 2/5] eBPF: Add new eBPF prog type
 BPF_PROG_TYPE_SOCKET_SG_FILTER
Date: Wed, 12 Sep 2018 12:25:44 -0700
Message-ID: <42dbd9de-ae8e-da57-dc58-b608a682ca3d@oracle.com>
References: <1536694684-3200-1-git-send-email-tushar.n.dave@oracle.com>
 <1536694684-3200-3-git-send-email-tushar.n.dave@oracle.com>
 <20180912035719.ff5mcjg3gbrg52xt@ast-mbp>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: ast@kernel.org, daniel@iogearbox.net, davem@davemloft.net,
        santosh.shilimkar@oracle.com, jakub.kicinski@netronome.com,
        quentin.monnet@netronome.com, jiong.wang@netronome.com,
        sandipan@linux.vnet.ibm.com, john.fastabend@gmail.com,
        kafai@fb.com, rdna@fb.com, yhs@fb.com, netdev@vger.kernel.org,
        rds-devel@oss.oracle.com, sowmini.varadhan@oracle.com
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from aserp2120.oracle.com ([141.146.126.78]:59846 "EHLO
        aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1728139AbeIMAdA (ORCPT
        <rfc822;netdev@vger.kernel.org>); Wed, 12 Sep 2018 20:33:00 -0400
In-Reply-To: <20180912035719.ff5mcjg3gbrg52xt@ast-mbp>
Content-Language: en-US
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>


On 09/11/2018 08:57 PM, Alexei Starovoitov wrote:
> On Tue, Sep 11, 2018 at 09:38:01PM +0200, Tushar Dave wrote:
>> Add new eBPF prog type BPF_PROG_TYPE_SOCKET_SG_FILTER which uses the
>> existing socket filter infrastructure for bpf program attach and load.
>> SOCKET_SG_FILTER eBPF program receives struct scatterlist as bpf context
>> contrast to SOCKET_FILTER which deals with struct skb. This is useful
>> for kernel entities that don't have skb to represent packet data but
>> want to run eBPF socket filter on packet data that is in form of struct
>> scatterlist e.g. IB/RDMA
>>
>> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
>> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
>> ---
>>   include/linux/bpf_types.h      |  1 +
>>   include/uapi/linux/bpf.h       |  1 +
>>   kernel/bpf/syscall.c           |  1 +
>>   kernel/bpf/verifier.c          |  1 +
>>   net/core/filter.c              | 55 ++++++++++++++++++++++++++++++++++++++++--
>>   samples/bpf/bpf_load.c         | 11 ++++++---
>>   tools/bpf/bpftool/prog.c       |  1 +
>>   tools/include/uapi/linux/bpf.h |  1 +
>>   tools/lib/bpf/libbpf.c         |  3 +++
>>   tools/lib/bpf/libbpf.h         |  2 ++
>

Alexi,

Thank you for reviewing the patches.

> please do not mix core kernel and user space into single patch.
> split tools/include/uapi/linux/bpf.h sync into separate patch
> and changes to tools/lib/bpf as yet another patch.

Sure, I can do that.

> 
>>   10 files changed, 72 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
>> index cd26c09..7dc1503 100644
>> --- a/include/linux/bpf_types.h
>> +++ b/include/linux/bpf_types.h
>> @@ -16,6 +16,7 @@
>>   BPF_PROG_TYPE(BPF_PROG_TYPE_SOCK_OPS, sock_ops)
>>   BPF_PROG_TYPE(BPF_PROG_TYPE_SK_SKB, sk_skb)
>>   BPF_PROG_TYPE(BPF_PROG_TYPE_SK_MSG, sk_msg)
>> +BPF_PROG_TYPE(BPF_PROG_TYPE_SOCKET_SG_FILTER, socksg_filter)
>>   #endif
>>   #ifdef CONFIG_BPF_EVENTS
>>   BPF_PROG_TYPE(BPF_PROG_TYPE_KPROBE, kprobe)
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 66917a4..6ec1e32 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -152,6 +152,7 @@ enum bpf_prog_type {
>>   	BPF_PROG_TYPE_LWT_SEG6LOCAL,
>>   	BPF_PROG_TYPE_LIRC_MODE2,
>>   	BPF_PROG_TYPE_SK_REUSEPORT,
>> +	BPF_PROG_TYPE_SOCKET_SG_FILTER,
>>   };
>>   
>>   enum bpf_attach_type {
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 3c9636f..5f302b7 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -1361,6 +1361,7 @@ static int bpf_prog_load(union bpf_attr *attr)
>>   
>>   	if (type != BPF_PROG_TYPE_SOCKET_FILTER &&
>>   	    type != BPF_PROG_TYPE_CGROUP_SKB &&
>> +	    type != BPF_PROG_TYPE_SOCKET_SG_FILTER &&
> 
> I'm not comfortable to let unpriv use this right away.
> Can you live with root-only ?

Honestly, I prep this the same as how we treat
BPF_PROG_TYPE_SOCKET_FILTER. But sure I can live with root-only.

> 
>>   	    !capable(CAP_SYS_ADMIN))
>>   		return -EPERM;
>>   
>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
>> index f4ff0c5..17fc4d2 100644
>> --- a/kernel/bpf/verifier.c
>> +++ b/kernel/bpf/verifier.c
>> @@ -1234,6 +1234,7 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env,
>>   	case BPF_PROG_TYPE_LWT_XMIT:
>>   	case BPF_PROG_TYPE_SK_SKB:
>>   	case BPF_PROG_TYPE_SK_MSG:
>> +	case BPF_PROG_TYPE_SOCKET_SG_FILTER:
>>   		if (meta)
>>   			return meta->pkt_access;
>>   
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 0b40f95..469c488 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -1140,7 +1140,8 @@ static void bpf_release_orig_filter(struct bpf_prog *fp)
>>   
>>   static void __bpf_prog_release(struct bpf_prog *prog)
>>   {
>> -	if (prog->type == BPF_PROG_TYPE_SOCKET_FILTER) {
>> +	if (prog->type == BPF_PROG_TYPE_SOCKET_FILTER ||
>> +	    prog->type == BPF_PROG_TYPE_SOCKET_SG_FILTER) {
>>   		bpf_prog_put(prog);
> 
> this doesn't look right.
> Why this is needed?
> Are you using old-style setsockopt to attach?

Yes.

> I think new style of attaching that all bpf prog types that came
> after socket_filter are using is preferred.
> Pls take a look at BPF_PROG_ATTACH cmd.

well, I am not sure if that is going to work.

I did this way so I can attach eBPF prog type
BPF_PROG_TYPE_SOCKET_SG_FILTER to a socket. Like today we can attach
regular socket filter bpf program (e.g. BPF_PROG_TYPE_SOCKET_FILTER) to
TCP, UDP sockets using setsockopt. The only difference between them is,
BPF_PROG_TYPE_SOCKET_FILTER deals with struct sk_buff while
BPF_PROG_TYPE_SOCKET_SG_FILTER deals with strut scatterlist.

e.g. setsockopt(sock, SOL_SOCKET, SO_ATTACH_BPF, prog_fd,
sizeof(prog_fd[0]));


> 
> Also it looks the first patch doesn't really add the useful logic, but adds
> few lines of code here and there. Then more code comes in patches 3 and 4.
> Please rearrange them that they're reviewable as logical pieces.

okay.

-Tushar
> 
>