From mboxrd@z Thu Jan  1 00:00:00 1970
From: Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [PATCH v3 5/6] net: core: run cgroup eBPF egress programs
Date: Tue, 30 Aug 2016 00:03:23 +0200
Message-ID: <57C4B12B.2070302@iogearbox.net>
References: <1472241532-11682-1-git-send-email-daniel@zonque.org> <1472241532-11682-6-git-send-email-daniel@zonque.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Cc: davem@davemloft.net, kafai@fb.com, fw@strlen.de,
        pablo@netfilter.org, harald@redhat.com, netdev@vger.kernel.org,
        sargun@sargun.me
To: Daniel Mack <daniel@zonque.org>, htejun@fb.com, ast@fb.com
Return-path: <netdev-owner@vger.kernel.org>
Received: from www62.your-server.de ([213.133.104.62]:54433 "EHLO
        www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1754081AbcH2WDl (ORCPT
        <rfc822;netdev@vger.kernel.org>); Mon, 29 Aug 2016 18:03:41 -0400
In-Reply-To: <1472241532-11682-6-git-send-email-daniel@zonque.org>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 08/26/2016 09:58 PM, Daniel Mack wrote:
> If the cgroup associated with the receiving socket has an eBPF
> programs installed, run them from __dev_queue_xmit().
>
> eBPF programs used in this context are expected to either return 1 to
> let the packet pass, or != 1 to drop them. The programs have access to
> the full skb, including the MAC headers.
>
> Note that cgroup_bpf_run_filter() is stubbed out as static inline nop
> for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if
> the feature is unused.
>
> Signed-off-by: Daniel Mack <daniel@zonque.org>
> ---
>   net/core/dev.c | 6 ++++++
>   1 file changed, 6 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index a75df86..17484e6 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -141,6 +141,7 @@
>   #include <linux/netfilter_ingress.h>
>   #include <linux/sctp.h>
>   #include <linux/crash_dump.h>
> +#include <linux/bpf-cgroup.h>
>
>   #include "net-sysfs.h"
>
> @@ -3329,6 +3330,11 @@ static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
>   	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
>   		__skb_tstamp_tx(skb, NULL, skb->sk, SCM_TSTAMP_SCHED);
>
> +	rc = cgroup_bpf_run_filter(skb->sk, skb,
> +				   BPF_ATTACH_TYPE_CGROUP_INET_EGRESS);
> +	if (rc)
> +		return rc;

This would leak the whole skb by the way.

Apart from that, could this be modeled w/o affecting the forwarding path (at some
local output point where we know to have a valid socket)? Then you could also drop
the !sk and sk->sk_family tests, and we wouldn't need to replicate parts of what
clsact is doing as well. Hmm, maybe access to src/dst mac could be handled to be
just zeroes since not available at that point?

>   	/* Disable soft irqs for various locks below. Also
>   	 * stops preemption for RCU.
>   	 */
>