From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jamal Hadi Salim Subject: Re: [PATCH v4 net-next 2/2] tc: add 'needs_l2' flag to ingress qdisc Date: Mon, 13 Apr 2015 10:16:01 -0400 Message-ID: <552BCFA1.7020502@mojatatu.com> References: <1428708792-5872-1-git-send-email-ast@plumgrid.com> <1428708792-5872-2-git-send-email-ast@plumgrid.com> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: Eric Dumazet , Daniel Borkmann , Thomas Graf , Jiri Pirko , netdev@vger.kernel.org To: Alexei Starovoitov , "David S. Miller" Return-path: Received: from mail-ig0-f178.google.com ([209.85.213.178]:38172 "EHLO mail-ig0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932262AbbDMOQK (ORCPT ); Mon, 13 Apr 2015 10:16:10 -0400 Received: by igbqf9 with SMTP id qf9so44854230igb.1 for ; Mon, 13 Apr 2015 07:16:09 -0700 (PDT) In-Reply-To: <1428708792-5872-2-git-send-email-ast@plumgrid.com> Sender: netdev-owner@vger.kernel.org List-ID: On 04/10/15 19:33, Alexei Starovoitov wrote: > TC classifers and actions attached to ingress and egress qdiscs see > inconsistent skb->data. For ingress L2 header is already pulled, whereas > for egress it's present. Introduce an optional flag for ingress qdisc > which if set will cause ingress to push L2 header before calling > into classifiers/actions and pull L2 back afterwards. > > The cls_bpf/act_bpf are now marked as 'needs_l2'. The users can use them > on ingress qdisc created with 'needs_l2' flag and on any egress qdisc. > The use of them with vanilla ingress is disallowed. > > The ingress_l2 qdisc can only be attached to devices that provide headers_ops. > > When ingress is not enabled static_key avoids *(skb->dev->ingress_queue) > > When ingress is enabled the difference old vs new to reach qdisc spinlock: > old: > *(skb->dev->ingress_queue), if, *(rxq->qdisc), if, *(rxq->qdisc), if > new: > *(skb->dev->ingress_queue), if, *(rxq->qdisc), if, if > > This patch provides a foundation to use ingress_l2+cls_bpf to filter > interesting traffic and mirror small part of it to a different netdev for > capturing. This approach is significantly faster than traditional af_packet, > since skb_clone is called after filtering. dhclient and other tap-based tools > may consider switching to this style. > Alexei, I want to support this work but i am having difficulties. I see your point as i hope you see mine. In my opinion, it is a stalemate. We need Dave to make the call. To repeat what i said earlier: The only known user at this point is bpf. cls_bpf and cls_act could both look at the AT field, find where they are being invoked from and react accordingly. This is not very hard for a coder to do and the user injecting the policy doesnt need to know about it. If you do that then i think you need to also inform users downstream from bpf that they should expect to see the packet at the Link header and not the network header. cheers, jamal PS:- note that __netif_receive_skb_core() at the beginning is what sets all these headers.