From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f195.google.com ([209.85.128.195]:39456 "EHLO mail-wr0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752548AbeCNP4n (ORCPT ); Wed, 14 Mar 2018 11:56:43 -0400 Received: by mail-wr0-f195.google.com with SMTP id k3so5270886wrg.6 for ; Wed, 14 Mar 2018 08:56:43 -0700 (PDT) Date: Wed, 14 Mar 2018 16:56:40 +0100 From: Jiri Pirko To: Or Gerlitz Cc: Jiri Pirko , Rabie Loulou , John Hurley , Jakub Kicinski , Simon Horman , Linux Netdev List , ASAP_Direct_Dev@mellanox.com, mlxsw Subject: Re: [RFC net-next 2/6] driver: net: bonding: allow registration of tc offload callbacks in bond Message-ID: <20180314155640.GF2130@nanopsycho> References: <20180314095027.GC2130@nanopsycho> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: Wed, Mar 14, 2018 at 12:23:59PM CET, gerlitz.or@gmail.com wrote: >On Wed, Mar 14, 2018 at 11:50 AM, Jiri Pirko wrote: >> Tue, Mar 13, 2018 at 04:51:02PM CET, gerlitz.or@gmail.com wrote: >>>On Wed, Mar 7, 2018 at 12:57 PM, Jiri Pirko wrote: > >>>This sounds nice for the case where one install ingress tc rules on >>>the bond (lets >>>call them type 1, see next) >>> >>>One obstacle pointed by my colleague, Rabie, is that when the upper layer >>>issues stat call on the filter, they will get two replies, this can confuse them >>>and lead to wrong decisions (aging). I wonder if/how we can set a knob >> >> The bonding itself would not do anything on stats update >> command (TC_CLSFLOWER_STATS for example). Only the slaves would do >> update. So there will be only reply from slaves. >> >> Bond/team is just going to probagare block bind/unbind down. Nothing else. > >Do we agree that user space will get the replies of all lower (slave) devices, >or I am missing something here? "user space will get the replies" - not sure what exactly do you mean by this. The stats would be accumulated over all devices/drivers who registered block callback. > >>>2. bond being egress port of a rule >>>2.1 VF rep --> uplink 0 >>>2.2 VF rep --> uplink 1 >>> >>>and we do that in the driver (add/del two HW rules, combine the stat >>>results, etc) >> >> That is up to the driver. If the driver can share block between 2 >> devices, he can do that. If he cannot share, it will just report stats >> for every device separatelly (2 block cbs registered) and tc will see >> them both together. No need to do anything in driver. > >right > >>>3. ingress rule on VF rep port with shared tunnel device being the >>>egress (encap) >>>and where the routing of the underlay (tunnel) goes through LAG. > >> Same as "2." > >ok > >>>4. ingress rule shared tunnel device being the ingress and VF rep port >>>being the egress (decap) > >> I don't follow :( > >the way tunneling is handled in tc classifier/action is > >encap: ingress: net port, action1: tunnel key set action2: mirred to >shared-tunnel device > >decap: ingress: shared tunnel device, action1: tunnel key unset >action2: mirred to net port > >type 4 are the decap rules, when we offload it to as HW ACL we stretch >the line and the ingress >in a HW port too (e.g uplink port in NICs) Okay, I see. But where's the bond here? Is it the one I mentioned as "mirred redirect to lag"? > > >>>this uses the egdev facility to be offloaded into the our driver, and >>>then in the driver >>>we will treat it like type 1, two rules need to be installed into HW, >>>but now, we can't delegate them >>>from the vxlan device b/c it has no direct connection with the bond. > >> I see another thing we need to sanitize: vxlan rule ingress match action >> mirred redirect to lag > >right, we don't have for NIC but for switch ASIC, I guess it is applicable Yes, it is. For future NICs I guess it is going to be as well.