From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jakub Kicinski Subject: Re: [PATCH net-next 4/6] nfp: flower: offload tunnel decap rules via indirect TC blocks Date: Tue, 27 Nov 2018 10:05:02 -0800 Message-ID: <20181127100502.6992916a@cakuba.netronome.com> References: <20181110052131.3306-1-jakub.kicinski@netronome.com> <20181110052131.3306-5-jakub.kicinski@netronome.com> <13ea87cc-a662-c510-5e94-d3799040a4cb@solarflare.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: , , , , , , , , , John Hurley To: Edward Cree Return-path: Received: from mail-pl1-f194.google.com ([209.85.214.194]:33382 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725872AbeK1FDu (ORCPT ); Wed, 28 Nov 2018 00:03:50 -0500 Received: by mail-pl1-f194.google.com with SMTP id z23so15998812plo.0 for ; Tue, 27 Nov 2018 10:05:07 -0800 (PST) In-Reply-To: <13ea87cc-a662-c510-5e94-d3799040a4cb@solarflare.com> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, 27 Nov 2018 16:04:29 +0000, Edward Cree wrote: > On 10/11/18 05:21, Jakub Kicinski wrote: > > From: John Hurley > > > > Previously, TC block tunnel decap rules were only offloaded when a > > callback was triggered through registration of the rules egress device. > > This meant that the driver had no access to the ingress netdev and so > > could not verify it was the same tunnel type that the rule implied. > > > > Register tunnel devices for indirect TC block offloads in NFP, giving > > access to new rules based on the ingress device rather than egress. Use > > this to verify the netdev type of VXLAN and Geneve based rules and offl= oad > > the rules to HW if applicable. > > > > Tunnel registration is done via a netdev notifier. On notifier > > registration, this is triggered for already existing netdevs. This means > > that NFP can register for offloads from devices that exist before it is > > loaded (filter rules will be replayed from the TC core). Similarly, on > > notifier unregister, a call is triggered for each currently active netd= ev. > > This allows the driver to unregister any indirect block callbacks that = may > > still be active. > > > > Signed-off-by: John Hurley > > Reviewed-by: Jakub Kicinski =20 > I've been experimenting with this to try to understand it better, and I'm > =C2=A0struggling to understand how the various cb_ident parameters are me= ant to > =C2=A0be used.=C2=A0 In particular: > > diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/driv= ers/net/ethernet/netronome/nfp/flower/offload.c > > index 2c32edfc1a9d..222e1a98cf16 100644 > > --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c > > +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c =20 >=20 > [...] >=20 > > +int nfp_flower_reg_indir_block_handler(struct nfp_app *app, > > + struct net_device *netdev, > > + unsigned long event) > > +{ > > + int err; > > + > > + if (!nfp_fl_is_netdev_to_offload(netdev)) > > + return NOTIFY_OK; > > + > > + if (event =3D=3D NETDEV_REGISTER) { > > + err =3D __tc_indr_block_cb_register(netdev, app, > > + nfp_flower_indr_setup_tc_cb, > > + netdev); =20 > Here 'netdev' is passed as cb_ident, meaning that if you have two of these > =C2=A0NICs in a system you will (AIUI) get EEXIST when the second one tri= es to > =C2=A0register its indr_block_cb, since both will have the same cb and cb= _ident. > This is because 'netdev' is the netdevice being watched, rather than the > =C2=A0netdevice doing the watching. > AFAICT the right thing to do here would be to pass in 'app' (more general= ly, > =C2=A0the same thing as cb_priv) as that should uniquely identify the wat= cher > =C2=A0(the watchee is already identified by the 'dev' parameter, which is= used to > =C2=A0obtain the indr_dev on which the cb_list resides). >=20 > Curiously, though you also pass netdev as the cb_ident to > =C2=A0tcf_block_cb_register() in nfp_flower_setup_indr_tc_block(), that d= oes not > =C2=A0appear to have the same result, as __tcf_block_cb_register() doesn'= t check > =C2=A0for an already existing entry =E2=80=94 with the result that > =C2=A0tcf_block_cb_unregister() may remove the wrong entry from the list.= =C2=A0 If two > =C2=A0callers register blocks with the same cb_ident, then one unregister= s, the > =C2=A0same block will be removed no matter which caller did the unregiste= r. Ack, thanks for pointing that out. > It is also concerning that both __tc_indr_block_cb_unregister() and > =C2=A0tcf_block_cb_unregister() return void and will silently do nothing = if the > =C2=A0passed cb_ident is not found.=C2=A0 In my experiments this did not = conduce to > =C2=A0debuggability; I think they should either WARN_ON, or at least retu= rn > =C2=A0(int) -ENOENT so that the caller has the opportunity to do so. WARN_ON() seems reasonable, returning ENOENT would only push that WARN_ON() to the drivers, I don't see what else the driver could do with an error on unregister :)