From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jakub Kicinski <jakub.kicinski@netronome.com>
Subject: Re: [PATCH net-next 4/6] nfp: flower: offload tunnel decap rules
 via indirect TC blocks
Date: Tue, 27 Nov 2018 10:05:02 -0800
Message-ID: <20181127100502.6992916a@cakuba.netronome.com>
References: <20181110052131.3306-1-jakub.kicinski@netronome.com>
        <20181110052131.3306-5-jakub.kicinski@netronome.com>
        <13ea87cc-a662-c510-5e94-d3799040a4cb@solarflare.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: <davem@davemloft.net>, <oss-drivers@netronome.com>,
        <netdev@vger.kernel.org>, <jiri@resnulli.us>,
        <xiyou.wangcong@gmail.com>, <jhs@mojatatu.com>,
        <gerlitz.or@gmail.com>, <ozsh@mellanox.com>, <vladbu@mellanox.com>,
        John Hurley <john.hurley@netronome.com>
To: Edward Cree <ecree@solarflare.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mail-pl1-f194.google.com ([209.85.214.194]:33382 "EHLO
        mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1725872AbeK1FDu (ORCPT
        <rfc822;netdev@vger.kernel.org>); Wed, 28 Nov 2018 00:03:50 -0500
Received: by mail-pl1-f194.google.com with SMTP id z23so15998812plo.0
        for <netdev@vger.kernel.org>; Tue, 27 Nov 2018 10:05:07 -0800 (PST)
In-Reply-To: <13ea87cc-a662-c510-5e94-d3799040a4cb@solarflare.com>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Tue, 27 Nov 2018 16:04:29 +0000, Edward Cree wrote:
> On 10/11/18 05:21, Jakub Kicinski wrote:
> > From: John Hurley <john.hurley@netronome.com>
> >
> > Previously, TC block tunnel decap rules were only offloaded when a
> > callback was triggered through registration of the rules egress device.
> > This meant that the driver had no access to the ingress netdev and so
> > could not verify it was the same tunnel type that the rule implied.
> >
> > Register tunnel devices for indirect TC block offloads in NFP, giving
> > access to new rules based on the ingress device rather than egress. Use
> > this to verify the netdev type of VXLAN and Geneve based rules and offl=
oad
> > the rules to HW if applicable.
> >
> > Tunnel registration is done via a netdev notifier. On notifier
> > registration, this is triggered for already existing netdevs. This means
> > that NFP can register for offloads from devices that exist before it is
> > loaded (filter rules will be replayed from the TC core). Similarly, on
> > notifier unregister, a call is triggered for each currently active netd=
ev.
> > This allows the driver to unregister any indirect block callbacks that =
may
> > still be active.
> >
> > Signed-off-by: John Hurley <john.hurley@netronome.com>
> > Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> =20
> I've been experimenting with this to try to understand it better, and I'm
> =C2=A0struggling to understand how the various cb_ident parameters are me=
ant to
> =C2=A0be used.=C2=A0 In particular:
> > diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/driv=
ers/net/ethernet/netronome/nfp/flower/offload.c
> > index 2c32edfc1a9d..222e1a98cf16 100644
> > --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
> > +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c =20
>=20
> [...]
>=20
> > +int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
> > +				       struct net_device *netdev,
> > +				       unsigned long event)
> > +{
> > +	int err;
> > +
> > +	if (!nfp_fl_is_netdev_to_offload(netdev))
> > +		return NOTIFY_OK;
> > +
> > +	if (event =3D=3D NETDEV_REGISTER) {
> > +		err =3D __tc_indr_block_cb_register(netdev, app,
> > +						  nfp_flower_indr_setup_tc_cb,
> > +						  netdev); =20
> Here 'netdev' is passed as cb_ident, meaning that if you have two of these
> =C2=A0NICs in a system you will (AIUI) get EEXIST when the second one tri=
es to
> =C2=A0register its indr_block_cb, since both will have the same cb and cb=
_ident.
> This is because 'netdev' is the netdevice being watched, rather than the
> =C2=A0netdevice doing the watching.
> AFAICT the right thing to do here would be to pass in 'app' (more general=
ly,
> =C2=A0the same thing as cb_priv) as that should uniquely identify the wat=
cher
> =C2=A0(the watchee is already identified by the 'dev' parameter, which is=
 used to
> =C2=A0obtain the indr_dev on which the cb_list resides).
>=20
> Curiously, though you also pass netdev as the cb_ident to
> =C2=A0tcf_block_cb_register() in nfp_flower_setup_indr_tc_block(), that d=
oes not
> =C2=A0appear to have the same result, as __tcf_block_cb_register() doesn'=
t check
> =C2=A0for an already existing entry =E2=80=94 with the result that
> =C2=A0tcf_block_cb_unregister() may remove the wrong entry from the list.=
=C2=A0 If two
> =C2=A0callers register blocks with the same cb_ident, then one unregister=
s, the
> =C2=A0same block will be removed no matter which caller did the unregiste=
r.

Ack, thanks for pointing that out.

> It is also concerning that both __tc_indr_block_cb_unregister() and
> =C2=A0tcf_block_cb_unregister() return void and will silently do nothing =
if the
> =C2=A0passed cb_ident is not found.=C2=A0 In my experiments this did not =
conduce to
> =C2=A0debuggability; I think they should either WARN_ON, or at least retu=
rn
> =C2=A0(int) -ENOENT so that the caller has the opportunity to do so.

WARN_ON() seems reasonable, returning ENOENT would only push that
WARN_ON() to the drivers, I don't see what else the driver could do
with an error on unregister :)