Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v3 1/9] ethernet: add sun8i-emac driver
From: LABBE Corentin @ 2016-09-13 13:33 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	maxime.ripard-wi1+55ScJUtKEb57/3fJTNBPR1lH4CV8, wens-jdAy2FN1RRM,
	linux-I+IVW8TIWO2tmTQ+vhA3Yw, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devicetree-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20160909141527.GE30871-g2DYL2Zd6BY@public.gmane.org>

On Fri, Sep 09, 2016 at 04:15:27PM +0200, Andrew Lunn wrote:
> Hi Corentin
> 
> > +static int sun8i_emac_mdio_register(struct net_device *ndev)
> > +{
> > +	struct sun8i_emac_priv *priv = netdev_priv(ndev);
> > +	struct mii_bus *bus;
> > +	int ret;
> > +
> > +	bus = mdiobus_alloc();
> 
> You can use devm_mdiobus_alloc() which will simplify your error
> handling and unregister code.
> 
> 	 Andrew

Hello

Since the mdio bus is allocated on ndev/open, it need to be removed when ndev/stop is called.
So devm_mdiobus_alloc cannot be used.

Regards

Corentin Labbe
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v5 0/6] Add eBPF hooks for cgroups
From: Daniel Mack @ 2016-09-13 13:31 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: htejun-b10kYP2dOMg, daniel-FeC+5ew28dpmcu3hnIyYJQ,
	ast-b10kYP2dOMg, davem-fT/PcQaiUtIeIZ0/mPfg9Q, kafai-b10kYP2dOMg,
	fw-HFFVJYpyMKqzQB+pC5nmwQ, harald-H+wXaHxf7aLQT0dZR+AlfA,
	netdev-u79uwXL29TY76Z2rM5mHXA, sargun-GaZTRHToo+CzQB+pC5nmwQ,
	cgroups-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20160913115627.GA4898@salvia>

Hi,

On 09/13/2016 01:56 PM, Pablo Neira Ayuso wrote:
> On Mon, Sep 12, 2016 at 06:12:09PM +0200, Daniel Mack wrote:
>> This is v5 of the patch set to allow eBPF programs for network
>> filtering and accounting to be attached to cgroups, so that they apply
>> to all sockets of all tasks placed in that cgroup. The logic also
>> allows to be extendeded for other cgroup based eBPF logic.
> 
> 1) This infrastructure can only be useful to systemd, or any similar
>    orchestration daemon. Look, you can only apply filtering policies
>    to processes that are launched by systemd, so this only works
>    for server processes.

Sorry, but both statements aren't true. The eBPF policies apply to every
process that is placed in a cgroup, and my example program in 6/6 shows
how that can be done from the command line. Also, systemd is able to
control userspace processes just fine, and it not limited to 'server
processes'.

> For client processes this infrastructure is
>    *racy*, you have to add new processes in runtime to the cgroup,
>    thus there will be time some little time where no filtering policy
>    will be applied. For quality of service, this may be an acceptable
>    race, but this is aiming to deploy a filtering policy.

That's a limitation that applies to many more control mechanisms in the
kernel, and it's something that can easily be solved with fork+exec.

> 2) This aproach looks uninfrastructured to me. This provides a hook
>    to push a bpf blob at a place in the stack that deploys a filtering
>    policy that is not visible to others.

That's just as transparent as SO_ATTACH_FILTER. What kind of
introspection mechanism do you have in mind?

> We have interfaces that allows
>    us to dump the filtering policy that is being applied, report events
>    to enable cooperation between several processes with similar
>    capabilities and so on.

Well, in practice, for netfilter, there can only be one instance in the
system that acts as central authoritative, otherwise you'll end up with
orphaned entries or with situation where some client deletes rules
behind the back of the one that originally installed it. So I really
think there is nothing wrong with demanding a single, privileged
controller to manage things.

>> After chatting with Daniel Borkmann and Alexei off-list, we concluded
>> that __dev_queue_xmit() is the place where the egress hooks should live
>> when eBPF programs need access to the L2 bits of the skb.
> 
> 3) This egress hook is coming very late, the only reason I find to
>    place it at __dev_queue_xmit() is that bpf naturally works with
>    layer 2 information in place. But this new hook is placed in
>    _everyone's output ath_ that only works for the very specific
>    usecase I exposed above.

It's about filtering outgoing network packets of applications, and
providing them with L2 information for filtering purposes. I don't think
that's a very specific use-case.

When the feature is not used at all, the added costs on the output path
are close to zero, due to the use of static branches. If used somewhere
in the system but not for the packet in flight, costs are slightly
higher but acceptable. In fact, it's not even measurable in my tests
here. How is that different from the netfilter OUTPUT hook, btw?

That said, limiting it to L3 is still an option. It's just that we need
ingress and egress to be in sync, so both would be L3 then. So far, the
possible advantages for future use-cases having access to L2 outweighed
the concerns of putting the hook to dev_queue_xmit(), but I'm open to
discussing that.

> The main concern during the workshop was that a hook only for cgroups
> is too specific, but this is actually even more specific than this.

This patch set merely implements an infrastructure that can accommodate
many more things as well in the future. We could, in theory, even add
hooks for forwarded packets specifically, or other eBPF programs, not
even for network filtering etc.

> I have nothing against systemd or the needs for more
> programmability/flexibility in the stack, but I think this needs to
> fulfill some requirements to fit into the infrastructure that we have
> in the right way.

Well, as I explained already, this patch set results from endless
discussions that went nowhere, about how such a thing can be achieved
with netfilter.


Thanks,
Daniel

^ permalink raw reply

* Re: [PATCH 0/9] Move runnable code (tests) from Documentation to selftests
From: Shuah Khan @ 2016-09-13 13:25 UTC (permalink / raw)
  To: Jani Nikula, Jonathan Corbet
  Cc: richardcochran, linux-doc, linux-kernel, netdev, linux-kselftest,
	Shuah Khan
In-Reply-To: <87sht4tav2.fsf@intel.com>

On 09/13/2016 03:20 AM, Jani Nikula wrote:
> On Sat, 10 Sep 2016, Jonathan Corbet <corbet@lwn.net> wrote:
>> On Fri,  9 Sep 2016 16:22:41 -0600
>> Shuah Khan <shuahkh@osg.samsung.com> wrote:
>>
>>> Move runnable code (tests) from Documentation to selftests and update
>>> Makefiles to work under selftests.
>>>
>>> Jon Corbet and I discussed this in an email thread and as per that
>>> discussion, this patch series moves all the tests that are under the
>>> Documentation directory to selftests. There is more runnable code in
>>> the form of examples and utils and that is going to be another patch
>>> series. I moved just the tests and left the documentation files as is.
>>
>> I'm fine with the idea, but it looks like a couple of tweaks are needed,
>> in particular to avoid leaving behind dangling references in
>> Documentation/Makefile that cause build errors.
>>
>> I think the individual patches probably need a wider CC list as well.
>> I'd use the get_maintainer script (or git) to see who has taken an
>> interest in the individual tests and make sure they are aware of the
>> move.
> 
> FWIW, I'm in favor of moving *all* the code away from Documentation, not
> just tests. Essentially removing the CONFIG_BUILD_DOCSRC config option,
> and reserving Documentation/Makefile for documentation build. After this
> series, some of the remaining code belongs under samples, some under
> tools.

I am planning another patch series to move all the examples and samples
and tools to their right location.

> 
> We could make it possible to include the code samples from samples into
> the Sphinx built documentation.
> 
> BR,
> Jani.
> 

I can't say I understand Sphinx, however, it might make sense to include
samples into Sphinx build. Is this approach different from the way they
are built under Documentation via Doc Makfiles now?

thanks,
-- Shuah

^ permalink raw reply

* RE: [RFC V3 PATCH 03/26] net/netpolicy: get device queue irq information
From: Liang, Kan @ 2016-09-13 13:22 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Sergei Shtylyov, davem@davemloft.net,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Kirsher, Jeffrey T, mingo@redhat.com, peterz@infradead.org,
	kuznet@ms2.inr.ac.ru, jmorris@namei.org, yoshfuji@linux-ipv6.org,
	kaber@trash.net, akpm@linux-foundation.org, keescook@chromium.org,
	viro@zeniv.linux.org.uk, gorcunov@openvz.org,
	john.stultz@linaro.org
In-Reply-To: <CAKgT0Uf+guzBrfPUFvMk0RnXbQOJUjkBUp0TeK9HOSb_LU7e8Q@mail.gmail.com>



> On Tue, Sep 13, 2016 at 5:23 AM, Liang, Kan <kan.liang@intel.com> wrote:
> >>
> >> Hello.
> >>
> >> On 09/12/2016 05:55 PM, kan.liang@intel.com wrote:
> >>
> >> > From: Kan Liang <kan.liang@intel.com>
> >> >
> >> > Net policy needs to know device information. Currently, it's enough
> >> > to only get irq information of rx and tx queues.
> >> >
> >> > This patch introduces ndo ops to do so, not ethtool ops.
> >> > Because there are already several ways to get irq information in
> >> > userspace. It's not necessory to extend the ethtool.
> >>
> >>     Necessary.
> >
> > OK. I will extend the ethtool in next version.
> >
> > Thanks,
> > Kan
> 
> Kan, I don't think Sergei was saying you have to extend the ethtool.
> Your spelling of necessary was incorrect in your patch description.
> 
> Sergei, please feel free to tell me I am wrong if my assumption on that is
> incorrect.
> 
> - Alex

Oh, I see. Thanks Alex. :)

Kan



^ permalink raw reply

* Re: [PATCH 2/9] selftests: update filesystems Makefile to work under selftests
From: Shuah Khan @ 2016-09-13 13:20 UTC (permalink / raw)
  To: Michael Ellerman, corbet, richardcochran
  Cc: linux-doc, linux-kernel, netdev, linux-kselftest, Shuah Khan
In-Reply-To: <87fup4ovx6.fsf@concordia.ellerman.id.au>

On 09/13/2016 05:56 AM, Michael Ellerman wrote:
> Shuah Khan <shuahkh@osg.samsung.com> writes:
> 
>> Update to work under selftests. dnotify_test will not be run as part of
>> selftests suite and will not included in install targets. It can be built
>> separately for now.
>>
>> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
>> ---
>>  tools/testing/selftests/filesystems/Makefile | 10 ++++++----
>>  1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/testing/selftests/filesystems/Makefile b/tools/testing/selftests/filesystems/Makefile
>> index 883010c..f1dce5c 100644
>> --- a/tools/testing/selftests/filesystems/Makefile
>> +++ b/tools/testing/selftests/filesystems/Makefile
>> @@ -1,5 +1,7 @@
>> -# List of programs to build
>> -hostprogs-y := dnotify_test
>> +TEST_PROGS := dnotify_test
>> +all: $(TEST_PROGS)
>>  
>> -# Tell kbuild to always build the programs
>> -always := $(hostprogs-y)
>> +include ../lib.mk
>> +
>> +clean:
>> +	rm -fr dnotify_test
> 
> That's a complete rewrite of the Makefile, so I don't think there's any
> value in bringing its content across from Documentation.

Moving Makefile accomplishes delete at the same time. I can combine
the move and updating Makefile into one single patch.

> 
> Better IMHO would be to squash this with the previous patch, so we get a
> working test under selftests in a single commit.
> 
> cheers
> 

thanks,
-- Shuah

^ permalink raw reply

* Re: [Intel-wired-lan] [net-next PATCH v3 1/3] e1000: track BQL bytes regardless of skb or not
From: Alexander Duyck @ 2016-09-13 13:17 UTC (permalink / raw)
  To: Tom Herbert
  Cc: John Fastabend, Brenden Blanco, Alexei Starovoitov, Jeff Kirsher,
	Jesper Dangaard Brouer, David Miller, Cong Wang, intel-wired-lan,
	William Tu, Netdev
In-Reply-To: <CALx6S364cmWz-6trzyX5pdhk_G_AC4XNoxCga13=_6uoWb5CTg@mail.gmail.com>

On Mon, Sep 12, 2016 at 9:25 PM, Tom Herbert <tom@herbertland.com> wrote:
> On Mon, Sep 12, 2016 at 8:00 PM, Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>> On Mon, Sep 12, 2016 at 3:13 PM, John Fastabend
>> <john.fastabend@gmail.com> wrote:
>>> The BQL API does not reference the sk_buff nor does the driver need to
>>> reference the sk_buff to calculate the length of a transmitted frame.
>>> This patch removes an sk_buff reference from the xmit irq path and
>>> also allows packets sent from XDP to use BQL.
>>>
>>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>>> ---
>>>  drivers/net/ethernet/intel/e1000/e1000_main.c |    7 ++-----
>>>  1 file changed, 2 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
>>> index f42129d..62a7f8d 100644
>>> --- a/drivers/net/ethernet/intel/e1000/e1000_main.c
>>> +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
>>> @@ -3882,11 +3882,8 @@ static bool e1000_clean_tx_irq(struct e1000_adapter *adapter,
>>>                         if (cleaned) {
>>>                                 total_tx_packets += buffer_info->segs;
>>>                                 total_tx_bytes += buffer_info->bytecount;
>>> -                               if (buffer_info->skb) {
>>> -                                       bytes_compl += buffer_info->skb->len;
>>> -                                       pkts_compl++;
>>> -                               }
>>> -
>>> +                               bytes_compl += buffer_info->length;
>>> +                               pkts_compl++;
>>>                         }
>>>                         e1000_unmap_and_free_tx_resource(adapter, buffer_info);
>>>                         tx_desc->upper.data = 0;
>>
>> Actually it might be worth looking into why we have two different
>> stats for tracking bytecount and segs.  From what I can tell the
>> pkts_compl value is never actually used.  The function doesn't even
>> use the value so it is just wasted cycles.  And as far as the bytes go
>
> Transmit flow steering which I posted and is pending on some testing
> uses the pkt count BQL to track inflight packets.
>
>> the accounting would be more accurate if you were to use bytecount
>> instead of buffer_info->skb->len.  You would just need to update the
>> xmit function to use that on the other side so that they match.

Okay, that makes sense.

But as I was saying we might be better off using the segs and
bytecount values instead of the skb->len in the xmit and cleanup paths
to get more accurate accounting for the total bytes/packets coming and
going from the interface.  That way we can avoid any significant
change in behavior between TSO and GSO.

- Alex

^ permalink raw reply

* Re: [RFC V3 PATCH 03/26] net/netpolicy: get device queue irq information
From: Alexander Duyck @ 2016-09-13 13:14 UTC (permalink / raw)
  To: Liang, Kan
  Cc: Sergei Shtylyov, davem@davemloft.net,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	Kirsher, Jeffrey T, mingo@redhat.com, peterz@infradead.org,
	kuznet@ms2.inr.ac.ru, jmorris@namei.org, yoshfuji@linux-ipv6.org,
	kaber@trash.net, akpm@linux-foundation.org, keescook@chromium.org,
	viro@zeniv.linux.org.uk, gorcunov@openvz.org,
	john.stultz@linaro.org
In-Reply-To: <37D7C6CF3E00A74B8858931C1DB2F07750BBF530@SHSMSX103.ccr.corp.intel.com>

On Tue, Sep 13, 2016 at 5:23 AM, Liang, Kan <kan.liang@intel.com> wrote:
>>
>> Hello.
>>
>> On 09/12/2016 05:55 PM, kan.liang@intel.com wrote:
>>
>> > From: Kan Liang <kan.liang@intel.com>
>> >
>> > Net policy needs to know device information. Currently, it's enough to
>> > only get irq information of rx and tx queues.
>> >
>> > This patch introduces ndo ops to do so, not ethtool ops.
>> > Because there are already several ways to get irq information in
>> > userspace. It's not necessory to extend the ethtool.
>>
>>     Necessary.
>
> OK. I will extend the ethtool in next version.
>
> Thanks,
> Kan

Kan, I don't think Sergei was saying you have to extend the ethtool.
Your spelling of necessary was incorrect in your patch description.

Sergei, please feel free to tell me I am wrong if my assumption on
that is incorrect.

- Alex

^ permalink raw reply

* Re: [PATCH 3/3] net-next: dsa: add new driver for qca8xxx family
From: Andrew Lunn @ 2016-09-13 13:14 UTC (permalink / raw)
  To: John Crispin
  Cc: David S. Miller, Florian Fainelli, netdev, linux-kernel,
	qsdk-review
In-Reply-To: <5c028659-db7e-9a98-7ec1-1e2d19d135f9@phrozen.org>

On Tue, Sep 13, 2016 at 11:40:43AM +0200, John Crispin wrote:
> 
> 
> On 13/09/2016 03:23, Andrew Lunn wrote:
> > So lets see if i have this right.
> > 
> > Port 0 has no internal phy.
> > Port 1 has an internal PHY at MDIO address 0.
> > Port 2 has an internal PHY at MDIO address 1.
> > ...
> > Port 5 has an internal PHY ad MDIO address 4.
> > Port 6 has no internal PHY.
> 
> Hi Andrew
> 
> correct. port 0 is the cpu port. I initially thought that port6 can also
> be used as te cpu port but there are various places in the datasheet
> stating that the cpu port is 0.

O.K, please correct the comments in the code, and make this clear in
the device tree binding.

> in some of the reference designs, port6
> is wired to a 2nd gmac of the cpu and in those cases port 6 is then
> hardwired to port 5 of the switch and called wan. right now the driver
> does not support this feature.

Hum, why not?

brctl addbr br1
brctl addif br1 port5
brctl addif br1 port6

They are just ports on a switch, so bridge them together.

> ok, i will simply substract 1 from the phy_addr inside the mdio
> callbacks. this would make the code more readable and make the DT
> binding compliant with the ePAPR spec.

It does however need well commenting. It is setting a trap for anybody
who puts an external PHY on port 6. If they access that PHY via these
functions, the address is off by one.

This is the first silicon vendor who made their MDIO addresses for
PHYs illogical. So i'm thinking we maybe should add a new function to
dsa_switch_ops.

	/* Return the MDIO address for the PHY for this port. */
        int     (*phy_port_map(struct dsa_switch *ds, int port);

This should return the MDIO address for integrated PHYs only, or
-ENODEV if the port does not have an integrated PHY. For an external
PHY, a phy-handle should be used. This phy_port_map() is used in
dsa_slave_phy_setup(). But dsa_slave_phy_setup() is already too
complex, so it needs doing with care.

	 Andrew

^ permalink raw reply

* Re: [PATCH] ath10k: Spelling and miscellaneous neatening
From: Valo, Kalle @ 2016-09-13 12:43 UTC (permalink / raw)
  To: Joe Perches
  Cc: netdev@vger.kernel.org, linux-wireless@vger.kernel.org,
	linux-kernel@vger.kernel.org, ath10k@lists.infradead.org
In-Reply-To: <95a6b65277914d88178ed2b15e182455067cca60.1472490319.git.joe@perches.com>

Joe Perches <joe@perches.com> writes:

> Correct some trivial comment typos.
> Remove unnecessary parentheses in a long line.
> Convert a return; before the end of a void function definition to just ;
>
> Signed-off-by: Joe Perches <joe@perches.com>

[...]

> --- a/drivers/net/wireless/ath/ath10k/core.c
> +++ b/drivers/net/wireless/ath/ath10k/core.c
> @@ -2118,7 +2118,7 @@ err:
>  	/* TODO: It's probably a good idea to release device from the driver
>  	 * but calling device_release_driver() here will cause a deadlock.
>  	 */
> -	return;
> +	;
>  }

I don't think this improves anything, I dropped this part from the patch
in my pending branch.

-- 
Kalle Valo

^ permalink raw reply

* [ANNOUNCE] netdev 1.2 tokyo weekly update (13th September, 2016)
From: Hajime Tazaki @ 2016-09-13 12:28 UTC (permalink / raw)
  To: netdev, netfilter-devel, linux-wireless, netdev-conference, lwn


Hello folks,

I hope you're fine and ready to trip to Tokyo.

Here is an weekly update of Netdev 1.2 Tokyo.

== Keynote talk ==

We confirmed that David Miller will give a keynote titled
"Fast Programmable Networks & Encapsulated Protocols".


== Newly accepted sessions ==

We also accepted one additional talk in the last week.

- Network interface configuration on a Linux NOS
  by Roopa Prabhu

Full list of accepted sessions is available here.

http://netdevconf.org/1.2/accepted-sessions.html


== Our sponsors ==

We got new sponsors, Cisco as a Gold sponsor, LWN.net as a
Media sponsor.  Many thanks for your support.

- Platinum
Verizon, Facebook, Cumulus Networks
- Gold
Mojatatu Networks, VMWare, Google, NTT, LinkedIn, Cisco (new)
- Silver
NetApp, IIJ, Netronome, SolarFlare, Mellanox, Sophos
- Bronze
Zen Load Balancer
- Media Sponsor
lwn.net (new)

Twitter: https://twitter.com/netdev01
Web: http://netdevconf.org/1.2/


== Others ==

Be prepared for your travel. Hotel and travel information
are available on the web pages.

http://netdevconf.org/1.2/travel.html
http://netdevconf.org/1.2/hotel.html

The early bird tickets is still available until 15th.
Please be registered - your early registration is always
helpful to organize a great conference.

http://netdevconf.org/1.2/registration.html


Looking forward to seeing you in Tokyo very soon.

-- Hajime



^ permalink raw reply

* Re: ath10k: remove unused variable ar_pci
From: Kalle Valo @ 2016-09-13 12:25 UTC (permalink / raw)
  To: Chaehyun Lim
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA, Chaehyun Lim,
	ath10k-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <20160905133802.18755-1-chaehyun.lim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Chaehyun Lim <chaehyun.lim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> Trival fix to remove unused variable ar_pci in ath10k_pci_tx_pipe_cleanup
> when building with W=1:
> drivers/net/wireless/ath/ath10k/pci.c:1696:21: warning: variable
> 'ar_pci' set but not used [-Wunused-but-set-variable]
> 
> Signed-off-by: Chaehyun Lim <chaehyun.lim-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

Thanks, 1 patch applied to ath-next branch of ath.git:

214d55394481 ath10k: remove unused variable ar_pci

-- 
Sent by pwcli
https://patchwork.kernel.org/patch/9313963/

^ permalink raw reply

* Re: [V2] ath10k: fix memory leak on caldata on error exit path
From: Kalle Valo @ 2016-09-13 12:24 UTC (permalink / raw)
  To: Colin Ian King; +Cc: ath10k, linux-wireless, netdev, linux-kernel
In-Reply-To: <20160903163819.13115-1-colin.king@canonical.com>

Colin Ian King <colin.king@canonical.com> wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> caldata is not being free'd on the error exit path, causing
> a memory leak and data definitely should not be freed. Free
> caldata instead of data.
> 
> Thanks to Kalle Valo for spotting that data should not be
> free'd.
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Thanks, 1 patch applied to ath-next branch of ath.git:

5f4761dda2ba ath10k: fix memory leak on caldata on error exit path

-- 
Sent by pwcli
https://patchwork.kernel.org/patch/9312163/

^ permalink raw reply

* [RFC PATCH] xen-netback: fix error handling on netback_probe()
From: Filipe Manco @ 2016-09-13 12:11 UTC (permalink / raw)
  To: netdev; +Cc: wei.liu2, filipe.manco

In case of error during netback_probe() (e.g. an entry missing on the
xenstore) netback_remove() is called on the new device, which will set
the device backend state to XenbusStateClosed by calling
set_backend_state(). However, the backend state wasn't initialized by
netback_probe() at this point, which will cause and invalid transaction
and set_backend_state() to BUG().

Initialize the backend state at the beginning of netback_probe() to
XenbusStateInitialising, and create a new valid state transaction on
set_backend_state(), from XenbusStateInitialising to XenbusStateClosed.

Signed-off-by: Filipe Manco <filipe.manco@neclab.eu>
---
 drivers/net/xen-netback/xenbus.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 6a31f2610c23..c0e5f6994d01 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -270,6 +270,7 @@ static int netback_probe(struct xenbus_device *dev,
 
 	be->dev = dev;
 	dev_set_drvdata(&dev->dev, be);
+	be->state = XenbusStateInitialising;
 
 	sg = 1;
 
@@ -515,6 +516,15 @@ static void set_backend_state(struct backend_info *be,
 {
 	while (be->state != state) {
 		switch (be->state) {
+		case XenbusStateInitialising:
+			switch (state) {
+			case XenbusStateClosed:
+				backend_switch_state(be, XenbusStateClosed);
+				break;
+			default:
+				BUG();
+			}
+			break;
 		case XenbusStateClosed:
 			switch (state) {
 			case XenbusStateInitWait:
-- 
2.7.4

^ permalink raw reply related

* RE: [RFC V3 PATCH 03/26] net/netpolicy: get device queue irq information
From: Liang, Kan @ 2016-09-13 12:23 UTC (permalink / raw)
  To: Sergei Shtylyov, davem@davemloft.net,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
  Cc: Kirsher, Jeffrey T, mingo@redhat.com, peterz@infradead.org,
	kuznet@ms2.inr.ac.ru, jmorris@namei.org, yoshfuji@linux-ipv6.org,
	kaber@trash.net, akpm@linux-foundation.org, keescook@chromium.org,
	viro@zeniv.linux.org.uk, gorcunov@openvz.org,
	john.stultz@linaro.org, aduyck@mirantis.com, decot@googlers.com,
	alexander.duyck@gmail.com, daniel@iogearbox.net, "tom@
In-Reply-To: <665fd0a7-fe83-742a-28fd-d597b28505b5@cogentembedded.com>

> 
> Hello.
> 
> On 09/12/2016 05:55 PM, kan.liang@intel.com wrote:
> 
> > From: Kan Liang <kan.liang@intel.com>
> >
> > Net policy needs to know device information. Currently, it's enough to
> > only get irq information of rx and tx queues.
> >
> > This patch introduces ndo ops to do so, not ethtool ops.
> > Because there are already several ways to get irq information in
> > userspace. It's not necessory to extend the ethtool.
> 
>     Necessary.

OK. I will extend the ethtool in next version.

Thanks,
Kan

> 
> > Signed-off-by: Kan Liang <kan.liang@intel.com>
> 
> [...]
> 
> MBR, Sergei


^ permalink raw reply

* RE: [RFC V3 PATCH 18/26] net/netpolicy: set tx queues according to policy
From: Liang, Kan @ 2016-09-13 12:22 UTC (permalink / raw)
  To: Tom Herbert
  Cc: David S. Miller, LKML, Linux Kernel Network Developers,
	Kirsher, Jeffrey T, Ingo Molnar, peterz@infradead.org,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, akpm@linux-foundation.org, Kees Cook,
	viro@zeniv.linux.org.uk, gorcunov@openvz.org, John Stultz,
	Alexander Duyck, David Decotigny,
	Alexander Duyck <alexander.duyc
In-Reply-To: <CALx6S36C3cJvH==e3FVf5jox=-OqEziEQTXvr8hmqshMgbGd8w@mail.gmail.com>



> -----Original Message-----
> From: Tom Herbert [mailto:tom@herbertland.com]
> Sent: Monday, September 12, 2016 4:23 PM
> To: Liang, Kan <kan.liang@intel.com>
> Cc: David S. Miller <davem@davemloft.net>; LKML <linux-
> kernel@vger.kernel.org>; Linux Kernel Network Developers
> <netdev@vger.kernel.org>; Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>;
> Ingo Molnar <mingo@redhat.com>; peterz@infradead.org; Alexey Kuznetsov
> <kuznet@ms2.inr.ac.ru>; James Morris <jmorris@namei.org>; Hideaki
> YOSHIFUJI <yoshfuji@linux-ipv6.org>; Patrick McHardy <kaber@trash.net>;
> akpm@linux-foundation.org; Kees Cook <keescook@chromium.org>;
> viro@zeniv.linux.org.uk; gorcunov@openvz.org; John Stultz
> <john.stultz@linaro.org>; Alexander Duyck <aduyck@mirantis.com>; Ben
> Hutchings <ben@decadent.org.uk>; David Decotigny <decot@googlers.com>;
> Florian Westphal <fw@strlen.de>; Alexander Duyck
> <alexander.duyck@gmail.com>; Daniel Borkmann <daniel@iogearbox.net>;
> rdunlap@infradead.org; Cong Wang <xiyou.wangcong@gmail.com>; Hannes
> Frederic Sowa <hannes@stressinduktion.org>; Stephen Hemminger
> <stephen@networkplumber.org>; Alexei Starovoitov
> <alexei.starovoitov@gmail.com>; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; Andi Kleen <andi@firstfloor.org>
> Subject: Re: [RFC V3 PATCH 18/26] net/netpolicy: set tx queues according to
> policy
> 
> On Mon, Sep 12, 2016 at 7:55 AM,  <kan.liang@intel.com> wrote:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > When the device tries to transmit a packet, netdev_pick_tx is called
> > to find the available tx queues. If the net policy is applied, it
> > picks up the assigned tx queue from net policy subsystem, and redirect
> > the traffic to the assigned queue.
> >
> > Signed-off-by: Kan Liang <kan.liang@intel.com>
> > ---
> >  include/net/sock.h |  9 +++++++++
> >  net/core/dev.c     | 20 ++++++++++++++++++--
> >  2 files changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/net/sock.h b/include/net/sock.h index
> > e1e9e3d..ca97f35 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -2280,4 +2280,13 @@ extern int sysctl_optmem_max;  extern __u32
> > sysctl_wmem_default;  extern __u32 sysctl_rmem_default;
> >
> > +/* Return netpolicy instance information from socket. */ static
> > +inline struct netpolicy_instance *netpolicy_find_instance(struct sock
> > +*sk) { #ifdef CONFIG_NETPOLICY
> > +       if (is_net_policy_valid(sk->sk_netpolicy.policy))
> > +               return &sk->sk_netpolicy; #endif
> > +       return NULL;
> > +}
> >  #endif /* _SOCK_H */
> > diff --git a/net/core/dev.c b/net/core/dev.c index 34b5322..b9a8044
> > 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3266,6 +3266,7 @@ struct netdev_queue *netdev_pick_tx(struct
> net_device *dev,
> >                                     struct sk_buff *skb,
> >                                     void *accel_priv)  {
> > +       struct sock *sk = skb->sk;
> >         int queue_index = 0;
> >
> >  #ifdef CONFIG_XPS
> > @@ -3280,8 +3281,23 @@ struct netdev_queue *netdev_pick_tx(struct
> net_device *dev,
> >                 if (ops->ndo_select_queue)
> >                         queue_index = ops->ndo_select_queue(dev, skb, accel_priv,
> >                                                             __netdev_pick_tx);
> > -               else
> > -                       queue_index = __netdev_pick_tx(dev, skb);
> > +               else {
> > +#ifdef CONFIG_NETPOLICY
> > +                       struct netpolicy_instance *instance;
> > +
> > +                       queue_index = -1;
> > +                       if (dev->netpolicy && sk) {
> > +                               instance = netpolicy_find_instance(sk);
> > +                               if (instance) {
> > +                                       if (!instance->dev)
> > +                                               instance->dev = dev;
> > +                                       queue_index = netpolicy_pick_queue(instance, false);
> > +                               }
> > +                       }
> > +                       if (queue_index < 0) #endif
> 
> I doubt this produces the intended effect. Several drivers use
> ndo_select_queue (such as mlx4) where there might do something special
> for a few packets but end up called the default handler which
> __netdev_pick_tx for most packets. So in such cases the netpolicy path would
> be routinely bypassed. Maybe this code should be in __netdev_pick_tx.

I will move the code to __netdev_pick_tx in next version.

Thanks,
Kan

> 
> Tom
> 
> > +                               queue_index = __netdev_pick_tx(dev, skb);
> > +               }
> >
> >                 if (!accel_priv)
> >                         queue_index = netdev_cap_txqueue(dev,
> > queue_index);
> > --
> > 2.5.5
> >

^ permalink raw reply

* Re: icmpv6: issue with routing table entries from link local addresses
From: Andreas Hübner @ 2016-09-13 11:59 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev
In-Reply-To: <bf36b487-1cdf-ede9-1ed8-cca17c7f5fc6@cumulusnetworks.com>

I think I found the relevant fixes:

First and foremost it's 741a11d9e410.
(net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set)

This seems to have already solved my problem, however there were two
followup fixes that I should probably also apply:

d46a9d678e4c net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set
6f21c96a78b8 ipv6: enforce flowi6_oif usage in ip6_dst_lookup_tail()


So again, sorry for the noise and thanks for your help!

Andreas

^ permalink raw reply

* Re: [PATCH net 1/6] sctp: remove the unnecessary state check in sctp_outq_tail
From: Neil Horman @ 2016-09-13 11:57 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, linux-sctp, davem, Marcelo Ricardo Leitner,
	Vlad Yasevich, daniel
In-Reply-To: <CADvbK_f-09NvfeCbeaZZF8+LuMASTYnrh+02RvTz5UR-huiR3g@mail.gmail.com>

On Sat, Sep 10, 2016 at 12:03:53AM +0800, Xin Long wrote:
> > I don't know, I still don't feel safe about it.  I agree the socket lock keeps
> > the state from changing during a single transmission, which makes the use case
> > you are focused on correct.
> ok, :-)
> 
> >
> > That said, have you considered the retransmit case?  That is to say, if you
> > queue and flush the outq, and some packets fail delivery, and in the time
> > between the intial send and the expiration of the RTX timer (during which the
> > socket lock will have been released), an event may occur which changes the
> > transport state, which will then be ignored with your patch.
> Sorry, I'm not sure if I got it.
> 
> You mean "during which changes q->asoc->state", right ?
> 
> This patch removes the check of q->asoc->state in sctp_outq_tail().
> 
> sctp_outq_tail() is called for data only in:
> sctp_primitive_SEND -> sctp_do_sm -> sctp_cmd_send_msg ->
> sctp_cmd_interpreter -> sctp_cmd_send_msg() -> sctp_outq_tail()
> 
> before calling sctp_primitive_SEND, hold sock lock first.
> then sctp_primitive_SEND choose FUNC according:
> 
> #define TYPE_SCTP_PRIMITIVE_SEND  {
> ....
> 
> if asoc->state is unavailable, FUNC can't be sctp_cmd_send_msg,
> but sctp_sf_error_closed/sctp_sf_error_shutdown,  sctp_outq_tail
> can't be called, either.
> I mean sctp_primitive_SEND do the same check for asoc->state
> already actually.
> 
> so the code in sctp_outq_tail is redundant actually.
> 
> 
> 
> 
> >
> > Neil
> >
> 

Ok, you've convinced me, thanks for taking the time to go through it

Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* Re: [PATCH v5 0/6] Add eBPF hooks for cgroups
From: Pablo Neira Ayuso @ 2016-09-13 11:56 UTC (permalink / raw)
  To: Daniel Mack
  Cc: htejun, daniel, ast, davem, kafai, fw, harald, netdev, sargun,
	cgroups
In-Reply-To: <1473696735-11269-1-git-send-email-daniel@zonque.org>

Hi,

On Mon, Sep 12, 2016 at 06:12:09PM +0200, Daniel Mack wrote:
> This is v5 of the patch set to allow eBPF programs for network
> filtering and accounting to be attached to cgroups, so that they apply
> to all sockets of all tasks placed in that cgroup. The logic also
> allows to be extendeded for other cgroup based eBPF logic.

1) This infrastructure can only be useful to systemd, or any similar
   orchestration daemon. Look, you can only apply filtering policies
   to processes that are launched by systemd, so this only works
   for server processes. For client processes this infrastructure is
   *racy*, you have to add new processes in runtime to the cgroup,
   thus there will be time some little time where no filtering policy
   will be applied. For quality of service, this may be an acceptable
   race, but this is aiming to deploy a filtering policy.

2) This aproach looks uninfrastructured to me. This provides a hook
   to push a bpf blob at a place in the stack that deploys a filtering
   policy that is not visible to others. We have interfaces that allows
   us to dump the filtering policy that is being applied, report events
   to enable cooperation between several processes with similar
   capabilities and so on.  For the XDP thing, this ability to push
   blobs may be fine as long as it will not interfer with the stack so
   we can provide an alternative to DPDK in Linux. For tracing, that's
   fine too since it is innocuous. And likely for other applications is
   a good fit. But I don't think this is the case.

> After chatting with Daniel Borkmann and Alexei off-list, we concluded
> that __dev_queue_xmit() is the place where the egress hooks should live
> when eBPF programs need access to the L2 bits of the skb.

3) This egress hook is coming very late, the only reason I find to
   place it at __dev_queue_xmit() is that bpf naturally works with
   layer 2 information in place. But this new hook is placed in
   _everyone's output ath_ that only works for the very specific
   usecase I exposed above.

The main concern during the workshop was that a hook only for cgroups
is too specific, but this is actually even more specific than this.

I have nothing against systemd or the needs for more
programmability/flexibility in the stack, but I think this needs to
fulfill some requirements to fit into the infrastructure that we have
in the right way.

^ permalink raw reply

* Re: [PATCH 2/9] selftests: update filesystems Makefile to work under selftests
From: Michael Ellerman @ 2016-09-13 11:56 UTC (permalink / raw)
  To: Shuah Khan, corbet, richardcochran
  Cc: Shuah Khan, linux-doc, linux-kernel, netdev, linux-kselftest
In-Reply-To: <239195aa67e72b7eb971cdacfc9e338de912524e.1473458697.git.shuahkh@osg.samsung.com>

Shuah Khan <shuahkh@osg.samsung.com> writes:

> Update to work under selftests. dnotify_test will not be run as part of
> selftests suite and will not included in install targets. It can be built
> separately for now.
>
> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
> ---
>  tools/testing/selftests/filesystems/Makefile | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/tools/testing/selftests/filesystems/Makefile b/tools/testing/selftests/filesystems/Makefile
> index 883010c..f1dce5c 100644
> --- a/tools/testing/selftests/filesystems/Makefile
> +++ b/tools/testing/selftests/filesystems/Makefile
> @@ -1,5 +1,7 @@
> -# List of programs to build
> -hostprogs-y := dnotify_test
> +TEST_PROGS := dnotify_test
> +all: $(TEST_PROGS)
>  
> -# Tell kbuild to always build the programs
> -always := $(hostprogs-y)
> +include ../lib.mk
> +
> +clean:
> +	rm -fr dnotify_test

That's a complete rewrite of the Makefile, so I don't think there's any
value in bringing its content across from Documentation.

Better IMHO would be to squash this with the previous patch, so we get a
working test under selftests in a single commit.

cheers

^ permalink raw reply

* [PATCH net-next v2 5/5] cxgb4: add support for drop and redirect actions
From: Rahul Lakkireddy @ 2016-09-13 11:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, hariprasad, leedom, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1473759872.git.rahul.lakkireddy@chelsio.com>

Add support for dropping matched packets in hardware.  Also add support
for re-directing matched packets to a specified port in hardware.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c | 71 +++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
index b9fb0af..297f4f3 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
@@ -32,6 +32,9 @@
  * SOFTWARE.
  */
 
+#include <net/tc_act/tc_gact.h>
+#include <net/tc_act/tc_mirred.h>
+
 #include "cxgb4.h"
 #include "cxgb4_tc_u32_parse.h"
 #include "cxgb4_tc_u32.h"
@@ -82,6 +85,67 @@ static int fill_match_fields(struct adapter *adap,
 	return 0;
 }
 
+/* Fill ch_filter_specification with parsed action. */
+static int fill_action_fields(struct adapter *adap,
+			      struct ch_filter_specification *fs,
+			      struct tc_cls_u32_offload *cls)
+{
+	const struct tc_action *a;
+	struct tcf_exts *exts;
+	LIST_HEAD(actions);
+	unsigned int num_actions = 0;
+
+	exts = cls->knode.exts;
+	if (tc_no_actions(exts))
+		return -EINVAL;
+
+	tcf_exts_to_list(exts, &actions);
+	list_for_each_entry(a, &actions, list) {
+		/* Don't allow more than one action per rule. */
+		if (num_actions)
+			return -EINVAL;
+
+		/* Drop in hardware. */
+		if (is_tcf_gact_shot(a)) {
+			fs->action = FILTER_DROP;
+			num_actions++;
+			continue;
+		}
+
+		/* Re-direct to specified port in hardware. */
+		if (is_tcf_mirred_redirect(a)) {
+			struct net_device *n_dev;
+			unsigned int i, index;
+			bool found = false;
+
+			index = tcf_mirred_ifindex(a);
+			for_each_port(adap, i) {
+				n_dev = adap->port[i];
+				if (index == n_dev->ifindex) {
+					fs->action = FILTER_SWITCH;
+					fs->eport = i;
+					found = true;
+					break;
+				}
+			}
+
+			/* Interface doesn't belong to any port of
+			 * the underlying hardware.
+			 */
+			if (!found)
+				return -EINVAL;
+
+			num_actions++;
+			continue;
+		}
+
+		/* Un-supported action. */
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 int cxgb4_config_knode(struct net_device *dev, __be16 protocol,
 		       struct tc_cls_u32_offload *cls)
 {
@@ -234,6 +298,13 @@ int cxgb4_config_knode(struct net_device *dev, __be16 protocol,
 	if (ret)
 		goto out;
 
+	/* Fill ch_filter_specification action fields to be shipped to
+	 * hardware.
+	 */
+	ret = fill_action_fields(adapter, &fs, cls);
+	if (ret)
+		goto out;
+
 	/* The filter spec has been completely built from the info
 	 * provided from u32.  We now set some default fields in the
 	 * spec for sanity.
-- 
2.5.3

^ permalink raw reply related

* [PATCH net-next v2 4/5] cxgb4: add support for offloading u32 filters
From: Rahul Lakkireddy @ 2016-09-13 11:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, hariprasad, leedom, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1473759872.git.rahul.lakkireddy@chelsio.com>

Add support for offloading u32 filter onto hardware.  Links are stored
in a jump table to perform necessary jumps to match TCP/UDP header.
When inserting rules in the linked bucket, the TCP/UDP match fields
in the corresponding entry of the jump table are appended to the filter
rule before insertion.  If a link is deleted, then all corresponding
filters associated with the link are also deleted.  Also enable
hardware tc offload as a supported feature.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/Makefile        |   2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |   3 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    |  41 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c  | 414 +++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h  |  57 +++
 .../ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h    |  12 +
 6 files changed, 527 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h

diff --git a/drivers/net/ethernet/chelsio/cxgb4/Makefile b/drivers/net/ethernet/chelsio/cxgb4/Makefile
index da88981..c6b71f6 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/Makefile
+++ b/drivers/net/ethernet/chelsio/cxgb4/Makefile
@@ -4,7 +4,7 @@
 
 obj-$(CONFIG_CHELSIO_T4) += cxgb4.o
 
-cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o cxgb4_filter.o
+cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o cxgb4_filter.o cxgb4_tc_u32.o
 cxgb4-$(CONFIG_CHELSIO_T4_DCB) +=  cxgb4_dcb.o
 cxgb4-$(CONFIG_CHELSIO_T4_FCOE) +=  cxgb4_fcoe.o
 cxgb4-$(CONFIG_DEBUG_FS) += cxgb4_debugfs.o
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 7c256a2..a969208 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -867,6 +867,9 @@ struct adapter {
 
 	spinlock_t stats_lock;
 	spinlock_t win0_lock ____cacheline_aligned_in_smp;
+
+	/* TC u32 offload */
+	struct cxgb4_tc_u32_table *tc_u32;
 };
 
 /* Support for "sched-class" command to allow a TX Scheduling Class to be
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 8909f96..a22dab9 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -78,6 +78,7 @@
 #include "clip_tbl.h"
 #include "l2t.h"
 #include "sched.h"
+#include "cxgb4_tc_u32.h"
 
 char cxgb4_driver_name[] = KBUILD_MODNAME;
 
@@ -3027,6 +3028,35 @@ static int cxgb_set_tx_maxrate(struct net_device *dev, int index, u32 rate)
 	return err;
 }
 
+int cxgb_setup_tc(struct net_device *dev, u32 handle, __be16 proto,
+		  struct tc_to_netdev *tc)
+{
+	struct adapter *adap = netdev2adap(dev);
+	struct port_info *pi = netdev2pinfo(dev);
+
+	if (!(adap->flags & FULL_INIT_DONE)) {
+		dev_err(adap->pdev_dev,
+			"Failed to setup tc on port %d. Link Down?\n",
+			pi->port_id);
+		return -EINVAL;
+	}
+
+	if (TC_H_MAJ(handle) == TC_H_MAJ(TC_H_INGRESS) &&
+	    tc->type == TC_SETUP_CLSU32) {
+		switch (tc->cls_u32->command) {
+		case TC_CLSU32_NEW_KNODE:
+		case TC_CLSU32_REPLACE_KNODE:
+			return cxgb4_config_knode(dev, proto, tc->cls_u32);
+		case TC_CLSU32_DELETE_KNODE:
+			return cxgb4_delete_knode(dev, proto, tc->cls_u32);
+		default:
+			return -EOPNOTSUPP;
+		}
+	}
+
+	return -EOPNOTSUPP;
+}
+
 static const struct net_device_ops cxgb4_netdev_ops = {
 	.ndo_open             = cxgb_open,
 	.ndo_stop             = cxgb_close,
@@ -3050,6 +3080,7 @@ static const struct net_device_ops cxgb4_netdev_ops = {
 	.ndo_busy_poll        = cxgb_busy_poll,
 #endif
 	.ndo_set_tx_maxrate   = cxgb_set_tx_maxrate,
+	.ndo_setup_tc         = cxgb_setup_tc,
 };
 
 #ifdef CONFIG_PCI_IOV
@@ -4781,6 +4812,7 @@ static void free_some_resources(struct adapter *adapter)
 	t4_free_mem(adapter->l2t);
 	t4_cleanup_sched(adapter);
 	t4_free_mem(adapter->tids.tid_tab);
+	cxgb4_cleanup_tc_u32(adapter);
 	kfree(adapter->sge.egr_map);
 	kfree(adapter->sge.ingr_map);
 	kfree(adapter->sge.starving_fl);
@@ -5125,7 +5157,8 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		netdev->hw_features = NETIF_F_SG | TSO_FLAGS |
 			NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 			NETIF_F_RXCSUM | NETIF_F_RXHASH |
-			NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
+			NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
+			NETIF_F_HW_TC;
 		if (highdma)
 			netdev->hw_features |= NETIF_F_HIGHDMA;
 		netdev->features |= netdev->hw_features;
@@ -5213,6 +5246,12 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		dev_warn(&pdev->dev, "could not allocate TID table, "
 			 "continuing\n");
 		adapter->params.offload = 0;
+	} else {
+		adapter->tc_u32 = cxgb4_init_tc_u32(adapter,
+						    CXGB4_MAX_LINK_HANDLE);
+		if (!adapter->tc_u32)
+			dev_warn(&pdev->dev,
+				 "could not offload tc u32, continuing\n");
 	}
 
 	if (is_offload(adapter)) {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
new file mode 100644
index 0000000..b9fb0af
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
@@ -0,0 +1,414 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "cxgb4.h"
+#include "cxgb4_tc_u32_parse.h"
+#include "cxgb4_tc_u32.h"
+
+/* Fill ch_filter_specification with parsed match value/mask pair. */
+static int fill_match_fields(struct adapter *adap,
+			     struct ch_filter_specification *fs,
+			     struct tc_cls_u32_offload *cls,
+			     const struct cxgb4_match_field *entry,
+			     bool next_header)
+{
+	unsigned int i, j;
+	u32 val, mask;
+	int off, err;
+	bool found = false;
+
+	for (i = 0; i < cls->knode.sel->nkeys; i++) {
+		off = cls->knode.sel->keys[i].off;
+		val = cls->knode.sel->keys[i].val;
+		mask = cls->knode.sel->keys[i].mask;
+
+		if (next_header) {
+			/* For next headers, parse only keys with offmask */
+			if (!cls->knode.sel->keys[i].offmask)
+				continue;
+		} else {
+			/* For the remaining, parse only keys without offmask */
+			if (cls->knode.sel->keys[i].offmask)
+				continue;
+		}
+
+		found = false;
+
+		for (j = 0; entry[j].val; j++) {
+			if (off == entry[j].off) {
+				found = true;
+				err = entry[j].val(fs, val, mask);
+				if (err)
+					return err;
+				break;
+			}
+		}
+
+		if (!found)
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+int cxgb4_config_knode(struct net_device *dev, __be16 protocol,
+		       struct tc_cls_u32_offload *cls)
+{
+	struct adapter *adapter = netdev2adap(dev);
+	struct cxgb4_tc_u32_table *t;
+	const struct cxgb4_match_field *start, *link_start = NULL;
+	struct cxgb4_link *link;
+	struct ch_filter_specification fs;
+	unsigned int filter_id;
+	u32 uhtid, link_uhtid;
+	int ret;
+	bool is_ipv6 = false;
+
+	if (!can_tc_u32_offload(dev))
+		return -EOPNOTSUPP;
+
+	if (protocol != htons(ETH_P_IP) && protocol != htons(ETH_P_IPV6))
+		return -EOPNOTSUPP;
+
+	/* Fetch the location to insert the filter. */
+	filter_id = cls->knode.handle & 0xFFFFF;
+
+	if (filter_id > adapter->tids.nftids) {
+		dev_err(adapter->pdev_dev,
+			"Location %d out of range for insertion. Max: %d\n",
+			filter_id, adapter->tids.nftids);
+		return -ERANGE;
+	}
+
+	t = adapter->tc_u32;
+	uhtid = TC_U32_USERHTID(cls->knode.handle);
+	link_uhtid = TC_U32_USERHTID(cls->knode.link_handle);
+
+	/* Ensure that uhtid is either root u32 (i.e. 0x800)
+	 * or a a valid linked bucket.
+	 */
+	if (uhtid != 0x800 && uhtid >= t->size)
+		return -EINVAL;
+
+	/* Ensure link handle uhtid is sane, if specified. */
+	if (link_uhtid >= t->size)
+		return -EINVAL;
+
+	memset(&fs, 0, sizeof(fs));
+
+	if (protocol == htons(ETH_P_IPV6)) {
+		start = cxgb4_ipv6_fields;
+		is_ipv6 = true;
+	} else {
+		start = cxgb4_ipv4_fields;
+		is_ipv6 = false;
+	}
+
+	if (uhtid != 0x800) {
+		/* Link must exist from root node before insertion. */
+		if (!t->table[uhtid - 1].link_handle)
+			return -EINVAL;
+
+		/* Link must have a valid supported next header. */
+		link_start = t->table[uhtid - 1].match_field;
+		if (!link_start)
+			return -EINVAL;
+	}
+
+	/* Parse links and record them for subsequent jumps to valid
+	 * next headers.
+	 */
+	if (link_uhtid) {
+		const struct cxgb4_next_header *next;
+		unsigned int i, j;
+		int off;
+		u32 val, mask;
+		bool found = false;
+
+		if (t->table[link_uhtid - 1].link_handle) {
+			dev_err(adapter->pdev_dev,
+				"Link handle exists for: 0x%x\n",
+				link_uhtid);
+			return -EINVAL;
+		}
+
+		next = is_ipv6 ? cxgb4_ipv6_jumps : cxgb4_ipv4_jumps;
+
+		/* Try to find matches that allow jumps to next header. */
+		for (i = 0; next[i].jump; i++) {
+			if (next[i].offoff != cls->knode.sel->offoff ||
+			    next[i].shift != cls->knode.sel->offshift ||
+			    next[i].mask != cls->knode.sel->offmask ||
+			    next[i].offset != cls->knode.sel->off)
+				continue;
+
+			/* Found a possible candidate.  Find a key that
+			 * matches the corresponding offset, value, and
+			 * mask to jump to next header.
+			 */
+			for (j = 0; j < cls->knode.sel->nkeys; j++) {
+				off = cls->knode.sel->keys[j].off;
+				val = cls->knode.sel->keys[j].val;
+				mask = cls->knode.sel->keys[j].mask;
+
+				if (next[i].match_off == off &&
+				    next[i].match_val == val &&
+				    next[i].match_mask == mask) {
+					found = true;
+					break;
+				}
+			}
+
+			if (!found)
+				continue; /* Try next candidate. */
+
+			/* Candidate to jump to next header found.
+			 * Translate all keys to internal specification
+			 * and store them in jump table. This spec is copied
+			 * later to set the actual filters.
+			 */
+			ret = fill_match_fields(adapter, &fs, cls,
+						start, false);
+			if (ret)
+				goto out;
+
+			link = &t->table[link_uhtid - 1];
+			link->match_field = next[i].jump;
+			link->link_handle = cls->knode.handle;
+			memcpy(&link->fs, &fs, sizeof(fs));
+			break;
+		}
+
+		/* No candidate found to jump to next header. */
+		if (!found)
+			return -EINVAL;
+
+		return 0;
+	}
+
+	/* Fill ch_filter_specification match fields to be shipped to hardware.
+	 * Copy the linked spec (if any) first.  And then update the spec as
+	 * needed.
+	 */
+	if (uhtid != 0x800 && t->table[uhtid - 1].link_handle) {
+		/* Copy linked ch_filter_specification */
+		memcpy(&fs, &t->table[uhtid - 1].fs, sizeof(fs));
+		ret = fill_match_fields(adapter, &fs, cls,
+					link_start, true);
+		if (ret)
+			goto out;
+	}
+
+	ret = fill_match_fields(adapter, &fs, cls, start, false);
+	if (ret)
+		goto out;
+
+	/* The filter spec has been completely built from the info
+	 * provided from u32.  We now set some default fields in the
+	 * spec for sanity.
+	 */
+
+	/* Match only packets coming from the ingress port where this
+	 * filter will be created.
+	 */
+	fs.val.iport = netdev2pinfo(dev)->port_id;
+	fs.mask.iport = ~0;
+
+	/* Enable filter hit counts. */
+	fs.hitcnts = 1;
+
+	/* Set type of filter - IPv6 or IPv4 */
+	fs.type = is_ipv6 ? 1 : 0;
+
+	/* Set the filter */
+	ret = cxgb4_set_filter(dev, filter_id, &fs);
+	if (ret)
+		goto out;
+
+	/* If this is a linked bucket, then set the corresponding
+	 * entry in the bitmap to mark it as belonging to this linked
+	 * bucket.
+	 */
+	if (uhtid != 0x800 && t->table[uhtid - 1].link_handle)
+		set_bit(filter_id, t->table[uhtid - 1].tid_map);
+
+out:
+	return ret;
+}
+
+int cxgb4_delete_knode(struct net_device *dev, __be16 protocol,
+		       struct tc_cls_u32_offload *cls)
+{
+	struct adapter *adapter = netdev2adap(dev);
+	struct cxgb4_tc_u32_table *t;
+	struct cxgb4_link *link = NULL;
+	u32 handle, uhtid;
+	unsigned int filter_id;
+	unsigned int max_tids;
+	unsigned int i, j;
+	int ret;
+
+	if (!can_tc_u32_offload(dev))
+		return -EOPNOTSUPP;
+
+	/* Fetch the location to delete the filter. */
+	filter_id = cls->knode.handle & 0xFFFFF;
+
+	if (filter_id > adapter->tids.nftids) {
+		dev_err(adapter->pdev_dev,
+			"Location %d out of range for deletion. Max: %d\n",
+			filter_id, adapter->tids.nftids);
+		return -ERANGE;
+	}
+
+	t = adapter->tc_u32;
+	handle = cls->knode.handle;
+	uhtid = TC_U32_USERHTID(cls->knode.handle);
+
+	/* Ensure that uhtid is either root u32 (i.e. 0x800)
+	 * or a a valid linked bucket.
+	 */
+	if (uhtid != 0x800 && uhtid >= t->size)
+		return -EINVAL;
+
+	/* Delete the specified filter */
+	if (uhtid != 0x800) {
+		link = &t->table[uhtid - 1];
+		if (!link->link_handle)
+			return -EINVAL;
+
+		if (!test_bit(filter_id, link->tid_map))
+			return -EINVAL;
+	}
+
+	ret = cxgb4_del_filter(dev, filter_id);
+	if (ret)
+		goto out;
+
+	if (link)
+		clear_bit(filter_id, link->tid_map);
+
+	/* If a link is being deleted, then delete all filters
+	 * associated with the link.
+	 */
+	max_tids = adapter->tids.nftids;
+	for (i = 0; i < t->size; i++) {
+		link = &t->table[i];
+
+		if (link->link_handle == handle) {
+			for (j = 0; j < max_tids; j++) {
+				if (!test_bit(j, link->tid_map))
+					continue;
+
+				ret = __cxgb4_del_filter(dev, j, NULL);
+				if (ret)
+					goto out;
+
+				clear_bit(j, link->tid_map);
+			}
+
+			/* Clear the link state */
+			link->match_field = NULL;
+			link->link_handle = 0;
+			memset(&link->fs, 0, sizeof(link->fs));
+			break;
+		}
+	}
+
+out:
+	return ret;
+}
+
+void cxgb4_cleanup_tc_u32(struct adapter *adap)
+{
+	struct cxgb4_tc_u32_table *t;
+	unsigned int i;
+
+	if (!adap->tc_u32)
+		return;
+
+	/* Free up all allocated memory. */
+	t = adap->tc_u32;
+	for (i = 0; i < t->size; i++) {
+		struct cxgb4_link *link = &t->table[i];
+
+		t4_free_mem(link->tid_map);
+	}
+	t4_free_mem(adap->tc_u32);
+}
+
+struct cxgb4_tc_u32_table *cxgb4_init_tc_u32(struct adapter *adap,
+					     unsigned int size)
+{
+	struct cxgb4_tc_u32_table *t;
+	unsigned int i;
+
+	if (!size)
+		return NULL;
+
+	t = t4_alloc_mem(sizeof(*t) +
+			 (size * sizeof(struct cxgb4_link)));
+	if (!t)
+		return NULL;
+
+	t->size = size;
+
+	for (i = 0; i < t->size; i++) {
+		struct cxgb4_link *link = &t->table[i];
+		unsigned int bmap_size;
+		unsigned int max_tids;
+
+		max_tids = adap->tids.nftids;
+		bmap_size = BITS_TO_LONGS(max_tids);
+		link->tid_map = t4_alloc_mem(sizeof(unsigned long) * bmap_size);
+		if (!link->tid_map)
+			goto out_no_mem;
+		bitmap_zero(link->tid_map, max_tids);
+	}
+
+	return t;
+
+out_no_mem:
+	for (i = 0; i < t->size; i++) {
+		struct cxgb4_link *link = &t->table[i];
+
+		if (link->tid_map)
+			t4_free_mem(link->tid_map);
+	}
+
+	if (t)
+		t4_free_mem(t);
+
+	return NULL;
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h
new file mode 100644
index 0000000..6bdc885
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h
@@ -0,0 +1,57 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_TC_U32_H
+#define __CXGB4_TC_U32_H
+
+#include <net/pkt_cls.h>
+
+#define CXGB4_MAX_LINK_HANDLE 32
+
+static inline bool can_tc_u32_offload(struct net_device *dev)
+{
+	struct adapter *adap = netdev2adap(dev);
+
+	return (dev->features & NETIF_F_HW_TC) && adap->tc_u32 ? true : false;
+}
+
+int cxgb4_config_knode(struct net_device *dev, __be16 protocol,
+		       struct tc_cls_u32_offload *cls);
+int cxgb4_delete_knode(struct net_device *dev, __be16 protocol,
+		       struct tc_cls_u32_offload *cls);
+
+void cxgb4_cleanup_tc_u32(struct adapter *adapter);
+struct cxgb4_tc_u32_table *cxgb4_init_tc_u32(struct adapter *adap,
+					     unsigned int size);
+#endif /* __CXGB4_TC_U32_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h
index 261aa4a..de321bf 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h
@@ -279,4 +279,16 @@ static const struct cxgb4_next_header cxgb4_ipv6_jumps[] = {
 	  .jump = cxgb4_udp_fields },
 	{ .jump = NULL }
 };
+
+struct cxgb4_link {
+	const struct cxgb4_match_field *match_field;  /* Next header */
+	struct ch_filter_specification fs; /* Match spec associated with link */
+	u32 link_handle;         /* Knode handle associated with the link */
+	unsigned long *tid_map;  /* Bitmap for filter tids */
+};
+
+struct cxgb4_tc_u32_table {
+	unsigned int size;          /* number of entries in table */
+	struct cxgb4_link table[0]; /* Jump table */
+};
 #endif /* __CXGB4_TC_U32_PARSE_H */
-- 
2.5.3

^ permalink raw reply related

* [PATCH net-next v2 3/5] cxgb4: add parser to translate u32 filters to internal spec
From: Rahul Lakkireddy @ 2016-09-13 11:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, hariprasad, leedom, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1473759872.git.rahul.lakkireddy@chelsio.com>

Parse information sent by u32 into internal filter specification.
Add support for parsing several fields in IPv4, IPv6, TCP, and UDP.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 .../ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h    | 282 +++++++++++++++++++++
 1 file changed, 282 insertions(+)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h
new file mode 100644
index 0000000..261aa4a
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h
@@ -0,0 +1,282 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_TC_U32_PARSE_H
+#define __CXGB4_TC_U32_PARSE_H
+
+struct cxgb4_match_field {
+	int off; /* Offset from the beginning of the header to match */
+	/* Fill the value/mask pair in the spec if matched */
+	int (*val)(struct ch_filter_specification *f, u32 val, u32 mask);
+};
+
+/* IPv4 match fields */
+static inline int cxgb4_fill_ipv4_tos(struct ch_filter_specification *f,
+				      u32 val, u32 mask)
+{
+	f->val.tos  = (ntohl(val)  >> 16) & 0x000000FF;
+	f->mask.tos = (ntohl(mask) >> 16) & 0x000000FF;
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv4_frag(struct ch_filter_specification *f,
+				       u32 val, u32 mask)
+{
+	u8 frag_val;
+	u32 mask_val;
+
+	frag_val = (ntohl(val) >> 13) & 0x00000007;
+	mask_val = ntohl(mask) & 0x0000FFFF;
+
+	if (frag_val == 0x1 && mask_val != 0x3FFF) { /* MF set */
+		f->val.frag = 1;
+		f->mask.frag = 1;
+	} else if (frag_val == 0x2 && mask_val != 0x3FFF) { /* DF set */
+		f->val.frag = 0;
+		f->mask.frag = 1;
+	} else {
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv4_proto(struct ch_filter_specification *f,
+					u32 val, u32 mask)
+{
+	f->val.proto  = (ntohl(val)  >> 16) & 0x000000FF;
+	f->mask.proto = (ntohl(mask) >> 16) & 0x000000FF;
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv4_src_ip(struct ch_filter_specification *f,
+					 u32 val, u32 mask)
+{
+	memcpy(&f->val.fip[0],  &val,  sizeof(u32));
+	memcpy(&f->mask.fip[0], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv4_dst_ip(struct ch_filter_specification *f,
+					 u32 val, u32 mask)
+{
+	memcpy(&f->val.lip[0],  &val,  sizeof(u32));
+	memcpy(&f->mask.lip[0], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static const struct cxgb4_match_field cxgb4_ipv4_fields[] = {
+	{ .off = 0,  .val = cxgb4_fill_ipv4_tos },
+	{ .off = 4,  .val = cxgb4_fill_ipv4_frag },
+	{ .off = 8,  .val = cxgb4_fill_ipv4_proto },
+	{ .off = 12, .val = cxgb4_fill_ipv4_src_ip },
+	{ .off = 16, .val = cxgb4_fill_ipv4_dst_ip },
+	{ .val = NULL }
+};
+
+/* IPv6 match fields */
+static inline int cxgb4_fill_ipv6_tos(struct ch_filter_specification *f,
+				      u32 val, u32 mask)
+{
+	f->val.tos  = (ntohl(val)  >> 20) & 0x000000FF;
+	f->mask.tos = (ntohl(mask) >> 20) & 0x000000FF;
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_proto(struct ch_filter_specification *f,
+					u32 val, u32 mask)
+{
+	f->val.proto  = (ntohl(val)  >> 8) & 0x000000FF;
+	f->mask.proto = (ntohl(mask) >> 8) & 0x000000FF;
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip0(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.fip[0],  &val,  sizeof(u32));
+	memcpy(&f->mask.fip[0], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip1(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.fip[4],  &val,  sizeof(u32));
+	memcpy(&f->mask.fip[4], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip2(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.fip[8],  &val,  sizeof(u32));
+	memcpy(&f->mask.fip[8], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_src_ip3(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.fip[12],  &val,  sizeof(u32));
+	memcpy(&f->mask.fip[12], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip0(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.lip[0],  &val,  sizeof(u32));
+	memcpy(&f->mask.lip[0], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip1(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.lip[4],  &val,  sizeof(u32));
+	memcpy(&f->mask.lip[4], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip2(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.lip[8],  &val,  sizeof(u32));
+	memcpy(&f->mask.lip[8], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static inline int cxgb4_fill_ipv6_dst_ip3(struct ch_filter_specification *f,
+					  u32 val, u32 mask)
+{
+	memcpy(&f->val.lip[12],  &val,  sizeof(u32));
+	memcpy(&f->mask.lip[12], &mask, sizeof(u32));
+
+	return 0;
+}
+
+static const struct cxgb4_match_field cxgb4_ipv6_fields[] = {
+	{ .off = 0,  .val = cxgb4_fill_ipv6_tos },
+	{ .off = 4,  .val = cxgb4_fill_ipv6_proto },
+	{ .off = 8,  .val = cxgb4_fill_ipv6_src_ip0 },
+	{ .off = 12, .val = cxgb4_fill_ipv6_src_ip1 },
+	{ .off = 16, .val = cxgb4_fill_ipv6_src_ip2 },
+	{ .off = 20, .val = cxgb4_fill_ipv6_src_ip3 },
+	{ .off = 24, .val = cxgb4_fill_ipv6_dst_ip0 },
+	{ .off = 28, .val = cxgb4_fill_ipv6_dst_ip1 },
+	{ .off = 32, .val = cxgb4_fill_ipv6_dst_ip2 },
+	{ .off = 36, .val = cxgb4_fill_ipv6_dst_ip3 },
+	{ .val = NULL }
+};
+
+/* TCP/UDP match */
+static inline int cxgb4_fill_l4_ports(struct ch_filter_specification *f,
+				      u32 val, u32 mask)
+{
+	f->val.fport  = ntohl(val)  >> 16;
+	f->mask.fport = ntohl(mask) >> 16;
+	f->val.lport  = ntohl(val)  & 0x0000FFFF;
+	f->mask.lport = ntohl(mask) & 0x0000FFFF;
+
+	return 0;
+};
+
+static const struct cxgb4_match_field cxgb4_tcp_fields[] = {
+	{ .off = 0, .val = cxgb4_fill_l4_ports },
+	{ .val = NULL }
+};
+
+static const struct cxgb4_match_field cxgb4_udp_fields[] = {
+	{ .off = 0, .val = cxgb4_fill_l4_ports },
+	{ .val = NULL }
+};
+
+struct cxgb4_next_header {
+	unsigned int offset; /* Offset to next header */
+	/* offset, shift, and mask added to offset above
+	 * to get to next header.  Useful when using a header
+	 * field's value to jump to next header such as IHL field
+	 * in IPv4 header.
+	 */
+	unsigned int offoff;
+	u32 shift;
+	u32 mask;
+	/* match criteria to make this jump */
+	unsigned int match_off;
+	u32 match_val;
+	u32 match_mask;
+	/* location of jump to make */
+	const struct cxgb4_match_field *jump;
+};
+
+/* Accept a rule with a jump to transport layer header based on IHL field in
+ * IPv4 header.
+ */
+static const struct cxgb4_next_header cxgb4_ipv4_jumps[] = {
+	{ .offset = 0, .offoff = 0, .shift = 6, .mask = 0xF,
+	  .match_off = 8, .match_val = 0x600, .match_mask = 0xFF00,
+	  .jump = cxgb4_tcp_fields },
+	{ .offset = 0, .offoff = 0, .shift = 6, .mask = 0xF,
+	  .match_off = 8, .match_val = 0x1100, .match_mask = 0xFF00,
+	  .jump = cxgb4_udp_fields },
+	{ .jump = NULL }
+};
+
+/* Accept a rule with a jump directly past the 40 Bytes of IPv6 fixed header
+ * to get to transport layer header.
+ */
+static const struct cxgb4_next_header cxgb4_ipv6_jumps[] = {
+	{ .offset = 0x28, .offoff = 0, .shift = 0, .mask = 0,
+	  .match_off = 4, .match_val = 0x60000, .match_mask = 0xFF0000,
+	  .jump = cxgb4_tcp_fields },
+	{ .offset = 0x28, .offoff = 0, .shift = 0, .mask = 0,
+	  .match_off = 4, .match_val = 0x110000, .match_mask = 0xFF0000,
+	  .jump = cxgb4_udp_fields },
+	{ .jump = NULL }
+};
+#endif /* __CXGB4_TC_U32_PARSE_H */
-- 
2.5.3

^ permalink raw reply related

* [PATCH net-next v2 2/5] cxgb4: add common api support for configuring filters
From: Rahul Lakkireddy @ 2016-09-13 11:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, hariprasad, leedom, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1473759872.git.rahul.lakkireddy@chelsio.com>

Enable filters for non-offload configuration and add common api support
for setting and deleting filters in LE-TCAM region of the hardware.

IPv4 filters occupy one slot.  IPv6 filters occupy 4 slots and must
be on a 4-slot boundary.  IPv4 filters can not occupy a slot belonging
to IPv6 and the vice-versa is also true.

Filters are set and deleted asynchronously.  Use completion to wait
for reply from firmware in order to allow for synchronization if needed.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h        |   3 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 478 +++++++++++++++++++++-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h |   1 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  33 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h    |  26 +-
 5 files changed, 510 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 275c4f0..7c256a2 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -1054,7 +1054,10 @@ struct filter_entry {
 
 	u32 pending:1;          /* filter action is pending firmware reply */
 	u32 smtidx:8;           /* Source MAC Table index for smac */
+	struct filter_ctx *ctx; /* Caller's completion hook */
 	struct l2t_entry *l2t;  /* Layer Two Table entry for dmac */
+	struct net_device *dev; /* Associated net device */
+	u32 tid;                /* This will store the actual tid */
 
 	/* The filter itself.  Most of this is a straight copy of information
 	 * provided by the extended ioctl().  Some fields are translated to
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index e224bf5..53a47ad 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -33,27 +33,165 @@
  */
 
 #include "cxgb4.h"
+#include "t4_regs.h"
 #include "l2t.h"
 #include "t4fw_api.h"
 #include "cxgb4_filter.h"
 
+static inline bool is_field_set(u32 val, u32 mask)
+{
+	return val || mask;
+}
+
+static inline bool unsupported(u32 conf, u32 conf_mask, u32 val, u32 mask)
+{
+	return !(conf & conf_mask) && is_field_set(val, mask);
+}
+
+/* Validate filter spec against configuration done on the card. */
+static int validate_filter(struct net_device *dev,
+			   struct ch_filter_specification *fs)
+{
+	struct adapter *adapter = netdev2adap(dev);
+	u32 fconf, iconf;
+
+	/* Check for unconfigured fields being used. */
+	fconf = adapter->params.tp.vlan_pri_map;
+	iconf = adapter->params.tp.ingress_config;
+
+	if (unsupported(fconf, FCOE_F, fs->val.fcoe, fs->mask.fcoe) ||
+	    unsupported(fconf, PORT_F, fs->val.iport, fs->mask.iport) ||
+	    unsupported(fconf, TOS_F, fs->val.tos, fs->mask.tos) ||
+	    unsupported(fconf, ETHERTYPE_F, fs->val.ethtype,
+			fs->mask.ethtype) ||
+	    unsupported(fconf, MACMATCH_F, fs->val.macidx, fs->mask.macidx) ||
+	    unsupported(fconf, MPSHITTYPE_F, fs->val.matchtype,
+			fs->mask.matchtype) ||
+	    unsupported(fconf, FRAGMENTATION_F, fs->val.frag, fs->mask.frag) ||
+	    unsupported(fconf, PROTOCOL_F, fs->val.proto, fs->mask.proto) ||
+	    unsupported(fconf, VNIC_ID_F, fs->val.pfvf_vld,
+			fs->mask.pfvf_vld) ||
+	    unsupported(fconf, VNIC_ID_F, fs->val.ovlan_vld,
+			fs->mask.ovlan_vld) ||
+	    unsupported(fconf, VLAN_F, fs->val.ivlan_vld, fs->mask.ivlan_vld))
+		return -EOPNOTSUPP;
+
+	/* T4 inconveniently uses the same FT_VNIC_ID_W bits for both the Outer
+	 * VLAN Tag and PF/VF/VFvld fields based on VNIC_F being set
+	 * in TP_INGRESS_CONFIG.  Hense the somewhat crazy checks
+	 * below.  Additionally, since the T4 firmware interface also
+	 * carries that overlap, we need to translate any PF/VF
+	 * specification into that internal format below.
+	 */
+	if (is_field_set(fs->val.pfvf_vld, fs->mask.pfvf_vld) &&
+	    is_field_set(fs->val.ovlan_vld, fs->mask.ovlan_vld))
+		return -EOPNOTSUPP;
+	if (unsupported(iconf, VNIC_F, fs->val.pfvf_vld, fs->mask.pfvf_vld) ||
+	    (is_field_set(fs->val.ovlan_vld, fs->mask.ovlan_vld) &&
+	     (iconf & VNIC_F)))
+		return -EOPNOTSUPP;
+	if (fs->val.pf > 0x7 || fs->val.vf > 0x7f)
+		return -ERANGE;
+	fs->mask.pf &= 0x7;
+	fs->mask.vf &= 0x7f;
+
+	/* If the user is requesting that the filter action loop
+	 * matching packets back out one of our ports, make sure that
+	 * the egress port is in range.
+	 */
+	if (fs->action == FILTER_SWITCH &&
+	    fs->eport >= adapter->params.nports)
+		return -ERANGE;
+
+	/* Don't allow various trivially obvious bogus out-of-range values... */
+	if (fs->val.iport >= adapter->params.nports)
+		return -ERANGE;
+
+	/* T4 doesn't support removing VLAN Tags for loop back filters. */
+	if (is_t4(adapter->params.chip) &&
+	    fs->action == FILTER_SWITCH &&
+	    (fs->newvlan == VLAN_REMOVE ||
+	     fs->newvlan == VLAN_REWRITE))
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
+static unsigned int get_filter_steerq(struct net_device *dev,
+				      struct ch_filter_specification *fs)
+{
+	struct adapter *adapter = netdev2adap(dev);
+	unsigned int iq;
+
+	/* If the user has requested steering matching Ingress Packets
+	 * to a specific Queue Set, we need to make sure it's in range
+	 * for the port and map that into the Absolute Queue ID of the
+	 * Queue Set's Response Queue.
+	 */
+	if (!fs->dirsteer) {
+		if (fs->iq)
+			return -EINVAL;
+		iq = 0;
+	} else {
+		struct port_info *pi = netdev_priv(dev);
+
+		/* If the iq id is greater than the number of qsets,
+		 * then assume it is an absolute qid.
+		 */
+		if (fs->iq < pi->nqsets)
+			iq = adapter->sge.ethrxq[pi->first_qset +
+						 fs->iq].rspq.abs_id;
+		else
+			iq = fs->iq;
+	}
+
+	return iq;
+}
+
+static int cxgb4_set_ftid(struct tid_info *t, int fidx, int family)
+{
+	spin_lock_bh(&t->ftid_lock);
+
+	if (test_bit(fidx, t->ftid_bmap)) {
+		spin_unlock_bh(&t->ftid_lock);
+		return -EBUSY;
+	}
+
+	if (family == PF_INET)
+		__set_bit(fidx, t->ftid_bmap);
+	else
+		bitmap_allocate_region(t->ftid_bmap, fidx, 2);
+
+	spin_unlock_bh(&t->ftid_lock);
+	return 0;
+}
+
+static void cxgb4_clear_ftid(struct tid_info *t, int fidx, int family)
+{
+	spin_lock_bh(&t->ftid_lock);
+	if (family == PF_INET)
+		__clear_bit(fidx, t->ftid_bmap);
+	else
+		bitmap_release_region(t->ftid_bmap, fidx, 2);
+	spin_unlock_bh(&t->ftid_lock);
+}
+
 /* Delete the filter at a specified index. */
 static int del_filter_wr(struct adapter *adapter, int fidx)
 {
 	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
 	struct sk_buff *skb;
 	struct fw_filter_wr *fwr;
-	unsigned int len, ftid;
+	unsigned int len;
 
 	len = sizeof(*fwr);
-	ftid = adapter->tids.ftid_base + fidx;
 
 	skb = alloc_skb(len, GFP_KERNEL);
 	if (!skb)
 		return -ENOMEM;
 
 	fwr = (struct fw_filter_wr *)__skb_put(skb, len);
-	t4_mk_filtdelwr(ftid, fwr, adapter->sge.fw_evtq.abs_id);
+	t4_mk_filtdelwr(f->tid, fwr, adapter->sge.fw_evtq.abs_id);
 
 	/* Mark the filter as "pending" and ship off the Filter Work Request.
 	 * When we get the Work Request Reply we'll clear the pending status.
@@ -74,7 +212,6 @@ int set_filter_wr(struct adapter *adapter, int fidx)
 	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
 	struct sk_buff *skb;
 	struct fw_filter_wr *fwr;
-	unsigned int ftid;
 
 	skb = alloc_skb(sizeof(*fwr), GFP_KERNEL);
 	if (!skb)
@@ -94,8 +231,6 @@ int set_filter_wr(struct adapter *adapter, int fidx)
 		}
 	}
 
-	ftid = adapter->tids.ftid_base + fidx;
-
 	fwr = (struct fw_filter_wr *)__skb_put(skb, sizeof(*fwr));
 	memset(fwr, 0, sizeof(*fwr));
 
@@ -110,7 +245,7 @@ int set_filter_wr(struct adapter *adapter, int fidx)
 	fwr->op_pkd = htonl(FW_WR_OP_V(FW_FILTER_WR));
 	fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr) / 16));
 	fwr->tid_to_iq =
-		htonl(FW_FILTER_WR_TID_V(ftid) |
+		htonl(FW_FILTER_WR_TID_V(f->tid) |
 		      FW_FILTER_WR_RQTYPE_V(f->fs.type) |
 		      FW_FILTER_WR_NOREPLY_V(0) |
 		      FW_FILTER_WR_IQ_V(f->fs.iq));
@@ -235,33 +370,342 @@ void clear_filter(struct adapter *adap, struct filter_entry *f)
 	memset(f, 0, sizeof(*f));
 }
 
+void clear_all_filters(struct adapter *adapter)
+{
+	unsigned int i;
+
+	if (adapter->tids.ftid_tab) {
+		struct filter_entry *f = &adapter->tids.ftid_tab[0];
+		unsigned int max_ftid = adapter->tids.nftids +
+					adapter->tids.nsftids;
+
+		for (i = 0; i < max_ftid; i++, f++)
+			if (f->valid || f->pending)
+				clear_filter(adapter, f);
+	}
+}
+
+/* Fill up default masks for set match fields. */
+static void fill_default_mask(struct ch_filter_specification *fs)
+{
+	unsigned int i;
+	unsigned int lip = 0, lip_mask = 0;
+	unsigned int fip = 0, fip_mask = 0;
+
+	if (fs->val.iport && !fs->mask.iport)
+		fs->mask.iport |= ~0;
+	if (fs->val.fcoe && !fs->mask.fcoe)
+		fs->mask.fcoe |= ~0;
+	if (fs->val.matchtype && !fs->mask.matchtype)
+		fs->mask.matchtype |= ~0;
+	if (fs->val.macidx && !fs->mask.macidx)
+		fs->mask.macidx |= ~0;
+	if (fs->val.ethtype && !fs->mask.ethtype)
+		fs->mask.ethtype |= ~0;
+	if (fs->val.ivlan && !fs->mask.ivlan)
+		fs->mask.ivlan |= ~0;
+	if (fs->val.ovlan && !fs->mask.ovlan)
+		fs->mask.ovlan |= ~0;
+	if (fs->val.frag && !fs->mask.frag)
+		fs->mask.frag |= ~0;
+	if (fs->val.tos && !fs->mask.tos)
+		fs->mask.tos |= ~0;
+	if (fs->val.proto && !fs->mask.proto)
+		fs->mask.proto |= ~0;
+
+	for (i = 0; i < ARRAY_SIZE(fs->val.lip); i++) {
+		lip |= fs->val.lip[i];
+		lip_mask |= fs->mask.lip[i];
+		fip |= fs->val.fip[i];
+		fip_mask |= fs->mask.fip[i];
+	}
+
+	if (lip && !lip_mask)
+		memset(fs->mask.lip, ~0, sizeof(fs->mask.lip));
+
+	if (fip && !fip_mask)
+		memset(fs->mask.fip, ~0, sizeof(fs->mask.lip));
+
+	if (fs->val.lport && !fs->mask.lport)
+		fs->mask.lport = ~0;
+	if (fs->val.fport && !fs->mask.fport)
+		fs->mask.fport = ~0;
+}
+
+/* Check a Chelsio Filter Request for validity, convert it into our internal
+ * format and send it to the hardware.  Return 0 on success, an error number
+ * otherwise.  We attach any provided filter operation context to the internal
+ * filter specification in order to facilitate signaling completion of the
+ * operation.
+ */
+int __cxgb4_set_filter(struct net_device *dev, int filter_id,
+		       struct ch_filter_specification *fs,
+		       struct filter_ctx *ctx)
+{
+	struct adapter *adapter = netdev2adap(dev);
+	struct filter_entry *f;
+	u32 iconf;
+	unsigned int fidx, iq;
+	unsigned int max_fidx;
+	int ret;
+
+	max_fidx = adapter->tids.nftids;
+	if (filter_id != (max_fidx + adapter->tids.nsftids - 1) &&
+	    filter_id >= max_fidx)
+		return -E2BIG;
+
+	fill_default_mask(fs);
+
+	ret = validate_filter(dev, fs);
+	if (ret)
+		return ret;
+
+	iq = get_filter_steerq(dev, fs);
+	if (iq < 0)
+		return iq;
+
+	/* IPv6 filters occupy four slots and must be aligned on
+	 * four-slot boundaries.  IPv4 filters only occupy a single
+	 * slot and have no alignment requirements but writing a new
+	 * IPv4 filter into the middle of an existing IPv6 filter
+	 * requires clearing the old IPv6 filter and hence we prevent
+	 * insertion.
+	 */
+	if (fs->type == 0) { /* IPv4 */
+		/* If our IPv4 filter isn't being written to a
+		 * multiple of four filter index and there's an IPv6
+		 * filter at the multiple of 4 base slot, then we
+		 * prevent insertion.
+		 */
+		fidx = filter_id & ~0x3;
+		if (fidx != filter_id &&
+		    adapter->tids.ftid_tab[fidx].fs.type) {
+			f = &adapter->tids.ftid_tab[fidx];
+			if (f->valid) {
+				dev_err(adapter->pdev_dev,
+					"Invalid location. IPv6 requires 4 slots and is occupying slots %u to %u\n",
+					fidx, fidx + 3);
+				return -EINVAL;
+			}
+		}
+	} else { /* IPv6 */
+		/* Ensure that the IPv6 filter is aligned on a
+		 * multiple of 4 boundary.
+		 */
+		if (filter_id & 0x3) {
+			dev_err(adapter->pdev_dev,
+				"Invalid location. IPv6 must be aligned on a 4-slot boundary\n");
+			return -EINVAL;
+		}
+
+		/* Check all except the base overlapping IPv4 filter slots. */
+		for (fidx = filter_id + 1; fidx < filter_id + 4; fidx++) {
+			f = &adapter->tids.ftid_tab[fidx];
+			if (f->valid) {
+				dev_err(adapter->pdev_dev,
+					"Invalid location.  IPv6 requires 4 slots and an IPv4 filter exists at %u\n",
+					fidx);
+				return -EINVAL;
+			}
+		}
+	}
+
+	/* Check to make sure that provided filter index is not
+	 * already in use by someone else
+	 */
+	f = &adapter->tids.ftid_tab[filter_id];
+	if (f->valid)
+		return -EBUSY;
+
+	fidx = filter_id + adapter->tids.ftid_base;
+	ret = cxgb4_set_ftid(&adapter->tids, filter_id,
+			     fs->type ? PF_INET6 : PF_INET);
+	if (ret)
+		return ret;
+
+	/* Check to make sure the filter requested is writable ... */
+	ret = writable_filter(f);
+	if (ret) {
+		/* Clear the bits we have set above */
+		cxgb4_clear_ftid(&adapter->tids, filter_id,
+				 fs->type ? PF_INET6 : PF_INET);
+		return ret;
+	}
+
+	/* Clear out any old resources being used by the filter before
+	 * we start constructing the new filter.
+	 */
+	if (f->valid)
+		clear_filter(adapter, f);
+
+	/* Convert the filter specification into our internal format.
+	 * We copy the PF/VF specification into the Outer VLAN field
+	 * here so the rest of the code -- including the interface to
+	 * the firmware -- doesn't have to constantly do these checks.
+	 */
+	f->fs = *fs;
+	f->fs.iq = iq;
+	f->dev = dev;
+
+	iconf = adapter->params.tp.ingress_config;
+	if (iconf & VNIC_F) {
+		f->fs.val.ovlan = (fs->val.pf << 13) | fs->val.vf;
+		f->fs.mask.ovlan = (fs->mask.pf << 13) | fs->mask.vf;
+		f->fs.val.ovlan_vld = fs->val.pfvf_vld;
+		f->fs.mask.ovlan_vld = fs->mask.pfvf_vld;
+	}
+
+	/* Attempt to set the filter.  If we don't succeed, we clear
+	 * it and return the failure.
+	 */
+	f->ctx = ctx;
+	f->tid = fidx; /* Save the actual tid */
+	ret = set_filter_wr(adapter, filter_id);
+	if (ret) {
+		cxgb4_clear_ftid(&adapter->tids, filter_id,
+				 fs->type ? PF_INET6 : PF_INET);
+		clear_filter(adapter, f);
+	}
+
+	return ret;
+}
+
+/* Check a delete filter request for validity and send it to the hardware.
+ * Return 0 on success, an error number otherwise.  We attach any provided
+ * filter operation context to the internal filter specification in order to
+ * facilitate signaling completion of the operation.
+ */
+int __cxgb4_del_filter(struct net_device *dev, int filter_id,
+		       struct filter_ctx *ctx)
+{
+	struct adapter *adapter = netdev2adap(dev);
+	struct filter_entry *f;
+	unsigned int max_fidx;
+	int ret = 0;
+
+	max_fidx = adapter->tids.nftids;
+	if (filter_id != (max_fidx + adapter->tids.nsftids - 1) &&
+	    filter_id >= max_fidx)
+		return -E2BIG;
+
+	f = &adapter->tids.ftid_tab[filter_id];
+	ret = writable_filter(f);
+	if (ret)
+		return ret;
+
+	if (f->valid) {
+		f->ctx = ctx;
+		cxgb4_clear_ftid(&adapter->tids, filter_id,
+				 f->fs.type ? PF_INET6 : PF_INET);
+		return del_filter_wr(adapter, filter_id);
+	}
+
+	/* If the caller has passed in a Completion Context then we need to
+	 * mark it as a successful completion so they don't stall waiting
+	 * for it.
+	 */
+	if (ctx) {
+		ctx->result = 0;
+		complete(&ctx->completion);
+	}
+	return ret;
+}
+
+int cxgb4_set_filter(struct net_device *dev, int filter_id,
+		     struct ch_filter_specification *fs)
+{
+	struct filter_ctx ctx;
+	int ret;
+
+	init_completion(&ctx.completion);
+
+	ret = __cxgb4_set_filter(dev, filter_id, fs, &ctx);
+	if (ret)
+		goto out;
+
+	/* Wait for reply */
+	ret = wait_for_completion_timeout(&ctx.completion, 10 * HZ);
+	if (!ret)
+		return -ETIMEDOUT;
+
+	ret = ctx.result;
+out:
+	return ret;
+}
+
+int cxgb4_del_filter(struct net_device *dev, int filter_id)
+{
+	struct filter_ctx ctx;
+	int ret;
+
+	init_completion(&ctx.completion);
+
+	ret = __cxgb4_del_filter(dev, filter_id, &ctx);
+	if (ret)
+		goto out;
+
+	/* Wait for reply */
+	ret = wait_for_completion_timeout(&ctx.completion, 10 * HZ);
+	if (!ret)
+		return -ETIMEDOUT;
+
+	ret = ctx.result;
+out:
+	return ret;
+}
+
 /* Handle a filter write/deletion reply. */
 void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
 {
-	unsigned int idx = GET_TID(rpl);
-	unsigned int nidx = idx - adap->tids.ftid_base;
-	unsigned int ret;
-	struct filter_entry *f;
+	unsigned int tid = GET_TID(rpl);
+	struct filter_entry *f = NULL;
+	unsigned int max_fidx;
+	int idx;
 
-	if (idx >= adap->tids.ftid_base && nidx <
-	   (adap->tids.nftids + adap->tids.nsftids)) {
-		idx = nidx;
-		ret = TCB_COOKIE_G(rpl->cookie);
+	max_fidx = adap->tids.nftids + adap->tids.nsftids;
+	/* Get the corresponding filter entry for this tid */
+	if (adap->tids.ftid_tab) {
+		/* Check this in normal filter region */
+		idx = tid - adap->tids.ftid_base;
+		if (idx >= max_fidx)
+			return;
 		f = &adap->tids.ftid_tab[idx];
+		if (f->tid != tid)
+			return;
+	}
+
+	/* We found the filter entry for this tid */
+	if (f) {
+		unsigned int ret = TCB_COOKIE_G(rpl->cookie);
+		struct filter_ctx *ctx;
+
+		/* Pull off any filter operation context attached to the
+		 * filter.
+		 */
+		ctx = f->ctx;
+		f->ctx = NULL;
 
 		if (ret == FW_FILTER_WR_FLT_DELETED) {
 			/* Clear the filter when we get confirmation from the
 			 * hardware that the filter has been deleted.
 			 */
 			clear_filter(adap, f);
+			if (ctx)
+				ctx->result = 0;
 		} else if (ret == FW_FILTER_WR_SMT_TBL_FULL) {
 			dev_err(adap->pdev_dev, "filter %u setup failed due to full SMT\n",
 				idx);
 			clear_filter(adap, f);
+			if (ctx)
+				ctx->result = -ENOMEM;
 		} else if (ret == FW_FILTER_WR_FLT_ADDED) {
 			f->smtidx = (be64_to_cpu(rpl->oldval) >> 24) & 0xff;
 			f->pending = 0;  /* asynchronous setup completed */
 			f->valid = 1;
+			if (ctx) {
+				ctx->result = 0;
+				ctx->tid = idx;
+			}
 		} else {
 			/* Something went wrong.  Issue a warning about the
 			 * problem and clear everything out.
@@ -269,6 +713,10 @@ void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
 			dev_err(adap->pdev_dev, "filter %u setup failed with error %u\n",
 				idx, ret);
 			clear_filter(adap, f);
+			if (ctx)
+				ctx->result = -EINVAL;
 		}
+		if (ctx)
+			complete(&ctx->completion);
 	}
 }
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
index f6bd0bf..23742cb 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
@@ -44,4 +44,5 @@ int set_filter_wr(struct adapter *adapter, int fidx);
 int delete_filter(struct adapter *adapter, unsigned int fidx);
 
 int writable_filter(struct filter_entry *f);
+void clear_all_filters(struct adapter *adapter);
 #endif /* __CXGB4_FILTER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 00f4d7f..8909f96 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -1498,17 +1498,20 @@ static int tid_init(struct tid_info *t)
 {
 	size_t size;
 	unsigned int stid_bmap_size;
+	unsigned int ftid_bmap_size;
 	unsigned int natids = t->natids;
+	unsigned int max_ftids = t->nftids + t->nsftids;
 	struct adapter *adap = container_of(t, struct adapter, tids);
 
 	stid_bmap_size = BITS_TO_LONGS(t->nstids + t->nsftids);
+	ftid_bmap_size = BITS_TO_LONGS(t->nftids);
 	size = t->ntids * sizeof(*t->tid_tab) +
 	       natids * sizeof(*t->atid_tab) +
 	       t->nstids * sizeof(*t->stid_tab) +
 	       t->nsftids * sizeof(*t->stid_tab) +
 	       stid_bmap_size * sizeof(long) +
-	       t->nftids * sizeof(*t->ftid_tab) +
-	       t->nsftids * sizeof(*t->ftid_tab);
+	       max_ftids * sizeof(*t->ftid_tab) +
+	       ftid_bmap_size * sizeof(long);
 
 	t->tid_tab = t4_alloc_mem(size);
 	if (!t->tid_tab)
@@ -1518,8 +1521,10 @@ static int tid_init(struct tid_info *t)
 	t->stid_tab = (struct serv_entry *)&t->atid_tab[natids];
 	t->stid_bmap = (unsigned long *)&t->stid_tab[t->nstids + t->nsftids];
 	t->ftid_tab = (struct filter_entry *)&t->stid_bmap[stid_bmap_size];
+	t->ftid_bmap = (unsigned long *)&t->ftid_tab[max_ftids];
 	spin_lock_init(&t->stid_lock);
 	spin_lock_init(&t->atid_lock);
+	spin_lock_init(&t->ftid_lock);
 
 	t->stids_in_use = 0;
 	t->sftids_in_use = 0;
@@ -1534,12 +1539,16 @@ static int tid_init(struct tid_info *t)
 			t->atid_tab[natids - 1].next = &t->atid_tab[natids];
 		t->afree = t->atid_tab;
 	}
-	bitmap_zero(t->stid_bmap, t->nstids + t->nsftids);
-	/* Reserve stid 0 for T4/T5 adapters */
-	if (!t->stid_base &&
-	    (CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5))
-		__set_bit(0, t->stid_bmap);
 
+	if (is_offload(adap)) {
+		bitmap_zero(t->stid_bmap, t->nstids + t->nsftids);
+		/* Reserve stid 0 for T4/T5 adapters */
+		if (!t->stid_base &&
+		    CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5)
+			__set_bit(0, t->stid_bmap);
+	}
+
+	bitmap_zero(t->ftid_bmap, t->nftids);
 	return 0;
 }
 
@@ -5200,7 +5209,7 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 				 i);
 	}
 
-	if (is_offload(adapter) && tid_init(&adapter->tids) < 0) {
+	if (tid_init(&adapter->tids) < 0) {
 		dev_warn(&pdev->dev, "could not allocate TID table, "
 			 "continuing\n");
 		adapter->params.offload = 0;
@@ -5383,13 +5392,7 @@ static void remove_one(struct pci_dev *pdev)
 		/* If we allocated filters, free up state associated with any
 		 * valid filters ...
 		 */
-		if (adapter->tids.ftid_tab) {
-			struct filter_entry *f = &adapter->tids.ftid_tab[0];
-			for (i = 0; i < (adapter->tids.nftids +
-					adapter->tids.nsftids); i++, f++)
-				if (f->valid)
-					clear_filter(adapter, f);
-		}
+		clear_all_filters(adapter);
 
 		if (adapter->flags & FULL_INIT_DONE)
 			cxgb_down(adapter);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index ab40372..91339d3 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -1,7 +1,7 @@
 /*
  * This file is part of the Chelsio T4 Ethernet driver for Linux.
  *
- * Copyright (c) 2003-2014 Chelsio Communications, Inc. All rights reserved.
+ * Copyright (c) 2003-2016 Chelsio Communications, Inc. All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -104,6 +104,7 @@ struct tid_info {
 	unsigned int atid_base;
 
 	struct filter_entry *ftid_tab;
+	unsigned long *ftid_bmap;
 	unsigned int nftids;
 	unsigned int ftid_base;
 	unsigned int aftid_base;
@@ -124,6 +125,8 @@ struct tid_info {
 	atomic_t tids_in_use;
 	/* TIDs in the HASH */
 	atomic_t hash_tids_in_use;
+	/* lock for setting/clearing filter bitmap */
+	spinlock_t ftid_lock;
 };
 
 static inline void *lookup_tid(const struct tid_info *t, unsigned int tid)
@@ -183,6 +186,27 @@ int cxgb4_create_server_filter(const struct net_device *dev, unsigned int stid,
 int cxgb4_remove_server_filter(const struct net_device *dev, unsigned int stid,
 			       unsigned int queue, bool ipv6);
 
+/* Filter operation context to allow callers of cxgb4_set_filter() and
+ * cxgb4_del_filter() to wait for an asynchronous completion.
+ */
+struct filter_ctx {
+	struct completion completion;	/* completion rendezvous */
+	void *closure;			/* caller's opaque information */
+	int result;			/* result of operation */
+	u32 tid;			/* to store tid */
+};
+
+struct ch_filter_specification;
+
+int __cxgb4_set_filter(struct net_device *dev, int filter_id,
+		       struct ch_filter_specification *fs,
+		       struct filter_ctx *ctx);
+int __cxgb4_del_filter(struct net_device *dev, int filter_id,
+		       struct filter_ctx *ctx);
+int cxgb4_set_filter(struct net_device *dev, int filter_id,
+		     struct ch_filter_specification *fs);
+int cxgb4_del_filter(struct net_device *dev, int filter_id);
+
 static inline void set_wr_txq(struct sk_buff *skb, int prio, int queue)
 {
 	skb_set_queue_mapping(skb, (queue << 1) | prio);
-- 
2.5.3

^ permalink raw reply related

* [PATCH net-next v2 1/5] cxgb4: move common filter code to separate file
From: Rahul Lakkireddy @ 2016-09-13 11:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, hariprasad, leedom, nirranjan, indranil, Rahul Lakkireddy
In-Reply-To: <cover.1473759872.git.rahul.lakkireddy@chelsio.com>

Move common filter code to separate files.  Also fix the following
checkpatch checks.

CHECK: Comparison to NULL could be written "!f->l2t"
+               if (f->l2t == NULL) {

CHECK: spaces preferred around that '/' (ctx:VxV)
+       fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr)/16));

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/Makefile       |   2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h        |  23 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c | 274 ++++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h |  47 ++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 264 +--------------------
 5 files changed, 346 insertions(+), 264 deletions(-)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h

diff --git a/drivers/net/ethernet/chelsio/cxgb4/Makefile b/drivers/net/ethernet/chelsio/cxgb4/Makefile
index 2461296..da88981 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/Makefile
+++ b/drivers/net/ethernet/chelsio/cxgb4/Makefile
@@ -4,7 +4,7 @@
 
 obj-$(CONFIG_CHELSIO_T4) += cxgb4.o
 
-cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o
+cxgb4-objs := cxgb4_main.o l2t.o t4_hw.o sge.o clip_tbl.o cxgb4_ethtool.o cxgb4_uld.o sched.o cxgb4_filter.o
 cxgb4-$(CONFIG_CHELSIO_T4_DCB) +=  cxgb4_dcb.o
 cxgb4-$(CONFIG_CHELSIO_T4_FCOE) +=  cxgb4_fcoe.o
 cxgb4-$(CONFIG_DEBUG_FS) += cxgb4_debugfs.o
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 4595569..275c4f0 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -1041,6 +1041,29 @@ enum {
 	VLAN_REWRITE
 };
 
+/* Host shadow copy of ingress filter entry.  This is in host native format
+ * and doesn't match the ordering or bit order, etc. of the hardware of the
+ * firmware command.  The use of bit-field structure elements is purely to
+ * remind ourselves of the field size limitations and save memory in the case
+ * where the filter table is large.
+ */
+struct filter_entry {
+	/* Administrative fields for filter. */
+	u32 valid:1;            /* filter allocated and valid */
+	u32 locked:1;           /* filter is administratively locked */
+
+	u32 pending:1;          /* filter action is pending firmware reply */
+	u32 smtidx:8;           /* Source MAC Table index for smac */
+	struct l2t_entry *l2t;  /* Layer Two Table entry for dmac */
+
+	/* The filter itself.  Most of this is a straight copy of information
+	 * provided by the extended ioctl().  Some fields are translated to
+	 * internal forms -- for instance the Ingress Queue ID passed in from
+	 * the ioctl() is translated into the Absolute Ingress Queue ID.
+	 */
+	struct ch_filter_specification fs;
+};
+
 static inline int is_offload(const struct adapter *adap)
 {
 	return adap->params.offload;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
new file mode 100644
index 0000000..e224bf5
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -0,0 +1,274 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2003-2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include "cxgb4.h"
+#include "l2t.h"
+#include "t4fw_api.h"
+#include "cxgb4_filter.h"
+
+/* Delete the filter at a specified index. */
+static int del_filter_wr(struct adapter *adapter, int fidx)
+{
+	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
+	struct sk_buff *skb;
+	struct fw_filter_wr *fwr;
+	unsigned int len, ftid;
+
+	len = sizeof(*fwr);
+	ftid = adapter->tids.ftid_base + fidx;
+
+	skb = alloc_skb(len, GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	fwr = (struct fw_filter_wr *)__skb_put(skb, len);
+	t4_mk_filtdelwr(ftid, fwr, adapter->sge.fw_evtq.abs_id);
+
+	/* Mark the filter as "pending" and ship off the Filter Work Request.
+	 * When we get the Work Request Reply we'll clear the pending status.
+	 */
+	f->pending = 1;
+	t4_mgmt_tx(adapter, skb);
+	return 0;
+}
+
+/* Send a Work Request to write the filter at a specified index.  We construct
+ * a Firmware Filter Work Request to have the work done and put the indicated
+ * filter into "pending" mode which will prevent any further actions against
+ * it till we get a reply from the firmware on the completion status of the
+ * request.
+ */
+int set_filter_wr(struct adapter *adapter, int fidx)
+{
+	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
+	struct sk_buff *skb;
+	struct fw_filter_wr *fwr;
+	unsigned int ftid;
+
+	skb = alloc_skb(sizeof(*fwr), GFP_KERNEL);
+	if (!skb)
+		return -ENOMEM;
+
+	/* If the new filter requires loopback Destination MAC and/or VLAN
+	 * rewriting then we need to allocate a Layer 2 Table (L2T) entry for
+	 * the filter.
+	 */
+	if (f->fs.newdmac || f->fs.newvlan) {
+		/* allocate L2T entry for new filter */
+		f->l2t = t4_l2t_alloc_switching(adapter, f->fs.vlan,
+						f->fs.eport, f->fs.dmac);
+		if (!f->l2t) {
+			kfree_skb(skb);
+			return -ENOMEM;
+		}
+	}
+
+	ftid = adapter->tids.ftid_base + fidx;
+
+	fwr = (struct fw_filter_wr *)__skb_put(skb, sizeof(*fwr));
+	memset(fwr, 0, sizeof(*fwr));
+
+	/* It would be nice to put most of the following in t4_hw.c but most
+	 * of the work is translating the cxgbtool ch_filter_specification
+	 * into the Work Request and the definition of that structure is
+	 * currently in cxgbtool.h which isn't appropriate to pull into the
+	 * common code.  We may eventually try to come up with a more neutral
+	 * filter specification structure but for now it's easiest to simply
+	 * put this fairly direct code in line ...
+	 */
+	fwr->op_pkd = htonl(FW_WR_OP_V(FW_FILTER_WR));
+	fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr) / 16));
+	fwr->tid_to_iq =
+		htonl(FW_FILTER_WR_TID_V(ftid) |
+		      FW_FILTER_WR_RQTYPE_V(f->fs.type) |
+		      FW_FILTER_WR_NOREPLY_V(0) |
+		      FW_FILTER_WR_IQ_V(f->fs.iq));
+	fwr->del_filter_to_l2tix =
+		htonl(FW_FILTER_WR_RPTTID_V(f->fs.rpttid) |
+		      FW_FILTER_WR_DROP_V(f->fs.action == FILTER_DROP) |
+		      FW_FILTER_WR_DIRSTEER_V(f->fs.dirsteer) |
+		      FW_FILTER_WR_MASKHASH_V(f->fs.maskhash) |
+		      FW_FILTER_WR_DIRSTEERHASH_V(f->fs.dirsteerhash) |
+		      FW_FILTER_WR_LPBK_V(f->fs.action == FILTER_SWITCH) |
+		      FW_FILTER_WR_DMAC_V(f->fs.newdmac) |
+		      FW_FILTER_WR_SMAC_V(f->fs.newsmac) |
+		      FW_FILTER_WR_INSVLAN_V(f->fs.newvlan == VLAN_INSERT ||
+					     f->fs.newvlan == VLAN_REWRITE) |
+		      FW_FILTER_WR_RMVLAN_V(f->fs.newvlan == VLAN_REMOVE ||
+					    f->fs.newvlan == VLAN_REWRITE) |
+		      FW_FILTER_WR_HITCNTS_V(f->fs.hitcnts) |
+		      FW_FILTER_WR_TXCHAN_V(f->fs.eport) |
+		      FW_FILTER_WR_PRIO_V(f->fs.prio) |
+		      FW_FILTER_WR_L2TIX_V(f->l2t ? f->l2t->idx : 0));
+	fwr->ethtype = htons(f->fs.val.ethtype);
+	fwr->ethtypem = htons(f->fs.mask.ethtype);
+	fwr->frag_to_ovlan_vldm =
+		(FW_FILTER_WR_FRAG_V(f->fs.val.frag) |
+		 FW_FILTER_WR_FRAGM_V(f->fs.mask.frag) |
+		 FW_FILTER_WR_IVLAN_VLD_V(f->fs.val.ivlan_vld) |
+		 FW_FILTER_WR_OVLAN_VLD_V(f->fs.val.ovlan_vld) |
+		 FW_FILTER_WR_IVLAN_VLDM_V(f->fs.mask.ivlan_vld) |
+		 FW_FILTER_WR_OVLAN_VLDM_V(f->fs.mask.ovlan_vld));
+	fwr->smac_sel = 0;
+	fwr->rx_chan_rx_rpl_iq =
+		htons(FW_FILTER_WR_RX_CHAN_V(0) |
+		      FW_FILTER_WR_RX_RPL_IQ_V(adapter->sge.fw_evtq.abs_id));
+	fwr->maci_to_matchtypem =
+		htonl(FW_FILTER_WR_MACI_V(f->fs.val.macidx) |
+		      FW_FILTER_WR_MACIM_V(f->fs.mask.macidx) |
+		      FW_FILTER_WR_FCOE_V(f->fs.val.fcoe) |
+		      FW_FILTER_WR_FCOEM_V(f->fs.mask.fcoe) |
+		      FW_FILTER_WR_PORT_V(f->fs.val.iport) |
+		      FW_FILTER_WR_PORTM_V(f->fs.mask.iport) |
+		      FW_FILTER_WR_MATCHTYPE_V(f->fs.val.matchtype) |
+		      FW_FILTER_WR_MATCHTYPEM_V(f->fs.mask.matchtype));
+	fwr->ptcl = f->fs.val.proto;
+	fwr->ptclm = f->fs.mask.proto;
+	fwr->ttyp = f->fs.val.tos;
+	fwr->ttypm = f->fs.mask.tos;
+	fwr->ivlan = htons(f->fs.val.ivlan);
+	fwr->ivlanm = htons(f->fs.mask.ivlan);
+	fwr->ovlan = htons(f->fs.val.ovlan);
+	fwr->ovlanm = htons(f->fs.mask.ovlan);
+	memcpy(fwr->lip, f->fs.val.lip, sizeof(fwr->lip));
+	memcpy(fwr->lipm, f->fs.mask.lip, sizeof(fwr->lipm));
+	memcpy(fwr->fip, f->fs.val.fip, sizeof(fwr->fip));
+	memcpy(fwr->fipm, f->fs.mask.fip, sizeof(fwr->fipm));
+	fwr->lp = htons(f->fs.val.lport);
+	fwr->lpm = htons(f->fs.mask.lport);
+	fwr->fp = htons(f->fs.val.fport);
+	fwr->fpm = htons(f->fs.mask.fport);
+	if (f->fs.newsmac)
+		memcpy(fwr->sma, f->fs.smac, sizeof(fwr->sma));
+
+	/* Mark the filter as "pending" and ship off the Filter Work Request.
+	 * When we get the Work Request Reply we'll clear the pending status.
+	 */
+	f->pending = 1;
+	set_wr_txq(skb, CPL_PRIORITY_CONTROL, f->fs.val.iport & 0x3);
+	t4_ofld_send(adapter, skb);
+	return 0;
+}
+
+/* Return an error number if the indicated filter isn't writable ... */
+int writable_filter(struct filter_entry *f)
+{
+	if (f->locked)
+		return -EPERM;
+	if (f->pending)
+		return -EBUSY;
+
+	return 0;
+}
+
+/* Delete the filter at the specified index (if valid).  The checks for all
+ * the common problems with doing this like the filter being locked, currently
+ * pending in another operation, etc.
+ */
+int delete_filter(struct adapter *adapter, unsigned int fidx)
+{
+	struct filter_entry *f;
+	int ret;
+
+	if (fidx >= adapter->tids.nftids + adapter->tids.nsftids)
+		return -EINVAL;
+
+	f = &adapter->tids.ftid_tab[fidx];
+	ret = writable_filter(f);
+	if (ret)
+		return ret;
+	if (f->valid)
+		return del_filter_wr(adapter, fidx);
+
+	return 0;
+}
+
+/* Clear a filter and release any of its resources that we own.  This also
+ * clears the filter's "pending" status.
+ */
+void clear_filter(struct adapter *adap, struct filter_entry *f)
+{
+	/* If the new or old filter have loopback rewriteing rules then we'll
+	 * need to free any existing Layer Two Table (L2T) entries of the old
+	 * filter rule.  The firmware will handle freeing up any Source MAC
+	 * Table (SMT) entries used for rewriting Source MAC Addresses in
+	 * loopback rules.
+	 */
+	if (f->l2t)
+		cxgb4_l2t_release(f->l2t);
+
+	/* The zeroing of the filter rule below clears the filter valid,
+	 * pending, locked flags, l2t pointer, etc. so it's all we need for
+	 * this operation.
+	 */
+	memset(f, 0, sizeof(*f));
+}
+
+/* Handle a filter write/deletion reply. */
+void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
+{
+	unsigned int idx = GET_TID(rpl);
+	unsigned int nidx = idx - adap->tids.ftid_base;
+	unsigned int ret;
+	struct filter_entry *f;
+
+	if (idx >= adap->tids.ftid_base && nidx <
+	   (adap->tids.nftids + adap->tids.nsftids)) {
+		idx = nidx;
+		ret = TCB_COOKIE_G(rpl->cookie);
+		f = &adap->tids.ftid_tab[idx];
+
+		if (ret == FW_FILTER_WR_FLT_DELETED) {
+			/* Clear the filter when we get confirmation from the
+			 * hardware that the filter has been deleted.
+			 */
+			clear_filter(adap, f);
+		} else if (ret == FW_FILTER_WR_SMT_TBL_FULL) {
+			dev_err(adap->pdev_dev, "filter %u setup failed due to full SMT\n",
+				idx);
+			clear_filter(adap, f);
+		} else if (ret == FW_FILTER_WR_FLT_ADDED) {
+			f->smtidx = (be64_to_cpu(rpl->oldval) >> 24) & 0xff;
+			f->pending = 0;  /* asynchronous setup completed */
+			f->valid = 1;
+		} else {
+			/* Something went wrong.  Issue a warning about the
+			 * problem and clear everything out.
+			 */
+			dev_err(adap->pdev_dev, "filter %u setup failed with error %u\n",
+				idx, ret);
+			clear_filter(adap, f);
+		}
+	}
+}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
new file mode 100644
index 0000000..f6bd0bf
--- /dev/null
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
@@ -0,0 +1,47 @@
+/*
+ * This file is part of the Chelsio T4 Ethernet driver for Linux.
+ *
+ * Copyright (c) 2003-2016 Chelsio Communications, Inc. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef __CXGB4_FILTER_H
+#define __CXGB4_FILTER_H
+
+#include "t4_msg.h"
+
+void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl);
+void clear_filter(struct adapter *adap, struct filter_entry *f);
+
+int set_filter_wr(struct adapter *adapter, int fidx);
+int delete_filter(struct adapter *adapter, unsigned int fidx);
+
+int writable_filter(struct filter_entry *f);
+#endif /* __CXGB4_FILTER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 44cc976..00f4d7f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -67,6 +67,7 @@
 #include <linux/crash_dump.h>
 
 #include "cxgb4.h"
+#include "cxgb4_filter.h"
 #include "t4_regs.h"
 #include "t4_values.h"
 #include "t4_msg.h"
@@ -87,30 +88,6 @@ char cxgb4_driver_name[] = KBUILD_MODNAME;
 const char cxgb4_driver_version[] = DRV_VERSION;
 #define DRV_DESC "Chelsio T4/T5/T6 Network Driver"
 
-/* Host shadow copy of ingress filter entry.  This is in host native format
- * and doesn't match the ordering or bit order, etc. of the hardware of the
- * firmware command.  The use of bit-field structure elements is purely to
- * remind ourselves of the field size limitations and save memory in the case
- * where the filter table is large.
- */
-struct filter_entry {
-	/* Administrative fields for filter.
-	 */
-	u32 valid:1;            /* filter allocated and valid */
-	u32 locked:1;           /* filter is administratively locked */
-
-	u32 pending:1;          /* filter action is pending firmware reply */
-	u32 smtidx:8;           /* Source MAC Table index for smac */
-	struct l2t_entry *l2t;  /* Layer Two Table entry for dmac */
-
-	/* The filter itself.  Most of this is a straight copy of information
-	 * provided by the extended ioctl().  Some fields are translated to
-	 * internal forms -- for instance the Ingress Queue ID passed in from
-	 * the ioctl() is translated into the Absolute Ingress Queue ID.
-	 */
-	struct ch_filter_specification fs;
-};
-
 #define DFLT_MSG_ENABLE (NETIF_MSG_DRV | NETIF_MSG_PROBE | NETIF_MSG_LINK | \
 			 NETIF_MSG_TIMER | NETIF_MSG_IFDOWN | NETIF_MSG_IFUP |\
 			 NETIF_MSG_RX_ERR | NETIF_MSG_TX_ERR)
@@ -532,66 +509,6 @@ static void dcb_rpl(struct adapter *adap, const struct fw_port_cmd *pcmd)
 }
 #endif /* CONFIG_CHELSIO_T4_DCB */
 
-/* Clear a filter and release any of its resources that we own.  This also
- * clears the filter's "pending" status.
- */
-static void clear_filter(struct adapter *adap, struct filter_entry *f)
-{
-	/* If the new or old filter have loopback rewriteing rules then we'll
-	 * need to free any existing Layer Two Table (L2T) entries of the old
-	 * filter rule.  The firmware will handle freeing up any Source MAC
-	 * Table (SMT) entries used for rewriting Source MAC Addresses in
-	 * loopback rules.
-	 */
-	if (f->l2t)
-		cxgb4_l2t_release(f->l2t);
-
-	/* The zeroing of the filter rule below clears the filter valid,
-	 * pending, locked flags, l2t pointer, etc. so it's all we need for
-	 * this operation.
-	 */
-	memset(f, 0, sizeof(*f));
-}
-
-/* Handle a filter write/deletion reply.
- */
-static void filter_rpl(struct adapter *adap, const struct cpl_set_tcb_rpl *rpl)
-{
-	unsigned int idx = GET_TID(rpl);
-	unsigned int nidx = idx - adap->tids.ftid_base;
-	unsigned int ret;
-	struct filter_entry *f;
-
-	if (idx >= adap->tids.ftid_base && nidx <
-	   (adap->tids.nftids + adap->tids.nsftids)) {
-		idx = nidx;
-		ret = TCB_COOKIE_G(rpl->cookie);
-		f = &adap->tids.ftid_tab[idx];
-
-		if (ret == FW_FILTER_WR_FLT_DELETED) {
-			/* Clear the filter when we get confirmation from the
-			 * hardware that the filter has been deleted.
-			 */
-			clear_filter(adap, f);
-		} else if (ret == FW_FILTER_WR_SMT_TBL_FULL) {
-			dev_err(adap->pdev_dev, "filter %u setup failed due to full SMT\n",
-				idx);
-			clear_filter(adap, f);
-		} else if (ret == FW_FILTER_WR_FLT_ADDED) {
-			f->smtidx = (be64_to_cpu(rpl->oldval) >> 24) & 0xff;
-			f->pending = 0;  /* asynchronous setup completed */
-			f->valid = 1;
-		} else {
-			/* Something went wrong.  Issue a warning about the
-			 * problem and clear everything out.
-			 */
-			dev_err(adap->pdev_dev, "filter %u setup failed with error %u\n",
-				idx, ret);
-			clear_filter(adap, f);
-		}
-	}
-}
-
 /* Response queue handler for the FW event queue.
  */
 static int fwevtq_handler(struct sge_rspq *q, const __be64 *rsp,
@@ -1198,151 +1115,6 @@ void t4_free_mem(void *addr)
 	kvfree(addr);
 }
 
-/* Send a Work Request to write the filter at a specified index.  We construct
- * a Firmware Filter Work Request to have the work done and put the indicated
- * filter into "pending" mode which will prevent any further actions against
- * it till we get a reply from the firmware on the completion status of the
- * request.
- */
-static int set_filter_wr(struct adapter *adapter, int fidx)
-{
-	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
-	struct sk_buff *skb;
-	struct fw_filter_wr *fwr;
-	unsigned int ftid;
-
-	skb = alloc_skb(sizeof(*fwr), GFP_KERNEL);
-	if (!skb)
-		return -ENOMEM;
-
-	/* If the new filter requires loopback Destination MAC and/or VLAN
-	 * rewriting then we need to allocate a Layer 2 Table (L2T) entry for
-	 * the filter.
-	 */
-	if (f->fs.newdmac || f->fs.newvlan) {
-		/* allocate L2T entry for new filter */
-		f->l2t = t4_l2t_alloc_switching(adapter, f->fs.vlan,
-						f->fs.eport, f->fs.dmac);
-		if (f->l2t == NULL) {
-			kfree_skb(skb);
-			return -ENOMEM;
-		}
-	}
-
-	ftid = adapter->tids.ftid_base + fidx;
-
-	fwr = (struct fw_filter_wr *)__skb_put(skb, sizeof(*fwr));
-	memset(fwr, 0, sizeof(*fwr));
-
-	/* It would be nice to put most of the following in t4_hw.c but most
-	 * of the work is translating the cxgbtool ch_filter_specification
-	 * into the Work Request and the definition of that structure is
-	 * currently in cxgbtool.h which isn't appropriate to pull into the
-	 * common code.  We may eventually try to come up with a more neutral
-	 * filter specification structure but for now it's easiest to simply
-	 * put this fairly direct code in line ...
-	 */
-	fwr->op_pkd = htonl(FW_WR_OP_V(FW_FILTER_WR));
-	fwr->len16_pkd = htonl(FW_WR_LEN16_V(sizeof(*fwr)/16));
-	fwr->tid_to_iq =
-		htonl(FW_FILTER_WR_TID_V(ftid) |
-		      FW_FILTER_WR_RQTYPE_V(f->fs.type) |
-		      FW_FILTER_WR_NOREPLY_V(0) |
-		      FW_FILTER_WR_IQ_V(f->fs.iq));
-	fwr->del_filter_to_l2tix =
-		htonl(FW_FILTER_WR_RPTTID_V(f->fs.rpttid) |
-		      FW_FILTER_WR_DROP_V(f->fs.action == FILTER_DROP) |
-		      FW_FILTER_WR_DIRSTEER_V(f->fs.dirsteer) |
-		      FW_FILTER_WR_MASKHASH_V(f->fs.maskhash) |
-		      FW_FILTER_WR_DIRSTEERHASH_V(f->fs.dirsteerhash) |
-		      FW_FILTER_WR_LPBK_V(f->fs.action == FILTER_SWITCH) |
-		      FW_FILTER_WR_DMAC_V(f->fs.newdmac) |
-		      FW_FILTER_WR_SMAC_V(f->fs.newsmac) |
-		      FW_FILTER_WR_INSVLAN_V(f->fs.newvlan == VLAN_INSERT ||
-					     f->fs.newvlan == VLAN_REWRITE) |
-		      FW_FILTER_WR_RMVLAN_V(f->fs.newvlan == VLAN_REMOVE ||
-					    f->fs.newvlan == VLAN_REWRITE) |
-		      FW_FILTER_WR_HITCNTS_V(f->fs.hitcnts) |
-		      FW_FILTER_WR_TXCHAN_V(f->fs.eport) |
-		      FW_FILTER_WR_PRIO_V(f->fs.prio) |
-		      FW_FILTER_WR_L2TIX_V(f->l2t ? f->l2t->idx : 0));
-	fwr->ethtype = htons(f->fs.val.ethtype);
-	fwr->ethtypem = htons(f->fs.mask.ethtype);
-	fwr->frag_to_ovlan_vldm =
-		(FW_FILTER_WR_FRAG_V(f->fs.val.frag) |
-		 FW_FILTER_WR_FRAGM_V(f->fs.mask.frag) |
-		 FW_FILTER_WR_IVLAN_VLD_V(f->fs.val.ivlan_vld) |
-		 FW_FILTER_WR_OVLAN_VLD_V(f->fs.val.ovlan_vld) |
-		 FW_FILTER_WR_IVLAN_VLDM_V(f->fs.mask.ivlan_vld) |
-		 FW_FILTER_WR_OVLAN_VLDM_V(f->fs.mask.ovlan_vld));
-	fwr->smac_sel = 0;
-	fwr->rx_chan_rx_rpl_iq =
-		htons(FW_FILTER_WR_RX_CHAN_V(0) |
-		      FW_FILTER_WR_RX_RPL_IQ_V(adapter->sge.fw_evtq.abs_id));
-	fwr->maci_to_matchtypem =
-		htonl(FW_FILTER_WR_MACI_V(f->fs.val.macidx) |
-		      FW_FILTER_WR_MACIM_V(f->fs.mask.macidx) |
-		      FW_FILTER_WR_FCOE_V(f->fs.val.fcoe) |
-		      FW_FILTER_WR_FCOEM_V(f->fs.mask.fcoe) |
-		      FW_FILTER_WR_PORT_V(f->fs.val.iport) |
-		      FW_FILTER_WR_PORTM_V(f->fs.mask.iport) |
-		      FW_FILTER_WR_MATCHTYPE_V(f->fs.val.matchtype) |
-		      FW_FILTER_WR_MATCHTYPEM_V(f->fs.mask.matchtype));
-	fwr->ptcl = f->fs.val.proto;
-	fwr->ptclm = f->fs.mask.proto;
-	fwr->ttyp = f->fs.val.tos;
-	fwr->ttypm = f->fs.mask.tos;
-	fwr->ivlan = htons(f->fs.val.ivlan);
-	fwr->ivlanm = htons(f->fs.mask.ivlan);
-	fwr->ovlan = htons(f->fs.val.ovlan);
-	fwr->ovlanm = htons(f->fs.mask.ovlan);
-	memcpy(fwr->lip, f->fs.val.lip, sizeof(fwr->lip));
-	memcpy(fwr->lipm, f->fs.mask.lip, sizeof(fwr->lipm));
-	memcpy(fwr->fip, f->fs.val.fip, sizeof(fwr->fip));
-	memcpy(fwr->fipm, f->fs.mask.fip, sizeof(fwr->fipm));
-	fwr->lp = htons(f->fs.val.lport);
-	fwr->lpm = htons(f->fs.mask.lport);
-	fwr->fp = htons(f->fs.val.fport);
-	fwr->fpm = htons(f->fs.mask.fport);
-	if (f->fs.newsmac)
-		memcpy(fwr->sma, f->fs.smac, sizeof(fwr->sma));
-
-	/* Mark the filter as "pending" and ship off the Filter Work Request.
-	 * When we get the Work Request Reply we'll clear the pending status.
-	 */
-	f->pending = 1;
-	set_wr_txq(skb, CPL_PRIORITY_CONTROL, f->fs.val.iport & 0x3);
-	t4_ofld_send(adapter, skb);
-	return 0;
-}
-
-/* Delete the filter at a specified index.
- */
-static int del_filter_wr(struct adapter *adapter, int fidx)
-{
-	struct filter_entry *f = &adapter->tids.ftid_tab[fidx];
-	struct sk_buff *skb;
-	struct fw_filter_wr *fwr;
-	unsigned int len, ftid;
-
-	len = sizeof(*fwr);
-	ftid = adapter->tids.ftid_base + fidx;
-
-	skb = alloc_skb(len, GFP_KERNEL);
-	if (!skb)
-		return -ENOMEM;
-
-	fwr = (struct fw_filter_wr *)__skb_put(skb, len);
-	t4_mk_filtdelwr(ftid, fwr, adapter->sge.fw_evtq.abs_id);
-
-	/* Mark the filter as "pending" and ship off the Filter Work Request.
-	 * When we get the Work Request Reply we'll clear the pending status.
-	 */
-	f->pending = 1;
-	t4_mgmt_tx(adapter, skb);
-	return 0;
-}
-
 static u16 cxgb_select_queue(struct net_device *dev, struct sk_buff *skb,
 			     void *accel_priv, select_queue_fallback_t fallback)
 {
@@ -2830,40 +2602,6 @@ static int cxgb_close(struct net_device *dev)
 	return t4_enable_vi(adapter, adapter->pf, pi->viid, false, false);
 }
 
-/* Return an error number if the indicated filter isn't writable ...
- */
-static int writable_filter(struct filter_entry *f)
-{
-	if (f->locked)
-		return -EPERM;
-	if (f->pending)
-		return -EBUSY;
-
-	return 0;
-}
-
-/* Delete the filter at the specified index (if valid).  The checks for all
- * the common problems with doing this like the filter being locked, currently
- * pending in another operation, etc.
- */
-static int delete_filter(struct adapter *adapter, unsigned int fidx)
-{
-	struct filter_entry *f;
-	int ret;
-
-	if (fidx >= adapter->tids.nftids + adapter->tids.nsftids)
-		return -EINVAL;
-
-	f = &adapter->tids.ftid_tab[fidx];
-	ret = writable_filter(f);
-	if (ret)
-		return ret;
-	if (f->valid)
-		return del_filter_wr(adapter, fidx);
-
-	return 0;
-}
-
 int cxgb4_create_server_filter(const struct net_device *dev, unsigned int stid,
 		__be32 sip, __be16 sport, __be16 vlan,
 		unsigned int queue, unsigned char port, unsigned char mask)
-- 
2.5.3

^ permalink raw reply related

* [PATCH net-next v2 0/5] cxgb4: add support for offloading TC u32 filters
From: Rahul Lakkireddy @ 2016-09-13 11:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, hariprasad, leedom, nirranjan, indranil, Rahul Lakkireddy

This series of patches add support to offload TC u32 filters onto
Chelsio NICs.

Patch 1 moves current common filter code to separate files
in order to provide a common api for performing packet classification
and filtering in Chelsio NICs.

Patch 2 enables filters for normal NIC configuration and implements
common api for setting and deleting filters.

Patches 3-5 add support for TC u32 offload via ndo_setup_tc.

---
v2:
Based on review and suggestions from Jiri Pirko <jiri@resnulli.us>:
- Replaced macros S and U with appropriate static helper functions.
- Moved completion code for set and delete filters to respective
  functions cxgb4_set_filter() and cxgb4_del_filter().  Renamed the
  original functions to __cxgb4_set_filter() and __cxgb4_del_filter()
  in case synchronization is not required.
- Dropped debugfs patch.
- Merged code for inserting and deleting u32 filters into a single
  patch.
- Reworked and fixed bugs with traversing the actions list.
- Removed all unnecessary extra ().

Rahul Lakkireddy (5):
  cxgb4: move common filter code to separate file
  cxgb4: add common api support for configuring filters
  cxgb4: add parser to translate u32 filters to internal spec
  cxgb4: add support for offloading u32 filters
  cxgb4: add support for drop and redirect actions

 drivers/net/ethernet/chelsio/cxgb4/Makefile        |   2 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h         |  29 +
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c  | 722 +++++++++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h  |  48 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c    | 338 ++--------
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c  | 485 ++++++++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h  |  57 ++
 .../ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h    | 294 +++++++++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h     |  26 +-
 9 files changed, 1720 insertions(+), 281 deletions(-)
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.h
 create mode 100644 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32_parse.h

-- 
2.5.3

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox