Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 1/4] dt-bindings: net: ravb: Document optional reset-gpios property
From: Sergei Shtylyov @ 2017-09-28 20:07 UTC (permalink / raw)
  To: Geert Uytterhoeven, David S . Miller, Simon Horman, Magnus Damm
  Cc: Andrew Lunn, Florian Fainelli, Niklas Söderlund, netdev,
	linux-renesas-soc, devicetree
In-Reply-To: <1506614014-4398-2-git-send-email-geert+renesas@glider.be>

Hello!

On 09/28/2017 06:53 PM, Geert Uytterhoeven wrote:

> The optional "reset-gpios" property (part of the generic MDIO bus
> properties) lets us describe the GPIO used for resetting the Ethernet
> PHY.
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
>   Documentation/devicetree/bindings/net/renesas,ravb.txt | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/renesas,ravb.txt b/Documentation/devicetree/bindings/net/renesas,ravb.txt
> index c902261893b913f5..4a6ec1ba32d0bf16 100644
> --- a/Documentation/devicetree/bindings/net/renesas,ravb.txt
> +++ b/Documentation/devicetree/bindings/net/renesas,ravb.txt
> @@ -52,6 +52,7 @@ Optional properties:
>   			 AVB_LINK signal.
>   - renesas,ether-link-active-low: boolean, specify when the AVB_LINK signal is
>   				 active-low instead of normal active-high.
> +- reset-gpios: see mdio.txt in the same directory.

    Sigh, I can only repeat that was a terrible prop name choice -- when 
applied to a MAC node... what reset does it mean? MAC?

MBR, Sergei

^ permalink raw reply

* Re: [PATCH net-next 2/6] bpf: add meta pointer for direct access
From: Andy Gospodarek @ 2017-09-28 19:58 UTC (permalink / raw)
  To: Waskiewicz Jr, Peter
  Cc: Daniel Borkmann, davem@davemloft.net,
	alexei.starovoitov@gmail.com, john.fastabend@gmail.com,
	jakub.kicinski@netronome.com, netdev@vger.kernel.org,
	mchan@broadcom.com
In-Reply-To: <E0D909EE5BB15A4699798539EA149D7F077E53D6@ORSMSX103.amr.corp.intel.com>

On Thu, Sep 28, 2017 at 1:59 AM, Waskiewicz Jr, Peter
<peter.waskiewicz.jr@intel.com> wrote:
> On 9/26/17 10:21 AM, Andy Gospodarek wrote:
>> On Mon, Sep 25, 2017 at 08:50:28PM +0200, Daniel Borkmann wrote:
>>> On 09/25/2017 08:10 PM, Andy Gospodarek wrote:
>>> [...]
>>>> First, thanks for this detailed description.  It was helpful to read
>>>> along with the patches.
>>>>
>>>> My only concern about this area being generic is that you are now in a
>>>> state where any bpf program must know about all the bpf programs in the
>>>> receive pipeline before it can properly parse what is stored in the
>>>> meta-data and add it to an skb (or perform any other action).
>>>> Especially if each program adds it's own meta-data along the way.
>>>>
>>>> Maybe this isn't a big concern based on the number of users of this
>>>> today, but it just starts to seem like a concern as there are these
>>>> hints being passed between layers that are challenging to track due to a
>>>> lack of a standard format for passing data between.
>>>
>>> Btw, we do have similar kind of programmable scratch buffer also today
>>> wrt skb cb[] that you can program from tc side, the perf ring buffer,
>>> which doesn't have any fixed layout for the slots, or a per-cpu map
>>> where you can transfer data between tail calls for example, then tail
>>> calls themselves that need to coordinate, or simply mangling of packets
>>> itself if you will, but more below to your use case ...
>>>
>>>> The main reason I bring this up is that Michael and I had discussed and
>>>> designed a way for drivers to communicate between each other that rx
>>>> resources could be freed after a tx completion on an XDP_REDIRECT
>>>> action.  Much like this code, it involved adding an new element to
>>>> struct xdp_md that could point to the important information.  Now that
>>>> there is a generic way to handle this, it would seem nice to be able to
>>>> leverage it, but I'm not sure how reliable this meta-data area would be
>>>> without the ability to mark it in some manner.
>>>>
>>>> For additional background, the minimum amount of data needed in the case
>>>> Michael and I were discussing was really 2 words.  One to serve as a
>>>> pointer to an rx_ring structure and one to have a counter to the rx
>>>> producer entry.  This data could be acessed by the driver processing the
>>>> tx completions and callback to the driver that received the frame off the wire
>>>> to perform any needed processing.  (For those curious this would also require a
>>>> new callback/netdev op to act on this data stored in the XDP buffer.)
>>>
>>> What you describe above doesn't seem to be fitting to the use-case of
>>> this set, meaning the area here is fully programmable out of the BPF
>>> program, the infrastructure you're describing is some sort of means of
>>> communication between drivers for the XDP_REDIRECT, and should be
>>> outside of the control of the BPF program to mangle.
>>
>> OK, I understand that perspective.  I think saying this is really meant
>> as a BPF<->BPF communication channel for now is fine.
>>
>>> You could probably reuse the base infra here and make a part of that
>>> inaccessible for the program with some sort of a fixed layout, but I
>>> haven't seen your code yet to be able to fully judge. Intention here
>>> is to allow for programmability within the BPF prog in a generic way,
>>> such that based on the use-case it can be populated in specific ways
>>> and propagated to the skb w/o having to define a fixed layout and
>>> bloat xdp_buff all the way to an skb while still retaining all the
>>> flexibility.
>>
>> Some level of reuse might be proper, but I'd rather it be explicit for
>> my use since it's not exclusively something that will need to be used by
>> a BPF prog, but rather the driver.  I'll produce some patches this week
>> for reference.
>
> Sorry for chiming in late, I've been offline.
>
> We're looking to add some functionality from driver to XDP inside this
> xdp_buff->data_meta region.  We want to assign it to an opaque
> structure, that would be specific per driver (think of a flex descriptor
> coming out of the hardware).  We'd like to pass these offloaded
> computations into XDP programs to help accelerate them, such as packet
> type, where headers are located, etc.  It's similar to Jesper's RFC
> patches back in May when passing through the mlx Rx descriptor to XDP.
>
> This is actually what a few of us are planning to present at NetDev 2.2
> in November.  If you're hoping to restrict this headroom in the xdp_buff
> for an exclusive use case with XDP_REDIRECT, then I'd like to discuss
> that further.
>

No sweat, PJ, thanks for replying.  I saw the notes for your accepted
session and I'm looking forward to it.

John's suggestion earlier in the thread was actually similar to the
conclusion I reached when thinking about Daniel's patch a bit more.
(I like John's better though as it doesn't get constrained by UAPI.)
Since redirect actions happen at a point where no other programs will
run on the buffer, that space can be used for this redirect data and
there are no conflicts.

It sounds like the idea behind your proposal includes populating some
data into the buffer before the XDP program is executed so that it can
be used by the program.  Would this data be useful later in the driver
or stack or are you just hoping to accelerate processing of frames in
the BPF program?

If the headroom needed for redirect info was only added after it was
clear the redirect action was needed, would this conflict with the
information you are trying to provide?  I had planned to add this just
after the action was XDP_REDIRECT was selected or at the end of the
driver's ndo_xdp_xmit function -- it seems like it would not conflict.

(There's also Jesper's series from today -- I've seen it but have not
had time to fully grok all of those changes.)

Thoughts?

^ permalink raw reply

* Re: [PATCH net-next RFC 3/9] net: dsa: mv88e6xxx: add support for GPIO configuration
From: Vivien Didelot @ 2017-09-28 19:57 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli
  Cc: Brandon Streiff, netdev, linux-kernel, David S. Miller,
	Richard Cochran, Erik Hons
In-Reply-To: <20170928180111.GF14940@lunn.ch>

Hi Brandon,

>> Would there be any value in implementing a proper gpiochip structure
>> here such that other pieces of SW can see this GPIO controller as a
>> provider and you can reference it from e.g: Device Tree using GPIO
>> descriptors?
>
> That would be my preference as well, or maybe a pinctrl driver.

Indeed seeing a gpio_chip or a pinctrl controller registered from a
gpio.c or pinctrl.c file in a separate patchset would be great.


Thanks,

        Vivien

^ permalink raw reply

* Re: [PATCH/RFC net-next] ravb: RX checksum offload
From: Sergei Shtylyov @ 2017-09-28 19:56 UTC (permalink / raw)
  To: Simon Horman; +Cc: David Miller, Magnus Damm, netdev, linux-renesas-soc
In-Reply-To: <20170928104918.GA11212@verge.net.au>

Hello!

On 09/28/2017 01:49 PM, Simon Horman wrote:

>>> Add support for RX checksum offload. This is enabled by default and
>>> may be disabled and re-enabled using ethtool:
>>>
>>>   # ethtool -K eth0 rx off
>>>   # ethtool -K eth0 rx on
>>>
>>> The RAVB provides a simple checksumming scheme which appears to be
>>> completely compatible with CHECKSUM_COMPLETE: a 1's complement sum of
>>
>>     Hm, the gen2/3 manuals say calculation doesn't involve bit inversion...
> 
> Yes, I believe that matches my observation of the values supplied by
> the hardware. Empirically this appears to be what the kernel expects.

    Then why you talk of 1's complement here?

>>> all packet data after the L2 header is appended to packet data; this may
>>> be trivially read by the driver and used to update the skb accordingly.
>>>
>>> In terms of performance throughput is close to gigabit line-rate both with
>>> and without RX checksum offload enabled. Perf output, however, appears to
>>> indicate that significantly less time is spent in do_csum(). This is as
>>> expected.
>>
>> [...]
>>
>>> By inspection this also appears to be compatible with the ravb found
>>> on R-Car Gen 2 SoCs, however, this patch is currently untested on such
>>> hardware.
>>
>>     I probably won't be able to test it on gen2 too...
>>
>>> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
>>
>>     I'm generally OK with the patch but have some questions/comments below...
> 
> Thanks, I will try to address them.
> 
>>> ---
>>>   drivers/net/ethernet/renesas/ravb_main.c | 58 +++++++++++++++++++++++++++++++-
>>>   1 file changed, 57 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>>> index fdf30bfa403b..7c6438cd7de7 100644
>>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>> [...]
>>> @@ -1842,6 +1859,41 @@ static int ravb_do_ioctl(struct net_device *ndev, struct ifreq *req, int cmd)
>>>   	return phy_mii_ioctl(phydev, req, cmd);
>>>   }
>>> +static void ravb_set_rx_csum(struct net_device *ndev, bool enable)
>>> +{
>>> +	struct ravb_private *priv = netdev_priv(ndev);
>>> +	unsigned long flags;
>>> +
>>> +	spin_lock_irqsave(&priv->lock, flags);
>>> +
>>> +	/* Disable TX and RX */
>>> +	ravb_rcv_snd_disable(ndev);
>>> +
>>> +	/* Modify RX Checksum setting */
>>> +	if (enable)
>>> +		ravb_modify(ndev, ECMR, 0, ECMR_RCSC);
>>
>>     Please use ECMR_RCSC as the 3rd argument too to conform the common driver
>> style.
>>
>>> +	else
>>> +		ravb_modify(ndev, ECMR, ECMR_RCSC, 0);
>>
>>     This *if* can easily be folded into a single ravb_modify() call...
> 
> Thanks, something like this?
> 
> 	ravb_modify(ndev, ECMR, ECMR_RCSC, enable ? ECMR_RCSC : 0);

    Yes, exactly! :-)

>> [...]
>>> @@ -2004,6 +2057,9 @@ static int ravb_probe(struct platform_device *pdev)
>>>   	if (!ndev)
>>>   		return -ENOMEM;
>>> +	ndev->features |= NETIF_F_RXCSUM;
>>> +	ndev->hw_features |= ndev->features;
>>
>>     Hum, both fields are 0 before this? Then why not use '=' instead of '|='?
>> Even if not, why not just use the same value as both the rvalues?
> 
> I don't feel strongly about this, how about?
> 
> 	ndev->features = NETIF_F_RXCSUM;
> 	ndev->hw_features = NETIF_F_RXCSUM;

    Yes, I think it should work...

MBR, Sergei

^ permalink raw reply

* Re: [PATCH RFC 3/5] Add KSZ8795 switch driver
From: Andrew Lunn @ 2017-09-28 19:34 UTC (permalink / raw)
  To: Tristram.Ha
  Cc: muvarov, pavel, nathan.leigh.conrad, vivien.didelot, f.fainelli,
	netdev, linux-kernel, Woojung.Huh
In-Reply-To: <93AF473E2DA327428DE3D46B72B1E9FD41124D5A@CHN-SV-EXMX02.mchp-main.com>

On Mon, Sep 18, 2017 at 08:27:13PM +0000, Tristram.Ha@microchip.com wrote:
> > > +/**
> > > + * Some counters do not need to be read too often because they are less
> > likely
> > > + * to increase much.
> > > + */
> > 
> > What does comment mean? Are you caching statistics, and updating
> > different values at different rates?
> > 
> 
> There are 34 counters.  In normal case using generic bus I/O or PCI to read them
> is very quick, but the switch is mostly accessed using SPI, or even I2C.  As the SPI
> access is very slow.

How slow is it? The Marvell switches all use MDIO. It is probably a
bit faster than I2C, but it is a lot slower than MMIO or PCI.

ethtool -S lan0 takes about 25ms.

No other driver does caching. So i'm hesitant to add one which does.

>  These accesses can be getting 1588 PTP timestamps and opening/closing ports.

You could drop the mutex between each statistic read, so allowing
something else access to the switch. That should reduce the jitter PTP
experiences.

	Andrew

^ permalink raw reply

* Re: [PATCH net-next RFC 5/9] net: dsa: forward hardware timestamping ioctls to switch driver
From: Vivien Didelot @ 2017-09-28 19:31 UTC (permalink / raw)
  To: Brandon Streiff, netdev
  Cc: linux-kernel, David S. Miller, Florian Fainelli, Andrew Lunn,
	Richard Cochran, Erik Hons, Brandon Streiff
In-Reply-To: <1506612341-18061-6-git-send-email-brandon.streiff@ni.com>

Hi Brandon,

Brandon Streiff <brandon.streiff@ni.com> writes:

>  static int dsa_slave_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
>  {
> +	struct dsa_slave_priv *p = netdev_priv(dev);
> +	struct dsa_switch *ds = p->dp->ds;
> +	int port = p->dp->index;
> +
>  	if (!dev->phydev)
>  		return -ENODEV;

Move this check below:

>  
> -	return phy_mii_ioctl(dev->phydev, ifr, cmd);
> +	switch (cmd) {
> +	case SIOCGMIIPHY:
> +	case SIOCGMIIREG:
> +	case SIOCSMIIREG:
> +		if (dev->phydev)
> +			return phy_mii_ioctl(dev->phydev, ifr, cmd);
> +		else
> +			return -EOPNOTSUPP;

                if (!dev->phydev)
                        return -ENODEV;

                return phy_mii_ioctl(dev->phydev, ifr, cmd);

> +	case SIOCGHWTSTAMP:
> +		if (ds->ops->port_hwtstamp_get)
> +			return ds->ops->port_hwtstamp_get(ds, port, ifr);
> +		else
> +			return -EOPNOTSUPP;

Here you can replace the else statement with break;

> +	case SIOCSHWTSTAMP:
> +		if (ds->ops->port_hwtstamp_set)
> +			return ds->ops->port_hwtstamp_set(ds, port, ifr);
> +		else
> +			return -EOPNOTSUPP;

Same here;

> +	default:
> +		return -EOPNOTSUPP;
> +	}

Then drop the default case and return -EOPNOTSUPP after the switch.

>  }


Thanks,

        Vivien

^ permalink raw reply

* Re: [RFC PATCH v3 7/7] i40e: Enable cloud filters via tc-flower
From: Nambiar, Amritha @ 2017-09-28 19:22 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: intel-wired-lan, jeffrey.t.kirsher, alexander.h.duyck, netdev,
	mlxsw, alexander.duyck@gmail.com, Jamal Hadi Salim, Cong Wang
In-Reply-To: <82e0a065-c7d6-8fe6-aedc-154dd0dd88d6@intel.com>

On 9/14/2017 1:00 AM, Nambiar, Amritha wrote:
> On 9/13/2017 6:26 AM, Jiri Pirko wrote:
>> Wed, Sep 13, 2017 at 11:59:50AM CEST, amritha.nambiar@intel.com wrote:
>>> This patch enables tc-flower based hardware offloads. tc flower
>>> filter provided by the kernel is configured as driver specific
>>> cloud filter. The patch implements functions and admin queue
>>> commands needed to support cloud filters in the driver and
>>> adds cloud filters to configure these tc-flower filters.
>>>
>>> The only action supported is to redirect packets to a traffic class
>>> on the same device.
>>
>> So basically you are not doing redirect, you are just setting tclass for
>> matched packets, right? Why you use mirred for this? I think that
>> you might consider extending g_act for that:
>>
>> # tc filter add dev eth0 protocol ip ingress \
>>   prio 1 flower dst_mac 3c:fd:fe:a0:d6:70 skip_sw \
>>   action tclass 0
>>
> Yes, this doesn't work like a typical egress redirect, but is aimed at
> forwarding the matched packets to a different queue-group/traffic class
> on the same device, so some sort-of ingress redirect in the hardware. I
> possibly may not need the mirred-redirect as you say, I'll look into the
> g_act way of doing this with a new gact tc action.
> 

I was looking at introducing a new gact tclass action to TC. In the HW
offload path, this sets a traffic class value for certain matched
packets so they will be processed in a queue belonging to the traffic class.

# tc filter add dev eth0 protocol ip parent ffff:\
  prio 2 flower dst_ip 192.168.3.5/32\
  ip_proto udp dst_port 25 skip_sw\
  action tclass 2

But, I'm having trouble defining what this action means in the kernel
datapath. For ingress, this action could just take the default path and
do nothing and only have meaning in the HW offloaded path. For egress,
certain qdiscs like 'multiq' and 'prio' could use this 'tclass' value
for band selection, while the 'mqprio' qdisc selects the traffic class
based on the skb priority in netdev_pick_tx(), so what would this action
mean for the 'mqprio' qdisc?

It looks like the 'prio' qdisc uses band selection based on the
'classid', so I was thinking of using the 'classid' through the cls
flower filter and offload it to HW for the traffic class index, this way
we would have the same behavior in HW offload and SW fallback and there
would be no need for a separate tc action.

In HW:
# tc filter add dev eth0 protocol ip parent ffff:\
  prio 2 flower dst_ip 192.168.3.5/32\
  ip_proto udp dst_port 25 skip_sw classid 1:2\

filter pref 2 flower chain 0
filter pref 2 flower chain 0 handle 0x1 classid 1:2
  eth_type ipv4
  ip_proto udp
  dst_ip 192.168.3.5
  dst_port 25
  skip_sw
  in_hw

This will be used to route packets to traffic class 2.

In SW:
# tc filter add dev eth0 protocol ip parent ffff:\
  prio 2 flower dst_ip 192.168.3.5/32\
  ip_proto udp dst_port 25 skip_hw classid 1:2

filter pref 2 flower chain 0
filter pref 2 flower chain 0 handle 0x1 classid 1:2
  eth_type ipv4
  ip_proto udp
  dst_ip 192.168.3.5
  dst_port 25
  skip_hw
  not_in_hw

>>
>>>
>>> # tc qdisc add dev eth0 ingress
>>> # ethtool -K eth0 hw-tc-offload on
>>>
>>> # tc filter add dev eth0 protocol ip parent ffff:\
>>>  prio 1 flower dst_mac 3c:fd:fe:a0:d6:70 skip_sw\
>>>  action mirred ingress redirect dev eth0 tclass 0
>>>
>>> # tc filter add dev eth0 protocol ip parent ffff:\
>>>  prio 2 flower dst_ip 192.168.3.5/32\
>>>  ip_proto udp dst_port 25 skip_sw\
>>>  action mirred ingress redirect dev eth0 tclass 1
>>>
>>> # tc filter add dev eth0 protocol ipv6 parent ffff:\
>>>  prio 3 flower dst_ip fe8::200:1\
>>>  ip_proto udp dst_port 66 skip_sw\
>>>  action mirred ingress redirect dev eth0 tclass 1
>>>
>>> Delete tc flower filter:
>>> Example:
>>>
>>> # tc filter del dev eth0 parent ffff: prio 3 handle 0x1 flower
>>> # tc filter del dev eth0 parent ffff:
>>>
>>> Flow Director Sideband is disabled while configuring cloud filters
>>> via tc-flower and until any cloud filter exists.
>>>
>>> Unsupported matches when cloud filters are added using enhanced
>>> big buffer cloud filter mode of underlying switch include:
>>> 1. source port and source IP
>>> 2. Combined MAC address and IP fields.
>>> 3. Not specifying L4 port
>>>
>>> These filter matches can however be used to redirect traffic to
>>> the main VSI (tc 0) which does not require the enhanced big buffer
>>> cloud filter support.
>>>
>>> v3: Cleaned up some lengthy function names. Changed ipv6 address to
>>> __be32 array instead of u8 array. Used macro for IP version. Minor
>>> formatting changes.
>>> v2:
>>> 1. Moved I40E_SWITCH_MODE_MASK definition to i40e_type.h
>>> 2. Moved dev_info for add/deleting cloud filters in else condition
>>> 3. Fixed some format specifier in dev_err logs
>>> 4. Refactored i40e_get_capabilities to take an additional
>>>   list_type parameter and use it to query device and function
>>>   level capabilities.
>>> 5. Fixed parsing tc redirect action to check for the is_tcf_mirred_tc()
>>>   to verify if redirect to a traffic class is supported.
>>> 6. Added comments for Geneve fix in cloud filter big buffer AQ
>>>   function definitions.
>>> 7. Cleaned up setup_tc interface to rebase and work with Jiri's
>>>   updates, separate function to process tc cls flower offloads.
>>> 8. Changes to make Flow Director Sideband and Cloud filters mutually
>>>   exclusive.
>>>
>>> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
>>> Signed-off-by: Kiran Patil <kiran.patil@intel.com>
>>> Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
>>> Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
>>> ---
>>> drivers/net/ethernet/intel/i40e/i40e.h             |   49 +
>>> drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h  |    3 
>>> drivers/net/ethernet/intel/i40e/i40e_common.c      |  189 ++++
>>> drivers/net/ethernet/intel/i40e/i40e_main.c        |  971 +++++++++++++++++++-
>>> drivers/net/ethernet/intel/i40e/i40e_prototype.h   |   16 
>>> drivers/net/ethernet/intel/i40e/i40e_type.h        |    1 
>>> .../net/ethernet/intel/i40evf/i40e_adminq_cmd.h    |    3 
>>> 7 files changed, 1202 insertions(+), 30 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
>>> index 6018fb6..b110519 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e.h
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e.h
>>> @@ -55,6 +55,8 @@
>>> #include <linux/net_tstamp.h>
>>> #include <linux/ptp_clock_kernel.h>
>>> #include <net/pkt_cls.h>
>>> +#include <net/tc_act/tc_gact.h>
>>> +#include <net/tc_act/tc_mirred.h>
>>> #include "i40e_type.h"
>>> #include "i40e_prototype.h"
>>> #include "i40e_client.h"
>>> @@ -252,9 +254,52 @@ struct i40e_fdir_filter {
>>> 	u32 fd_id;
>>> };
>>>
>>> +#define IPV4_VERSION 4
>>> +#define IPV6_VERSION 6
>>> +
>>> +#define I40E_CLOUD_FIELD_OMAC	0x01
>>> +#define I40E_CLOUD_FIELD_IMAC	0x02
>>> +#define I40E_CLOUD_FIELD_IVLAN	0x04
>>> +#define I40E_CLOUD_FIELD_TEN_ID	0x08
>>> +#define I40E_CLOUD_FIELD_IIP	0x10
>>> +
>>> +#define I40E_CLOUD_FILTER_FLAGS_OMAC	I40E_CLOUD_FIELD_OMAC
>>> +#define I40E_CLOUD_FILTER_FLAGS_IMAC	I40E_CLOUD_FIELD_IMAC
>>> +#define I40E_CLOUD_FILTER_FLAGS_IMAC_IVLAN	(I40E_CLOUD_FIELD_IMAC | \
>>> +						 I40E_CLOUD_FIELD_IVLAN)
>>> +#define I40E_CLOUD_FILTER_FLAGS_IMAC_TEN_ID	(I40E_CLOUD_FIELD_IMAC | \
>>> +						 I40E_CLOUD_FIELD_TEN_ID)
>>> +#define I40E_CLOUD_FILTER_FLAGS_OMAC_TEN_ID_IMAC (I40E_CLOUD_FIELD_OMAC | \
>>> +						  I40E_CLOUD_FIELD_IMAC | \
>>> +						  I40E_CLOUD_FIELD_TEN_ID)
>>> +#define I40E_CLOUD_FILTER_FLAGS_IMAC_IVLAN_TEN_ID (I40E_CLOUD_FIELD_IMAC | \
>>> +						   I40E_CLOUD_FIELD_IVLAN | \
>>> +						   I40E_CLOUD_FIELD_TEN_ID)
>>> +#define I40E_CLOUD_FILTER_FLAGS_IIP	I40E_CLOUD_FIELD_IIP
>>> +
>>> struct i40e_cloud_filter {
>>> 	struct hlist_node cloud_node;
>>> 	unsigned long cookie;
>>> +	/* cloud filter input set follows */
>>> +	u8 dst_mac[ETH_ALEN];
>>> +	u8 src_mac[ETH_ALEN];
>>> +	__be16 vlan_id;
>>> +	__be32 dst_ip;
>>> +	__be32 src_ip;
>>> +	__be32 dst_ipv6[4];
>>> +	__be32 src_ipv6[4];
>>> +	__be16 dst_port;
>>> +	__be16 src_port;
>>> +	u32 ip_version;
>>> +	u8 ip_proto;	/* IPPROTO value */
>>> +	/* L4 port type: src or destination port */
>>> +#define I40E_CLOUD_FILTER_PORT_SRC	0x01
>>> +#define I40E_CLOUD_FILTER_PORT_DEST	0x02
>>> +	u8 port_type;
>>> +	u32 tenant_id;
>>> +	u8 flags;
>>> +#define I40E_CLOUD_TNL_TYPE_NONE	0xff
>>> +	u8 tunnel_type;
>>> 	u16 seid;	/* filter control */
>>> };
>>>
>>> @@ -491,6 +536,8 @@ struct i40e_pf {
>>> #define I40E_FLAG_LINK_DOWN_ON_CLOSE_ENABLED	BIT(27)
>>> #define I40E_FLAG_SOURCE_PRUNING_DISABLED	BIT(28)
>>> #define I40E_FLAG_TC_MQPRIO			BIT(29)
>>> +#define I40E_FLAG_FD_SB_INACTIVE		BIT(30)
>>> +#define I40E_FLAG_FD_SB_TO_CLOUD_FILTER		BIT(31)
>>>
>>> 	struct i40e_client_instance *cinst;
>>> 	bool stat_offsets_loaded;
>>> @@ -573,6 +620,8 @@ struct i40e_pf {
>>> 	u16 phy_led_val;
>>>
>>> 	u16 override_q_count;
>>> +	u16 last_sw_conf_flags;
>>> +	u16 last_sw_conf_valid_flags;
>>> };
>>>
>>> /**
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
>>> index 2e567c2..feb3d42 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
>>> @@ -1392,6 +1392,9 @@ struct i40e_aqc_cloud_filters_element_data {
>>> 		struct {
>>> 			u8 data[16];
>>> 		} v6;
>>> +		struct {
>>> +			__le16 data[8];
>>> +		} raw_v6;
>>> 	} ipaddr;
>>> 	__le16	flags;
>>> #define I40E_AQC_ADD_CLOUD_FILTER_SHIFT			0
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c
>>> index 9567702..d9c9665 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_common.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
>>> @@ -5434,5 +5434,194 @@ i40e_add_pinfo_to_list(struct i40e_hw *hw,
>>>
>>> 	status = i40e_aq_write_ppp(hw, (void *)sec, sec->data_end,
>>> 				   track_id, &offset, &info, NULL);
>>> +
>>> +	return status;
>>> +}
>>> +
>>> +/**
>>> + * i40e_aq_add_cloud_filters
>>> + * @hw: pointer to the hardware structure
>>> + * @seid: VSI seid to add cloud filters from
>>> + * @filters: Buffer which contains the filters to be added
>>> + * @filter_count: number of filters contained in the buffer
>>> + *
>>> + * Set the cloud filters for a given VSI.  The contents of the
>>> + * i40e_aqc_cloud_filters_element_data are filled in by the caller
>>> + * of the function.
>>> + *
>>> + **/
>>> +enum i40e_status_code
>>> +i40e_aq_add_cloud_filters(struct i40e_hw *hw, u16 seid,
>>> +			  struct i40e_aqc_cloud_filters_element_data *filters,
>>> +			  u8 filter_count)
>>> +{
>>> +	struct i40e_aq_desc desc;
>>> +	struct i40e_aqc_add_remove_cloud_filters *cmd =
>>> +	(struct i40e_aqc_add_remove_cloud_filters *)&desc.params.raw;
>>> +	enum i40e_status_code status;
>>> +	u16 buff_len;
>>> +
>>> +	i40e_fill_default_direct_cmd_desc(&desc,
>>> +					  i40e_aqc_opc_add_cloud_filters);
>>> +
>>> +	buff_len = filter_count * sizeof(*filters);
>>> +	desc.datalen = cpu_to_le16(buff_len);
>>> +	desc.flags |= cpu_to_le16((u16)(I40E_AQ_FLAG_BUF | I40E_AQ_FLAG_RD));
>>> +	cmd->num_filters = filter_count;
>>> +	cmd->seid = cpu_to_le16(seid);
>>> +
>>> +	status = i40e_asq_send_command(hw, &desc, filters, buff_len, NULL);
>>> +
>>> +	return status;
>>> +}
>>> +
>>> +/**
>>> + * i40e_aq_add_cloud_filters_bb
>>> + * @hw: pointer to the hardware structure
>>> + * @seid: VSI seid to add cloud filters from
>>> + * @filters: Buffer which contains the filters in big buffer to be added
>>> + * @filter_count: number of filters contained in the buffer
>>> + *
>>> + * Set the big buffer cloud filters for a given VSI.  The contents of the
>>> + * i40e_aqc_cloud_filters_element_bb are filled in by the caller of the
>>> + * function.
>>> + *
>>> + **/
>>> +i40e_status
>>> +i40e_aq_add_cloud_filters_bb(struct i40e_hw *hw, u16 seid,
>>> +			     struct i40e_aqc_cloud_filters_element_bb *filters,
>>> +			     u8 filter_count)
>>> +{
>>> +	struct i40e_aq_desc desc;
>>> +	struct i40e_aqc_add_remove_cloud_filters *cmd =
>>> +	(struct i40e_aqc_add_remove_cloud_filters *)&desc.params.raw;
>>> +	i40e_status status;
>>> +	u16 buff_len;
>>> +	int i;
>>> +
>>> +	i40e_fill_default_direct_cmd_desc(&desc,
>>> +					  i40e_aqc_opc_add_cloud_filters);
>>> +
>>> +	buff_len = filter_count * sizeof(*filters);
>>> +	desc.datalen = cpu_to_le16(buff_len);
>>> +	desc.flags |= cpu_to_le16((u16)(I40E_AQ_FLAG_BUF | I40E_AQ_FLAG_RD));
>>> +	cmd->num_filters = filter_count;
>>> +	cmd->seid = cpu_to_le16(seid);
>>> +	cmd->big_buffer_flag = I40E_AQC_ADD_CLOUD_CMD_BB;
>>> +
>>> +	for (i = 0; i < filter_count; i++) {
>>> +		u16 tnl_type;
>>> +		u32 ti;
>>> +
>>> +		tnl_type = (le16_to_cpu(filters[i].element.flags) &
>>> +			   I40E_AQC_ADD_CLOUD_TNL_TYPE_MASK) >>
>>> +			   I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT;
>>> +
>>> +		/* For Geneve, the VNI should be placed in offset shifted by a
>>> +		 * byte than the offset for the Tenant ID for rest of the
>>> +		 * tunnels.
>>> +		 */
>>> +		if (tnl_type == I40E_AQC_ADD_CLOUD_TNL_TYPE_GENEVE) {
>>> +			ti = le32_to_cpu(filters[i].element.tenant_id);
>>> +			filters[i].element.tenant_id = cpu_to_le32(ti << 8);
>>> +		}
>>> +	}
>>> +
>>> +	status = i40e_asq_send_command(hw, &desc, filters, buff_len, NULL);
>>> +
>>> +	return status;
>>> +}
>>> +
>>> +/**
>>> + * i40e_aq_rem_cloud_filters
>>> + * @hw: pointer to the hardware structure
>>> + * @seid: VSI seid to remove cloud filters from
>>> + * @filters: Buffer which contains the filters to be removed
>>> + * @filter_count: number of filters contained in the buffer
>>> + *
>>> + * Remove the cloud filters for a given VSI.  The contents of the
>>> + * i40e_aqc_cloud_filters_element_data are filled in by the caller
>>> + * of the function.
>>> + *
>>> + **/
>>> +enum i40e_status_code
>>> +i40e_aq_rem_cloud_filters(struct i40e_hw *hw, u16 seid,
>>> +			  struct i40e_aqc_cloud_filters_element_data *filters,
>>> +			  u8 filter_count)
>>> +{
>>> +	struct i40e_aq_desc desc;
>>> +	struct i40e_aqc_add_remove_cloud_filters *cmd =
>>> +	(struct i40e_aqc_add_remove_cloud_filters *)&desc.params.raw;
>>> +	enum i40e_status_code status;
>>> +	u16 buff_len;
>>> +
>>> +	i40e_fill_default_direct_cmd_desc(&desc,
>>> +					  i40e_aqc_opc_remove_cloud_filters);
>>> +
>>> +	buff_len = filter_count * sizeof(*filters);
>>> +	desc.datalen = cpu_to_le16(buff_len);
>>> +	desc.flags |= cpu_to_le16((u16)(I40E_AQ_FLAG_BUF | I40E_AQ_FLAG_RD));
>>> +	cmd->num_filters = filter_count;
>>> +	cmd->seid = cpu_to_le16(seid);
>>> +
>>> +	status = i40e_asq_send_command(hw, &desc, filters, buff_len, NULL);
>>> +
>>> +	return status;
>>> +}
>>> +
>>> +/**
>>> + * i40e_aq_rem_cloud_filters_bb
>>> + * @hw: pointer to the hardware structure
>>> + * @seid: VSI seid to remove cloud filters from
>>> + * @filters: Buffer which contains the filters in big buffer to be removed
>>> + * @filter_count: number of filters contained in the buffer
>>> + *
>>> + * Remove the big buffer cloud filters for a given VSI.  The contents of the
>>> + * i40e_aqc_cloud_filters_element_bb are filled in by the caller of the
>>> + * function.
>>> + *
>>> + **/
>>> +i40e_status
>>> +i40e_aq_rem_cloud_filters_bb(struct i40e_hw *hw, u16 seid,
>>> +			     struct i40e_aqc_cloud_filters_element_bb *filters,
>>> +			     u8 filter_count)
>>> +{
>>> +	struct i40e_aq_desc desc;
>>> +	struct i40e_aqc_add_remove_cloud_filters *cmd =
>>> +	(struct i40e_aqc_add_remove_cloud_filters *)&desc.params.raw;
>>> +	i40e_status status;
>>> +	u16 buff_len;
>>> +	int i;
>>> +
>>> +	i40e_fill_default_direct_cmd_desc(&desc,
>>> +					  i40e_aqc_opc_remove_cloud_filters);
>>> +
>>> +	buff_len = filter_count * sizeof(*filters);
>>> +	desc.datalen = cpu_to_le16(buff_len);
>>> +	desc.flags |= cpu_to_le16((u16)(I40E_AQ_FLAG_BUF | I40E_AQ_FLAG_RD));
>>> +	cmd->num_filters = filter_count;
>>> +	cmd->seid = cpu_to_le16(seid);
>>> +	cmd->big_buffer_flag = I40E_AQC_ADD_CLOUD_CMD_BB;
>>> +
>>> +	for (i = 0; i < filter_count; i++) {
>>> +		u16 tnl_type;
>>> +		u32 ti;
>>> +
>>> +		tnl_type = (le16_to_cpu(filters[i].element.flags) &
>>> +			   I40E_AQC_ADD_CLOUD_TNL_TYPE_MASK) >>
>>> +			   I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT;
>>> +
>>> +		/* For Geneve, the VNI should be placed in offset shifted by a
>>> +		 * byte than the offset for the Tenant ID for rest of the
>>> +		 * tunnels.
>>> +		 */
>>> +		if (tnl_type == I40E_AQC_ADD_CLOUD_TNL_TYPE_GENEVE) {
>>> +			ti = le32_to_cpu(filters[i].element.tenant_id);
>>> +			filters[i].element.tenant_id = cpu_to_le32(ti << 8);
>>> +		}
>>> +	}
>>> +
>>> +	status = i40e_asq_send_command(hw, &desc, filters, buff_len, NULL);
>>> +
>>> 	return status;
>>> }
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> index afcf08a..96ee608 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
>>> @@ -69,6 +69,15 @@ static int i40e_reset(struct i40e_pf *pf);
>>> static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired);
>>> static void i40e_fdir_sb_setup(struct i40e_pf *pf);
>>> static int i40e_veb_get_bw_info(struct i40e_veb *veb);
>>> +static int i40e_add_del_cloud_filter(struct i40e_vsi *vsi,
>>> +				     struct i40e_cloud_filter *filter,
>>> +				     bool add);
>>> +static int i40e_add_del_cloud_filter_big_buf(struct i40e_vsi *vsi,
>>> +					     struct i40e_cloud_filter *filter,
>>> +					     bool add);
>>> +static int i40e_get_capabilities(struct i40e_pf *pf,
>>> +				 enum i40e_admin_queue_opc list_type);
>>> +
>>>
>>> /* i40e_pci_tbl - PCI Device ID Table
>>>  *
>>> @@ -5478,7 +5487,11 @@ int i40e_set_bw_limit(struct i40e_vsi *vsi, u16 seid, u64 max_tx_rate)
>>>  **/
>>> static void i40e_remove_queue_channels(struct i40e_vsi *vsi)
>>> {
>>> +	enum i40e_admin_queue_err last_aq_status;
>>> +	struct i40e_cloud_filter *cfilter;
>>> 	struct i40e_channel *ch, *ch_tmp;
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	struct hlist_node *node;
>>> 	int ret, i;
>>>
>>> 	/* Reset rss size that was stored when reconfiguring rss for
>>> @@ -5519,6 +5532,29 @@ static void i40e_remove_queue_channels(struct i40e_vsi *vsi)
>>> 				 "Failed to reset tx rate for ch->seid %u\n",
>>> 				 ch->seid);
>>>
>>> +		/* delete cloud filters associated with this channel */
>>> +		hlist_for_each_entry_safe(cfilter, node,
>>> +					  &pf->cloud_filter_list, cloud_node) {
>>> +			if (cfilter->seid != ch->seid)
>>> +				continue;
>>> +
>>> +			hash_del(&cfilter->cloud_node);
>>> +			if (cfilter->dst_port)
>>> +				ret = i40e_add_del_cloud_filter_big_buf(vsi,
>>> +									cfilter,
>>> +									false);
>>> +			else
>>> +				ret = i40e_add_del_cloud_filter(vsi, cfilter,
>>> +								false);
>>> +			last_aq_status = pf->hw.aq.asq_last_status;
>>> +			if (ret)
>>> +				dev_info(&pf->pdev->dev,
>>> +					 "Failed to delete cloud filter, err %s aq_err %s\n",
>>> +					 i40e_stat_str(&pf->hw, ret),
>>> +					 i40e_aq_str(&pf->hw, last_aq_status));
>>> +			kfree(cfilter);
>>> +		}
>>> +
>>> 		/* delete VSI from FW */
>>> 		ret = i40e_aq_delete_element(&vsi->back->hw, ch->seid,
>>> 					     NULL);
>>> @@ -5970,6 +6006,74 @@ static bool i40e_setup_channel(struct i40e_pf *pf, struct i40e_vsi *vsi,
>>> }
>>>
>>> /**
>>> + * i40e_validate_and_set_switch_mode - sets up switch mode correctly
>>> + * @vsi: ptr to VSI which has PF backing
>>> + * @l4type: true for TCP ond false for UDP
>>> + * @port_type: true if port is destination and false if port is source
>>> + *
>>> + * Sets up switch mode correctly if it needs to be changed and perform
>>> + * what are allowed modes.
>>> + **/
>>> +static int i40e_validate_and_set_switch_mode(struct i40e_vsi *vsi, bool l4type,
>>> +					     bool port_type)
>>> +{
>>> +	u8 mode;
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	struct i40e_hw *hw = &pf->hw;
>>> +	int ret;
>>> +
>>> +	ret = i40e_get_capabilities(pf, i40e_aqc_opc_list_dev_capabilities);
>>> +	if (ret)
>>> +		return -EINVAL;
>>> +
>>> +	if (hw->dev_caps.switch_mode) {
>>> +		/* if switch mode is set, support mode2 (non-tunneled for
>>> +		 * cloud filter) for now
>>> +		 */
>>> +		u32 switch_mode = hw->dev_caps.switch_mode &
>>> +							I40E_SWITCH_MODE_MASK;
>>> +		if (switch_mode >= I40E_NVM_IMAGE_TYPE_MODE1) {
>>> +			if (switch_mode == I40E_NVM_IMAGE_TYPE_MODE2)
>>> +				return 0;
>>> +			dev_err(&pf->pdev->dev,
>>> +				"Invalid switch_mode (%d), only non-tunneled mode for cloud filter is supported\n",
>>> +				hw->dev_caps.switch_mode);
>>> +			return -EINVAL;
>>> +		}
>>> +	}
>>> +
>>> +	/* port_type: true for destination port and false for source port
>>> +	 * For now, supports only destination port type
>>> +	 */
>>> +	if (!port_type) {
>>> +		dev_err(&pf->pdev->dev, "src port type not supported\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	/* Set Bit 7 to be valid */
>>> +	mode = I40E_AQ_SET_SWITCH_BIT7_VALID;
>>> +
>>> +	/* Set L4type to both TCP and UDP support */
>>> +	mode |= I40E_AQ_SET_SWITCH_L4_TYPE_BOTH;
>>> +
>>> +	/* Set cloud filter mode */
>>> +	mode |= I40E_AQ_SET_SWITCH_MODE_NON_TUNNEL;
>>> +
>>> +	/* Prep mode field for set_switch_config */
>>> +	ret = i40e_aq_set_switch_config(hw, pf->last_sw_conf_flags,
>>> +					pf->last_sw_conf_valid_flags,
>>> +					mode, NULL);
>>> +	if (ret && hw->aq.asq_last_status != I40E_AQ_RC_ESRCH)
>>> +		dev_err(&pf->pdev->dev,
>>> +			"couldn't set switch config bits, err %s aq_err %s\n",
>>> +			i40e_stat_str(hw, ret),
>>> +			i40e_aq_str(hw,
>>> +				    hw->aq.asq_last_status));
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +/**
>>>  * i40e_create_queue_channel - function to create channel
>>>  * @vsi: VSI to be configured
>>>  * @ch: ptr to channel (it contains channel specific params)
>>> @@ -6735,13 +6839,726 @@ static int i40e_setup_tc(struct net_device *netdev, void *type_data)
>>> 	return ret;
>>> }
>>>
>>> +/**
>>> + * i40e_set_cld_element - sets cloud filter element data
>>> + * @filter: cloud filter rule
>>> + * @cld: ptr to cloud filter element data
>>> + *
>>> + * This is helper function to copy data into cloud filter element
>>> + **/
>>> +static inline void
>>> +i40e_set_cld_element(struct i40e_cloud_filter *filter,
>>> +		     struct i40e_aqc_cloud_filters_element_data *cld)
>>> +{
>>> +	int i, j;
>>> +	u32 ipa;
>>> +
>>> +	memset(cld, 0, sizeof(*cld));
>>> +	ether_addr_copy(cld->outer_mac, filter->dst_mac);
>>> +	ether_addr_copy(cld->inner_mac, filter->src_mac);
>>> +
>>> +	if (filter->ip_version == IPV6_VERSION) {
>>> +#define IPV6_MAX_INDEX	(ARRAY_SIZE(filter->dst_ipv6) - 1)
>>> +		for (i = 0, j = 0; i < 4; i++, j += 2) {
>>> +			ipa = be32_to_cpu(filter->dst_ipv6[IPV6_MAX_INDEX - i]);
>>> +			ipa = cpu_to_le32(ipa);
>>> +			memcpy(&cld->ipaddr.raw_v6.data[j], &ipa, 4);
>>> +		}
>>> +	} else {
>>> +		ipa = be32_to_cpu(filter->dst_ip);
>>> +		memcpy(&cld->ipaddr.v4.data, &ipa, 4);
>>> +	}
>>> +
>>> +	cld->inner_vlan = cpu_to_le16(ntohs(filter->vlan_id));
>>> +
>>> +	/* tenant_id is not supported by FW now, once the support is enabled
>>> +	 * fill the cld->tenant_id with cpu_to_le32(filter->tenant_id)
>>> +	 */
>>> +	if (filter->tenant_id)
>>> +		return;
>>> +}
>>> +
>>> +/**
>>> + * i40e_add_del_cloud_filter - Add/del cloud filter
>>> + * @vsi: pointer to VSI
>>> + * @filter: cloud filter rule
>>> + * @add: if true, add, if false, delete
>>> + *
>>> + * Add or delete a cloud filter for a specific flow spec.
>>> + * Returns 0 if the filter were successfully added.
>>> + **/
>>> +static int i40e_add_del_cloud_filter(struct i40e_vsi *vsi,
>>> +				     struct i40e_cloud_filter *filter, bool add)
>>> +{
>>> +	struct i40e_aqc_cloud_filters_element_data cld_filter;
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	int ret;
>>> +	static const u16 flag_table[128] = {
>>> +		[I40E_CLOUD_FILTER_FLAGS_OMAC]  =
>>> +			I40E_AQC_ADD_CLOUD_FILTER_OMAC,
>>> +		[I40E_CLOUD_FILTER_FLAGS_IMAC]  =
>>> +			I40E_AQC_ADD_CLOUD_FILTER_IMAC,
>>> +		[I40E_CLOUD_FILTER_FLAGS_IMAC_IVLAN]  =
>>> +			I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN,
>>> +		[I40E_CLOUD_FILTER_FLAGS_IMAC_TEN_ID] =
>>> +			I40E_AQC_ADD_CLOUD_FILTER_IMAC_TEN_ID,
>>> +		[I40E_CLOUD_FILTER_FLAGS_OMAC_TEN_ID_IMAC] =
>>> +			I40E_AQC_ADD_CLOUD_FILTER_OMAC_TEN_ID_IMAC,
>>> +		[I40E_CLOUD_FILTER_FLAGS_IMAC_IVLAN_TEN_ID] =
>>> +			I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN_TEN_ID,
>>> +		[I40E_CLOUD_FILTER_FLAGS_IIP] =
>>> +			I40E_AQC_ADD_CLOUD_FILTER_IIP,
>>> +	};
>>> +
>>> +	if (filter->flags >= ARRAY_SIZE(flag_table))
>>> +		return I40E_ERR_CONFIG;
>>> +
>>> +	/* copy element needed to add cloud filter from filter */
>>> +	i40e_set_cld_element(filter, &cld_filter);
>>> +
>>> +	if (filter->tunnel_type != I40E_CLOUD_TNL_TYPE_NONE)
>>> +		cld_filter.flags = cpu_to_le16(filter->tunnel_type <<
>>> +					     I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT);
>>> +
>>> +	if (filter->ip_version == IPV6_VERSION)
>>> +		cld_filter.flags |= cpu_to_le16(flag_table[filter->flags] |
>>> +						I40E_AQC_ADD_CLOUD_FLAGS_IPV6);
>>> +	else
>>> +		cld_filter.flags |= cpu_to_le16(flag_table[filter->flags] |
>>> +						I40E_AQC_ADD_CLOUD_FLAGS_IPV4);
>>> +
>>> +	if (add)
>>> +		ret = i40e_aq_add_cloud_filters(&pf->hw, filter->seid,
>>> +						&cld_filter, 1);
>>> +	else
>>> +		ret = i40e_aq_rem_cloud_filters(&pf->hw, filter->seid,
>>> +						&cld_filter, 1);
>>> +	if (ret)
>>> +		dev_dbg(&pf->pdev->dev,
>>> +			"Failed to %s cloud filter using l4 port %u, err %d aq_err %d\n",
>>> +			add ? "add" : "delete", filter->dst_port, ret,
>>> +			pf->hw.aq.asq_last_status);
>>> +	else
>>> +		dev_info(&pf->pdev->dev,
>>> +			 "%s cloud filter for VSI: %d\n",
>>> +			 add ? "Added" : "Deleted", filter->seid);
>>> +	return ret;
>>> +}
>>> +
>>> +/**
>>> + * i40e_add_del_cloud_filter_big_buf - Add/del cloud filter using big_buf
>>> + * @vsi: pointer to VSI
>>> + * @filter: cloud filter rule
>>> + * @add: if true, add, if false, delete
>>> + *
>>> + * Add or delete a cloud filter for a specific flow spec using big buffer.
>>> + * Returns 0 if the filter were successfully added.
>>> + **/
>>> +static int i40e_add_del_cloud_filter_big_buf(struct i40e_vsi *vsi,
>>> +					     struct i40e_cloud_filter *filter,
>>> +					     bool add)
>>> +{
>>> +	struct i40e_aqc_cloud_filters_element_bb cld_filter;
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	int ret;
>>> +
>>> +	/* Both (Outer/Inner) valid mac_addr are not supported */
>>> +	if (is_valid_ether_addr(filter->dst_mac) &&
>>> +	    is_valid_ether_addr(filter->src_mac))
>>> +		return -EINVAL;
>>> +
>>> +	/* Make sure port is specified, otherwise bail out, for channel
>>> +	 * specific cloud filter needs 'L4 port' to be non-zero
>>> +	 */
>>> +	if (!filter->dst_port)
>>> +		return -EINVAL;
>>> +
>>> +	/* adding filter using src_port/src_ip is not supported at this stage */
>>> +	if (filter->src_port || filter->src_ip ||
>>> +	    !ipv6_addr_any((struct in6_addr *)&filter->src_ipv6))
>>> +		return -EINVAL;
>>> +
>>> +	/* copy element needed to add cloud filter from filter */
>>> +	i40e_set_cld_element(filter, &cld_filter.element);
>>> +
>>> +	if (is_valid_ether_addr(filter->dst_mac) ||
>>> +	    is_valid_ether_addr(filter->src_mac) ||
>>> +	    is_multicast_ether_addr(filter->dst_mac) ||
>>> +	    is_multicast_ether_addr(filter->src_mac)) {
>>> +		/* MAC + IP : unsupported mode */
>>> +		if (filter->dst_ip)
>>> +			return -EINVAL;
>>> +
>>> +		/* since we validated that L4 port must be valid before
>>> +		 * we get here, start with respective "flags" value
>>> +		 * and update if vlan is present or not
>>> +		 */
>>> +		cld_filter.element.flags =
>>> +			cpu_to_le16(I40E_AQC_ADD_CLOUD_FILTER_MAC_PORT);
>>> +
>>> +		if (filter->vlan_id) {
>>> +			cld_filter.element.flags =
>>> +			cpu_to_le16(I40E_AQC_ADD_CLOUD_FILTER_MAC_VLAN_PORT);
>>> +		}
>>> +
>>> +	} else if (filter->dst_ip || filter->ip_version == IPV6_VERSION) {
>>> +		cld_filter.element.flags =
>>> +				cpu_to_le16(I40E_AQC_ADD_CLOUD_FILTER_IP_PORT);
>>> +		if (filter->ip_version == IPV6_VERSION)
>>> +			cld_filter.element.flags |=
>>> +				cpu_to_le16(I40E_AQC_ADD_CLOUD_FLAGS_IPV6);
>>> +		else
>>> +			cld_filter.element.flags |=
>>> +				cpu_to_le16(I40E_AQC_ADD_CLOUD_FLAGS_IPV4);
>>> +	} else {
>>> +		dev_err(&pf->pdev->dev,
>>> +			"either mac or ip has to be valid for cloud filter\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	/* Now copy L4 port in Byte 6..7 in general fields */
>>> +	cld_filter.general_fields[I40E_AQC_ADD_CLOUD_FV_FLU_0X16_WORD0] =
>>> +						be16_to_cpu(filter->dst_port);
>>> +
>>> +	if (add) {
>>> +		bool proto_type, port_type;
>>> +
>>> +		proto_type = (filter->ip_proto == IPPROTO_TCP) ? true : false;
>>> +		port_type = (filter->port_type & I40E_CLOUD_FILTER_PORT_DEST) ?
>>> +			     true : false;
>>> +
>>> +		/* For now, src port based cloud filter for channel is not
>>> +		 * supported
>>> +		 */
>>> +		if (!port_type) {
>>> +			dev_err(&pf->pdev->dev,
>>> +				"unsupported port type (src port)\n");
>>> +			return -EOPNOTSUPP;
>>> +		}
>>> +
>>> +		/* Validate current device switch mode, change if necessary */
>>> +		ret = i40e_validate_and_set_switch_mode(vsi, proto_type,
>>> +							port_type);
>>> +		if (ret) {
>>> +			dev_err(&pf->pdev->dev,
>>> +				"failed to set switch mode, ret %d\n",
>>> +				ret);
>>> +			return ret;
>>> +		}
>>> +
>>> +		ret = i40e_aq_add_cloud_filters_bb(&pf->hw, filter->seid,
>>> +						   &cld_filter, 1);
>>> +	} else {
>>> +		ret = i40e_aq_rem_cloud_filters_bb(&pf->hw, filter->seid,
>>> +						   &cld_filter, 1);
>>> +	}
>>> +
>>> +	if (ret)
>>> +		dev_dbg(&pf->pdev->dev,
>>> +			"Failed to %s cloud filter(big buffer) err %d aq_err %d\n",
>>> +			add ? "add" : "delete", ret, pf->hw.aq.asq_last_status);
>>> +	else
>>> +		dev_info(&pf->pdev->dev,
>>> +			 "%s cloud filter for VSI: %d, L4 port: %d\n",
>>> +			 add ? "add" : "delete", filter->seid,
>>> +			 ntohs(filter->dst_port));
>>> +	return ret;
>>> +}
>>> +
>>> +/**
>>> + * i40e_parse_cls_flower - Parse tc flower filters provided by kernel
>>> + * @vsi: Pointer to VSI
>>> + * @cls_flower: Pointer to struct tc_cls_flower_offload
>>> + * @filter: Pointer to cloud filter structure
>>> + *
>>> + **/
>>> +static int i40e_parse_cls_flower(struct i40e_vsi *vsi,
>>> +				 struct tc_cls_flower_offload *f,
>>> +				 struct i40e_cloud_filter *filter)
>>> +{
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	u16 addr_type = 0;
>>> +	u8 field_flags = 0;
>>> +
>>> +	if (f->dissector->used_keys &
>>> +	    ~(BIT(FLOW_DISSECTOR_KEY_CONTROL) |
>>> +	      BIT(FLOW_DISSECTOR_KEY_BASIC) |
>>> +	      BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) |
>>> +	      BIT(FLOW_DISSECTOR_KEY_VLAN) |
>>> +	      BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) |
>>> +	      BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) |
>>> +	      BIT(FLOW_DISSECTOR_KEY_PORTS) |
>>> +	      BIT(FLOW_DISSECTOR_KEY_ENC_KEYID))) {
>>> +		dev_err(&pf->pdev->dev, "Unsupported key used: 0x%x\n",
>>> +			f->dissector->used_keys);
>>> +		return -EOPNOTSUPP;
>>> +	}
>>> +
>>> +	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_ENC_KEYID)) {
>>> +		struct flow_dissector_key_keyid *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_ENC_KEYID,
>>> +						  f->key);
>>> +
>>> +		struct flow_dissector_key_keyid *mask =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_ENC_KEYID,
>>> +						  f->mask);
>>> +
>>> +		if (mask->keyid != 0)
>>> +			field_flags |= I40E_CLOUD_FIELD_TEN_ID;
>>> +
>>> +		filter->tenant_id = be32_to_cpu(key->keyid);
>>> +	}
>>> +
>>> +	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
>>> +		struct flow_dissector_key_basic *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_BASIC,
>>> +						  f->key);
>>> +
>>> +		filter->ip_proto = key->ip_proto;
>>> +	}
>>> +
>>> +	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
>>> +		struct flow_dissector_key_eth_addrs *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_ETH_ADDRS,
>>> +						  f->key);
>>> +
>>> +		struct flow_dissector_key_eth_addrs *mask =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_ETH_ADDRS,
>>> +						  f->mask);
>>> +
>>> +		/* use is_broadcast and is_zero to check for all 0xf or 0 */
>>> +		if (!is_zero_ether_addr(mask->dst)) {
>>> +			if (is_broadcast_ether_addr(mask->dst)) {
>>> +				field_flags |= I40E_CLOUD_FIELD_OMAC;
>>> +			} else {
>>> +				dev_err(&pf->pdev->dev, "Bad ether dest mask %pM\n",
>>> +					mask->dst);
>>> +				return I40E_ERR_CONFIG;
>>> +			}
>>> +		}
>>> +
>>> +		if (!is_zero_ether_addr(mask->src)) {
>>> +			if (is_broadcast_ether_addr(mask->src)) {
>>> +				field_flags |= I40E_CLOUD_FIELD_IMAC;
>>> +			} else {
>>> +				dev_err(&pf->pdev->dev, "Bad ether src mask %pM\n",
>>> +					mask->src);
>>> +				return I40E_ERR_CONFIG;
>>> +			}
>>> +		}
>>> +		ether_addr_copy(filter->dst_mac, key->dst);
>>> +		ether_addr_copy(filter->src_mac, key->src);
>>> +	}
>>> +
>>> +	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_VLAN)) {
>>> +		struct flow_dissector_key_vlan *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_VLAN,
>>> +						  f->key);
>>> +		struct flow_dissector_key_vlan *mask =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_VLAN,
>>> +						  f->mask);
>>> +
>>> +		if (mask->vlan_id) {
>>> +			if (mask->vlan_id == VLAN_VID_MASK) {
>>> +				field_flags |= I40E_CLOUD_FIELD_IVLAN;
>>> +
>>> +			} else {
>>> +				dev_err(&pf->pdev->dev, "Bad vlan mask 0x%04x\n",
>>> +					mask->vlan_id);
>>> +				return I40E_ERR_CONFIG;
>>> +			}
>>> +		}
>>> +
>>> +		filter->vlan_id = cpu_to_be16(key->vlan_id);
>>> +	}
>>> +
>>> +	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_CONTROL)) {
>>> +		struct flow_dissector_key_control *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_CONTROL,
>>> +						  f->key);
>>> +
>>> +		addr_type = key->addr_type;
>>> +	}
>>> +
>>> +	if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
>>> +		struct flow_dissector_key_ipv4_addrs *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_IPV4_ADDRS,
>>> +						  f->key);
>>> +		struct flow_dissector_key_ipv4_addrs *mask =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_IPV4_ADDRS,
>>> +						  f->mask);
>>> +
>>> +		if (mask->dst) {
>>> +			if (mask->dst == cpu_to_be32(0xffffffff)) {
>>> +				field_flags |= I40E_CLOUD_FIELD_IIP;
>>> +			} else {
>>> +				dev_err(&pf->pdev->dev, "Bad ip dst mask 0x%08x\n",
>>> +					be32_to_cpu(mask->dst));
>>> +				return I40E_ERR_CONFIG;
>>> +			}
>>> +		}
>>> +
>>> +		if (mask->src) {
>>> +			if (mask->src == cpu_to_be32(0xffffffff)) {
>>> +				field_flags |= I40E_CLOUD_FIELD_IIP;
>>> +			} else {
>>> +				dev_err(&pf->pdev->dev, "Bad ip src mask 0x%08x\n",
>>> +					be32_to_cpu(mask->dst));
>>> +				return I40E_ERR_CONFIG;
>>> +			}
>>> +		}
>>> +
>>> +		if (field_flags & I40E_CLOUD_FIELD_TEN_ID) {
>>> +			dev_err(&pf->pdev->dev, "Tenant id not allowed for ip filter\n");
>>> +			return I40E_ERR_CONFIG;
>>> +		}
>>> +		filter->dst_ip = key->dst;
>>> +		filter->src_ip = key->src;
>>> +		filter->ip_version = IPV4_VERSION;
>>> +	}
>>> +
>>> +	if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
>>> +		struct flow_dissector_key_ipv6_addrs *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_IPV6_ADDRS,
>>> +						  f->key);
>>> +		struct flow_dissector_key_ipv6_addrs *mask =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_IPV6_ADDRS,
>>> +						  f->mask);
>>> +
>>> +		/* src and dest IPV6 address should not be LOOPBACK
>>> +		 * (0:0:0:0:0:0:0:1), which can be represented as ::1
>>> +		 */
>>> +		if (ipv6_addr_loopback(&key->dst) ||
>>> +		    ipv6_addr_loopback(&key->src)) {
>>> +			dev_err(&pf->pdev->dev,
>>> +				"Bad ipv6, addr is LOOPBACK\n");
>>> +			return I40E_ERR_CONFIG;
>>> +		}
>>> +		if (!ipv6_addr_any(&mask->dst) || !ipv6_addr_any(&mask->src))
>>> +			field_flags |= I40E_CLOUD_FIELD_IIP;
>>> +
>>> +		memcpy(&filter->src_ipv6, &key->src.s6_addr32,
>>> +		       sizeof(filter->src_ipv6));
>>> +		memcpy(&filter->dst_ipv6, &key->dst.s6_addr32,
>>> +		       sizeof(filter->dst_ipv6));
>>> +
>>> +		/* mark it as IPv6 filter, to be used later */
>>> +		filter->ip_version = IPV6_VERSION;
>>> +	}
>>> +
>>> +	if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_PORTS)) {
>>> +		struct flow_dissector_key_ports *key =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_PORTS,
>>> +						  f->key);
>>> +		struct flow_dissector_key_ports *mask =
>>> +			skb_flow_dissector_target(f->dissector,
>>> +						  FLOW_DISSECTOR_KEY_PORTS,
>>> +						  f->mask);
>>> +
>>> +		if (mask->src) {
>>> +			if (mask->src == cpu_to_be16(0xffff)) {
>>> +				field_flags |= I40E_CLOUD_FIELD_IIP;
>>> +			} else {
>>> +				dev_err(&pf->pdev->dev, "Bad src port mask 0x%04x\n",
>>> +					be16_to_cpu(mask->src));
>>> +				return I40E_ERR_CONFIG;
>>> +			}
>>> +		}
>>> +
>>> +		if (mask->dst) {
>>> +			if (mask->dst == cpu_to_be16(0xffff)) {
>>> +				field_flags |= I40E_CLOUD_FIELD_IIP;
>>> +			} else {
>>> +				dev_err(&pf->pdev->dev, "Bad dst port mask 0x%04x\n",
>>> +					be16_to_cpu(mask->dst));
>>> +				return I40E_ERR_CONFIG;
>>> +			}
>>> +		}
>>> +
>>> +		filter->dst_port = key->dst;
>>> +		filter->src_port = key->src;
>>> +
>>> +		/* For now, only supports destination port*/
>>> +		filter->port_type |= I40E_CLOUD_FILTER_PORT_DEST;
>>> +
>>> +		switch (filter->ip_proto) {
>>> +		case IPPROTO_TCP:
>>> +		case IPPROTO_UDP:
>>> +			break;
>>> +		default:
>>> +			dev_err(&pf->pdev->dev,
>>> +				"Only UDP and TCP transport are supported\n");
>>> +			return -EINVAL;
>>> +		}
>>> +	}
>>> +	filter->flags = field_flags;
>>> +	return 0;
>>> +}
>>> +
>>> +/**
>>> + * i40e_handle_redirect_action: Forward to a traffic class on the device
>>> + * @vsi: Pointer to VSI
>>> + * @ifindex: ifindex of the device to forwared to
>>> + * @tc: traffic class index on the device
>>> + * @filter: Pointer to cloud filter structure
>>> + *
>>> + **/
>>> +static int i40e_handle_redirect_action(struct i40e_vsi *vsi, int ifindex, u8 tc,
>>> +				       struct i40e_cloud_filter *filter)
>>> +{
>>> +	struct i40e_channel *ch, *ch_tmp;
>>> +
>>> +	/* redirect to a traffic class on the same device */
>>> +	if (vsi->netdev->ifindex == ifindex) {
>>> +		if (tc == 0) {
>>> +			filter->seid = vsi->seid;
>>> +			return 0;
>>> +		} else if (vsi->tc_config.enabled_tc & BIT(tc)) {
>>> +			if (!filter->dst_port) {
>>> +				dev_err(&vsi->back->pdev->dev,
>>> +					"Specify destination port to redirect to traffic class that is not default\n");
>>> +				return -EINVAL;
>>> +			}
>>> +			if (list_empty(&vsi->ch_list))
>>> +				return -EINVAL;
>>> +			list_for_each_entry_safe(ch, ch_tmp, &vsi->ch_list,
>>> +						 list) {
>>> +				if (ch->seid == vsi->tc_seid_map[tc])
>>> +					filter->seid = ch->seid;
>>> +			}
>>> +			return 0;
>>> +		}
>>> +	}
>>> +	return -EINVAL;
>>> +}
>>> +
>>> +/**
>>> + * i40e_parse_tc_actions - Parse tc actions
>>> + * @vsi: Pointer to VSI
>>> + * @cls_flower: Pointer to struct tc_cls_flower_offload
>>> + * @filter: Pointer to cloud filter structure
>>> + *
>>> + **/
>>> +static int i40e_parse_tc_actions(struct i40e_vsi *vsi, struct tcf_exts *exts,
>>> +				 struct i40e_cloud_filter *filter)
>>> +{
>>> +	const struct tc_action *a;
>>> +	LIST_HEAD(actions);
>>> +	int err;
>>> +
>>> +	if (!tcf_exts_has_actions(exts))
>>> +		return -EINVAL;
>>> +
>>> +	tcf_exts_to_list(exts, &actions);
>>> +	list_for_each_entry(a, &actions, list) {
>>> +		/* Drop action */
>>> +		if (is_tcf_gact_shot(a)) {
>>> +			dev_err(&vsi->back->pdev->dev,
>>> +				"Cloud filters do not support the drop action.\n");
>>> +			return -EOPNOTSUPP;
>>> +		}
>>> +
>>> +		/* Redirect to a traffic class on the same device */
>>> +		if (!is_tcf_mirred_egress_redirect(a) && is_tcf_mirred_tc(a)) {
>>> +			int ifindex = tcf_mirred_ifindex(a);
>>> +			u8 tc = tcf_mirred_tc(a);
>>> +
>>> +			err = i40e_handle_redirect_action(vsi, ifindex, tc,
>>> +							  filter);
>>> +			if (err == 0)
>>> +				return err;
>>> +		}
>>> +	}
>>> +	return -EINVAL;
>>> +}
>>> +
>>> +/**
>>> + * i40e_configure_clsflower - Configure tc flower filters
>>> + * @vsi: Pointer to VSI
>>> + * @cls_flower: Pointer to struct tc_cls_flower_offload
>>> + *
>>> + **/
>>> +static int i40e_configure_clsflower(struct i40e_vsi *vsi,
>>> +				    struct tc_cls_flower_offload *cls_flower)
>>> +{
>>> +	struct i40e_cloud_filter *filter = NULL;
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	int err = 0;
>>> +
>>> +	if (test_bit(__I40E_RESET_RECOVERY_PENDING, pf->state) ||
>>> +	    test_bit(__I40E_RESET_INTR_RECEIVED, pf->state))
>>> +		return -EBUSY;
>>> +
>>> +	if (pf->fdir_pf_active_filters ||
>>> +	    (!hlist_empty(&pf->fdir_filter_list))) {
>>> +		dev_err(&vsi->back->pdev->dev,
>>> +			"Flow Director Sideband filters exists, turn ntuple off to configure cloud filters\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	if (vsi->back->flags & I40E_FLAG_FD_SB_ENABLED) {
>>> +		dev_err(&vsi->back->pdev->dev,
>>> +			"Disable Flow Director Sideband, configuring Cloud filters via tc-flower\n");
>>> +		vsi->back->flags &= ~I40E_FLAG_FD_SB_ENABLED;
>>> +		vsi->back->flags |= I40E_FLAG_FD_SB_TO_CLOUD_FILTER;
>>> +	}
>>> +
>>> +	filter = kzalloc(sizeof(*filter), GFP_KERNEL);
>>> +	if (!filter)
>>> +		return -ENOMEM;
>>> +
>>> +	filter->cookie = cls_flower->cookie;
>>> +
>>> +	err = i40e_parse_cls_flower(vsi, cls_flower, filter);
>>> +	if (err < 0)
>>> +		goto err;
>>> +
>>> +	err = i40e_parse_tc_actions(vsi, cls_flower->exts, filter);
>>> +	if (err < 0)
>>> +		goto err;
>>> +
>>> +	/* Add cloud filter */
>>> +	if (filter->dst_port)
>>> +		err = i40e_add_del_cloud_filter_big_buf(vsi, filter, true);
>>> +	else
>>> +		err = i40e_add_del_cloud_filter(vsi, filter, true);
>>> +
>>> +	if (err) {
>>> +		dev_err(&pf->pdev->dev,
>>> +			"Failed to add cloud filter, err %s\n",
>>> +			i40e_stat_str(&pf->hw, err));
>>> +		err = i40e_aq_rc_to_posix(err, pf->hw.aq.asq_last_status);
>>> +		goto err;
>>> +	}
>>> +
>>> +	/* add filter to the ordered list */
>>> +	INIT_HLIST_NODE(&filter->cloud_node);
>>> +
>>> +	hlist_add_head(&filter->cloud_node, &pf->cloud_filter_list);
>>> +
>>> +	pf->num_cloud_filters++;
>>> +
>>> +	return err;
>>> +err:
>>> +	kfree(filter);
>>> +	return err;
>>> +}
>>> +
>>> +/**
>>> + * i40e_find_cloud_filter - Find the could filter in the list
>>> + * @vsi: Pointer to VSI
>>> + * @cookie: filter specific cookie
>>> + *
>>> + **/
>>> +static struct i40e_cloud_filter *i40e_find_cloud_filter(struct i40e_vsi *vsi,
>>> +							unsigned long *cookie)
>>> +{
>>> +	struct i40e_cloud_filter *filter = NULL;
>>> +	struct hlist_node *node2;
>>> +
>>> +	hlist_for_each_entry_safe(filter, node2,
>>> +				  &vsi->back->cloud_filter_list, cloud_node)
>>> +		if (!memcmp(cookie, &filter->cookie, sizeof(filter->cookie)))
>>> +			return filter;
>>> +	return NULL;
>>> +}
>>> +
>>> +/**
>>> + * i40e_delete_clsflower - Remove tc flower filters
>>> + * @vsi: Pointer to VSI
>>> + * @cls_flower: Pointer to struct tc_cls_flower_offload
>>> + *
>>> + **/
>>> +static int i40e_delete_clsflower(struct i40e_vsi *vsi,
>>> +				 struct tc_cls_flower_offload *cls_flower)
>>> +{
>>> +	struct i40e_cloud_filter *filter = NULL;
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	int err = 0;
>>> +
>>> +	filter = i40e_find_cloud_filter(vsi, &cls_flower->cookie);
>>> +
>>> +	if (!filter)
>>> +		return -EINVAL;
>>> +
>>> +	hash_del(&filter->cloud_node);
>>> +
>>> +	if (filter->dst_port)
>>> +		err = i40e_add_del_cloud_filter_big_buf(vsi, filter, false);
>>> +	else
>>> +		err = i40e_add_del_cloud_filter(vsi, filter, false);
>>> +	if (err) {
>>> +		kfree(filter);
>>> +		dev_err(&pf->pdev->dev,
>>> +			"Failed to delete cloud filter, err %s\n",
>>> +			i40e_stat_str(&pf->hw, err));
>>> +		return i40e_aq_rc_to_posix(err, pf->hw.aq.asq_last_status);
>>> +	}
>>> +
>>> +	kfree(filter);
>>> +	pf->num_cloud_filters--;
>>> +
>>> +	if (!pf->num_cloud_filters)
>>> +		if ((pf->flags & I40E_FLAG_FD_SB_TO_CLOUD_FILTER) &&
>>> +		    !(pf->flags & I40E_FLAG_FD_SB_INACTIVE)) {
>>> +			pf->flags |= I40E_FLAG_FD_SB_ENABLED;
>>> +			pf->flags &= ~I40E_FLAG_FD_SB_TO_CLOUD_FILTER;
>>> +			pf->flags &= ~I40E_FLAG_FD_SB_INACTIVE;
>>> +		}
>>> +	return 0;
>>> +}
>>> +
>>> +/**
>>> + * i40e_setup_tc_cls_flower - flower classifier offloads
>>> + * @netdev: net device to configure
>>> + * @type_data: offload data
>>> + **/
>>> +static int i40e_setup_tc_cls_flower(struct net_device *netdev,
>>> +				    struct tc_cls_flower_offload *cls_flower)
>>> +{
>>> +	struct i40e_netdev_priv *np = netdev_priv(netdev);
>>> +	struct i40e_vsi *vsi = np->vsi;
>>> +
>>> +	if (!is_classid_clsact_ingress(cls_flower->common.classid) ||
>>> +	    cls_flower->common.chain_index)
>>> +		return -EOPNOTSUPP;
>>> +
>>> +	switch (cls_flower->command) {
>>> +	case TC_CLSFLOWER_REPLACE:
>>> +		return i40e_configure_clsflower(vsi, cls_flower);
>>> +	case TC_CLSFLOWER_DESTROY:
>>> +		return i40e_delete_clsflower(vsi, cls_flower);
>>> +	case TC_CLSFLOWER_STATS:
>>> +		return -EOPNOTSUPP;
>>> +	default:
>>> +		return -EINVAL;
>>> +	}
>>> +}
>>> +
>>> static int __i40e_setup_tc(struct net_device *netdev, enum tc_setup_type type,
>>> 			   void *type_data)
>>> {
>>> -	if (type != TC_SETUP_MQPRIO)
>>> +	switch (type) {
>>> +	case TC_SETUP_MQPRIO:
>>> +		return i40e_setup_tc(netdev, type_data);
>>> +	case TC_SETUP_CLSFLOWER:
>>> +		return i40e_setup_tc_cls_flower(netdev, type_data);
>>> +	default:
>>> 		return -EOPNOTSUPP;
>>> -
>>> -	return i40e_setup_tc(netdev, type_data);
>>> +	}
>>> }
>>>
>>> /**
>>> @@ -6939,6 +7756,13 @@ static void i40e_cloud_filter_exit(struct i40e_pf *pf)
>>> 		kfree(cfilter);
>>> 	}
>>> 	pf->num_cloud_filters = 0;
>>> +
>>> +	if ((pf->flags & I40E_FLAG_FD_SB_TO_CLOUD_FILTER) &&
>>> +	    !(pf->flags & I40E_FLAG_FD_SB_INACTIVE)) {
>>> +		pf->flags |= I40E_FLAG_FD_SB_ENABLED;
>>> +		pf->flags &= ~I40E_FLAG_FD_SB_TO_CLOUD_FILTER;
>>> +		pf->flags &= ~I40E_FLAG_FD_SB_INACTIVE;
>>> +	}
>>> }
>>>
>>> /**
>>> @@ -8046,7 +8870,8 @@ static int i40e_reconstitute_veb(struct i40e_veb *veb)
>>>  * i40e_get_capabilities - get info about the HW
>>>  * @pf: the PF struct
>>>  **/
>>> -static int i40e_get_capabilities(struct i40e_pf *pf)
>>> +static int i40e_get_capabilities(struct i40e_pf *pf,
>>> +				 enum i40e_admin_queue_opc list_type)
>>> {
>>> 	struct i40e_aqc_list_capabilities_element_resp *cap_buf;
>>> 	u16 data_size;
>>> @@ -8061,9 +8886,8 @@ static int i40e_get_capabilities(struct i40e_pf *pf)
>>>
>>> 		/* this loads the data into the hw struct for us */
>>> 		err = i40e_aq_discover_capabilities(&pf->hw, cap_buf, buf_len,
>>> -					    &data_size,
>>> -					    i40e_aqc_opc_list_func_capabilities,
>>> -					    NULL);
>>> +						    &data_size, list_type,
>>> +						    NULL);
>>> 		/* data loaded, buffer no longer needed */
>>> 		kfree(cap_buf);
>>>
>>> @@ -8080,26 +8904,44 @@ static int i40e_get_capabilities(struct i40e_pf *pf)
>>> 		}
>>> 	} while (err);
>>>
>>> -	if (pf->hw.debug_mask & I40E_DEBUG_USER)
>>> -		dev_info(&pf->pdev->dev,
>>> -			 "pf=%d, num_vfs=%d, msix_pf=%d, msix_vf=%d, fd_g=%d, fd_b=%d, pf_max_q=%d num_vsi=%d\n",
>>> -			 pf->hw.pf_id, pf->hw.func_caps.num_vfs,
>>> -			 pf->hw.func_caps.num_msix_vectors,
>>> -			 pf->hw.func_caps.num_msix_vectors_vf,
>>> -			 pf->hw.func_caps.fd_filters_guaranteed,
>>> -			 pf->hw.func_caps.fd_filters_best_effort,
>>> -			 pf->hw.func_caps.num_tx_qp,
>>> -			 pf->hw.func_caps.num_vsis);
>>> -
>>> +	if (pf->hw.debug_mask & I40E_DEBUG_USER) {
>>> +		if (list_type == i40e_aqc_opc_list_func_capabilities) {
>>> +			dev_info(&pf->pdev->dev,
>>> +				 "pf=%d, num_vfs=%d, msix_pf=%d, msix_vf=%d, fd_g=%d, fd_b=%d, pf_max_q=%d num_vsi=%d\n",
>>> +				 pf->hw.pf_id, pf->hw.func_caps.num_vfs,
>>> +				 pf->hw.func_caps.num_msix_vectors,
>>> +				 pf->hw.func_caps.num_msix_vectors_vf,
>>> +				 pf->hw.func_caps.fd_filters_guaranteed,
>>> +				 pf->hw.func_caps.fd_filters_best_effort,
>>> +				 pf->hw.func_caps.num_tx_qp,
>>> +				 pf->hw.func_caps.num_vsis);
>>> +		} else if (list_type == i40e_aqc_opc_list_dev_capabilities) {
>>> +			dev_info(&pf->pdev->dev,
>>> +				 "switch_mode=0x%04x, function_valid=0x%08x\n",
>>> +				 pf->hw.dev_caps.switch_mode,
>>> +				 pf->hw.dev_caps.valid_functions);
>>> +			dev_info(&pf->pdev->dev,
>>> +				 "SR-IOV=%d, num_vfs for all function=%u\n",
>>> +				 pf->hw.dev_caps.sr_iov_1_1,
>>> +				 pf->hw.dev_caps.num_vfs);
>>> +			dev_info(&pf->pdev->dev,
>>> +				 "num_vsis=%u, num_rx:%u, num_tx=%u\n",
>>> +				 pf->hw.dev_caps.num_vsis,
>>> +				 pf->hw.dev_caps.num_rx_qp,
>>> +				 pf->hw.dev_caps.num_tx_qp);
>>> +		}
>>> +	}
>>> +	if (list_type == i40e_aqc_opc_list_func_capabilities) {
>>> #define DEF_NUM_VSI (1 + (pf->hw.func_caps.fcoe ? 1 : 0) \
>>> 		       + pf->hw.func_caps.num_vfs)
>>> -	if (pf->hw.revision_id == 0 && (DEF_NUM_VSI > pf->hw.func_caps.num_vsis)) {
>>> -		dev_info(&pf->pdev->dev,
>>> -			 "got num_vsis %d, setting num_vsis to %d\n",
>>> -			 pf->hw.func_caps.num_vsis, DEF_NUM_VSI);
>>> -		pf->hw.func_caps.num_vsis = DEF_NUM_VSI;
>>> +		if (pf->hw.revision_id == 0 &&
>>> +		    (pf->hw.func_caps.num_vsis < DEF_NUM_VSI)) {
>>> +			dev_info(&pf->pdev->dev,
>>> +				 "got num_vsis %d, setting num_vsis to %d\n",
>>> +				 pf->hw.func_caps.num_vsis, DEF_NUM_VSI);
>>> +			pf->hw.func_caps.num_vsis = DEF_NUM_VSI;
>>> +		}
>>> 	}
>>> -
>>> 	return 0;
>>> }
>>>
>>> @@ -8141,6 +8983,7 @@ static void i40e_fdir_sb_setup(struct i40e_pf *pf)
>>> 		if (!vsi) {
>>> 			dev_info(&pf->pdev->dev, "Couldn't create FDir VSI\n");
>>> 			pf->flags &= ~I40E_FLAG_FD_SB_ENABLED;
>>> +			pf->flags |= I40E_FLAG_FD_SB_INACTIVE;
>>> 			return;
>>> 		}
>>> 	}
>>> @@ -8163,6 +9006,48 @@ static void i40e_fdir_teardown(struct i40e_pf *pf)
>>> }
>>>
>>> /**
>>> + * i40e_rebuild_cloud_filters - Rebuilds cloud filters for VSIs
>>> + * @vsi: PF main vsi
>>> + * @seid: seid of main or channel VSIs
>>> + *
>>> + * Rebuilds cloud filters associated with main VSI and channel VSIs if they
>>> + * existed before reset
>>> + **/
>>> +static int i40e_rebuild_cloud_filters(struct i40e_vsi *vsi, u16 seid)
>>> +{
>>> +	struct i40e_cloud_filter *cfilter;
>>> +	struct i40e_pf *pf = vsi->back;
>>> +	struct hlist_node *node;
>>> +	i40e_status ret;
>>> +
>>> +	/* Add cloud filters back if they exist */
>>> +	if (hlist_empty(&pf->cloud_filter_list))
>>> +		return 0;
>>> +
>>> +	hlist_for_each_entry_safe(cfilter, node, &pf->cloud_filter_list,
>>> +				  cloud_node) {
>>> +		if (cfilter->seid != seid)
>>> +			continue;
>>> +
>>> +		if (cfilter->dst_port)
>>> +			ret = i40e_add_del_cloud_filter_big_buf(vsi, cfilter,
>>> +								true);
>>> +		else
>>> +			ret = i40e_add_del_cloud_filter(vsi, cfilter, true);
>>> +
>>> +		if (ret) {
>>> +			dev_dbg(&pf->pdev->dev,
>>> +				"Failed to rebuild cloud filter, err %s aq_err %s\n",
>>> +				i40e_stat_str(&pf->hw, ret),
>>> +				i40e_aq_str(&pf->hw,
>>> +					    pf->hw.aq.asq_last_status));
>>> +			return ret;
>>> +		}
>>> +	}
>>> +	return 0;
>>> +}
>>> +
>>> +/**
>>>  * i40e_rebuild_channels - Rebuilds channel VSIs if they existed before reset
>>>  * @vsi: PF main vsi
>>>  *
>>> @@ -8199,6 +9084,13 @@ static int i40e_rebuild_channels(struct i40e_vsi *vsi)
>>> 						I40E_BW_CREDIT_DIVISOR,
>>> 				ch->seid);
>>> 		}
>>> +		ret = i40e_rebuild_cloud_filters(vsi, ch->seid);
>>> +		if (ret) {
>>> +			dev_dbg(&vsi->back->pdev->dev,
>>> +				"Failed to rebuild cloud filters for channel VSI %u\n",
>>> +				ch->seid);
>>> +			return ret;
>>> +		}
>>> 	}
>>> 	return 0;
>>> }
>>> @@ -8365,7 +9257,7 @@ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
>>> 		i40e_verify_eeprom(pf);
>>>
>>> 	i40e_clear_pxe_mode(hw);
>>> -	ret = i40e_get_capabilities(pf);
>>> +	ret = i40e_get_capabilities(pf, i40e_aqc_opc_list_func_capabilities);
>>> 	if (ret)
>>> 		goto end_core_reset;
>>>
>>> @@ -8482,6 +9374,10 @@ static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
>>> 			goto end_unlock;
>>> 	}
>>>
>>> +	ret = i40e_rebuild_cloud_filters(vsi, vsi->seid);
>>> +	if (ret)
>>> +		goto end_unlock;
>>> +
>>> 	/* PF Main VSI is rebuild by now, go ahead and rebuild channel VSIs
>>> 	 * for this main VSI if they exist
>>> 	 */
>>> @@ -9404,6 +10300,7 @@ static int i40e_init_msix(struct i40e_pf *pf)
>>> 	    (pf->num_fdsb_msix == 0)) {
>>> 		dev_info(&pf->pdev->dev, "Sideband Flowdir disabled, not enough MSI-X vectors\n");
>>> 		pf->flags &= ~I40E_FLAG_FD_SB_ENABLED;
>>> +		pf->flags |= I40E_FLAG_FD_SB_INACTIVE;
>>> 	}
>>> 	if ((pf->flags & I40E_FLAG_VMDQ_ENABLED) &&
>>> 	    (pf->num_vmdq_msix == 0)) {
>>> @@ -9521,6 +10418,7 @@ static int i40e_init_interrupt_scheme(struct i40e_pf *pf)
>>> 				       I40E_FLAG_FD_SB_ENABLED	|
>>> 				       I40E_FLAG_FD_ATR_ENABLED	|
>>> 				       I40E_FLAG_VMDQ_ENABLED);
>>> +			pf->flags |= I40E_FLAG_FD_SB_INACTIVE;
>>>
>>> 			/* rework the queue expectations without MSIX */
>>> 			i40e_determine_queue_usage(pf);
>>> @@ -10263,9 +11161,13 @@ bool i40e_set_ntuple(struct i40e_pf *pf, netdev_features_t features)
>>> 		/* Enable filters and mark for reset */
>>> 		if (!(pf->flags & I40E_FLAG_FD_SB_ENABLED))
>>> 			need_reset = true;
>>> -		/* enable FD_SB only if there is MSI-X vector */
>>> -		if (pf->num_fdsb_msix > 0)
>>> +		/* enable FD_SB only if there is MSI-X vector and no cloud
>>> +		 * filters exist
>>> +		 */
>>> +		if (pf->num_fdsb_msix > 0 && !pf->num_cloud_filters) {
>>> 			pf->flags |= I40E_FLAG_FD_SB_ENABLED;
>>> +			pf->flags &= ~I40E_FLAG_FD_SB_INACTIVE;
>>> +		}
>>> 	} else {
>>> 		/* turn off filters, mark for reset and clear SW filter list */
>>> 		if (pf->flags & I40E_FLAG_FD_SB_ENABLED) {
>>> @@ -10274,6 +11176,8 @@ bool i40e_set_ntuple(struct i40e_pf *pf, netdev_features_t features)
>>> 		}
>>> 		pf->flags &= ~(I40E_FLAG_FD_SB_ENABLED |
>>> 			       I40E_FLAG_FD_SB_AUTO_DISABLED);
>>> +		pf->flags |= I40E_FLAG_FD_SB_INACTIVE;
>>> +
>>> 		/* reset fd counters */
>>> 		pf->fd_add_err = 0;
>>> 		pf->fd_atr_cnt = 0;
>>> @@ -10857,7 +11761,8 @@ static int i40e_config_netdev(struct i40e_vsi *vsi)
>>> 		netdev->hw_features |= NETIF_F_NTUPLE;
>>> 	hw_features = hw_enc_features		|
>>> 		      NETIF_F_HW_VLAN_CTAG_TX	|
>>> -		      NETIF_F_HW_VLAN_CTAG_RX;
>>> +		      NETIF_F_HW_VLAN_CTAG_RX	|
>>> +		      NETIF_F_HW_TC;
>>>
>>> 	netdev->hw_features |= hw_features;
>>>
>>> @@ -12159,8 +13064,10 @@ static int i40e_setup_pf_switch(struct i40e_pf *pf, bool reinit)
>>> 	*/
>>>
>>> 	if ((pf->hw.pf_id == 0) &&
>>> -	    !(pf->flags & I40E_FLAG_TRUE_PROMISC_SUPPORT))
>>> +	    !(pf->flags & I40E_FLAG_TRUE_PROMISC_SUPPORT)) {
>>> 		flags = I40E_AQ_SET_SWITCH_CFG_PROMISC;
>>> +		pf->last_sw_conf_flags = flags;
>>> +	}
>>>
>>> 	if (pf->hw.pf_id == 0) {
>>> 		u16 valid_flags;
>>> @@ -12176,6 +13083,7 @@ static int i40e_setup_pf_switch(struct i40e_pf *pf, bool reinit)
>>> 					     pf->hw.aq.asq_last_status));
>>> 			/* not a fatal problem, just keep going */
>>> 		}
>>> +		pf->last_sw_conf_valid_flags = valid_flags;
>>> 	}
>>>
>>> 	/* first time setup */
>>> @@ -12273,6 +13181,7 @@ static void i40e_determine_queue_usage(struct i40e_pf *pf)
>>> 			       I40E_FLAG_DCB_ENABLED	|
>>> 			       I40E_FLAG_SRIOV_ENABLED	|
>>> 			       I40E_FLAG_VMDQ_ENABLED);
>>> +		pf->flags |= I40E_FLAG_FD_SB_INACTIVE;
>>> 	} else if (!(pf->flags & (I40E_FLAG_RSS_ENABLED |
>>> 				  I40E_FLAG_FD_SB_ENABLED |
>>> 				  I40E_FLAG_FD_ATR_ENABLED |
>>> @@ -12287,6 +13196,7 @@ static void i40e_determine_queue_usage(struct i40e_pf *pf)
>>> 			       I40E_FLAG_FD_ATR_ENABLED	|
>>> 			       I40E_FLAG_DCB_ENABLED	|
>>> 			       I40E_FLAG_VMDQ_ENABLED);
>>> +		pf->flags |= I40E_FLAG_FD_SB_INACTIVE;
>>> 	} else {
>>> 		/* Not enough queues for all TCs */
>>> 		if ((pf->flags & I40E_FLAG_DCB_CAPABLE) &&
>>> @@ -12310,6 +13220,7 @@ static void i40e_determine_queue_usage(struct i40e_pf *pf)
>>> 			queues_left -= 1; /* save 1 queue for FD */
>>> 		} else {
>>> 			pf->flags &= ~I40E_FLAG_FD_SB_ENABLED;
>>> +			pf->flags |= I40E_FLAG_FD_SB_INACTIVE;
>>> 			dev_info(&pf->pdev->dev, "not enough queues for Flow Director. Flow Director feature is disabled\n");
>>> 		}
>>> 	}
>>> @@ -12613,7 +13524,7 @@ static int i40e_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>>> 		dev_warn(&pdev->dev, "This device is a pre-production adapter/LOM. Please be aware there may be issues with your hardware. If you are experiencing problems please contact your Intel or hardware representative who provided you with this hardware.\n");
>>>
>>> 	i40e_clear_pxe_mode(hw);
>>> -	err = i40e_get_capabilities(pf);
>>> +	err = i40e_get_capabilities(pf, i40e_aqc_opc_list_func_capabilities);
>>> 	if (err)
>>> 		goto err_adminq_setup;
>>>
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_prototype.h b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
>>> index 92869f5..3bb6659 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_prototype.h
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_prototype.h
>>> @@ -283,6 +283,22 @@ i40e_status i40e_aq_query_switch_comp_bw_config(struct i40e_hw *hw,
>>> 		struct i40e_asq_cmd_details *cmd_details);
>>> i40e_status i40e_aq_resume_port_tx(struct i40e_hw *hw,
>>> 				   struct i40e_asq_cmd_details *cmd_details);
>>> +i40e_status
>>> +i40e_aq_add_cloud_filters_bb(struct i40e_hw *hw, u16 seid,
>>> +			     struct i40e_aqc_cloud_filters_element_bb *filters,
>>> +			     u8 filter_count);
>>> +enum i40e_status_code
>>> +i40e_aq_add_cloud_filters(struct i40e_hw *hw, u16 vsi,
>>> +			  struct i40e_aqc_cloud_filters_element_data *filters,
>>> +			  u8 filter_count);
>>> +enum i40e_status_code
>>> +i40e_aq_rem_cloud_filters(struct i40e_hw *hw, u16 vsi,
>>> +			  struct i40e_aqc_cloud_filters_element_data *filters,
>>> +			  u8 filter_count);
>>> +i40e_status
>>> +i40e_aq_rem_cloud_filters_bb(struct i40e_hw *hw, u16 seid,
>>> +			     struct i40e_aqc_cloud_filters_element_bb *filters,
>>> +			     u8 filter_count);
>>> i40e_status i40e_read_lldp_cfg(struct i40e_hw *hw,
>>> 			       struct i40e_lldp_variables *lldp_cfg);
>>> /* i40e_common */
>>> diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h
>>> index c019f46..af38881 100644
>>> --- a/drivers/net/ethernet/intel/i40e/i40e_type.h
>>> +++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
>>> @@ -287,6 +287,7 @@ struct i40e_hw_capabilities {
>>> #define I40E_NVM_IMAGE_TYPE_MODE1	0x6
>>> #define I40E_NVM_IMAGE_TYPE_MODE2	0x7
>>> #define I40E_NVM_IMAGE_TYPE_MODE3	0x8
>>> +#define I40E_SWITCH_MODE_MASK		0xF
>>>
>>> 	u32  management_mode;
>>> 	u32  mng_protocols_over_mctp;
>>> diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h b/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h
>>> index b8c78bf..4fe27f0 100644
>>> --- a/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h
>>> +++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h
>>> @@ -1360,6 +1360,9 @@ struct i40e_aqc_cloud_filters_element_data {
>>> 		struct {
>>> 			u8 data[16];
>>> 		} v6;
>>> +		struct {
>>> +			__le16 data[8];
>>> +		} raw_v6;
>>> 	} ipaddr;
>>> 	__le16	flags;
>>> #define I40E_AQC_ADD_CLOUD_FILTER_SHIFT			0
>>>

^ permalink raw reply

* Re: [PATCH 2/4] ravb: Add optional PHY reset during system resume
From: Florian Fainelli @ 2017-09-28 19:21 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Geert Uytterhoeven, David S . Miller, Simon Horman, Magnus Damm,
	Sergei Shtylyov, Andrew Lunn, Niklas Söderlund,
	netdev@vger.kernel.org, Linux-Renesas, devicetree@vger.kernel.org
In-Reply-To: <CAMuHMdWOAnT3xONPfU8pJi9fbAgtWL2GyRbooAxrfGDb=bsB_A@mail.gmail.com>

On 09/28/2017 11:45 AM, Geert Uytterhoeven wrote:
> Hi Florian,
> 
> On Thu, Sep 28, 2017 at 7:22 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
>> On 09/28/2017 08:53 AM, Geert Uytterhoeven wrote:
>>> If the optional "reset-gpios" property is specified in DT, the generic
>>> MDIO bus code takes care of resetting the PHY during device probe.
>>> However, the PHY may still have to be reset explicitly after system
>>> resume.
>>>
>>> This allows to restore Ethernet operation after resume from s2ram on
>>> Salvator-XS, where the enable pin of the regulator providing PHY power
>>> is connected to PRESETn, and PSCI suspend powers down the SoC.
>>>
>>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>>> ---
>>>  drivers/net/ethernet/renesas/ravb_main.c | 9 +++++++++
>>>  1 file changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>>> index fdf30bfa403bf416..96d1d48e302f8c9a 100644
>>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>>> @@ -19,6 +19,7 @@
>>>  #include <linux/etherdevice.h>
>>>  #include <linux/ethtool.h>
>>>  #include <linux/if_vlan.h>
>>> +#include <linux/gpio/consumer.h>
>>>  #include <linux/kernel.h>
>>>  #include <linux/list.h>
>>>  #include <linux/module.h>
>>> @@ -2268,6 +2269,7 @@ static int __maybe_unused ravb_resume(struct device *dev)
>>>  {
>>>       struct net_device *ndev = dev_get_drvdata(dev);
>>>       struct ravb_private *priv = netdev_priv(ndev);
>>> +     struct mii_bus *bus = priv->mii_bus;
>>>       int ret = 0;
>>>
>>>       if (priv->wol_enabled) {
>>> @@ -2302,6 +2304,13 @@ static int __maybe_unused ravb_resume(struct device *dev)
>>>        * reopen device if it was running before system suspended.
>>>        */
>>>
>>> +     /* PHY reset */
>>> +     if (bus->reset_gpiod) {
>>> +             gpiod_set_value_cansleep(bus->reset_gpiod, 1);
>>> +             udelay(bus->reset_delay_us);
>>> +             gpiod_set_value_cansleep(bus->reset_gpiod, 0);
>>> +     }
>>
>> This is a clever hack, but unfortunately this is also misusing the MDIO
>> bus reset line into a PHY reset line. As commented in patch 3, if this
>> reset line is tied to the PHY, then this should be a PHY property and
> 
> OK.
> 
>> you cannot (ab)use the MDIO bus GPIO reset logic anymore...
> 
> And then I should add reset-gpios support to drivers/net/phy/micrel.c?
> Or is there already generic code to handle per-PHY reset? I couldn't find it.

There is not such a thing unfortunately, but it would presumably be
called within drivers/net/phy/mdio_bus.c during bus->reset() time
because you need the PHY reset to be deasserted before you can
successfully read/write from the PHY, and if you can't read/write from
the PHY, the MDIO bus layer cannot read the PHY ID, and therefore cannot
match a PHY device with its driver, so things don't work.

NB: you could move this entirely to the Micrel PHY driver if you specify
a compatible string that has a the PHY OUI in it, because that bypasses
the need to match the PHY driver with the PHY device, but this may not
be an acceptable solution for non-DT platforms or other platforms where
the PHY can't be determined based on the board DTS.

I was going to suggest writing some sort of generic helper that walks
the list of child nodes from a MDIO bus device node and deassert reset
lines and enables clocks, but there is absolutely nothing generic about
that. Things like which of the reset should come first, and if there are
multiple, in which order, etc.

> 
>> Should not you also try to manage this reset line during ravb_open() to
>> achiever better power savings?
> 
> I don't know. The Micrel KSZ9031RNXVA datasheet doesn't mention if it's
> safe or not to assert reset for a prolonged time.
> 
> Thanks!
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 


-- 
Florian

^ permalink raw reply

* Re: [PATCH RFC 3/5] Add KSZ8795 switch driver
From: Florian Fainelli @ 2017-09-28 18:45 UTC (permalink / raw)
  To: Pavel Machek, Tristram.Ha
  Cc: andrew, muvarov, nathan.leigh.conrad, vivien.didelot, netdev,
	linux-kernel, Woojung.Huh
In-Reply-To: <20170928184059.GA2825@amd>

On 09/28/2017 11:40 AM, Pavel Machek wrote:
> Hi!
> 
> On Mon 2017-09-18 20:27:13, Tristram.Ha@microchip.com wrote:
>>>> +/**
>>>> + * Some counters do not need to be read too often because they are less
>>> likely
>>>> + * to increase much.
>>>> + */
>>>
>>> What does comment mean? Are you caching statistics, and updating
>>> different values at different rates?
>>>
>>
>> There are 34 counters.  In normal case using generic bus I/O or PCI to read them
>> is very quick, but the switch is mostly accessed using SPI, or even I2C.  As the SPI
>> access is very slow and cannot run in interrupt context I keep worrying reading
>> the MIB counters in a loop for 5 or more ports will prevent other critical hardware
>> access from executing soon enough.  These accesses can be getting 1588 PTP
>> timestamps and opening/closing ports.  (RSTP Conformance Test sends test traffic
>> to port supposed to be closed/opened after receiving specific RSTP
>> BPDU.)
> 
> Hmm. Ok, interesting.
> 
> I wonder how well this is going to work if userspace actively 'does
> something' with the switch.
> 
> It seems to me that even if your statistics code is careful not to do
> 'a lot' of accesses at the same time, userspace can use other parts of
> the driver to do the same, and thus cause same unwanted effects...

A few switches have a MIB snapshot feature that is implemented such that
accessing the snapshot does not hog the remainder of the switch
registers, is this something possible on KSZ switches?

Tangential: net-next is currently open, so now would be a good time to
send a revised version of your patch series to target possibly 4.15 with
an initial implementation. Please fix the cover-letter and patch
threading such that they look like the following:

[PATCH 0/X]
   [PATCH 1/X]
   [PATCH 2/X]
   etc..

Right now this shows up as separate emails/patches and this is very
annoying to follow as a thread.

Thank you
-- 
Florian

^ permalink raw reply

* Re: [PATCH 2/4] ravb: Add optional PHY reset during system resume
From: Geert Uytterhoeven @ 2017-09-28 18:45 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Geert Uytterhoeven, David S . Miller, Simon Horman, Magnus Damm,
	Sergei Shtylyov, Andrew Lunn, Niklas Söderlund,
	netdev@vger.kernel.org, Linux-Renesas, devicetree@vger.kernel.org
In-Reply-To: <406f7aff-e386-31f3-39d3-17523443c265@gmail.com>

Hi Florian,

On Thu, Sep 28, 2017 at 7:22 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> On 09/28/2017 08:53 AM, Geert Uytterhoeven wrote:
>> If the optional "reset-gpios" property is specified in DT, the generic
>> MDIO bus code takes care of resetting the PHY during device probe.
>> However, the PHY may still have to be reset explicitly after system
>> resume.
>>
>> This allows to restore Ethernet operation after resume from s2ram on
>> Salvator-XS, where the enable pin of the regulator providing PHY power
>> is connected to PRESETn, and PSCI suspend powers down the SoC.
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> ---
>>  drivers/net/ethernet/renesas/ravb_main.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>> index fdf30bfa403bf416..96d1d48e302f8c9a 100644
>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>> @@ -19,6 +19,7 @@
>>  #include <linux/etherdevice.h>
>>  #include <linux/ethtool.h>
>>  #include <linux/if_vlan.h>
>> +#include <linux/gpio/consumer.h>
>>  #include <linux/kernel.h>
>>  #include <linux/list.h>
>>  #include <linux/module.h>
>> @@ -2268,6 +2269,7 @@ static int __maybe_unused ravb_resume(struct device *dev)
>>  {
>>       struct net_device *ndev = dev_get_drvdata(dev);
>>       struct ravb_private *priv = netdev_priv(ndev);
>> +     struct mii_bus *bus = priv->mii_bus;
>>       int ret = 0;
>>
>>       if (priv->wol_enabled) {
>> @@ -2302,6 +2304,13 @@ static int __maybe_unused ravb_resume(struct device *dev)
>>        * reopen device if it was running before system suspended.
>>        */
>>
>> +     /* PHY reset */
>> +     if (bus->reset_gpiod) {
>> +             gpiod_set_value_cansleep(bus->reset_gpiod, 1);
>> +             udelay(bus->reset_delay_us);
>> +             gpiod_set_value_cansleep(bus->reset_gpiod, 0);
>> +     }
>
> This is a clever hack, but unfortunately this is also misusing the MDIO
> bus reset line into a PHY reset line. As commented in patch 3, if this
> reset line is tied to the PHY, then this should be a PHY property and

OK.

> you cannot (ab)use the MDIO bus GPIO reset logic anymore...

And then I should add reset-gpios support to drivers/net/phy/micrel.c?
Or is there already generic code to handle per-PHY reset? I couldn't find it.

> Should not you also try to manage this reset line during ravb_open() to
> achiever better power savings?

I don't know. The Micrel KSZ9031RNXVA datasheet doesn't mention if it's
safe or not to assert reset for a prolonged time.

Thanks!

Gr{oetje,eeting}s,

                        Geert

^ permalink raw reply

* Re: [PATCH RFC 3/5] Add KSZ8795 switch driver
From: Pavel Machek @ 2017-09-28 18:40 UTC (permalink / raw)
  To: Tristram.Ha
  Cc: andrew, muvarov, nathan.leigh.conrad, vivien.didelot, f.fainelli,
	netdev, linux-kernel, Woojung.Huh
In-Reply-To: <93AF473E2DA327428DE3D46B72B1E9FD41124D5A@CHN-SV-EXMX02.mchp-main.com>

[-- Attachment #1: Type: text/plain, Size: 1419 bytes --]

Hi!

On Mon 2017-09-18 20:27:13, Tristram.Ha@microchip.com wrote:
> > > +/**
> > > + * Some counters do not need to be read too often because they are less
> > likely
> > > + * to increase much.
> > > + */
> > 
> > What does comment mean? Are you caching statistics, and updating
> > different values at different rates?
> > 
> 
> There are 34 counters.  In normal case using generic bus I/O or PCI to read them
> is very quick, but the switch is mostly accessed using SPI, or even I2C.  As the SPI
> access is very slow and cannot run in interrupt context I keep worrying reading
> the MIB counters in a loop for 5 or more ports will prevent other critical hardware
> access from executing soon enough.  These accesses can be getting 1588 PTP
> timestamps and opening/closing ports.  (RSTP Conformance Test sends test traffic
> to port supposed to be closed/opened after receiving specific RSTP
> BPDU.)

Hmm. Ok, interesting.

I wonder how well this is going to work if userspace actively 'does
something' with the switch.

It seems to me that even if your statistics code is careful not to do
'a lot' of accesses at the same time, userspace can use other parts of
the driver to do the same, and thus cause same unwanted effects...
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* [PATCH V4] r8152:  add Linksys USB3GIGV1 id
From: Grant Grundler @ 2017-09-28 18:35 UTC (permalink / raw)
  To: Hayes Wang, Oliver Neukum
  Cc: linux-usb, David S . Miller, LKML, netdev, Grant Grundler

This linksys dongle by default comes up in cdc_ether mode.
This patch allows r8152 to claim the device:
   Bus 002 Device 002: ID 13b1:0041 Linksys

Signed-off-by: Grant Grundler <grundler@chromium.org>
---
 drivers/net/usb/cdc_ether.c | 10 ++++++++++
 drivers/net/usb/r8152.c     |  2 ++
 2 files changed, 12 insertions(+)

V4: use IS_ENABLED() to check CONFIG_USB_RTL8152 is m or y.
    (verified by adding #error to the new code and trying to compile
     Thanks Doug for the tip!)
    Add LINKSYS vendor #define in same order for both drivers.

V3: for backwards compat, add #ifdef CONFIG_USB_RTL8152 around
    the cdc_ether blacklist entry so the cdc_ether driver can
    still claim the device if r8152 driver isn't configured.

V2: add LINKSYS_VENDOR_ID to cdc_ether blacklist



diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
index 8ab281b478f2..677a85360db1 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -547,6 +547,7 @@ static const struct driver_info wwan_info = {
 #define REALTEK_VENDOR_ID	0x0bda
 #define SAMSUNG_VENDOR_ID	0x04e8
 #define LENOVO_VENDOR_ID	0x17ef
+#define LINKSYS_VENDOR_ID	0x13b1
 #define NVIDIA_VENDOR_ID	0x0955
 #define HP_VENDOR_ID		0x03f0
 #define MICROSOFT_VENDOR_ID	0x045e
@@ -737,6 +738,15 @@ static const struct usb_device_id	products[] = {
 	.driver_info = 0,
 },
 
+#if IS_ENABLED(CONFIG_USB_RTL8152)
+/* Linksys USB3GIGV1 Ethernet Adapter */
+{
+	USB_DEVICE_AND_INTERFACE_INFO(LINKSYS_VENDOR_ID, 0x0041, USB_CLASS_COMM,
+			USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+	.driver_info = 0,
+},
+#endif
+
 /* ThinkPad USB-C Dock (based on Realtek RTL8153) */
 {
 	USB_DEVICE_AND_INTERFACE_INFO(LENOVO_VENDOR_ID, 0x3062, USB_CLASS_COMM,
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index ceb78e2ea4f0..941ece08ba78 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -613,6 +613,7 @@ enum rtl8152_flags {
 #define VENDOR_ID_MICROSOFT		0x045e
 #define VENDOR_ID_SAMSUNG		0x04e8
 #define VENDOR_ID_LENOVO		0x17ef
+#define VENDOR_ID_LINKSYS		0x13b1
 #define VENDOR_ID_NVIDIA		0x0955
 
 #define MCU_TYPE_PLA			0x0100
@@ -5316,6 +5317,7 @@ static const struct usb_device_id rtl8152_table[] = {
 	{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x7205)},
 	{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x720c)},
 	{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO,  0x7214)},
+	{REALTEK_USB_DEVICE(VENDOR_ID_LINKSYS, 0x0041)},
 	{REALTEK_USB_DEVICE(VENDOR_ID_NVIDIA,  0x09ff)},
 	{}
 };
-- 
2.14.2.822.g60be5d43e6-goog

^ permalink raw reply related

* Re: [PATCH net-next v9] openvswitch: enable NSH support
From: Pravin Shelar @ 2017-09-28 18:28 UTC (permalink / raw)
  To: Yang, Yi
  Cc: Jiri Benc, netdev@vger.kernel.org, dev@openvswitch.org, e@erig.me,
	davem@davemloft.net, Jan Scheurich
In-Reply-To: <20170927013908.GA33716@localhost.localdomain>

On Tue, Sep 26, 2017 at 6:39 PM, Yang, Yi <yi.y.yang@intel.com> wrote:
> On Tue, Sep 26, 2017 at 06:49:14PM +0800, Jiri Benc wrote:
>> On Tue, 26 Sep 2017 12:55:39 +0800, Yang, Yi wrote:
>> > After push_nsh, the packet won't be recirculated to flow pipeline, so
>> > key->eth.type must be set explicitly here, but for pop_nsh, the packet
>> > will be recirculated to flow pipeline, it will be reparsed, so
>> > key->eth.type will be set in packet parse function, we needn't handle it
>> > in pop_nsh.
>>
>> This seems to be a very different approach than what we currently have.
>> Looking at the code, the requirement after "destructive" actions such
>> as pushing or popping headers is to recirculate.
>
> This is optimization proposed by Jan Scheurich, recurculating after push_nsh
> will impact on performance, recurculating after pop_nsh is unavoidable, So
> also cc jan.scheurich@ericsson.com.
>
> Actucally all the keys before push_nsh are still there after push_nsh,
> push_nsh has updated all the nsh keys, so recirculating remains avoidable.
>


We should keep existing model for this patch. Later you can submit
optimization patch with specific use cases and performance
improvement. So that we can evaluate code complexity and benefits.

>>
>> Setting key->eth.type to satisfy conditions in the output path without
>> updating the rest of the key looks very hacky and fragile to me. There
>> might be other conditions and dependencies that are not obvious.
>> I don't think the code was written with such code path in mind.
>>
>> I'd like to hear what Pravin thinks about this.
>>
>>  Jiri

^ permalink raw reply

* [PATCH net-next] Revert "net: dsa: bcm_sf2: Defer port enabling to calling port_enable"
From: Florian Fainelli @ 2017-09-28 18:19 UTC (permalink / raw)
  To: netdev; +Cc: Florian Fainelli, Andrew Lunn, Vivien Didelot, open list

This reverts commit e85ec74ace29 ("net: dsa: bcm_sf2: Defer port
enabling to calling port_enable") because this now makes an unbind
followed by a bind to fail connecting to the ingrated PHY.

What this patch missed is that we need the PHY to be enabled with
bcm_sf2_gphy_enable_set() before probing it on the MDIO bus. This is
correctly done in the ops->setup() function, but by the time
ops->port_enable() runs, this is too late. Upon unbind we would power
down the PHY, and so when we would bind again, the PHY would be left
powered off.

Fixes: e85ec74ace29 ("net: dsa: bcm_sf2: Defer port enabling to calling port_enable")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/bcm_sf2.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 898d5642b516..7aecc98d0a18 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -754,11 +754,14 @@ static int bcm_sf2_sw_setup(struct dsa_switch *ds)
 	struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
 	unsigned int port;
 
-	/* Disable unused ports and configure IMP port */
+	/* Enable all valid ports and disable those unused */
 	for (port = 0; port < priv->hw_params.num_ports; port++) {
-		if (dsa_is_cpu_port(ds, port))
+		/* IMP port receives special treatment */
+		if ((1 << port) & ds->enabled_port_mask)
+			bcm_sf2_port_setup(ds, port, NULL);
+		else if (dsa_is_cpu_port(ds, port))
 			bcm_sf2_imp_setup(ds, port);
-		else if (!((1 << port) & ds->enabled_port_mask))
+		else
 			bcm_sf2_port_disable(ds, port, NULL);
 	}
 
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH] Add a driver for Renesas uPD60620 and uPD60620A PHYs
From: Bernd Edlinger @ 2017-09-28 18:12 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev@vger.kernel.org, Florian Fainelli
In-Reply-To: <20170922175918.GD3470@lunn.ch>

On 09/22/17 19:59, Andrew Lunn wrote:
> On Fri, Sep 22, 2017 at 05:08:45PM +0000, Bernd Edlinger wrote:
>>
>> +config RENESAS_PHY
>> +	tristate "Driver for Renesas PHYs"
>> +	---help---
>> +	  Supports the uPD60620 and uPD60620A PHYs.
>> +
> 
> Hi Bernd
> 
> Please call this "Reneseas PHYs" and place in it alphabetical order.
> 

Done.

>> +
>> +/* Extended Registers and values */
>> +/* PHY Special Control/Status    */
>> +#define PHY_PHYSCR         0x1F      /* PHY.31 */
>> +#define PHY_PHYSCR_10MB    0x0004    /* PHY speed = 10mb */
>> +#define PHY_PHYSCR_100MB   0x0008    /* PHY speed = 100mb */
>> +#define PHY_PHYSCR_DUPLEX  0x0010    /* PHY Duplex */
>> +#define PHY_PHYSCR_RSVD5   0x0020    /* Reserved Bit 5 */
>> +#define PHY_PHYSCR_MIIMOD  0x0040    /* Enable 4B5B MII mode */
> 
> Are any of these comments actually useful. It seems like the defines
> are pretty obvious.
> 
>> +#define PHY_PHYSCR_RSVD7   0x0080    /* Reserved Bit 7 */
>> +#define PHY_PHYSCR_RSVD8   0x0100    /* Reserved Bit 8 */
>> +#define PHY_PHYSCR_RSVD9   0x0200    /* Reserved Bit 9 */
>> +#define PHY_PHYSCR_RSVD10  0x0400    /* Reserved Bit 10 */
>> +#define PHY_PHYSCR_RSVD11  0x0800    /* Reserved Bit 11 */
>> +#define PHY_PHYSCR_ANDONE  0x1000    /* Auto negotiation done */
>> +#define PHY_PHYSCR_RSVD13  0x2000    /* Reserved Bit 13 */
>> +#define PHY_PHYSCR_RSVD14  0x4000    /* Reserved Bit 14 */
>> +#define PHY_PHYSCR_RSVD15  0x8000    /* Reserved Bit 15 */
> 
> It looks like the only register you use is SCR and SPM. Maybe delete
> all the rest? Or do you plan to add more features making use of these
> registers?
> 

No, I removed all unused defines for now.

>> +	phydev->link = 0;
>> +	phydev->lp_advertising = 0;
>> +	phydev->pause = 0;
>> +	phydev->asym_pause = 0;
>> +
>> +	if (phy_state & BMSR_ANEGCOMPLETE) {
> 
> It is worth comparing this against genphy_read_status() which is the
> reference implementation. You would normally check if auto negotiation
> is enabled, not if it has completed. If it is enabled you read the
> current negotiated state, even if it is not completed.
> 

Do you suggest that there are cases where auto negotiation does not
reach completion, and still provides a usable link status?

I have tried to connect to link partners with fixed configuration
but even then the auto negotiation always competes normally.
 

>> +		phy_state = phy_read(phydev, PHY_PHYSCR);
>> +		if (phy_state < 0)
>> +			return phy_state;
>> +
>> +		if (phy_state & (PHY_PHYSCR_10MB | PHY_PHYSCR_100MB)) {
>> +			phydev->link = 1;
>> +			phydev->speed = SPEED_10;
>> +			phydev->duplex = DUPLEX_HALF;
>> +
>> +			if (phy_state & PHY_PHYSCR_100MB)
>> +				phydev->speed = SPEED_100;
>> +			if (phy_state & PHY_PHYSCR_DUPLEX)
>> +				phydev->duplex = DUPLEX_FULL;
>> +
>> +			phy_state = phy_read(phydev, MII_LPA);
>> +			if (phy_state < 0)
>> +				return phy_state;
>> +
>> +			phydev->lp_advertising
>> +				= mii_lpa_to_ethtool_lpa_t(phy_state);
>> +
>> +			if (phydev->duplex == DUPLEX_FULL) {
>> +				if (phy_state & LPA_PAUSE_CAP)
>> +					phydev->pause = 1;
>> +				if (phy_state & LPA_PAUSE_ASYM)
>> +					phydev->asym_pause = 1;
>> +			}
>> +		}
>> +	} else if (phy_state & BMSR_LSTATUS) {
> 
> The else clause is then for a fixed configuration. Since all you are
> looking at is BMCR, you can probably just cut/paste from
> genphy_read_status().
> 

I think I can fold the fixed speed case in the auto negotiation case:
The PHYSCR has always the correct values for fixed settings.
I was initially unsure if I should look at it while autonegotiation is
not complete, but as you pointed out, that is the generally accepted
practice.


Thanks
Bernd.


>From 2e101aed8466b314251972d1eaccfb43cf177078 Mon Sep 17 00:00:00 2001
From: Bernd Edlinger <bernd.edlinger@hotmail.de>
Date: Thu, 21 Sep 2017 15:46:16 +0200
Subject: [PATCH 2/5] Add a driver for Renesas uPD60620 and uPD60620A PHYs.

Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
---
 drivers/net/phy/Kconfig    |   5 +++
 drivers/net/phy/Makefile   |   1 +
 drivers/net/phy/uPD60620.c | 109 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 115 insertions(+)
 create mode 100644 drivers/net/phy/uPD60620.c

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index a9d16a3..f67943b 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -366,6 +366,11 @@ config REALTEK_PHY
 	---help---
 	  Supports the Realtek 821x PHY.
 
+config RENESAS_PHY
+	tristate "Driver for Renesas PHYs"
+	---help---
+	  Supports the Renesas PHYs uPD60620 and uPD60620A.
+
 config ROCKCHIP_PHY
         tristate "Driver for Rockchip Ethernet PHYs"
         ---help---
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 416df92..1404ad3 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -72,6 +72,7 @@ obj-$(CONFIG_MICROSEMI_PHY)	+= mscc.o
 obj-$(CONFIG_NATIONAL_PHY)	+= national.o
 obj-$(CONFIG_QSEMI_PHY)		+= qsemi.o
 obj-$(CONFIG_REALTEK_PHY)	+= realtek.o
+obj-$(CONFIG_RENESAS_PHY)	+= uPD60620.o
 obj-$(CONFIG_ROCKCHIP_PHY)	+= rockchip.o
 obj-$(CONFIG_SMSC_PHY)		+= smsc.o
 obj-$(CONFIG_STE10XP)		+= ste10Xp.o
diff --git a/drivers/net/phy/uPD60620.c b/drivers/net/phy/uPD60620.c
new file mode 100644
index 0000000..96b3347
--- /dev/null
+++ b/drivers/net/phy/uPD60620.c
@@ -0,0 +1,109 @@
+/*
+ * Driver for the Renesas PHY uPD60620.
+ *
+ * Copyright (C) 2015 Softing Industrial Automation GmbH
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/phy.h>
+
+#define UPD60620_PHY_ID    0xb8242824
+
+/* Extended Registers and values */
+/* PHY Special Control/Status    */
+#define PHY_PHYSCR         0x1F      /* PHY.31 */
+#define PHY_PHYSCR_10MB    0x0004    /* PHY speed = 10mb */
+#define PHY_PHYSCR_100MB   0x0008    /* PHY speed = 100mb */
+#define PHY_PHYSCR_DUPLEX  0x0010    /* PHY Duplex */
+
+/* PHY Special Modes */
+#define PHY_SPM            0x12      /* PHY.18 */
+
+/* Init PHY */
+
+static int upd60620_config_init(struct phy_device *phydev)
+{
+	/* Enable support for passive HUBs (could be a strap option) */
+	/* PHYMODE: All speeds, HD in parallel detect */
+	return phy_write(phydev, PHY_SPM, 0x0180 | phydev->mdio.addr);
+}
+
+/* Get PHY status from common registers */
+
+static int upd60620_read_status(struct phy_device *phydev)
+{
+	int phy_state;
+
+	/* Read negotiated state */
+	phy_state = phy_read(phydev, MII_BMSR);
+	if (phy_state < 0)
+		return phy_state;
+
+	phydev->link = 0;
+	phydev->lp_advertising = 0;
+	phydev->pause = 0;
+	phydev->asym_pause = 0;
+
+	if (phy_state & (BMSR_ANEGCOMPLETE | BMSR_LSTATUS)) {
+		phy_state = phy_read(phydev, PHY_PHYSCR);
+		if (phy_state < 0)
+			return phy_state;
+
+		if (phy_state & (PHY_PHYSCR_10MB | PHY_PHYSCR_100MB)) {
+			phydev->link = 1;
+			phydev->speed = SPEED_10;
+			phydev->duplex = DUPLEX_HALF;
+
+			if (phy_state & PHY_PHYSCR_100MB)
+				phydev->speed = SPEED_100;
+			if (phy_state & PHY_PHYSCR_DUPLEX)
+				phydev->duplex = DUPLEX_FULL;
+
+			phy_state = phy_read(phydev, MII_LPA);
+			if (phy_state < 0)
+				return phy_state;
+
+			phydev->lp_advertising
+				= mii_lpa_to_ethtool_lpa_t(phy_state);
+
+			if (phydev->duplex == DUPLEX_FULL) {
+				if (phy_state & LPA_PAUSE_CAP)
+					phydev->pause = 1;
+				if (phy_state & LPA_PAUSE_ASYM)
+					phydev->asym_pause = 1;
+			}
+		}
+	}
+	return 0;
+}
+
+MODULE_DESCRIPTION("Renesas uPD60620 PHY driver");
+MODULE_AUTHOR("Bernd Edlinger <bernd.edlinger@hotmail.de>");
+MODULE_LICENSE("GPL");
+
+static struct phy_driver upd60620_driver[1] = { {
+	.phy_id         = UPD60620_PHY_ID,
+	.phy_id_mask    = 0xfffffffe,
+	.name           = "Renesas uPD60620",
+	.features       = PHY_BASIC_FEATURES,
+	.flags          = 0,
+	.config_init    = upd60620_config_init,
+	.config_aneg    = genphy_config_aneg,
+	.read_status    = upd60620_read_status,
+} };
+
+module_phy_driver(upd60620_driver);
+
+static struct mdio_device_id __maybe_unused upd60620_tbl[] = {
+	{ UPD60620_PHY_ID, 0xfffffffe },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(mdio, upd60620_tbl);
-- 
2.7.4

^ permalink raw reply related

* [PATCH v2] lib: fix multiple strlcpy definition
From: Baruch Siach @ 2017-09-28 18:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Phil Sutter, Baruch Siach

Some C libraries, like uClibc and musl, provide BSD compatible
strlcpy(). Add check_strlcpy() to configure, and avoid defining strlcpy
and strlcat when the C library provides them.

This fixes the following static link error with uClibc-ng:

.../sysroot/usr/lib/libc.a(strlcpy.os): In function `strlcpy':
strlcpy.c:(.text+0x0): multiple definition of `strlcpy'
../lib/libutil.a(utils.o):utils.c:(.text+0x1ddc): first defined here
collect2: error: ld returned 1 exit status

Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
---
v2: Fix the order of strlcpy parameters
---
 configure    | 24 ++++++++++++++++++++++++
 lib/Makefile |  4 ++++
 lib/utils.c  |  2 ++
 3 files changed, 30 insertions(+)

diff --git a/configure b/configure
index 7be8fb113cc9..e0982f34a992 100755
--- a/configure
+++ b/configure
@@ -326,6 +326,27 @@ EOF
     rm -f $TMPDIR/dbtest.c $TMPDIR/dbtest
 }
 
+check_strlcpy()
+{
+    cat >$TMPDIR/strtest.c <<EOF
+#include <string.h>
+int main(int argc, char **argv) {
+	char dst[10];
+	strlcpy(dst, "test", sizeof(dst));
+	return 0;
+}
+EOF
+    $CC -I$INCLUDE -o $TMPDIR/strtest $TMPDIR/strtest.c >/dev/null 2>&1
+    if [ $? -eq 0 ]
+    then
+	echo "no"
+    else
+	echo "NEED_STRLCPY:=y" >>$CONFIG
+	echo "yes"
+    fi
+    rm -f $TMPDIR/strtest.c $TMPDIR/strtest
+}
+
 quiet_config()
 {
 	cat <<EOF
@@ -397,6 +418,9 @@ check_mnl
 echo -n "Berkeley DB: "
 check_berkeley_db
 
+echo -n "need for strlcpy: "
+check_strlcpy
+
 echo
 echo -n "docs:"
 check_docs
diff --git a/lib/Makefile b/lib/Makefile
index 0fbdf4c31f50..132ad00c3335 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -1,5 +1,9 @@
 include ../config.mk
 
+ifeq ($(NEED_STRLCPY),y)
+	CFLAGS += -DNEED_STRLCPY
+endif
+
 CFLAGS += -fPIC
 
 UTILOBJ = utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o \
diff --git a/lib/utils.c b/lib/utils.c
index bbd3cbc46a0e..240e7426a810 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -1231,6 +1231,7 @@ int get_real_family(int rtm_type, int rtm_family)
 	return rtm_family;
 }
 
+#ifdef NEED_STRLCPY
 size_t strlcpy(char *dst, const char *src, size_t size)
 {
 	size_t srclen = strlen(src);
@@ -1253,3 +1254,4 @@ size_t strlcat(char *dst, const char *src, size_t size)
 
 	return dlen + strlcpy(dst + dlen, src, size - dlen);
 }
+#endif
-- 
2.14.2

^ permalink raw reply related

* Re: [PATCH net-next RFC 3/9] net: dsa: mv88e6xxx: add support for GPIO configuration
From: Andrew Lunn @ 2017-09-28 18:01 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Brandon Streiff, netdev, linux-kernel, David S. Miller,
	Vivien Didelot, Richard Cochran, Erik Hons
In-Reply-To: <659c4254-d0b7-52dc-dd9b-3921cd2f20c0@gmail.com>

On Thu, Sep 28, 2017 at 10:45:03AM -0700, Florian Fainelli wrote:
> On 09/28/2017 08:25 AM, Brandon Streiff wrote:
> > The Scratch/Misc register is a windowed interface that provides access
> > to the GPIO configuration. Provide a new method for configuration of
> > GPIO functions.
> > 
> > Signed-off-by: Brandon Streiff <brandon.streiff@ni.com>
> > ---
> 
> > +/* Offset 0x1A: Scratch and Misc. Register */
> > +static int mv88e6xxx_g2_scratch_reg_read(struct mv88e6xxx_chip *chip,
> > +					 int reg, u8 *data)
> > +{
> > +	int err;
> > +	u16 value;
> > +
> > +	err = mv88e6xxx_g2_write(chip, MV88E6XXX_G2_SCRATCH_MISC_MISC,
> > +				 reg << 8);
> > +	if (err)
> > +		return err;
> > +
> > +	err = mv88e6xxx_g2_read(chip, MV88E6XXX_G2_SCRATCH_MISC_MISC, &value);
> > +	if (err)
> > +		return err;
> > +
> > +	*data = (value & MV88E6XXX_G2_SCRATCH_MISC_DATA_MASK);
> > +
> > +	return 0;
> > +}
> 
> With the write and read acquiring and then releasing the lock
> immediately, is no there room for this sequence to be interrupted in the
> middle and end-up returning inconsistent reads?

Hi Florian

The general pattern in this code is that the lock chip->reg_lock is
taken at a higher level. That protects against other threads. The
driver tends to do that at the highest levels, at the entry points
into the driver. I've not yet checked this code follows the pattern
yet. However, we have a check in the low level to ensure the lock has
been taken. So it seems likely the lock is held.
 
> Would there be any value in implementing a proper gpiochip structure
> here such that other pieces of SW can see this GPIO controller as a
> provider and you can reference it from e.g: Device Tree using GPIO
> descriptors?

That would be my preference as well, or maybe a pinctrl driver.

     Andrew

^ permalink raw reply

* Re: [patch net-next 3/7] ipv4: ipmr: Don't forward packets already forwarded by hardware
From: Florian Fainelli @ 2017-09-28 17:56 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, yotamg, idosch, mlxsw, nikolay, andrew, dsa, edumazet,
	willemb, johannes.berg, dcaratti, pabeni, daniel, fw, gfree.wind
In-Reply-To: <20170928173415.15551-4-jiri@resnulli.us>

On 09/28/2017 10:34 AM, Jiri Pirko wrote:
> From: Yotam Gigi <yotamg@mellanox.com>
> 
> Change the ipmr module to not forward packets if:
>  - The packet is marked with the offload_mr_fwd_mark, and
>  - Both input interface and output interface share the same parent ID.
> 
> This way, a packet can go through partial multicast forwarding in the
> hardware, where it will be forwarded only to the devices that share the
> same parent ID (AKA, reside inside the same hardware). The kernel will
> forward the packet to all other interfaces.
> 
> To do this, add the ipmr_offload_forward helper, which per skb, ingress VIF
> and egress VIF, returns whether the forwarding was offloaded to hardware.
> The ipmr_queue_xmit frees the skb and does not forward it if the result is
> a true value.
> 
> All the forwarding path code compiles out when the CONFIG_NET_SWITCHDEV is
> not set.
> 
> Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
> Reviewed-by: Ido Schimmel <idosch@mellanox.com>
> Signed-off-by: Jiri Pirko <jiri@mellanox.com>
> ---
>  net/ipv4/ipmr.c | 37 ++++++++++++++++++++++++++++++++-----
>  1 file changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
> index 4566c54..deba569 100644
> --- a/net/ipv4/ipmr.c
> +++ b/net/ipv4/ipmr.c
> @@ -1857,10 +1857,33 @@ static inline int ipmr_forward_finish(struct net *net, struct sock *sk,
>  	return dst_output(net, sk, skb);
>  }
>  
> +#ifdef CONFIG_NET_SWITCHDEV
> +static bool ipmr_forward_offloaded(struct sk_buff *skb, struct mr_table *mrt,
> +				   int in_vifi, int out_vifi)
> +{
> +	struct vif_device *out_vif = &mrt->vif_table[out_vifi];
> +	struct vif_device *in_vif = &mrt->vif_table[in_vifi];

Nit: in_vifi and out_vifi may be better named as in_vif_idx and
out_vif_idx, oh well you are just replicating the existing naming
conventions used down below, never mind then.
-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next RFC 0/9] net: dsa: PTP timestamping for mv88e6xxx
From: Florian Fainelli @ 2017-09-28 17:51 UTC (permalink / raw)
  To: Andrew Lunn, Brandon Streiff
  Cc: netdev, linux-kernel, David S. Miller, Vivien Didelot,
	Richard Cochran, Erik Hons
In-Reply-To: <20170928173629.GD14940@lunn.ch>

On 09/28/2017 10:36 AM, Andrew Lunn wrote:
>> - Patch #3: The GPIO config support is handled in a very simple manner.
>>   I suspect a longer term goal would be to use pinctrl here.
> 
> I assume ptp already has the core code to use pinctrl and Linux
> standard GPIOs? What does the device tree binding look like? How do
> you specify the GPIOs to use?
> 
> What we want to avoid is defining an ABI now, otherwise it is going to
> be hard to swap to pinctrl later.
> 
>> - Patch #6: the dsa_switch pointer and port index is plumbed from
>>   dsa_device_ops::rcv so that we can call the correct port_rxtstamp
>>   method. This involved instrumenting all of the *_tag_rcv functions in
>>   a way that's kind of a kludge and that I'm not terribly happy with.
> 
> Yes, this is ugly. I will see if i can find a better way to do
> this. 

See my reply in patch 6, I may be missing something, but once
dst->rdcv() has been called, skb->dev points to the slave network device
which already contains the switch port and switch information in
dsa_slave_priv, so that should lift the need for asking the individual
taggers' rcv() callback to tell us about it.
-- 
Florian

^ permalink raw reply

* Re: [patch net-next 1/7] skbuff: Add the offload_mr_fwd_mark field
From: Andrew Lunn @ 2017-09-28 17:49 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, yotamg, idosch, mlxsw, nikolay, dsa, edumazet,
	willemb, johannes.berg, dcaratti, pabeni, daniel, f.fainelli, fw,
	gfree.wind
In-Reply-To: <20170928173415.15551-2-jiri@resnulli.us>

On Thu, Sep 28, 2017 at 07:34:09PM +0200, Jiri Pirko wrote:
> From: Yotam Gigi <yotamg@mellanox.com>
> 
> Similarly to the offload_fwd_mark field, the offload_mr_fwd_mark field is
> used to allow partial offloading of MFC multicast routes.

> The reason why the already existing "offload_fwd_mark" bit cannot be used
> is that a switchdev driver would want to make the distinction between a
> packet that has already gone through L2 forwarding but did not go through
> multicast forwarding, and a packet that has already gone through both L2
> and multicast forwarding.

Hi Jiri

So we are talking about l2 vs l3. So why not call this
offload_l3_fwd_mark?

Is there anything really specific to multicast here?

   Thanks
      Andrew

^ permalink raw reply

* Re: [PATCH] net-ipv6: remove unused IP6_ECN_clear() function
From: David Miller @ 2017-09-28 17:48 UTC (permalink / raw)
  To: zenczykowski; +Cc: maze, netdev
In-Reply-To: <20170927033722.89146-1-zenczykowski@gmail.com>

From: Maciej Żenczykowski <zenczykowski@gmail.com>
Date: Tue, 26 Sep 2017 20:37:22 -0700

> From: Maciej Żenczykowski <maze@google.com>
> 
> This function is unused, and furthermore it is buggy since it suffers
> from the same issue that requires IP6_ECN_set_ce() to take a pointer
> to the skb so that it may (in case of CHECKSUM_COMPLETE) update skb->csum
> 
> Instead of fixing it, let's just outright remove it.
> 
> Tested: builds, and 'git grep IP6_ECN_clear' comes up empty
> 
> Signed-off-by: Maciej Żenczykowski <maze@google.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v5 4/4] ipv4: Namespaceify tcp_fastopen_blackhole_timeout knob
From: David Miller @ 2017-09-28 17:48 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, edumazet, weiwan, lucab, netdev, linux-kernel
In-Reply-To: <1506483343-11544-4-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 27 Sep 2017 11:35:43 +0800

> Different namespace application might require different time period in
> second to disable Fastopen on active TCP sockets.
> 
> Tested:
> Simulate following similar situation that the server's data gets dropped
> after 3WHS.
> C ---- syn-data ---> S
> C <--- syn/ack ----- S
> C ---- ack --------> S
> S (accept & write)
> C?  X <- data ------ S
> 	[retry and timeout]
> 
> And then print netstat of TCPFastOpenBlackhole, the counter increased as
> expected when the firewall blackhole issue is detected and active TFO is
> disabled.
> # cat /proc/net/netstat | awk '{print $91}'
> TCPFastOpenBlackhole
> 1
> 
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

^ permalink raw reply

* Re: [PATCH v5 3/4] ipv4: Namespaceify tcp_fastopen_key knob
From: David Miller @ 2017-09-28 17:47 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, edumazet, weiwan, lucab, netdev, linux-kernel
In-Reply-To: <1506483343-11544-3-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 27 Sep 2017 11:35:42 +0800

> Different namespace application might require different tcp_fastopen_key
> independently of the host.
> 
> David Miller pointed out there is a leak without releasing the context
> of tcp_fastopen_key during netns teardown. So add the release action in
> exit_batch path.
> 
> Tested:
> 1. Container namespace:
> # cat /proc/sys/net/ipv4/tcp_fastopen_key:
> 2817fff2-f803cf97-eadfd1f3-78c0992b
> 
> cookie key in tcp syn packets:
> Fast Open Cookie
>     Kind: TCP Fast Open Cookie (34)
>     Length: 10
>     Fast Open Cookie: 1e5dd82a8c492ca9
> 
> 2. Host:
> # cat /proc/sys/net/ipv4/tcp_fastopen_key:
> 107d7c5f-68eb2ac7-02fb06e6-ed341702
> 
> cookie key in tcp syn packets:
> Fast Open Cookie
>     Kind: TCP Fast Open Cookie (34)
>     Length: 10
>     Fast Open Cookie: e213c02bf0afbc8a
> 
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

^ permalink raw reply

* Re: [PATCH v5 2/4] ipv4: Remove the 'publish' logic in tcp_fastopen_init_key_once
From: David Miller @ 2017-09-28 17:47 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, edumazet, weiwan, lucab, netdev, linux-kernel
In-Reply-To: <1506483343-11544-2-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 27 Sep 2017 11:35:41 +0800

> The 'publish' logic is not necessary after commit dfea2aa65424 ("tcp:
> Do not call tcp_fastopen_reset_cipher from interrupt context"), because
> in tcp_fastopen_cookie_gen，it wouldn't call tcp_fastopen_init_key_once.
> 
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

^ permalink raw reply

* Re: [PATCH v5 1/4] ipv4: Namespaceify tcp_fastopen knob
From: David Miller @ 2017-09-28 17:47 UTC (permalink / raw)
  To: yanhaishuang; +Cc: kuznet, edumazet, weiwan, lucab, netdev, linux-kernel
In-Reply-To: <1506483343-11544-1-git-send-email-yanhaishuang@cmss.chinamobile.com>

From: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Date: Wed, 27 Sep 2017 11:35:40 +0800

> Different namespace application might require enable TCP Fast Open
> feature independently of the host.
> 
> This patch series continues making more of the TCP Fast Open related
> sysctl knobs be per net-namespace.
> 
> Reported-by: Luca BRUNO <lucab@debian.org>
> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox