Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next v9 3/5] dt-bindings: net: Add RGMII internal delay for DP83869
From: Dan Murphy @ 2020-06-19 16:18 UTC (permalink / raw)
  To: andrew, f.fainelli, hkallweit1, davem, robh
  Cc: netdev, linux-kernel, devicetree, Dan Murphy
In-Reply-To: <20200619161813.2716-1-dmurphy@ti.com>

Add the internal delay values into the header and update the binding
with the internal delay properties.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
---
 .../devicetree/bindings/net/ti,dp83869.yaml      | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/net/ti,dp83869.yaml b/Documentation/devicetree/bindings/net/ti,dp83869.yaml
index 5b69ef03bbf7..71e90a3e4652 100644
--- a/Documentation/devicetree/bindings/net/ti,dp83869.yaml
+++ b/Documentation/devicetree/bindings/net/ti,dp83869.yaml
@@ -8,7 +8,7 @@ $schema: "http://devicetree.org/meta-schemas/core.yaml#"
 title: TI DP83869 ethernet PHY
 
 allOf:
-  - $ref: "ethernet-controller.yaml#"
+  - $ref: "ethernet-phy.yaml#"
 
 maintainers:
   - Dan Murphy <dmurphy@ti.com>
@@ -64,6 +64,18 @@ properties:
        Operational mode for the PHY.  If this is not set then the operational
        mode is set by the straps. see dt-bindings/net/ti-dp83869.h for values
 
+  rx-internal-delay-ps:
+    description: Delay is in pico seconds
+    enum: [ 250, 500, 750, 1000, 1250, 1500, 1750, 2000, 2250, 2500, 2750, 3000,
+            3250, 3500, 3750, 4000 ]
+    default: 2000
+
+  tx-internal-delay-ps:
+    description: Delay is in pico seconds
+    enum: [ 250, 500, 750, 1000, 1250, 1500, 1750, 2000, 2250, 2500, 2750, 3000,
+            3250, 3500, 3750, 4000 ]
+    default: 2000
+
 required:
   - reg
 
@@ -80,5 +92,7 @@ examples:
         ti,op-mode = <DP83869_RGMII_COPPER_ETHERNET>;
         ti,max-output-impedance = "true";
         ti,clk-output-sel = <DP83869_CLK_O_SEL_CHN_A_RCLK>;
+        rx-internal-delay-ps = <2000>;
+        tx-internal-delay-ps = <2000>;
       };
     };
-- 
2.26.2


^ permalink raw reply related

* [PATCH net-next v9 2/5] net: phy: Add a helper to return the index for of the internal delay
From: Dan Murphy @ 2020-06-19 16:18 UTC (permalink / raw)
  To: andrew, f.fainelli, hkallweit1, davem, robh
  Cc: netdev, linux-kernel, devicetree, Dan Murphy
In-Reply-To: <20200619161813.2716-1-dmurphy@ti.com>

Add a helper function that will return the index in the array for the
passed in internal delay value.  The helper requires the array, size and
delay value.

The helper will then return the index for the exact match or return the
index for the index to the closest smaller value.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
---
 drivers/net/phy/phy_device.c | 100 +++++++++++++++++++++++++++++++++++
 include/linux/phy.h          |   4 ++
 2 files changed, 104 insertions(+)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 04946de74fa0..0bb08bc0370c 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -31,6 +31,7 @@
 #include <linux/mdio.h>
 #include <linux/io.h>
 #include <linux/uaccess.h>
+#include <linux/property.h>
 
 MODULE_DESCRIPTION("PHY library");
 MODULE_AUTHOR("Andy Fleming");
@@ -2657,6 +2658,105 @@ void phy_get_pause(struct phy_device *phydev, bool *tx_pause, bool *rx_pause)
 }
 EXPORT_SYMBOL(phy_get_pause);
 
+#if IS_ENABLED(CONFIG_OF_MDIO)
+static int phy_get_int_delay_property(struct device *dev, const char *name)
+{
+	s32 int_delay;
+	int ret;
+
+	ret = device_property_read_u32(dev, name, &int_delay);
+	if (ret)
+		return ret;
+
+	return int_delay;
+}
+#else
+static inline int phy_get_int_delay_property(struct device *dev,
+					     const char *name)
+{
+	return -EINVAL;
+}
+#endif
+
+/**
+ * phy_get_delay_index - returns the index of the internal delay
+ * @phydev: phy_device struct
+ * @dev: pointer to the devices device struct
+ * @delay_values: array of delays the PHY supports
+ * @size: the size of the delay array
+ * @is_rx: boolean to indicate to get the rx internal delay
+ *
+ * Returns the index within the array of internal delay passed in.
+ * If the device property is not present then the interface type is checked
+ * if the interface defines use of internal delay then a 1 is returned otherwise
+ * a 0 is returned.
+ * The array must be in ascending order. If PHY does not have an ascending order
+ * array then size = 0 and the value of the delay property is returned.
+ * Return -EINVAL if the delay is invalid or cannot be found.
+ */
+s32 phy_get_internal_delay(struct phy_device *phydev, struct device *dev,
+			   const int *delay_values, int size, bool is_rx)
+{
+	int i;
+	s32 delay;
+
+	if (is_rx) {
+		delay = phy_get_int_delay_property(dev, "rx-internal-delay-ps");
+		if (delay < 0 && size == 0) {
+			if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID ||
+			    phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID)
+				return 1;
+			else
+				return 0;
+		}
+
+	} else {
+		delay = phy_get_int_delay_property(dev, "tx-internal-delay-ps");
+		if (delay < 0 && size == 0) {
+			if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID ||
+			    phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
+				return 1;
+			else
+				return 0;
+		}
+	}
+
+	if (delay < 0)
+		return delay;
+
+	if (delay && size == 0)
+		return delay;
+
+	if (delay < delay_values[0] || delay > delay_values[size - 1]) {
+		phydev_err(phydev, "Delay %d is out of range\n", delay);
+		return -EINVAL;
+	}
+
+	if (delay == delay_values[0])
+		return 0;
+
+	for (i = 1; i < size; i++) {
+		if (delay == delay_values[i])
+			return i;
+
+		/* Find an approximate index by looking up the table */
+		if (delay > delay_values[i - 1] &&
+		    delay < delay_values[i]) {
+			if (delay - delay_values[i - 1] <
+			    delay_values[i] - delay)
+				return i - 1;
+			else
+				return i;
+		}
+	}
+
+	phydev_err(phydev, "error finding internal delay index for %d\n",
+		   delay);
+
+	return -EINVAL;
+}
+EXPORT_SYMBOL(phy_get_internal_delay);
+
 static bool phy_drv_supports_irq(struct phy_driver *phydrv)
 {
 	return phydrv->config_intr && phydrv->ack_interrupt;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 8c05d0fb5c00..917bfd422e06 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1430,6 +1430,10 @@ void phy_set_asym_pause(struct phy_device *phydev, bool rx, bool tx);
 bool phy_validate_pause(struct phy_device *phydev,
 			struct ethtool_pauseparam *pp);
 void phy_get_pause(struct phy_device *phydev, bool *tx_pause, bool *rx_pause);
+
+s32 phy_get_internal_delay(struct phy_device *phydev, struct device *dev,
+			   const int *delay_values, int size, bool is_rx);
+
 void phy_resolve_pause(unsigned long *local_adv, unsigned long *partner_adv,
 		       bool *tx_pause, bool *rx_pause);
 
-- 
2.26.2


^ permalink raw reply related

* [PATCH net-next v9 4/5] net: dp83869: Add RGMII internal delay configuration
From: Dan Murphy @ 2020-06-19 16:18 UTC (permalink / raw)
  To: andrew, f.fainelli, hkallweit1, davem, robh
  Cc: netdev, linux-kernel, devicetree, Dan Murphy
In-Reply-To: <20200619161813.2716-1-dmurphy@ti.com>

Add RGMII internal delay configuration for Rx and Tx.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
---
 drivers/net/phy/dp83869.c | 53 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/dp83869.c b/drivers/net/phy/dp83869.c
index 53ed3abc26c9..58103152c601 100644
--- a/drivers/net/phy/dp83869.c
+++ b/drivers/net/phy/dp83869.c
@@ -64,6 +64,10 @@
 #define DP83869_RGMII_TX_CLK_DELAY_EN		BIT(1)
 #define DP83869_RGMII_RX_CLK_DELAY_EN		BIT(0)
 
+/* RGMIIDCTL */
+#define DP83869_RGMII_CLK_DELAY_SHIFT		4
+#define DP83869_CLK_DELAY_DEF			7
+
 /* STRAP_STS1 bits */
 #define DP83869_STRAP_OP_MODE_MASK		GENMASK(2, 0)
 #define DP83869_STRAP_STS1_RESERVED		BIT(11)
@@ -78,9 +82,6 @@
 #define DP83869_PHYCR_FIFO_DEPTH_MASK	GENMASK(15, 12)
 #define DP83869_PHYCR_RESERVED_MASK	BIT(11)
 
-/* RGMIIDCTL bits */
-#define DP83869_RGMII_TX_CLK_DELAY_SHIFT	4
-
 /* IO_MUX_CFG bits */
 #define DP83869_IO_MUX_CFG_IO_IMPEDANCE_CTRL	0x1f
 
@@ -108,6 +109,8 @@ enum {
 struct dp83869_private {
 	int tx_fifo_depth;
 	int rx_fifo_depth;
+	s32 rx_int_delay;
+	s32 tx_int_delay;
 	int io_impedance;
 	int port_mirroring;
 	bool rxctrl_strap_quirk;
@@ -177,11 +180,16 @@ static int dp83869_set_strapped_mode(struct phy_device *phydev)
 }
 
 #if IS_ENABLED(CONFIG_OF_MDIO)
+static const int dp83869_internal_delay[] = {250, 500, 750, 1000, 1250, 1500,
+					     1750, 2000, 2250, 2500, 2750, 3000,
+					     3250, 3500, 3750, 4000};
+
 static int dp83869_of_init(struct phy_device *phydev)
 {
 	struct dp83869_private *dp83869 = phydev->priv;
 	struct device *dev = &phydev->mdio.dev;
 	struct device_node *of_node = dev->of_node;
+	int delay_size = ARRAY_SIZE(dp83869_internal_delay);
 	int ret;
 
 	if (!of_node)
@@ -235,6 +243,20 @@ static int dp83869_of_init(struct phy_device *phydev)
 				 &dp83869->tx_fifo_depth))
 		dp83869->tx_fifo_depth = DP83869_PHYCR_FIFO_DEPTH_4_B_NIB;
 
+	dp83869->rx_int_delay = phy_get_internal_delay(phydev, dev,
+						       &dp83869_internal_delay[0],
+						       delay_size, true);
+	if (dp83869->rx_int_delay < 0)
+		dp83869->rx_int_delay =
+				dp83869_internal_delay[DP83869_CLK_DELAY_DEF];
+
+	dp83869->tx_int_delay = phy_get_internal_delay(phydev, dev,
+						       &dp83869_internal_delay[0],
+						       delay_size, false);
+	if (dp83869->tx_int_delay < 0)
+		dp83869->tx_int_delay =
+				dp83869_internal_delay[DP83869_CLK_DELAY_DEF];
+
 	return ret;
 }
 #else
@@ -397,6 +419,31 @@ static int dp83869_config_init(struct phy_device *phydev)
 				     dp83869->clk_output_sel <<
 				     DP83869_IO_MUX_CFG_CLK_O_SEL_SHIFT);
 
+	if (phy_interface_is_rgmii(phydev)) {
+		ret = phy_write_mmd(phydev, DP83869_DEVADDR, DP83869_RGMIIDCTL,
+				    dp83869->rx_int_delay |
+			dp83869->tx_int_delay << DP83869_RGMII_CLK_DELAY_SHIFT);
+		if (ret)
+			return ret;
+
+		val = phy_read_mmd(phydev, DP83869_DEVADDR, DP83869_RGMIICTL);
+		val &= ~(DP83869_RGMII_TX_CLK_DELAY_EN |
+			 DP83869_RGMII_RX_CLK_DELAY_EN);
+
+		if (phydev->interface == PHY_INTERFACE_MODE_RGMII_ID)
+			val |= (DP83869_RGMII_TX_CLK_DELAY_EN |
+				DP83869_RGMII_RX_CLK_DELAY_EN);
+
+		if (phydev->interface == PHY_INTERFACE_MODE_RGMII_TXID)
+			val |= DP83869_RGMII_TX_CLK_DELAY_EN;
+
+		if (phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID)
+			val |= DP83869_RGMII_RX_CLK_DELAY_EN;
+
+		ret = phy_write_mmd(phydev, DP83869_DEVADDR, DP83869_RGMIICTL,
+				    val);
+	}
+
 	return ret;
 }
 
-- 
2.26.2


^ permalink raw reply related

* [PATCH net-next v9 5/5] net: phy: DP83822: Add setting the fixed internal delay
From: Dan Murphy @ 2020-06-19 16:18 UTC (permalink / raw)
  To: andrew, f.fainelli, hkallweit1, davem, robh
  Cc: netdev, linux-kernel, devicetree, Dan Murphy
In-Reply-To: <20200619161813.2716-1-dmurphy@ti.com>

The DP83822 can be configured to use the RGMII interface. There are
independent fixed 3.5ns clock shift (aka internal delay) for the TX and RX
paths. This allow either one to be set if the MII interface is RGMII and
the value is set in the firmware node.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
---
 drivers/net/phy/dp83822.c | 79 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 69 insertions(+), 10 deletions(-)

diff --git a/drivers/net/phy/dp83822.c b/drivers/net/phy/dp83822.c
index 1dd19d0cb269..37643c468e19 100644
--- a/drivers/net/phy/dp83822.c
+++ b/drivers/net/phy/dp83822.c
@@ -26,7 +26,9 @@
 #define MII_DP83822_PHYSCR	0x11
 #define MII_DP83822_MISR1	0x12
 #define MII_DP83822_MISR2	0x13
+#define MII_DP83822_RCSR	0x17
 #define MII_DP83822_RESET_CTRL	0x1f
+#define MII_DP83822_GENCFG	0x465
 
 #define DP83822_HW_RESET	BIT(15)
 #define DP83822_SW_RESET	BIT(14)
@@ -77,6 +79,10 @@
 #define DP83822_WOL_INDICATION_SEL BIT(8)
 #define DP83822_WOL_CLR_INDICATION BIT(11)
 
+/* RSCR bits */
+#define DP83822_RX_CLK_SHIFT	BIT(12)
+#define DP83822_TX_CLK_SHIFT	BIT(11)
+
 static int dp83822_ack_interrupt(struct phy_device *phydev)
 {
 	int err;
@@ -255,7 +261,7 @@ static int dp83822_config_intr(struct phy_device *phydev)
 	return phy_write(phydev, MII_DP83822_PHYSCR, physcr_status);
 }
 
-static int dp83822_config_init(struct phy_device *phydev)
+static int dp8382x_disable_wol(struct phy_device *phydev)
 {
 	int value = DP83822_WOL_EN | DP83822_WOL_MAGIC_EN |
 		    DP83822_WOL_SECURE_ON;
@@ -264,6 +270,46 @@ static int dp83822_config_init(struct phy_device *phydev)
 				  MII_DP83822_WOL_CFG, value);
 }
 
+static int dp83822_config_init(struct phy_device *phydev)
+{
+	struct device *dev = &phydev->mdio.dev;
+	int rgmii_delay;
+	s32 rx_int_delay;
+	s32 tx_int_delay;
+	int err = 0;
+
+	if (phy_interface_is_rgmii(phydev)) {
+		rx_int_delay = phy_get_internal_delay(phydev, dev, NULL, 0,
+						      true);
+
+		if (rx_int_delay <= 0)
+			rgmii_delay = 0;
+		else
+			rgmii_delay = DP83822_RX_CLK_SHIFT;
+
+		tx_int_delay = phy_get_internal_delay(phydev, dev, NULL, 0,
+						      false);
+		if (tx_int_delay <= 0)
+			rgmii_delay &= ~DP83822_TX_CLK_SHIFT;
+		else
+			rgmii_delay |= DP83822_TX_CLK_SHIFT;
+
+		if (rgmii_delay) {
+			err = phy_set_bits_mmd(phydev, DP83822_DEVADDR,
+					       MII_DP83822_RCSR, rgmii_delay);
+			if (err)
+				return err;
+		}
+	}
+
+	return dp8382x_disable_wol(phydev);
+}
+
+static int dp8382x_config_init(struct phy_device *phydev)
+{
+	return dp8382x_disable_wol(phydev);
+}
+
 static int dp83822_phy_reset(struct phy_device *phydev)
 {
 	int err;
@@ -272,9 +318,7 @@ static int dp83822_phy_reset(struct phy_device *phydev)
 	if (err < 0)
 		return err;
 
-	dp83822_config_init(phydev);
-
-	return 0;
+	return phydev->drv->config_init(phydev);
 }
 
 static int dp83822_suspend(struct phy_device *phydev)
@@ -318,14 +362,29 @@ static int dp83822_resume(struct phy_device *phydev)
 		.resume = dp83822_resume,			\
 	}
 
+#define DP8382X_PHY_DRIVER(_id, _name)				\
+	{							\
+		PHY_ID_MATCH_MODEL(_id),			\
+		.name		= (_name),			\
+		/* PHY_BASIC_FEATURES */			\
+		.soft_reset	= dp83822_phy_reset,		\
+		.config_init	= dp8382x_config_init,		\
+		.get_wol = dp83822_get_wol,			\
+		.set_wol = dp83822_set_wol,			\
+		.ack_interrupt = dp83822_ack_interrupt,		\
+		.config_intr = dp83822_config_intr,		\
+		.suspend = dp83822_suspend,			\
+		.resume = dp83822_resume,			\
+	}
+
 static struct phy_driver dp83822_driver[] = {
 	DP83822_PHY_DRIVER(DP83822_PHY_ID, "TI DP83822"),
-	DP83822_PHY_DRIVER(DP83825I_PHY_ID, "TI DP83825I"),
-	DP83822_PHY_DRIVER(DP83826C_PHY_ID, "TI DP83826C"),
-	DP83822_PHY_DRIVER(DP83826NC_PHY_ID, "TI DP83826NC"),
-	DP83822_PHY_DRIVER(DP83825S_PHY_ID, "TI DP83825S"),
-	DP83822_PHY_DRIVER(DP83825CM_PHY_ID, "TI DP83825M"),
-	DP83822_PHY_DRIVER(DP83825CS_PHY_ID, "TI DP83825CS"),
+	DP8382X_PHY_DRIVER(DP83825I_PHY_ID, "TI DP83825I"),
+	DP8382X_PHY_DRIVER(DP83826C_PHY_ID, "TI DP83826C"),
+	DP8382X_PHY_DRIVER(DP83826NC_PHY_ID, "TI DP83826NC"),
+	DP8382X_PHY_DRIVER(DP83825S_PHY_ID, "TI DP83825S"),
+	DP8382X_PHY_DRIVER(DP83825CM_PHY_ID, "TI DP83825M"),
+	DP8382X_PHY_DRIVER(DP83825CS_PHY_ID, "TI DP83825CS"),
 };
 module_phy_driver(dp83822_driver);
 
-- 
2.26.2


^ permalink raw reply related

* [PATCH net-next v9 1/5] dt-bindings: net: Add tx and rx internal delays
From: Dan Murphy @ 2020-06-19 16:18 UTC (permalink / raw)
  To: andrew, f.fainelli, hkallweit1, davem, robh
  Cc: netdev, linux-kernel, devicetree, Dan Murphy
In-Reply-To: <20200619161813.2716-1-dmurphy@ti.com>

tx-internal-delays and rx-internal-delays are a common setting for RGMII
capable devices.

These properties are used when the phy-mode or phy-controller is set to
rgmii-id, rgmii-rxid or rgmii-txid.  These modes indicate to the
controller that the PHY will add the internal delay for the connection.

Signed-off-by: Dan Murphy <dmurphy@ti.com>
---
 .../devicetree/bindings/net/ethernet-phy.yaml       | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/ethernet-phy.yaml b/Documentation/devicetree/bindings/net/ethernet-phy.yaml
index 9b1f1147ca36..7d8265eb49d6 100644
--- a/Documentation/devicetree/bindings/net/ethernet-phy.yaml
+++ b/Documentation/devicetree/bindings/net/ethernet-phy.yaml
@@ -162,6 +162,19 @@ properties:
     description:
       Specifies a reference to a node representing a SFP cage.
 
+
+  rx-internal-delay-ps:
+    description: |
+      RGMII Receive PHY Clock Delay defined in pico seconds.  This is used for
+      PHY's that have configurable RX internal delays.  If this property is
+      present then the PHY applies the RX delay.
+
+  tx-internal-delay-ps:
+    description: |
+      RGMII Transmit PHY Clock Delay defined in pico seconds.  This is used for
+      PHY's that have configurable TX internal delays. If this property is
+      present then the PHY applies the TX delay.
+
 required:
   - reg
 
-- 
2.26.2


^ permalink raw reply related

* Re: [PATCH v1 2/3] net/fsl: acpize xgmac_mdio
From: Russell King - ARM Linux admin @ 2020-06-19 15:02 UTC (permalink / raw)
  To: Calvin Johnson
  Cc: Jeremy Linton, Jon, Cristi Sovaiala, Ioana Ciornei, Andrew Lunn,
	Andy Shevchenko, Florian Fainelli, Madalin Bucur, netdev,
	linux.cj
In-Reply-To: <20200619145927.GA12973@lsv03152.swis.in-blr01.nxp.com>

On Fri, Jun 19, 2020 at 08:29:27PM +0530, Calvin Johnson wrote:
> On Wed, Jun 17, 2020 at 06:49:31PM +0100, Russell King - ARM Linux admin wrote:
> > On Wed, Jun 17, 2020 at 10:45:34PM +0530, Calvin Johnson wrote:
> > > From: Jeremy Linton <jeremy.linton@arm.com>
> > > 
> > > Add ACPI support for xgmac MDIO bus registration while maintaining
> > > the existing DT support.
> > > 
> > > The function mdiobus_register() inside of_mdiobus_register(), brings
> > > up all the PHYs on the mdio bus and attach them to the bus.
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > Signed-off-by: Calvin Johnson <calvin.johnson@oss.nxp.com>
> > > ---
> > > 
> > >  drivers/net/ethernet/freescale/xgmac_mdio.c | 27 +++++++++++++--------
> > >  1 file changed, 17 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
> > > index c82c85ef5fb3..fb7f8caff643 100644
> > > --- a/drivers/net/ethernet/freescale/xgmac_mdio.c
> > > +++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
> > > @@ -245,14 +245,14 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
> > >  {
> > >  	struct device_node *np = pdev->dev.of_node;
> > >  	struct mii_bus *bus;
> > > -	struct resource res;
> > > +	struct resource *res;
> > >  	struct mdio_fsl_priv *priv;
> > >  	int ret;
> > >  
> > > -	ret = of_address_to_resource(np, 0, &res);
> > > -	if (ret) {
> > > +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> > > +	if (!res) {
> > >  		dev_err(&pdev->dev, "could not obtain address\n");
> > > -		return ret;
> > > +		return -EINVAL;
> > >  	}
> > 
> > I think, as you're completely rewriting the resource handling, it would
> > be a good idea to switch over to using devm_* stuff here.
> > 
> > 	void __iomem *regs;
> > 
> > 	regs = devm_platform_ioremap_resource(pdev, 0);
> 
> I had used devm_ API earlier in this place and ran into a regression.
> This mdio driver is used by both DPAA-1 and DPAA-2. In DPAA2 case, this
> works fine.
> 
> But in DPAA-1 case, the existing device tree describes the memory map in a
> hierarchical manner. The FMan of DPAA-1 area include the MDIO, Port, MAC areas.
> Therefore, we may have to continue with existing method.

And you need to document that in comments in the driver, otherwise you
will have people come along and try to "clean up" the driver.

You can still use devm_ioremap() though.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* Re: [PATCH 1/2] net: qrtr: Migrate nameservice to kernel from userspace
From: Qais Yousef @ 2020-06-19 15:03 UTC (permalink / raw)
  To: Manivannan Sadhasivam; +Cc: davem, kuba, bjorn.andersson, netdev, linux-kernel
In-Reply-To: <9184F012-1FDC-4F6B-8B3E-5D2B87F5DACA@linaro.org>

On 06/19/20 20:16, Manivannan Sadhasivam wrote:
> Hi, 
> 
> On 17 June 2020 12:18:06 AM IST, Qais Yousef <qais.yousef@arm.com> wrote:
> >Hi Manivannan, David
> >
> >On 02/13/20 14:44, Manivannan Sadhasivam wrote:
> >
> >[...]
> >
> >> +	trace_printk("advertising new server [%d:%x]@[%d:%d]\n",
> >> +		     srv->service, srv->instance, srv->node, srv->port);
> >
> >I can't tell exactly from the discussion whether this is the version
> >that got
> >merged into 5.7 or not, but it does match the commit message.
> >
> 
> This got merged and there was a follow up patch to replace trace_printk() with tracepoints got merged as well. 

Cool. Thanks!

--
Qais Yousef

^ permalink raw reply

* Re: [PATCH net-next 3/3] cls_flower: Allow flow offloading though masked key exist.
From: Vlad Buslov @ 2020-06-19 16:15 UTC (permalink / raw)
  To: dsatish
  Cc: davem, jhs, xiyou.wangcong, jiri, kuba, netdev, simon.horman,
	kesavac, prathibha.nagooru, intiyaz.basha, jai.rana
In-Reply-To: <20200619094156.31184-4-satish.d@oneconvergence.com>


On Fri 19 Jun 2020 at 12:41, dsatish <satish.d@oneconvergence.com> wrote:
> A packet reaches OVS user space, only if, either there is no rule in
> datapath/hardware or there is race condition that the flow is added
> to hardware but before it is processed another packet arrives.
>
> It is possible hardware as part of its limitations/optimizations
> remove certain flows. To handle such cases where the hardware lost
> the flows, tc can offload to hardware if it receives a flow for which
> there exists an entry in its flow table. To handle such cases TC when
> it returns EEXIST error, also programs the flow in hardware, if
> hardware offload is enabled.
>
> Signed-off-by: Chandra Kesava <kesavac@gmail.com>
> Signed-off-by: Prathibha Nagooru <prathibha.nagooru@oneconvergence.com>
> Signed-off-by: Satish Dhote <satish.d@oneconvergence.com>
> ---
>  net/sched/cls_flower.c | 23 +++++++++++++++++++----
>  1 file changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index f1a5352cbb04..d718233cd5b9 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -431,7 +431,8 @@ static void fl_hw_destroy_filter(struct tcf_proto *tp, struct cls_fl_filter *f,
>
>  static int fl_hw_replace_filter(struct tcf_proto *tp,
>  				struct cls_fl_filter *f, bool rtnl_held,
> -				struct netlink_ext_ack *extack)
> +				struct netlink_ext_ack *extack,
> +				unsigned long cookie)
>  {
>  	struct tcf_block *block = tp->chain->block;
>  	struct flow_cls_offload cls_flower = {};
> @@ -444,7 +445,7 @@ static int fl_hw_replace_filter(struct tcf_proto *tp,
>
>  	tc_cls_common_offload_init(&cls_flower.common, tp, f->flags, extack);
>  	cls_flower.command = FLOW_CLS_REPLACE;
> -	cls_flower.cookie = (unsigned long) f;
> +	cls_flower.cookie = cookie;
>  	cls_flower.rule->match.dissector = &f->mask->dissector;
>  	cls_flower.rule->match.mask = &f->mask->key;
>  	cls_flower.rule->match.key = &f->mkey;
> @@ -2024,11 +2025,25 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
>  	fl_init_unmasked_key_dissector(&fnew->unmasked_key_dissector);
>
>  	err = fl_ht_insert_unique(fnew, fold, &in_ht);
> -	if (err)
> +	if (err) {
> +		/* It is possible Hardware lost the flow even though TC has it,
> +		 * and flow miss in hardware causes controller to offload flow again.
> +		 */
> +		if (err == -EEXIST && !tc_skip_hw(fnew->flags)) {
> +			struct cls_fl_filter *f =
> +				__fl_lookup(fnew->mask, &fnew->mkey);

You don't hold neither rcu read lock nor reference to the "f" filter
here, which means it can be concurrently destroyed at any time.

> +
> +			if (f)
> +				fl_hw_replace_filter(tp, fnew, rtnl_held,
> +						     extack,
> +						     (unsigned long)(f));
> +		}

It looks like you are inventing filter replace/overwrite functionality
here. However, such functionality is already supported. fl_change()
receives "fold" via "arg" argument, if filter with specified key exists
in classifier instance and NLM_F_EXCL netlink message flag is not set.

>  		goto errout_mask;
> +	}
>
>  	if (!tc_skip_hw(fnew->flags)) {
> -		err = fl_hw_replace_filter(tp, fnew, rtnl_held, extack);
> +		err = fl_hw_replace_filter(tp, fnew, rtnl_held, extack,
> +					   (unsigned long)fnew);
>  		if (err)
>  			goto errout_ht;
>  	}

^ permalink raw reply

* Re: [PATCH net-next 1/3] net/sched: Introduce action hash
From: Davide Caratti @ 2020-06-19 16:13 UTC (permalink / raw)
  To: Ariel Levkovich, netdev
  Cc: jiri, kuba, jhs, xiyou.wangcong, ast, daniel, Jiri Pirko,
	Davide Caratti
In-Reply-To: <20200618221548.3805-2-lariel@mellanox.com>

hello Ariel,

(I'm doing a resend because I suspect that my original reply was dropped
somewhere).

Thanks for your patch! some comments/questions below:

On Fri, 2020-06-19 at 01:15 +0300, Ariel Levkovich wrote:
> Allow setting a hash value to a packet for a future match.
> 
> The action will determine the packet's hash result according to
> the selected hash type.
> The first option is to select a basic asymmetric l4 hash calculation
> on the packet headers which will either use the skb->hash value as
> if such was already calculated and set by the device driver, or it
> will perform the kernel jenkins hash function on the packet which will
> generate the result otherwise.

If I understand correctly, this new tc action is going to change the skb
metadata based on some operation done on the packet. Linux has already a
tc module that does this job, it's act_skbedit.

Wouldn't it be possible to extend act_skbedit instead of adding a new tc
action? that would save us from some bugs we already encountered in the
past (maybe I spotted a couple of them below), and we would also leverage on
the existing tests.

> The other option is for user to provide an BPF program which is
> dedicated to calculate the hash. In such case the program is loaded
> and used by tc to perform the hash calculation and provide it to
> the hash action to be stored in skb->hash field.
> 
> The BPF option can be useful for future HW offload support of the hash
> calculation by emulating the HW hash function when it's different than
> the kernel's but yet we want to maintain consistency between the SW and
> the HW.

Like Daniel noticed, this can be done by act_bpf. Using 'jump'
or 'goto_chain' control actions, it should be possible to get to the same
result combining act_skbedit and act_bpf. WDYT?

> Usage is as follows:
> 
> $ tc filter add dev ens1f0_0 ingress \
> prio 1 chain 0 proto ip \
> flower ip_proto tcp \
> action hash bpf object-file <bpf file> \
> action goto chain 2

[...]

> diff --git a/include/net/act_api.h b/include/net/act_api.h
> index 8c3934880670..b7e5d060bd2f 100644
> --- a/include/net/act_api.h
> +++ b/include/net/act_api.h
> @@ -12,6 +12,8 @@
>  #include <net/net_namespace.h>
>  #include <net/netns/generic.h>
>  
> +#define ACT_BPF_NAME_LEN	256
> +

(BTW, line above seems to be a leftover. Correct?)

>  struct tcf_idrinfo {
>  	struct mutex	lock;
>  	struct idr	action_idr;
> 
[...]

> new file mode 100644
> index 000000000000..40a5c34f8745
> --- /dev/null
> +++ b/net/sched/act_hash.c
> @@ -0,0 +1,376 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/* -
> + * net/sched/act_hash.c  Hash calculation action
> + *
> + * Author:   Ariel Levkovich <lariel@mellanox.com>
> + */
> +
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/rtnetlink.h>
> +#include <linux/skbuff.h>
> +#include <linux/filter.h>
> +#include <net/netlink.h>
> +#include <net/pkt_sched.h>
> +#include <net/pkt_cls.h>
> +#include <linux/tc_act/tc_hash.h>
> +#include <net/tc_act/tc_hash.h>
> +
> +#define ACT_HASH_BPF_NAME_LEN	256
> +
> +static unsigned int hash_net_id;
> +static struct tc_action_ops act_hash_ops;
> +
> +static int tcf_hash_act(struct sk_buff *skb, const struct tc_action *a,
> +			struct tcf_result *res)
> +{
> +	struct tcf_hash *h = to_hash(a);
> +	struct tcf_hash_params *p;
> +	int action;
> +	u32 hash;
> +
> +	tcf_lastuse_update(&h->tcf_tm);
> +	tcf_action_update_bstats(&h->common, skb);
> +
> +	p = rcu_dereference_bh(h->hash_p);
> +
> +	action = READ_ONCE(h->tcf_action);
> +
> +	switch (p->alg) {
> +	case TCA_HASH_ALG_L4:
> +		hash = skb_get_hash(skb);
> +		/* If a hash basis was provided, add it into
> +		 * hash calculation here and re-set skb->hash
> +		 * to the new result with sw_hash indication
> +		 * and keeping the l4 hash indication.
> +		 */
> +		hash = jhash_1word(hash, p->basis);
> +		__skb_set_sw_hash(skb, hash, skb->l4_hash);

can you consider moving the above line to the data path of act_skbedit, and
extend the control plane accordingly? 

> +		break;
> +	case TCA_HASH_ALG_BPF:

here the code is assuming that the action is at tc ingress. But
theoretically we could install this action also on egress, nobody is
forbidding that.  whouldn't it be better to add proper checks (or using
directly act_bpf with appropriate control action, that already does this
job)?

> +		__skb_push(skb, skb->mac_len);
> +		bpf_compute_data_pointers(skb);
> +		hash = BPF_PROG_RUN(p->prog, skb);
> +		__skb_pull(skb, skb->mac_len);
> +		/* The BPF program hash function type is
> +		 * unknown so only the sw hash bit is set.
> +		 */
> +		__skb_set_sw_hash(skb, hash, false);
> +		break;
> +	}
> +	return action;
> +}
> +
> +static const struct nla_policy hash_policy[TCA_HASH_MAX + 1] = {
> +	[TCA_HASH_PARMS]	= { .type = NLA_EXACT_LEN, .len = sizeof(struct tc_hash) },
> +	[TCA_HASH_ALG]		= { .type = NLA_U32 },
> +	[TCA_HASH_BASIS]	= { .type = NLA_U32 },
> +	[TCA_HASH_BPF_FD]	= { .type = NLA_U32 },
> +	[TCA_HASH_BPF_NAME]	= { .type = NLA_NUL_STRING,
> +				    .len = ACT_HASH_BPF_NAME_LEN },
> +};
> +
> +static int tcf_hash_bpf_init(struct nlattr **tb, struct tcf_hash_params *params)
> +{
> +	struct bpf_prog *fp;
> +	char *name = NULL;
> +	u32 bpf_fd;
> +
> +	bpf_fd = nla_get_u32(tb[TCA_HASH_BPF_FD]);

shouldn't we check for non-NULL tb[TCA_HASH_BPF_FD] to avoid a kernel crash here?
please note, act_bpf does it:

https://elixir.bootlin.com/linux/v5.8-rc1/source/net/sched/act_bpf.c#L337

[...]

> +static int tcf_hash_init(struct net *net, struct nlattr *nla,
> +			 struct nlattr *est, struct tc_action **a,
> +			 int replace, int bind, bool rtnl_held,
> +			 struct tcf_proto *tp, u32 flags,
> +			 struct netlink_ext_ack *extack)
> +{

[...]

> +
> +	if (!tb[TCA_HASH_ALG]) {
> +		NL_SET_ERR_MSG_MOD(extack, "Missing hash algorithm selection");
> +		err = -EINVAL;
> +		goto cleanup;
> +	}
> +
> +	p->alg = nla_get_u32(tb[TCA_HASH_ALG]);

I don't understand why 'p->alg' is assigned and then validated. Wouldn't
it be better to validate it earlier, and assign only when we know it's a
good value? this would also avoid the spinlock unbalance below:

> +	spin_lock_bh(&h->tcf_lock);
> +
> +	switch (p->alg) {
> +	case TCA_HASH_ALG_L4:
> +		break;
> +	case TCA_HASH_ALG_BPF:
> +		if (res != ACT_P_CREATED) {
> +			params = rcu_dereference_protected(h->hash_p, 1);
> +			old.prog = params->prog;
> +			old.bpf_name = params->bpf_name;
> +		}
> +
> +		err = tcf_hash_bpf_init(tb, p);
> +		if (err)
> +			goto cleanup;

shouldn't we spin_unlock_bh() here?
> +
> +		break;
> +	default:
> +		NL_SET_ERR_MSG_MOD(extack, "Hash type not supported");
> +		err = -EINVAL;
> +		goto cleanup;

shouldn't we spin_unlock_bh() here?

> +	}
> +	if (tb[TCA_HASH_BASIS])
> +		p->basis = nla_get_u32(tb[TCA_HASH_BASIS]);
> +
> +	goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch);
> +	p = rcu_replace_pointer(h->hash_p, p,
> +				lockdep_is_held(&h->tcf_lock));
> +	spin_unlock_bh(&h->tcf_lock);
> +
> +	if (goto_ch)
> +		tcf_chain_put_by_act(goto_ch);
> +	if (p)
> +		kfree_rcu(p, rcu);
> +
> +	if (res == ACT_P_CREATED) {
> +		tcf_idr_insert(tn, *a);
> +	} else {
> +		synchronize_rcu();
> +		tcf_hash_bpf_cleanup(&old);
> +	}
> +
> +	return res;
> +
> +cleanup:
> +	if (goto_ch)
> +		tcf_chain_put_by_act(goto_ch);
> +	kfree(p);
> +
> +release_idr:
> +	tcf_idr_release(*a, bind);
> +	return err;
> +}

thank you in advance for any feedback!

-- 
davide


^ permalink raw reply

* Re: [PATCH net] ionic: tame the watchdog timer on reconfig
From: Jonathan Toppins @ 2020-06-19 16:06 UTC (permalink / raw)
  To: Shannon Nelson, netdev, davem
In-Reply-To: <20200618172904.53814-1-snelson@pensando.io>

On 6/18/20 1:29 PM, Shannon Nelson wrote:
> Even with moving netif_tx_disable() to an earlier point when
> taking down the queues for a reconfiguration, we still end
> up with the occasional netdev watchdog Tx Timeout complaint.
> The old method of using netif_trans_update() works fine for
> queue 0, but has no effect on the remaining queues.  Using
> netif_device_detach() allows us to signal to the watchdog to
> ignore us for the moment.
> 
> Fixes: beead698b173 ("ionic: Add the basic NDO callbacks for netdev support")
> Signed-off-by: Shannon Nelson <snelson@pensando.io>

Acked-by: Jonathan Toppins <jtoppins@redhat.com>


^ permalink raw reply

* [PATCH] dt-bindings: net: renesas,ether: Improve schema validation
From: Geert Uytterhoeven @ 2020-06-19 15:14 UTC (permalink / raw)
  To: Sergei Shtylyov, David S . Miller, Jakub Kicinski, Rob Herring
  Cc: netdev, linux-renesas-soc, devicetree, Geert Uytterhoeven

  - Remove pinctrl consumer properties, as they are handled by core
    dt-schema,
  - Document missing properties,
  - Document missing PHY child node,
  - Add "additionalProperties: false".

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
 .../bindings/net/renesas,ether.yaml           | 22 +++++++++++++------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/renesas,ether.yaml b/Documentation/devicetree/bindings/net/renesas,ether.yaml
index 08678af5ed9364cd..8ce5ed8a58dd76e6 100644
--- a/Documentation/devicetree/bindings/net/renesas,ether.yaml
+++ b/Documentation/devicetree/bindings/net/renesas,ether.yaml
@@ -59,9 +59,15 @@ properties:
   clocks:
     maxItems: 1
 
-  pinctrl-0: true
+  power-domains:
+    maxItems: 1
+
+  resets:
+    maxItems: 1
 
-  pinctrl-names: true
+  phy-mode: true
+
+  phy-handle: true
 
   renesas,no-ether-link:
     type: boolean
@@ -74,6 +80,11 @@ properties:
       specify when the Ether LINK signal is active-low instead of normal
       active-high
 
+patternProperties:
+  "^ethernet-phy@[0-9a-f]$":
+    type: object
+    $ref: ethernet-phy.yaml#
+
 required:
   - compatible
   - reg
@@ -83,7 +94,8 @@ required:
   - '#address-cells'
   - '#size-cells'
   - clocks
-  - pinctrl-0
+
+additionalProperties: false
 
 examples:
   # Lager board
@@ -99,8 +111,6 @@ examples:
         clocks = <&mstp8_clks R8A7790_CLK_ETHER>;
         phy-mode = "rmii";
         phy-handle = <&phy1>;
-        pinctrl-0 = <&ether_pins>;
-        pinctrl-names = "default";
         renesas,ether-link-active-low;
         #address-cells = <1>;
         #size-cells = <0>;
@@ -109,7 +119,5 @@ examples:
             reg = <1>;
             interrupt-parent = <&irqc0>;
             interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
-            pinctrl-0 = <&phy1_pins>;
-            pinctrl-names = "default";
         };
     };
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH ghak90 V8 07/16] audit: add contid support for signalling the audit daemon
From: Richard Guy Briggs @ 2020-06-19 15:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Paul Moore, nhorman, linux-api, containers, LKML, dhowells,
	linux-audit, netfilter-devel, simo, netdev, linux-fsdevel,
	Eric Paris, mpatel, Serge Hallyn
In-Reply-To: <871rol7nw3.fsf@x220.int.ebiederm.org>

On 2020-04-17 17:23, Eric W. Biederman wrote:
> Paul Moore <paul@paul-moore.com> writes:
> 
> > On Thu, Apr 16, 2020 at 4:36 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> >> Paul Moore <paul@paul-moore.com> writes:
> >> > On Mon, Mar 30, 2020 at 1:49 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> On 2020-03-30 13:34, Paul Moore wrote:
> >> >> > On Mon, Mar 30, 2020 at 12:22 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> > > On 2020-03-30 10:26, Paul Moore wrote:
> >> >> > > > On Mon, Mar 30, 2020 at 9:47 AM Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> > > > > On 2020-03-28 23:11, Paul Moore wrote:
> >> >> > > > > > On Tue, Mar 24, 2020 at 5:02 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> > > > > > > On 2020-03-23 20:16, Paul Moore wrote:
> >> >> > > > > > > > On Thu, Mar 19, 2020 at 6:03 PM Richard Guy Briggs <rgb@redhat.com> wrote:
> >> >> > > > > > > > > On 2020-03-18 18:06, Paul Moore wrote:
> >> >
> >> > ...
> >> >
> >> >> > > Well, every time a record gets generated, *any* record gets generated,
> >> >> > > we'll need to check for which audit daemons this record is in scope and
> >> >> > > generate a different one for each depending on the content and whether
> >> >> > > or not the content is influenced by the scope.
> >> >> >
> >> >> > That's the problem right there - we don't want to have to generate a
> >> >> > unique record for *each* auditd on *every* record.  That is a recipe
> >> >> > for disaster.
> >> >> >
> >> >> > Solving this for all of the known audit records is not something we
> >> >> > need to worry about in depth at the moment (although giving it some
> >> >> > casual thought is not a bad thing), but solving this for the audit
> >> >> > container ID information *is* something we need to worry about right
> >> >> > now.
> >> >>
> >> >> If you think that a different nested contid value string per daemon is
> >> >> not acceptable, then we are back to issuing a record that has only *one*
> >> >> contid listed without any nesting information.  This brings us back to
> >> >> the original problem of keeping *all* audit log history since the boot
> >> >> of the machine to be able to track the nesting of any particular contid.
> >> >
> >> > I'm not ruling anything out, except for the "let's just completely
> >> > regenerate every record for each auditd instance".
> >>
> >> Paul I am a bit confused about what you are referring to when you say
> >> regenerate every record.
> >>
> >> Are you saying that you don't want to repeat the sequence:
> >>         audit_log_start(...);
> >>         audit_log_format(...);
> >>         audit_log_end(...);
> >> for every nested audit daemon?
> >
> > If it can be avoided yes.  Audit performance is already not-awesome,
> > this would make it even worse.
> 
> As far as I can see not repeating sequences like that is fundamental
> for making this work at all.  Just because only the audit subsystem
> should know about one or multiple audit daemons.  Nothing else should
> care.
> 
> >> Or are you saying that you would like to literraly want to send the same
> >> skb to each of the nested audit daemons?
> >
> > Ideally we would reuse the generated audit messages as much as
> > possible.  Less work is better.  That's really my main concern here,
> > let's make sure we aren't going to totally tank performance when we
> > have a bunch of nested audit daemons.
> 
> So I think there are two parts of this answer.  Assuming we are talking
> about nesting audit daemons in containers we will have different
> rulesets and I expect most of the events for a nested audit daemon won't
> be of interest to the outer audit daemon.
> 
> Beyond that it should be very straight forward to keep a pointer and
> leave the buffer as a scatter gather list until audit_log_end
> and translate pids, and rewrite ACIDs attributes in audit_log_end
> when we build the final packet.  Either through collaboration with
> audit_log_format or a special audit_log command that carefully sets
> up the handful of things that need that information.
> 
> Hmm.  I am seeing that we send skbs to kauditd and then kauditd
> sends those skbs to userspace.  I presume that is primary so that
> sending messages to userspace does not block the process being audited.
> 
> Plus a little bit so that the retry logic will work.
> 
> I think the naive implementation would be to simply have 1 kauditd
> per auditd (strictly and audit context/namespace).  Although that can be
> optimized if that is a problem.
> 
> Beyond that I think we would need to look at profiles to really
> understand where the bottlenecks are.
> 
> >> Or are you thinking of something else?
> >
> > As mentioned above, I'm not thinking of anything specific, other than
> > let's please not have to regenerate *all* of the audit record strings
> > for each instance of an audit daemon, that's going to be a killer.
> >
> > Maybe we have to regenerate some, if we do, what would that look like
> > in code?  How do we handle the regeneration aspect?  I worry that is
> > going to be really ugly.
> >
> > Maybe we finally burn down the audit_log_format(...) function and pass
> > structs/TLVs to the audit subsystem and the audit subsystem generates
> > the strings in the auditd connection thread.  Some of the record
> > strings could likely be shared, others would need to be ACID/auditd
> > dependent.
> 
> I think we just a very limited amount of structs/TLVs for the cases that
> matter and one-one auditd and kauditd implementations we should still
> be able to do everything in audit_log_end.  Plus doing as much work as
> possible in audit_log_end where things are still cache hot is desirable.

So in the end, perf may show us that moving things around a bit and
knowing to which queue(s) we send an skb will help maintain performance
by writing out the field contents in audit_log_end() and sending to the
correct queue rather than deferring writing out that field contents in
the kauditd process due to cache issues.  In any case, it makes sense to
delay that formatting work until just after the daemon routing decision
is made.

> > I'm open to any ideas people may have.  We have a problem, let's solve
> > it.
> 
> It definitely makes sense to look ahead to having audit daemons running
> in containers, but in the grand scheme of things that is a nice to have.
> Probably something we will and should get to, but we have lived a long
> time without auditd running in containers so I expect we can live a
> while longer.
> 
> As I understand Richard patchset for the specific case of the ACID we
> are only talking about taking a subset of an existing string, and one
> string at that.  Not hard at all.  Especially when looking at the
> fundamental fact that we will need to send a different skb to
> userspace, for each audit daemon.
> 
> Eric

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


^ permalink raw reply

* Re: [PATCH 2/3] net: phy: marvell: Add Marvell 88E1340S support
From: Andrew Lunn @ 2020-06-19 14:59 UTC (permalink / raw)
  To: Maxim Kochetkov
  Cc: netdev, Florian Fainelli, Heiner Kallweit, Russell King,
	David S. Miller, Jakub Kicinski
In-Reply-To: <20200619084904.95432-3-fido_max@inbox.ru>

On Fri, Jun 19, 2020 at 11:49:03AM +0300, Maxim Kochetkov wrote:
> Add support for this new phy ID.
> 
> Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 1/2] net: qrtr: Migrate nameservice to kernel from userspace
From: Manivannan Sadhasivam @ 2020-06-19 14:46 UTC (permalink / raw)
  To: Qais Yousef; +Cc: davem, kuba, bjorn.andersson, netdev, linux-kernel
In-Reply-To: <20200616184805.k7eowfhdevasqite@e107158-lin.cambridge.arm.com>

Hi, 

On 17 June 2020 12:18:06 AM IST, Qais Yousef <qais.yousef@arm.com> wrote:
>Hi Manivannan, David
>
>On 02/13/20 14:44, Manivannan Sadhasivam wrote:
>
>[...]
>
>> +	trace_printk("advertising new server [%d:%x]@[%d:%d]\n",
>> +		     srv->service, srv->instance, srv->node, srv->port);
>
>I can't tell exactly from the discussion whether this is the version
>that got
>merged into 5.7 or not, but it does match the commit message.
>

This got merged and there was a follow up patch to replace trace_printk() with tracepoints got merged as well. 

Thanks, 
Mani

>This patch introduces several trace_printk() which AFAIK is intended
>for
>debugging only and shouldn't make it into mainline kernels.
>
>It causes this big message to be printed to the log too
>
>[    0.000000]
>**********************************************************
>[    0.000000] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE  
>**
>[    0.000000] **                                                     
>**
>[    0.000000] ** trace_printk() being used. Allocating extra memory. 
>**
>[    0.000000] **                                                     
>**
>[    0.000000] ** This means that this is a DEBUG kernel and it is    
>**
>[    0.000000] ** unsafe for production use.                          
>**
>[    0.000000] **                                                     
>**
>[    0.000000] ** If you see this message and you are not debugging   
>**
>[    0.000000] ** the kernel, report this immediately to your vendor! 
>**
>[    0.000000] **                                                     
>**
>[    0.000000] **   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE  
>**
>[    0.000000]
>**********************************************************
>
>Shouldn't this be replaced with one of pr_*() variants instead?
>
>Thanks
>
>--
>Qais Yousef

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply

* [PATCH net-next v3 1/4] xfrm: bail early on slave pass over skb
From: Jarod Wilson @ 2020-06-19 14:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jarod Wilson, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, Jeff Kirsher, Jakub Kicinski, Steffen Klassert,
	Herbert Xu, netdev, intel-wired-lan
In-Reply-To: <20200619143155.20726-1-jarod@redhat.com>

This is prep work for initial support of bonding hardware encryption
pass-through support. The bonding driver will fill in the slave_dev
pointer, and we use that to know not to skb_push() again on a given
skb that was already processed on the bond device.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 include/net/xfrm.h     |  1 +
 net/xfrm/xfrm_device.c | 34 +++++++++++++++++-----------------
 2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 094fe682f5d7..e20b2b27ec48 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -127,6 +127,7 @@ struct xfrm_state_walk {
 
 struct xfrm_state_offload {
 	struct net_device	*dev;
+	struct net_device	*slave_dev;
 	unsigned long		offload_handle;
 	unsigned int		num_exthdrs;
 	u8			flags;
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index f50d1f97cf8e..b8918fc5248b 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -106,6 +106,7 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
 	struct sk_buff *skb2, *nskb, *pskb = NULL;
 	netdev_features_t esp_features = features;
 	struct xfrm_offload *xo = xfrm_offload(skb);
+	struct net_device *dev = skb->dev;
 	struct sec_path *sp;
 
 	if (!xo)
@@ -119,6 +120,10 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
 	if (xo->flags & XFRM_GRO || x->xso.flags & XFRM_OFFLOAD_INBOUND)
 		return skb;
 
+	/* This skb was already validated on the master dev */
+	if ((x->xso.dev != dev) && (x->xso.slave_dev == dev))
+		return skb;
+
 	local_irq_save(flags);
 	sd = this_cpu_ptr(&softnet_data);
 	err = !skb_queue_empty(&sd->xfrm_backlog);
@@ -129,25 +134,20 @@ struct sk_buff *validate_xmit_xfrm(struct sk_buff *skb, netdev_features_t featur
 		return skb;
 	}
 
-	if (skb_is_gso(skb)) {
-		struct net_device *dev = skb->dev;
-
-		if (unlikely(x->xso.dev != dev)) {
-			struct sk_buff *segs;
+	if (skb_is_gso(skb) && unlikely(x->xso.dev != dev)) {
+		struct sk_buff *segs;
 
-			/* Packet got rerouted, fixup features and segment it. */
-			esp_features = esp_features & ~(NETIF_F_HW_ESP
-							| NETIF_F_GSO_ESP);
+		/* Packet got rerouted, fixup features and segment it. */
+		esp_features = esp_features & ~(NETIF_F_HW_ESP | NETIF_F_GSO_ESP);
 
-			segs = skb_gso_segment(skb, esp_features);
-			if (IS_ERR(segs)) {
-				kfree_skb(skb);
-				atomic_long_inc(&dev->tx_dropped);
-				return NULL;
-			} else {
-				consume_skb(skb);
-				skb = segs;
-			}
+		segs = skb_gso_segment(skb, esp_features);
+		if (IS_ERR(segs)) {
+			kfree_skb(skb);
+			atomic_long_inc(&dev->tx_dropped);
+			return NULL;
+		} else {
+			consume_skb(skb);
+			skb = segs;
 		}
 	}
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v3 4/4] bonding: support hardware encryption offload to slaves
From: Jarod Wilson @ 2020-06-19 14:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jarod Wilson, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, Jeff Kirsher, Jakub Kicinski, Steffen Klassert,
	Herbert Xu, netdev, intel-wired-lan
In-Reply-To: <20200619143155.20726-1-jarod@redhat.com>

Currently, this support is limited to active-backup mode, as I'm not sure
about the feasilibity of mapping an xfrm_state's offload handle to
multiple hardware devices simultaneously, and we rely on being able to
pass some hints to both the xfrm and NIC driver about whether or not
they're operating on a slave device.

I've tested this atop an Intel x520 device (ixgbe) using libreswan in
transport mode, succesfully achieving ~4.3Gbps throughput with netperf
(more or less identical to throughput on a bare NIC in this system),
as well as successful failover and recovery mid-netperf.

v2: just use CONFIG_XFRM_OFFLOAD for wrapping, isolate more code with it

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 drivers/net/bonding/bond_main.c | 127 +++++++++++++++++++++++++++++++-
 include/net/bonding.h           |   3 +
 2 files changed, 128 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 004919aea5fb..90939ccf2a94 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -79,6 +79,7 @@
 #include <net/pkt_sched.h>
 #include <linux/rculist.h>
 #include <net/flow_dissector.h>
+#include <net/xfrm.h>
 #include <net/bonding.h>
 #include <net/bond_3ad.h>
 #include <net/bond_alb.h>
@@ -278,8 +279,6 @@ const char *bond_mode_name(int mode)
 	return names[mode];
 }
 
-/*---------------------------------- VLAN -----------------------------------*/
-
 /**
  * bond_dev_queue_xmit - Prepare skb for xmit.
  *
@@ -302,6 +301,8 @@ netdev_tx_t bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
 	return dev_queue_xmit(skb);
 }
 
+/*---------------------------------- VLAN -----------------------------------*/
+
 /* In the following 2 functions, bond_vlan_rx_add_vid and bond_vlan_rx_kill_vid,
  * We don't protect the slave list iteration with a lock because:
  * a. This operation is performed in IOCTL context,
@@ -372,6 +373,84 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev,
 	return 0;
 }
 
+/*---------------------------------- XFRM -----------------------------------*/
+
+#ifdef CONFIG_XFRM_OFFLOAD
+/**
+ * bond_ipsec_add_sa - program device with a security association
+ * @xs: pointer to transformer state struct
+ **/
+static int bond_ipsec_add_sa(struct xfrm_state *xs)
+{
+	struct net_device *bond_dev = xs->xso.dev;
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *slave = rtnl_dereference(bond->curr_active_slave);
+
+	xs->xso.slave_dev = slave->dev;
+	bond->xs = xs;
+
+	if (!(slave->dev->xfrmdev_ops
+	      && slave->dev->xfrmdev_ops->xdo_dev_state_add)) {
+		slave_warn(bond_dev, slave->dev, "Slave does not support ipsec offload\n");
+		return -EINVAL;
+	}
+
+	return slave->dev->xfrmdev_ops->xdo_dev_state_add(xs);
+}
+
+/**
+ * bond_ipsec_del_sa - clear out this specific SA
+ * @xs: pointer to transformer state struct
+ **/
+static void bond_ipsec_del_sa(struct xfrm_state *xs)
+{
+	struct net_device *bond_dev = xs->xso.dev;
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *slave = rtnl_dereference(bond->curr_active_slave);
+
+	if (!slave)
+		return;
+
+	xs->xso.slave_dev = slave->dev;
+
+	if (!(slave->dev->xfrmdev_ops
+	      && slave->dev->xfrmdev_ops->xdo_dev_state_delete)) {
+		slave_warn(bond_dev, slave->dev, "%s: no slave xdo_dev_state_delete\n", __func__);
+		return;
+	}
+
+	slave->dev->xfrmdev_ops->xdo_dev_state_delete(xs);
+}
+
+/**
+ * bond_ipsec_offload_ok - can this packet use the xfrm hw offload
+ * @skb: current data packet
+ * @xs: pointer to transformer state struct
+ **/
+static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs)
+{
+	struct net_device *bond_dev = xs->xso.dev;
+	struct bonding *bond = netdev_priv(bond_dev);
+	struct slave *curr_active = rtnl_dereference(bond->curr_active_slave);
+	struct net_device *slave_dev = curr_active->dev;
+
+	if (!(slave_dev->xfrmdev_ops
+	      && slave_dev->xfrmdev_ops->xdo_dev_offload_ok)) {
+		slave_warn(bond_dev, slave_dev, "%s: no slave xdo_dev_offload_ok\n", __func__);
+		return false;
+	}
+
+	xs->xso.slave_dev = slave_dev;
+	return slave_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs);
+}
+
+static const struct xfrmdev_ops bond_xfrmdev_ops = {
+	.xdo_dev_state_add = bond_ipsec_add_sa,
+	.xdo_dev_state_delete = bond_ipsec_del_sa,
+	.xdo_dev_offload_ok = bond_ipsec_offload_ok,
+};
+#endif /* CONFIG_XFRM_OFFLOAD */
+
 /*------------------------------- Link status -------------------------------*/
 
 /* Set the carrier state for the master according to the state of its
@@ -879,6 +958,11 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
 		return;
 
 	if (new_active) {
+#ifdef CONFIG_XFRM_OFFLOAD
+		if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP) && bond->xs)
+			bond_ipsec_del_sa(bond->xs);
+#endif /* CONFIG_XFRM_OFFLOAD */
+
 		new_active->last_link_up = jiffies;
 
 		if (new_active->link == BOND_LINK_BACK) {
@@ -941,6 +1025,13 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)
 					bond_should_notify_peers(bond);
 			}
 
+#ifdef CONFIG_XFRM_OFFLOAD
+			if (old_active && bond->xs) {
+				xfrm_dev_state_flush(dev_net(bond->dev), bond->dev, true);
+				bond_ipsec_add_sa(bond->xs);
+			}
+#endif /* CONFIG_XFRM_OFFLOAD */
+
 			call_netdevice_notifiers(NETDEV_BONDING_FAILOVER, bond->dev);
 			if (should_notify_peers) {
 				bond->send_peer_notif--;
@@ -1127,15 +1218,24 @@ static netdev_features_t bond_fix_features(struct net_device *dev,
 #define BOND_ENC_FEATURES	(NETIF_F_HW_CSUM | NETIF_F_SG | \
 				 NETIF_F_RXCSUM | NETIF_F_ALL_TSO)
 
+#ifdef CONFIG_XFRM_OFFLOAD
+#define BOND_XFRM_FEATURES	(NETIF_F_HW_ESP | NETIF_F_HW_ESP_TX_CSUM | \
+				 NETIF_F_GSO_ESP)
+#endif /* CONFIG_XFRM_OFFLOAD */
+
 #define BOND_MPLS_FEATURES	(NETIF_F_HW_CSUM | NETIF_F_SG | \
 				 NETIF_F_ALL_TSO)
 
+
 static void bond_compute_features(struct bonding *bond)
 {
 	unsigned int dst_release_flag = IFF_XMIT_DST_RELEASE |
 					IFF_XMIT_DST_RELEASE_PERM;
 	netdev_features_t vlan_features = BOND_VLAN_FEATURES;
 	netdev_features_t enc_features  = BOND_ENC_FEATURES;
+#ifdef CONFIG_XFRM_OFFLOAD
+	netdev_features_t xfrm_features  = BOND_XFRM_FEATURES;
+#endif /* CONFIG_XFRM_OFFLOAD */
 	netdev_features_t mpls_features  = BOND_MPLS_FEATURES;
 	struct net_device *bond_dev = bond->dev;
 	struct list_head *iter;
@@ -1157,6 +1257,12 @@ static void bond_compute_features(struct bonding *bond)
 							 slave->dev->hw_enc_features,
 							 BOND_ENC_FEATURES);
 
+#ifdef CONFIG_XFRM_OFFLOAD
+		xfrm_features = netdev_increment_features(xfrm_features,
+							  slave->dev->hw_enc_features,
+							  BOND_XFRM_FEATURES);
+#endif /* CONFIG_XFRM_OFFLOAD */
+
 		mpls_features = netdev_increment_features(mpls_features,
 							  slave->dev->mpls_features,
 							  BOND_MPLS_FEATURES);
@@ -1176,6 +1282,9 @@ static void bond_compute_features(struct bonding *bond)
 				    NETIF_F_HW_VLAN_CTAG_TX |
 				    NETIF_F_HW_VLAN_STAG_TX |
 				    NETIF_F_GSO_UDP_L4;
+#ifdef CONFIG_XFRM_OFFLOAD
+	bond_dev->hw_enc_features |= xfrm_features;
+#endif /* CONFIG_XFRM_OFFLOAD */
 	bond_dev->mpls_features = mpls_features;
 	bond_dev->gso_max_segs = gso_max_segs;
 	netif_set_gso_max_size(bond_dev, gso_max_size);
@@ -1464,6 +1573,9 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
 		slave_dbg(bond_dev, slave_dev, "is !NETIF_F_VLAN_CHALLENGED\n");
 	}
 
+	if (slave_dev->features & NETIF_F_HW_ESP)
+		slave_dbg(bond_dev, slave_dev, "is esp-hw-offload capable\n");
+
 	/* Old ifenslave binaries are no longer supported.  These can
 	 * be identified with moderate accuracy by the state of the slave:
 	 * the current ifenslave will set the interface down prior to
@@ -4540,6 +4652,13 @@ void bond_setup(struct net_device *bond_dev)
 	bond_dev->priv_flags |= IFF_BONDING | IFF_UNICAST_FLT | IFF_NO_QUEUE;
 	bond_dev->priv_flags &= ~(IFF_XMIT_DST_RELEASE | IFF_TX_SKB_SHARING);
 
+#ifdef CONFIG_XFRM_OFFLOAD
+	/* set up xfrm device ops (only supported in active-backup right now) */
+	if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
+		bond_dev->xfrmdev_ops = &bond_xfrmdev_ops;
+	bond->xs = NULL;
+#endif /* CONFIG_XFRM_OFFLOAD */
+
 	/* don't acquire bond device's netif_tx_lock when transmitting */
 	bond_dev->features |= NETIF_F_LLTX;
 
@@ -4558,6 +4677,10 @@ void bond_setup(struct net_device *bond_dev)
 				NETIF_F_HW_VLAN_CTAG_FILTER;
 
 	bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL | NETIF_F_GSO_UDP_L4;
+#ifdef CONFIG_XFRM_OFFLOAD
+	if ((BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP))
+		bond_dev->hw_features |= BOND_XFRM_FEATURES;
+#endif /* CONFIG_XFRM_OFFLOAD */
 	bond_dev->features |= bond_dev->hw_features;
 	bond_dev->features |= NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_STAG_TX;
 }
diff --git a/include/net/bonding.h b/include/net/bonding.h
index aa854a9c01e2..a00e1764e9b1 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -238,6 +238,9 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+#ifdef CONFIG_XFRM_OFFLOAD
+	struct xfrm_state *xs;
+#endif /* CONFIG_XFRM_OFFLOAD */
 };
 
 #define bond_slave_get_rcu(dev) \
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v3 3/4] mlx5: become aware of when running as a bonding slave
From: Jarod Wilson @ 2020-06-19 14:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jarod Wilson, Boris Pismenny, Saeed Mahameed, Leon Romanovsky,
	Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	Jeff Kirsher, Jakub Kicinski, Steffen Klassert, Herbert Xu,
	netdev
In-Reply-To: <20200619143155.20726-1-jarod@redhat.com>

I've been unable to get my hands on suitable supported hardware to date,
but I believe this ought to be all that is needed to enable the mlx5
driver to also work with bonding active-backup crypto offload passthru.

CC: Boris Pismenny <borisp@mellanox.com>
CC: Saeed Mahameed <saeedm@mellanox.com>
CC: Leon Romanovsky <leon@kernel.org>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
index 92eb3bad4acd..72ad6664bd73 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c
@@ -210,6 +210,9 @@ static inline int mlx5e_xfrm_validate_state(struct xfrm_state *x)
 	struct net_device *netdev = x->xso.dev;
 	struct mlx5e_priv *priv;
 
+	if (x->xso.slave_dev)
+		netdev = x->xso.slave_dev;
+
 	priv = netdev_priv(netdev);
 
 	if (x->props.aalgo != SADB_AALG_NONE) {
@@ -291,6 +294,9 @@ static int mlx5e_xfrm_add_state(struct xfrm_state *x)
 	unsigned int sa_handle;
 	int err;
 
+	if (x->xso.slave_dev)
+		netdev = x->xso.slave_dev;
+
 	priv = netdev_priv(netdev);
 
 	err = mlx5e_xfrm_validate_state(x);
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v3 2/4] ixgbe_ipsec: become aware of when running as a bonding slave
From: Jarod Wilson @ 2020-06-19 14:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jarod Wilson, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, Jeff Kirsher, Jakub Kicinski, Steffen Klassert,
	Herbert Xu, netdev, intel-wired-lan, Jeff Kirsher
In-Reply-To: <20200619143155.20726-1-jarod@redhat.com>

Slave devices in a bond doing hardware encryption also need to be aware
that they're slaves, so we operate on the slave instead of the bonding
master to do the actual hardware encryption offload bits.

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Acked-by: Jeff Kirsher <Jeffrey.t.kirsher@intel.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
 .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    | 39 +++++++++++++++----
 1 file changed, 31 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
index 113f6087c7c9..26b0a58a064d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ipsec.c
@@ -432,6 +432,9 @@ static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
 	char *alg_name = NULL;
 	int key_len;
 
+	if (xs->xso.slave_dev)
+		dev = xs->xso.slave_dev;
+
 	if (!xs->aead) {
 		netdev_err(dev, "Unsupported IPsec algorithm\n");
 		return -EINVAL;
@@ -478,8 +481,8 @@ static int ixgbe_ipsec_parse_proto_keys(struct xfrm_state *xs,
 static int ixgbe_ipsec_check_mgmt_ip(struct xfrm_state *xs)
 {
 	struct net_device *dev = xs->xso.dev;
-	struct ixgbe_adapter *adapter = netdev_priv(dev);
-	struct ixgbe_hw *hw = &adapter->hw;
+	struct ixgbe_adapter *adapter;
+	struct ixgbe_hw *hw;
 	u32 mfval, manc, reg;
 	int num_filters = 4;
 	bool manc_ipv4;
@@ -497,6 +500,12 @@ static int ixgbe_ipsec_check_mgmt_ip(struct xfrm_state *xs)
 #define BMCIP_V6                 0x3
 #define BMCIP_MASK               0x3
 
+	if (xs->xso.slave_dev)
+		dev = xs->xso.slave_dev;
+
+	adapter = netdev_priv(dev);
+	hw = &adapter->hw;
+
 	manc = IXGBE_READ_REG(hw, IXGBE_MANC);
 	manc_ipv4 = !!(manc & MANC_EN_IPV4_FILTER);
 	mfval = IXGBE_READ_REG(hw, IXGBE_MFVAL);
@@ -561,14 +570,21 @@ static int ixgbe_ipsec_check_mgmt_ip(struct xfrm_state *xs)
 static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
 {
 	struct net_device *dev = xs->xso.dev;
-	struct ixgbe_adapter *adapter = netdev_priv(dev);
-	struct ixgbe_ipsec *ipsec = adapter->ipsec;
-	struct ixgbe_hw *hw = &adapter->hw;
+	struct ixgbe_adapter *adapter;
+	struct ixgbe_ipsec *ipsec;
+	struct ixgbe_hw *hw;
 	int checked, match, first;
 	u16 sa_idx;
 	int ret;
 	int i;
 
+	if (xs->xso.slave_dev)
+		dev = xs->xso.slave_dev;
+
+	adapter = netdev_priv(dev);
+	ipsec = adapter->ipsec;
+	hw = &adapter->hw;
+
 	if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
 		netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
 			   xs->id.proto);
@@ -746,12 +762,19 @@ static int ixgbe_ipsec_add_sa(struct xfrm_state *xs)
 static void ixgbe_ipsec_del_sa(struct xfrm_state *xs)
 {
 	struct net_device *dev = xs->xso.dev;
-	struct ixgbe_adapter *adapter = netdev_priv(dev);
-	struct ixgbe_ipsec *ipsec = adapter->ipsec;
-	struct ixgbe_hw *hw = &adapter->hw;
+	struct ixgbe_adapter *adapter;
+	struct ixgbe_ipsec *ipsec;
+	struct ixgbe_hw *hw;
 	u32 zerobuf[4] = {0, 0, 0, 0};
 	u16 sa_idx;
 
+	if (xs->xso.slave_dev)
+		dev = xs->xso.slave_dev;
+
+	adapter = netdev_priv(dev);
+	ipsec = adapter->ipsec;
+	hw = &adapter->hw;
+
 	if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
 		struct rx_sa *rsa;
 		u8 ipi;
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v3 0/4] bonding: initial support for hardware crypto offload
From: Jarod Wilson @ 2020-06-19 14:31 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jarod Wilson, Jay Vosburgh, Veaceslav Falico, Andy Gospodarek,
	David S. Miller, Jeff Kirsher, Jakub Kicinski, Steffen Klassert,
	Herbert Xu, netdev, intel-wired-lan

This is an initial functional implementation for doing pass-through of
hardware encryption from bonding device to capable slaves, in active-backup
bond setups. This was developed and tested using ixgbe-driven Intel x520
interfaces with libreswan and a transport mode connection, primarily using
netperf, with assorted connection failures forced during transmission. The
failover works quite well in my testing, and overall performance is right
on par with offload when running on a bare interface, no bond involved.

Caveats: this is ONLY enabled for active-backup, because I'm not sure
how one would manage multiple offload handles for different devices all
running at the same time in the same xfrm, and it relies on some minor
changes to both the xfrm code and slave device driver code to get things
to behave, and I don't have immediate access to any other hardware that
could function similarly, but the NIC driver changes are minimal and
straight-forward enough that I've included what I think ought to be
enough for mlx5 devices too.

v2: reordered patches, switched (back) to using CONFIG_XFRM_OFFLOAD
to wrap the code additions and wrapped overlooked additions.
v3: rebase w/net-next open, add proper cc list to cover letter

Jarod Wilson (4):
  xfrm: bail early on slave pass over skb
  ixgbe_ipsec: become aware of when running as a bonding slave
  mlx5: become aware of when running as a bonding slave
  bonding: support hardware encryption offload to slaves

CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Jakub Kicinski <kuba@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: netdev@vger.kernel.org
CC: intel-wired-lan@lists.osuosl.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>

Jarod Wilson (4):
  xfrm: bail early on slave pass over skb
  ixgbe_ipsec: become aware of when running as a bonding slave
  mlx5: become aware of when running as a bonding slave
  bonding: support hardware encryption offload to slaves

 drivers/net/bonding/bond_main.c               | 127 +++++++++++++++++-
 .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c    |  39 ++++--
 .../mellanox/mlx5/core/en_accel/ipsec.c       |   6 +
 include/net/bonding.h                         |   3 +
 include/net/xfrm.h                            |   1 +
 net/xfrm/xfrm_device.c                        |  34 ++---
 6 files changed, 183 insertions(+), 27 deletions(-)

-- 
2.20.1

^ permalink raw reply

* [PATCH net-next 4/5] cxgb4: add support to fetch ethtool n-tuple filters
From: Vishal Kulkarni @ 2020-06-19 14:21 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, rahul.lakkireddy, Vishal Kulkarni
In-Reply-To: <20200619142139.27982-1-vishal@chelsio.com>

Add support to fetch the requested ethtool n-tuple filters by
translating them from hardware spec to ethtool n-tuple spec.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 110 ++++++++++++++++++
 1 file changed, 110 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 5c588677877d..3dd28e5856ba 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -1589,10 +1589,104 @@ static struct filter_entry *cxgb4_get_filter_entry(struct adapter *adap,
 	return f;
 }
 
+static void cxgb4_fill_filter_rule(struct ethtool_rx_flow_spec *fs,
+				   struct ch_filter_specification *dfs)
+{
+	switch (dfs->val.proto) {
+	case IPPROTO_TCP:
+		if (dfs->type)
+			fs->flow_type = TCP_V6_FLOW;
+		else
+			fs->flow_type = TCP_V4_FLOW;
+		break;
+	case IPPROTO_UDP:
+		if (dfs->type)
+			fs->flow_type = UDP_V6_FLOW;
+		else
+			fs->flow_type = UDP_V4_FLOW;
+		break;
+	}
+
+	if (dfs->type) {
+		fs->h_u.tcp_ip6_spec.psrc = cpu_to_be16(dfs->val.fport);
+		fs->m_u.tcp_ip6_spec.psrc = cpu_to_be16(dfs->mask.fport);
+		fs->h_u.tcp_ip6_spec.pdst = cpu_to_be16(dfs->val.lport);
+		fs->m_u.tcp_ip6_spec.pdst = cpu_to_be16(dfs->mask.lport);
+		memcpy(&fs->h_u.tcp_ip6_spec.ip6src, &dfs->val.fip[0],
+		       sizeof(fs->h_u.tcp_ip6_spec.ip6src));
+		memcpy(&fs->m_u.tcp_ip6_spec.ip6src, &dfs->mask.fip[0],
+		       sizeof(fs->m_u.tcp_ip6_spec.ip6src));
+		memcpy(&fs->h_u.tcp_ip6_spec.ip6dst, &dfs->val.lip[0],
+		       sizeof(fs->h_u.tcp_ip6_spec.ip6dst));
+		memcpy(&fs->m_u.tcp_ip6_spec.ip6dst, &dfs->mask.lip[0],
+		       sizeof(fs->m_u.tcp_ip6_spec.ip6dst));
+		fs->h_u.tcp_ip6_spec.tclass = dfs->val.tos;
+		fs->m_u.tcp_ip6_spec.tclass = dfs->mask.tos;
+	} else {
+		fs->h_u.tcp_ip4_spec.psrc = cpu_to_be16(dfs->val.fport);
+		fs->m_u.tcp_ip4_spec.psrc = cpu_to_be16(dfs->mask.fport);
+		fs->h_u.tcp_ip4_spec.pdst = cpu_to_be16(dfs->val.lport);
+		fs->m_u.tcp_ip4_spec.pdst = cpu_to_be16(dfs->mask.lport);
+		memcpy(&fs->h_u.tcp_ip4_spec.ip4src, &dfs->val.fip[0],
+		       sizeof(fs->h_u.tcp_ip4_spec.ip4src));
+		memcpy(&fs->m_u.tcp_ip4_spec.ip4src, &dfs->mask.fip[0],
+		       sizeof(fs->m_u.tcp_ip4_spec.ip4src));
+		memcpy(&fs->h_u.tcp_ip4_spec.ip4dst, &dfs->val.lip[0],
+		       sizeof(fs->h_u.tcp_ip4_spec.ip4dst));
+		memcpy(&fs->m_u.tcp_ip4_spec.ip4dst, &dfs->mask.lip[0],
+		       sizeof(fs->m_u.tcp_ip4_spec.ip4dst));
+		fs->h_u.tcp_ip4_spec.tos = dfs->val.tos;
+		fs->m_u.tcp_ip4_spec.tos = dfs->mask.tos;
+	}
+	fs->h_ext.vlan_tci = cpu_to_be16(dfs->val.ivlan);
+	fs->m_ext.vlan_tci = cpu_to_be16(dfs->mask.ivlan);
+	fs->flow_type |= FLOW_EXT;
+
+	if (dfs->action == FILTER_DROP)
+		fs->ring_cookie = RX_CLS_FLOW_DISC;
+	else
+		fs->ring_cookie = dfs->iq;
+}
+
+static int cxgb4_ntuple_get_filter(struct net_device *dev,
+				   struct ethtool_rxnfc *cmd,
+				   unsigned int loc)
+{
+	const struct port_info *pi = netdev_priv(dev);
+	struct adapter *adap = netdev2adap(dev);
+	struct filter_entry *f;
+	int ftid;
+
+	if (!(adap->flags & CXGB4_FULL_INIT_DONE))
+		return -EAGAIN;
+
+	/* Check for maximum filter range */
+	if (!adap->ethtool_filters)
+		return -EOPNOTSUPP;
+
+	if (loc >= adap->ethtool_filters->nentries)
+		return -ERANGE;
+
+	if (!test_bit(loc, adap->ethtool_filters->port[pi->port_id].bmap))
+		return -ENOENT;
+
+	ftid = adap->ethtool_filters->port[pi->port_id].loc_array[loc];
+
+	/* Fetch filter_entry */
+	f = cxgb4_get_filter_entry(adap, ftid);
+
+	cxgb4_fill_filter_rule(&cmd->fs, &f->fs);
+
+	return 0;
+}
+
 static int get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
 		     u32 *rules)
 {
 	const struct port_info *pi = netdev_priv(dev);
+	struct adapter *adap = netdev2adap(dev);
+	unsigned int count = 0, index = 0;
+	int ret = 0;
 
 	switch (info->cmd) {
 	case ETHTOOL_GRXFH: {
@@ -1648,7 +1742,23 @@ static int get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
 	case ETHTOOL_GRXRINGS:
 		info->data = pi->nqsets;
 		return 0;
+	case ETHTOOL_GRXCLSRLCNT:
+		info->rule_cnt =
+		       adap->ethtool_filters->port[pi->port_id].in_use;
+		return 0;
+	case ETHTOOL_GRXCLSRULE:
+		return cxgb4_ntuple_get_filter(dev, info, info->fs.location);
+	case ETHTOOL_GRXCLSRLALL:
+		info->data = adap->ethtool_filters->nentries;
+		while (count < info->rule_cnt) {
+			ret = cxgb4_ntuple_get_filter(dev, info, index);
+			if (!ret)
+				rules[count++] = index;
+			index++;
+		}
+		return 0;
 	}
+
 	return -EOPNOTSUPP;
 }
 
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next 5/5] cxgb4: add action to steer flows to specific Rxq
From: Vishal Kulkarni @ 2020-06-19 14:21 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, rahul.lakkireddy, Vishal Kulkarni
In-Reply-To: <20200619142139.27982-1-vishal@chelsio.com>

Add support for queue action to steer Rx traffic
hitting the flows to specified Rxq.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 53dbb3683653..8a176190586d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -425,6 +425,11 @@ void cxgb4_process_flow_actions(struct net_device *in,
 			process_pedit_field(fs, val, mask, offset, htype);
 			}
 			break;
+		case FLOW_ACTION_QUEUE:
+			fs->action = FILTER_PASS;
+			fs->dirsteer = 1;
+			fs->iq = act->queue.index;
+			break;
 		default:
 			break;
 		}
@@ -609,6 +614,9 @@ int cxgb4_validate_flow_actions(struct net_device *dev,
 			act_pedit = true;
 			}
 			break;
+		case FLOW_ACTION_QUEUE:
+			/* Do nothing. cxgb4_set_filter will validate */
+			break;
 		default:
 			netdev_err(dev, "%s: Unsupported action\n", __func__);
 			return -EOPNOTSUPP;
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next 3/5] cxgb4: add ethtool n-tuple filter deletion
From: Vishal Kulkarni @ 2020-06-19 14:21 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, rahul.lakkireddy, Vishal Kulkarni
In-Reply-To: <20200619142139.27982-1-vishal@chelsio.com>

Add support to delete ethtool n-tuple filter. Fetch the appropriate
filter region (HPFILTER, HASH, NORMAL) in which the filter exists,
and delete it from the respective region, accordingly.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 61 +++++++++++++++++++
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.c  | 30 ++++++---
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.h  |  2 +
 3 files changed, 84 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 82fc09b6dc8e..5c588677877d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -1573,6 +1573,22 @@ static int set_rss_table(struct net_device *dev, const u32 *p, const u8 *key,
 	return -EPERM;
 }
 
+static struct filter_entry *cxgb4_get_filter_entry(struct adapter *adap,
+						   u32 ftid)
+{
+	struct tid_info *t = &adap->tids;
+	struct filter_entry *f;
+
+	if (ftid < t->nhpftids)
+		f = &adap->tids.hpftid_tab[ftid];
+	else if (ftid < t->nftids)
+		f = &adap->tids.ftid_tab[ftid - t->nhpftids];
+	else
+		f = lookup_tid(&adap->tids, ftid);
+
+	return f;
+}
+
 static int get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
 		     u32 *rules)
 {
@@ -1636,6 +1652,48 @@ static int get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
 	return -EOPNOTSUPP;
 }
 
+static int cxgb4_ntuple_del_filter(struct net_device *dev,
+				   struct ethtool_rxnfc *cmd)
+{
+	struct cxgb4_ethtool_filter_info *filter_info;
+	struct adapter *adapter = netdev2adap(dev);
+	struct port_info *pi = netdev_priv(dev);
+	struct filter_entry *f;
+	u32 filter_id;
+	int ret;
+
+	if (!(adapter->flags & CXGB4_FULL_INIT_DONE))
+		return -EAGAIN;  /* can still change nfilters */
+
+	if (!adapter->ethtool_filters)
+		return -EOPNOTSUPP;
+
+	if (cmd->fs.location >= adapter->ethtool_filters->nentries) {
+		dev_err(adapter->pdev_dev,
+			"Location must be < %u",
+			adapter->ethtool_filters->nentries);
+		return -ERANGE;
+	}
+
+	filter_info = &adapter->ethtool_filters->port[pi->port_id];
+
+	if (!test_bit(cmd->fs.location, filter_info->bmap))
+		return -ENOENT;
+
+	filter_id = filter_info->loc_array[cmd->fs.location];
+	f = cxgb4_get_filter_entry(adapter, filter_id);
+
+	ret = cxgb4_flow_rule_destroy(dev, f->fs.tc_prio, &f->fs, filter_id);
+	if (ret)
+		goto err;
+
+	clear_bit(cmd->fs.location, filter_info->bmap);
+	filter_info->in_use--;
+
+err:
+	return ret;
+}
+
 /* Add Ethtool n-tuple filters. */
 static int cxgb4_ntuple_set_filter(struct net_device *netdev,
 				   struct ethtool_rxnfc *cmd)
@@ -1702,6 +1760,9 @@ static int set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
 	case ETHTOOL_SRXCLSRLINS:
 		ret = cxgb4_ntuple_set_filter(dev, cmd);
 		break;
+	case ETHTOOL_SRXCLSRLDEL:
+		ret = cxgb4_ntuple_del_filter(dev, cmd);
+		break;
 	default:
 		break;
 	}
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 222f4bc19908..53dbb3683653 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -797,23 +797,38 @@ int cxgb4_tc_flower_replace(struct net_device *dev,
 	return ret;
 }
 
+int cxgb4_flow_rule_destroy(struct net_device *dev, u32 tc_prio,
+			    struct ch_filter_specification *fs, int tid)
+{
+	struct adapter *adap = netdev2adap(dev);
+	u8 hash;
+	int ret;
+
+	hash = fs->hash;
+
+	ret = cxgb4_del_filter(dev, tid, fs);
+	if (ret)
+		return ret;
+
+	if (hash)
+		cxgb4_tc_flower_hash_prio_del(adap, tc_prio);
+
+	return ret;
+}
+
 int cxgb4_tc_flower_destroy(struct net_device *dev,
 			    struct flow_cls_offload *cls)
 {
 	struct adapter *adap = netdev2adap(dev);
 	struct ch_tc_flower_entry *ch_flower;
-	u32 tc_prio;
-	bool hash;
 	int ret;
 
 	ch_flower = ch_flower_lookup(adap, cls->cookie);
 	if (!ch_flower)
 		return -ENOENT;
 
-	hash = ch_flower->fs.hash;
-	tc_prio = ch_flower->fs.tc_prio;
-
-	ret = cxgb4_del_filter(dev, ch_flower->filter_id, &ch_flower->fs);
+	ret = cxgb4_flow_rule_destroy(dev, ch_flower->fs.tc_prio,
+				      &ch_flower->fs, ch_flower->filter_id);
 	if (ret)
 		goto err;
 
@@ -825,9 +840,6 @@ int cxgb4_tc_flower_destroy(struct net_device *dev,
 	}
 	kfree_rcu(ch_flower, rcu);
 
-	if (hash)
-		cxgb4_tc_flower_hash_prio_del(adap, tc_prio);
-
 err:
 	return ret;
 }
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
index 7fa379749500..befa459324fb 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -124,6 +124,8 @@ int cxgb4_tc_flower_stats(struct net_device *dev,
 int cxgb4_flow_rule_replace(struct net_device *dev, struct flow_rule *rule,
 			    u32 tc_prio, struct netlink_ext_ack *extack,
 			    struct ch_filter_specification *fs, u32 *tid);
+int cxgb4_flow_rule_destroy(struct net_device *dev, u32 tc_prio,
+			    struct ch_filter_specification *fs, int tid);
 
 int cxgb4_init_tc_flower(struct adapter *adap);
 void cxgb4_cleanup_tc_flower(struct adapter *adap);
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next 2/5] cxgb4: add ethtool n-tuple filter insertion
From: Vishal Kulkarni @ 2020-06-19 14:21 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, rahul.lakkireddy, Vishal Kulkarni
In-Reply-To: <20200619142139.27982-1-vishal@chelsio.com>

Add support to parse and insert ethtool n-tuple filters.
Translate n-tuple spec to flow spec and use the existing tc-flower
offload infra to insert ethtool n-tuple filters.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 75 ++++++++++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_filter.c |  5 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  2 +-
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.c  | 97 ++++++++++---------
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.h  |  3 +
 5 files changed, 135 insertions(+), 47 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 51f1d5f87bc3..82fc09b6dc8e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -11,6 +11,7 @@
 #include "t4fw_api.h"
 #include "cxgb4_cudbg.h"
 #include "cxgb4_filter.h"
+#include "cxgb4_tc_flower.h"
 
 #define EEPROM_MAGIC 0x38E2F10C
 
@@ -1635,6 +1636,79 @@ static int get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info,
 	return -EOPNOTSUPP;
 }
 
+/* Add Ethtool n-tuple filters. */
+static int cxgb4_ntuple_set_filter(struct net_device *netdev,
+				   struct ethtool_rxnfc *cmd)
+{
+	struct ethtool_rx_flow_spec_input input = {};
+	struct cxgb4_ethtool_filter_info *filter_info;
+	struct adapter *adapter = netdev2adap(netdev);
+	struct port_info *pi = netdev_priv(netdev);
+	struct ch_filter_specification fs;
+	struct ethtool_rx_flow_rule *flow;
+	u32 tid;
+	int ret;
+
+	if (!(adapter->flags & CXGB4_FULL_INIT_DONE))
+		return -EAGAIN;  /* can still change nfilters */
+
+	if (!adapter->ethtool_filters)
+		return -EOPNOTSUPP;
+
+	if (cmd->fs.location >= adapter->ethtool_filters->nentries) {
+		dev_err(adapter->pdev_dev,
+			"Location must be < %u",
+			adapter->ethtool_filters->nentries);
+		return -ERANGE;
+	}
+
+	if (test_bit(cmd->fs.location,
+		     adapter->ethtool_filters->port[pi->port_id].bmap))
+		return -EEXIST;
+
+	memset(&fs, 0, sizeof(fs));
+
+	input.fs = &cmd->fs;
+	flow = ethtool_rx_flow_rule_create(&input);
+	if (IS_ERR(flow)) {
+		ret = PTR_ERR(flow);
+		goto exit;
+	}
+
+	fs.hitcnts = 1;
+
+	ret = cxgb4_flow_rule_replace(netdev, flow->rule, cmd->fs.location,
+				      NULL, &fs, &tid);
+	if (ret)
+		goto free;
+
+	filter_info = &adapter->ethtool_filters->port[pi->port_id];
+
+	filter_info->loc_array[cmd->fs.location] = tid;
+	set_bit(cmd->fs.location, filter_info->bmap);
+	filter_info->in_use++;
+
+free:
+	ethtool_rx_flow_rule_destroy(flow);
+exit:
+	return ret;
+}
+
+static int set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
+{
+	int ret = -EOPNOTSUPP;
+
+	switch (cmd->cmd) {
+	case ETHTOOL_SRXCLSRLINS:
+		ret = cxgb4_ntuple_set_filter(dev, cmd);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
 static int set_dump(struct net_device *dev, struct ethtool_dump *eth_dump)
 {
 	struct adapter *adapter = netdev2adap(dev);
@@ -1840,6 +1914,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
 	.get_regs_len      = get_regs_len,
 	.get_regs          = get_regs,
 	.get_rxnfc         = get_rxnfc,
+	.set_rxnfc         = set_rxnfc,
 	.get_rxfh_indir_size = get_rss_table_size,
 	.get_rxfh	   = get_rss_table,
 	.set_rxfh	   = set_rss_table,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
index 796555255207..a12df792d832 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.c
@@ -1152,6 +1152,11 @@ bool is_filter_exact_match(struct adapter *adap,
 	if (!is_hashfilter(adap))
 		return false;
 
+	if ((atomic_read(&adap->tids.hash_tids_in_use) +
+	     atomic_read(&adap->tids.tids_in_use)) >=
+	    (adap->tids.nhash + (adap->tids.stid_base - adap->tids.tid_base)))
+		return false;
+
 	 /* Keep tunnel VNI match disabled for hash-filters for now */
 	if (fs->mask.encap_vld)
 		return false;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 501917751b7f..7423980bc49a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6371,7 +6371,7 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 			NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
 			NETIF_F_RXCSUM | NETIF_F_RXHASH | NETIF_F_GRO |
 			NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
-			NETIF_F_HW_TC;
+			NETIF_F_HW_TC | NETIF_F_NTUPLE;
 
 		if (chip_ver > CHELSIO_T5) {
 			netdev->hw_enc_features |= NETIF_F_IP_CSUM |
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 4a5fa9eba0b6..222f4bc19908 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -81,19 +81,9 @@ static struct ch_tc_flower_entry *ch_flower_lookup(struct adapter *adap,
 }
 
 static void cxgb4_process_flow_match(struct net_device *dev,
-				     struct flow_cls_offload *cls,
+				     struct flow_rule *rule,
 				     struct ch_filter_specification *fs)
 {
-	struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
-	u16 addr_type = 0;
-
-	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CONTROL)) {
-		struct flow_match_control match;
-
-		flow_rule_match_control(rule, &match);
-		addr_type = match.key->addr_type;
-	}
-
 	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) {
 		struct flow_match_basic match;
 		u16 ethtype_key, ethtype_mask;
@@ -116,7 +106,7 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 		fs->mask.proto = match.mask->ip_proto;
 	}
 
-	if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
+	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV4_ADDRS)) {
 		struct flow_match_ipv4_addrs match;
 
 		flow_rule_match_ipv4_addrs(rule, &match);
@@ -131,7 +121,7 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 		memcpy(&fs->nat_fip[0], &match.key->src, sizeof(match.key->src));
 	}
 
-	if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
+	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV6_ADDRS)) {
 		struct flow_match_ipv6_addrs match;
 
 		flow_rule_match_ipv6_addrs(rule, &match);
@@ -224,9 +214,8 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 }
 
 static int cxgb4_validate_flow_match(struct net_device *dev,
-				     struct flow_cls_offload *cls)
+				     struct flow_rule *rule)
 {
-	struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
 	struct flow_dissector *dissector = rule->match.dissector;
 	u16 ethtype_mask = 0;
 	u16 ethtype_key = 0;
@@ -693,14 +682,11 @@ static void cxgb4_tc_flower_hash_prio_del(struct adapter *adap, u32 tc_prio)
 	spin_unlock_bh(&t->ftid_lock);
 }
 
-int cxgb4_tc_flower_replace(struct net_device *dev,
-			    struct flow_cls_offload *cls)
+int cxgb4_flow_rule_replace(struct net_device *dev, struct flow_rule *rule,
+			    u32 tc_prio, struct netlink_ext_ack *extack,
+			    struct ch_filter_specification *fs, u32 *tid)
 {
-	struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
-	struct netlink_ext_ack *extack = cls->common.extack;
 	struct adapter *adap = netdev2adap(dev);
-	struct ch_tc_flower_entry *ch_flower;
-	struct ch_filter_specification *fs;
 	struct filter_ctx ctx;
 	u8 inet_family;
 	int fidx, ret;
@@ -708,18 +694,10 @@ int cxgb4_tc_flower_replace(struct net_device *dev,
 	if (cxgb4_validate_flow_actions(dev, &rule->action, extack))
 		return -EOPNOTSUPP;
 
-	if (cxgb4_validate_flow_match(dev, cls))
+	if (cxgb4_validate_flow_match(dev, rule))
 		return -EOPNOTSUPP;
 
-	ch_flower = allocate_flower_entry();
-	if (!ch_flower) {
-		netdev_err(dev, "%s: ch_flower alloc failed.\n", __func__);
-		return -ENOMEM;
-	}
-
-	fs = &ch_flower->fs;
-	fs->hitcnts = 1;
-	cxgb4_process_flow_match(dev, cls, fs);
+	cxgb4_process_flow_match(dev, rule, fs);
 	cxgb4_process_flow_actions(dev, &rule->action, fs);
 
 	fs->hash = is_filter_exact_match(adap, fs);
@@ -730,12 +708,11 @@ int cxgb4_tc_flower_replace(struct net_device *dev,
 	 * existing rules.
 	 */
 	fidx = cxgb4_get_free_ftid(dev, inet_family, fs->hash,
-				   cls->common.prio);
+				   tc_prio);
 	if (fidx < 0) {
 		NL_SET_ERR_MSG_MOD(extack,
 				   "No free LETCAM index available");
-		ret = -ENOMEM;
-		goto free_entry;
+		return -ENOMEM;
 	}
 
 	if (fidx < adap->tids.nhpftids) {
@@ -749,42 +726,70 @@ int cxgb4_tc_flower_replace(struct net_device *dev,
 	if (fs->hash)
 		fidx = 0;
 
-	fs->tc_prio = cls->common.prio;
-	fs->tc_cookie = cls->cookie;
+	fs->tc_prio = tc_prio;
 
 	init_completion(&ctx.completion);
 	ret = __cxgb4_set_filter(dev, fidx, fs, &ctx);
 	if (ret) {
 		netdev_err(dev, "%s: filter creation err %d\n",
 			   __func__, ret);
-		goto free_entry;
+		return ret;
 	}
 
 	/* Wait for reply */
 	ret = wait_for_completion_timeout(&ctx.completion, 10 * HZ);
-	if (!ret) {
-		ret = -ETIMEDOUT;
-		goto free_entry;
-	}
+	if (!ret)
+		return -ETIMEDOUT;
 
-	ret = ctx.result;
 	/* Check if hw returned error for filter creation */
+	if (ctx.result)
+		return ctx.result;
+
+	*tid = ctx.tid;
+
+	if (fs->hash)
+		cxgb4_tc_flower_hash_prio_add(adap, tc_prio);
+
+	return 0;
+}
+
+int cxgb4_tc_flower_replace(struct net_device *dev,
+			    struct flow_cls_offload *cls)
+{
+	struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
+	struct netlink_ext_ack *extack = cls->common.extack;
+	struct adapter *adap = netdev2adap(dev);
+	struct ch_tc_flower_entry *ch_flower;
+	struct ch_filter_specification *fs;
+	int ret;
+
+	ch_flower = allocate_flower_entry();
+	if (!ch_flower) {
+		netdev_err(dev, "%s: ch_flower alloc failed.\n", __func__);
+		return -ENOMEM;
+	}
+
+	fs = &ch_flower->fs;
+	fs->hitcnts = 1;
+	fs->tc_cookie = cls->cookie;
+
+	ret = cxgb4_flow_rule_replace(dev, rule, cls->common.prio, extack, fs,
+				      &ch_flower->filter_id);
 	if (ret)
 		goto free_entry;
 
 	ch_flower->tc_flower_cookie = cls->cookie;
-	ch_flower->filter_id = ctx.tid;
 	ret = rhashtable_insert_fast(&adap->flower_tbl, &ch_flower->node,
 				     adap->flower_ht_params);
 	if (ret)
 		goto del_filter;
 
-	if (fs->hash)
-		cxgb4_tc_flower_hash_prio_add(adap, cls->common.prio);
-
 	return 0;
 
 del_filter:
+	if (fs->hash)
+		cxgb4_tc_flower_hash_prio_del(adap, cls->common.prio);
+
 	cxgb4_del_filter(dev, ch_flower->filter_id, &ch_flower->fs);
 
 free_entry:
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
index 0a30c96b81ff..7fa379749500 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h
@@ -121,6 +121,9 @@ int cxgb4_tc_flower_destroy(struct net_device *dev,
 			    struct flow_cls_offload *cls);
 int cxgb4_tc_flower_stats(struct net_device *dev,
 			  struct flow_cls_offload *cls);
+int cxgb4_flow_rule_replace(struct net_device *dev, struct flow_rule *rule,
+			    u32 tc_prio, struct netlink_ext_ack *extack,
+			    struct ch_filter_specification *fs, u32 *tid);
 
 int cxgb4_init_tc_flower(struct adapter *adap);
 void cxgb4_cleanup_tc_flower(struct adapter *adap);
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next 1/5] cxgb4: add skeleton for ethtool n-tuple filters
From: Vishal Kulkarni @ 2020-06-19 14:21 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, rahul.lakkireddy, Vishal Kulkarni
In-Reply-To: <20200619142139.27982-1-vishal@chelsio.com>

Allocate and manage resources required for ethtool n-tuple filters.
Also fetch the HASH filter region size and calculate nhash entries.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    | 14 ++++
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 82 +++++++++++++++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_filter.h |  2 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 38 +++++----
 .../net/ethernet/chelsio/cxgb4/cxgb4_uld.h    |  2 +
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h  |  4 +
 6 files changed, 126 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 999816273328..466a61ba23ce 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -1066,6 +1066,17 @@ struct mps_entries_ref {
 	refcount_t refcnt;
 };
 
+struct cxgb4_ethtool_filter_info {
+	u32 *loc_array; /* Array holding the actual TIDs set to filters */
+	unsigned long *bmap; /* Bitmap for managing filters in use */
+	u32 in_use; /* # of filters in use */
+};
+
+struct cxgb4_ethtool_filter {
+	u32 nentries; /* Adapter wide number of supported filters */
+	struct cxgb4_ethtool_filter_info *port; /* Per port entry */
+};
+
 struct adapter {
 	void __iomem *regs;
 	void __iomem *bar2;
@@ -1191,6 +1202,9 @@ struct adapter {
 
 	/* TC MATCHALL classifier offload */
 	struct cxgb4_tc_matchall *tc_matchall;
+
+	/* Ethtool n-tuple */
+	struct cxgb4_ethtool_filter *ethtool_filters;
 };
 
 /* Support for "sched-class" command to allow a TX Scheduling Class to be
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 0bfdc97e9083..51f1d5f87bc3 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -10,6 +10,7 @@
 #include "t4_regs.h"
 #include "t4fw_api.h"
 #include "cxgb4_cudbg.h"
+#include "cxgb4_filter.h"
 
 #define EEPROM_MAGIC 0x38E2F10C
 
@@ -1853,6 +1854,87 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
 	.set_priv_flags    = cxgb4_set_priv_flags,
 };
 
+void cxgb4_cleanup_ethtool_filters(struct adapter *adap)
+{
+	struct cxgb4_ethtool_filter_info *eth_filter_info;
+	u8 i;
+
+	if (!adap->ethtool_filters)
+		return;
+
+	eth_filter_info = adap->ethtool_filters->port;
+
+	if (eth_filter_info) {
+		for (i = 0; i < adap->params.nports; i++) {
+			kvfree(eth_filter_info[i].loc_array);
+			kfree(eth_filter_info[i].bmap);
+		}
+		kfree(eth_filter_info);
+	}
+
+	kfree(adap->ethtool_filters);
+}
+
+int cxgb4_init_ethtool_filters(struct adapter *adap)
+{
+	struct cxgb4_ethtool_filter_info *eth_filter_info;
+	struct cxgb4_ethtool_filter *eth_filter;
+	struct tid_info *tids = &adap->tids;
+	u32 nentries, i;
+	int ret;
+
+	eth_filter = kzalloc(sizeof(*eth_filter), GFP_KERNEL);
+	if (!eth_filter)
+		return -ENOMEM;
+
+	eth_filter_info = kcalloc(adap->params.nports,
+				  sizeof(*eth_filter_info),
+				  GFP_KERNEL);
+	if (!eth_filter_info) {
+		ret = -ENOMEM;
+		goto free_eth_filter;
+	}
+
+	eth_filter->port = eth_filter_info;
+
+	nentries = tids->nhpftids + tids->nftids;
+	if (is_hashfilter(adap))
+		nentries += tids->nhash +
+			    (adap->tids.stid_base - adap->tids.tid_base);
+	eth_filter->nentries = nentries;
+
+	for (i = 0; i < adap->params.nports; i++) {
+		eth_filter->port[i].loc_array = kvzalloc(nentries, GFP_KERNEL);
+		if (!eth_filter->port[i].loc_array) {
+			ret = -ENOMEM;
+			goto free_eth_finfo;
+		}
+
+		eth_filter->port[i].bmap = kcalloc(BITS_TO_LONGS(nentries),
+						   sizeof(unsigned long),
+						   GFP_KERNEL);
+		if (!eth_filter->port[i].bmap) {
+			ret = -ENOMEM;
+			goto free_eth_finfo;
+		}
+	}
+
+	adap->ethtool_filters = eth_filter;
+	return 0;
+
+free_eth_finfo:
+	while (i-- > 0) {
+		kfree(eth_filter->port[i].bmap);
+		kvfree(eth_filter->port[i].loc_array);
+	}
+	kfree(eth_filter_info);
+
+free_eth_filter:
+	kfree(eth_filter);
+
+	return ret;
+}
+
 void cxgb4_set_ethtool_ops(struct net_device *netdev)
 {
 	netdev->ethtool_ops = &cxgb_ethtool_ops;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
index b0751c0611ec..807a8dafec45 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_filter.h
@@ -53,4 +53,6 @@ void clear_all_filters(struct adapter *adapter);
 void init_hash_filter(struct adapter *adap);
 bool is_filter_exact_match(struct adapter *adap,
 			   struct ch_filter_specification *fs);
+void cxgb4_cleanup_ethtool_filters(struct adapter *adap);
+int cxgb4_init_ethtool_filters(struct adapter *adap);
 #endif /* __CXGB4_FILTER_H */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 854b1717a70d..501917751b7f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -5860,6 +5860,7 @@ static void free_some_resources(struct adapter *adapter)
 	cxgb4_cleanup_tc_mqprio(adapter);
 	cxgb4_cleanup_tc_flower(adapter);
 	cxgb4_cleanup_tc_u32(adapter);
+	cxgb4_cleanup_ethtool_filters(adapter);
 	kfree(adapter->sge.egr_map);
 	kfree(adapter->sge.ingr_map);
 	kfree(adapter->sge.starving_fl);
@@ -6493,6 +6494,24 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 				 i);
 	}
 
+	if (is_offload(adapter) || is_hashfilter(adapter)) {
+		if (t4_read_reg(adapter, LE_DB_CONFIG_A) & HASHEN_F) {
+			u32 v;
+
+			v = t4_read_reg(adapter, LE_DB_HASH_CONFIG_A);
+			if (chip_ver <= CHELSIO_T5) {
+				adapter->tids.nhash = 1 << HASHTIDSIZE_G(v);
+				v = t4_read_reg(adapter, LE_DB_TID_HASHBASE_A);
+				adapter->tids.hash_base = v / 4;
+			} else {
+				adapter->tids.nhash = HASHTBLSIZE_G(v) << 3;
+				v = t4_read_reg(adapter,
+						T6_LE_DB_HASH_TID_BASE_A);
+				adapter->tids.hash_base = v;
+			}
+		}
+	}
+
 	if (tid_init(&adapter->tids) < 0) {
 		dev_warn(&pdev->dev, "could not allocate TID table, "
 			 "continuing\n");
@@ -6514,22 +6533,9 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		if (cxgb4_init_tc_matchall(adapter))
 			dev_warn(&pdev->dev,
 				 "could not offload tc matchall, continuing\n");
-	}
-
-	if (is_offload(adapter) || is_hashfilter(adapter)) {
-		if (t4_read_reg(adapter, LE_DB_CONFIG_A) & HASHEN_F) {
-			u32 hash_base, hash_reg;
-
-			if (chip_ver <= CHELSIO_T5) {
-				hash_reg = LE_DB_TID_HASHBASE_A;
-				hash_base = t4_read_reg(adapter, hash_reg);
-				adapter->tids.hash_base = hash_base / 4;
-			} else {
-				hash_reg = T6_LE_DB_HASH_TID_BASE_A;
-				hash_base = t4_read_reg(adapter, hash_reg);
-				adapter->tids.hash_base = hash_base;
-			}
-		}
+		if (cxgb4_init_ethtool_filters(adapter))
+			dev_warn(&pdev->dev,
+				 "could not initialize ethtool filters, continuing\n");
 	}
 
 	/* See what interrupts we'll be using */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index dbce99b209d6..a963fd0b4540 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -106,6 +106,8 @@ struct tid_info {
 	unsigned long *stid_bmap;
 	unsigned int nstids;
 	unsigned int stid_base;
+
+	unsigned int nhash;
 	unsigned int hash_base;
 
 	union aopen_entry *atid_tab;
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
index 4b697550f08d..065c01c654ff 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_regs.h
@@ -3044,6 +3044,10 @@
 #define HASHTIDSIZE_M    0x3fU
 #define HASHTIDSIZE_G(x) (((x) >> HASHTIDSIZE_S) & HASHTIDSIZE_M)
 
+#define HASHTBLSIZE_S    3
+#define HASHTBLSIZE_M    0x1ffffU
+#define HASHTBLSIZE_G(x) (((x) >> HASHTBLSIZE_S) & HASHTBLSIZE_M)
+
 #define LE_DB_HASH_TID_BASE_A 0x19c30
 #define LE_DB_HASH_TBL_BASE_ADDR_A 0x19c30
 #define LE_DB_INT_CAUSE_A 0x19c3c
-- 
2.21.1


^ permalink raw reply related

* [PATCH net-next 0/5] cxgb4: add support for ethtool n-tuple filters
From: Vishal Kulkarni @ 2020-06-19 14:21 UTC (permalink / raw)
  To: netdev, davem; +Cc: nirranjan, rahul.lakkireddy, Vishal Kulkarni

Patch 1: Adds data structure to maintain list of filters and handles init/dinit
	 of the same.

Patch 2: Handles addition of filters via ETHTOOL_SRXCLSRLINS.

Patch 3: Handles deletion of filtes via ETHTOOL_SRXCLSRLDEL.

Patch 4: Handles viewing of added filters.

Patch 5: Adds FLOW_ACTION_QUEUE support.

Vishal Kulkarni (5):
  cxgb4: add skeleton for ethtool n-tuple filters
  cxgb4: add ethtool n-tuple filter insertion
  cxgb4: add ethtool n-tuple filter deletion
  cxgb4: add support to fetch ethtool n-tuple filters
  cxgb4: add action to steer flows to specific Rxq

 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h    |  15 +
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c    | 335 ++++++++++++++++++
 .../net/ethernet/chelsio/cxgb4/cxgb4_filter.c |   5 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_filter.h |   3 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  22 +-
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.c  | 131 ++++---
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.h  |   9 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_uld.h    |   2 +
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h  |   4 +
 9 files changed, 463 insertions(+), 63 deletions(-)

-- 
2.18.2


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox