Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v2 07/15] net: davinci_emac: use nvmem to retrieve the mac address
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

All users which store the MAC address in EEPROM now register relevant
nvmem cells. Switch to retrieving the MAC address over the nvmem
framework. If we can't get the nvmem cell then fall back to using
the device tree.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 drivers/net/ethernet/ti/davinci_emac.c | 33 ++++++++++++++++++--------
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index a1a6445b5a7e..48b70bc7b9cf 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -67,7 +67,7 @@
 #include <linux/of_irq.h>
 #include <linux/of_net.h>
 #include <linux/mfd/syscon.h>
-
+#include <linux/nvmem-consumer.h>
 #include <asm/irq.h>
 #include <asm/page.h>
 
@@ -1696,7 +1696,6 @@ davinci_emac_of_get_pdata(struct platform_device *pdev, struct emac_priv *priv)
 	const struct of_device_id *match;
 	const struct emac_platform_data *auxdata;
 	struct emac_platform_data *pdata = NULL;
-	const u8 *mac_addr;
 
 	if (!IS_ENABLED(CONFIG_OF) || !pdev->dev.of_node)
 		return dev_get_platdata(&pdev->dev);
@@ -1708,12 +1707,6 @@ davinci_emac_of_get_pdata(struct platform_device *pdev, struct emac_priv *priv)
 	np = pdev->dev.of_node;
 	pdata->version = EMAC_VERSION_2;
 
-	if (!is_valid_ether_addr(pdata->mac_addr)) {
-		mac_addr = of_get_mac_address(np);
-		if (mac_addr)
-			ether_addr_copy(pdata->mac_addr, mac_addr);
-	}
-
 	of_property_read_u32(np, "ti,davinci-ctrl-reg-offset",
 			     &pdata->ctrl_reg_offset);
 
@@ -1783,7 +1776,9 @@ static int davinci_emac_probe(struct platform_device *pdev)
 	struct cpdma_params dma_params;
 	struct clk *emac_clk;
 	unsigned long emac_bus_frequency;
-
+	struct nvmem_cell *cell;
+	const void *mac_addr;
+	size_t mac_addr_len;
 
 	/* obtain emac clock from kernel */
 	emac_clk = devm_clk_get(&pdev->dev, NULL);
@@ -1815,8 +1810,26 @@ static int davinci_emac_probe(struct platform_device *pdev)
 		goto err_free_netdev;
 	}
 
+	cell = nvmem_cell_get(&pdev->dev, "mac-address");
+	if (!IS_ERR(cell)) {
+		mac_addr = nvmem_cell_read(cell, &mac_addr_len);
+		if (!IS_ERR(mac_addr)) {
+			if (is_valid_ether_addr(mac_addr)) {
+				dev_info(&pdev->dev,
+					 "Read MAC addr from EEPROM: %pM\n",
+					 mac_addr);
+				ether_addr_copy(priv->mac_addr, mac_addr);
+			}
+			kfree(mac_addr);
+		}
+		nvmem_cell_put(cell);
+	} else {
+		mac_addr = of_get_mac_address(np);
+		if (mac_addr)
+			ether_addr_copy(priv->mac_addr, mac_addr);
+	}
+
 	/* MAC addr and PHY mask , RMII enable info from platform_data */
-	memcpy(priv->mac_addr, pdata->mac_addr, ETH_ALEN);
 	priv->phy_id = pdata->phy_id;
 	priv->rmii_en = pdata->rmii_en;
 	priv->version = pdata->version;
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 09/15] ARM: davinci: dm365-evm: use device properties for at24 eeprom
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: netdev, linux-omap, linux-kernel, linux-arm-kernel,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

We want to work towards phasing out the at24_platform_data structure.
There are few users and its contents can be represented using generic
device properties. Using device properties only will allow us to
significantly simplify the at24 configuration code.

Remove the at24_platform_data structure and replace it with an array
of property entries. Drop the byte_len/size property, as the model name
already implies the EEPROM's size.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/board-dm365-evm.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mach-davinci/board-dm365-evm.c b/arch/arm/mach-davinci/board-dm365-evm.c
index df640d977bfa..ffe93265f565 100644
--- a/arch/arm/mach-davinci/board-dm365-evm.c
+++ b/arch/arm/mach-davinci/board-dm365-evm.c
@@ -18,7 +18,7 @@
 #include <linux/i2c.h>
 #include <linux/io.h>
 #include <linux/clk.h>
-#include <linux/platform_data/at24.h>
+#include <linux/property.h>
 #include <linux/leds.h>
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/partitions.h>
@@ -179,18 +179,15 @@ static struct nvmem_cell_lookup dm365evm_mac_address_cell = {
 	.nvmem_name = "1-00500",
 };
 
-static struct at24_platform_data eeprom_info = {
-	.byte_len       = (256*1024) / 8,
-	.page_size      = 64,
-	.flags          = AT24_FLAG_ADDR16,
-	.setup          = davinci_get_mac_addr,
-	.context	= (void *)0x7f00,
+static const struct property_entry eeprom_properties[] = {
+	PROPERTY_ENTRY_U32("pagesize", 64),
+	{ }
 };
 
 static struct i2c_board_info i2c_info[] = {
 	{
 		I2C_BOARD_INFO("24c256", 0x50),
-		.platform_data	= &eeprom_info,
+		.properties = eeprom_properties,
 	},
 	{
 		I2C_BOARD_INFO("tlv320aic3x", 0x18),
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 10/15] ARM: davinci: da830-evm: use device properties for at24 eeprom
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: netdev, linux-omap, linux-kernel, linux-arm-kernel,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

We want to work towards phasing out the at24_platform_data structure.
There are few users and its contents can be represented using generic
device properties. Using device properties only will allow us to
significantly simplify the at24 configuration code.

Remove the at24_platform_data structure and replace it with an array
of property entries. Drop the byte_len/size property, as the model name
already implies the EEPROM's size.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/board-da830-evm.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mach-davinci/board-da830-evm.c b/arch/arm/mach-davinci/board-da830-evm.c
index 3be3e93f2f18..779d09581169 100644
--- a/arch/arm/mach-davinci/board-da830-evm.c
+++ b/arch/arm/mach-davinci/board-da830-evm.c
@@ -18,7 +18,7 @@
 #include <linux/platform_device.h>
 #include <linux/i2c.h>
 #include <linux/platform_data/pcf857x.h>
-#include <linux/platform_data/at24.h>
+#include <linux/property.h>
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/partitions.h>
 #include <linux/spi/spi.h>
@@ -419,12 +419,9 @@ static struct nvmem_cell_lookup da830_evm_mac_address_cell = {
 	.nvmem_name = "1-00500",
 };
 
-static struct at24_platform_data da830_evm_i2c_eeprom_info = {
-	.byte_len	= SZ_256K / 8,
-	.page_size	= 64,
-	.flags		= AT24_FLAG_ADDR16,
-	.setup		= davinci_get_mac_addr,
-	.context	= (void *)0x7f00,
+static const struct property_entry da830_evm_i2c_eeprom_properties[] = {
+	PROPERTY_ENTRY_U32("pagesize", 64),
+	{ }
 };
 
 static int __init da830_evm_ui_expander_setup(struct i2c_client *client,
@@ -458,7 +455,7 @@ static struct pcf857x_platform_data __initdata da830_evm_ui_expander_info = {
 static struct i2c_board_info __initdata da830_evm_i2c_devices[] = {
 	{
 		I2C_BOARD_INFO("24c256", 0x50),
-		.platform_data	= &da830_evm_i2c_eeprom_info,
+		.properties = da830_evm_i2c_eeprom_properties,
 	},
 	{
 		I2C_BOARD_INFO("tlv320aic3x", 0x18),
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 11/15] ARM: davinci: dm644x-evm: use device properties for at24 eeprom
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

We want to work towards phasing out the at24_platform_data structure.
There are few users and its contents can be represented using generic
device properties. Using device properties only will allow us to
significantly simplify the at24 configuration code.

Remove the at24_platform_data structure and replace it with an array
of property entries. Drop the byte_len/size property, as the model name
already implies the EEPROM's size.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/board-dm644x-evm.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mach-davinci/board-dm644x-evm.c b/arch/arm/mach-davinci/board-dm644x-evm.c
index adbe8630ef19..5b26a8c5bbd8 100644
--- a/arch/arm/mach-davinci/board-dm644x-evm.c
+++ b/arch/arm/mach-davinci/board-dm644x-evm.c
@@ -16,8 +16,8 @@
 #include <linux/gpio/machine.h>
 #include <linux/i2c.h>
 #include <linux/platform_data/pcf857x.h>
-#include <linux/platform_data/at24.h>
 #include <linux/platform_data/gpio-davinci.h>
+#include <linux/property.h>
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/rawnand.h>
 #include <linux/mtd/partitions.h>
@@ -486,12 +486,8 @@ static struct nvmem_cell_lookup dm6446evm_mac_address_cell = {
 	.nvmem_name = "1-00500",
 };
 
-static struct at24_platform_data eeprom_info = {
-	.byte_len	= (256*1024) / 8,
-	.page_size	= 64,
-	.flags		= AT24_FLAG_ADDR16,
-	.setup          = davinci_get_mac_addr,
-	.context	= (void *)0x7f00,
+static const struct property_entry eeprom_properties[] = {
+	PROPERTY_ENTRY_U32("pagesize", 64),
 };
 
 /*
@@ -601,7 +597,7 @@ static struct i2c_board_info __initdata i2c_info[] =  {
 	},
 	{
 		I2C_BOARD_INFO("24c256", 0x50),
-		.platform_data	= &eeprom_info,
+		.properties = eeprom_properties,
 	},
 	{
 		I2C_BOARD_INFO("tlv320aic33", 0x1b),
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 12/15] ARM: davinci: dm646x-evm: use device properties for at24 eeprom
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

We want to work towards phasing out the at24_platform_data structure.
There are few users and its contents can be represented using generic
device properties. Using device properties only will allow us to
significantly simplify the at24 configuration code.

Remove the at24_platform_data structure and replace it with an array
of property entries. Drop the byte_len/size property, as the model name
already implies the EEPROM's size.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/board-dm646x-evm.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mach-davinci/board-dm646x-evm.c b/arch/arm/mach-davinci/board-dm646x-evm.c
index 4c82d38033b6..8c585e7be180 100644
--- a/arch/arm/mach-davinci/board-dm646x-evm.c
+++ b/arch/arm/mach-davinci/board-dm646x-evm.c
@@ -22,7 +22,7 @@
 #include <linux/gpio.h>
 #include <linux/platform_device.h>
 #include <linux/i2c.h>
-#include <linux/platform_data/at24.h>
+#include <linux/property.h>
 #include <linux/platform_data/pcf857x.h>
 
 #include <media/i2c/tvp514x.h>
@@ -320,12 +320,9 @@ static struct nvmem_cell_lookup dm646x_evm_mac_address_cell = {
 	.nvmem_name = "1-00500",
 };
 
-static struct at24_platform_data eeprom_info = {
-	.byte_len       = (256*1024) / 8,
-	.page_size      = 64,
-	.flags          = AT24_FLAG_ADDR16,
-	.setup          = davinci_get_mac_addr,
-	.context	= (void *)0x7f00,
+static const struct property_entry eeprom_properties[] = {
+	PROPERTY_ENTRY_U32("pagesize", 64),
+	{ }
 };
 #endif
 
@@ -396,7 +393,7 @@ static void evm_init_cpld(void)
 static struct i2c_board_info __initdata i2c_info[] =  {
 	{
 		I2C_BOARD_INFO("24c256", 0x50),
-		.platform_data  = &eeprom_info,
+		.properties  = eeprom_properties,
 	},
 	{
 		I2C_BOARD_INFO("pcf8574a", 0x38),
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 13/15] ARM: davinci: sffsdr: fix the at24 eeprom device name
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

The currently used 24lc64 i2c device name doesn't match against any
of the devices supported by the at24 driver. Change it to the closest
compatible chip.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/board-sffsdr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-davinci/board-sffsdr.c b/arch/arm/mach-davinci/board-sffsdr.c
index e7c1728b0833..f6a4d094cbc3 100644
--- a/arch/arm/mach-davinci/board-sffsdr.c
+++ b/arch/arm/mach-davinci/board-sffsdr.c
@@ -100,7 +100,7 @@ static struct at24_platform_data eeprom_info = {
 
 static struct i2c_board_info __initdata i2c_info[] =  {
 	{
-		I2C_BOARD_INFO("24lc64", 0x50),
+		I2C_BOARD_INFO("24c64", 0x50),
 		.platform_data	= &eeprom_info,
 	},
 	/* Other I2C devices:
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 14/15] ARM: davinci: sffsdr: use device properties for at24 eeprom
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

We want to work towards phasing out the at24_platform_data structure.
There are few users and its contents can be represented using generic
device properties. Using device properties only will allow us to
significantly simplify the at24 configuration code.

Remove the at24_platform_data structure and replace it with an array
of property entries. Drop the byte_len/size property, as the model name
already implies the EEPROM's size.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/board-sffsdr.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/arm/mach-davinci/board-sffsdr.c b/arch/arm/mach-davinci/board-sffsdr.c
index f6a4d094cbc3..680e5d7628a8 100644
--- a/arch/arm/mach-davinci/board-sffsdr.c
+++ b/arch/arm/mach-davinci/board-sffsdr.c
@@ -26,7 +26,7 @@
 #include <linux/init.h>
 #include <linux/platform_device.h>
 #include <linux/i2c.h>
-#include <linux/platform_data/at24.h>
+#include <linux/property.h>
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/rawnand.h>
 #include <linux/mtd/partitions.h>
@@ -92,16 +92,15 @@ static struct platform_device davinci_sffsdr_nandflash_device = {
 	.resource	= davinci_sffsdr_nandflash_resource,
 };
 
-static struct at24_platform_data eeprom_info = {
-	.byte_len	= (64*1024) / 8,
-	.page_size	= 32,
-	.flags		= AT24_FLAG_ADDR16,
+static const struct property_entry eeprom_properties[] = {
+	PROPERTY_ENTRY_U32("pagesize", 32),
+	{ },
 };
 
 static struct i2c_board_info __initdata i2c_info[] =  {
 	{
 		I2C_BOARD_INFO("24c64", 0x50),
-		.platform_data	= &eeprom_info,
+		.properties = eeprom_properties,
 	},
 	/* Other I2C devices:
 	 * MSP430,  addr 0x23 (not used)
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH] selftests: bpf: notification about privilege required to run test_lirc_mode2.sh testing script
From: Daniel Borkmann @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Jeffrin Jose T, ast, shuah; +Cc: netdev, linux-kernel, linux-kselftest
In-Reply-To: <20180622185417.3452-1-ahiliation@gmail.com>

On 06/22/2018 08:54 PM, Jeffrin Jose T wrote:
> The test_lirc_mode2.sh script require root privilege for the successful
> execution of the test.
> 
> This patch is to notify the user about the privilege the script
> demands for the successful execution of the test.
> 
> Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com>

Applied to bpf, thanks Jeffrin!

^ permalink raw reply

* [PATCH v2 15/15] ARM: davinci: remove dead code
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

There are no more users of davinci_get_mac_addr(). Remove it. Also
remove the mac_addr field from the emac platform data struct.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/common.c | 15 ---------------
 include/linux/davinci_emac.h   |  2 --
 2 files changed, 17 deletions(-)

diff --git a/arch/arm/mach-davinci/common.c b/arch/arm/mach-davinci/common.c
index bcb6a7ba84e9..0c6cc354a4aa 100644
--- a/arch/arm/mach-davinci/common.c
+++ b/arch/arm/mach-davinci/common.c
@@ -28,21 +28,6 @@ EXPORT_SYMBOL(davinci_soc_info);
 void __iomem *davinci_intc_base;
 int davinci_intc_type;
 
-void davinci_get_mac_addr(struct nvmem_device *nvmem, void *context)
-{
-	char *mac_addr = davinci_soc_info.emac_pdata->mac_addr;
-	off_t offset = (off_t)context;
-
-	if (!IS_BUILTIN(CONFIG_NVMEM)) {
-		pr_warn("Cannot read MAC addr from EEPROM without CONFIG_NVMEM\n");
-		return;
-	}
-
-	/* Read MAC addr from EEPROM */
-	if (nvmem_device_read(nvmem, offset, ETH_ALEN, mac_addr) == ETH_ALEN)
-		pr_info("Read MAC addr from EEPROM: %pM\n", mac_addr);
-}
-
 static int __init davinci_init_id(struct davinci_soc_info *soc_info)
 {
 	int			i;
diff --git a/include/linux/davinci_emac.h b/include/linux/davinci_emac.h
index 05b97144d342..19888b27706d 100644
--- a/include/linux/davinci_emac.h
+++ b/include/linux/davinci_emac.h
@@ -19,7 +19,6 @@ struct mdio_platform_data {
 };
 
 struct emac_platform_data {
-	char mac_addr[ETH_ALEN];
 	u32 ctrl_reg_offset;
 	u32 ctrl_mod_reg_offset;
 	u32 ctrl_ram_offset;
@@ -46,5 +45,4 @@ enum {
 	EMAC_VERSION_2,	/* DM646x */
 };
 
-void davinci_get_mac_addr(struct nvmem_device *nvmem, void *context);
 #endif
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 08/15] ARM: davinci: mityomapl138: don't read the MAC address from machine code
From: Bartosz Golaszewski @ 2018-06-26 10:22 UTC (permalink / raw)
  To: Sekhar Nori, Kevin Hilman, Russell King, Grygorii Strashko,
	David S . Miller, Srinivas Kandagatla, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-1-brgl@bgdev.pl>

From: Bartosz Golaszewski <bgolaszewski@baylibre.com>

This is now done by the emac driver using a registered nvmem cell.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
---
 arch/arm/mach-davinci/board-mityomapl138.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/arch/arm/mach-davinci/board-mityomapl138.c b/arch/arm/mach-davinci/board-mityomapl138.c
index 2ec31ff61dbd..6263e6afcbf0 100644
--- a/arch/arm/mach-davinci/board-mityomapl138.c
+++ b/arch/arm/mach-davinci/board-mityomapl138.c
@@ -120,7 +120,6 @@ static void read_factory_config(struct nvmem_device *nvmem, void *context)
 {
 	int ret;
 	const char *partnum = NULL;
-	struct davinci_soc_info *soc_info = &davinci_soc_info;
 
 	if (!IS_BUILTIN(CONFIG_NVMEM)) {
 		pr_warn("Factory Config not available without CONFIG_NVMEM\n");
@@ -146,13 +145,6 @@ static void read_factory_config(struct nvmem_device *nvmem, void *context)
 		goto bad_config;
 	}
 
-	pr_info("Found MAC = %pM\n", factory_config.mac);
-	if (is_valid_ether_addr(factory_config.mac))
-		memcpy(soc_info->emac_pdata->mac_addr,
-			factory_config.mac, ETH_ALEN);
-	else
-		pr_warn("Invalid MAC found in factory config block\n");
-
 	partnum = factory_config.partnum;
 	pr_info("Part Number = %s\n", partnum);
 
-- 
2.17.1

^ permalink raw reply related

* Re: [RESEND PATCH] bpfilter: check compiler capability in Kconfig
From: Daniel Borkmann @ 2018-06-26 10:29 UTC (permalink / raw)
  To: Masahiro Yamada, David S . Miller, netdev, Alexei Starovoitov
  Cc: Matteo Croce, Arnd Bergmann, linux-kbuild, Alexei Starovoitov,
	linux-kernel, Michal Marek
In-Reply-To: <1529985336-27522-1-git-send-email-yamada.masahiro@socionext.com>

On 06/26/2018 05:55 AM, Masahiro Yamada wrote:
> With the brand-new syntax extension of Kconfig, we can directly
> check the compiler capability in the configuration phase.
> 
> If the cc-can-link.sh fails, the BPFILTER_UMH is automatically
> hidden by the dependency.
> 
> I also deleted 'default n', which is no-op.
> 
> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply

* Re: [PATCH v3,net-next] vlan: implement vlan id and protocol changes
From: Ido Schimmel @ 2018-06-26 10:32 UTC (permalink / raw)
  To: David Ahern; +Cc: Chas Williams, davem, netdev, Roopa Prabhu, Ido Schimmel
In-Reply-To: <c39f97c6-7ae2-e66f-2226-acbe5c1352d2@cumulusnetworks.com>

On Mon, Jun 25, 2018 at 02:45:24PM -0600, David Ahern wrote:
> On 6/25/18 4:30 AM, Chas Williams wrote:
> > vlan_changelink silently ignores attempts to change the vlan id
> > or protocol id of an existing vlan interface.  Implement by adding
> > the new vlan id and protocol to the interface's vlan group and then
> > removing the old vlan id and protocol from the vlan group.
> > 
> > Signed-off-by: Chas Williams <3chas3@gmail.com>
> > ---
> >  include/linux/netdevice.h |  1 +
> >  net/8021q/vlan.c          |  4 ++--
> >  net/8021q/vlan.h          |  2 ++
> >  net/8021q/vlan_netlink.c  | 38 ++++++++++++++++++++++++++++++++++++++
> >  net/core/dev.c            |  1 +
> >  5 files changed, 44 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 3ec9850c7936..a95ae238addf 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2409,6 +2409,7 @@ enum netdev_cmd {
> >  	NETDEV_CVLAN_FILTER_DROP_INFO,
> >  	NETDEV_SVLAN_FILTER_PUSH_INFO,
> >  	NETDEV_SVLAN_FILTER_DROP_INFO,
> > +	NETDEV_CHANGEVLAN,
> >  };
> >  const char *netdev_cmd_to_name(enum netdev_cmd cmd);
> >  
> 
> you add the new notifier, but do not add any hooks to catch and process it.
> 
> Personally, I think it is a bit sketchy to change the vlan id on an
> existing device and I suspect it will cause latent errors.

+1

> 
> What's your use case for trying to implement the change versus causing
> it to generate an unsupported error?
> 
> If this patch does get accepted, I believe the mlxsw switchdev driver
> will be impacted.

Yes, at minimum we need to return an error for NETDEV_CHANGEVLAN, but
looking at the code it seems that there's no proper rollback.

Thanks for the Cc, David.

^ permalink raw reply

* Re: [PATCH 1/2] sh_eth: fix *enum* RPADIR_BIT
From: Sergei Shtylyov @ 2018-06-26 10:37 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: netdev, David S. Miller, Linux-Renesas
In-Reply-To: <CAMuHMdXro+fWQtoURHeqC6APqmci60NiMn5JAF98xcLxrjLNEg@mail.gmail.com>

On 6/26/2018 10:25 AM, Geert Uytterhoeven wrote:

>> The *enum*  RPADIR_BIT  was declared in the commit 86a74ff21a7a ("net:
>> sh_eth: add support for Renesas SuperH Ethernet") adding SH771x support,
>> however the SH771x manual doesn't have the RPADIR register described and,
>> moreover, tells why the padding insertion must not be used. The newer SoC
>> manuals do have RPADIR documented, though with somewhat different layout --
>> update the *enum* according to these manuals...
>>
>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> 
> Thanks for your patch!
> 
> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
> 
>> --- net-next.orig/drivers/net/ethernet/renesas/sh_eth.h
>> +++ net-next/drivers/net/ethernet/renesas/sh_eth.h
>> @@ -403,8 +403,7 @@ enum DESC_I_BIT {
>>
>>   /* RPADIR */
>>   enum RPADIR_BIT {
>> -       RPADIR_PADS1 = 0x20000, RPADIR_PADS0 = 0x10000,
>> -       RPADIR_PADR = 0x0003f,
>> +       RPADIR_PADS = 0x1f0000, RPADIR_PADR = 0xffff,
> 
> Perhaps add some comments?
> 
>          RPADIR_PADS = 0x1f0000; /* Padding Size (insert N bytes of padding) */
>          RPADIR_PADR = 0xffff;   /* Padding Slot (insert padding at byte N) */

    It would be nice but inconsistent with what we do for the other registers...

>>   };
> 
> Note that none of the RPADIR enums are actually used.

    I'd surely noted that. :-)

> Gr{oetje,eeting}s,
> 
>                          Geert
> 

MBR, Sergei

^ permalink raw reply

* Re: [net-next PATCH v4 6/7] net-sysfs: Add interface for Rx queue(s) map per Tx queue
From: Willem de Bruijn @ 2018-06-26 10:55 UTC (permalink / raw)
  To: Amritha Nambiar
  Cc: Network Development, David Miller, Alexander Duyck,
	Samudrala, Sridhar, Alexander Duyck, Eric Dumazet,
	Hannes Frederic Sowa, Tom Herbert
In-Reply-To: <152994988629.9733.12140400900480192885.stgit@anamhost.jf.intel.com>

On Mon, Jun 25, 2018 at 7:06 PM Amritha Nambiar
<amritha.nambiar@intel.com> wrote:
>
> Extend transmit queue sysfs attribute to configure Rx queue(s) map
> per Tx queue. By default no receive queues are configured for the
> Tx queue.
>
> - /sys/class/net/eth0/queues/tx-*/xps_rxqs
>
> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> ---

> +static ssize_t xps_rxqs_show(struct netdev_queue *queue, char *buf)
> +{
> +       struct net_device *dev = queue->dev;
> +       struct xps_dev_maps *dev_maps;
> +       unsigned long *mask, index;
> +       int j, len, num_tc = 1, tc = 0;
> +
> +       mask = kcalloc(BITS_TO_LONGS(dev->num_rx_queues), sizeof(long),
> +                      GFP_KERNEL);
> +       if (!mask)
> +               return -ENOMEM;
> +
> +       index = get_netdev_queue_index(queue);
> +
> +       if (dev->num_tc) {
> +               num_tc = dev->num_tc;
> +               tc = netdev_txq_to_tc(dev, index);
> +               if (tc < 0)
> +                       return -EINVAL;

Must free mask

> +static ssize_t xps_rxqs_store(struct netdev_queue *queue, const char *buf,
> +                             size_t len)
> +{
> +       struct net_device *dev = queue->dev;
> +       unsigned long *mask, index;
> +       int err;
> +
> +       if (!capable(CAP_NET_ADMIN))
> +               return -EPERM;

ns_capable?

^ permalink raw reply

* Re: [net-next PATCH v4 3/7] net: sock: Change tx_queue_mapping in sock_common to unsigned short
From: Willem de Bruijn @ 2018-06-26 10:58 UTC (permalink / raw)
  To: Amritha Nambiar
  Cc: Network Development, David Miller, Alexander Duyck,
	Samudrala, Sridhar, Alexander Duyck, Eric Dumazet,
	Hannes Frederic Sowa, Tom Herbert
In-Reply-To: <152994986976.9733.18263514750793164132.stgit@anamhost.jf.intel.com>

On Mon, Jun 25, 2018 at 7:06 PM Amritha Nambiar
<amritha.nambiar@intel.com> wrote:
>
> Change 'skc_tx_queue_mapping' field in sock_common structure from
> 'int' to 'unsigned short' type with 0 indicating unset and
> a positive queue value being set. This way it is consistent with
> the queue_mapping field in the sk_buff. This will also accommodate
> adding a new 'unsigned short' field in sock_common in the next
> patch for rx_queue_mapping.
>
> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> ---

>  static inline void sk_tx_queue_set(struct sock *sk, int tx_queue)
>  {
> -       sk->sk_tx_queue_mapping = tx_queue;
> +       /* sk_tx_queue_mapping accept only upto a 16-bit value */
> +       WARN_ON((unsigned short)tx_queue > USHRT_MAX);
> +       sk->sk_tx_queue_mapping = tx_queue + 1;
>  }

WARN_ON_ONCE to avoid flooding the kernel buffer.

^ permalink raw reply

* Re: [net-next PATCH v4 5/7] net: Enable Tx queue selection based on Rx queues
From: Willem de Bruijn @ 2018-06-26 11:04 UTC (permalink / raw)
  To: Amritha Nambiar
  Cc: Network Development, David Miller, Alexander Duyck,
	Samudrala, Sridhar, Alexander Duyck, Eric Dumazet,
	Hannes Frederic Sowa, Tom Herbert
In-Reply-To: <152994988080.9733.10385317895413246222.stgit@anamhost.jf.intel.com>

On Mon, Jun 25, 2018 at 7:06 PM Amritha Nambiar
<amritha.nambiar@intel.com> wrote:
>
> This patch adds support to pick Tx queue based on the Rx queue(s) map
> configuration set by the admin through the sysfs attribute
> for each Tx queue. If the user configuration for receive queue(s) map
> does not apply, then the Tx queue selection falls back to CPU(s) map
> based selection and finally to hashing.
>
> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com>
> ---

> +static int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
>  {
>  #ifdef CONFIG_XPS
>         struct xps_dev_maps *dev_maps;
> -       struct xps_map *map;
> +       struct sock *sk = skb->sk;
>         int queue_index = -1;
>
>         if (!static_key_false(&xps_needed))
>                 return -1;
>
>         rcu_read_lock();
> -       dev_maps = rcu_dereference(dev->xps_cpus_map);
> +       if (!static_key_false(&xps_rxqs_needed))
> +               goto get_cpus_map;
> +
> +       dev_maps = rcu_dereference(dev->xps_rxqs_map);
>         if (dev_maps) {
> -               unsigned int tci = skb->sender_cpu - 1;
> +               int tci = sk_rx_queue_get(sk);

What if the rx device differs from the tx device?

^ permalink raw reply

* Re: [PATCH v2 01/15] nvmem: add support for cell lookups
From: Srinivas Kandagatla @ 2018-06-26 11:06 UTC (permalink / raw)
  To: Bartosz Golaszewski, Sekhar Nori, Kevin Hilman, Russell King,
	Grygorii Strashko, David S . Miller, Lukas Wunner, Rob Herring,
	Florian Fainelli, Dan Carpenter, Ivan Khoronzhuk, David Lechner,
	Greg Kroah-Hartman, Andrew Lunn
  Cc: linux-arm-kernel, linux-kernel, linux-omap, netdev,
	Bartosz Golaszewski
In-Reply-To: <20180626102245.30711-2-brgl@bgdev.pl>

Thanks for the patch,

On 26/06/18 11:22, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bgolaszewski@baylibre.com>
> 
> We can currently only register nvmem cells from device tree or by
> manually calling nvmem_add_cells(). The latter options however forces
> users to make sure that the nvmem provider with which the cells are
> associated is registered before the call.
> 
> This patch proposes a new solution inspired by other frameworks that
> offer resource lookups (GPIO, PWM etc.). It adds a function that allows
> machine code to register nvmem lookup which are later lazily used to
> add corresponding nvmem cells.
> 
Overall the idea look fine to me.

This needs to be documented in ./Documentation/nvmem/nvmem.txt

> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
> ---
>   drivers/nvmem/core.c           | 57 +++++++++++++++++++++++++++++++++-
>   include/linux/nvmem-consumer.h |  6 ++++
>   include/linux/nvmem-provider.h |  6 ++++
>   3 files changed, 68 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
> index b5b0cdc21d01..a2e87b464319 100644
> --- a/drivers/nvmem/core.c
> +++ b/drivers/nvmem/core.c
> @@ -62,6 +62,9 @@ static DEFINE_IDA(nvmem_ida);
>   static LIST_HEAD(nvmem_cells);
>   static DEFINE_MUTEX(nvmem_cells_mutex);
>   
> +static LIST_HEAD(nvmem_cell_lookups);
> +static DEFINE_MUTEX(nvmem_lookup_mutex);
> +
>   #ifdef CONFIG_DEBUG_LOCK_ALLOC
>   static struct lock_class_key eeprom_lock_key;
>   #endif
> @@ -247,6 +250,23 @@ static const struct attribute_group *nvmem_ro_root_dev_groups[] = {
>   	NULL,
>   };
>   
> +/**
> + * nvmem_register_lookup() - register a number of nvmem cell lookup entries
> + *

Can we rename this to nvmem_add_lookup_table()?
register sound bit heavy here.

We should also have something like nvmem_remove_lookup_table() for 
consistency, and it should ensure that it clears the cells entry too.

> + * @lookup: array of nvmem cell lookup entries
> + * @nentries: number of lookup entries in the array
> + */
> +void nvmem_register_lookup(struct nvmem_cell_lookup *lookup, size_t nentries)
> +{
> +	int i;
> + > +	mutex_lock(&nvmem_lookup_mutex);
> +	for (i = 0; i < nentries; i++)
> +		list_add_tail(&lookup[i].list, &nvmem_cell_lookups);
> +	mutex_unlock(&nvmem_lookup_mutex);
> +}
> +EXPORT_SYMBOL_GPL(nvmem_register_lookup);
> +
>   static void nvmem_release(struct device *dev)
>   {
>   	struct nvmem_device *nvmem = to_nvmem_device(dev);
> @@ -916,6 +936,37 @@ struct nvmem_cell *of_nvmem_cell_get(struct device_node *np,
>   EXPORT_SYMBOL_GPL(of_nvmem_cell_get);
>   #endif
>   

^ permalink raw reply

* Re: [PATCH net-next V3 1/2] cxgb4: Add support for FW_ETH_TX_PKT_VM_WR
From: kbuild test robot @ 2018-06-26 11:23 UTC (permalink / raw)
  To: Ganesh Goudar
  Cc: kbuild-all, netdev, davem, nirranjan, indranil, venkatesh,
	Arjun Vynipadath, Casey Leedom, Ganesh Goudar
In-Reply-To: <1530001483-17816-1-git-send-email-ganeshgr@chelsio.com>

[-- Attachment #1: Type: text/plain, Size: 12730 bytes --]

Hi Arjun,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Ganesh-Goudar/cxgb4-Add-support-for-FW_ETH_TX_PKT_VM_WR/20180626-163628
config: x86_64-randconfig-x001-201825 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   drivers/net/ethernet/chelsio/cxgb4/sge.c: In function 'cxgb4_vf_eth_xmit':
>> drivers/net/ethernet/chelsio/cxgb4/sge.c:1646:18: error: assignment of read-only variable 'fw_hdr_copy_len'
     fw_hdr_copy_len = (sizeof(wr->ethmacdst) + sizeof(wr->ethmacsrc) +
                     ^

vim +/fw_hdr_copy_len +1646 drivers/net/ethernet/chelsio/cxgb4/sge.c

  1622	
  1623	/**
  1624	 *	cxgb4_vf_eth_xmit - add a packet to an Ethernet TX queue
  1625	 *	@skb: the packet
  1626	 *	@dev: the egress net device
  1627	 *
  1628	 *	Add a packet to an SGE Ethernet TX queue.  Runs with softirqs disabled.
  1629	 */
  1630	static netdev_tx_t cxgb4_vf_eth_xmit(struct sk_buff *skb,
  1631					     struct net_device *dev)
  1632	{
  1633		dma_addr_t addr[MAX_SKB_FRAGS + 1];
  1634		const struct skb_shared_info *ssi;
  1635		struct fw_eth_tx_pkt_vm_wr *wr;
  1636		int qidx, credits, max_pkt_len;
  1637		const size_t fw_hdr_copy_len;
  1638		struct cpl_tx_pkt_core *cpl;
  1639		const struct port_info *pi;
  1640		unsigned int flits, ndesc;
  1641		struct sge_eth_txq *txq;
  1642		struct adapter *adapter;
  1643		u64 cntrl, *end;
  1644		u32 wr_mid;
  1645	
> 1646		fw_hdr_copy_len = (sizeof(wr->ethmacdst) + sizeof(wr->ethmacsrc) +
  1647				   sizeof(wr->ethtype) + sizeof(wr->vlantci));
  1648	
  1649		/* The chip minimum packet length is 10 octets but the firmware
  1650		 * command that we are using requires that we copy the Ethernet header
  1651		 * (including the VLAN tag) into the header so we reject anything
  1652		 * smaller than that ...
  1653		 */
  1654		if (unlikely(skb->len < fw_hdr_copy_len))
  1655			goto out_free;
  1656	
  1657		/* Discard the packet if the length is greater than mtu */
  1658		max_pkt_len = ETH_HLEN + dev->mtu;
  1659		if (skb_vlan_tag_present(skb))
  1660			max_pkt_len += VLAN_HLEN;
  1661		if (!skb_shinfo(skb)->gso_size && (unlikely(skb->len > max_pkt_len)))
  1662			goto out_free;
  1663	
  1664		/* Figure out which TX Queue we're going to use. */
  1665		pi = netdev_priv(dev);
  1666		adapter = pi->adapter;
  1667		qidx = skb_get_queue_mapping(skb);
  1668		WARN_ON(qidx >= pi->nqsets);
  1669		txq = &adapter->sge.ethtxq[pi->first_qset + qidx];
  1670	
  1671		/* Take this opportunity to reclaim any TX Descriptors whose DMA
  1672		 * transfers have completed.
  1673		 */
  1674		cxgb4_reclaim_completed_tx(adapter, &txq->q, true);
  1675	
  1676		/* Calculate the number of flits and TX Descriptors we're going to
  1677		 * need along with how many TX Descriptors will be left over after
  1678		 * we inject our Work Request.
  1679		 */
  1680		flits = t4vf_calc_tx_flits(skb);
  1681		ndesc = flits_to_desc(flits);
  1682		credits = txq_avail(&txq->q) - ndesc;
  1683	
  1684		if (unlikely(credits < 0)) {
  1685			/* Not enough room for this packet's Work Request.  Stop the
  1686			 * TX Queue and return a "busy" condition.  The queue will get
  1687			 * started later on when the firmware informs us that space
  1688			 * has opened up.
  1689			 */
  1690			eth_txq_stop(txq);
  1691			dev_err(adapter->pdev_dev,
  1692				"%s: TX ring %u full while queue awake!\n",
  1693				dev->name, qidx);
  1694			return NETDEV_TX_BUSY;
  1695		}
  1696	
  1697		if (!t4vf_is_eth_imm(skb) &&
  1698		    unlikely(cxgb4_map_skb(adapter->pdev_dev, skb, addr) < 0)) {
  1699			/* We need to map the skb into PCI DMA space (because it can't
  1700			 * be in-lined directly into the Work Request) and the mapping
  1701			 * operation failed.  Record the error and drop the packet.
  1702			 */
  1703			txq->mapping_err++;
  1704			goto out_free;
  1705		}
  1706	
  1707		wr_mid = FW_WR_LEN16_V(DIV_ROUND_UP(flits, 2));
  1708		if (unlikely(credits < ETHTXQ_STOP_THRES)) {
  1709			/* After we're done injecting the Work Request for this
  1710			 * packet, we'll be below our "stop threshold" so stop the TX
  1711			 * Queue now and schedule a request for an SGE Egress Queue
  1712			 * Update message.  The queue will get started later on when
  1713			 * the firmware processes this Work Request and sends us an
  1714			 * Egress Queue Status Update message indicating that space
  1715			 * has opened up.
  1716			 */
  1717			eth_txq_stop(txq);
  1718			wr_mid |= FW_WR_EQUEQ_F | FW_WR_EQUIQ_F;
  1719		}
  1720	
  1721		/* Start filling in our Work Request.  Note that we do _not_ handle
  1722		 * the WR Header wrapping around the TX Descriptor Ring.  If our
  1723		 * maximum header size ever exceeds one TX Descriptor, we'll need to
  1724		 * do something else here.
  1725		 */
  1726		WARN_ON(DIV_ROUND_UP(T4VF_ETHTXQ_MAX_HDR, TXD_PER_EQ_UNIT) > 1);
  1727		wr = (void *)&txq->q.desc[txq->q.pidx];
  1728		wr->equiq_to_len16 = cpu_to_be32(wr_mid);
  1729		wr->r3[0] = cpu_to_be32(0);
  1730		wr->r3[1] = cpu_to_be32(0);
  1731		skb_copy_from_linear_data(skb, (void *)wr->ethmacdst, fw_hdr_copy_len);
  1732		end = (u64 *)wr + flits;
  1733	
  1734		/* If this is a Large Send Offload packet we'll put in an LSO CPL
  1735		 * message with an encapsulated TX Packet CPL message.  Otherwise we
  1736		 * just use a TX Packet CPL message.
  1737		 */
  1738		ssi = skb_shinfo(skb);
  1739		if (ssi->gso_size) {
  1740			struct cpl_tx_pkt_lso_core *lso = (void *)(wr + 1);
  1741			bool v6 = (ssi->gso_type & SKB_GSO_TCPV6) != 0;
  1742			int l3hdr_len = skb_network_header_len(skb);
  1743			int eth_xtra_len = skb_network_offset(skb) - ETH_HLEN;
  1744	
  1745			wr->op_immdlen =
  1746				cpu_to_be32(FW_WR_OP_V(FW_ETH_TX_PKT_VM_WR) |
  1747					    FW_WR_IMMDLEN_V(sizeof(*lso) +
  1748							    sizeof(*cpl)));
  1749			 /* Fill in the LSO CPL message. */
  1750			lso->lso_ctrl =
  1751				cpu_to_be32(LSO_OPCODE_V(CPL_TX_PKT_LSO) |
  1752					    LSO_FIRST_SLICE_F |
  1753					    LSO_LAST_SLICE_F |
  1754					    LSO_IPV6_V(v6) |
  1755					    LSO_ETHHDR_LEN_V(eth_xtra_len / 4) |
  1756					    LSO_IPHDR_LEN_V(l3hdr_len / 4) |
  1757					    LSO_TCPHDR_LEN_V(tcp_hdr(skb)->doff));
  1758			lso->ipid_ofst = cpu_to_be16(0);
  1759			lso->mss = cpu_to_be16(ssi->gso_size);
  1760			lso->seqno_offset = cpu_to_be32(0);
  1761			if (is_t4(adapter->params.chip))
  1762				lso->len = cpu_to_be32(skb->len);
  1763			else
  1764				lso->len = cpu_to_be32(LSO_T5_XFER_SIZE_V(skb->len));
  1765	
  1766			/* Set up TX Packet CPL pointer, control word and perform
  1767			 * accounting.
  1768			 */
  1769			cpl = (void *)(lso + 1);
  1770	
  1771			if (CHELSIO_CHIP_VERSION(adapter->params.chip) <= CHELSIO_T5)
  1772				cntrl = TXPKT_ETHHDR_LEN_V(eth_xtra_len);
  1773			else
  1774				cntrl = T6_TXPKT_ETHHDR_LEN_V(eth_xtra_len);
  1775	
  1776			cntrl |= TXPKT_CSUM_TYPE_V(v6 ?
  1777						   TX_CSUM_TCPIP6 : TX_CSUM_TCPIP) |
  1778				 TXPKT_IPHDR_LEN_V(l3hdr_len);
  1779			txq->tso++;
  1780			txq->tx_cso += ssi->gso_segs;
  1781		} else {
  1782			int len;
  1783	
  1784			len = (t4vf_is_eth_imm(skb)
  1785			       ? skb->len + sizeof(*cpl)
  1786			       : sizeof(*cpl));
  1787			wr->op_immdlen =
  1788				cpu_to_be32(FW_WR_OP_V(FW_ETH_TX_PKT_VM_WR) |
  1789					    FW_WR_IMMDLEN_V(len));
  1790	
  1791			/* Set up TX Packet CPL pointer, control word and perform
  1792			 * accounting.
  1793			 */
  1794			cpl = (void *)(wr + 1);
  1795			if (skb->ip_summed == CHECKSUM_PARTIAL) {
  1796				cntrl = hwcsum(adapter->params.chip, skb) |
  1797					TXPKT_IPCSUM_DIS_F;
  1798				txq->tx_cso++;
  1799			} else {
  1800				cntrl = TXPKT_L4CSUM_DIS_F | TXPKT_IPCSUM_DIS_F;
  1801			}
  1802		}
  1803	
  1804		/* If there's a VLAN tag present, add that to the list of things to
  1805		 * do in this Work Request.
  1806		 */
  1807		if (skb_vlan_tag_present(skb)) {
  1808			txq->vlan_ins++;
  1809			cntrl |= TXPKT_VLAN_VLD_F | TXPKT_VLAN_V(skb_vlan_tag_get(skb));
  1810		}
  1811	
  1812		 /* Fill in the TX Packet CPL message header. */
  1813		cpl->ctrl0 = cpu_to_be32(TXPKT_OPCODE_V(CPL_TX_PKT_XT) |
  1814					 TXPKT_INTF_V(pi->port_id) |
  1815					 TXPKT_PF_V(0));
  1816		cpl->pack = cpu_to_be16(0);
  1817		cpl->len = cpu_to_be16(skb->len);
  1818		cpl->ctrl1 = cpu_to_be64(cntrl);
  1819	
  1820		/* Fill in the body of the TX Packet CPL message with either in-lined
  1821		 * data or a Scatter/Gather List.
  1822		 */
  1823		if (t4vf_is_eth_imm(skb)) {
  1824			/* In-line the packet's data and free the skb since we don't
  1825			 * need it any longer.
  1826			 */
  1827			cxgb4_inline_tx_skb(skb, &txq->q, cpl + 1);
  1828			dev_consume_skb_any(skb);
  1829		} else {
  1830			/* Write the skb's Scatter/Gather list into the TX Packet CPL
  1831			 * message and retain a pointer to the skb so we can free it
  1832			 * later when its DMA completes.  (We store the skb pointer
  1833			 * in the Software Descriptor corresponding to the last TX
  1834			 * Descriptor used by the Work Request.)
  1835			 *
  1836			 * The retained skb will be freed when the corresponding TX
  1837			 * Descriptors are reclaimed after their DMAs complete.
  1838			 * However, this could take quite a while since, in general,
  1839			 * the hardware is set up to be lazy about sending DMA
  1840			 * completion notifications to us and we mostly perform TX
  1841			 * reclaims in the transmit routine.
  1842			 *
  1843			 * This is good for performamce but means that we rely on new
  1844			 * TX packets arriving to run the destructors of completed
  1845			 * packets, which open up space in their sockets' send queues.
  1846			 * Sometimes we do not get such new packets causing TX to
  1847			 * stall.  A single UDP transmitter is a good example of this
  1848			 * situation.  We have a clean up timer that periodically
  1849			 * reclaims completed packets but it doesn't run often enough
  1850			 * (nor do we want it to) to prevent lengthy stalls.  A
  1851			 * solution to this problem is to run the destructor early,
  1852			 * after the packet is queued but before it's DMAd.  A con is
  1853			 * that we lie to socket memory accounting, but the amount of
  1854			 * extra memory is reasonable (limited by the number of TX
  1855			 * descriptors), the packets do actually get freed quickly by
  1856			 * new packets almost always, and for protocols like TCP that
  1857			 * wait for acks to really free up the data the extra memory
  1858			 * is even less.  On the positive side we run the destructors
  1859			 * on the sending CPU rather than on a potentially different
  1860			 * completing CPU, usually a good thing.
  1861			 *
  1862			 * Run the destructor before telling the DMA engine about the
  1863			 * packet to make sure it doesn't complete and get freed
  1864			 * prematurely.
  1865			 */
  1866			struct ulptx_sgl *sgl = (struct ulptx_sgl *)(cpl + 1);
  1867			struct sge_txq *tq = &txq->q;
  1868			int last_desc;
  1869	
  1870			/* If the Work Request header was an exact multiple of our TX
  1871			 * Descriptor length, then it's possible that the starting SGL
  1872			 * pointer lines up exactly with the end of our TX Descriptor
  1873			 * ring.  If that's the case, wrap around to the beginning
  1874			 * here ...
  1875			 */
  1876			if (unlikely((void *)sgl == (void *)tq->stat)) {
  1877				sgl = (void *)tq->desc;
  1878				end = (void *)((void *)tq->desc +
  1879					       ((void *)end - (void *)tq->stat));
  1880			}
  1881	
  1882			cxgb4_write_sgl(skb, tq, sgl, end, 0, addr);
  1883			skb_orphan(skb);
  1884	
  1885			last_desc = tq->pidx + ndesc - 1;
  1886			if (last_desc >= tq->size)
  1887				last_desc -= tq->size;
  1888			tq->sdesc[last_desc].skb = skb;
  1889			tq->sdesc[last_desc].sgl = sgl;
  1890		}
  1891	
  1892		/* Advance our internal TX Queue state, tell the hardware about
  1893		 * the new TX descriptors and return success.
  1894		 */
  1895		txq_advance(&txq->q, ndesc);
  1896	
  1897		cxgb4_ring_tx_db(adapter, &txq->q, ndesc);
  1898		return NETDEV_TX_OK;
  1899	
  1900	out_free:
  1901		/* An error of some sort happened.  Free the TX skb and tell the
  1902		 * OS that we've "dealt" with the packet ...
  1903		 */
  1904		dev_kfree_skb_any(skb);
  1905		return NETDEV_TX_OK;
  1906	}
  1907	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36302 bytes --]

^ permalink raw reply

* Re: [PATCH rdma-next 08/12] overflow.h: Add arithmetic shift helper
From: Leon Romanovsky @ 2018-06-26 11:37 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Jason Gunthorpe, Doug Ledford, Kees Cook, RDMA mailing list,
	Hadar Hen Zion, Matan Barak, Michael J Ruhl, Noa Osherovich,
	Raed Salem, Yishai Hadas, Saeed Mahameed, linux-netdev,
	linux-kernel
In-Reply-To: <CAKwiHFiRYbyiJqDYCgKXKZYRr0KjCt8q9AwKwfqoCA1sT2KFyQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3492 bytes --]

On Tue, Jun 26, 2018 at 10:07:07AM +0200, Rasmus Villemoes wrote:
> On 25 June 2018 at 19:11, Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> > On Mon, Jun 25, 2018 at 11:26:05AM +0200, Rasmus Villemoes wrote:
> >
> > >    check_shift_overflow(a, s, d) {
> > >        unsigned _nbits = 8*sizeof(a);
> > >        typeof(a) _a = (a);
> > >        typeof(s) _s = (s);
> > >        typeof(d) _d = (d);
> > >
> > >        *_d = ((u64)(_a) << (_s & (_nbits-1)));
> > >        _s >= _nbits || (_s > 0 && (_a >> (_nbits - _s -
> > >    is_signed_type(a))) != 0);
> > >    }
> >
> > Those types are not quite right.. What about this?
> >
> >     check_shift_overflow(a, s, d) ({
> >         unsigned int _nbits = 8*sizeof(d) - is_signed_type(d);
> >         typeof(d) _a = a;  // Shift is always performed on type 'd'
> >         typeof(s) _s = s;
> >         typeof(d) _d = d;
> >
> >         *_d = (_a << (_s & (_nbits-1)));
> >
> >         (((*_d) >> (_s & (_nbits-1)) != _a);
> >     })
> >
>
> No, because, the check_*_overflow (and the __builtin_*_overflow cousins)
> functions must do their job without causing undefined behaviour, regardless
> of what crazy input values and types they are given. Also, the output must
> be completely defined for all inputs [1]. I omitted it for brevity, but I
> also wanted a and *d to have the same type, so there should also be one of
> those (void)(&_a == _d); statements. See the other check_*_overflow and the
> commit adding them. Without the (u64) cast, any signed (and negative) a
> would cause UB in your suggestion. Also, having _nbits be 31 when a (and/or
> *d) has type int, and then and'ing the shift by 30 doesn't make any sense;
> I have no idea what you're trying to do. I haven't tested the above, but I
> know from when I wrote the other ones that gcc is smart enough not to
> actually do the arithmetic in 64 bits when only <= 32 bit types are
> involved (i.e., gcc sees that the result is anyway implicitly truncated to
> 32 bits, so only bothers to compute the lower 32 bits).
>
> [1] For this one, it would probably be most consistent to say that the
> result is a*2^s computed in infinite-precision, then truncated to fit in d.
> So for too large s, that would just yield 0. But that becomes a bit
> annoying when s is negative; we don't want to start handling a negative
> left shift as a right shift. That's also why I said that one should sit
> down and think about the semantics one really wants, then implement that,
> and write tests. For a first implementation, it might be completely
> reasonable to simply say BUILD_BUG_ON(is_signed_type(a)), but that still
> leaves open what to put in *d when s is negative. But maybe another
> BUILD_BUG_ON(is_signed_type(s)) could handle that, though that's a bit
> annoying for integer literals.
>
> And can we use mathamatcial invertability to prove no overlow and
> > bound _a ? As above.
> >
>
> It's quite possible that the expression determining whether overflow
> occured can be written differently, possibly in terms of shifting back, but
> it definitely needs to return true when s is greater than nbits;
> check_shift_overflow(1U, 32, &d) must be true. And that expression also
> must not involve UB.

Rasmus,

RDMA doesn't really need specific size_t variant, but wants to prevent
shift overflows from users commands, so any true/false function/macro
will work for us.

https://patchwork.kernel.org/patch/10484055/
https://patchwork.kernel.org/patch/10484053/

Thanks

>
> Rasmus

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* [PATCH net-next V4 1/2] cxgb4: Add support for FW_ETH_TX_PKT_VM_WR
From: Ganesh Goudar @ 2018-06-26 11:40 UTC (permalink / raw)
  To: netdev, davem
  Cc: nirranjan, indranil, venkatesh, Arjun Vynipadath, Casey Leedom,
	Ganesh Goudar

From: Arjun Vynipadath <arjun@chelsio.com>

The present TX workrequest(FW_ETH_TX_PKT_WR) cant be used for
host->vf communication, since it doesn't loopback the outgoing
packets to virtual interfaces on the same port. This can be done
using FW_ETH_TX_PKT_VM_WR.
This fix depends on ethtool_flags to determine what WR to use for
TX path. Support for setting this flags by user is added in next
commit.

Based on the original work by : Casey Leedom <leedom@chelsio.com>

Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
V4: Fixed build errors.

V3: Made eth_flags type consistent across struct adapter and
    struct port_info.                                                              

V2: Renamed t4_eth_xmit() and t4vf_eth_xmit(), since some compilers
    were warning about conflicting definition in cxgb4vf driver
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h      |  13 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |   2 +-
 drivers/net/ethernet/chelsio/cxgb4/sge.c        | 372 +++++++++++++++++++++++-
 3 files changed, 383 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 1adb968..a4ea53d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -522,6 +522,15 @@ enum {
 	MAX_INGQ = MAX_ETH_QSETS + INGQ_EXTRAS,
 };
 
+enum {
+	PRIV_FLAG_PORT_TX_VM_BIT,
+};
+
+#define PRIV_FLAG_PORT_TX_VM		BIT(PRIV_FLAG_PORT_TX_VM_BIT)
+
+#define PRIV_FLAGS_ADAP			0
+#define PRIV_FLAGS_PORT			PRIV_FLAG_PORT_TX_VM
+
 struct adapter;
 struct sge_rspq;
 
@@ -558,6 +567,7 @@ struct port_info {
 	struct hwtstamp_config tstamp_config;
 	bool ptp_enable;
 	struct sched_table *sched_tbl;
+	u32 eth_flags;
 };
 
 struct dentry;
@@ -868,6 +878,7 @@ struct adapter {
 	unsigned int flags;
 	unsigned int adap_idx;
 	enum chip_type chip;
+	u32 eth_flags;
 
 	int msg_enable;
 	__be16 vxlan_port;
@@ -1334,7 +1345,7 @@ void t4_os_link_changed(struct adapter *adap, int port_id, int link_stat);
 void t4_free_sge_resources(struct adapter *adap);
 void t4_free_ofld_rxqs(struct adapter *adap, int n, struct sge_ofld_rxq *q);
 irq_handler_t t4_intr_handler(struct adapter *adap);
-netdev_tx_t t4_eth_xmit(struct sk_buff *skb, struct net_device *dev);
+netdev_tx_t t4_start_xmit(struct sk_buff *skb, struct net_device *dev);
 int t4_ethrx_handler(struct sge_rspq *q, const __be64 *rsp,
 		     const struct pkt_gl *gl);
 int t4_mgmt_tx(struct adapter *adap, struct sk_buff *skb);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index bc03c17..d3b0f9c 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -3217,7 +3217,7 @@ static netdev_features_t cxgb_fix_features(struct net_device *dev,
 static const struct net_device_ops cxgb4_netdev_ops = {
 	.ndo_open             = cxgb_open,
 	.ndo_stop             = cxgb_close,
-	.ndo_start_xmit       = t4_eth_xmit,
+	.ndo_start_xmit       = t4_start_xmit,
 	.ndo_select_queue     =	cxgb_select_queue,
 	.ndo_get_stats64      = cxgb_get_stats,
 	.ndo_set_rx_mode      = cxgb_set_rxmode,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 395e2a0..ebb46c4 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -1288,13 +1288,13 @@ static inline void t6_fill_tnl_lso(struct sk_buff *skb,
 }
 
 /**
- *	t4_eth_xmit - add a packet to an Ethernet Tx queue
+ *	cxgb4_eth_xmit - add a packet to an Ethernet Tx queue
  *	@skb: the packet
  *	@dev: the egress net device
  *
  *	Add a packet to an SGE Ethernet Tx queue.  Runs with softirqs disabled.
  */
-netdev_tx_t t4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
+static netdev_tx_t cxgb4_eth_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	u32 wr_mid, ctrl0, op;
 	u64 cntrl, *end, *sgl;
@@ -1547,6 +1547,374 @@ out_free:	dev_kfree_skb_any(skb);
 	return NETDEV_TX_OK;
 }
 
+/* Constants ... */
+enum {
+	/* Egress Queue sizes, producer and consumer indices are all in units
+	 * of Egress Context Units bytes.  Note that as far as the hardware is
+	 * concerned, the free list is an Egress Queue (the host produces free
+	 * buffers which the hardware consumes) and free list entries are
+	 * 64-bit PCI DMA addresses.
+	 */
+	EQ_UNIT = SGE_EQ_IDXSIZE,
+	FL_PER_EQ_UNIT = EQ_UNIT / sizeof(__be64),
+	TXD_PER_EQ_UNIT = EQ_UNIT / sizeof(__be64),
+
+	T4VF_ETHTXQ_MAX_HDR = (sizeof(struct fw_eth_tx_pkt_vm_wr) +
+			       sizeof(struct cpl_tx_pkt_lso_core) +
+			       sizeof(struct cpl_tx_pkt_core)) / sizeof(__be64),
+};
+
+/**
+ *	t4vf_is_eth_imm - can an Ethernet packet be sent as immediate data?
+ *	@skb: the packet
+ *
+ *	Returns whether an Ethernet packet is small enough to fit completely as
+ *	immediate data.
+ */
+static inline int t4vf_is_eth_imm(const struct sk_buff *skb)
+{
+	/* The VF Driver uses the FW_ETH_TX_PKT_VM_WR firmware Work Request
+	 * which does not accommodate immediate data.  We could dike out all
+	 * of the support code for immediate data but that would tie our hands
+	 * too much if we ever want to enhace the firmware.  It would also
+	 * create more differences between the PF and VF Drivers.
+	 */
+	return false;
+}
+
+/**
+ *	t4vf_calc_tx_flits - calculate the number of flits for a packet TX WR
+ *	@skb: the packet
+ *
+ *	Returns the number of flits needed for a TX Work Request for the
+ *	given Ethernet packet, including the needed WR and CPL headers.
+ */
+static inline unsigned int t4vf_calc_tx_flits(const struct sk_buff *skb)
+{
+	unsigned int flits;
+
+	/* If the skb is small enough, we can pump it out as a work request
+	 * with only immediate data.  In that case we just have to have the
+	 * TX Packet header plus the skb data in the Work Request.
+	 */
+	if (t4vf_is_eth_imm(skb))
+		return DIV_ROUND_UP(skb->len + sizeof(struct cpl_tx_pkt),
+				    sizeof(__be64));
+
+	/* Otherwise, we're going to have to construct a Scatter gather list
+	 * of the skb body and fragments.  We also include the flits necessary
+	 * for the TX Packet Work Request and CPL.  We always have a firmware
+	 * Write Header (incorporated as part of the cpl_tx_pkt_lso and
+	 * cpl_tx_pkt structures), followed by either a TX Packet Write CPL
+	 * message or, if we're doing a Large Send Offload, an LSO CPL message
+	 * with an embedded TX Packet Write CPL message.
+	 */
+	flits = sgl_len(skb_shinfo(skb)->nr_frags + 1);
+	if (skb_shinfo(skb)->gso_size)
+		flits += (sizeof(struct fw_eth_tx_pkt_vm_wr) +
+			  sizeof(struct cpl_tx_pkt_lso_core) +
+			  sizeof(struct cpl_tx_pkt_core)) / sizeof(__be64);
+	else
+		flits += (sizeof(struct fw_eth_tx_pkt_vm_wr) +
+			  sizeof(struct cpl_tx_pkt_core)) / sizeof(__be64);
+	return flits;
+}
+
+/**
+ *	cxgb4_vf_eth_xmit - add a packet to an Ethernet TX queue
+ *	@skb: the packet
+ *	@dev: the egress net device
+ *
+ *	Add a packet to an SGE Ethernet TX queue.  Runs with softirqs disabled.
+ */
+static netdev_tx_t cxgb4_vf_eth_xmit(struct sk_buff *skb,
+				     struct net_device *dev)
+{
+	dma_addr_t addr[MAX_SKB_FRAGS + 1];
+	const struct skb_shared_info *ssi;
+	struct fw_eth_tx_pkt_vm_wr *wr;
+	int qidx, credits, max_pkt_len;
+	struct cpl_tx_pkt_core *cpl;
+	const struct port_info *pi;
+	unsigned int flits, ndesc;
+	struct sge_eth_txq *txq;
+	struct adapter *adapter;
+	u64 cntrl, *end;
+	u32 wr_mid;
+	const size_t fw_hdr_copy_len = sizeof(wr->ethmacdst) +
+				       sizeof(wr->ethmacsrc) +
+				       sizeof(wr->ethtype) +
+				       sizeof(wr->vlantci);
+
+	/* The chip minimum packet length is 10 octets but the firmware
+	 * command that we are using requires that we copy the Ethernet header
+	 * (including the VLAN tag) into the header so we reject anything
+	 * smaller than that ...
+	 */
+	if (unlikely(skb->len < fw_hdr_copy_len))
+		goto out_free;
+
+	/* Discard the packet if the length is greater than mtu */
+	max_pkt_len = ETH_HLEN + dev->mtu;
+	if (skb_vlan_tag_present(skb))
+		max_pkt_len += VLAN_HLEN;
+	if (!skb_shinfo(skb)->gso_size && (unlikely(skb->len > max_pkt_len)))
+		goto out_free;
+
+	/* Figure out which TX Queue we're going to use. */
+	pi = netdev_priv(dev);
+	adapter = pi->adapter;
+	qidx = skb_get_queue_mapping(skb);
+	WARN_ON(qidx >= pi->nqsets);
+	txq = &adapter->sge.ethtxq[pi->first_qset + qidx];
+
+	/* Take this opportunity to reclaim any TX Descriptors whose DMA
+	 * transfers have completed.
+	 */
+	cxgb4_reclaim_completed_tx(adapter, &txq->q, true);
+
+	/* Calculate the number of flits and TX Descriptors we're going to
+	 * need along with how many TX Descriptors will be left over after
+	 * we inject our Work Request.
+	 */
+	flits = t4vf_calc_tx_flits(skb);
+	ndesc = flits_to_desc(flits);
+	credits = txq_avail(&txq->q) - ndesc;
+
+	if (unlikely(credits < 0)) {
+		/* Not enough room for this packet's Work Request.  Stop the
+		 * TX Queue and return a "busy" condition.  The queue will get
+		 * started later on when the firmware informs us that space
+		 * has opened up.
+		 */
+		eth_txq_stop(txq);
+		dev_err(adapter->pdev_dev,
+			"%s: TX ring %u full while queue awake!\n",
+			dev->name, qidx);
+		return NETDEV_TX_BUSY;
+	}
+
+	if (!t4vf_is_eth_imm(skb) &&
+	    unlikely(cxgb4_map_skb(adapter->pdev_dev, skb, addr) < 0)) {
+		/* We need to map the skb into PCI DMA space (because it can't
+		 * be in-lined directly into the Work Request) and the mapping
+		 * operation failed.  Record the error and drop the packet.
+		 */
+		txq->mapping_err++;
+		goto out_free;
+	}
+
+	wr_mid = FW_WR_LEN16_V(DIV_ROUND_UP(flits, 2));
+	if (unlikely(credits < ETHTXQ_STOP_THRES)) {
+		/* After we're done injecting the Work Request for this
+		 * packet, we'll be below our "stop threshold" so stop the TX
+		 * Queue now and schedule a request for an SGE Egress Queue
+		 * Update message.  The queue will get started later on when
+		 * the firmware processes this Work Request and sends us an
+		 * Egress Queue Status Update message indicating that space
+		 * has opened up.
+		 */
+		eth_txq_stop(txq);
+		wr_mid |= FW_WR_EQUEQ_F | FW_WR_EQUIQ_F;
+	}
+
+	/* Start filling in our Work Request.  Note that we do _not_ handle
+	 * the WR Header wrapping around the TX Descriptor Ring.  If our
+	 * maximum header size ever exceeds one TX Descriptor, we'll need to
+	 * do something else here.
+	 */
+	WARN_ON(DIV_ROUND_UP(T4VF_ETHTXQ_MAX_HDR, TXD_PER_EQ_UNIT) > 1);
+	wr = (void *)&txq->q.desc[txq->q.pidx];
+	wr->equiq_to_len16 = cpu_to_be32(wr_mid);
+	wr->r3[0] = cpu_to_be32(0);
+	wr->r3[1] = cpu_to_be32(0);
+	skb_copy_from_linear_data(skb, (void *)wr->ethmacdst, fw_hdr_copy_len);
+	end = (u64 *)wr + flits;
+
+	/* If this is a Large Send Offload packet we'll put in an LSO CPL
+	 * message with an encapsulated TX Packet CPL message.  Otherwise we
+	 * just use a TX Packet CPL message.
+	 */
+	ssi = skb_shinfo(skb);
+	if (ssi->gso_size) {
+		struct cpl_tx_pkt_lso_core *lso = (void *)(wr + 1);
+		bool v6 = (ssi->gso_type & SKB_GSO_TCPV6) != 0;
+		int l3hdr_len = skb_network_header_len(skb);
+		int eth_xtra_len = skb_network_offset(skb) - ETH_HLEN;
+
+		wr->op_immdlen =
+			cpu_to_be32(FW_WR_OP_V(FW_ETH_TX_PKT_VM_WR) |
+				    FW_WR_IMMDLEN_V(sizeof(*lso) +
+						    sizeof(*cpl)));
+		 /* Fill in the LSO CPL message. */
+		lso->lso_ctrl =
+			cpu_to_be32(LSO_OPCODE_V(CPL_TX_PKT_LSO) |
+				    LSO_FIRST_SLICE_F |
+				    LSO_LAST_SLICE_F |
+				    LSO_IPV6_V(v6) |
+				    LSO_ETHHDR_LEN_V(eth_xtra_len / 4) |
+				    LSO_IPHDR_LEN_V(l3hdr_len / 4) |
+				    LSO_TCPHDR_LEN_V(tcp_hdr(skb)->doff));
+		lso->ipid_ofst = cpu_to_be16(0);
+		lso->mss = cpu_to_be16(ssi->gso_size);
+		lso->seqno_offset = cpu_to_be32(0);
+		if (is_t4(adapter->params.chip))
+			lso->len = cpu_to_be32(skb->len);
+		else
+			lso->len = cpu_to_be32(LSO_T5_XFER_SIZE_V(skb->len));
+
+		/* Set up TX Packet CPL pointer, control word and perform
+		 * accounting.
+		 */
+		cpl = (void *)(lso + 1);
+
+		if (CHELSIO_CHIP_VERSION(adapter->params.chip) <= CHELSIO_T5)
+			cntrl = TXPKT_ETHHDR_LEN_V(eth_xtra_len);
+		else
+			cntrl = T6_TXPKT_ETHHDR_LEN_V(eth_xtra_len);
+
+		cntrl |= TXPKT_CSUM_TYPE_V(v6 ?
+					   TX_CSUM_TCPIP6 : TX_CSUM_TCPIP) |
+			 TXPKT_IPHDR_LEN_V(l3hdr_len);
+		txq->tso++;
+		txq->tx_cso += ssi->gso_segs;
+	} else {
+		int len;
+
+		len = (t4vf_is_eth_imm(skb)
+		       ? skb->len + sizeof(*cpl)
+		       : sizeof(*cpl));
+		wr->op_immdlen =
+			cpu_to_be32(FW_WR_OP_V(FW_ETH_TX_PKT_VM_WR) |
+				    FW_WR_IMMDLEN_V(len));
+
+		/* Set up TX Packet CPL pointer, control word and perform
+		 * accounting.
+		 */
+		cpl = (void *)(wr + 1);
+		if (skb->ip_summed == CHECKSUM_PARTIAL) {
+			cntrl = hwcsum(adapter->params.chip, skb) |
+				TXPKT_IPCSUM_DIS_F;
+			txq->tx_cso++;
+		} else {
+			cntrl = TXPKT_L4CSUM_DIS_F | TXPKT_IPCSUM_DIS_F;
+		}
+	}
+
+	/* If there's a VLAN tag present, add that to the list of things to
+	 * do in this Work Request.
+	 */
+	if (skb_vlan_tag_present(skb)) {
+		txq->vlan_ins++;
+		cntrl |= TXPKT_VLAN_VLD_F | TXPKT_VLAN_V(skb_vlan_tag_get(skb));
+	}
+
+	 /* Fill in the TX Packet CPL message header. */
+	cpl->ctrl0 = cpu_to_be32(TXPKT_OPCODE_V(CPL_TX_PKT_XT) |
+				 TXPKT_INTF_V(pi->port_id) |
+				 TXPKT_PF_V(0));
+	cpl->pack = cpu_to_be16(0);
+	cpl->len = cpu_to_be16(skb->len);
+	cpl->ctrl1 = cpu_to_be64(cntrl);
+
+	/* Fill in the body of the TX Packet CPL message with either in-lined
+	 * data or a Scatter/Gather List.
+	 */
+	if (t4vf_is_eth_imm(skb)) {
+		/* In-line the packet's data and free the skb since we don't
+		 * need it any longer.
+		 */
+		cxgb4_inline_tx_skb(skb, &txq->q, cpl + 1);
+		dev_consume_skb_any(skb);
+	} else {
+		/* Write the skb's Scatter/Gather list into the TX Packet CPL
+		 * message and retain a pointer to the skb so we can free it
+		 * later when its DMA completes.  (We store the skb pointer
+		 * in the Software Descriptor corresponding to the last TX
+		 * Descriptor used by the Work Request.)
+		 *
+		 * The retained skb will be freed when the corresponding TX
+		 * Descriptors are reclaimed after their DMAs complete.
+		 * However, this could take quite a while since, in general,
+		 * the hardware is set up to be lazy about sending DMA
+		 * completion notifications to us and we mostly perform TX
+		 * reclaims in the transmit routine.
+		 *
+		 * This is good for performamce but means that we rely on new
+		 * TX packets arriving to run the destructors of completed
+		 * packets, which open up space in their sockets' send queues.
+		 * Sometimes we do not get such new packets causing TX to
+		 * stall.  A single UDP transmitter is a good example of this
+		 * situation.  We have a clean up timer that periodically
+		 * reclaims completed packets but it doesn't run often enough
+		 * (nor do we want it to) to prevent lengthy stalls.  A
+		 * solution to this problem is to run the destructor early,
+		 * after the packet is queued but before it's DMAd.  A con is
+		 * that we lie to socket memory accounting, but the amount of
+		 * extra memory is reasonable (limited by the number of TX
+		 * descriptors), the packets do actually get freed quickly by
+		 * new packets almost always, and for protocols like TCP that
+		 * wait for acks to really free up the data the extra memory
+		 * is even less.  On the positive side we run the destructors
+		 * on the sending CPU rather than on a potentially different
+		 * completing CPU, usually a good thing.
+		 *
+		 * Run the destructor before telling the DMA engine about the
+		 * packet to make sure it doesn't complete and get freed
+		 * prematurely.
+		 */
+		struct ulptx_sgl *sgl = (struct ulptx_sgl *)(cpl + 1);
+		struct sge_txq *tq = &txq->q;
+		int last_desc;
+
+		/* If the Work Request header was an exact multiple of our TX
+		 * Descriptor length, then it's possible that the starting SGL
+		 * pointer lines up exactly with the end of our TX Descriptor
+		 * ring.  If that's the case, wrap around to the beginning
+		 * here ...
+		 */
+		if (unlikely((void *)sgl == (void *)tq->stat)) {
+			sgl = (void *)tq->desc;
+			end = (void *)((void *)tq->desc +
+				       ((void *)end - (void *)tq->stat));
+		}
+
+		cxgb4_write_sgl(skb, tq, sgl, end, 0, addr);
+		skb_orphan(skb);
+
+		last_desc = tq->pidx + ndesc - 1;
+		if (last_desc >= tq->size)
+			last_desc -= tq->size;
+		tq->sdesc[last_desc].skb = skb;
+		tq->sdesc[last_desc].sgl = sgl;
+	}
+
+	/* Advance our internal TX Queue state, tell the hardware about
+	 * the new TX descriptors and return success.
+	 */
+	txq_advance(&txq->q, ndesc);
+
+	cxgb4_ring_tx_db(adapter, &txq->q, ndesc);
+	return NETDEV_TX_OK;
+
+out_free:
+	/* An error of some sort happened.  Free the TX skb and tell the
+	 * OS that we've "dealt" with the packet ...
+	 */
+	dev_kfree_skb_any(skb);
+	return NETDEV_TX_OK;
+}
+
+netdev_tx_t t4_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+	struct port_info *pi = netdev_priv(dev);
+
+	if (unlikely(pi->eth_flags & PRIV_FLAG_PORT_TX_VM))
+		return cxgb4_vf_eth_xmit(skb, dev);
+
+	return cxgb4_eth_xmit(skb, dev);
+}
+
 /**
  *	reclaim_completed_tx_imm - reclaim completed control-queue Tx descs
  *	@q: the SGE control Tx queue
-- 
2.1.0

^ permalink raw reply related

* [PATCH net-next V4 2/2] cxgb4: Support ethtool private flags
From: Ganesh Goudar @ 2018-06-26 11:40 UTC (permalink / raw)
  To: netdev, davem
  Cc: nirranjan, indranil, venkatesh, Arjun Vynipadath, Casey Leedom,
	Ganesh Goudar

From: Arjun Vynipadath <arjun@chelsio.com>

This is used to change TX workrequests, which helps in
host->vf communication.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
---
V4: No changes

V3: No changes

V2: No changes
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c | 42 ++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index f7eef93..ddb8b9e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -177,6 +177,10 @@ static char loopback_stats_strings[][ETH_GSTRING_LEN] = {
 	"bg3_frames_trunc       ",
 };
 
+static const char cxgb4_priv_flags_strings[][ETH_GSTRING_LEN] = {
+	[PRIV_FLAG_PORT_TX_VM_BIT] = "port_tx_vm_wr",
+};
+
 static int get_sset_count(struct net_device *dev, int sset)
 {
 	switch (sset) {
@@ -185,6 +189,8 @@ static int get_sset_count(struct net_device *dev, int sset)
 		       ARRAY_SIZE(adapter_stats_strings) +
 		       ARRAY_SIZE(channel_stats_strings) +
 		       ARRAY_SIZE(loopback_stats_strings);
+	case ETH_SS_PRIV_FLAGS:
+		return ARRAY_SIZE(cxgb4_priv_flags_strings);
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -235,6 +241,7 @@ static void get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
 			 FW_HDR_FW_VER_MINOR_G(exprom_vers),
 			 FW_HDR_FW_VER_MICRO_G(exprom_vers),
 			 FW_HDR_FW_VER_BUILD_G(exprom_vers));
+	info->n_priv_flags = ARRAY_SIZE(cxgb4_priv_flags_strings);
 }
 
 static void get_strings(struct net_device *dev, u32 stringset, u8 *data)
@@ -250,6 +257,9 @@ static void get_strings(struct net_device *dev, u32 stringset, u8 *data)
 		data += sizeof(channel_stats_strings);
 		memcpy(data, loopback_stats_strings,
 		       sizeof(loopback_stats_strings));
+	} else if (stringset == ETH_SS_PRIV_FLAGS) {
+		memcpy(data, cxgb4_priv_flags_strings,
+		       sizeof(cxgb4_priv_flags_strings));
 	}
 }
 
@@ -1499,6 +1509,36 @@ static int cxgb4_get_module_eeprom(struct net_device *dev,
 			 offset, len, &data[eprom->len - len]);
 }
 
+static u32 cxgb4_get_priv_flags(struct net_device *netdev)
+{
+	struct port_info *pi = netdev_priv(netdev);
+	struct adapter *adapter = pi->adapter;
+
+	return (adapter->eth_flags | pi->eth_flags);
+}
+
+/**
+ *	set_flags - set/unset specified flags if passed in new_flags
+ *	@cur_flags: pointer to current flags
+ *	@new_flags: new incoming flags
+ *	@flags: set of flags to set/unset
+ */
+static inline void set_flags(u32 *cur_flags, u32 new_flags, u32 flags)
+{
+	*cur_flags = (*cur_flags & ~flags) | (new_flags & flags);
+}
+
+static int cxgb4_set_priv_flags(struct net_device *netdev, u32 flags)
+{
+	struct port_info *pi = netdev_priv(netdev);
+	struct adapter *adapter = pi->adapter;
+
+	set_flags(&adapter->eth_flags, flags, PRIV_FLAGS_ADAP);
+	set_flags(&pi->eth_flags, flags, PRIV_FLAGS_PORT);
+
+	return 0;
+}
+
 static const struct ethtool_ops cxgb_ethtool_ops = {
 	.get_link_ksettings = get_link_ksettings,
 	.set_link_ksettings = set_link_ksettings,
@@ -1535,6 +1575,8 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
 	.get_dump_data     = get_dump_data,
 	.get_module_info   = cxgb4_get_module_info,
 	.get_module_eeprom = cxgb4_get_module_eeprom,
+	.get_priv_flags    = cxgb4_get_priv_flags,
+	.set_priv_flags    = cxgb4_set_priv_flags,
 };
 
 void cxgb4_set_ethtool_ops(struct net_device *netdev)
-- 
2.1.0

^ permalink raw reply related

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
From: Cornelia Huck @ 2018-06-26 11:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Siwei Liu, Samudrala, Sridhar, Alexander Duyck, virtio-dev,
	aaron.f.brown, Jiri Pirko, Jakub Kicinski, Netdev, qemu-devel,
	virtualization, konrad.wilk, boris.ostrovsky, Joao Martins,
	Venu Busireddy, vijay.balakrishna
In-Reply-To: <20180626043839-mutt-send-email-mst@kernel.org>

On Tue, 26 Jun 2018 04:46:03 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Jun 25, 2018 at 11:55:12AM +0200, Cornelia Huck wrote:
> > On Fri, 22 Jun 2018 22:05:50 +0300
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > On Fri, Jun 22, 2018 at 05:09:55PM +0200, Cornelia Huck wrote:  
> > > > On Thu, 21 Jun 2018 21:20:13 +0300
> > > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >     
> > > > > On Thu, Jun 21, 2018 at 04:59:13PM +0200, Cornelia Huck wrote:    
> > > > > > OK, so what about the following:
> > > > > > 
> > > > > > - introduce a new feature bit, VIRTIO_NET_F_STANDBY_UUID that indicates
> > > > > >   that we have a new uuid field in the virtio-net config space
> > > > > > - in QEMU, add a property for virtio-net that allows to specify a uuid,
> > > > > >   offer VIRTIO_NET_F_STANDBY_UUID if set
> > > > > > - when configuring, set the property to the group UUID of the vfio-pci
> > > > > >   device
> > > > > > - in the guest, use the uuid from the virtio-net device's config space
> > > > > >   if applicable; else, fall back to matching by MAC as done today
> > > > > > 
> > > > > > That should work for all virtio transports.      
> > > > > 
> > > > > True. I'm a bit unhappy that it's virtio net specific though
> > > > > since down the road I expect we'll have a very similar feature
> > > > > for scsi (and maybe others).
> > > > > 
> > > > > But we do not have a way to have fields that are portable
> > > > > both across devices and transports, and I think it would
> > > > > be a useful addition. How would this work though? Any idea?    
> > > > 
> > > > Can we introduce some kind of device-independent config space area?
> > > > Pushing back the device-specific config space by a certain value if the
> > > > appropriate feature is negotiated and use that for things like the uuid?    
> > > 
> > > So config moves back and forth?
> > > Reminds me of the msi vector mess we had with pci.  
> > 
> > Yes, that would be a bit unfortunate.
> >   
> > > I'd rather have every transport add a new config.  
> > 
> > You mean via different mechanisms?  
> 
> I guess so.

Is there an alternate mechanism for pci to use? (Not so familiar with
it.)

For ccw, this needs more thought. We already introduced two commands
for reading/writing the config space (a concept that does not really
exist on s390). There's the generic read configuration data command,
but the data returned by it is not really generic enough. So we would
need one new command (or two, if we need to write as well). I'm not
sure about that yet.

> 
> > >   
> > > > But regardless of that, I'm not sure whether extending this approach to
> > > > other device types is the way to go. Tying together two different
> > > > devices is creating complicated situations at least in the hypervisor
> > > > (even if it's fairly straightforward in the guest). [I have not come
> > > > around again to look at the "how to handle visibility in QEMU"
> > > > questions due to lack of cycles, sorry about that.]
> > > > 
> > > > So, what's the goal of this approach? Only to allow migration with
> > > > vfio-pci, or also to plug in a faster device and use it instead of an
> > > > already attached paravirtualized device?    
> > > 
> > > These are two sides of the same coin, I think the second approach
> > > is closer to what we are doing here.  
> > 
> > Thinking about it, do we need any knob to keep the vfio device
> > invisible if the virtio device is not present? IOW, how does the
> > hypervisor know that the vfio device is supposed to be paired with a
> > virtio device? It seems we need an explicit tie-in.  
> 
> If we are going the way of the bridge, both bridge and
> virtio would have some kind of id.

So the presence of the id would indicate "this is one part of a pair"?

> 
> When pairing using mac, I'm less sure. PAss vfio device mac to qemu
> as a property?

That feels a bit odd. "This is the vfio device's mac, use this instead
of your usual mac property"? As we have not designed the QEMU interface
yet, just go with the id in any case? The guest can still match by mac.

> > > > What about migration of vfio devices that are not easily replaced by a
> > > > paravirtualized device? I'm thinking of vfio-ccw, where our main (and
> > > > currently only) supported device is dasd (disks) -- which can do a lot
> > > > of specialized things that virtio-blk does not support (and should not
> > > > or even cannot support).    
> > > 
> > > But maybe virtio-scsi can?  
> > 
> > I don't think so. Dasds have some channel commands that don't map
> > easily to scsi commands.  
> 
> There's always a choice of adding these to the spec.
> E.g. FC extensions were proposed, I don't remember why they
> are still stuck.

FC extensions are a completely different kind of enhancements, though.
For a start, they are not unique to a certain transport.

Also, we have a whole list of special dasd issues. Weird disk layout
for eckd, low-level disk formatting, etc. (See the list of commands in
drivers/s390/block/dasd_eckd.h for an idea. There's also no public
documentation AFAICS; https://en.wikipedia.org/wiki/ECKD does not link
to anything interesting.) I don't think we want to cram stuff like this
into a completely different framework.

^ permalink raw reply

* Re: [PATCH net-next 3/5] sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams
From: 吉藤英明 @ 2018-06-26 12:02 UTC (permalink / raw)
  To: Xin Long
  Cc: Marcelo Ricardo Leitner, Neil Horman, David Miller, network dev,
	linux-sctp, yoshfuji, 吉藤英明
In-Reply-To: <CADvbK_dpvT1aMtEdARkXQ1b6O5b-QsXTMwBdJsS4bFYnMd=X4Q@mail.gmail.com>

2018-06-26 13:33 GMT+09:00 Xin Long <lucien.xin@gmail.com>:
> On Tue, Jun 26, 2018 at 12:31 AM, Marcelo Ricardo Leitner
> <marcelo.leitner@gmail.com> wrote:
>> Hi,
>>
>> On Tue, Jun 26, 2018 at 01:12:00AM +0900, 吉藤英明 wrote:
>>> Hi,
>>>
>>> 2018-06-25 22:03 GMT+09:00 Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>:
>>> > On Mon, Jun 25, 2018 at 07:28:47AM -0400, Neil Horman wrote:
>>> >> On Mon, Jun 25, 2018 at 04:31:26PM +0900, David Miller wrote:
>>> >> > From: Xin Long <lucien.xin@gmail.com>
>>> >> > Date: Mon, 25 Jun 2018 10:14:35 +0800
>>> >> >
>>> >> > >  struct sctp_paddrparams {
>>> >> > > @@ -773,6 +775,8 @@ struct sctp_paddrparams {
>>> >> > >   __u32                   spp_pathmtu;
>>> >> > >   __u32                   spp_sackdelay;
>>> >> > >   __u32                   spp_flags;
>>> >> > > + __u32                   spp_ipv6_flowlabel;
>>> >> > > + __u8                    spp_dscp;
>>> >> > >  } __attribute__((packed, aligned(4)));
>>> >> >
>>> >> > I don't think you can change the size of this structure like this.
>>> >> >
>>> >> > This check in sctp_setsockopt_peer_addr_params():
>>> >> >
>>> >> >     if (optlen != sizeof(struct sctp_paddrparams))
>>> >> >             return -EINVAL;
>>> >> >
>>> >> > is going to trigger in old kernels when executing programs
>>> >> > built against the new struct definition.
>>> >
>>> > That will happen, yes, but do we really care about being future-proof
>>> > here? I mean: if we also update such check(s) to support dealing with
>>> > smaller-than-supported structs, newer kernels will be able to run
>>> > programs built against the old struct, and the new one; while building
>>> > using newer headers and running on older kernel may fool the
>>> > application in other ways too (like enabling support for something
>>> > that is available on newer kernel and that is not present in the older
>>> > one).
>>>
>>> We should not break existing apps.
>>> We still accept apps of pre-2.4 era without sin6_scope_id
>>> (e.g., net/ipv6/af_inet6.c:inet6_bind()).
>>
>> Yes. That's what I tried to say. That is supporting an old app built
>> with old kernel headers and running on a newer kernel, and not the
>> other way around (an app built with fresh headers and running on an
>> old kernel).
> To make it, I will update the check like:
>
> diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> index 1df5d07..c949d8c 100644
> --- a/net/sctp/socket.c
> +++ b/net/sctp/socket.c
> @@ -2715,13 +2715,18 @@ static int
> sctp_setsockopt_peer_addr_params(struct sock *sk,
>         struct sctp_sock        *sp = sctp_sk(sk);
>         int error;
>         int hb_change, pmtud_change, sackdelay_change;
> +       int plen = sizeof(params);
> +       int old_plen = plen - sizeof(u32) * 2;

if (optlen < offsetof(struct sctp_paddrparams, spp_ipv6_flowlabel))
maybe?

>
> -       if (optlen != sizeof(struct sctp_paddrparams))
> +       if (optlen != plen && optlen != old_plen)
>                 return -EINVAL;
>
>         if (copy_from_user(&params, optval, optlen))
>                 return -EFAULT;
>
> +       if (optlen == old_plen)
> +               params.spp_flags &= ~(SPP_DSCP | SPP_IPV6_FLOWLABEL);

I think we should return -EINVAL if size is not new one.

--yoshfuji

> +
>         /* Validate flags and value parameters. */
>         hb_change        = params.spp_flags & SPP_HB;
>         pmtud_change     = params.spp_flags & SPP_PMTUD;
> @@ -5591,10 +5596,13 @@ static int
> sctp_getsockopt_peer_addr_params(struct sock *sk, int len,
>         struct sctp_transport   *trans = NULL;
>         struct sctp_association *asoc = NULL;
>         struct sctp_sock        *sp = sctp_sk(sk);
> +       int plen = sizeof(params);
> +       int old_plen = plen - sizeof(u32) * 2;
>
> -       if (len < sizeof(struct sctp_paddrparams))
> +       if (len < old_plen)
>                 return -EINVAL;
> -       len = sizeof(struct sctp_paddrparams);
> +
> +       len = len >= plen ? plen : old_plen;
>         if (copy_from_user(&params, optval, len))
>                 return -EFAULT;
>
> does it look ok to you?

^ permalink raw reply

* Re: [offlist] Re: Crash in netlink/sk_filter_trim_cap on ARMv7 on 4.18rc1
From: Peter Robinson @ 2018-06-26 12:23 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Eric Dumazet, netdev, linux-arm-kernel, labbott
In-Reply-To: <CALeDE9OOrZUnaNpzkYPU30iN=4HFQaqEomjf14EO5EtcnHu8OQ@mail.gmail.com>

Hi Daniel,

>>> On 06/24/2018 11:24 AM, Peter Robinson wrote:
>>>>>> I'm seeing this netlink/sk_filter_trim_cap crash on ARMv7 across quite
>>>>>> a few ARMv7 platforms on Fedora with 4.18rc1. I've tested RPi2/RPi3
>>>>>> (doesn't happen on aarch64), AllWinner H3, BeagleBone and a few
>>>>>> others, both LPAE/normal kernels.
>>>
>>> So this is arm32 right?
>>
>> Correct.
>>
>>>>>> I'm a bit out of my depth in this part of the kernel but I'm wondering
>>>>>> if it's known, I couldn't find anything that looked obvious on a few
>>>>>> mailing lists.
>>>>>>
>>>>>> Peter
>>>>>
>>>>> Hi Peter
>>>>>
>>>>> Could you provide symbolic information ?
>>>>
>>>> I passed in through scripts/decode_stacktrace.sh is that what you were after:
>>>>
>>>> [    8.673880] Internal error: Oops: a06 [#10] SMP ARM
>>>> [    8.673949] ---[ end trace 049df4786ea3140a ]---
>>>> [    8.678754] Modules linked in:
>>>> [    8.678766] CPU: 1 PID: 206 Comm: systemd-udevd Tainted: G      D
>>>>         4.18.0-0.rc1.git0.1.fc29.armv7hl+lpae #1
>>>> [    8.678769] Hardware name: Allwinner sun8i Family
>>>> [    8.678781] PC is at sk_filter_trim_cap ()
>>>> [    8.678790] LR is at   (null)
>>>> [    8.709463] pc : lr : psr: 60000013 ()
>>>> [    8.715722] sp : c996bd60  ip : 00000000  fp : 00000000
>>>> [    8.720939] r10: ee79dc00  r9 : c12c9f80  r8 : 00000000
>>>> [    8.726157] r7 : 00000000  r6 : 00000001  r5 : f1648000  r4 : 00000000
>>>> [    8.732674] r3 : 00000007  r2 : 00000000  r1 : 00000000  r0 : 00000000
>>>> [    8.739193] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
>>>> [    8.746318] Control: 30c5387d  Table: 6e7bc880  DAC: ffe75ece
>>>> [    8.752055] Process systemd-udevd (pid: 206, stack limit = 0x(ptrval))
>>>> [    8.758574] Stack: (0xc996bd60 to 0xc996c000)
>>>
>>> Do you have BPF JIT enabled or disabled? Does it happen with disabled?
>>
>> Enabled, I can test with it disabled, BPF configs bits are:
>> CONFIG_BPF_EVENTS=y
>> # CONFIG_BPFILTER is not set
>> CONFIG_BPF_JIT_ALWAYS_ON=y
>> CONFIG_BPF_JIT=y
>> CONFIG_BPF_STREAM_PARSER=y
>> CONFIG_BPF_SYSCALL=y
>> CONFIG_BPF=y
>> CONFIG_CGROUP_BPF=y
>> CONFIG_HAVE_EBPF_JIT=y
>> CONFIG_IPV6_SEG6_BPF=y
>> CONFIG_LWTUNNEL_BPF=y
>> # CONFIG_NBPFAXI_DMA is not set
>> CONFIG_NET_ACT_BPF=m
>> CONFIG_NET_CLS_BPF=m
>> CONFIG_NETFILTER_XT_MATCH_BPF=m
>> # CONFIG_TEST_BPF is not set
>>
>>> I can see one bug, but your stack trace seems unrelated.
>>>
>>> Anyway, could you try with this?
>>
>> Build in process.
>>
>>> diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
>>> index 6e8b716..f6a62ae 100644
>>> --- a/arch/arm/net/bpf_jit_32.c
>>> +++ b/arch/arm/net/bpf_jit_32.c
>>> @@ -1844,7 +1844,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
>>>                 /* there are 2 passes here */
>>>                 bpf_jit_dump(prog->len, image_size, 2, ctx.target);
>>>
>>> -       set_memory_ro((unsigned long)header, header->pages);
>>> +       bpf_jit_binary_lock_ro(header);
>>>         prog->bpf_func = (void *)ctx.target;
>>>         prog->jited = 1;
>>>         prog->jited_len = image_size;
>
> So with that and the other fix there was no improvement, with those
> and the BPF JIT disabled it works, I'm not sure if the two patches
> have any effect with the JIT disabled though.
>
> Will look at the other patches shortly, there's been some other issue
> introduced between rc1 and rc2 which I have to work out before I can
> test those though.

Quick update, with linus's head as of yesterday, basically rc2 plus
davem's network fixes it works if the JIT is disabled IE:
# CONFIG_BPF_JIT_ALWAYS_ON is not set
# CONFIG_BPF_JIT is not set

If I enable it the boot breaks even worse than the errors above in
that I get no console output at all, even with earlycon, so we've gone
backwards since rc1 somehow.

I'll try the above two reverted unless you have any other suggestions.

Peter

^ permalink raw reply

* [PATCH net-next v2 0/2] net: mscc: ocelot: add more features
From: Alexandre Belloni @ 2018-06-26 12:28 UTC (permalink / raw)
  To: David S . Miller
  Cc: Allan Nielsen, razvan.stefanescu, po.liu, Thomas Petazzoni,
	Andrew Lunn, Florian Fainelli, netdev, linux-kernel,
	Alexandre Belloni

Hi,

This series adds link aggregation and VLAN filtering hardware offload
support to the ocelot driver.

PTP support will be sent later.

changes in v2:
 - rebased on v4.18-rc1
 - check for aggregation type and only offload it when type is hash (balance-xor
   or 802.3ad)

Alexandre Belloni (1):
  net: mscc: ocelot: add bonding support

Antoine Tenart (1):
  net: mscc: ocelot: add VLAN filtering

 drivers/net/ethernet/mscc/ocelot.c | 445 ++++++++++++++++++++++++++++-
 drivers/net/ethernet/mscc/ocelot.h |   2 +-
 2 files changed, 444 insertions(+), 3 deletions(-)

-- 
2.18.0

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox