Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables
From: John Fastabend @ 2015-01-06  1:19 UTC (permalink / raw)
  To: Simon Horman; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy
In-Reply-To: <20150106010942.GD14077@vergenet.net>

On 01/05/2015 05:09 PM, Simon Horman wrote:
> On Mon, Jan 05, 2015 at 04:45:50PM -0800, John Fastabend wrote:
>> [...]
>>
>>>>> +/**
>>>>> + * @struct net_flow_field_ref
>>>>> + * @brief uniquely identify field as header:field tuple
>>>>> + */
>>>>> +struct net_flow_field_ref {
>>>>> +    int instance;
>>>>> +    int header;
>>>>> +    int field;
>>>>> +    int mask_type;
>>>>> +    int type;
>>>>> +    union {    /* Are these all the required data types */
>>>>> +        __u8 value_u8;
>>>>> +        __u16 value_u16;
>>>>> +        __u32 value_u32;
>>>>> +        __u64 value_u64;
>>>>> +    };
>>>>> +    union {    /* Are these all the required data types */
>>>>> +        __u8 mask_u8;
>>>>> +        __u16 mask_u16;
>>>>> +        __u32 mask_u32;
>>>>> +        __u64 mask_u64;
>>>>> +    };
>>>>> +};
>>>>
>>>> Does it make sense to write this as follows?
>>>
>>> Yes. I'll make this update it helps make it clear value/mask pairs are
>>> needed.
>>>
>>>>
>>>> union {
>>>>          struct {
>>>>                  __u8 value_u8;
>>>>                  __u8 mask_u8;
>>>>          };
>>>>          struct {
>>>>                  __u16 value_u16;
>>>>                  __u16 mask_u16;
>>>>          };
>>>>          ...
>>>> };
>>
>> Another thought is to pull this entirely out of the structure and hide
>> it from the UAPI so we can add more value/mask types as needed without
>> having to spin versions of net_flow_field_ref. On the other hand I've
>> been able to fit all my fields in these types so far and I can't think
>> of any additions we need at the moment.
>
> FWIW, I think it would be cleaner to break both field_ref and action_args
> out into attributes and not expose the structures to user-space. But
> perhaps there is an advantage to dealing with structures directly that
> I am missing.
>

I  came to the same conclusion just now as well. I'm reworking it now
for v2.

-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* [PATCH V2] net: eth: xgene: change APM X-Gene SoC platform ethernet to support ACPI
From: Feng Kan @ 2015-01-06  1:42 UTC (permalink / raw)
  To: patches, davem, netdev, linux-kernel; +Cc: Feng Kan

This adds support for APM X-Gene ethernet driver to use ACPI table to derive
ethernet driver parameter.

Signed-off-by: Feng Kan <fkan@apm.com>
---
V2:
   - remove NO_MAC define

 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c   | 94 +++++++++++++++++------
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 95 +++++++++++++++++++-----
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h |  3 +
 3 files changed, 149 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index 7ba83ff..869d97f 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -593,10 +593,12 @@ static int xgene_enet_reset(struct xgene_enet_pdata *pdata)
 	if (!xgene_ring_mgr_init(pdata))
 		return -ENODEV;
 
-	clk_prepare_enable(pdata->clk);
-	clk_disable_unprepare(pdata->clk);
-	clk_prepare_enable(pdata->clk);
-	xgene_enet_ecc_init(pdata);
+	if (!efi_enabled(EFI_BOOT)) {
+		clk_prepare_enable(pdata->clk);
+		clk_disable_unprepare(pdata->clk);
+		clk_prepare_enable(pdata->clk);
+		xgene_enet_ecc_init(pdata);
+	}
 	xgene_enet_config_ring_if_assoc(pdata);
 
 	/* Enable auto-incr for scanning */
@@ -663,15 +665,20 @@ static int xgene_enet_phy_connect(struct net_device *ndev)
 	struct phy_device *phy_dev;
 	struct device *dev = &pdata->pdev->dev;
 
-	phy_np = of_parse_phandle(dev->of_node, "phy-handle", 0);
-	if (!phy_np) {
-		netdev_dbg(ndev, "No phy-handle found\n");
-		return -ENODEV;
+	if (dev->of_node) {
+		phy_np = of_parse_phandle(dev->of_node, "phy-handle", 0);
+		if (!phy_np) {
+			netdev_dbg(ndev, "No phy-handle found in DT\n");
+			return -ENODEV;
+		}
+		pdata->phy_dev = of_phy_find_device(phy_np);
 	}
 
-	phy_dev = of_phy_connect(ndev, phy_np, &xgene_enet_adjust_link,
-				 0, pdata->phy_mode);
-	if (!phy_dev) {
+	phy_dev = pdata->phy_dev;
+
+	if (!phy_dev ||
+	    phy_connect_direct(ndev, phy_dev, &xgene_enet_adjust_link,
+			       pdata->phy_mode)) {
 		netdev_err(ndev, "Could not connect to PHY\n");
 		return  -ENODEV;
 	}
@@ -681,32 +688,71 @@ static int xgene_enet_phy_connect(struct net_device *ndev)
 			      ~SUPPORTED_100baseT_Half &
 			      ~SUPPORTED_1000baseT_Half;
 	phy_dev->advertising = phy_dev->supported;
-	pdata->phy_dev = phy_dev;
 
 	return 0;
 }
 
-int xgene_enet_mdio_config(struct xgene_enet_pdata *pdata)
+static int xgene_mdiobus_register(struct xgene_enet_pdata *pdata,
+				  struct mii_bus *mdio)
 {
-	struct net_device *ndev = pdata->ndev;
 	struct device *dev = &pdata->pdev->dev;
+	struct net_device *ndev = pdata->ndev;
+	struct phy_device *phy;
 	struct device_node *child_np;
 	struct device_node *mdio_np = NULL;
-	struct mii_bus *mdio_bus;
 	int ret;
+	u32 phy_id;
+
+	if (dev->of_node) {
+		for_each_child_of_node(dev->of_node, child_np) {
+			if (of_device_is_compatible(child_np,
+						    "apm,xgene-mdio")) {
+				mdio_np = child_np;
+				break;
+			}
+		}
 
-	for_each_child_of_node(dev->of_node, child_np) {
-		if (of_device_is_compatible(child_np, "apm,xgene-mdio")) {
-			mdio_np = child_np;
-			break;
+		if (!mdio_np) {
+			netdev_dbg(ndev, "No mdio node in the dts\n");
+			return -ENXIO;
 		}
-	}
 
-	if (!mdio_np) {
-		netdev_dbg(ndev, "No mdio node in the dts\n");
-		return -ENXIO;
+		return of_mdiobus_register(mdio, mdio_np);
 	}
 
+	/* Mask out all PHYs from auto probing. */
+	mdio->phy_mask = ~0;
+
+	/* Register the MDIO bus */
+	ret = mdiobus_register(mdio);
+	if (ret)
+		return ret;
+
+	ret = device_property_read_u32(dev, "phy-channel", &phy_id);
+	if (ret)
+		ret = device_property_read_u32(dev, "phy-addr", &phy_id);
+	if (ret)
+		return -EINVAL;
+
+	phy = get_phy_device(mdio, phy_id, true);
+	if (!phy || IS_ERR(phy))
+		return -EIO;
+
+	ret = phy_device_register(phy);
+	if (ret)
+		phy_device_free(phy);
+	else
+		pdata->phy_dev = phy;
+
+	return ret;
+}
+
+int xgene_enet_mdio_config(struct xgene_enet_pdata *pdata)
+{
+	struct net_device *ndev = pdata->ndev;
+	struct mii_bus *mdio_bus;
+	int ret;
+
 	mdio_bus = mdiobus_alloc();
 	if (!mdio_bus)
 		return -ENOMEM;
@@ -720,7 +766,7 @@ int xgene_enet_mdio_config(struct xgene_enet_pdata *pdata)
 	mdio_bus->priv = pdata;
 	mdio_bus->parent = &ndev->dev;
 
-	ret = of_mdiobus_register(mdio_bus, mdio_np);
+	ret = xgene_mdiobus_register(pdata, mdio_bus);
 	if (ret) {
 		netdev_err(ndev, "Failed to register MDIO bus\n");
 		mdiobus_free(mdio_bus);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 83a5028..ba579c2 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -24,6 +24,10 @@
 #include "xgene_enet_sgmac.h"
 #include "xgene_enet_xgmac.h"
 
+#define RES_ENET_CSR	0
+#define RES_RING_CSR	1
+#define RES_RING_CMD	2
+
 static void xgene_enet_init_bufpool(struct xgene_enet_desc_ring *buf_pool)
 {
 	struct xgene_enet_raw_desc16 *raw_desc;
@@ -746,6 +750,41 @@ static const struct net_device_ops xgene_ndev_ops = {
 	.ndo_set_mac_address = xgene_enet_set_mac_address,
 };
 
+static int xgene_get_mac_address(struct device *dev,
+				 unsigned char *addr)
+{
+	int ret;
+
+	ret = device_property_read_u8_array(dev, "local-mac-address", addr, 6);
+	if (ret)
+		ret = device_property_read_u8_array(dev, "mac-address",
+						    addr, 6);
+	if (ret)
+		return -ENODEV;
+
+	return ETH_ALEN;
+}
+
+static int xgene_get_phy_mode(struct device *dev)
+{
+	int i, ret;
+	char *modestr;
+
+	ret = device_property_read_string(dev, "phy-connection-type",
+					  (const char **)&modestr);
+	if (ret)
+		ret = device_property_read_string(dev, "phy-mode",
+						  (const char **)&modestr);
+	if (ret)
+		return -ENODEV;
+
+	for (i = 0; i < PHY_INTERFACE_MODE_MAX; i++) {
+		if (!strcasecmp(modestr, phy_modes(i)))
+			return i;
+	}
+	return -ENODEV;
+}
+
 static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 {
 	struct platform_device *pdev;
@@ -753,29 +792,42 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 	struct device *dev;
 	struct resource *res;
 	void __iomem *base_addr;
-	const char *mac;
 	int ret;
 
 	pdev = pdata->pdev;
 	dev = &pdev->dev;
 	ndev = pdata->ndev;
 
-	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "enet_csr");
-	pdata->base_addr = devm_ioremap_resource(dev, res);
+	res = platform_get_resource(pdev, IORESOURCE_MEM, RES_ENET_CSR);
+	if (!res) {
+		dev_err(dev, "Resource enet_csr not defined\n");
+		return -ENODEV;
+	}
+	pdata->base_addr = devm_ioremap(dev, res->start, resource_size(res));
 	if (IS_ERR(pdata->base_addr)) {
 		dev_err(dev, "Unable to retrieve ENET Port CSR region\n");
 		return PTR_ERR(pdata->base_addr);
 	}
 
-	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ring_csr");
-	pdata->ring_csr_addr = devm_ioremap_resource(dev, res);
+	res = platform_get_resource(pdev, IORESOURCE_MEM, RES_RING_CSR);
+	if (!res) {
+		dev_err(dev, "Resource ring_csr not defined\n");
+		return -ENODEV;
+	}
+	pdata->ring_csr_addr = devm_ioremap(dev, res->start,
+							resource_size(res));
 	if (IS_ERR(pdata->ring_csr_addr)) {
 		dev_err(dev, "Unable to retrieve ENET Ring CSR region\n");
 		return PTR_ERR(pdata->ring_csr_addr);
 	}
 
-	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ring_cmd");
-	pdata->ring_cmd_addr = devm_ioremap_resource(dev, res);
+	res = platform_get_resource(pdev, IORESOURCE_MEM, RES_RING_CMD);
+	if (!res) {
+		dev_err(dev, "Resource ring_cmd not defined\n");
+		return -ENODEV;
+	}
+	pdata->ring_cmd_addr = devm_ioremap(dev, res->start,
+							resource_size(res));
 	if (IS_ERR(pdata->ring_cmd_addr)) {
 		dev_err(dev, "Unable to retrieve ENET Ring command region\n");
 		return PTR_ERR(pdata->ring_cmd_addr);
@@ -789,14 +841,12 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 	}
 	pdata->rx_irq = ret;
 
-	mac = of_get_mac_address(dev->of_node);
-	if (mac)
-		memcpy(ndev->dev_addr, mac, ndev->addr_len);
-	else
+	if (xgene_get_mac_address(dev, ndev->dev_addr) != ETH_ALEN)
 		eth_hw_addr_random(ndev);
+
 	memcpy(ndev->perm_addr, ndev->dev_addr, ndev->addr_len);
 
-	pdata->phy_mode = of_get_phy_mode(pdev->dev.of_node);
+	pdata->phy_mode = xgene_get_phy_mode(dev);
 	if (pdata->phy_mode < 0) {
 		dev_err(dev, "Unable to get phy-connection-type\n");
 		return pdata->phy_mode;
@@ -809,11 +859,9 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 	}
 
 	pdata->clk = devm_clk_get(&pdev->dev, NULL);
-	ret = IS_ERR(pdata->clk);
 	if (IS_ERR(pdata->clk)) {
-		dev_err(&pdev->dev, "can't get clock\n");
-		ret = PTR_ERR(pdata->clk);
-		return ret;
+		/* Firmware may have set up the clock already. */
+		pdata->clk = NULL;
 	}
 
 	base_addr = pdata->base_addr;
@@ -924,7 +972,7 @@ static int xgene_enet_probe(struct platform_device *pdev)
 		goto err;
 	}
 
-	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
+	ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(64));
 	if (ret) {
 		netdev_err(ndev, "No usable DMA configuration\n");
 		goto err;
@@ -972,7 +1020,15 @@ static int xgene_enet_remove(struct platform_device *pdev)
 	return 0;
 }
 
-static struct of_device_id xgene_enet_match[] = {
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id xgene_enet_acpi_match[] = {
+	{ "APMC0D05", },
+	{ }
+};
+MODULE_DEVICE_TABLE(acpi, xgene_enet_acpi_match);
+#endif
+
+static struct of_device_id xgene_enet_of_match[] = {
 	{.compatible = "apm,xgene-enet",},
 	{},
 };
@@ -982,7 +1038,8 @@ MODULE_DEVICE_TABLE(of, xgene_enet_match);
 static struct platform_driver xgene_enet_driver = {
 	.driver = {
 		   .name = "xgene-enet",
-		   .of_match_table = xgene_enet_match,
+		   .of_match_table = of_match_ptr(xgene_enet_of_match),
+		   .acpi_match_table = ACPI_PTR(xgene_enet_acpi_match),
 	},
 	.probe = xgene_enet_probe,
 	.remove = xgene_enet_remove,
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index f9958fa..c2d465c 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -22,7 +22,10 @@
 #ifndef __XGENE_ENET_MAIN_H__
 #define __XGENE_ENET_MAIN_H__
 
+#include <linux/acpi.h>
 #include <linux/clk.h>
+#include <linux/efi.h>
+#include <linux/io.h>
 #include <linux/of_platform.h>
 #include <linux/of_net.h>
 #include <linux/of_mdio.h>
-- 
1.9.1

^ permalink raw reply related

* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables
From: Simon Horman @ 2015-01-06  2:05 UTC (permalink / raw)
  To: John Fastabend; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy
In-Reply-To: <54AB381E.3010009@gmail.com>

On Mon, Jan 05, 2015 at 05:19:26PM -0800, John Fastabend wrote:
> On 01/05/2015 05:09 PM, Simon Horman wrote:
> >On Mon, Jan 05, 2015 at 04:45:50PM -0800, John Fastabend wrote:
> >>[...]
> >>
> >>>>>+/**
> >>>>>+ * @struct net_flow_field_ref
> >>>>>+ * @brief uniquely identify field as header:field tuple
> >>>>>+ */
> >>>>>+struct net_flow_field_ref {
> >>>>>+    int instance;
> >>>>>+    int header;
> >>>>>+    int field;
> >>>>>+    int mask_type;
> >>>>>+    int type;
> >>>>>+    union {    /* Are these all the required data types */
> >>>>>+        __u8 value_u8;
> >>>>>+        __u16 value_u16;
> >>>>>+        __u32 value_u32;
> >>>>>+        __u64 value_u64;
> >>>>>+    };
> >>>>>+    union {    /* Are these all the required data types */
> >>>>>+        __u8 mask_u8;
> >>>>>+        __u16 mask_u16;
> >>>>>+        __u32 mask_u32;
> >>>>>+        __u64 mask_u64;
> >>>>>+    };
> >>>>>+};
> >>>>
> >>>>Does it make sense to write this as follows?
> >>>
> >>>Yes. I'll make this update it helps make it clear value/mask pairs are
> >>>needed.
> >>>
> >>>>
> >>>>union {
> >>>>         struct {
> >>>>                 __u8 value_u8;
> >>>>                 __u8 mask_u8;
> >>>>         };
> >>>>         struct {
> >>>>                 __u16 value_u16;
> >>>>                 __u16 mask_u16;
> >>>>         };
> >>>>         ...
> >>>>};
> >>
> >>Another thought is to pull this entirely out of the structure and hide
> >>it from the UAPI so we can add more value/mask types as needed without
> >>having to spin versions of net_flow_field_ref. On the other hand I've
> >>been able to fit all my fields in these types so far and I can't think
> >>of any additions we need at the moment.
> >
> >FWIW, I think it would be cleaner to break both field_ref and action_args
> >out into attributes and not expose the structures to user-space. But
> >perhaps there is an advantage to dealing with structures directly that
> >I am missing.
> >
> 
> I  came to the same conclusion just now as well. I'm reworking it now
> for v2.

Thanks.

BTW, I think there are a few problems with net_flow_put_flow_action().

I am not quite to the bottom of it but it seems that:
* It loops over a->args[i] and then calls net_flow_put_act_types()
  which performs a similar loop. This outer-loop appears to be incorrect.
* It passes a[i].args instead of a->args[i] to net_flow_put_act_types()

I can post a fix once I've got it working to my satisfaction.
But if you are reworking that code anyway perhaps it is easier for
you to handle it then.

^ permalink raw reply

* Re: [net-next PATCH v1 00/11] A flow API
From: Scott Feldman @ 2015-01-06  2:42 UTC (permalink / raw)
  To: John Fastabend
  Cc: Thomas Graf, Jiří Pírko, Jamal Hadi Salim,
	simon.horman@netronome.com, Netdev, David S. Miller,
	Andy Gospodarek
In-Reply-To: <20141231194057.31070.5244.stgit@nitbit.x32>

On Wed, Dec 31, 2014 at 11:45 AM, John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Finally I have more patches to add support for creating and destroying
> tables. This allows users to define the pipeline at runtime rather
> than statically as rocker does now. After this set gets some traction
> I'll look at pushing them in a next round. However it likely requires
> adding another "world" to rocker.

Yes, it would require another "world" to be added to rocker.  It would
be cool if someone could work on this.  Currently, the only world
rocker supports is OF-DPA, which is based on Broadcom's published
OF-DPA spec.  OF-DPA is a fixed pipeline with predefined tables.  I'd
like to see a "universal machine" world added to rocker that presents
a programmable pipeline and programmable tables.  The nice thing is
most of the rocker device and driver code gets reused fro this new
world.  Nothing changes about the port interface, or the DMA/MSI/PCI
interfaces, or the I/O, cmd, and event paths, so much is reused.

> Another piece that I want to add is
> a description of the actions and metadata. This way user space can
> "learn" what an action is and how metadata interacts with the system.
> This work is under development.

^ permalink raw reply

* [PATCH net-next v2 0/8] net: extend ethtool link mode bitmaps to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev, linux-kernel, linux-api
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan

From: David Decotigny <decot@googlers.com>

History:
  v2:
    - added ethtool_ops::get_compat_flags() to be able to check user
      parameters in generic ethtool layer when a driver does not
      support a new feature.
    - in generic ethtool code, EINVAL when userspace sets a link mode
      with bits 32..47 set, unless driver explicitly indicates that it
      supports it (get_compat_flags). also print a warning once (per
      lifetime, not per-interface) if driver sets bits 32..47 of link
      mode, unless it explicitly indicated that it supports 48bit link
      modes.
    - don't update phydev->advertising too early (ie. keep original behavior)
  v1:
    initial draft

This patch series increases the width of the supported/advertising
ethtool masks from 32 bits to 48. This should allow to breathe for a
couple more years (or... months?).

It should not cause any backward compatibility issues for
now. However, if user-space passed non-0 to previously-reserved fields
to ethtool_set_settings or ethtool_set_eee, they will be rejected by
non-updated (ie. no get_compat_flags) drivers. Updated drivers will
handle the new expanded link mode masks appropriately. It is
recommended we gradually adopt the new get/set API, and provide the
associated get_compat_flags(), in order to correctly support future
link modes. See ethtool.h for details.

I updated a couple drivers (mlx4, veth, tun), and some shared code in
drivers/net (phy, mii, mdio). It might be overkill for phy/mii/mdio,
and I might have missed other shared code in drivers/net. Please let
me know.

I used the compiler on my private copy to 1/ track where the
ethtool_cmd/ethtool_eee link mode fields were used 2/ track where the
link mode bitmaps are used and updated. So I believe that there is
some sort of transitive closure for the code I updated. Maybe tools
like coccinelle or clang could allow to automate these processes (1/
would probably be easy-ish, but 2/ seems a bit more complex)? I
reverted these internal "tracking" tricks for this version to minimize
impact on uapi/ethtool.h. The main resulting visible artifact of those
tricks is the new ethtool_link_mode_mask_t typedef (aka. u64). I kept
this trivial typedef here to help future refactoring, but I'd be happy
to rename it as plain "u64" if you prefer.

I can send updates to other drivers, even though it's rather pointless
to update 1G drivers at this point for example. Please let me know,
but I'd prefer to do this in follow-up patches outside this first
patch series.

############################################
# Patch Set Summary:

David Decotigny (8):
  net: ethtool: internal compatibility flags to reject non-zero reserved
    fields
  net: ethtool: extend link mode support to 48 bits
  net: phy: extend link mode support to 48 bits
  net: mii: extend link mode support to 48 bits
  net: mdio: extend link mode support to 48 bits
  net: veth: extend link mode support to 48 bits
  net: tun: extend link mode support to 48 bits
  net: mlx4_en: extend link mode support to 48 bits

 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c |  73 ++++++++-----
 drivers/net/mdio.c                              |  59 +++++-----
 drivers/net/mii.c                               |  52 +++++----
 drivers/net/phy/phy.c                           |  30 +++---
 drivers/net/phy/phy_device.c                    |   4 +-
 drivers/net/tun.c                               |   4 +-
 drivers/net/veth.c                              |   4 +-
 include/linux/ethtool.h                         |  25 ++++-
 include/linux/mdio.h                            |  15 +--
 include/linux/mii.h                             |  31 +++---
 include/linux/phy.h                             |  10 +-
 include/uapi/linux/ethtool.h                    | 137 ++++++++++++++++++++----
 net/core/ethtool.c                              |  44 ++++++++
 13 files changed, 343 insertions(+), 145 deletions(-)

-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply

* [PATCH net-next v2 1/8] net: ethtool: internal compatibility flags to reject non-zero reserved fields
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

From: David Decotigny <decot-Ypc/8FJVVoBWk0Htik3J/w@public.gmane.org>

The public ethtool API progressively shrinks "reserved" fields to
expand some other fields (eg. link mode masks). This patch allows
drivers to declare that they fully support expanded fields. When they
don't do so, the generic ethtool layer may reject (-EINVAL) userspace
requests that assign values utilizing the newly allocated areas in
these expanded fields (ie. when some reserved field receives != 0).

Signed-off-by: David Decotigny <decot-Ypc/8FJVVoBWk0Htik3J/w@public.gmane.org>
---
 include/linux/ethtool.h      | 19 +++++++++++++++++--
 include/uapi/linux/ethtool.h |  6 +++++-
 2 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 653dc9c..dcb08c1 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -44,6 +44,17 @@ extern int __ethtool_get_settings(struct net_device *dev,
 				  struct ethtool_cmd *cmd);
 
 /**
+ * enum ethtool_compat_flags - bit indices used for %get_compat_flags() bitmaps
+ * @__ETHTOOL_COMPAT_PLACEHOLDER_BIT: to avoid a compiler error,
+ *                                    superseded by next patches
+ */
+enum ethtool_compat_flags {
+	__ETHTOOL_COMPAT_PLACEHOLDER_BIT,
+};
+
+#define __ETH_COMPAT_MASK(name)	(1UL << (ETHTOOL_COMPAT_ ## name ## _BIT))
+
+/**
  * enum ethtool_phys_id_state - indicator state for physical identification
  * @ETHTOOL_ID_INACTIVE: Physical ID indicator should be deactivated
  * @ETHTOOL_ID_ACTIVE: Physical ID indicator should be activated
@@ -99,6 +110,11 @@ static inline u32 ethtool_rxfh_indir_default(u32 index, u32 n_rx_rings)
 
 /**
  * struct ethtool_ops - optional netdev operations
+ * @get_compat_flags: Internal function. Returns the internal ethtool
+ *      compatibility flags: in the absence of this method, or of
+ *      specific compatilibilty flags, the generic layer enforces
+ *      additional constraints on the user space values before
+ *      calling the callbacks below.
  * @get_settings: Get various device settings including Ethernet link
  *	settings. The @cmd parameter is expected to have been cleared
  *	before get_settings is called. Returns a negative error code or
@@ -279,7 +295,6 @@ struct ethtool_ops {
 			       const struct ethtool_tunable *, void *);
 	int	(*set_tunable)(struct net_device *,
 			       const struct ethtool_tunable *, const void *);
-
-
+	u32	(*get_compat_flags)(struct net_device *);
 };
 #endif /* _LINUX_ETHTOOL_H */
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 5f66d9c..d063368 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -88,7 +88,8 @@
  * other than @cmd that are not described as read-only or deprecated,
  * and must ignore all fields described as read-only.
  *
- * Deprecated fields should be ignored by both users and drivers.
+ * Deprecated and reserved fields should be ignored by both users and
+ * drivers. If reserved fields must be set, store the value 0 in them.
  */
 struct ethtool_cmd {
 	__u32	cmd;
@@ -300,6 +301,9 @@ struct ethtool_eeprom {
  * @tx_lpi_timer: Time in microseconds the interface delays prior to asserting
  *	its tx lpi (after reaching 'idle' state). Effective only when eee
  *	was negotiated and tx_lpi_enabled was set.
+ *
+ * Deprecated and reserved fields should be ignored by both users and
+ * drivers. If reserved fields must be set, store the value 0 in them.
  */
 struct ethtool_eee {
 	__u32	cmd;
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* [PATCH net-next v2 2/8] net: ethtool: extend link mode support to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev, linux-kernel, linux-api
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig@gmail.com>

From: David Decotigny <decot@googlers.com>

This patch also ensures that requests from userspace asking to
advertise >= 32b link mode masks will be rejected, unless the driver
explicitly supports this.

Signed-off-by: David Decotigny <decot@googlers.com>
---
 include/linux/ethtool.h      |  12 +++-
 include/uapi/linux/ethtool.h | 131 ++++++++++++++++++++++++++++++++++++-------
 net/core/ethtool.c           |  44 +++++++++++++++
 3 files changed, 164 insertions(+), 23 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index dcb08c1..9baa80f 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -45,15 +45,21 @@ extern int __ethtool_get_settings(struct net_device *dev,
 
 /**
  * enum ethtool_compat_flags - bit indices used for %get_compat_flags() bitmaps
- * @__ETHTOOL_COMPAT_PLACEHOLDER_BIT: to avoid a compiler error,
- *                                    superseded by next patches
+ * @ETHTOOL_COMPAT_SUPPORT_LINK_MODE_48b_BIT: when set, the driver
+ *      handles 48-bit link mode requests from userspace. In its
+ *      absence, the generic %ethtool_set_settings/%ethtool_set_eee
+ *      ioctl handlers will reject the request if user passed an
+ *      advertising link mode with any of the bits 32..47 set.
  */
 enum ethtool_compat_flags {
-	__ETHTOOL_COMPAT_PLACEHOLDER_BIT,
+	ETHTOOL_COMPAT_SUPPORT_LINK_MODE_48b_BIT,
 };
 
 #define __ETH_COMPAT_MASK(name)	(1UL << (ETHTOOL_COMPAT_ ## name ## _BIT))
 
+#define ETH_COMPAT_SUPPORT_LINK_MODE_48b	\
+	__ETH_COMPAT_MASK(SUPPORT_LINK_MODE_48b)
+
 /**
  * enum ethtool_phys_id_state - indicator state for physical identification
  * @ETHTOOL_ID_INACTIVE: Physical ID indicator should be deactivated
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index d063368..ab7c11d 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -23,14 +23,16 @@
 /**
  * struct ethtool_cmd - link control and status
  * @cmd: Command number = %ETHTOOL_GSET or %ETHTOOL_SSET
- * @supported: Bitmask of %SUPPORTED_* flags for the link modes,
- *	physical connectors and other link features for which the
- *	interface supports autonegotiation or auto-detection.
- *	Read-only.
- * @advertising: Bitmask of %ADVERTISED_* flags for the link modes,
- *	physical connectors and other link features that are
- *	advertised through autonegotiation or enabled for
- *	auto-detection.
+ * @supported: Low bits of bitmask of %SUPPORTED_* flags for the link
+ *	modes, physical connectors and other link features for which
+ *	the interface supports autonegotiation or auto-detection.
+ *	Read-only. Please do not access this field directly, use the
+ *	%ethtool_cmd_supported_* family of functions instead.
+ * @advertising: Low bits of bitmask of %ADVERTISED_* flags for the
+ *	link modes, physical connectors and other link features that
+ *	are advertised through autonegotiation or enabled for
+ *	auto-detection. Please do not access this field directly, use
+ *	the %ethtool_cmd_advertising_* family of functions instead.
  * @speed: Low bits of the speed
  * @duplex: Duplex mode; one of %DUPLEX_*
  * @port: Physical connector type; one of %PORT_*
@@ -56,10 +58,22 @@
  *	yield %ETH_TP_MDI_INVALID and writes may be ignored or rejected.
  *	When written successfully, the link should be renegotiated if
  *	necessary.
- * @lp_advertising: Bitmask of %ADVERTISED_* flags for the link modes
- *	and other link features that the link partner advertised
- *	through autonegotiation; 0 if unknown or not applicable.
- *	Read-only.
+ * @lp_advertising: Low bits of bitmask of %ADVERTISED_* flags for the
+ *      link modes and other link features that the link partner
+ *      advertised through autonegotiation; 0 if unknown or not
+ *      applicable.  Read-only. Please do not access this field
+ *      directly, use the %ethtool_cmd_lp_advertising_* family of
+ *      functions instead.
+ * @supported_hi: High bits of bitmask of %SUPPORTED_* flags. See
+ *      %supported. Read-only. Please do not access this field directly,
+ *      use the %ethtool_cmd_supported_* family of functions instead.
+ * @advertising_hi: High bits of bitmask of %ADVERTISING_* flags. See
+ *      %advertising. Please do not access this field directly, use the
+ *      %ethtool_cmd_advertising_* family of functions instead.
+ * @lp_advertising_hi: High bits of bitmask of %ADVERTISING_* flags.
+ *      See %lp_advertising. Read-only. Please do not access this
+ *      field directly, use the %ethtool_cmd_lp_advertising_* family
+ *      of functions instead.
  *
  * The link speed in Mbps is split between @speed and @speed_hi.  Use
  * the ethtool_cmd_speed() and ethtool_cmd_speed_set() functions to
@@ -108,7 +122,10 @@ struct ethtool_cmd {
 	__u8	eth_tp_mdix;
 	__u8	eth_tp_mdix_ctrl;
 	__u32	lp_advertising;
-	__u32	reserved[2];
+	__u16	supported_hi;
+	__u16	advertising_hi;
+	__u16	lp_advertising_hi;
+	__u16	reserved;
 };
 
 static inline void ethtool_cmd_speed_set(struct ethtool_cmd *ep,
@@ -124,6 +141,45 @@ static inline __u32 ethtool_cmd_speed(const struct ethtool_cmd *ep)
 	return (ep->speed_hi << 16) | ep->speed;
 }
 
+/**
+ * ETHTOOL_MAKE_LINK_MODE_ACCESSORS - create the link_mode accessors
+ * Macro to generate the %ethtool_cmd_supported_*,
+ * %ethtool_cmd_advertising_*, %ethtool_cmd_lp_advertising_*,
+ * %ethtool_eee_supported_*, %ethtool_eee_advertised_*,
+ * %ethtool_eee_lp_advertised_* families of functions.
+ *
+ *  @struct_name: either %ethtool_cmd or %ethtool_eee
+ *  @field_name: name of the fields in %struct_name to
+ *      access. supported/advertising/lp_advertising for ethtool_cmd,
+ *      supported/advertised/lp_advertised for ethtool_eee
+ *
+ * Generates the following static functions:
+ *  ethtool_cmd_supported(const struct ethtool_cmd*): returns
+ *      the 64b value of %supported fields (the upper bits 63..48 are 0)
+ *  ethtool_cmd_supported_set(struct ethtool_cmd*,
+ *      ethtool_link_mode_mask_t value): set the %supported fields to
+ *      given %value (only the lower 48 bits used, upper bits 63..48
+ *      ignored)
+ *
+ * Same doc for all the other functions.
+ */
+#define ETHTOOL_MAKE_LINK_MODE_ACCESSORS(struct_name, field_name)	\
+	static inline ethtool_link_mode_mask_t				\
+	struct_name ## _ ## field_name(const struct struct_name *cmd)	\
+	{ ethtool_link_mode_mask_t r = cmd->field_name;			\
+	  r |= ((__u64)cmd->field_name ## _hi) << 32; return r; }	\
+	static inline void						\
+	struct_name ## _ ## field_name ## _set(struct struct_name *cmd,	\
+					       ethtool_link_mode_mask_t mask) \
+	{ cmd->field_name = mask & 0xffffffff;				\
+	  cmd->field_name ## _hi = (mask >> 32) & 0xffff; }
+
+typedef __u64 ethtool_link_mode_mask_t;
+
+ETHTOOL_MAKE_LINK_MODE_ACCESSORS(ethtool_cmd, supported);
+ETHTOOL_MAKE_LINK_MODE_ACCESSORS(ethtool_cmd, advertising);
+ETHTOOL_MAKE_LINK_MODE_ACCESSORS(ethtool_cmd, lp_advertising);
+
 /* Device supports clause 22 register access to PHY or peripherals
  * using the interface defined in <linux/mii.h>.  This should not be
  * set if there are known to be no such peripherals present or if
@@ -288,12 +344,18 @@ struct ethtool_eeprom {
 /**
  * struct ethtool_eee - Energy Efficient Ethernet information
  * @cmd: ETHTOOL_{G,S}EEE
- * @supported: Mask of %SUPPORTED_* flags for the speed/duplex combinations
- *	for which there is EEE support.
- * @advertised: Mask of %ADVERTISED_* flags for the speed/duplex combinations
- *	advertised as eee capable.
- * @lp_advertised: Mask of %ADVERTISED_* flags for the speed/duplex
- *	combinations advertised by the link partner as eee capable.
+ * @supported: Low bits of mask of %SUPPORTED_* flags for the
+ *	speed/duplex combinations for which there is EEE
+ *	support. Please do not access this field directly, use the
+ *	%ethtool_eee_supported_* family of functions instead.
+ * @advertised: Low bits of mask of %ADVERTISED_* flags for the
+ *	speed/duplex combinations advertised as eee capable. Please do
+ *	not access this field directly, use the
+ *	%ethtool_eee_advertised_* family of functions instead.
+ * @lp_advertised: Low bits of mask of %ADVERTISED_* flags for the
+ *	speed/duplex combinations advertised by the link partner as
+ *	eee capable.  Please do not access this field directly, use
+ *	the %ethtool_eee_lp_advertised_* family of functions instead.
  * @eee_active: Result of the eee auto negotiation.
  * @eee_enabled: EEE configured mode (enabled/disabled).
  * @tx_lpi_enabled: Whether the interface should assert its tx lpi, given
@@ -301,6 +363,16 @@ struct ethtool_eeprom {
  * @tx_lpi_timer: Time in microseconds the interface delays prior to asserting
  *	its tx lpi (after reaching 'idle' state). Effective only when eee
  *	was negotiated and tx_lpi_enabled was set.
+ * @supported_hi: High bits of mask of %SUPPORTED_* flags. See
+ *      %supported. Please do not access this field directly, use the
+ *      %ethtool_eee_supported_* family of functions instead.
+ * @advertised_hi: High bits of mask of %ADVERTISING_* flags. See
+ *      %advertising. Please do not access this field directly, use
+ *      the %ethtool_eee_advertising_* family of functions instead.
+ * @lp_advertised_hi: High bits of mask of %ADVERTISING_* flags.
+ *      See %lp_advertising. Please do not access this field directly,
+ *      use the %ethtool_eee_lp_advertising_* family of functions
+ *      instead.
  *
  * Deprecated and reserved fields should be ignored by both users and
  * drivers. If reserved fields must be set, store the value 0 in them.
@@ -314,9 +386,17 @@ struct ethtool_eee {
 	__u32	eee_enabled;
 	__u32	tx_lpi_enabled;
 	__u32	tx_lpi_timer;
-	__u32	reserved[2];
+	__u16	supported_hi;
+	__u16	advertised_hi;
+	__u16	lp_advertised_hi;
+	__u16	reserved;
 };
 
+ETHTOOL_MAKE_LINK_MODE_ACCESSORS(ethtool_eee, supported);
+ETHTOOL_MAKE_LINK_MODE_ACCESSORS(ethtool_eee, advertised);
+ETHTOOL_MAKE_LINK_MODE_ACCESSORS(ethtool_eee, lp_advertised);
+
+
 /**
  * struct ethtool_modinfo - plugin module eeprom information
  * @cmd: %ETHTOOL_GMODULEINFO
@@ -1196,6 +1276,11 @@ enum ethtool_sfeatures_retval_bits {
 #define SPARC_ETH_GSET		ETHTOOL_GSET
 #define SPARC_ETH_SSET		ETHTOOL_SSET
 
+/*
+ * Do not use the following macros directly to update
+ * ethtool_cmd::supported, ethtool_eee::supported. Please use
+ * ethtool_(cmd|eee)_supported(|_set) instead.
+ */
 #define SUPPORTED_10baseT_Half		(1 << 0)
 #define SUPPORTED_10baseT_Full		(1 << 1)
 #define SUPPORTED_100baseT_Half		(1 << 2)
@@ -1228,6 +1313,12 @@ enum ethtool_sfeatures_retval_bits {
 #define SUPPORTED_56000baseSR4_Full	(1 << 29)
 #define SUPPORTED_56000baseLR4_Full	(1 << 30)
 
+/*
+ * Do not use the following macros directly to update
+ * ethtool_cmd::advertising, ethtool_cmd::lp_advertising,
+ * ethtool_eee::advertised, ethtool_eee::lp_advertised. Please use
+ * ethtool_(cmd|eee)_*(|_set).
+ */
 #define ADVERTISED_10baseT_Half		(1 << 0)
 #define ADVERTISED_10baseT_Full		(1 << 1)
 #define ADVERTISED_100baseT_Half	(1 << 2)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 550892c..462a8f4 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -362,6 +362,19 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr)
 	if (err < 0)
 		return err;
 
+	/* Log a warning if driver reports high link mode bits when
+	 * it's not officially supporting them
+	 */
+	if (!dev->ethtool_ops->get_compat_flags ||
+	    !(dev->ethtool_ops->get_compat_flags(dev)
+	      & ETH_COMPAT_SUPPORT_LINK_MODE_48b)) {
+		WARN_ONCE((0 != cmd.supported_hi) ||
+			  (0 != cmd.advertising_hi) ||
+			  (0 != cmd.lp_advertising_hi),
+			  "%s: using ethtool link mode high bits",
+			  netdev_name(dev));
+	}
+
 	if (copy_to_user(useraddr, &cmd, sizeof(cmd)))
 		return -EFAULT;
 	return 0;
@@ -377,6 +390,15 @@ static int ethtool_set_settings(struct net_device *dev, void __user *useraddr)
 	if (copy_from_user(&cmd, useraddr, sizeof(cmd)))
 		return -EFAULT;
 
+	/* Reject the high advertising bits if driver doesn't support
+	 * them
+	 */
+	if (cmd.advertising_hi &&
+	    (!dev->ethtool_ops->get_compat_flags ||
+	     !(dev->ethtool_ops->get_compat_flags(dev)
+	       & ETH_COMPAT_SUPPORT_LINK_MODE_48b)))
+		return -EINVAL;
+
 	return dev->ethtool_ops->set_settings(dev, &cmd);
 }
 
@@ -955,6 +977,19 @@ static int ethtool_get_eee(struct net_device *dev, char __user *useraddr)
 	if (rc)
 		return rc;
 
+	/* Log a warning if driver reports high link mode bits when
+	 * it's not officially supporting them
+	 */
+	if (!dev->ethtool_ops->get_compat_flags ||
+	    !(dev->ethtool_ops->get_compat_flags(dev)
+	      & ETH_COMPAT_SUPPORT_LINK_MODE_48b)) {
+		WARN_ONCE((0 != edata.supported_hi) ||
+			  (0 != edata.advertised_hi) ||
+			  (0 != edata.lp_advertised_hi),
+			  "%s: using ethtool EEE link mode high bits",
+			  netdev_name(dev));
+	}
+
 	if (copy_to_user(useraddr, &edata, sizeof(edata)))
 		return -EFAULT;
 
@@ -971,6 +1006,15 @@ static int ethtool_set_eee(struct net_device *dev, char __user *useraddr)
 	if (copy_from_user(&edata, useraddr, sizeof(edata)))
 		return -EFAULT;
 
+	/* Reject the high advertising bits if driver doesn't support
+	 * them
+	 */
+	if (edata.advertised_hi &&
+	    (!dev->ethtool_ops->get_compat_flags ||
+	     !(dev->ethtool_ops->get_compat_flags(dev)
+		     & ETH_COMPAT_SUPPORT_LINK_MODE_48b)))
+		return -EINVAL;
+
 	return dev->ethtool_ops->set_eee(dev, &edata);
 }
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* [PATCH net-next v2 3/8] net: phy: extend link mode support to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

From: David Decotigny <decot-Ypc/8FJVVoBWk0Htik3J/w@public.gmane.org>

Signed-off-by: David Decotigny <decot-Ypc/8FJVVoBWk0Htik3J/w@public.gmane.org>
---
 drivers/net/phy/phy.c        | 30 ++++++++++++++++--------------
 drivers/net/phy/phy_device.c |  4 ++--
 include/linux/phy.h          | 10 +++++-----
 3 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 767cd11..77d4eb9 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -132,7 +132,7 @@ static inline int phy_aneg_done(struct phy_device *phydev)
 struct phy_setting {
 	int speed;
 	int duplex;
-	u32 setting;
+	ethtool_link_mode_mask_t setting;
 };
 
 /* A mapping of all SUPPORTED settings to speed/duplex */
@@ -227,7 +227,8 @@ static inline unsigned int phy_find_setting(int speed, int duplex)
  *   the mask in features.  Returns the index of the last setting
  *   if nothing else matches.
  */
-static inline unsigned int phy_find_valid(unsigned int idx, u32 features)
+static inline unsigned int phy_find_valid(unsigned int idx,
+					  ethtool_link_mode_mask_t features)
 {
 	while (idx < MAX_NUM_SETTINGS && !(settings[idx].setting & features))
 		idx++;
@@ -245,7 +246,7 @@ static inline unsigned int phy_find_valid(unsigned int idx, u32 features)
  */
 static void phy_sanitize_settings(struct phy_device *phydev)
 {
-	u32 features = phydev->supported;
+	ethtool_link_mode_mask_t features = phydev->supported;
 	unsigned int idx;
 
 	/* Sanitize settings based on PHY capabilities */
@@ -274,18 +275,19 @@ static void phy_sanitize_settings(struct phy_device *phydev)
 int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd)
 {
 	u32 speed = ethtool_cmd_speed(cmd);
+	ethtool_link_mode_mask_t eth_adv = ethtool_cmd_advertising(cmd);
 
 	if (cmd->phy_address != phydev->addr)
 		return -EINVAL;
 
 	/* We make sure that we don't pass unsupported values in to the PHY */
-	cmd->advertising &= phydev->supported;
+	eth_adv &= phydev->supported;
 
 	/* Verify the settings we care about. */
 	if (cmd->autoneg != AUTONEG_ENABLE && cmd->autoneg != AUTONEG_DISABLE)
 		return -EINVAL;
 
-	if (cmd->autoneg == AUTONEG_ENABLE && cmd->advertising == 0)
+	if (cmd->autoneg == AUTONEG_ENABLE && eth_adv == 0)
 		return -EINVAL;
 
 	if (cmd->autoneg == AUTONEG_DISABLE &&
@@ -300,7 +302,7 @@ int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd)
 
 	phydev->speed = speed;
 
-	phydev->advertising = cmd->advertising;
+	phydev->advertising = eth_adv;
 
 	if (AUTONEG_ENABLE == cmd->autoneg)
 		phydev->advertising |= ADVERTISED_Autoneg;
@@ -318,10 +320,10 @@ EXPORT_SYMBOL(phy_ethtool_sset);
 
 int phy_ethtool_gset(struct phy_device *phydev, struct ethtool_cmd *cmd)
 {
-	cmd->supported = phydev->supported;
+	ethtool_cmd_supported_set(cmd, phydev->supported);
 
-	cmd->advertising = phydev->advertising;
-	cmd->lp_advertising = phydev->lp_advertising;
+	ethtool_cmd_advertising_set(cmd, phydev->advertising);
+	ethtool_cmd_lp_advertising_set(cmd, phydev->lp_advertising);
 
 	ethtool_cmd_speed_set(cmd, phydev->speed);
 	cmd->duplex = phydev->duplex;
@@ -1040,7 +1042,7 @@ int phy_init_eee(struct phy_device *phydev, bool clk_stop_enable)
 	    (phydev->interface == PHY_INTERFACE_MODE_RGMII) ||
 	     phy_is_internal(phydev))) {
 		int eee_lp, eee_cap, eee_adv;
-		u32 lp, cap, adv;
+		ethtool_link_mode_mask_t cap, adv, lp;
 		int status;
 		unsigned int idx;
 
@@ -1132,21 +1134,21 @@ int phy_ethtool_get_eee(struct phy_device *phydev, struct ethtool_eee *data)
 				    MDIO_MMD_PCS, phydev->addr);
 	if (val < 0)
 		return val;
-	data->supported = mmd_eee_cap_to_ethtool_sup_t(val);
+	ethtool_eee_supported_set(data, mmd_eee_cap_to_ethtool_sup_t(val));
 
 	/* Get advertisement EEE */
 	val = phy_read_mmd_indirect(phydev, MDIO_AN_EEE_ADV,
 				    MDIO_MMD_AN, phydev->addr);
 	if (val < 0)
 		return val;
-	data->advertised = mmd_eee_adv_to_ethtool_adv_t(val);
+	ethtool_eee_advertised_set(data, mmd_eee_adv_to_ethtool_adv_t(val));
 
 	/* Get LP advertisement EEE */
 	val = phy_read_mmd_indirect(phydev, MDIO_AN_EEE_LPABLE,
 				    MDIO_MMD_AN, phydev->addr);
 	if (val < 0)
 		return val;
-	data->lp_advertised = mmd_eee_adv_to_ethtool_adv_t(val);
+	ethtool_eee_lp_advertised_set(data, mmd_eee_adv_to_ethtool_adv_t(val));
 
 	return 0;
 }
@@ -1161,7 +1163,7 @@ EXPORT_SYMBOL(phy_ethtool_get_eee);
  */
 int phy_ethtool_set_eee(struct phy_device *phydev, struct ethtool_eee *data)
 {
-	int val = ethtool_adv_to_mmd_eee_adv_t(data->advertised);
+	int val = ethtool_adv_to_mmd_eee_adv_t(ethtool_eee_advertised(data));
 
 	phy_write_mmd_indirect(phydev, MDIO_AN_EEE_ADV, MDIO_MMD_AN,
 			       phydev->addr, val);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 3fc91e8..4391dc7 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -734,7 +734,7 @@ EXPORT_SYMBOL(phy_resume);
  */
 static int genphy_config_advert(struct phy_device *phydev)
 {
-	u32 advertise;
+	ethtool_link_mode_mask_t advertise;
 	int oldadv, adv, bmsr;
 	int err, changed = 0;
 
@@ -1086,7 +1086,7 @@ EXPORT_SYMBOL(genphy_soft_reset);
 int genphy_config_init(struct phy_device *phydev)
 {
 	int val;
-	u32 features;
+	ethtool_link_mode_mask_t features;
 
 	features = (SUPPORTED_TP | SUPPORTED_MII
 			| SUPPORTED_AUI | SUPPORTED_FIBRE |
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 22af8f8..aabf808 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -390,10 +390,10 @@ struct phy_device {
 	u32 interrupts;
 
 	/* Union of PHY and Attached devices' supported modes */
-	/* See mii.h for more info */
-	u32 supported;
-	u32 advertising;
-	u32 lp_advertising;
+	/* See ethtool.h for more info */
+	ethtool_link_mode_mask_t supported;
+	ethtool_link_mode_mask_t advertising;
+	ethtool_link_mode_mask_t lp_advertising;
 
 	int autoneg;
 
@@ -447,7 +447,7 @@ struct phy_driver {
 	u32 phy_id;
 	char *name;
 	unsigned int phy_id_mask;
-	u32 features;
+	ethtool_link_mode_mask_t features;
 	u32 flags;
 	const void *driver_data;
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* [PATCH net-next v2 4/8] net: mii: extend link mode support to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev, linux-kernel, linux-api
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig@gmail.com>

From: David Decotigny <decot@googlers.com>

Signed-off-by: David Decotigny <decot@googlers.com>
---
 drivers/net/mii.c   | 52 +++++++++++++++++++++++++++++-----------------------
 include/linux/mii.h | 31 ++++++++++++++++---------------
 2 files changed, 45 insertions(+), 38 deletions(-)

diff --git a/drivers/net/mii.c b/drivers/net/mii.c
index 4a99c39..2be59ba 100644
--- a/drivers/net/mii.c
+++ b/drivers/net/mii.c
@@ -33,7 +33,7 @@
 #include <linux/ethtool.h>
 #include <linux/mii.h>
 
-static u32 mii_get_an(struct mii_if_info *mii, u16 addr)
+static ethtool_link_mode_mask_t mii_get_an(struct mii_if_info *mii, u16 addr)
 {
 	int advert;
 
@@ -56,14 +56,15 @@ int mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 {
 	struct net_device *dev = mii->dev;
 	u16 bmcr, bmsr, ctrl1000 = 0, stat1000 = 0;
-	u32 nego;
+	ethtool_link_mode_mask_t supported_link_modes, advertising_link_modes;
+	ethtool_link_mode_mask_t nego;
 
-	ecmd->supported =
+	supported_link_modes =
 	    (SUPPORTED_10baseT_Half | SUPPORTED_10baseT_Full |
 	     SUPPORTED_100baseT_Half | SUPPORTED_100baseT_Full |
 	     SUPPORTED_Autoneg | SUPPORTED_TP | SUPPORTED_MII);
 	if (mii->supports_gmii)
-		ecmd->supported |= SUPPORTED_1000baseT_Half |
+		supported_link_modes |= SUPPORTED_1000baseT_Half |
 			SUPPORTED_1000baseT_Full;
 
 	/* only supports twisted-pair */
@@ -76,7 +77,7 @@ int mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 	ecmd->phy_address = mii->phy_id;
 	ecmd->mdio_support = ETH_MDIO_SUPPORTS_C22;
 
-	ecmd->advertising = ADVERTISED_TP | ADVERTISED_MII;
+	advertising_link_modes = ADVERTISED_TP | ADVERTISED_MII;
 
 	bmcr = mii->mdio_read(dev, mii->phy_id, MII_BMCR);
 	bmsr = mii->mdio_read(dev, mii->phy_id, MII_BMSR);
@@ -85,23 +86,25 @@ int mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 		stat1000 = mii->mdio_read(dev, mii->phy_id, MII_STAT1000);
 	}
 	if (bmcr & BMCR_ANENABLE) {
-		ecmd->advertising |= ADVERTISED_Autoneg;
+		ethtool_link_mode_mask_t lp_adv;
+
+		advertising_link_modes |= ADVERTISED_Autoneg;
 		ecmd->autoneg = AUTONEG_ENABLE;
 
-		ecmd->advertising |= mii_get_an(mii, MII_ADVERTISE);
+		advertising_link_modes |= mii_get_an(mii, MII_ADVERTISE);
 		if (mii->supports_gmii)
-			ecmd->advertising |=
+			advertising_link_modes |=
 					mii_ctrl1000_to_ethtool_adv_t(ctrl1000);
 
 		if (bmsr & BMSR_ANEGCOMPLETE) {
-			ecmd->lp_advertising = mii_get_an(mii, MII_LPA);
-			ecmd->lp_advertising |=
-					mii_stat1000_to_ethtool_lpa_t(stat1000);
+			lp_adv = mii_get_an(mii, MII_LPA);
+			lp_adv |= mii_stat1000_to_ethtool_lpa_t(stat1000);
 		} else {
-			ecmd->lp_advertising = 0;
+			lp_adv = 0;
 		}
 
-		nego = ecmd->advertising & ecmd->lp_advertising;
+		ethtool_cmd_lp_advertising_set(ecmd, lp_adv);
+		nego = advertising_link_modes & lp_adv;
 
 		if (nego & (ADVERTISED_1000baseT_Full |
 			    ADVERTISED_1000baseT_Half)) {
@@ -128,6 +131,8 @@ int mii_ethtool_gset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 	}
 
 	mii->full_duplex = ecmd->duplex;
+	ethtool_cmd_supported_set(ecmd, supported_link_modes);
+	ethtool_cmd_advertising_set(ecmd, advertising_link_modes);
 
 	/* ignore maxtxpkt, maxrxpkt for now */
 
@@ -168,13 +173,15 @@ int mii_ethtool_sset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 	if (ecmd->autoneg == AUTONEG_ENABLE) {
 		u32 bmcr, advert, tmp;
 		u32 advert2 = 0, tmp2 = 0;
-
-		if ((ecmd->advertising & (ADVERTISED_10baseT_Half |
-					  ADVERTISED_10baseT_Full |
-					  ADVERTISED_100baseT_Half |
-					  ADVERTISED_100baseT_Full |
-					  ADVERTISED_1000baseT_Half |
-					  ADVERTISED_1000baseT_Full)) == 0)
+		ethtool_link_mode_mask_t ethtool_adv;
+
+		ethtool_adv = ethtool_cmd_advertising(ecmd);
+		if ((ethtool_adv & (ADVERTISED_10baseT_Half |
+				    ADVERTISED_10baseT_Full |
+				    ADVERTISED_100baseT_Half |
+				    ADVERTISED_100baseT_Full |
+				    ADVERTISED_1000baseT_Half |
+				    ADVERTISED_1000baseT_Full)) == 0)
 			return -EINVAL;
 
 		/* advertise only what has been requested */
@@ -184,11 +191,10 @@ int mii_ethtool_sset(struct mii_if_info *mii, struct ethtool_cmd *ecmd)
 			advert2 = mii->mdio_read(dev, mii->phy_id, MII_CTRL1000);
 			tmp2 = advert2 & ~(ADVERTISE_1000HALF | ADVERTISE_1000FULL);
 		}
-		tmp |= ethtool_adv_to_mii_adv_t(ecmd->advertising);
+		tmp |= ethtool_adv_to_mii_adv_t(ethtool_adv);
 
 		if (mii->supports_gmii)
-			tmp2 |=
-			      ethtool_adv_to_mii_ctrl1000_t(ecmd->advertising);
+			tmp2 |= ethtool_adv_to_mii_ctrl1000_t(ethtool_adv);
 		if (advert != tmp) {
 			mii->mdio_write(dev, mii->phy_id, MII_ADVERTISE, tmp);
 			mii->advertising = tmp;
diff --git a/include/linux/mii.h b/include/linux/mii.h
index 47492c9..6f5336c 100644
--- a/include/linux/mii.h
+++ b/include/linux/mii.h
@@ -106,7 +106,7 @@ static inline unsigned int mii_duplex (unsigned int duplex_lock,
  * settings to phy autonegotiation advertisements for the
  * MII_ADVERTISE register.
  */
-static inline u32 ethtool_adv_to_mii_adv_t(u32 ethadv)
+static inline u32 ethtool_adv_to_mii_adv_t(ethtool_link_mode_mask_t ethadv)
 {
 	u32 result = 0;
 
@@ -133,9 +133,9 @@ static inline u32 ethtool_adv_to_mii_adv_t(u32 ethadv)
  * A small helper function that translates MII_ADVERTISE bits
  * to ethtool advertisement settings.
  */
-static inline u32 mii_adv_to_ethtool_adv_t(u32 adv)
+static inline ethtool_link_mode_mask_t mii_adv_to_ethtool_adv_t(u32 adv)
 {
-	u32 result = 0;
+	ethtool_link_mode_mask_t result = 0;
 
 	if (adv & ADVERTISE_10HALF)
 		result |= ADVERTISED_10baseT_Half;
@@ -161,7 +161,8 @@ static inline u32 mii_adv_to_ethtool_adv_t(u32 adv)
  * settings to phy autonegotiation advertisements for the
  * MII_CTRL1000 register when in 1000T mode.
  */
-static inline u32 ethtool_adv_to_mii_ctrl1000_t(u32 ethadv)
+static inline u32
+ethtool_adv_to_mii_ctrl1000_t(ethtool_link_mode_mask_t ethadv)
 {
 	u32 result = 0;
 
@@ -181,9 +182,9 @@ static inline u32 ethtool_adv_to_mii_ctrl1000_t(u32 ethadv)
  * bits, when in 1000Base-T mode, to ethtool
  * advertisement settings.
  */
-static inline u32 mii_ctrl1000_to_ethtool_adv_t(u32 adv)
+static inline ethtool_link_mode_mask_t mii_ctrl1000_to_ethtool_adv_t(u32 adv)
 {
-	u32 result = 0;
+	ethtool_link_mode_mask_t result = 0;
 
 	if (adv & ADVERTISE_1000HALF)
 		result |= ADVERTISED_1000baseT_Half;
@@ -201,9 +202,9 @@ static inline u32 mii_ctrl1000_to_ethtool_adv_t(u32 adv)
  * bits, when in 1000Base-T mode, to ethtool
  * LP advertisement settings.
  */
-static inline u32 mii_lpa_to_ethtool_lpa_t(u32 lpa)
+static inline ethtool_link_mode_mask_t mii_lpa_to_ethtool_lpa_t(u32 lpa)
 {
-	u32 result = 0;
+	ethtool_link_mode_mask_t result = 0;
 
 	if (lpa & LPA_LPACK)
 		result |= ADVERTISED_Autoneg;
@@ -219,9 +220,9 @@ static inline u32 mii_lpa_to_ethtool_lpa_t(u32 lpa)
  * bits, when in 1000Base-T mode, to ethtool
  * advertisement settings.
  */
-static inline u32 mii_stat1000_to_ethtool_lpa_t(u32 lpa)
+static inline ethtool_link_mode_mask_t mii_stat1000_to_ethtool_lpa_t(u32 lpa)
 {
-	u32 result = 0;
+	ethtool_link_mode_mask_t result = 0;
 
 	if (lpa & LPA_1000HALF)
 		result |= ADVERTISED_1000baseT_Half;
@@ -239,7 +240,7 @@ static inline u32 mii_stat1000_to_ethtool_lpa_t(u32 lpa)
  * settings to phy autonegotiation advertisements for the
  * MII_CTRL1000 register when in 1000Base-X mode.
  */
-static inline u32 ethtool_adv_to_mii_adv_x(u32 ethadv)
+static inline u32 ethtool_adv_to_mii_adv_x(ethtool_link_mode_mask_t ethadv)
 {
 	u32 result = 0;
 
@@ -263,9 +264,9 @@ static inline u32 ethtool_adv_to_mii_adv_x(u32 ethadv)
  * bits, when in 1000Base-X mode, to ethtool
  * advertisement settings.
  */
-static inline u32 mii_adv_to_ethtool_adv_x(u32 adv)
+static inline ethtool_link_mode_mask_t mii_adv_to_ethtool_adv_x(u32 adv)
 {
-	u32 result = 0;
+	ethtool_link_mode_mask_t result = 0;
 
 	if (adv & ADVERTISE_1000XHALF)
 		result |= ADVERTISED_1000baseT_Half;
@@ -287,9 +288,9 @@ static inline u32 mii_adv_to_ethtool_adv_x(u32 adv)
  * bits, when in 1000Base-X mode, to ethtool
  * LP advertisement settings.
  */
-static inline u32 mii_lpa_to_ethtool_lpa_x(u32 lpa)
+static inline ethtool_link_mode_mask_t mii_lpa_to_ethtool_lpa_x(u32 lpa)
 {
-	u32 result = 0;
+	ethtool_link_mode_mask_t result = 0;
 
 	if (lpa & LPA_LPACK)
 		result |= ADVERTISED_Autoneg;
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* [PATCH net-next v2 5/8] net: mdio: extend link mode support to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev, linux-kernel, linux-api
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig@gmail.com>

From: David Decotigny <decot@googlers.com>

Signed-off-by: David Decotigny <decot@googlers.com>
---
 drivers/net/mdio.c   | 59 +++++++++++++++++++++++++++++-----------------------
 include/linux/mdio.h | 15 +++++++------
 2 files changed, 42 insertions(+), 32 deletions(-)

diff --git a/drivers/net/mdio.c b/drivers/net/mdio.c
index 3e027ed..5cac2ac 100644
--- a/drivers/net/mdio.c
+++ b/drivers/net/mdio.c
@@ -148,9 +148,10 @@ int mdio45_nway_restart(const struct mdio_if_info *mdio)
 }
 EXPORT_SYMBOL(mdio45_nway_restart);
 
-static u32 mdio45_get_an(const struct mdio_if_info *mdio, u16 addr)
+static ethtool_link_mode_mask_t
+mdio45_get_an(const struct mdio_if_info *mdio, u16 addr)
 {
-	u32 result = 0;
+	ethtool_link_mode_mask_t result = 0;
 	int reg;
 
 	reg = mdio->mdio_read(mdio->dev, mdio->prtad, MDIO_MMD_AN, addr);
@@ -185,9 +186,11 @@ static u32 mdio45_get_an(const struct mdio_if_info *mdio, u16 addr)
  */
 void mdio45_ethtool_gset_npage(const struct mdio_if_info *mdio,
 			       struct ethtool_cmd *ecmd,
-			       u32 npage_adv, u32 npage_lpa)
+			       ethtool_link_mode_mask_t npage_adv,
+			       ethtool_link_mode_mask_t npage_lpa)
 {
 	int reg;
+	ethtool_link_mode_mask_t supported_link_modes, advertising_link_modes;
 	u32 speed;
 
 	BUILD_BUG_ON(MDIO_SUPPORTS_C22 != ETH_MDIO_SUPPORTS_C22);
@@ -206,64 +209,64 @@ void mdio45_ethtool_gset_npage(const struct mdio_if_info *mdio,
 	case MDIO_PMA_CTRL2_100BTX:
 	case MDIO_PMA_CTRL2_10BT:
 		ecmd->port = PORT_TP;
-		ecmd->supported = SUPPORTED_TP;
+		supported_link_modes = SUPPORTED_TP;
 		reg = mdio->mdio_read(mdio->dev, mdio->prtad, MDIO_MMD_PMAPMD,
 				      MDIO_SPEED);
 		if (reg & MDIO_SPEED_10G)
-			ecmd->supported |= SUPPORTED_10000baseT_Full;
+			supported_link_modes |= SUPPORTED_10000baseT_Full;
 		if (reg & MDIO_PMA_SPEED_1000)
-			ecmd->supported |= (SUPPORTED_1000baseT_Full |
+			supported_link_modes |= (SUPPORTED_1000baseT_Full |
 					    SUPPORTED_1000baseT_Half);
 		if (reg & MDIO_PMA_SPEED_100)
-			ecmd->supported |= (SUPPORTED_100baseT_Full |
+			supported_link_modes |= (SUPPORTED_100baseT_Full |
 					    SUPPORTED_100baseT_Half);
 		if (reg & MDIO_PMA_SPEED_10)
-			ecmd->supported |= (SUPPORTED_10baseT_Full |
+			supported_link_modes |= (SUPPORTED_10baseT_Full |
 					    SUPPORTED_10baseT_Half);
-		ecmd->advertising = ADVERTISED_TP;
+		advertising_link_modes = ADVERTISED_TP;
 		break;
 
 	case MDIO_PMA_CTRL2_10GBCX4:
 		ecmd->port = PORT_OTHER;
-		ecmd->supported = 0;
-		ecmd->advertising = 0;
+		supported_link_modes = 0;
+		advertising_link_modes = 0;
 		break;
 
 	case MDIO_PMA_CTRL2_10GBKX4:
 	case MDIO_PMA_CTRL2_10GBKR:
 	case MDIO_PMA_CTRL2_1000BKX:
 		ecmd->port = PORT_OTHER;
-		ecmd->supported = SUPPORTED_Backplane;
+		supported_link_modes = SUPPORTED_Backplane;
 		reg = mdio->mdio_read(mdio->dev, mdio->prtad, MDIO_MMD_PMAPMD,
 				      MDIO_PMA_EXTABLE);
 		if (reg & MDIO_PMA_EXTABLE_10GBKX4)
-			ecmd->supported |= SUPPORTED_10000baseKX4_Full;
+			supported_link_modes |= SUPPORTED_10000baseKX4_Full;
 		if (reg & MDIO_PMA_EXTABLE_10GBKR)
-			ecmd->supported |= SUPPORTED_10000baseKR_Full;
+			supported_link_modes |= SUPPORTED_10000baseKR_Full;
 		if (reg & MDIO_PMA_EXTABLE_1000BKX)
-			ecmd->supported |= SUPPORTED_1000baseKX_Full;
+			supported_link_modes |= SUPPORTED_1000baseKX_Full;
 		reg = mdio->mdio_read(mdio->dev, mdio->prtad, MDIO_MMD_PMAPMD,
 				      MDIO_PMA_10GBR_FECABLE);
 		if (reg & MDIO_PMA_10GBR_FECABLE_ABLE)
-			ecmd->supported |= SUPPORTED_10000baseR_FEC;
-		ecmd->advertising = ADVERTISED_Backplane;
+			supported_link_modes |= SUPPORTED_10000baseR_FEC;
+		advertising_link_modes = ADVERTISED_Backplane;
 		break;
 
 	/* All the other defined modes are flavours of optical */
 	default:
 		ecmd->port = PORT_FIBRE;
-		ecmd->supported = SUPPORTED_FIBRE;
-		ecmd->advertising = ADVERTISED_FIBRE;
+		supported_link_modes = SUPPORTED_FIBRE;
+		advertising_link_modes = ADVERTISED_FIBRE;
 		break;
 	}
 
 	if (mdio->mmds & MDIO_DEVS_AN) {
-		ecmd->supported |= SUPPORTED_Autoneg;
+		supported_link_modes |= SUPPORTED_Autoneg;
 		reg = mdio->mdio_read(mdio->dev, mdio->prtad, MDIO_MMD_AN,
 				      MDIO_CTRL1);
 		if (reg & MDIO_AN_CTRL1_ENABLE) {
 			ecmd->autoneg = AUTONEG_ENABLE;
-			ecmd->advertising |=
+			advertising_link_modes |=
 				ADVERTISED_Autoneg |
 				mdio45_get_an(mdio, MDIO_AN_ADVERTISE) |
 				npage_adv;
@@ -275,21 +278,22 @@ void mdio45_ethtool_gset_npage(const struct mdio_if_info *mdio,
 	}
 
 	if (ecmd->autoneg) {
-		u32 modes = 0;
+		ethtool_link_mode_mask_t modes = 0;
 		int an_stat = mdio->mdio_read(mdio->dev, mdio->prtad,
 					      MDIO_MMD_AN, MDIO_STAT1);
 
 		/* If AN is complete and successful, report best common
 		 * mode, otherwise report best advertised mode. */
 		if (an_stat & MDIO_AN_STAT1_COMPLETE) {
-			ecmd->lp_advertising =
+			ethtool_link_mode_mask_t lp_adv =
 				mdio45_get_an(mdio, MDIO_AN_LPA) | npage_lpa;
 			if (an_stat & MDIO_AN_STAT1_LPABLE)
-				ecmd->lp_advertising |= ADVERTISED_Autoneg;
-			modes = ecmd->advertising & ecmd->lp_advertising;
+				lp_adv |= ADVERTISED_Autoneg;
+			ethtool_cmd_lp_advertising_set(ecmd, lp_adv);
+			modes = advertising_link_modes & lp_adv;
 		}
 		if ((modes & ~ADVERTISED_Autoneg) == 0)
-			modes = ecmd->advertising;
+			modes = advertising_link_modes;
 
 		if (modes & (ADVERTISED_10000baseT_Full |
 			     ADVERTISED_10000baseKX4_Full |
@@ -338,6 +342,9 @@ void mdio45_ethtool_gset_npage(const struct mdio_if_info *mdio,
 			break;
 		}
 	}
+
+	ethtool_cmd_supported_set(ecmd, supported_link_modes);
+	ethtool_cmd_advertising_set(ecmd, advertising_link_modes);
 }
 EXPORT_SYMBOL(mdio45_ethtool_gset_npage);
 
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index b42963b..18c649a 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -69,7 +69,8 @@ extern int mdio45_links_ok(const struct mdio_if_info *mdio, u32 mmds);
 extern int mdio45_nway_restart(const struct mdio_if_info *mdio);
 extern void mdio45_ethtool_gset_npage(const struct mdio_if_info *mdio,
 				      struct ethtool_cmd *ecmd,
-				      u32 npage_adv, u32 npage_lpa);
+				      ethtool_link_mode_mask_t npage_adv,
+				      ethtool_link_mode_mask_t npage_lpa);
 
 /**
  * mdio45_ethtool_gset - get settings for ETHTOOL_GSET
@@ -97,9 +98,10 @@ extern int mdio_mii_ioctl(const struct mdio_if_info *mdio,
  * A small helper function that translates MMD EEE Capability (3.20) bits
  * to ethtool supported settings.
  */
-static inline u32 mmd_eee_cap_to_ethtool_sup_t(u16 eee_cap)
+static inline ethtool_link_mode_mask_t
+mmd_eee_cap_to_ethtool_sup_t(u16 eee_cap)
 {
-	u32 supported = 0;
+	ethtool_link_mode_mask_t supported = 0;
 
 	if (eee_cap & MDIO_EEE_100TX)
 		supported |= SUPPORTED_100baseT_Full;
@@ -125,9 +127,10 @@ static inline u32 mmd_eee_cap_to_ethtool_sup_t(u16 eee_cap)
  * and MMD EEE Link Partner Ability (7.61) bits to ethtool advertisement
  * settings.
  */
-static inline u32 mmd_eee_adv_to_ethtool_adv_t(u16 eee_adv)
+static inline ethtool_link_mode_mask_t
+mmd_eee_adv_to_ethtool_adv_t(u16 eee_adv)
 {
-	u32 adv = 0;
+	ethtool_link_mode_mask_t adv = 0;
 
 	if (eee_adv & MDIO_EEE_100TX)
 		adv |= ADVERTISED_100baseT_Full;
@@ -153,7 +156,7 @@ static inline u32 mmd_eee_adv_to_ethtool_adv_t(u16 eee_adv)
  * to EEE advertisements for the MMD EEE Advertisement (7.60) and
  * MMD EEE Link Partner Ability (7.61) registers.
  */
-static inline u16 ethtool_adv_to_mmd_eee_adv_t(u32 adv)
+static inline u16 ethtool_adv_to_mmd_eee_adv_t(ethtool_link_mode_mask_t adv)
 {
 	u16 reg = 0;
 
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* [PATCH net-next v2 6/8] net: veth: extend link mode support to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev, linux-kernel, linux-api
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig@gmail.com>

From: David Decotigny <decot@googlers.com>

Signed-off-by: David Decotigny <decot@googlers.com>
---
 drivers/net/veth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 8ad5965..902e127 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -49,8 +49,8 @@ static struct {
 
 static int veth_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 {
-	cmd->supported		= 0;
-	cmd->advertising	= 0;
+	ethtool_cmd_supported_set(cmd, 0);
+	ethtool_cmd_advertising_set(cmd, 0);
 	ethtool_cmd_speed_set(cmd, SPEED_10000);
 	cmd->duplex		= DUPLEX_FULL;
 	cmd->port		= PORT_TP;
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* [PATCH net-next v2 8/8] net: mlx4_en: extend link mode support to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev, linux-kernel, linux-api
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig@gmail.com>

From: David Decotigny <decot@googlers.com>

Signed-off-by: David Decotigny <decot@googlers.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_ethtool.c | 73 ++++++++++++++++---------
 1 file changed, 46 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
index 90e0f04..8ccbe3f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_ethtool.c
@@ -46,6 +46,11 @@
 #define EN_ETHTOOL_SHORT_MASK cpu_to_be16(0xffff)
 #define EN_ETHTOOL_WORD_MASK  cpu_to_be32(0xffffffff)
 
+static u32 mlx4_en_get_compat_flags(struct net_device *unused_netdev)
+{
+	return ETH_COMPAT_SUPPORT_LINK_MODE_48b;
+}
+
 static int mlx4_en_moderation_update(struct mlx4_en_priv *priv)
 {
 	int i;
@@ -388,7 +393,8 @@ static u32 mlx4_en_autoneg_get(struct net_device *dev)
 	return autoneg;
 }
 
-static u32 ptys_get_supported_port(struct mlx4_ptys_reg *ptys_reg)
+static ethtool_link_mode_mask_t
+ptys_get_supported_port(struct mlx4_ptys_reg *ptys_reg)
 {
 	u32 eth_proto = be32_to_cpu(ptys_reg->eth_proto_cap);
 
@@ -465,7 +471,7 @@ enum ethtool_report {
 };
 
 /* Translates mlx4 link mode to equivalent ethtool Link modes/speed */
-static u32 ptys2ethtool_map[MLX4_LINK_MODES_SZ][3] = {
+static u64 ptys2ethtool_map[MLX4_LINK_MODES_SZ][3] = {
 	[MLX4_100BASE_TX] = {
 		SUPPORTED_100baseT_Full,
 		ADVERTISED_100baseT_Full,
@@ -558,10 +564,11 @@ static u32 ptys2ethtool_map[MLX4_LINK_MODES_SZ][3] = {
 		},
 };
 
-static u32 ptys2ethtool_link_modes(u32 eth_proto, enum ethtool_report report)
+static ethtool_link_mode_mask_t
+ptys2ethtool_link_modes(u32 eth_proto, enum ethtool_report report)
 {
 	int i;
-	u32 link_modes = 0;
+	ethtool_link_mode_mask_t link_modes = 0;
 
 	for (i = 0; i < MLX4_LINK_MODES_SZ; i++) {
 		if (eth_proto & MLX4_PROT_MASK(i))
@@ -570,7 +577,8 @@ static u32 ptys2ethtool_link_modes(u32 eth_proto, enum ethtool_report report)
 	return link_modes;
 }
 
-static u32 ethtool2ptys_link_modes(u32 link_modes, enum ethtool_report report)
+static u32 ethtool2ptys_link_modes(ethtool_link_mode_mask_t link_modes,
+				   enum ethtool_report report)
 {
 	int i;
 	u32 ptys_modes = 0;
@@ -601,6 +609,9 @@ static int ethtool_get_ptys_settings(struct net_device *dev,
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_ptys_reg ptys_reg;
 	u32 eth_proto;
+	ethtool_link_mode_mask_t eth_supported;
+	ethtool_link_mode_mask_t eth_advertising;
+	ethtool_link_mode_mask_t eth_lp_advertising;
 	int ret;
 
 	memset(&ptys_reg, 0, sizeof(ptys_reg));
@@ -624,41 +635,44 @@ static int ethtool_get_ptys_settings(struct net_device *dev,
 	en_dbg(DRV, priv, "ptys_reg.eth_proto_lp_adv %x\n",
 	       be32_to_cpu(ptys_reg.eth_proto_lp_adv));
 
-	cmd->supported = 0;
-	cmd->advertising = 0;
+	eth_supported = 0;
+	eth_advertising = 0;
 
-	cmd->supported |= ptys_get_supported_port(&ptys_reg);
+	eth_supported |= ptys_get_supported_port(&ptys_reg);
 
 	eth_proto = be32_to_cpu(ptys_reg.eth_proto_cap);
-	cmd->supported |= ptys2ethtool_link_modes(eth_proto, SUPPORTED);
+	eth_supported |= ptys2ethtool_link_modes(eth_proto, SUPPORTED);
 
 	eth_proto = be32_to_cpu(ptys_reg.eth_proto_admin);
-	cmd->advertising |= ptys2ethtool_link_modes(eth_proto, ADVERTISED);
+	eth_advertising |= ptys2ethtool_link_modes(eth_proto, ADVERTISED);
 
-	cmd->supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
-	cmd->advertising |= (priv->prof->tx_pause) ? ADVERTISED_Pause : 0;
+	eth_supported |= SUPPORTED_Pause | SUPPORTED_Asym_Pause;
+	eth_advertising |= (priv->prof->tx_pause) ? ADVERTISED_Pause : 0;
 
-	cmd->advertising |= (priv->prof->tx_pause ^ priv->prof->rx_pause) ?
+	eth_advertising |= (priv->prof->tx_pause ^ priv->prof->rx_pause) ?
 		ADVERTISED_Asym_Pause : 0;
 
 	cmd->port = ptys_get_active_port(&ptys_reg);
-	cmd->transceiver = (SUPPORTED_TP & cmd->supported) ?
+	cmd->transceiver = (SUPPORTED_TP & eth_supported) ?
 		XCVR_EXTERNAL : XCVR_INTERNAL;
 
 	if (mlx4_en_autoneg_get(dev)) {
-		cmd->supported |= SUPPORTED_Autoneg;
-		cmd->advertising |= ADVERTISED_Autoneg;
+		eth_supported |= SUPPORTED_Autoneg;
+		eth_advertising |= ADVERTISED_Autoneg;
 	}
 
 	cmd->autoneg = (priv->port_state.flags & MLX4_EN_PORT_ANC) ?
 		AUTONEG_ENABLE : AUTONEG_DISABLE;
 
 	eth_proto = be32_to_cpu(ptys_reg.eth_proto_lp_adv);
-	cmd->lp_advertising = ptys2ethtool_link_modes(eth_proto, ADVERTISED);
+	eth_lp_advertising = ptys2ethtool_link_modes(eth_proto, ADVERTISED);
 
-	cmd->lp_advertising |= (priv->port_state.flags & MLX4_EN_PORT_ANC) ?
+	eth_lp_advertising |= (priv->port_state.flags & MLX4_EN_PORT_ANC) ?
 			ADVERTISED_Autoneg : 0;
 
+	ethtool_cmd_supported_set(cmd, eth_supported);
+	ethtool_cmd_advertising_set(cmd, eth_advertising);
+	ethtool_cmd_lp_advertising_set(cmd, eth_lp_advertising);
 	cmd->phy_address = 0;
 	cmd->mdio_support = 0;
 	cmd->maxtxpkt = 0;
@@ -673,27 +687,30 @@ static void ethtool_get_default_settings(struct net_device *dev,
 					 struct ethtool_cmd *cmd)
 {
 	struct mlx4_en_priv *priv = netdev_priv(dev);
+	ethtool_link_mode_mask_t eth_supported, eth_advertising;
 	int trans_type;
 
 	cmd->autoneg = AUTONEG_DISABLE;
-	cmd->supported = SUPPORTED_10000baseT_Full;
-	cmd->advertising = ADVERTISED_10000baseT_Full;
+	eth_supported = SUPPORTED_10000baseT_Full;
+	eth_advertising = ADVERTISED_10000baseT_Full;
 	trans_type = priv->port_state.transceiver;
 
 	if (trans_type > 0 && trans_type <= 0xC) {
 		cmd->port = PORT_FIBRE;
 		cmd->transceiver = XCVR_EXTERNAL;
-		cmd->supported |= SUPPORTED_FIBRE;
-		cmd->advertising |= ADVERTISED_FIBRE;
+		eth_supported |= SUPPORTED_FIBRE;
+		eth_advertising |= ADVERTISED_FIBRE;
 	} else if (trans_type == 0x80 || trans_type == 0) {
 		cmd->port = PORT_TP;
 		cmd->transceiver = XCVR_INTERNAL;
-		cmd->supported |= SUPPORTED_TP;
-		cmd->advertising |= ADVERTISED_TP;
+		eth_supported |= SUPPORTED_TP;
+		eth_advertising |= ADVERTISED_TP;
 	} else  {
 		cmd->port = -1;
 		cmd->transceiver = -1;
 	}
+	ethtool_cmd_supported_set(cmd, eth_supported);
+	ethtool_cmd_advertising_set(cmd, eth_advertising);
 }
 
 static int mlx4_en_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
@@ -747,13 +764,14 @@ static int mlx4_en_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 	struct mlx4_en_priv *priv = netdev_priv(dev);
 	struct mlx4_ptys_reg ptys_reg;
 	__be32 proto_admin;
+	const ethtool_link_mode_mask_t eth_adv = ethtool_cmd_advertising(cmd);
 	int ret;
 
-	u32 ptys_adv = ethtool2ptys_link_modes(cmd->advertising, ADVERTISED);
+	u32 ptys_adv = ethtool2ptys_link_modes(eth_adv, ADVERTISED);
 	int speed = ethtool_cmd_speed(cmd);
 
-	en_dbg(DRV, priv, "Set Speed=%d adv=0x%x autoneg=%d duplex=%d\n",
-	       speed, cmd->advertising, cmd->autoneg, cmd->duplex);
+	en_dbg(DRV, priv, "Set Speed=%d adv=0x%llx autoneg=%d duplex=%d\n",
+	       speed, eth_adv, cmd->autoneg, cmd->duplex);
 
 	if (!(priv->mdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_ETH_PROT_CTRL) ||
 	    (cmd->duplex == DUPLEX_HALF))
@@ -1821,6 +1839,7 @@ static int mlx4_en_get_module_eeprom(struct net_device *dev,
 }
 
 const struct ethtool_ops mlx4_en_ethtool_ops = {
+	.get_compat_flags = mlx4_en_get_compat_flags,
 	.get_drvinfo = mlx4_en_get_drvinfo,
 	.get_settings = mlx4_en_get_settings,
 	.set_settings = mlx4_en_set_settings,
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* [PATCH net-next v2 7/8] net: tun: extend link mode support to 48 bits
From: David Decotigny @ 2015-01-06  2:54 UTC (permalink / raw)
  To: Amir Vadai, Florian Fainelli, netdev, linux-kernel, linux-api
  Cc: David Decotigny, David S. Miller, Jason Wang, Michael S. Tsirkin,
	Herbert Xu, Al Viro, Ben Hutchings, Masatake YAMATO, Xi Wang,
	Neil Horman, WANG Cong, Flavio Leitner, Tom Gundersen, Jiri Pirko,
	Vlad Yasevich, Eric W. Biederman, Saeed Mahameed, Venkata Duvvuru,
	Govindarajulu Varadarajan
In-Reply-To: <1420512850-24699-1-git-send-email-ddecotig@gmail.com>

From: David Decotigny <decot@googlers.com>

Signed-off-by: David Decotigny <decot@googlers.com>
---
 drivers/net/tun.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index c0df872..88f9078 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2257,8 +2257,8 @@ static struct miscdevice tun_miscdev = {
 
 static int tun_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 {
-	cmd->supported		= 0;
-	cmd->advertising	= 0;
+	ethtool_cmd_supported_set(cmd, 0);
+	ethtool_cmd_advertising_set(cmd, 0);
 	ethtool_cmd_speed_set(cmd, SPEED_10);
 	cmd->duplex		= DUPLEX_FULL;
 	cmd->port		= PORT_TP;
-- 
2.2.0.rc0.207.ga3a616c

^ permalink raw reply related

* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables
From: Simon Horman @ 2015-01-06  2:54 UTC (permalink / raw)
  To: John Fastabend; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy
In-Reply-To: <20150106020514.GA24057@vergenet.net>

On Tue, Jan 06, 2015 at 11:05:14AM +0900, Simon Horman wrote:
> On Mon, Jan 05, 2015 at 05:19:26PM -0800, John Fastabend wrote:
> > On 01/05/2015 05:09 PM, Simon Horman wrote:
> > >On Mon, Jan 05, 2015 at 04:45:50PM -0800, John Fastabend wrote:
> > >>[...]
> > >>
> > >>>>>+/**
> > >>>>>+ * @struct net_flow_field_ref
> > >>>>>+ * @brief uniquely identify field as header:field tuple
> > >>>>>+ */
> > >>>>>+struct net_flow_field_ref {
> > >>>>>+    int instance;
> > >>>>>+    int header;
> > >>>>>+    int field;
> > >>>>>+    int mask_type;
> > >>>>>+    int type;
> > >>>>>+    union {    /* Are these all the required data types */
> > >>>>>+        __u8 value_u8;
> > >>>>>+        __u16 value_u16;
> > >>>>>+        __u32 value_u32;
> > >>>>>+        __u64 value_u64;
> > >>>>>+    };
> > >>>>>+    union {    /* Are these all the required data types */
> > >>>>>+        __u8 mask_u8;
> > >>>>>+        __u16 mask_u16;
> > >>>>>+        __u32 mask_u32;
> > >>>>>+        __u64 mask_u64;
> > >>>>>+    };
> > >>>>>+};
> > >>>>
> > >>>>Does it make sense to write this as follows?
> > >>>
> > >>>Yes. I'll make this update it helps make it clear value/mask pairs are
> > >>>needed.
> > >>>
> > >>>>
> > >>>>union {
> > >>>>         struct {
> > >>>>                 __u8 value_u8;
> > >>>>                 __u8 mask_u8;
> > >>>>         };
> > >>>>         struct {
> > >>>>                 __u16 value_u16;
> > >>>>                 __u16 mask_u16;
> > >>>>         };
> > >>>>         ...
> > >>>>};
> > >>
> > >>Another thought is to pull this entirely out of the structure and hide
> > >>it from the UAPI so we can add more value/mask types as needed without
> > >>having to spin versions of net_flow_field_ref. On the other hand I've
> > >>been able to fit all my fields in these types so far and I can't think
> > >>of any additions we need at the moment.
> > >
> > >FWIW, I think it would be cleaner to break both field_ref and action_args
> > >out into attributes and not expose the structures to user-space. But
> > >perhaps there is an advantage to dealing with structures directly that
> > >I am missing.
> > >
> > 
> > I  came to the same conclusion just now as well. I'm reworking it now
> > for v2.
> 
> Thanks.
> 
> BTW, I think there are a few problems with net_flow_put_flow_action().
> 
> I am not quite to the bottom of it but it seems that:
> * It loops over a->args[i] and then calls net_flow_put_act_types()
>   which performs a similar loop. This outer-loop appears to be incorrect.
> * It passes a[i].args instead of a->args[i] to net_flow_put_act_types()
> 
> I can post a fix once I've got it working to my satisfaction.
> But if you are reworking that code anyway perhaps it is easier for
> you to handle it then.

FWIW this got the current scheme working for me:

diff --git a/net/core/flow_table.c b/net/core/flow_table.c
index 5dbdc13..598afa2 100644
--- a/net/core/flow_table.c
+++ b/net/core/flow_table.c
@@ -946,7 +946,7 @@ static int net_flow_put_flow_action(struct sk_buff *skb,
 				    struct net_flow_action *a)
 {
 	struct nlattr *action, *sigs;
-	int i, err = 0;
+	int err = 0;
 
 	action = nla_nest_start(skb, NET_FLOW_ACTION);
 	if (!action)
@@ -958,21 +958,19 @@ static int net_flow_put_flow_action(struct sk_buff *skb,
 	if (!a->args)
 		goto done;
 
-	for (i = 0; a->args[i].type; i++) {
-		sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE);
-		if (!sigs) {
-			nla_nest_cancel(skb, action);
-			return -EMSGSIZE;
-		}
+	sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE);
+	if (!sigs) {
+		nla_nest_cancel(skb, action);
+		return -EMSGSIZE;
+	}
 
-		err = net_flow_put_act_types(skb, a[i].args);
-		if (err) {
-			nla_nest_cancel(skb, sigs);
-			nla_nest_cancel(skb, action);
-			return err;
-		}
-		nla_nest_end(skb, sigs);
+	err = net_flow_put_act_types(skb, a->args);
+	if (err) {
+		nla_nest_cancel(skb, sigs);
+		nla_nest_cancel(skb, action);
+		return err;
 	}
+	nla_nest_end(skb, sigs);
 
 done:
 	nla_nest_end(skb, action);
@@ -1103,6 +1101,7 @@ static int net_flow_get_action(struct net_flow_action *a, struct nlattr *attr)
 		}
 
 		a->args[count] = *(struct net_flow_action_arg *)nla_data(args);
+		count++;
 	}
 	return 0;
 }

^ permalink raw reply related

* [Question]when work == weight in net_rx_action
From: Dennis Chen @ 2015-01-06  3:28 UTC (permalink / raw)
  To: netdev

in net_rx_action() function, I found that if the work done returned
from the n->poll() equal the weight, then it will enter the following
path:
if (unlikely(work == weight)) {
            if (unlikely(napi_disable_pending(n))) {
                local_irq_enable();
                napi_complete(n);
                local_irq_disable();
            } else {
                if (n->gro_list) {
                    /* flush too old packets
                     * If HZ < 1000, flush all packets.
                     */
                    local_irq_enable();
                    napi_gro_flush(n, HZ >= 1000);
                    local_irq_disable();
                }
                list_move_tail(&n->poll_list, &sd->poll_list);
            }
        }

suppose the napi instance is not disabled and n->gro_list = NULL, so
this function just moves the napi instance node to the tail of the
sd->poll_list, because the context here is device interrupt disabled,
is there any chance for the net_rx_action will be called again?

-- 
Den

^ permalink raw reply

* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables
From: John Fastabend @ 2015-01-06  3:31 UTC (permalink / raw)
  To: Simon Horman; +Cc: Thomas Graf, sfeldma, jiri, jhs, netdev, davem, andy
In-Reply-To: <20150106025456.GB24057@vergenet.net>


[...]

>>
>> BTW, I think there are a few problems with net_flow_put_flow_action().
>>
>> I am not quite to the bottom of it but it seems that:
>> * It loops over a->args[i] and then calls net_flow_put_act_types()
>>    which performs a similar loop. This outer-loop appears to be incorrect.
>> * It passes a[i].args instead of a->args[i] to net_flow_put_act_types()
>>
>> I can post a fix once I've got it working to my satisfaction.
>> But if you are reworking that code anyway perhaps it is easier for
>> you to handle it then.
>
> FWIW this got the current scheme working for me:
>

Thanks Simon. I'll roll this in as well.

> diff --git a/net/core/flow_table.c b/net/core/flow_table.c
> index 5dbdc13..598afa2 100644
> --- a/net/core/flow_table.c
> +++ b/net/core/flow_table.c
> @@ -946,7 +946,7 @@ static int net_flow_put_flow_action(struct sk_buff *skb,
>   				    struct net_flow_action *a)
>   {
>   	struct nlattr *action, *sigs;
> -	int i, err = 0;
> +	int err = 0;
>
>   	action = nla_nest_start(skb, NET_FLOW_ACTION);
>   	if (!action)
> @@ -958,21 +958,19 @@ static int net_flow_put_flow_action(struct sk_buff *skb,
>   	if (!a->args)
>   		goto done;
>
> -	for (i = 0; a->args[i].type; i++) {
> -		sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE);
> -		if (!sigs) {
> -			nla_nest_cancel(skb, action);
> -			return -EMSGSIZE;
> -		}
> +	sigs = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE);
> +	if (!sigs) {
> +		nla_nest_cancel(skb, action);
> +		return -EMSGSIZE;
> +	}
>
> -		err = net_flow_put_act_types(skb, a[i].args);
> -		if (err) {
> -			nla_nest_cancel(skb, sigs);
> -			nla_nest_cancel(skb, action);
> -			return err;
> -		}
> -		nla_nest_end(skb, sigs);
> +	err = net_flow_put_act_types(skb, a->args);
> +	if (err) {
> +		nla_nest_cancel(skb, sigs);
> +		nla_nest_cancel(skb, action);
> +		return err;
>   	}
> +	nla_nest_end(skb, sigs);
>
>   done:
>   	nla_nest_end(skb, action);
> @@ -1103,6 +1101,7 @@ static int net_flow_get_action(struct net_flow_action *a, struct nlattr *attr)
>   		}
>
>   		a->args[count] = *(struct net_flow_action_arg *)nla_data(args);
> +		count++;
>   	}
>   	return 0;
>   }
>


-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* Re: [PATCH V2] net: eth: xgene: change APM X-Gene SoC platform ethernet to support ACPI
From: David Miller @ 2015-01-06  3:41 UTC (permalink / raw)
  To: fkan; +Cc: patches, netdev, linux-kernel
In-Reply-To: <1420508576-10480-1-git-send-email-fkan@apm.com>

From: Feng Kan <fkan@apm.com>
Date: Mon,  5 Jan 2015 17:42:56 -0800

> This adds support for APM X-Gene ethernet driver to use ACPI table to derive
> ethernet driver parameter.
> 
> Signed-off-by: Feng Kan <fkan@apm.com>

Applied.

^ permalink raw reply

* Re: [PATCH V2] net: eth: xgene: change APM X-Gene SoC platform ethernet to support ACPI
From: David Miller @ 2015-01-06  3:42 UTC (permalink / raw)
  To: fkan; +Cc: patches, netdev, linux-kernel
In-Reply-To: <20150105.224137.894770670718334587.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Mon, 05 Jan 2015 22:41:37 -0500 (EST)

> From: Feng Kan <fkan@apm.com>
> Date: Mon,  5 Jan 2015 17:42:56 -0800
> 
>> This adds support for APM X-Gene ethernet driver to use ACPI table to derive
>> ethernet driver parameter.
>> 
>> Signed-off-by: Feng Kan <fkan@apm.com>
> 
> Applied.

Actually, reverted, this breaks the build:

drivers/net/ethernet/apm/xgene/xgene_enet_main.c:1031:28: warning: ‘xgene_enet_of_match’ defined but not used [-Wunused-variable]
 static struct of_device_id xgene_enet_of_match[] = {
                            ^
In file included from drivers/net/ethernet/apm/xgene/xgene_enet_main.h:32:0,
                 from drivers/net/ethernet/apm/xgene/xgene_enet_main.c:22:
include/linux/module.h:138:40: error: ‘__mod_of__xgene_enet_match_device_table’ aliased to undefined symbol ‘xgene_enet_match’
   extern const struct type##_device_id __mod_##type##__##name##_device_table \
                                        ^
drivers/net/ethernet/apm/xgene/xgene_enet_main.c:1036:1: note: in expansion of macro ‘MODULE_DEVICE_TABLE’
 MODULE_DEVICE_TABLE(of, xgene_enet_match);
 ^

^ permalink raw reply

* Re: [PATCH net-next v3 0/4] ip: Support checksum returned in csmg
From: David Miller @ 2015-01-06  3:47 UTC (permalink / raw)
  To: therbert; +Cc: netdev
In-Reply-To: <1420494977-15026-1-git-send-email-therbert@google.com>

From: Tom Herbert <therbert@google.com>
Date: Mon,  5 Jan 2015 13:56:13 -0800

> This patch set allows the packet checksum for a datagram socket
> to be returned in csum data in recvmsg. This allows userspace
> to implement its own checksum over the data, for instance if an
> IP tunnel was be implemented in user space, the inner checksum
> could be validated.
> 
> Changes in this patch set:
>   - Move checksum conversion to inet_sock from udp_sock. This
>     generalizes checksum conversion for use with other protocols.
>   - Move IP cmsg constants to a header file and make processing
>     of the flags more efficient in ip_cmsg_recv
>   - Return checksum value in cmsg. This is specifically the unfolded
>     32 bit checksum of the full packet starting from the first byte
>     returned in recvmsg
> 
> Tested: Wrote a little server to get checksums in cmsg for UDP and
>         verfied correct checksum is returned.

Series applied, thanks Tom.

^ permalink raw reply

* Re: [PATCH net-next v4] net: Do not call ndo_dflt_fdb_dump if ndo_fdb_dump is defined
From: David Miller @ 2015-01-06  3:52 UTC (permalink / raw)
  To: hubert.sokolowski; +Cc: netdev, ray.kinsella
In-Reply-To: <54AAC9F1.9060409@intel.com>

From: Hubert Sokolowski <hubert.sokolowski@intel.com>
Date: Mon, 05 Jan 2015 17:29:21 +0000

> Add checking whether the call to ndo_dflt_fdb_dump is needed.
> It is not expected to call ndo_dflt_fdb_dump unconditionally
> by some drivers (i.e. qlcnic or macvlan) that defines
> own ndo_fdb_dump. Other drivers define own ndo_fdb_dump
> and don't want ndo_dflt_fdb_dump to be called at all.
> At the same time it is desirable to call the default dump
> function on a bridge device.
> Fix attributes that are passed to dev->netdev_ops->ndo_fdb_dump.
> Add extra checking in br_fdb_dump to avoid duplicate entries
> as now filter_dev can be NULL.
> 
> Following tests for filtering have been performed before
> the change and after the patch was applied to make sure
> they are the same and it doesn't break the filtering algorithm.
 ...
> Signed-off-by: Hubert Sokolowski <hubert.sokolowski@intel.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next v3 0/6] net: allow setting congctl via routing table
From: David Miller @ 2015-01-06  3:55 UTC (permalink / raw)
  To: dborkman; +Cc: hannes, fw, netdev
In-Reply-To: <1420498668-4660-1-git-send-email-dborkman@redhat.com>

From: Daniel Borkmann <dborkman@redhat.com>
Date: Mon,  5 Jan 2015 23:57:42 +0100

> This is the second part of our work and allows for setting the congestion
> control algorithm via routing table. For details, please see individual
> patches.
> 
> Since patch 1 is a bug fix, we suggest applying patch 1 to net, and then
> merging net into net-next, for example, and following up with the remaining
> feature patches wrt dependencies.
> 
> Joint work with Florian Westphal, suggested by Hannes Frederic Sowa.
> 
> Patch for iproute2 is available under [1], but will be reposted with along
> with the man-page update when this set hits net-next.
> 
>   [1] http://patchwork.ozlabs.org/patch/418149/

Looks good, series applied, thanks!

^ permalink raw reply

* Re: [PATCH V2] net: eth: xgene: change APM X-Gene SoC platform ethernet to support ACPI
From: Feng Kan @ 2015-01-06  4:05 UTC (permalink / raw)
  To: David Miller; +Cc: patches, netdev, linux-kernel@vger.kernel.org
In-Reply-To: <20150105.224235.1941007145649816230.davem@davemloft.net>

On Mon, Jan 5, 2015 at 7:42 PM, David Miller <davem@davemloft.net> wrote:
> From: David Miller <davem@davemloft.net>
> Date: Mon, 05 Jan 2015 22:41:37 -0500 (EST)
>
>> From: Feng Kan <fkan@apm.com>
>> Date: Mon,  5 Jan 2015 17:42:56 -0800
>>
>>> This adds support for APM X-Gene ethernet driver to use ACPI table to derive
>>> ethernet driver parameter.
>>>
>>> Signed-off-by: Feng Kan <fkan@apm.com>
>>
>> Applied.
>
> Actually, reverted, this breaks the build:
>
> drivers/net/ethernet/apm/xgene/xgene_enet_main.c:1031:28: warning: 'xgene_enet_of_match' defined but not used [-Wunused-variable]
>  static struct of_device_id xgene_enet_of_match[] = {
>                             ^
> In file included from drivers/net/ethernet/apm/xgene/xgene_enet_main.h:32:0,
>                  from drivers/net/ethernet/apm/xgene/xgene_enet_main.c:22:
> include/linux/module.h:138:40: error: '__mod_of__xgene_enet_match_device_table' aliased to undefined symbol 'xgene_enet_match'
>    extern const struct type##_device_id __mod_##type##__##name##_device_table \
>                                         ^
> drivers/net/ethernet/apm/xgene/xgene_enet_main.c:1036:1: note: in expansion of macro 'MODULE_DEVICE_TABLE'
>  MODULE_DEVICE_TABLE(of, xgene_enet_match);
>  ^
I see, I initially tried on the net-next tree which did not build for
me, failed somewhere else but builds xgene enet ok.
Then I tested on net/master which builds fine for me.
Would you please point me to the branch you are using?

^ permalink raw reply

* Re: [PATCH V2] net: eth: xgene: change APM X-Gene SoC platform ethernet to support ACPI
From: David Miller @ 2015-01-06  4:10 UTC (permalink / raw)
  To: fkan; +Cc: patches, netdev, linux-kernel
In-Reply-To: <CAL85gmCGK54yK3sThVhGsatMVUS8+76LRwdDt4w9PUfppMJ3og@mail.gmail.com>

From: Feng Kan <fkan@apm.com>
Date: Mon, 5 Jan 2015 20:05:28 -0800

> I see, I initially tried on the net-next tree which did not build for
> me, failed somewhere else but builds xgene enet ok.
> Then I tested on net/master which builds fine for me.
> Would you please point me to the branch you are using?

net-next

^ permalink raw reply

* Re: [net-next PATCH v1 01/11] net: flow_table: create interface for hw match/action tables
From: Scott Feldman @ 2015-01-06  5:25 UTC (permalink / raw)
  To: John Fastabend
  Cc: Thomas Graf, Jiří Pírko, Jamal Hadi Salim,
	simon.horman@netronome.com, Netdev, David S. Miller,
	Andy Gospodarek
In-Reply-To: <20141231194544.31070.30335.stgit@nitbit.x32>

Nice work John.  Some nits inline...

On Wed, Dec 31, 2014 at 11:45 AM, John Fastabend
<john.fastabend@gmail.com> wrote:
>
> diff --git a/include/linux/if_flow.h b/include/linux/if_flow.h
> new file mode 100644
> index 0000000..1b6c1ea
> --- /dev/null
> +++ b/include/linux/if_flow.h
> @@ -0,0 +1,93 @@
> +/*
> + * include/linux/net/if_flow.h - Flow table interface for Switch devices
> + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com>
> + *
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * The full GNU General Public License is included in this distribution in
> + * the file called "COPYING".
> + *
> + * Author: John Fastabend <john.r.fastabend@intel.com>
> + */
> +
> +#ifndef _IF_FLOW_H
> +#define _IF_FLOW_H
> +
> +#include <uapi/linux/if_flow.h>
> +
> +/**
> + * @struct net_flow_header
> + * @brief defines a match (header/field) an endpoint can use
> + *
> + * @uid unique identifier for header
> + * @field_sz number of fields are in the set
> + * @fields the set of fields in the net_flow_header
> + */
> +struct net_flow_header {
> +       char name[NET_FLOW_NAMSIZ];
> +       int uid;
> +       int field_sz;
> +       struct net_flow_field *fields;
> +};
> +
> +/**
> + * @struct net_flow_action
> + * @brief a description of a endpoint defined action
> + *
> + * @name printable name
> + * @uid unique action identifier
> + * @types NET_FLOW_ACTION_TYPE_NULL terminated list of action types

s/types/args?

> + */
> +struct net_flow_action {
> +       char name[NET_FLOW_NAMSIZ];
> +       int uid;
> +       struct net_flow_action_arg *args;
> +};
> +
> +/**
> + * @struct net_flow_table
> + * @brief define flow table with supported match/actions
> + *
> + * @uid unique identifier for table
> + * @source uid of parent table

Is parent table the table previous in the pipeline?  If so, what if
you can get to table from N different parent tables, what goes in
source?

> + * @size max number of entries for table or -1 for unbounded
> + * @matches null terminated set of supported match types given by match uid
> + * @actions null terminated set of supported action types given by action uid
> + * @flows set of flows
> + */
> +struct net_flow_table {
> +       char name[NET_FLOW_NAMSIZ];
> +       int uid;
> +       int source;
> +       int size;
> +       struct net_flow_field_ref *matches;
> +       int *actions;
> +};
> +
> +/* net_flow_hdr_node: node in a header graph of header fields.
> + *
> + * @uid : unique id of the graph node
> + * @flwo_header_ref : identify the hdrs that can handled by this node

s/flwo_header_ref/hdrs?

> + * @net_flow_jump_table : give a case jump statement

s/net_flow_jump_table/jump

> + */
> +struct net_flow_hdr_node {
> +       char name[NET_FLOW_NAMSIZ];
> +       int uid;
> +       int *hdrs;
> +       struct net_flow_jump_table *jump;
> +};
> +
> +struct net_flow_tbl_node {
> +       int uid;
> +       __u32 flags;
> +       struct net_flow_jump_table *jump;
> +};
> +#endif
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 29c92ee..3c3c856 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -52,6 +52,11 @@
>  #include <linux/neighbour.h>
>  #include <uapi/linux/netdevice.h>
>
> +#ifdef CONFIG_NET_FLOW_TABLES
> +#include <linux/if_flow.h>
> +#include <uapi/linux/if_flow.h>

linux/if_flow.h already included uapi file

> +#endif
> +
>  struct netpoll_info;
>  struct device;
>  struct phy_device;
> @@ -1186,6 +1191,13 @@ struct net_device_ops {
>         int                     (*ndo_switch_port_stp_update)(struct net_device *dev,
>                                                               u8 state);
>  #endif
> +#ifdef CONFIG_NET_FLOW_TABLES
> +       struct net_flow_action  **(*ndo_flow_get_actions)(struct net_device *dev);
> +       struct net_flow_table   **(*ndo_flow_get_tables)(struct net_device *dev);
> +       struct net_flow_header  **(*ndo_flow_get_headers)(struct net_device *dev);
> +       struct net_flow_hdr_node **(*ndo_flow_get_hdr_graph)(struct net_device *dev);

hdr or header?  pick one, probably hdr.

> +       struct net_flow_tbl_node **(*ndo_flow_get_tbl_graph)(struct net_device *dev);

move this up next to get_tables

> +#endif
>  };
>
>  /**
> diff --git a/include/uapi/linux/if_flow.h b/include/uapi/linux/if_flow.h
> new file mode 100644
> index 0000000..2acdb38
> --- /dev/null
> +++ b/include/uapi/linux/if_flow.h
> @@ -0,0 +1,363 @@
> +/*
> + * include/uapi/linux/if_flow.h - Flow table interface for Switch devices
> + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com>
> + *
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * The full GNU General Public License is included in this distribution in
> + * the file called "COPYING".
> + *
> + * Author: John Fastabend <john.r.fastabend@intel.com>
> + */
> +
> +/* Netlink description:
> + *
> + * Table definition used to describe running tables. The following
> + * describes the netlink message returned from a flow API messages.

message?


> +
> +enum {
> +       NET_FLOW_MASK_TYPE_UNSPEC,
> +       NET_FLOW_MASK_TYPE_EXACT,
> +       NET_FLOW_MASK_TYPE_LPM,

As discussed in another thread, need third mask type that's not LPM;
e.g. 0b0101.


> +#define NET_FLOW_TABLE_GRAPH_NODE_MAX (__NET_FLOW_TABLE_GRAPH_NODE_MAX - 1)
> +
> +enum {
> +       NET_FLOW_TABLE_GRAPH_UNSPEC,
> +       NET_FLOW_TABLE_GRAPH_NODE,
> +       __NET_FLOW_TABLE_GRAPH_MAX,
> +};
> +#define NET_FLOW_TABLE_GRAPH_MAX (__NET_FLOW_TABLE_GRAPH_MAX - 1)
> +
> +enum {
> +       NET_FLOW_IDENTIFIER_IFINDEX, /* net_device ifindex */

Maybe add an NET_FLOW_IDENTIFIER_UNSPEC so NET_FLOW_IDENTIFIER_IFINDEX
isn't zero.

> diff --git a/net/Kconfig b/net/Kconfig
> index ff9ffc1..8380bfe 100644
> --- a/net/Kconfig
> +++ b/net/Kconfig
> @@ -293,6 +293,13 @@ config NET_FLOW_LIMIT
>           with many clients some protection against DoS by a single (spoofed)
>           flow that greatly exceeds average workload.
>
> +config NET_FLOW_TABLES
> +       boolean "Support network flow tables"
> +       ---help---
> +       This feature provides an interface for device drivers to report
> +       flow tables and supported matches and actions. If you do not
> +       want to support hardware offloads for flow tables, say N here.
> +
>  menu "Network testing"
>
>  config NET_PKTGEN
> diff --git a/net/core/Makefile b/net/core/Makefile
> index 235e6c5..1eea785 100644
> --- a/net/core/Makefile
> +++ b/net/core/Makefile
> @@ -23,3 +23,4 @@ obj-$(CONFIG_NETWORK_PHY_TIMESTAMPING) += timestamping.o
>  obj-$(CONFIG_NET_PTP_CLASSIFY) += ptp_classifier.o
>  obj-$(CONFIG_CGROUP_NET_PRIO) += netprio_cgroup.o
>  obj-$(CONFIG_CGROUP_NET_CLASSID) += netclassid_cgroup.o
> +obj-$(CONFIG_NET_FLOW_TABLES) += flow_table.o
> diff --git a/net/core/flow_table.c b/net/core/flow_table.c
> new file mode 100644
> index 0000000..ec3f06d
> --- /dev/null
> +++ b/net/core/flow_table.c
> @@ -0,0 +1,837 @@
> +/*
> + * include/uapi/linux/if_flow.h - Flow table interface for Switch devices
> + * Copyright (c) 2014 John Fastabend <john.r.fastabend@intel.com>
> + *
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * The full GNU General Public License is included in this distribution in
> + * the file called "COPYING".
> + *
> + * Author: John Fastabend <john.r.fastabend@intel.com>
> + */
> +
> +#include <uapi/linux/if_flow.h>
> +#include <linux/if_flow.h>
> +#include <linux/if_bridge.h>
> +#include <linux/types.h>
> +#include <net/netlink.h>
> +#include <net/genetlink.h>
> +#include <net/rtnetlink.h>
> +#include <linux/module.h>
> +
> +static struct genl_family net_flow_nl_family = {
> +       .id             = GENL_ID_GENERATE,
> +       .name           = NET_FLOW_GENL_NAME,
> +       .version        = NET_FLOW_GENL_VERSION,
> +       .maxattr        = NET_FLOW_MAX,
> +       .netnsok        = true,
> +};
> +
> +static struct net_device *net_flow_get_dev(struct genl_info *info)
> +{
> +       struct net *net = genl_info_net(info);
> +       int type, ifindex;
> +
> +       if (!info->attrs[NET_FLOW_IDENTIFIER_TYPE] ||
> +           !info->attrs[NET_FLOW_IDENTIFIER])
> +               return NULL;
> +
> +       type = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER_TYPE]);
> +       switch (type) {
> +       case NET_FLOW_IDENTIFIER_IFINDEX:
> +               ifindex = nla_get_u32(info->attrs[NET_FLOW_IDENTIFIER]);
> +               break;
> +       default:
> +               return NULL;
> +       }
> +
> +       return dev_get_by_index(net, ifindex);
> +}
> +
> +static int net_flow_put_act_types(struct sk_buff *skb,
> +                                 struct net_flow_action_arg *args)
> +{
> +       int i, err;
> +
> +       for (i = 0; args[i].type; i++) {
> +               err = nla_put(skb, NET_FLOW_ACTION_ARG,
> +                             sizeof(struct net_flow_action_arg), &args[i]);
> +               if (err)
> +                       return -EMSGSIZE;
> +       }
> +       return 0;
> +}
> +
> +static const
> +struct nla_policy net_flow_action_policy[NET_FLOW_ACTION_ATTR_MAX + 1] = {
> +       [NET_FLOW_ACTION_ATTR_NAME]      = {.type = NLA_STRING,
> +                                           .len = NET_FLOW_NAMSIZ-1 },
> +       [NET_FLOW_ACTION_ATTR_UID]       = {.type = NLA_U32 },
> +       [NET_FLOW_ACTION_ATTR_SIGNATURE] = {.type = NLA_NESTED },
> +};
> +
> +static int net_flow_put_action(struct sk_buff *skb, struct net_flow_action *a)
> +{
> +       struct net_flow_action_arg *this;
> +       struct nlattr *nest;
> +       int err, args = 0;
> +
> +       if (a->name && nla_put_string(skb, NET_FLOW_ACTION_ATTR_NAME, a->name))
> +               return -EMSGSIZE;
> +
> +       if (nla_put_u32(skb, NET_FLOW_ACTION_ATTR_UID, a->uid))
> +               return -EMSGSIZE;
> +
> +       if (!a->args)
> +               return 0;
> +
> +       for (this = &a->args[0]; strlen(this->name) > 0; this++)
> +               args++;
> +

Since you only need to know that there are > 0 args, but don't need
the actual count, can you simplify test with something like:

   bool has_args = strlen(a->args->name) > 0;

or

  bool has_args = !!a->args->type;

> +       if (args) {
> +               nest = nla_nest_start(skb, NET_FLOW_ACTION_ATTR_SIGNATURE);
> +               if (!nest)
> +                       goto nest_put_failure;

Maybe just return -EMSGSIZE here and skip goto.

> +
> +               err = net_flow_put_act_types(skb, a->args);
> +               if (err) {
> +                       nla_nest_cancel(skb, nest);
> +                       return err;
> +               }
> +               nla_nest_end(skb, nest);
> +       }
> +
> +       return 0;
> +nest_put_failure:
> +       return -EMSGSIZE;
> +}
> +
> +static int net_flow_put_actions(struct sk_buff *skb,
> +                               struct net_flow_action **acts)
> +{
> +       struct nlattr *actions;
> +       int err, i;
> +
> +       actions = nla_nest_start(skb, NET_FLOW_ACTIONS);
> +       if (!actions)
> +               return -EMSGSIZE;
> +
> +       for (i = 0; acts[i]->uid; i++) {

Using for(act = acts; act->udi; act++) will make code a little nicer.

> +               struct nlattr *action = nla_nest_start(skb, NET_FLOW_ACTION);
> +
> +               if (!action)
> +                       goto action_put_failure;
> +
> +               err = net_flow_put_action(skb, acts[i]);
> +               if (err)
> +                       goto action_put_failure;
> +               nla_nest_end(skb, action);
> +       }
> +       nla_nest_end(skb, actions);
> +
> +       return 0;
> +action_put_failure:
> +       nla_nest_cancel(skb, actions);
> +       return -EMSGSIZE;
> +}
> +
> +struct sk_buff *net_flow_build_actions_msg(struct net_flow_action **a,
> +                                          struct net_device *dev,
> +                                          u32 portid, int seq, u8 cmd)
> +{
> +       struct genlmsghdr *hdr;
> +       struct sk_buff *skb;
> +       int err = -ENOBUFS;
> +
> +       skb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
> +       if (!skb)
> +               return ERR_PTR(-ENOBUFS);
> +
> +       hdr = genlmsg_put(skb, portid, seq, &net_flow_nl_family, 0, cmd);
> +       if (!hdr)
> +               goto out;
> +
> +       if (nla_put_u32(skb,
> +                       NET_FLOW_IDENTIFIER_TYPE,
> +                       NET_FLOW_IDENTIFIER_IFINDEX) ||
> +           nla_put_u32(skb, NET_FLOW_IDENTIFIER, dev->ifindex)) {
> +               err = -ENOBUFS;
> +               goto out;
> +       }
> +
> +       err = net_flow_put_actions(skb, a);
> +       if (err < 0)
> +               goto out;
> +
> +       err = genlmsg_end(skb, hdr);
> +       if (err < 0)
> +               goto out;
> +
> +       return skb;
> +out:
> +       nlmsg_free(skb);
> +       return ERR_PTR(err);
> +}
> +
> +static int net_flow_cmd_get_actions(struct sk_buff *skb,
> +                                   struct genl_info *info)
> +{
> +       struct net_flow_action **a;
> +       struct net_device *dev;
> +       struct sk_buff *msg;
> +
> +       dev = net_flow_get_dev(info);
> +       if (!dev)
> +               return -EINVAL;
> +
> +       if (!dev->netdev_ops->ndo_flow_get_actions) {
> +               dev_put(dev);
> +               return -EOPNOTSUPP;
> +       }
> +
> +       a = dev->netdev_ops->ndo_flow_get_actions(dev);
> +       if (!a)
> +               return -EBUSY;

Is it assumed ndo_flow_get_actions() returns a pointer to a static
list of actions?  What if the device wants to give up a dynamic list
of actions?  I'm trying to understand the lifetime of pointer 'a'.
What would cause -EBUSY condition?

> +
> +       msg = net_flow_build_actions_msg(a, dev,
> +                                        info->snd_portid,
> +                                        info->snd_seq,
> +                                        NET_FLOW_TABLE_CMD_GET_ACTIONS);
> +       dev_put(dev);
> +
> +       if (IS_ERR(msg))
> +               return PTR_ERR(msg);
> +
> +       return genlmsg_reply(msg, info);
> +}
> +
> +static int net_flow_put_table(struct net_device *dev,
> +                             struct sk_buff *skb,
> +                             struct net_flow_table *t)
> +{
> +       struct nlattr *matches, *actions;
> +       int i;
> +
> +       if (nla_put_string(skb, NET_FLOW_TABLE_ATTR_NAME, t->name) ||
> +           nla_put_u32(skb, NET_FLOW_TABLE_ATTR_UID, t->uid) ||
> +           nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SOURCE, t->source) ||
> +           nla_put_u32(skb, NET_FLOW_TABLE_ATTR_SIZE, t->size))
> +               return -EMSGSIZE;
> +
> +       matches = nla_nest_start(skb, NET_FLOW_TABLE_ATTR_MATCHES);
> +       if (!matches)
> +               return -EMSGSIZE;
> +
> +       for (i = 0; t->matches[i].instance; i++)

pointer-based loop better than i-based?  my personal preference, i guess.

> +               nla_put(skb, NET_FLOW_FIELD_REF,
> +                       sizeof(struct net_flow_field_ref),
> +                       &t->matches[i]);
> +       nla_nest_end(skb, matches);
> +

^ permalink raw reply

* [PATCH v4 net-net 0/2] Increase the limit of tuntap queues
From: Pankaj Gupta @ 2015-01-06  5:39 UTC (permalink / raw)
  To: linux-kernel, netdev
  Cc: davem, jasowang, mst, dgibson, vfalico, edumazet, vyasevic, hkchu,
	wuzhy, xemul, therbert, bhutchings, xii, stephen, jiri,
	sergei.shtylyov, Pankaj Gupta

Networking under KVM works best if we allocate a per-vCPU rx and tx
queue in a virtual NIC. This requires a per-vCPU queue on the host side.
Modern physical NICs have multiqueue support for large number of queues.
To scale vNIC to run multiple queues parallel to maximum number of vCPU's
we need to increase number of queues support in tuntap.   

Changes from v3:

PATCH1: Michael.S.Tsirkin - Some cleanups and updated commit message.

Changes from v2:
PATCH 3: David Miller     - flex array adds extra level of indirection
                            for preallocated array.(dropped, as flow array
			    is allocated using kzalloc with failover to zalloc). 
Changes from v1:
PATCH 2: David Miller     - sysctl changes to limit number of queues 
                            not required for unprivileged users(dropped).

Changes from RFC
PATCH 1: Sergei Shtylyov  - Add an empty line after declarations.
PATCH 2: Jiri Pirko -       Do not introduce new module paramaters.
	 Michael.S.Tsirkin- We can use sysctl for limiting max number
                            of queues.

This series is to increase the number of tuntap queues. Original work is being 
done by 'jasowang@redhat.com'. I am taking this 'https://lkml.org/lkml/2013/6/19/29' 
patch series as a reference. As per discussion in the patch series:

There were two reasons which prevented us from increasing number of tun queues:

- The netdev_queue array in netdevice were allocated through kmalloc, which may 
  cause a high order memory allocation too when we have several queues. 
  E.g. sizeof(netdev_queue) is 320, which means a high order allocation would 
  happens when the device has more than 16 queues.

- We store the hash buckets in tun_struct which results a very large size of
  tun_struct, this high order memory allocation fail easily when the memory is
  fragmented.

The patch 60877a32bce00041528576e6b8df5abe9251fa73 increases the number of tx 
queues. Memory allocation fallback to vzalloc() when kmalloc() fails.

This series tries to address following issues:

- Increase the number of netdev_queue queues for rx similarly its done for tx 
  queues by falling back to vzalloc() when memory allocation with kmalloc() fails.

- Increase number of queues to 256, maximum number is equal to maximum number 
  of vCPUS allowed in a guest.

I have also done testing with multiple parallel Netperf sessions for different 
combination of queues and CPU's. It seems to be working fine without much increase 
in cpu load with increase in number of queues. I also see good increase in throughput
with increase in number of queues. Though i had limitation of 8 physical CPU's. 

For this test: Two Hosts(Host1 & Host2) are directly connected with cable  
Host1 is running Guest1. Data is sent from Host2 to Guest1 via Host1.

Host kernel: 3.19.0-rc2+, AMD Opteron(tm) Processor 6320
NIC : Emulex Corporation OneConnect 10Gb NIC (be3)

Patch Applied  %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle  throughput
Single Queue, 2 vCPU's
-------------
Before Patch :all    0.19    0.00    0.16    0.07    0.04    0.10    0.00    0.18    0.00   99.26  57864.18
After  Patch :all    0.99    0.00    0.64    0.69    0.07    0.26    0.00    1.58    0.00   95.77  57735.77

With 2 Queues, 2 vCPU's    
---------------
Before Patch :all    0.19    0.00    0.19    0.10    0.04    0.11    0.00    0.28    0.00   99.08  63083.09
After  Patch :all    0.87    0.00    0.73    0.78    0.09    0.35    0.00    2.04    0.00   95.14  62917.03

With 4 Queues, 4 vCPU's
--------------
Before Patch :all    0.20    0.00    0.21    0.11    0.04    0.12    0.00    0.32    0.00   99.00  80865.06
After  Patch :all    0.71    0.00    0.93    0.85    0.11    0.51    0.00    2.62    0.00   94.27  86463.19

With 8 Queues, 8 vCPU's
--------------
Before Patch :all    0.19    0.00    0.18    0.09    0.04    0.11    0.00    0.23    0.00   99.17  86795.31
After  Patch :all    0.65    0.00    1.18    0.93    0.13    0.68    0.00    3.38    0.00   93.05  89459.93

With 16 Queues, 8 vCPU's
--------------
After  Patch :all    0.61    0.00    1.59    0.97    0.18    0.92    0.00    4.32    0.00   91.41  120951.60

Patches Summary:
  net: allow large number of rx queues
  tuntap: Increase the number of queues in tun

 drivers/net/tun.c |    9 +++++----
 net/core/dev.c    |   13 ++++++++-----
 2 files changed, 13 insertions(+), 9 deletions(-)

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox