Netdev List
 help / color / mirror / Atom feed
* [PATCH 1/4] bindings: net: stmmac: correct note about TSO
From: Niklas Cassel @ 2016-11-23 14:24 UTC (permalink / raw)
  To: Rob Herring, Mark Rutland, David S. Miller, Giuseppe CAVALLARO,
	Alexandre TORGUE, Phil Reid, Niklas Cassel, Eric Engestrom
  Cc: Niklas Cassel, netdev-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

From: Niklas Cassel <niklas.cassel-VrBV9hrLPhE@public.gmane.org>

snps,tso was previously placed under AXI BUS Mode parameters,
suggesting that the property should be in the stmmac-axi-config node.

TSO (TCP Segmentation Offloading) has nothing to do with AXI BUS Mode
parameters, and the parser actually expects it to be in the root node,
not in the stmmac-axi-config.

Also added a note about snps,tso only being available on GMAC4 and newer.

Signed-off-by: Niklas Cassel <niklas.cassel-VrBV9hrLPhE@public.gmane.org>
---
 Documentation/devicetree/bindings/net/stmmac.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/stmmac.txt b/Documentation/devicetree/bindings/net/stmmac.txt
index 41b49e6075f5..b95ff998ba73 100644
--- a/Documentation/devicetree/bindings/net/stmmac.txt
+++ b/Documentation/devicetree/bindings/net/stmmac.txt
@@ -1,7 +1,7 @@
 * STMicroelectronics 10/100/1000 Ethernet driver (GMAC)
 
 Required properties:
-- compatible: Should be "snps,dwmac-<ip_version>" "snps,dwmac"
+- compatible: Should be "snps,dwmac-<ip_version>", "snps,dwmac"
 	For backwards compatibility: "st,spear600-gmac" is also supported.
 - reg: Address and length of the register set for the device
 - interrupt-parent: Should be the phandle for the interrupt controller
@@ -50,6 +50,8 @@ Optional properties:
 - snps,ps-speed: port selection speed that can be passed to the core when
 		 PCS is supported. For example, this is used in case of SGMII
 		 and MAC2MAC connection.
+- snps,tso: this enables the TSO feature otherwise it will be managed by
+		 MAC HW capability register. Only for GMAC4 and newer.
 - AXI BUS Mode parameters: below the list of all the parameters to program the
 			   AXI register inside the DMA module:
 	- snps,lpi_en: enable Low Power Interface
@@ -62,8 +64,6 @@ Optional properties:
 	- snps,fb: fixed-burst
 	- snps,mb: mixed-burst
 	- snps,rb: rebuild INCRx Burst
-	- snps,tso: this enables the TSO feature otherwise it will be managed by
-	    MAC HW capability register.
 - mdio: with compatible = "snps,dwmac-mdio", create and register mdio bus.
 
 Examples:
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH] dwc_eth_qos: drop duplicate headers
From: Geliang Tang @ 2016-11-23 14:24 UTC (permalink / raw)
  To: Lars Persson; +Cc: Geliang Tang, netdev, linux-kernel

Drop duplicate headers types.h and delay.h from dwc_eth_qos.c.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 drivers/net/ethernet/synopsys/dwc_eth_qos.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/synopsys/dwc_eth_qos.c b/drivers/net/ethernet/synopsys/dwc_eth_qos.c
index 7053301..acce385 100644
--- a/drivers/net/ethernet/synopsys/dwc_eth_qos.c
+++ b/drivers/net/ethernet/synopsys/dwc_eth_qos.c
@@ -33,7 +33,6 @@
 #include <linux/stat.h>
 #include <linux/types.h>
 
-#include <linux/types.h>
 #include <linux/slab.h>
 #include <linux/delay.h>
 #include <linux/mm.h>
@@ -43,7 +42,6 @@
 
 #include <linux/phy.h>
 #include <linux/mii.h>
-#include <linux/delay.h>
 #include <linux/dma-mapping.h>
 #include <linux/vmalloc.h>
 
-- 
2.9.3

^ permalink raw reply related

* [PATCH 3/4] net: stmmac: dwmac-generic: add missing compatible strings
From: Niklas Cassel @ 2016-11-23 14:25 UTC (permalink / raw)
  To: Giuseppe Cavallaro, Alexandre Torgue; +Cc: Niklas Cassel, netdev, linux-kernel

From: Niklas Cassel <niklas.cassel@axis.com>

devicetree binding for stmmac states:
- compatible: Should be "snps,dwmac-<ip_version>", "snps,dwmac"
	For backwards compatibility: "st,spear600-gmac" is also supported.

Since dwmac-generic.c calls stmmac_probe_config_dt explicitly,
another alternative would have been to remove all compatible strings
other than "snps,dwmac" and "st,spear600-gmac" from dwmac-generic.c.

However, that would probably do more good than harm, since when trying
to figure out what hardware a certain driver supports, you usually look
at the compatible strings in the struct of_device_id, and not in some
function defined in a completely different file.

No functional change intended.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c
index b1e5f24708c9..52cd365b8e5e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c
@@ -58,9 +58,12 @@ static int dwmac_generic_probe(struct platform_device *pdev)
 
 static const struct of_device_id dwmac_generic_match[] = {
 	{ .compatible = "st,spear600-gmac"},
+	{ .compatible = "snps,dwmac-3.50a"},
 	{ .compatible = "snps,dwmac-3.610"},
 	{ .compatible = "snps,dwmac-3.70a"},
 	{ .compatible = "snps,dwmac-3.710"},
+	{ .compatible = "snps,dwmac-4.00"},
+	{ .compatible = "snps,dwmac-4.10a"},
 	{ .compatible = "snps,dwmac"},
 	{ }
 };
-- 
2.1.4

^ permalink raw reply related

* [PATCH 4/4] net: stmmac: stmmac_platform: use correct setup function for gmac4
From: Niklas Cassel @ 2016-11-23 14:25 UTC (permalink / raw)
  To: Giuseppe Cavallaro, Alexandre Torgue; +Cc: Niklas Cassel, netdev, linux-kernel

From: Niklas Cassel <niklas.cassel@axis.com>

devicetree binding for stmmac states:
- compatible: Should be "snps,dwmac-<ip_version>", "snps,dwmac"
	For backwards compatibility: "st,spear600-gmac" is also supported.

Previously, when specifying "snps,dwmac-4.10a", "snps,dwmac" as your
compatible string, plat_stmmacenet_data would have both has_gmac and
has_gmac4 set.

This would lead to stmmac_hw_init calling dwmac1000_setup rather than
dwmac4_setup, resulting in a non-functional driver.
This happened since the check for has_gmac is done before the check for
has_gmac4. However, the order should not matter, so it does not make sense
to have both set.

If something is valid for both, you should do as the stmmac_interrupt does:
if (priv->plat->has_gmac || priv->plat->has_gmac4) ...

The places where it was obvious that the author actually meant
if (has_gmac || has_gmac4) rather than if (has_gmac) has been updated.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c  | 4 ++--
 drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 1 +
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index d5a8122b6033..dd5b38e4cd1f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -263,7 +263,7 @@ static void stmmac_ethtool_getdrvinfo(struct net_device *dev,
 {
 	struct stmmac_priv *priv = netdev_priv(dev);
 
-	if (priv->plat->has_gmac)
+	if (priv->plat->has_gmac || priv->plat->has_gmac4)
 		strlcpy(info->driver, GMAC_ETHTOOL_NAME, sizeof(info->driver));
 	else
 		strlcpy(info->driver, MAC100_ETHTOOL_NAME,
@@ -448,7 +448,7 @@ static void stmmac_ethtool_gregs(struct net_device *dev,
 
 	memset(reg_space, 0x0, REG_SPACE_SIZE);
 
-	if (!priv->plat->has_gmac) {
+	if (!(priv->plat->has_gmac || priv->plat->has_gmac4)) {
 		/* MAC registers */
 		for (i = 0; i < 12; i++)
 			reg_space[i] = readl(priv->ioaddr + (i * 4));
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
index 4d544c34c1f2..c8a59f396c6e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
@@ -291,6 +291,7 @@ stmmac_probe_config_dt(struct platform_device *pdev, const char **mac)
 	if (of_device_is_compatible(np, "snps,dwmac-4.00") ||
 	    of_device_is_compatible(np, "snps,dwmac-4.10a")) {
 		plat->has_gmac4 = 1;
+		plat->has_gmac = 0;
 		plat->pmt = 1;
 		plat->tso_en = of_property_read_bool(np, "snps,tso");
 	}
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH] cpsw: ethtool: add support for getting/setting EEE registers
From: Rami Rosen @ 2016-11-23 14:31 UTC (permalink / raw)
  To: yegorslists
  Cc: Netdev, Linux OMAP Mailing List, grygorii.strashko, mugunthanvnm
In-Reply-To: <1479897248-5923-1-git-send-email-yegorslists@googlemail.com>

Hi, Yegor,

Minor comment: these methods should be static.

+int cpsw_get_eee(struct net_device *ndev, struct ethtool_eee *edata)
+{
...
...
+int cpsw_set_eee(struct net_device *ndev, struct ethtool_eee *edata)
...

Regards,
Rami Rosen

^ permalink raw reply

* [PATCH] can: bcm: fix support for CAN FD frames
From: Marc Kleine-Budde @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, linux-can, kernel, Oliver Hartkopp, linux-stable,
	Marc Kleine-Budde
In-Reply-To: <20161123143430.24985-1-mkl@pengutronix.de>

From: Oliver Hartkopp <socketcan@hartkopp.net>

Since commit 6f3b911d5f29b98 ("can: bcm: add support for CAN FD frames") the
CAN broadcast manager supports CAN and CAN FD data frames.

As these data frames are embedded in struct can[fd]_frames which have a
different length the access to the provided array of CAN frames became
dependend of op->cfsiz. By using a struct canfd_frame pointer for the array of
CAN frames the new offset calculation based on op->cfsiz was accidently applied
to CAN FD frame element lengths.

This fix makes the pointer to the arrays of the different CAN frame types a
void pointer so that the offset calculation in bytes accesses the correct CAN
frame elements.

Reference: http://marc.info/?l=linux-netdev&m=147980658909653

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 net/can/bcm.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/net/can/bcm.c b/net/can/bcm.c
index 8af9d25ff988..436a7537e6a9 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -77,7 +77,7 @@
 		     (CAN_EFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG) : \
 		     (CAN_SFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG))
 
-#define CAN_BCM_VERSION "20160617"
+#define CAN_BCM_VERSION "20161123"
 
 MODULE_DESCRIPTION("PF_CAN broadcast manager protocol");
 MODULE_LICENSE("Dual BSD/GPL");
@@ -109,8 +109,9 @@ struct bcm_op {
 	u32 count;
 	u32 nframes;
 	u32 currframe;
-	struct canfd_frame *frames;
-	struct canfd_frame *last_frames;
+	/* void pointers to arrays of struct can[fd]_frame */
+	void *frames;
+	void *last_frames;
 	struct canfd_frame sframe;
 	struct canfd_frame last_sframe;
 	struct sock *sk;
@@ -681,7 +682,7 @@ static void bcm_rx_handler(struct sk_buff *skb, void *data)
 
 	if (op->flags & RX_FILTER_ID) {
 		/* the easiest case */
-		bcm_rx_update_and_send(op, &op->last_frames[0], rxframe);
+		bcm_rx_update_and_send(op, op->last_frames, rxframe);
 		goto rx_starttimer;
 	}
 
@@ -1068,7 +1069,7 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 
 		if (msg_head->nframes) {
 			/* update CAN frames content */
-			err = memcpy_from_msg((u8 *)op->frames, msg,
+			err = memcpy_from_msg(op->frames, msg,
 					      msg_head->nframes * op->cfsiz);
 			if (err < 0)
 				return err;
@@ -1118,7 +1119,7 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 		}
 
 		if (msg_head->nframes) {
-			err = memcpy_from_msg((u8 *)op->frames, msg,
+			err = memcpy_from_msg(op->frames, msg,
 					      msg_head->nframes * op->cfsiz);
 			if (err < 0) {
 				if (op->frames != &op->sframe)
@@ -1163,6 +1164,7 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 	/* check flags */
 
 	if (op->flags & RX_RTR_FRAME) {
+		struct canfd_frame *frame0 = op->frames;
 
 		/* no timers in RTR-mode */
 		hrtimer_cancel(&op->thrtimer);
@@ -1174,8 +1176,8 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 		 * prevent a full-load-loopback-test ... ;-]
 		 */
 		if ((op->flags & TX_CP_CAN_ID) ||
-		    (op->frames[0].can_id == op->can_id))
-			op->frames[0].can_id = op->can_id & ~CAN_RTR_FLAG;
+		    (frame0->can_id == op->can_id))
+			frame0->can_id = op->can_id & ~CAN_RTR_FLAG;
 
 	} else {
 		if (op->flags & SETTIMER) {
-- 
2.10.2

^ permalink raw reply related

* [patch net-next v2 00/11] ipv4: fib: Allow modules to dump FIB tables
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber

From: Jiri Pirko <jiri@mellanox.com>

Ido says:

In kernel 4.9 the switchdev-specific FIB offload mechanism was replaced
by a new FIB notification chain to which modules could register in order
to be notified about the addition and deletion of FIB entries. The
motivation for this change was that switchdev drivers need to be able to
reflect the entire FIB table and not only FIBs configured on top of the
port netdevs themselves. This is useful in case of in-band management.

The fundamental problem with this approach is that upon registration
listeners lose all the information previously sent in the chain and
thus have an incomplete view of the FIB tables, which can result in
packet loss. This patchset fixes that by introducing a new API to dump
the FIB tables.

The entire dump process is done under RCU and thus the FIB notification
chain is converted to be atomic. The listeners are modified accordingly.
This is done in the first seven patches.

The eighth patch adds a change sequence counter to ensure the integrity
of the FIB dump, which is finally introduced in the following patch. The
last two patches modify current listeners of the FIB notification chain
to invoke the dump during their init.

---
v1->v2:
- Add a sequence counter to ensure the integrity of the FIB dump
  (David S. Miller, Hannes Frederic Sowa).
- Protect notifications from re-ordering in listeners by using an
  ordered workqueue (Hannes Frederic Sowa).
- Introduce fib_info_hold() (Jiri Pirko).
- Relieve rocker from the need to invoke the FIB dump by registering
  to the FIB notification chain prior to ports creation.
 
Ido Schimmel (11):
  ipv4: fib: Export free_fib_info()
  ipv4: fib: Add fib_info_hold() helper
  mlxsw: core: Create an ordered workqueue for FIB offload
  mlxsw: spectrum_router: Implement FIB offload in deferred work
  rocker: Create an ordered workqueue for FIB offload
  rocker: Implement FIB offload in deferred work
  ipv4: fib: Convert FIB notification chain to be atomic
  ipv4: fib: Allow for consistent FIB dumping
  ipv4: fib: Add an API to request a FIB dump
  mlxsw: spectrum_router: Request a dump of FIB tables during init
  rocker: Register FIB notifier before creating ports

 drivers/net/ethernet/mellanox/mlxsw/core.c         |  22 ++++
 drivers/net/ethernet/mellanox/mlxsw/core.h         |   2 +
 .../net/ethernet/mellanox/mlxsw/spectrum_router.c  |  88 ++++++++++++--
 drivers/net/ethernet/rocker/rocker.h               |   1 +
 drivers/net/ethernet/rocker/rocker_main.c          |  78 +++++++++++--
 drivers/net/ethernet/rocker/rocker_ofdpa.c         |   1 +
 include/net/ip_fib.h                               |   6 +
 include/net/netns/ipv4.h                           |   2 +
 net/ipv4/fib_frontend.c                            |   2 +
 net/ipv4/fib_semantics.c                           |   1 +
 net/ipv4/fib_trie.c                                | 126 ++++++++++++++++++++-
 11 files changed, 303 insertions(+), 26 deletions(-)

-- 
2.7.4

^ permalink raw reply

* pull-request: can 2016-11-23
From: Marc Kleine-Budde @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev; +Cc: davem, linux-can, kernel

Hello David,

this is a pull request for net/master.

The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the CAN-FD
support, which may cause an out-of-bounds access otherwise.

regards,
Marc
---

The following changes since commit c9b8af1330198ae241cd545e1f040019010d44d9:

  flow_dissect: call init_default_flow_dissectors() earlier (2016-11-22 14:44:01 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can.git tags/linux-can-fixes-for-4.9-20161123

for you to fetch changes up to 5499a6b22e5508b921c447757685b0a5e40a07ed:

  can: bcm: fix support for CAN FD frames (2016-11-23 15:22:18 +0100)

----------------------------------------------------------------
linux-can-fixes-for-4.9-20161123

----------------------------------------------------------------
Oliver Hartkopp (1):
      can: bcm: fix support for CAN FD frames

 net/can/bcm.c | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

^ permalink raw reply

* [patch net-next v2 01/11] ipv4: fib: Export free_fib_info()
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

The FIB notification chain is going to be converted to an atomic chain,
which means switchdev drivers will have to offload FIB entries in
deferred work, as hardware operations entail sleeping.

However, while the work is queued fib info might be freed, so a
reference must be taken. To release the reference (and potentially free
the fib info) fib_info_put() will be called, which in turn calls
free_fib_info().

Export free_fib_info() so that modules will be able to invoke
fib_info_put().

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/ipv4/fib_semantics.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 388d3e2..c1bc1e9 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -234,6 +234,7 @@ void free_fib_info(struct fib_info *fi)
 #endif
 	call_rcu(&fi->rcu, free_fib_info_rcu);
 }
+EXPORT_SYMBOL_GPL(free_fib_info);
 
 void fib_release_info(struct fib_info *fi)
 {
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 02/11] ipv4: fib: Add fib_info_hold() helper
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

As explained in the previous commit, modules are going to need to take a
reference on fib info and then drop it using fib_info_put().

Add the fib_info_hold() helper to make the code more readable and also
symmetric with fib_info_put().

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/ip_fib.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index f390c3b..6c67b93 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -397,6 +397,11 @@ static inline void fib_combine_itag(u32 *itag, const struct fib_result *res)
 
 void free_fib_info(struct fib_info *fi);
 
+static inline void fib_info_hold(struct fib_info *fi)
+{
+	atomic_inc(&fi->fib_clntref);
+}
+
 static inline void fib_info_put(struct fib_info *fi)
 {
 	if (atomic_dec_and_test(&fi->fib_clntref))
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 03/11] mlxsw: core: Create an ordered workqueue for FIB offload
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

We're going to start processing FIB entries addition / deletion events
in deferred work. These work items must be processed in the order they
were submitted or otherwise we can have differences between the kernel's
FIB table and the device's.

Solve this by creating an ordered workqueue to which these work items
will be submitted to. Note that we can't simply convert the current
workqueue to be ordered, as EMADs re-transmissions are also processed in
deferred work.

Later on, we can migrate other work items to this workqueue, such as FDB
notification processing and nexthop resolution, since they all take the
same lock anyway.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/core.c | 22 ++++++++++++++++++++++
 drivers/net/ethernet/mellanox/mlxsw/core.h |  2 ++
 2 files changed, 24 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.c b/drivers/net/ethernet/mellanox/mlxsw/core.c
index bcd7251..a8d9a9c 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.c
@@ -77,6 +77,7 @@ static const char mlxsw_core_driver_name[] = "mlxsw_core";
 static struct dentry *mlxsw_core_dbg_root;
 
 static struct workqueue_struct *mlxsw_wq;
+static struct workqueue_struct *mlxsw_owq;
 
 struct mlxsw_core_pcpu_stats {
 	u64			trap_rx_packets[MLXSW_TRAP_ID_MAX];
@@ -1857,6 +1858,18 @@ int mlxsw_core_schedule_dw(struct delayed_work *dwork, unsigned long delay)
 }
 EXPORT_SYMBOL(mlxsw_core_schedule_dw);
 
+int mlxsw_core_schedule_odw(struct delayed_work *dwork, unsigned long delay)
+{
+	return queue_delayed_work(mlxsw_owq, dwork, delay);
+}
+EXPORT_SYMBOL(mlxsw_core_schedule_odw);
+
+void mlxsw_core_flush_owq(void)
+{
+	flush_workqueue(mlxsw_owq);
+}
+EXPORT_SYMBOL(mlxsw_core_flush_owq);
+
 static int __init mlxsw_core_module_init(void)
 {
 	int err;
@@ -1864,6 +1877,12 @@ static int __init mlxsw_core_module_init(void)
 	mlxsw_wq = alloc_workqueue(mlxsw_core_driver_name, WQ_MEM_RECLAIM, 0);
 	if (!mlxsw_wq)
 		return -ENOMEM;
+	mlxsw_owq = alloc_ordered_workqueue("%s_ordered", WQ_MEM_RECLAIM,
+					    mlxsw_core_driver_name);
+	if (!mlxsw_owq) {
+		err = -ENOMEM;
+		goto err_alloc_ordered_workqueue;
+	}
 	mlxsw_core_dbg_root = debugfs_create_dir(mlxsw_core_driver_name, NULL);
 	if (!mlxsw_core_dbg_root) {
 		err = -ENOMEM;
@@ -1872,6 +1891,8 @@ static int __init mlxsw_core_module_init(void)
 	return 0;
 
 err_debugfs_create_dir:
+	destroy_workqueue(mlxsw_owq);
+err_alloc_ordered_workqueue:
 	destroy_workqueue(mlxsw_wq);
 	return err;
 }
@@ -1879,6 +1900,7 @@ static int __init mlxsw_core_module_init(void)
 static void __exit mlxsw_core_module_exit(void)
 {
 	debugfs_remove_recursive(mlxsw_core_dbg_root);
+	destroy_workqueue(mlxsw_owq);
 	destroy_workqueue(mlxsw_wq);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlxsw/core.h b/drivers/net/ethernet/mellanox/mlxsw/core.h
index 3de8955..f676ee9 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/core.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/core.h
@@ -156,6 +156,8 @@ enum devlink_port_type mlxsw_core_port_type_get(struct mlxsw_core *mlxsw_core,
 						u8 local_port);
 
 int mlxsw_core_schedule_dw(struct delayed_work *dwork, unsigned long delay);
+int mlxsw_core_schedule_odw(struct delayed_work *dwork, unsigned long delay);
+void mlxsw_core_flush_owq(void);
 
 #define MLXSW_CONFIG_PROFILE_SWID_COUNT 8
 
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 04/11] mlxsw: spectrum_router: Implement FIB offload in deferred work
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

FIB offload is currently done in process context with RTNL held, but
we're about to dump the FIB tables in RCU critical section, so we can no
longer sleep.

Instead, defer the operation to process context using deferred work. Make
sure fib info isn't freed while the work is queued by taking a reference
on it and releasing it after the operation is done.

Deferring the operation is valid because the upper layers always assume
the operation was successful. If it's not, then the driver-specific
abort mechanism is called and all routed traffic is directed to slow
path.

The work items are submitted to an ordered workqueue to prevent a
mismatch between the kernel's FIB table and the device's.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 .../net/ethernet/mellanox/mlxsw/spectrum_router.c  | 72 +++++++++++++++++++---
 1 file changed, 62 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 683f045..14bed1d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -593,6 +593,14 @@ static void mlxsw_sp_router_fib_flush(struct mlxsw_sp *mlxsw_sp);
 
 static void mlxsw_sp_vrs_fini(struct mlxsw_sp *mlxsw_sp)
 {
+	/* At this stage we're guaranteed not to have new incoming
+	 * FIB notifications and the work queue is free from FIBs
+	 * sitting on top of mlxsw netdevs. However, we can still
+	 * have other FIBs queued. Flush the queue before flushing
+	 * the device's tables. No need for locks, as we're the only
+	 * writer.
+	 */
+	mlxsw_core_flush_owq();
 	mlxsw_sp_router_fib_flush(mlxsw_sp);
 	kfree(mlxsw_sp->router.vrs);
 }
@@ -1948,30 +1956,74 @@ static void __mlxsw_sp_router_fini(struct mlxsw_sp *mlxsw_sp)
 	kfree(mlxsw_sp->rifs);
 }
 
-static int mlxsw_sp_router_fib_event(struct notifier_block *nb,
-				     unsigned long event, void *ptr)
+struct mlxsw_sp_fib_event_work {
+	struct delayed_work dw;
+	struct fib_entry_notifier_info fen_info;
+	struct mlxsw_sp *mlxsw_sp;
+	unsigned long event;
+};
+
+static void mlxsw_sp_router_fib_event_work(struct work_struct *work)
 {
-	struct mlxsw_sp *mlxsw_sp = container_of(nb, struct mlxsw_sp, fib_nb);
-	struct fib_entry_notifier_info *fen_info = ptr;
+	struct mlxsw_sp_fib_event_work *fib_work =
+		container_of(work, struct mlxsw_sp_fib_event_work, dw.work);
+	struct mlxsw_sp *mlxsw_sp = fib_work->mlxsw_sp;
 	int err;
 
-	if (!net_eq(fen_info->info.net, &init_net))
-		return NOTIFY_DONE;
-
-	switch (event) {
+	/* Protect internal structures from changes */
+	rtnl_lock();
+	switch (fib_work->event) {
 	case FIB_EVENT_ENTRY_ADD:
-		err = mlxsw_sp_router_fib4_add(mlxsw_sp, fen_info);
+		err = mlxsw_sp_router_fib4_add(mlxsw_sp, &fib_work->fen_info);
 		if (err)
 			mlxsw_sp_router_fib4_abort(mlxsw_sp);
+		fib_info_put(fib_work->fen_info.fi);
 		break;
 	case FIB_EVENT_ENTRY_DEL:
-		mlxsw_sp_router_fib4_del(mlxsw_sp, fen_info);
+		mlxsw_sp_router_fib4_del(mlxsw_sp, &fib_work->fen_info);
+		fib_info_put(fib_work->fen_info.fi);
 		break;
 	case FIB_EVENT_RULE_ADD: /* fall through */
 	case FIB_EVENT_RULE_DEL:
 		mlxsw_sp_router_fib4_abort(mlxsw_sp);
 		break;
 	}
+	rtnl_unlock();
+	kfree(fib_work);
+}
+
+/* Called with rcu_read_lock() */
+static int mlxsw_sp_router_fib_event(struct notifier_block *nb,
+				     unsigned long event, void *ptr)
+{
+	struct mlxsw_sp *mlxsw_sp = container_of(nb, struct mlxsw_sp, fib_nb);
+	struct mlxsw_sp_fib_event_work *fib_work;
+	struct fib_notifier_info *info = ptr;
+
+	if (!net_eq(info->net, &init_net))
+		return NOTIFY_DONE;
+
+	fib_work = kzalloc(sizeof(*fib_work), GFP_ATOMIC);
+	if (WARN_ON(!fib_work))
+		return NOTIFY_BAD;
+
+	INIT_DELAYED_WORK(&fib_work->dw, mlxsw_sp_router_fib_event_work);
+	fib_work->mlxsw_sp = mlxsw_sp;
+	fib_work->event = event;
+
+	switch (event) {
+	case FIB_EVENT_ENTRY_ADD: /* fall through */
+	case FIB_EVENT_ENTRY_DEL:
+		memcpy(&fib_work->fen_info, ptr, sizeof(fib_work->fen_info));
+		/* Take referece on fib_info to prevent it from being
+		 * freed while work is queued. Release it afterwards.
+		 */
+		fib_info_hold(fib_work->fen_info.fi);
+		break;
+	}
+
+	mlxsw_core_schedule_odw(&fib_work->dw, 0);
+
 	return NOTIFY_DONE;
 }
 
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 05/11] rocker: Create an ordered workqueue for FIB offload
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

As explained in the previous patches, we need to process FIB entries
addition / deletion events in FIFO order or otherwise we can have a
mismatch between the kernel's FIB table and the device's.

Create an ordered workqueue for rocker to which these work items will be
submitted to.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/rocker/rocker.h      |  1 +
 drivers/net/ethernet/rocker/rocker_main.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker.h b/drivers/net/ethernet/rocker/rocker.h
index 2eb9b49..ee9675d 100644
--- a/drivers/net/ethernet/rocker/rocker.h
+++ b/drivers/net/ethernet/rocker/rocker.h
@@ -72,6 +72,7 @@ struct rocker {
 	struct rocker_dma_ring_info event_ring;
 	struct notifier_block fib_nb;
 	struct rocker_world_ops *wops;
+	struct workqueue_struct *rocker_owq;
 	void *wpriv;
 };
 
diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 67df4cf..424be96 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -28,6 +28,7 @@
 #include <linux/if_bridge.h>
 #include <linux/bitops.h>
 #include <linux/ctype.h>
+#include <linux/workqueue.h>
 #include <net/switchdev.h>
 #include <net/rtnetlink.h>
 #include <net/netevent.h>
@@ -2754,6 +2755,13 @@ static int rocker_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 		goto err_request_event_irq;
 	}
 
+	rocker->rocker_owq = alloc_ordered_workqueue(rocker_driver_name,
+						     WQ_MEM_RECLAIM);
+	if (!rocker->rocker_owq) {
+		err = -ENOMEM;
+		goto err_alloc_ordered_workqueue;
+	}
+
 	rocker->hw.id = rocker_read64(rocker, SWITCH_ID);
 
 	err = rocker_probe_ports(rocker);
@@ -2771,6 +2779,8 @@ static int rocker_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	return 0;
 
 err_probe_ports:
+	destroy_workqueue(rocker->rocker_owq);
+err_alloc_ordered_workqueue:
 	free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_EVENT), rocker);
 err_request_event_irq:
 	free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_CMD), rocker);
@@ -2799,6 +2809,7 @@ static void rocker_remove(struct pci_dev *pdev)
 	unregister_fib_notifier(&rocker->fib_nb);
 	rocker_write32(rocker, CONTROL, ROCKER_CONTROL_RESET);
 	rocker_remove_ports(rocker);
+	destroy_workqueue(rocker->rocker_owq);
 	free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_EVENT), rocker);
 	free_irq(rocker_msix_vector(rocker, ROCKER_MSIX_VEC_CMD), rocker);
 	rocker_dma_rings_fini(rocker);
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 06/11] rocker: Implement FIB offload in deferred work
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

Convert rocker to offload FIBs in deferred work in a similar fashion to
mlxsw, which was converted in the previous patches.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/rocker/rocker_main.c  | 58 +++++++++++++++++++++++++-----
 drivers/net/ethernet/rocker/rocker_ofdpa.c |  1 +
 2 files changed, 51 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index 424be96..914e9e1 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -2166,28 +2166,70 @@ static const struct switchdev_ops rocker_port_switchdev_ops = {
 	.switchdev_port_obj_dump	= rocker_port_obj_dump,
 };
 
-static int rocker_router_fib_event(struct notifier_block *nb,
-				   unsigned long event, void *ptr)
+struct rocker_fib_event_work {
+	struct work_struct work;
+	struct fib_entry_notifier_info fen_info;
+	struct rocker *rocker;
+	unsigned long event;
+};
+
+static void rocker_router_fib_event_work(struct work_struct *work)
 {
-	struct rocker *rocker = container_of(nb, struct rocker, fib_nb);
-	struct fib_entry_notifier_info *fen_info = ptr;
+	struct rocker_fib_event_work *fib_work =
+		container_of(work, struct rocker_fib_event_work, work);
+	struct rocker *rocker = fib_work->rocker;
 	int err;
 
-	switch (event) {
+	/* Protect internal structures from changes */
+	rtnl_lock();
+	switch (fib_work->event) {
 	case FIB_EVENT_ENTRY_ADD:
-		err = rocker_world_fib4_add(rocker, fen_info);
+		err = rocker_world_fib4_add(rocker, &fib_work->fen_info);
 		if (err)
 			rocker_world_fib4_abort(rocker);
-		else
+		fib_info_put(fib_work->fen_info.fi);
 		break;
 	case FIB_EVENT_ENTRY_DEL:
-		rocker_world_fib4_del(rocker, fen_info);
+		rocker_world_fib4_del(rocker, &fib_work->fen_info);
+		fib_info_put(fib_work->fen_info.fi);
 		break;
 	case FIB_EVENT_RULE_ADD: /* fall through */
 	case FIB_EVENT_RULE_DEL:
 		rocker_world_fib4_abort(rocker);
 		break;
 	}
+	rtnl_unlock();
+	kfree(fib_work);
+}
+
+/* Called with rcu_read_lock() */
+static int rocker_router_fib_event(struct notifier_block *nb,
+				   unsigned long event, void *ptr)
+{
+	struct rocker *rocker = container_of(nb, struct rocker, fib_nb);
+	struct rocker_fib_event_work *fib_work;
+
+	fib_work = kzalloc(sizeof(*fib_work), GFP_ATOMIC);
+	if (WARN_ON(!fib_work))
+		return NOTIFY_BAD;
+
+	INIT_WORK(&fib_work->work, rocker_router_fib_event_work);
+	fib_work->rocker = rocker;
+	fib_work->event = event;
+
+	switch (event) {
+	case FIB_EVENT_ENTRY_ADD: /* fall through */
+	case FIB_EVENT_ENTRY_DEL:
+		memcpy(&fib_work->fen_info, ptr, sizeof(fib_work->fen_info));
+		/* Take referece on fib_info to prevent it from being
+		 * freed while work is queued. Release it afterwards.
+		 */
+		fib_info_hold(fib_work->fen_info.fi);
+		break;
+	}
+
+	queue_work(rocker->rocker_owq, &fib_work->work);
+
 	return NOTIFY_DONE;
 }
 
diff --git a/drivers/net/ethernet/rocker/rocker_ofdpa.c b/drivers/net/ethernet/rocker/rocker_ofdpa.c
index 4ca4613..7cd76b6 100644
--- a/drivers/net/ethernet/rocker/rocker_ofdpa.c
+++ b/drivers/net/ethernet/rocker/rocker_ofdpa.c
@@ -2516,6 +2516,7 @@ static void ofdpa_fini(struct rocker *rocker)
 	int bkt;
 
 	del_timer_sync(&ofdpa->fdb_cleanup_timer);
+	flush_workqueue(rocker->rocker_owq);
 
 	spin_lock_irqsave(&ofdpa->flow_tbl_lock, flags);
 	hash_for_each_safe(ofdpa->flow_tbl, bkt, tmp, flow_entry, entry)
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 07/11] ipv4: fib: Convert FIB notification chain to be atomic
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

In order not to hold RTNL for long periods of time we're going to dump
the FIB tables using RCU.

Convert the FIB notification chain to be atomic, as we can't block in
RCU critical sections.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 net/ipv4/fib_trie.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 026f309..9bfce0d 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -84,17 +84,17 @@
 #include <trace/events/fib.h>
 #include "fib_lookup.h"
 
-static BLOCKING_NOTIFIER_HEAD(fib_chain);
+static ATOMIC_NOTIFIER_HEAD(fib_chain);
 
 int register_fib_notifier(struct notifier_block *nb)
 {
-	return blocking_notifier_chain_register(&fib_chain, nb);
+	return atomic_notifier_chain_register(&fib_chain, nb);
 }
 EXPORT_SYMBOL(register_fib_notifier);
 
 int unregister_fib_notifier(struct notifier_block *nb)
 {
-	return blocking_notifier_chain_unregister(&fib_chain, nb);
+	return atomic_notifier_chain_unregister(&fib_chain, nb);
 }
 EXPORT_SYMBOL(unregister_fib_notifier);
 
@@ -102,7 +102,7 @@ int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
 		       struct fib_notifier_info *info)
 {
 	info->net = net;
-	return blocking_notifier_call_chain(&fib_chain, event_type, info);
+	return atomic_notifier_call_chain(&fib_chain, event_type, info);
 }
 
 static int call_fib_entry_notifiers(struct net *net,
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 08/11] ipv4: fib: Allow for consistent FIB dumping
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

The next patch will enable listeners of the FIB notification chain to
request a dump of the FIB tables. However, since RTNL isn't taken during
the dump, it's possible for the FIB tables to change mid-dump, which
will result in inconsistency between the listener's table and the
kernel's.

Allow listeners to know about changes that occurred mid-dump, by adding
a change sequence counter to each net namespace. The counter is
incremented just before a notification is sent in the FIB chain.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/netns/ipv4.h | 2 ++
 net/ipv4/fib_frontend.c  | 2 ++
 net/ipv4/fib_trie.c      | 1 +
 3 files changed, 5 insertions(+)

diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 7adf438..d236c08 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -136,5 +136,7 @@ struct netns_ipv4 {
 	int sysctl_fib_multipath_use_neigh;
 #endif
 	atomic_t	rt_genid;
+
+	atomic_t	fib_seq;
 };
 #endif
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 121384b..cf8c867 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1219,6 +1219,8 @@ static int __net_init ip_fib_net_init(struct net *net)
 	int err;
 	size_t size = sizeof(struct hlist_head) * FIB_TABLE_HASHSZ;
 
+	atomic_set(&net->ipv4.fib_seq, 0);
+
 	/* Avoid false sharing : Use at least a full cache line */
 	size = max_t(size_t, size, L1_CACHE_BYTES);
 
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 9bfce0d..b1d2d09 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -101,6 +101,7 @@ EXPORT_SYMBOL(unregister_fib_notifier);
 int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
 		       struct fib_notifier_info *info)
 {
+	atomic_inc(&net->ipv4.fib_seq);
 	info->net = net;
 	return atomic_notifier_call_chain(&fib_chain, event_type, info);
 }
-- 
2.7.4

^ permalink raw reply related

* [patch net-next v2 09/11] ipv4: fib: Add an API to request a FIB dump
From: Jiri Pirko @ 2016-11-23 14:34 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

Commit b90eb7549499 ("fib: introduce FIB notification infrastructure")
introduced a new notification chain to notify listeners (f.e., switchdev
drivers) about addition and deletion of routes.

However, upon registration to the chain the FIB tables can already be
populated, which means potential listeners will have an incomplete view
of the tables.

Solve that by adding an API to request a FIB dump. The dump itself it
done using RCU in order not to starve consumers that need RTNL to make
progress.

For each net namespace the integrity of the dump is ensured by reading
the atomic change sequence counter before and after the dump. This
allows us to avoid the problematic situation in which the dumping
process sends a ENTRY_ADD notification following ENTRY_DEL generated by
another process holding RTNL.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 include/net/ip_fib.h |   1 +
 net/ipv4/fib_trie.c  | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 6c67b93..c76303e 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -221,6 +221,7 @@ enum fib_event_type {
 	FIB_EVENT_RULE_DEL,
 };
 
+bool fib_notifier_dump(struct notifier_block *nb);
 int register_fib_notifier(struct notifier_block *nb);
 int unregister_fib_notifier(struct notifier_block *nb);
 int call_fib_notifiers(struct net *net, enum fib_event_type event_type,
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index b1d2d09..9770edfe 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -86,6 +86,67 @@
 
 static ATOMIC_NOTIFIER_HEAD(fib_chain);
 
+static int call_fib_notifier(struct notifier_block *nb, struct net *net,
+			     enum fib_event_type event_type,
+			     struct fib_notifier_info *info)
+{
+	info->net = net;
+	return nb->notifier_call(nb, event_type, info);
+}
+
+static void fib_rules_notify(struct net *net, struct notifier_block *nb,
+			     enum fib_event_type event_type)
+{
+#ifdef CONFIG_IP_MULTIPLE_TABLES
+	struct fib_notifier_info info;
+
+	if (net->ipv4.fib_has_custom_rules)
+		call_fib_notifier(nb, net, event_type, &info);
+#endif
+}
+
+static void fib_notify(struct net *net, struct notifier_block *nb,
+		       enum fib_event_type event_type);
+
+static int call_fib_entry_notifier(struct notifier_block *nb, struct net *net,
+				   enum fib_event_type event_type, u32 dst,
+				   int dst_len, struct fib_info *fi,
+				   u8 tos, u8 type, u32 tb_id, u32 nlflags)
+{
+	struct fib_entry_notifier_info info = {
+		.dst = dst,
+		.dst_len = dst_len,
+		.fi = fi,
+		.tos = tos,
+		.type = type,
+		.tb_id = tb_id,
+		.nlflags = nlflags,
+	};
+	return call_fib_notifier(nb, net, event_type, &info.info);
+}
+
+bool fib_notifier_dump(struct notifier_block *nb)
+{
+	struct net *net;
+	bool ret = true;
+
+	rcu_read_lock();
+	for_each_net_rcu(net) {
+		int fib_seq = atomic_read(&net->ipv4.fib_seq);
+
+		fib_rules_notify(net, nb, FIB_EVENT_RULE_ADD);
+		fib_notify(net, nb, FIB_EVENT_ENTRY_ADD);
+		if (atomic_read(&net->ipv4.fib_seq) != fib_seq) {
+			ret = false;
+			goto out_unlock;
+		}
+	}
+out_unlock:
+	rcu_read_unlock();
+	return ret;
+}
+EXPORT_SYMBOL(fib_notifier_dump);
+
 int register_fib_notifier(struct notifier_block *nb)
 {
 	return atomic_notifier_chain_register(&fib_chain, nb);
@@ -1902,6 +1963,62 @@ int fib_table_flush(struct net *net, struct fib_table *tb)
 	return found;
 }
 
+static void fib_leaf_notify(struct net *net, struct key_vector *l,
+			    struct fib_table *tb, struct notifier_block *nb,
+			    enum fib_event_type event_type)
+{
+	struct fib_alias *fa;
+
+	hlist_for_each_entry_rcu(fa, &l->leaf, fa_list) {
+		struct fib_info *fi = fa->fa_info;
+
+		if (!fi)
+			continue;
+
+		/* local and main table can share the same trie,
+		 * so don't notify twice for the same entry.
+		 */
+		if (tb->tb_id != fa->tb_id)
+			continue;
+
+		call_fib_entry_notifier(nb, net, event_type, l->key,
+					KEYLENGTH - fa->fa_slen, fi, fa->fa_tos,
+					fa->fa_type, fa->tb_id, 0);
+	}
+}
+
+static void fib_table_notify(struct net *net, struct fib_table *tb,
+			     struct notifier_block *nb,
+			     enum fib_event_type event_type)
+{
+	struct trie *t = (struct trie *)tb->tb_data;
+	struct key_vector *l, *tp = t->kv;
+	t_key key = 0;
+
+	while ((l = leaf_walk_rcu(&tp, key)) != NULL) {
+		fib_leaf_notify(net, l, tb, nb, event_type);
+
+		key = l->key + 1;
+		/* stop in case of wrap around */
+		if (key < l->key)
+			break;
+	}
+}
+
+static void fib_notify(struct net *net, struct notifier_block *nb,
+		       enum fib_event_type event_type)
+{
+	unsigned int h;
+
+	for (h = 0; h < FIB_TABLE_HASHSZ; h++) {
+		struct hlist_head *head = &net->ipv4.fib_table_hash[h];
+		struct fib_table *tb;
+
+		hlist_for_each_entry_rcu(tb, head, tb_hlist)
+			fib_table_notify(net, tb, nb, event_type);
+	}
+}
+
 static void __trie_free_rcu(struct rcu_head *head)
 {
 	struct fib_table *tb = container_of(head, struct fib_table, rcu);
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] can: bcm: fix support for CAN FD frames
From: Marc Kleine-Budde @ 2016-11-23 14:35 UTC (permalink / raw)
  To: Oliver Hartkopp, linux-can, Andrey Konovalov, netdev
In-Reply-To: <20161123133325.1812-1-socketcan@hartkopp.net>


[-- Attachment #1.1: Type: text/plain, Size: 1260 bytes --]

On 11/23/2016 02:33 PM, Oliver Hartkopp wrote:
> Since commit 6f3b911d5f29b98 ("can: bcm: add support for CAN FD frames") the
> CAN broadcast manager supports CAN and CAN FD data frames.
> 
> As these data frames are embedded in struct can[fd]_frames which have a
> different length the access to the provided array of CAN frames became
> dependend of op->cfsiz. By using a struct canfd_frame pointer for the array of
> CAN frames the new offset calculation based on op->cfsiz was accidently applied
> to CAN FD frame element lengths.
> 
> This fix makes the pointer to the arrays of the different CAN frame types a
> void pointer so that the offset calculation in bytes accesses the correct CAN
> frame elements.
> 
> Reference: http://marc.info/?l=linux-netdev&m=147980658909653
> 
> Reported-by: Andrey Konovalov <andreyknvl@google.com>
> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>

Added to can and send a pull request to David.

Thanks,
Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH v2] cpsw: ethtool: add support for getting/setting EEE registers
From: yegorslists @ 2016-11-23 14:38 UTC (permalink / raw)
  To: netdev
  Cc: linux-omap, grygorii.strashko, mugunthanvnm, roszenrami,
	Yegor Yefremov

From: Yegor Yefremov <yegorslists@googlemail.com>

Add the ability to query and set Energy Efficient Ethernet parameters
via ethtool for applicable devices.

Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
---
Changes:
	v2: make routines static (Rami Rosen)

 drivers/net/ethernet/ti/cpsw.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index c6cff3d..c706540 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -2239,6 +2239,30 @@ static int cpsw_set_channels(struct net_device *ndev,
 	return ret;
 }
 
+static int cpsw_get_eee(struct net_device *ndev, struct ethtool_eee *edata)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int slave_no = cpsw_slave_index(cpsw, priv);
+
+	if (cpsw->slaves[slave_no].phy)
+		return phy_ethtool_get_eee(cpsw->slaves[slave_no].phy, edata);
+	else
+		return -EOPNOTSUPP;
+}
+
+static int cpsw_set_eee(struct net_device *ndev, struct ethtool_eee *edata)
+{
+	struct cpsw_priv *priv = netdev_priv(ndev);
+	struct cpsw_common *cpsw = priv->cpsw;
+	int slave_no = cpsw_slave_index(cpsw, priv);
+
+	if (cpsw->slaves[slave_no].phy)
+		return phy_ethtool_set_eee(cpsw->slaves[slave_no].phy, edata);
+	else
+		return -EOPNOTSUPP;
+}
+
 static const struct ethtool_ops cpsw_ethtool_ops = {
 	.get_drvinfo	= cpsw_get_drvinfo,
 	.get_msglevel	= cpsw_get_msglevel,
@@ -2262,6 +2286,8 @@ static const struct ethtool_ops cpsw_ethtool_ops = {
 	.complete	= cpsw_ethtool_op_complete,
 	.get_channels	= cpsw_get_channels,
 	.set_channels	= cpsw_set_channels,
+	.get_eee	= cpsw_get_eee,
+	.set_eee	= cpsw_set_eee,
 };
 
 static void cpsw_slave_init(struct cpsw_slave *slave, struct cpsw_common *cpsw,
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH] cpsw: ethtool: add support for getting/setting EEE registers
From: Yegor Yefremov @ 2016-11-23 14:40 UTC (permalink / raw)
  To: Rami Rosen
  Cc: Netdev, Linux OMAP Mailing List, Grygorii Strashko,
	N, Mugunthan V
In-Reply-To: <CAKoUArn8ndLaaEEr55DNFOqJL1hcRvpZCZ1e7WqhWZiaAHBiNw@mail.gmail.com>

Hi Rami,

On Wed, Nov 23, 2016 at 3:31 PM, Rami Rosen <roszenrami@gmail.com> wrote:
> Hi, Yegor,
>
> Minor comment: these methods should be static.
>
> +int cpsw_get_eee(struct net_device *ndev, struct ethtool_eee *edata)
> +{
> ...
> ...
> +int cpsw_set_eee(struct net_device *ndev, struct ethtool_eee *edata)
> ...

ACK. Fixed in v2.

Thanks.

Yegor

^ permalink raw reply

* [PATCH] drivers: net: davinci_mdio: use builtin_platform_driver
From: Geliang Tang @ 2016-11-23 14:45 UTC (permalink / raw)
  To: Mugunthan V N, Grygorii Strashko
  Cc: Geliang Tang, linux-omap, netdev, linux-kernel

Use builtin_platform_driver() helper to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
---
 drivers/net/ethernet/ti/davinci_mdio.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_mdio.c b/drivers/net/ethernet/ti/davinci_mdio.c
index 33df340..b3f0a12 100644
--- a/drivers/net/ethernet/ti/davinci_mdio.c
+++ b/drivers/net/ethernet/ti/davinci_mdio.c
@@ -536,11 +536,7 @@ static struct platform_driver davinci_mdio_driver = {
 	.remove = davinci_mdio_remove,
 };
 
-static int __init davinci_mdio_init(void)
-{
-	return platform_driver_register(&davinci_mdio_driver);
-}
-device_initcall(davinci_mdio_init);
+builtin_platform_driver(davinci_mdio_driver);
 
 static void __exit davinci_mdio_exit(void)
 {
-- 
2.9.3

^ permalink raw reply related

* Re: [PATCH net-next 4/5] net: phy: bcm7xxx: Add support for downshift/Wirespeed
From: Andrew Lunn @ 2016-11-23 14:46 UTC (permalink / raw)
  To: Allan W. Nielsen
  Cc: Florian Fainelli, netdev, davem, bcm-kernel-feedback-list,
	raju.lakkaraju, vivien.didelot
In-Reply-To: <20161123114512.GB25778@microsemi.com>

> > > Maybe we should think about this locking a bit. It is normal for the
> > > lock to be held when using ops in the phy driver structure. The
> > > exception is suspend/resume. Maybe we should also take the lock before
> > > calling the phydev->drv->get_tunable() and phydev->drv->set_tunable()?
> > 
> > Yes, that certainly seems like a good approach to me, let me cook a
> > patch doing that.
> 
> Just for my understanding (such that I will not make the same mistake again)...
> 
> Why is it that phy functions such as get_wol needs to take the phy_lock and
> others like get_tunable does not.
> 
> I do understand the arguments on why the lock should be held by the caller of
> get_tunable, but I do not understand why the same argument does not apply for
> get_wol.

Hi Allan

phy_ethtool_get_wol and friends probably should take the
phy_lock. This inconsistency is probably leading to locking
bugs. e.g. at803x_set_wol() does a read-modify-write, and does not
take the lock.

There is no comment in the patch adding phy_ethtool_set_wol() to say
why the lock is not taken, and a quick look at the code does not
suggest a reason why it could not be taken/released by
phy_ethtool_set_wol().

I think it would be a good idea to change this.

phy_suspend()/phy_resume() might have good reasons to avoid the lock,
i've no idea how it is supposed to work. Is there a danger something
else is holding the lock and has already been suspended? I guess not,
otherwise there is little hope suspend would work at all.

	  Andrew

^ permalink raw reply

* Re: [PATCH v2] cpsw: ethtool: add support for getting/setting EEE registers
From: Rami Rosen @ 2016-11-23 14:47 UTC (permalink / raw)
  To: yegorslists
  Cc: Netdev, Linux OMAP Mailing List, grygorii.strashko, mugunthanvnm
In-Reply-To: <1479911913-1761-1-git-send-email-yegorslists@googlemail.com>

Acked-by: Rami Rosen <roszenrami@gmail.com>

^ permalink raw reply

* Re: net/arp: ARP cache aging failed.
From: Hannes Frederic Sowa @ 2016-11-23 14:37 UTC (permalink / raw)
  To: Eric Dumazet, Julian Anastasov; +Cc: yuehaibing, davem, netdev
In-Reply-To: <1479902729.8455.479.camel@edumazet-glaptop3.roam.corp.google.com>

On 23.11.2016 13:05, Eric Dumazet wrote:
> On Wed, 2016-11-23 at 10:33 +0200, Julian Anastasov wrote:
>> 	Hello,
>>
>> On Wed, 23 Nov 2016, yuehaibing wrote:
>>
>>> 	As to my topo,HOST1 and HOST3 share one route on HOST2, tcp connection between HOST2 and HOST3 may call tcp_ack to set dst->pending_confirm.
>>> 	
>>> So dst_neigh_output may wrongly freshed  n->confirmed which stands for HOST1,however HOST1'MAC had been changed.
>>>
>>> 	The possibility of this occurred Significantly increases ,when ping and TCP transaction are set the same processor affinity on the HOST2.
>>>
>>> 	It seems that the issue is brought in commit 5110effee8fde2edfacac9cd12a9960ab2dc39ea ("net: Do delayed neigh confirmation.").
>>
>> 	Bad news. Problem is not in delayed confirmation but
>> in the mechanism to use same dst for different neighbours on
>> LAN. We don't have a dst->neighbour reference anymore.
>>
>> 	For IPv4 this is related to rt->rt_uses_gateway but
>> also to DST_NOCACHE. In the other cases we can not call
>> dst_confirm, may be we should lookup the neigh entry instead.
>> But we need a way to reduce such lookups on every packet,
>> for example, by remembering in struct sock and checking if
>> some bits of jiffies (at least 4-5) are changed from
>> previous lookup.
> 
> 
> I thought bonding would keep the MAC address 'alive'.

I wonder about this, too.

> If TCP packets are confirmed, this means the old MAC address is still
> valid, what am I missing here ?

Irregardless about the question if bonding should keep the MAC address
alive, a MAC address can certainly change below a TCP connection.

dst_entry is 1:n to neigh_entry and as such we can end up confirming an
aging neighbor while sending a reply with dst->pending_confirm set while
the confirming packet actually came from a different neighbor.

I agree with Julian, pending_confirm became useless in this way.

Bye,
Hannes

^ permalink raw reply

* [patch net-next v2 10/11] mlxsw: spectrum_router: Request a dump of FIB tables during init
From: Jiri Pirko @ 2016-11-23 14:48 UTC (permalink / raw)
  To: netdev
  Cc: davem, idosch, eladr, yotamg, nogahf, arkadis, ogerlitz, roopa,
	dsa, nikolay, andy, vivien.didelot, andrew, f.fainelli,
	alexander.h.duyck, hannes, kaber
In-Reply-To: <1479911670-4525-1-git-send-email-jiri@resnulli.us>

From: Ido Schimmel <idosch@mellanox.com>

Make sure the device has a complete view of the FIB tables by invoking
their dump during module init.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 14bed1d..36a71d2 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -2027,6 +2027,21 @@ static int mlxsw_sp_router_fib_event(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+static void mlxsw_sp_router_fib_dump(struct mlxsw_sp *mlxsw_sp)
+{
+	while (!fib_notifier_dump(&mlxsw_sp->fib_nb)) {
+		/* Flush pending FIB notifications and then flush the
+		 * device's table before requesting another dump. Do
+		 * that with RTNL held, as FIB notification block is
+		 * already registered.
+		 */
+		mlxsw_core_flush_owq();
+		rtnl_lock();
+		mlxsw_sp_router_fib_flush(mlxsw_sp);
+		rtnl_unlock();
+	}
+}
+
 int mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
 {
 	int err;
@@ -2048,6 +2063,7 @@ int mlxsw_sp_router_init(struct mlxsw_sp *mlxsw_sp)
 
 	mlxsw_sp->fib_nb.notifier_call = mlxsw_sp_router_fib_event;
 	register_fib_notifier(&mlxsw_sp->fib_nb);
+	mlxsw_sp_router_fib_dump(mlxsw_sp);
 	return 0;
 
 err_neigh_init:
-- 
2.7.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox