Netdev List
 help / color / mirror / Atom feed
* [RFC net-next 0/5] net: dsa: LAG support
From: Florian Fainelli @ 2017-10-01 19:46 UTC (permalink / raw)
  To: netdev
  Cc: andrew, vivien.didelot, jiri, idosch, Woojung.Huh, john,
	sean.wang, Florian Fainelli

Hi all,

This patch series is sent as RFC since I have only been able to test LAG
with dsa-loop and not with real HW yet (that should be tomorrow). I also
looked at how the Marvell DSDT API is defined for adding ports to "trunk"
groups and the API proposed here should work there too. Can't speak about
QCA, Mediatek or KSZ switches though.

Few open questions that may need solving now or later:

- on Broadcom switches, we should allow enslaving a port as a LAG group
  member if its speed does not match that of the other members of the group

- not sure what to do with a switch fabric, naively, if adding two ports
  of two distinct switches as a LAG group, we may have to propagate that
  to "dsa" cross-chip interfaces as well

Thanks!

Florian Fainelli (5):
  net: dsa: Add infrastructure to support LAG
  net: dsa: b53: Define MAC trunking/bonding registers
  net: dsa: b53: Add support for LAG
  net: dsa: bcm_sf2: Add support for LAG
  net: dsa: loop: Add support for LAG

 drivers/net/dsa/b53/b53_common.c |  94 ++++++++++++++++++++++-
 drivers/net/dsa/b53/b53_priv.h   |   6 ++
 drivers/net/dsa/b53/b53_regs.h   |  18 +++++
 drivers/net/dsa/bcm_sf2.c        |   3 +
 drivers/net/dsa/dsa_loop.c       |  54 +++++++++++++-
 include/net/dsa.h                |  25 +++++++
 net/dsa/dsa2.c                   |  12 +++
 net/dsa/dsa_priv.h               |   7 ++
 net/dsa/port.c                   |  92 +++++++++++++++++++++++
 net/dsa/slave.c                  | 157 ++++++++++++++++++++++++++++++++++++---
 net/dsa/switch.c                 |  30 ++++++++
 11 files changed, 484 insertions(+), 14 deletions(-)

-- 
2.11.0

^ permalink raw reply

* [RFC net-next 1/5] net: dsa: Add infrastructure to support LAG
From: Florian Fainelli @ 2017-10-01 19:46 UTC (permalink / raw)
  To: netdev
  Cc: andrew, vivien.didelot, jiri, idosch, Woojung.Huh, john,
	sean.wang, Florian Fainelli
In-Reply-To: <20171001194639.8647-1-f.fainelli@gmail.com>

Add the necessary logic to support network device events targetting LAG events,
this is loosely inspired from mlxsw/spectrum.c.

In the process we change dsa_slave_changeupper() to be more generic and be called
from both LAG events as well as normal bridge enslaving events paths.

The DSA layer takes care of managing the LAG group identifiers, how many LAGs
may be supported by a switch, and how many members per LAG are supported by a
switch device. When a LAG group is identified, the port is then configured to
be a part of that group. When a LAG group no longer has any users, we remove it
and we tell the drivers whether it is safe to disable trunking altogether.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 include/net/dsa.h  |  25 +++++++++
 net/dsa/dsa2.c     |  12 ++++
 net/dsa/dsa_priv.h |   7 +++
 net/dsa/port.c     |  92 +++++++++++++++++++++++++++++++
 net/dsa/slave.c    | 157 +++++++++++++++++++++++++++++++++++++++++++++++++----
 net/dsa/switch.c   |  30 ++++++++++
 6 files changed, 312 insertions(+), 11 deletions(-)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 10dceccd9ce8..247ea58add68 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -182,12 +182,20 @@ struct dsa_port {
 	u8			stp_state;
 	struct net_device	*bridge_dev;
 	struct devlink_port	devlink_port;
+	u8			lag_id;
+	bool			lagged;
 	/*
 	 * Original copy of the master netdev ethtool_ops
 	 */
 	const struct ethtool_ops *orig_ethtool_ops;
 };
 
+struct dsa_lag_group {
+	/* Used to know when we can disable lag on the switch */
+	unsigned int		ref_count;
+	struct net_device	*lag_dev;
+};
+
 struct dsa_switch {
 	struct device *dev;
 
@@ -242,6 +250,12 @@ struct dsa_switch {
 	/* Number of switch port queues */
 	unsigned int		num_tx_queues;
 
+	/* Number of lag groups */
+	unsigned int		max_lags;
+	struct dsa_lag_group	*lags;
+	/* Number of members per lag group */
+	unsigned int		max_lag_members;
+
 	/* Dynamically allocated ports, keep last */
 	size_t num_ports;
 	struct dsa_port ports[];
@@ -431,6 +445,16 @@ struct dsa_switch_ops {
 					 int port, struct net_device *br);
 	void	(*crosschip_bridge_leave)(struct dsa_switch *ds, int sw_index,
 					  int port, struct net_device *br);
+
+	/*
+	 * Link aggregation
+	 */
+	bool	(*port_lag_member)(struct dsa_switch *ds, int port, u8 lag_id);
+	int	(*port_lag_join)(struct dsa_switch *ds, int port, u8 lag_id);
+	void	(*port_lag_leave)(struct dsa_switch *ds, int port, u8 lag_id,
+				  bool lag_disable);
+	int	(*port_lag_change)(struct dsa_switch *ds, int port,
+				   struct netdev_lag_lower_state_info *info);
 };
 
 struct dsa_switch_driver {
@@ -455,6 +479,7 @@ static inline bool netdev_uses_dsa(struct net_device *dev)
 }
 
 struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n);
+int dsa_switch_alloc_lags(struct dsa_switch *ds, size_t n);
 void dsa_unregister_switch(struct dsa_switch *ds);
 int dsa_register_switch(struct dsa_switch *ds);
 #ifdef CONFIG_PM_SLEEP
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 54ed054777bd..dddf8128ba04 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -813,6 +813,18 @@ struct dsa_switch *dsa_switch_alloc(struct device *dev, size_t n)
 }
 EXPORT_SYMBOL_GPL(dsa_switch_alloc);
 
+int dsa_switch_alloc_lags(struct dsa_switch *ds, size_t n)
+{
+	ds->max_lags = n;
+	ds->lags = devm_kcalloc(ds->dev, ds->max_lags,
+			        sizeof(*ds->lags), GFP_KERNEL);
+	if (!ds->lags)
+		return -ENOMEM;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dsa_switch_alloc_lags);
+
 int dsa_register_switch(struct dsa_switch *ds)
 {
 	int err;
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 2850077cc9cc..0bd964bd9642 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -159,6 +159,11 @@ int dsa_port_vlan_add(struct dsa_port *dp,
 		      struct switchdev_trans *trans);
 int dsa_port_vlan_del(struct dsa_port *dp,
 		      const struct switchdev_obj_port_vlan *vlan);
+int dsa_port_lag_join(struct dsa_port *dp, struct net_device *lag_dev);
+int dsa_port_lag_leave(struct dsa_port *dp, struct net_device *lag_dev);
+int dsa_port_lag_change(struct dsa_port *dp,
+			struct netdev_lag_lower_state_info *info);
+
 /* slave.c */
 extern const struct dsa_device_ops notag_netdev_ops;
 void dsa_slave_mii_bus_init(struct dsa_switch *ds);
@@ -170,6 +175,8 @@ int dsa_slave_register_notifier(void);
 void dsa_slave_unregister_notifier(void);
 
 /* switch.c */
+int dsa_switch_lag_get_index(struct dsa_switch *ds, struct net_device *lag_dev,
+			     u8 *lag_id);
 int dsa_switch_register_notifier(struct dsa_switch *ds);
 void dsa_switch_unregister_notifier(struct dsa_switch *ds);
 
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 72c8dbd3d3f2..d62fa7bfab4b 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -264,3 +264,95 @@ int dsa_port_vlan_del(struct dsa_port *dp,
 
 	return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
 }
+
+static int dsa_port_lag_member(struct dsa_port *dp, u8 lag_id)
+{
+	struct dsa_switch *ds = dp->ds;
+	int err = -EOPNOTSUPP;
+	unsigned int i;
+
+	if (!ds->ops->port_lag_member && !ds->max_lag_members)
+		return err;
+
+	for (i = 0; i < ds->max_lag_members; i++) {
+		if (!ds->ops->port_lag_member(ds, i, lag_id)) {
+			return 0;
+		}
+	}
+
+	return -EBUSY;
+}
+
+int dsa_port_lag_join(struct dsa_port *dp, struct net_device *lag_dev)
+{
+	struct dsa_switch *ds = dp->ds;
+	struct dsa_lag_group *lag;
+	int err = -EOPNOTSUPP;
+	u8 lag_id;
+
+	if (!ds->ops->port_lag_join)
+		return err;
+
+	/* Obtain a new lag identifier */
+	err = dsa_switch_lag_get_index(ds, lag_dev, &lag_id);
+	if (err)
+		return err;
+
+	/* Create a lag group if non-existent */
+	lag = &ds->lags[lag_id];
+	if (!lag->ref_count)
+		lag->lag_dev = lag_dev;
+
+	err = dsa_port_lag_member(dp, lag_id);
+	if (err)
+		return err;
+
+	err = ds->ops->port_lag_join(ds, dp->index, lag_id);
+	if (err)
+		return err;
+
+	dp->lag_id = lag_id;
+	dp->lagged = true;
+	lag->ref_count++;
+
+	return err;
+}
+
+int dsa_port_lag_leave(struct dsa_port *dp, struct net_device *lag_dev)
+{
+	struct dsa_switch *ds = dp->ds;
+	struct dsa_lag_group *lag;
+	bool lag_disable = false;
+	int err = -EOPNOTSUPP;
+	u8 lag_id;
+
+	if (!ds->ops->port_lag_join)
+		return err;
+
+	if (!dp->lagged)
+		return 0;
+
+	lag_id = dp->lag_id;
+	lag = &ds->lags[lag_id];
+	WARN_ON(lag->ref_count == 0);
+
+	if (lag->ref_count == 1)
+		lag_disable = true;
+
+	ds->ops->port_lag_leave(ds, dp->index, lag_id, lag_disable);
+	dp->lagged = false;
+	lag->ref_count--;
+
+	return err;
+}
+
+int dsa_port_lag_change(struct dsa_port *dp,
+			struct netdev_lag_lower_state_info *info)
+{
+	struct dsa_switch *ds = dp->ds;
+
+	if (!ds->ops->port_lag_change)
+		return -EOPNOTSUPP;
+
+	return ds->ops->port_lag_change(ds, dp->index, info);
+}
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 4b634db05cee..b64320aa20f1 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1216,36 +1216,171 @@ static bool dsa_slave_dev_check(struct net_device *dev)
 	return dev->netdev_ops == &dsa_slave_netdev_ops;
 }
 
-static int dsa_slave_changeupper(struct net_device *dev,
-				 struct netdev_notifier_changeupper_info *info)
+static bool dsa_slave_lag_check(struct net_device *dev, struct net_device *lag_dev,
+				struct netdev_lag_upper_info *lag_upper_info)
+{
+	struct dsa_slave_priv *p = netdev_priv(dev);
+	u8 lag_id;
+
+	/* No more lag identifiers available or already in use */
+	if (dsa_switch_lag_get_index(p->dp->ds, lag_dev, &lag_id) != 0)
+		return false;
+
+	if (lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH)
+		return false;
+
+	return true;
+}
+
+static int dsa_slave_changeupper_bridge(struct net_device *dev,
+					struct netdev_notifier_changeupper_info *info)
 {
 	struct dsa_slave_priv *p = netdev_priv(dev);
 	struct dsa_port *dp = p->dp;
 	int err = NOTIFY_DONE;
 
-	if (netif_is_bridge_master(info->upper_dev)) {
-		if (info->linking) {
-			err = dsa_port_bridge_join(dp, info->upper_dev);
-			err = notifier_from_errno(err);
-		} else {
-			dsa_port_bridge_leave(dp, info->upper_dev);
-			err = NOTIFY_OK;
-		}
+	if (info->linking) {
+		err = dsa_port_bridge_join(dp, info->upper_dev);
+		err = notifier_from_errno(err);
+	} else {
+		dsa_port_bridge_leave(dp, info->upper_dev);
+		err = NOTIFY_OK;
+	}
+
+	return err;
+}
+
+static int dsa_slave_changeupper_lag(struct net_device *dev,
+				     struct netdev_notifier_changeupper_info *info)
+{
+	struct dsa_slave_priv *p = netdev_priv(dev);
+	struct dsa_port *dp = p->dp;
+	int err = NOTIFY_DONE;
+
+	if (info->linking) {
+		err = dsa_port_lag_join(dp, info->upper_dev);
+		err = notifier_from_errno(err);
+	} else {
+		err = dsa_port_lag_leave(dp, info->upper_dev);
+		err = NOTIFY_OK;
+	}
+
+	return err;
+}
+
+static int dsa_slave_upper_event(struct net_device *lower_dev,
+				 struct net_device *slave_dev, unsigned long event,
+				 void *ptr)
+{
+	struct netdev_notifier_changeupper_info *info;
+	struct net_device *upper_dev;
+	struct dsa_slave_priv *p;
+	int err = 0;
+
+	info = ptr;
+	p = netdev_priv(slave_dev);
+
+	switch (event) {
+	case NETDEV_PRECHANGEUPPER:
+		upper_dev = info->upper_dev;
+		if (!is_vlan_dev(upper_dev) &&
+		    !netif_is_lag_master(upper_dev) &&
+		    !netif_is_bridge_master(upper_dev))
+			return -EINVAL;
+
+		if (!info->linking)
+			break;
+
+		if (netdev_has_any_upper_dev(upper_dev))
+			return -EINVAL;
+
+		if (netif_is_lag_master(upper_dev) &&
+		    !dsa_slave_lag_check(lower_dev, upper_dev, info->upper_info))
+			return -EINVAL;
+
+		break;
+	case NETDEV_CHANGEUPPER:
+		upper_dev = info->upper_dev;
+		if (netif_is_bridge_master(upper_dev))
+			err = dsa_slave_changeupper_bridge(lower_dev, info);
+		else if (netif_is_lag_master(upper_dev))
+			err = dsa_slave_changeupper_lag(lower_dev, info);
+		break;
 	}
 
 	return err;
 }
 
+static int dsa_slave_lower_event(struct net_device *dev,
+				 unsigned long event, void *ptr)
+{
+	struct netdev_notifier_changelowerstate_info *info;
+	struct dsa_slave_priv *p;
+	int err;
+
+	p = netdev_priv(dev);
+	info = ptr;
+
+	switch (event) {
+	case NETDEV_CHANGELOWERSTATE:
+		if (netif_is_lag_port(dev) && p->dp->lagged) {
+			err = dsa_port_lag_change(p->dp, info->lower_state_info);
+			if (err)
+				netdev_err(dev, "Failed to reflect LAG\n");
+		}
+		break;
+	}
+
+	return 0;
+}
+
+static int dsa_slave_upper_lower_event(struct net_device *lower_dev,
+				       struct net_device *slave_dev,
+				       unsigned long event, void *ptr)
+{
+	switch (event) {
+	case NETDEV_PRECHANGEUPPER:
+	case NETDEV_CHANGEUPPER:
+		return dsa_slave_upper_event(lower_dev, slave_dev, event, ptr);
+	case NETDEV_CHANGELOWERSTATE:
+		return dsa_slave_lower_event(slave_dev, event, ptr);
+	}
+
+	return NOTIFY_OK;
+}
+
+static int dsa_slave_lag_event(struct net_device *lag_dev,
+				unsigned long event, void *ptr)
+{
+	struct net_device *dev;
+	struct list_head *iter;
+	int err;
+
+	netdev_for_each_lower_dev(lag_dev, dev, iter) {
+		if (dsa_slave_dev_check(dev)) {
+			err = dsa_slave_upper_lower_event(lag_dev, dev,
+							  event, ptr);
+			if (err)
+				return err;
+		}
+	}
+
+	return NOTIFY_OK;
+}
+
 static int dsa_slave_netdevice_event(struct notifier_block *nb,
 				     unsigned long event, void *ptr)
 {
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
 
+	if (netif_is_lag_master(dev))
+		return dsa_slave_lag_event(dev, event, ptr);
+
 	if (!dsa_slave_dev_check(dev))
 		return NOTIFY_DONE;
 
 	if (event == NETDEV_CHANGEUPPER)
-		return dsa_slave_changeupper(dev, ptr);
+		return dsa_slave_upper_event(dev, dev, event, ptr);
 
 	return NOTIFY_DONE;
 }
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index e6c06aa349a6..9ce1b25bf197 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -252,6 +252,36 @@ static int dsa_switch_event(struct notifier_block *nb,
 	return notifier_from_errno(err);
 }
 
+int dsa_switch_lag_get_index(struct dsa_switch *ds, struct net_device *lag_dev,
+			     u8 *lag_id)
+{
+	struct dsa_lag_group *lag;
+	int free_lag_idx = -1;
+	unsigned int i;
+
+	if (!ds->max_lags)
+		return -EOPNOTSUPP;
+
+	for (i = 0; i < ds->max_lags; i++) {
+		lag = &ds->lags[i];
+		if (lag->ref_count) {
+			if (lag->lag_dev == lag_dev) {
+				*lag_id = i;
+				return 0;
+			}
+		} else if (free_lag_idx < 0) {
+			free_lag_idx = i;
+		}
+	}
+
+	if (free_lag_idx < 0)
+		return -EBUSY;
+
+	*lag_id = free_lag_idx;
+
+	return 0;
+}
+
 int dsa_switch_register_notifier(struct dsa_switch *ds)
 {
 	ds->nb.notifier_call = dsa_switch_event;
-- 
2.11.0

^ permalink raw reply related

* [RFC net-next 2/5] net: dsa: b53: Define MAC trunking/bonding registers
From: Florian Fainelli @ 2017-10-01 19:46 UTC (permalink / raw)
  To: netdev
  Cc: andrew, vivien.didelot, jiri, idosch, Woojung.Huh, john,
	sean.wang, Florian Fainelli
In-Reply-To: <20171001194639.8647-1-f.fainelli@gmail.com>

Define the MAC trunking page offset and its register layout to implement
bonding in the next patches.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/b53/b53_regs.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/net/dsa/b53/b53_regs.h b/drivers/net/dsa/b53/b53_regs.h
index 2a9f421680aa..d02e4f7dda10 100644
--- a/drivers/net/dsa/b53/b53_regs.h
+++ b/drivers/net/dsa/b53/b53_regs.h
@@ -44,6 +44,9 @@
 /* Port VLAN Page */
 #define B53_PVLAN_PAGE			0x31
 
+/* Trunking Page */
+#define B53_TRUNK_PAGE			0x32
+
 /* VLAN Registers */
 #define B53_VLAN_PAGE			0x34
 
@@ -360,6 +363,21 @@
 #define B53_JOIN_ALL_VLAN_EN		0x50
 
 /*************************************************************************
+ * Trunking Registers
+ *************************************************************************/
+
+/* MAC Trunking Control Register (8 bit) */
+#define B53_MAC_TRUNK_CTRL		0x00
+#define  TRK_HASH_IDX_DA_SA		0
+#define  TRK_HASH_IDX_DA		1
+#define  TRK_HASH_IDX_SA		2
+#define  TRK_HASH_IDX_MASK		0x3
+#define  MAC_BASE_TRNK_EN		BIT(3)
+
+/* MAC Trunking Group Register (16 bit) */
+#define B53_MAC_TRUNK_GROUP(x)		(0x10 + (x) * 2)
+
+/*************************************************************************
  * 802.1Q Page Registers
  *************************************************************************/
 
-- 
2.11.0

^ permalink raw reply related

* [RFC net-next 3/5] net: dsa: b53: Add support for LAG
From: Florian Fainelli @ 2017-10-01 19:46 UTC (permalink / raw)
  To: netdev
  Cc: andrew, vivien.didelot, jiri, idosch, Woojung.Huh, john,
	sean.wang, Florian Fainelli
In-Reply-To: <20171001194639.8647-1-f.fainelli@gmail.com>

Add support for LAG in the b53 driver by implementing the port_lag_join,
port_lag_leave and port_lag_member operations. port_lag_change is not supported
since the HW does not let us change anyting regarding tx_enabled or not.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/b53/b53_common.c | 94 +++++++++++++++++++++++++++++++++++++++-
 drivers/net/dsa/b53/b53_priv.h   |  6 +++
 2 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index d4ce092def83..e9903947f050 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1603,6 +1603,58 @@ int b53_set_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e)
 }
 EXPORT_SYMBOL(b53_set_mac_eee);
 
+bool b53_lag_member(struct dsa_switch *ds, int port, u8 lag_id)
+{
+	struct b53_device *dev = ds->priv;
+	u16 reg;
+
+	b53_read16(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_GROUP(lag_id), &reg);
+
+	return !!(BIT(port) & reg);
+}
+EXPORT_SYMBOL(b53_lag_member);
+
+int b53_lag_join(struct dsa_switch *ds, int port, u8 lag_id)
+{
+	struct b53_device *dev = ds->priv;
+	u8 trunk_ctl;
+	u16 lag;
+
+	/* Program this port and the CPU port in this trunking group */
+	b53_read16(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_GROUP(lag_id), &lag);
+	lag |= BIT(port);
+	b53_write16(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_GROUP(lag_id), lag);
+
+	/* Enable MAC DA,SA hashing, enable trunking */
+	b53_read8(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_CTRL, &trunk_ctl);
+	trunk_ctl &= ~TRK_HASH_IDX_MASK;
+	trunk_ctl |= MAC_BASE_TRNK_EN;
+	b53_write8(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_CTRL, trunk_ctl);
+
+	return 0;
+}
+EXPORT_SYMBOL(b53_lag_join);
+
+void b53_lag_leave(struct dsa_switch *ds, int port, u8 lag_id, bool lag_disable)
+{
+	struct b53_device *dev = ds->priv;
+	u8 trunk_ctl;
+	u16 lag;
+
+	/* Remove this port from the trunking group */
+	b53_read16(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_GROUP(lag_id), &lag);
+	lag &= ~BIT(port);
+	b53_write16(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_GROUP(lag_id), lag);
+
+	/* Disable trunking if the lag group is being removed */
+	if (lag_disable) {
+		b53_read8(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_CTRL, &trunk_ctl);
+		trunk_ctl &= ~(TRK_HASH_IDX_MASK | MAC_BASE_TRNK_EN);
+		b53_write8(dev, B53_TRUNK_PAGE, B53_MAC_TRUNK_CTRL, trunk_ctl);
+	}
+}
+EXPORT_SYMBOL(b53_lag_leave);
+
 static const struct dsa_switch_ops b53_switch_ops = {
 	.get_tag_protocol	= b53_get_tag_protocol,
 	.setup			= b53_setup,
@@ -1629,6 +1681,9 @@ static const struct dsa_switch_ops b53_switch_ops = {
 	.port_fdb_del		= b53_fdb_del,
 	.port_mirror_add	= b53_mirror_add,
 	.port_mirror_del	= b53_mirror_del,
+	.port_lag_member	= b53_lag_member,
+	.port_lag_join		= b53_lag_join,
+	.port_lag_leave		= b53_lag_leave,
 };
 
 struct b53_chip_data {
@@ -1642,6 +1697,8 @@ struct b53_chip_data {
 	u8 duplex_reg;
 	u8 jumbo_pm_reg;
 	u8 jumbo_size_reg;
+	unsigned int num_lags;
+	unsigned int max_lag_members;
 };
 
 #define B53_VTA_REGS	\
@@ -1681,6 +1738,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM5397_DEVICE_ID,
@@ -1693,6 +1752,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM5398_DEVICE_ID,
@@ -1705,6 +1766,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53115_DEVICE_ID,
@@ -1717,6 +1780,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53125_DEVICE_ID,
@@ -1729,6 +1794,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53128_DEVICE_ID,
@@ -1741,6 +1808,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM63XX_DEVICE_ID,
@@ -1753,6 +1822,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_63XX,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK_63XX,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE_63XX,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53010_DEVICE_ID,
@@ -1765,6 +1836,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53011_DEVICE_ID,
@@ -1777,6 +1850,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53012_DEVICE_ID,
@@ -1789,6 +1864,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53018_DEVICE_ID,
@@ -1801,6 +1878,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM53019_DEVICE_ID,
@@ -1813,6 +1892,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM58XX_DEVICE_ID,
@@ -1825,6 +1906,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM7445_DEVICE_ID,
@@ -1837,6 +1920,8 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 	{
 		.chip_id = BCM7278_DEVICE_ID,
@@ -1849,11 +1934,14 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.duplex_reg = B53_DUPLEX_STAT_GE,
 		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
 		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+		.num_lags = 2,
+		.max_lag_members = 4,
 	},
 };
 
 static int b53_switch_init(struct b53_device *dev)
 {
+	struct dsa_switch *ds = dev->ds;
 	unsigned int i;
 	int ret;
 
@@ -1872,6 +1960,8 @@ static int b53_switch_init(struct b53_device *dev)
 			dev->cpu_port = chip->cpu_port;
 			dev->num_vlans = chip->vlans;
 			dev->num_arl_entries = chip->arl_entries;
+			dev->num_lags = chip->num_lags;
+			dev->max_lag_members = chip->max_lag_members;
 			break;
 		}
 	}
@@ -1933,7 +2023,9 @@ static int b53_switch_init(struct b53_device *dev)
 			return ret;
 	}
 
-	return 0;
+	ds->max_lag_members = dev->max_lag_members;
+
+	return dsa_switch_alloc_lags(ds, dev->num_lags);
 }
 
 struct b53_device *b53_switch_alloc(struct device *base,
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 603c66d240d8..0f48184c9b54 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -118,6 +118,9 @@ struct b53_device {
 	struct b53_vlan *vlans;
 	unsigned int num_ports;
 	struct b53_port *ports;
+
+	unsigned int num_lags;
+	unsigned int max_lag_members;
 };
 
 #define b53_for_each_port(dev, i) \
@@ -318,5 +321,8 @@ void b53_eee_enable_set(struct dsa_switch *ds, int port, bool enable);
 int b53_eee_init(struct dsa_switch *ds, int port, struct phy_device *phy);
 int b53_get_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e);
 int b53_set_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e);
+bool b53_lag_member(struct dsa_switch *ds, int port, u8 lag_id);
+int b53_lag_join(struct dsa_switch *ds, int port, u8 lag_id);
+void b53_lag_leave(struct dsa_switch *ds, int port, u8 lag_id, bool lag_disable);
 
 #endif
-- 
2.11.0

^ permalink raw reply related

* [RFC net-next 4/5] net: dsa: bcm_sf2: Add support for LAG
From: Florian Fainelli @ 2017-10-01 19:46 UTC (permalink / raw)
  To: netdev
  Cc: andrew, vivien.didelot, jiri, idosch, Woojung.Huh, john,
	sean.wang, Florian Fainelli
In-Reply-To: <20171001194639.8647-1-f.fainelli@gmail.com>

Utilize the recently introduced b53 LAG operations and use those as-is.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/bcm_sf2.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 7aecc98d0a18..41a06fde7510 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -901,6 +901,9 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
 	.set_rxnfc		= bcm_sf2_set_rxnfc,
 	.port_mirror_add	= b53_mirror_add,
 	.port_mirror_del	= b53_mirror_del,
+	.port_lag_member	= b53_lag_member,
+	.port_lag_join		= b53_lag_join,
+	.port_lag_leave		= b53_lag_leave,
 };
 
 struct bcm_sf2_of_data {
-- 
2.11.0

^ permalink raw reply related

* [RFC net-next 5/5] net: dsa: loop: Add support for LAG
From: Florian Fainelli @ 2017-10-01 19:46 UTC (permalink / raw)
  To: netdev
  Cc: andrew, vivien.didelot, jiri, idosch, Woojung.Huh, john,
	sean.wang, Florian Fainelli
In-Reply-To: <20171001194639.8647-1-f.fainelli@gmail.com>

Implement LAG operations for the DSA loopback/mock-up driver in order to
exercise the DSA core code. This just maintains a software bitmask of ports
belonging to a LAG group, we allow up to 2 LAG groups and up to 2 members per
LAG group.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/dsa/dsa_loop.c | 54 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 52 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/dsa_loop.c b/drivers/net/dsa/dsa_loop.c
index d55051abf4ed..776130c69c7f 100644
--- a/drivers/net/dsa/dsa_loop.c
+++ b/drivers/net/dsa/dsa_loop.c
@@ -1,7 +1,7 @@
 /*
  * Distributed Switch Architecture loopback driver
  *
- * Copyright (C) 2016, Florian Fainelli <f.fainelli@gmail.com>
+ * Copyright (C) 2016-2017, Florian Fainelli <f.fainelli@gmail.com>
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
@@ -27,6 +27,10 @@ struct dsa_loop_vlan {
 	u16 untagged;
 };
 
+struct dsa_loop_lag {
+	u8 members;
+};
+
 struct dsa_loop_mib_entry {
 	char name[ETH_GSTRING_LEN];
 	unsigned long val;
@@ -52,6 +56,7 @@ struct dsa_loop_port {
 };
 
 #define DSA_LOOP_VLANS	5
+#define DSA_LOOP_LAGS	2
 
 struct dsa_loop_priv {
 	struct mii_bus	*bus;
@@ -60,6 +65,7 @@ struct dsa_loop_priv {
 	struct net_device *netdev;
 	struct dsa_loop_port ports[DSA_MAX_PORTS];
 	u16 pvid;
+	struct dsa_loop_lag lags[DSA_LOOP_LAGS];
 };
 
 static struct phy_device *phydevs[PHY_MAX_ADDR];
@@ -257,6 +263,40 @@ static int dsa_loop_port_vlan_del(struct dsa_switch *ds, int port,
 	return 0;
 }
 
+static bool dsa_loop_lag_member(struct dsa_switch *ds, int port, u8 lag_id)
+{
+	struct dsa_loop_priv *priv = ds->priv;
+	struct dsa_loop_lag *lag = &priv->lags[lag_id];
+
+	return !!(BIT(port) & lag->members);
+}
+
+static int dsa_loop_lag_join(struct dsa_switch *ds, int port, u8 lag_id)
+{
+	struct dsa_loop_priv *priv = ds->priv;
+	struct dsa_loop_lag *lag = &priv->lags[lag_id];
+
+	lag->members |= BIT(port);
+
+	dev_dbg(ds->dev, "%s, added port %d to lag: %d\n",
+		__func__, port, lag_id);
+
+	return 0;
+}
+
+static void dsa_loop_lag_leave(struct dsa_switch *ds, int port, u8 lag_id,
+				bool lag_disable)
+{
+	struct dsa_loop_priv *priv = ds->priv;
+	struct dsa_loop_lag *lag = &priv->lags[lag_id];
+
+
+	lag->members &= ~BIT(port);
+
+	dev_dbg(ds->dev, "%s, removed port %d from lag: %d\n",
+		__func__, port, lag_id);
+}
+
 static const struct dsa_switch_ops dsa_loop_driver = {
 	.get_tag_protocol	= dsa_loop_get_protocol,
 	.setup			= dsa_loop_setup,
@@ -273,6 +313,9 @@ static const struct dsa_switch_ops dsa_loop_driver = {
 	.port_vlan_prepare	= dsa_loop_port_vlan_prepare,
 	.port_vlan_add		= dsa_loop_port_vlan_add,
 	.port_vlan_del		= dsa_loop_port_vlan_del,
+	.port_lag_member	= dsa_loop_lag_member,
+	.port_lag_join		= dsa_loop_lag_join,
+	.port_lag_leave		= dsa_loop_lag_leave,
 };
 
 static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
@@ -280,6 +323,7 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
 	struct dsa_loop_pdata *pdata = mdiodev->dev.platform_data;
 	struct dsa_loop_priv *ps;
 	struct dsa_switch *ds;
+	int ret;
 
 	if (!pdata)
 		return -ENODEV;
@@ -291,6 +335,13 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
 	if (!ds)
 		return -ENOMEM;
 
+	ds->dev = &mdiodev->dev;
+	ds->max_lag_members = 2;
+
+	ret = dsa_switch_alloc_lags(ds, DSA_LOOP_LAGS);
+	if (ret)
+		return ret;
+
 	ps = devm_kzalloc(&mdiodev->dev, sizeof(*ps), GFP_KERNEL);
 	if (!ps)
 		return -ENOMEM;
@@ -301,7 +352,6 @@ static int dsa_loop_drv_probe(struct mdio_device *mdiodev)
 
 	pdata->cd.netdev[DSA_LOOP_CPU_PORT] = &ps->netdev->dev;
 
-	ds->dev = &mdiodev->dev;
 	ds->ops = &dsa_loop_driver;
 	ds->priv = ps;
 	ps->bus = mdiodev->bus;
-- 
2.11.0

^ permalink raw reply related

* [PATCH 0/2] atm: Adjustments for adummy_init()
From: SF Markus Elfring @ 2017-10-01 19:48 UTC (permalink / raw)
  To: linux-atm-general, netdev, Chas Williams; +Cc: LKML, kernel-janitors

From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sun, 1 Oct 2017 21:43:21 +0200

Two update suggestions were taken into account
from static source code analysis.

Markus Elfring (2):
  Delete an error message for a failed memory allocation
  Improve a size determination

 drivers/atm/adummy.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

-- 
2.14.2

^ permalink raw reply

* [PATCH 1/2] atm: Delete an error message for a failed memory allocation in adummy_init()
From: SF Markus Elfring @ 2017-10-01 19:49 UTC (permalink / raw)
  To: linux-atm-general, netdev, Chas Williams; +Cc: LKML, kernel-janitors
In-Reply-To: <8fe2ca8a-e1f7-d492-db69-c52b66143ac6@users.sourceforge.net>

From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sun, 1 Oct 2017 21:31:32 +0200

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
 drivers/atm/adummy.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/atm/adummy.c b/drivers/atm/adummy.c
index 8d98130ecd40..94b73ddfb731 100644
--- a/drivers/atm/adummy.c
+++ b/drivers/atm/adummy.c
@@ -150,7 +150,6 @@ static int __init adummy_init(void)
 	adummy_dev = kzalloc(sizeof(struct adummy_dev),
 						   GFP_KERNEL);
 	if (!adummy_dev) {
-		printk(KERN_ERR DEV_LABEL ": kzalloc() failed\n");
 		err = -ENOMEM;
 		goto out;
 	}
-- 
2.14.2

^ permalink raw reply related

* [PATCH 2/2] atm: Improve a size determination in adummy_init()
From: SF Markus Elfring @ 2017-10-01 19:50 UTC (permalink / raw)
  To: linux-atm-general, netdev, Chas Williams; +Cc: LKML, kernel-janitors
In-Reply-To: <8fe2ca8a-e1f7-d492-db69-c52b66143ac6@users.sourceforge.net>

From: Markus Elfring <elfring@users.sourceforge.net>
Date: Sun, 1 Oct 2017 21:35:18 +0200

Replace the specification of a data structure by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
---
 drivers/atm/adummy.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/atm/adummy.c b/drivers/atm/adummy.c
index 94b73ddfb731..1ef2d8ee8d67 100644
--- a/drivers/atm/adummy.c
+++ b/drivers/atm/adummy.c
@@ -146,9 +146,7 @@ static int __init adummy_init(void)
 	int err = 0;
 
 	printk(KERN_ERR "adummy: version %s\n", DRV_VERSION);
-
-	adummy_dev = kzalloc(sizeof(struct adummy_dev),
-						   GFP_KERNEL);
+	adummy_dev = kzalloc(sizeof(*adummy_dev), GFP_KERNEL);
 	if (!adummy_dev) {
 		err = -ENOMEM;
 		goto out;
-- 
2.14.2

^ permalink raw reply related

* Re: [PATCH v2 iproute2] tc: fix ipv6 filter selector attribute for some prefix lengths
From: Stephen Hemminger @ 2017-10-01 20:42 UTC (permalink / raw)
  To: Yulia Kartseva; +Cc: netdev, yulia.kartseva
In-Reply-To: <20171001031840.1252538-1-hex@fb.com>

On Sat, 30 Sep 2017 20:18:40 -0700
Yulia Kartseva <hex@fb.com> wrote:

> Wrong TCA_U32_SEL attribute packing if prefixLen AND 0x1f equals 0x1f.
> These are  /31, /63, /95 and /127 prefix lengths.
> 
> Example:
> ip6 dst face:b00f::/31
> filter parent b: protocol ipv6 pref 2307 u32
> filter parent b: protocol ipv6 pref 2307 u32 fh 800: ht divisor 1
> filter parent b: protocol ipv6 pref 2307 u32 fh 800::800 order 2048
> key ht 800 bkt 0
>   match faceb00f/ffffffff at 24
> 
> v2: previous patch was made with a wrong repo
> 
> Signed-off-by: Yulia Kartseva <hex@fb.com>

That came through correctly. Applied.

^ permalink raw reply

* Re: [PATCH iproute2] ip xfrm: use correct key length for netlink message
From: Stephen Hemminger @ 2017-10-01 20:48 UTC (permalink / raw)
  To: Michal Kubecek; +Cc: netdev
In-Reply-To: <20170929114105.2D971A0F6B@unicorn.suse.cz>

On Fri, 29 Sep 2017 13:41:05 +0200 (CEST)
Michal Kubecek <mkubecek@suse.cz> wrote:

> When SA is added manually using "ip xfrm state add", xfrm_state_modify()
> uses alg_key_len field of struct xfrm_algo for the length of key passed to
> kernel in the netlink message. However alg_key_len is bit length of the key
> while we need byte length here. This is usually harmless as kernel ignores
> the excess data but when the bit length of the key exceeds 512
> (XFRM_ALGO_KEY_BUF_SIZE), it can result in buffer overflow.
> 
> We can simply divide by 8 here as the only place setting alg_key_len is in
> xfrm_algo_parse() where it is always set to a multiple of 8 (and there are
> already multiple places using "algo->alg_key_len / 8").
> 
> Signed-off-by: Michal Kubecek <mkubecek@suse.cz>

This looks correct applied.

^ permalink raw reply

* [BUG] bpf is broken in net-next
From: Stephen Hemminger @ 2017-10-01 21:02 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: netdev

Recent regression in net-next building bpf.c in samples/bpf now broken.

$ make samples/bpf/


  HOSTCC  samples/bpf/../../tools/lib/bpf/bpf.o
samples/bpf/../../tools/lib/bpf/bpf.c: In function ‘bpf_create_map_node’:
samples/bpf/../../tools/lib/bpf/bpf.c:76:13: error: ‘union bpf_attr’ has no member named ‘map_name’; did you mean ‘map_type’?
  memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
             ^
samples/bpf/../../tools/lib/bpf/bpf.c:76:44: error: ‘BPF_OBJ_NAME_LEN’ undeclared (first use in this function)
  memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
                                            ^
samples/bpf/../../tools/lib/bpf/bpf.c:49:27: note: in definition of macro ‘min’
 #define min(x, y) ((x) < (y) ? (x) : (y))
                           ^
samples/bpf/../../tools/lib/bpf/bpf.c:76:44: note: each undeclared identifier is reported only once for each function it appears in
  memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
                                            ^
samples/bpf/../../tools/lib/bpf/bpf.c:49:27: note: in definition of macro ‘min’
 #define min(x, y) ((x) < (y) ? (x) : (y))
                           ^
samples/bpf/../../tools/lib/bpf/bpf.c: In function ‘bpf_create_map_in_map_node’:
samples/bpf/../../tools/lib/bpf/bpf.c:116:13: error: ‘union bpf_attr’ has no member named ‘map_name’; did you mean ‘map_type’?
  memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
             ^
samples/bpf/../../tools/lib/bpf/bpf.c:116:44: error: ‘BPF_OBJ_NAME_LEN’ undeclared (first use in this function)
  memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
                                            ^
samples/bpf/../../tools/lib/bpf/bpf.c:49:27: note: in definition of macro ‘min’
 #define min(x, y) ((x) < (y) ? (x) : (y))
                           ^
samples/bpf/../../tools/lib/bpf/bpf.c: In function ‘bpf_load_program_name’:
samples/bpf/../../tools/lib/bpf/bpf.c:154:13: error: ‘union bpf_attr’ has no member named ‘prog_name’; did you mean ‘prog_type’?
  memcpy(attr.prog_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
             ^
samples/bpf/../../tools/lib/bpf/bpf.c:154:45: error: ‘BPF_OBJ_NAME_LEN’ undeclared (first use in this function)
  memcpy(attr.prog_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
                                             ^
samples/bpf/../../tools/lib/bpf/bpf.c:49:27: note: in definition of macro ‘min’
 #define min(x, y) ((x) < (y) ? (x) : (y))
                           ^
scripts/Makefile.host:118: recipe for target 'samples/bpf/../../tools/lib/bpf/bpf.o' failed

^ permalink raw reply

* [PATCH net-next] samples/bpf: fix warnings in xdp_monitor_user
From: Stephen Hemminger @ 2017-10-01 21:07 UTC (permalink / raw)
  To: ast, daniel; +Cc: netdev, Stephen Hemminger, Stephen Hemminger

Make local functions static to fix

  HOSTCC  samples/bpf/xdp_monitor_user.o
samples/bpf/xdp_monitor_user.c:64:7: warning: no previous prototype for ‘gettime’ [-Wmissing-prototypes]
 __u64 gettime(void)
       ^~~~~~~
samples/bpf/xdp_monitor_user.c:209:6: warning: no previous prototype for ‘print_bpf_prog_info’ [-Wmissing-prototypes]
 void print_bpf_prog_info(void)
      ^~~~~~~~~~~~~~~~~~~
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 samples/bpf/xdp_monitor_user.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/xdp_monitor_user.c b/samples/bpf/xdp_monitor_user.c
index b51b4f5e3257..c5ab8b776973 100644
--- a/samples/bpf/xdp_monitor_user.c
+++ b/samples/bpf/xdp_monitor_user.c
@@ -61,7 +61,7 @@ static void usage(char *argv[])
 }
 
 #define NANOSEC_PER_SEC 1000000000 /* 10^9 */
-__u64 gettime(void)
+static __u64 gettime(void)
 {
 	struct timespec t;
 	int res;
@@ -206,7 +206,7 @@ static void stats_poll(int interval, bool err_only)
 	}
 }
 
-void print_bpf_prog_info(void)
+static void print_bpf_prog_info(void)
 {
 	int i;
 
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH 00/18] use ARRAY_SIZE macro
From: Tobin C. Harding @ 2017-10-01 22:01 UTC (permalink / raw)
  To: Jérémy Lefaure
  Cc: alsa-devel, nouveau, dri-devel, dm-devel, brcm80211-dev-list,
	devel, linux-scsi, linux-rdma, amd-gfx, Jason Gunthorpe,
	linux-acpi, linux-video, intel-wired-lan, linux-media, intel-gfx,
	ecryptfs, linux-nfs, linux-raid, openipmi-developer,
	intel-gvt-dev, devel, brcm80211-dev-list.pdl, netdev, linux-usb,
	linux-wireless, linux-kernel
In-Reply-To: <20171001193101.8898-1-jeremy.lefaure@lse.epita.fr>

On Sun, Oct 01, 2017 at 03:30:38PM -0400, Jérémy Lefaure wrote:
> Hi everyone,
> Using ARRAY_SIZE improves the code readability. I used coccinelle (I
> made a change to the array_size.cocci file [1]) to find several places
> where ARRAY_SIZE could be used instead of other macros or sizeof
> division.
> 
> I tried to divide the changes into a patch per subsystem (excepted for
> staging). If one of the patch should be split into several patches, let
> me know.
> 
> In order to reduce the size of the To: and Cc: lines, each patch of the
> series is sent only to the maintainers and lists concerned by the patch.
> This cover letter is sent to every list concerned by this series.

Why don't you just send individual patches for each subsystem? I'm not a maintainer but I don't see
how any one person is going to be able to apply this whole series, it is making it hard for
maintainers if they have to pick patches out from among the series (if indeed any will bother
doing that).

I get that this will be more work for you but AFAIK we are optimizing for maintainers.

Good luck,
Tobin.

^ permalink raw reply

* [RFC] compat SIOCADDRT problems
From: Al Viro @ 2017-10-01 22:13 UTC (permalink / raw)
  To: netdev; +Cc: Ralf Baechle

	Handling of SIOC{ADD,DEL}RT for 32bit is somewhat odd.  AFAICS,
the rules for native ioctl look so:

AF_APPLETALK, AF_INET, AF_IPX, AF_PACKET: take struct rtentry.  The last one
doesn't have ->compat_ioctl() and 32bit automatically hits routing_ioctl()
in net/socket.c, the rest have ->compat_ioctl() but it doesn't recognize
SIOC{ADD,DEL}RT, so it ends up handled by the same code.

AF_INET6: takes struct in6_rtmsg.  Hits routing_ioctl(), which recognizes ipv6
and does the right thing.

AF_X25: takes x25_route_struct.  Layout is apparently identical for 32bit and
64bit.  Has ->compat_ioctl(), which does the same thing as ->ioctl() on those
two.

AF_AX25: takes struct ax25_routes_struct.  Again, identical layout on 32bit
and 64bit.  Unfortunately, there's no ->compat_ioctl() in this one, so we
end up hitting routing_ioctl() and get screwed.
AF_NETROM: same as previous, except that it takes struct nr_route_struct.
Apparently broken.
AF_ROSE: ditto, with struct rose_route_struct.

AF_QIPCRTR: explicitly recognizes and fails with -EINVAL.  Odd (other protocol
families without SIOCADDRT support fail with -ENOTTY), but clearly not an issue
for compat code.

Everything else: fails with -ENOTTY.


	Are AF_{AX25,NETROM,ROSE} really broken for 32bit processes on biarch
hosts, or am I missing something subtle in there?

^ permalink raw reply

* [PATCH net 0/3] net: TCP/IP: A few minor cleanups
From: Michael Witten @ 2017-10-01 22:19 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI
  Cc: Stephen Hemminger, Eric Dumazet, netdev, linux-kernel
In-Reply-To: <45aab5effc0c424a992646a97cf2ec14-mfwitten@gmail.com>

The following patch series is an ad hoc "cleanup" that I made
while perusing the code (I'm not well versed in this code, so I
would not be surprised if there were objections to the changes):

  [1] net: __sock_cmsg_send(): Remove unused parameter `msg'
  [2] net: inet_recvmsg(): Remove unnecessary bitwise operation
  [3] net: skb_queue_purge(): lock/unlock the queue only once

Each patch will be sent as an individual reply to this email;
the total diff is appended below for your convenience.

You may also fetch these patches from GitHub:

  git checkout --detach 5969d1bb3082b41eba8fd2c826559abe38ccb6df
  git pull https://github.com/mfwitten/linux.git net/tcp-ip/01-cleanup/02

Overall:

  include/net/sock.h     |  2 +-
  net/core/skbuff.c      | 26 ++++++++++++++++++--------
  net/core/sock.c        |  4 ++--
  net/ipv4/af_inet.c     |  2 +-
  net/ipv4/ip_sockglue.c |  2 +-
  net/ipv6/datagram.c    |  2 +-
  6 files changed, 24 insertions(+), 14 deletions(-)

Sincerly,
Michael Witten

diff --git a/include/net/sock.h b/include/net/sock.h
index 03a362568357..83373d7148a9 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1562,7 +1562,7 @@ struct sockcm_cookie {
 	u16 tsflags;
 };
 
-int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
+int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 		     struct sockcm_cookie *sockc);
 int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
 		   struct sockcm_cookie *sockc);
diff --git a/net/core/sock.c b/net/core/sock.c
index 9b7b6bbb2a23..425e03fe1c56 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2091,7 +2091,7 @@ struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
 }
 EXPORT_SYMBOL(sock_alloc_send_skb);
 
-int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
+int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 		     struct sockcm_cookie *sockc)
 {
 	u32 tsflags;
@@ -2137,7 +2137,7 @@ int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
 			return -EINVAL;
 		if (cmsg->cmsg_level != SOL_SOCKET)
 			continue;
-		ret = __sock_cmsg_send(sk, msg, cmsg, sockc);
+		ret = __sock_cmsg_send(sk, cmsg, sockc);
 		if (ret)
 			return ret;
 	}
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index e558e4f9597b..c79b7822b0b9 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -263,7 +263,7 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
 		}
 #endif
 		if (cmsg->cmsg_level == SOL_SOCKET) {
-			err = __sock_cmsg_send(sk, msg, cmsg, &ipc->sockc);
+			err = __sock_cmsg_send(sk, cmsg, &ipc->sockc);
 			if (err)
 				return err;
 			continue;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index a1f918713006..1d1926a4cbe2 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -756,7 +756,7 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 		}
 
 		if (cmsg->cmsg_level == SOL_SOCKET) {
-			err = __sock_cmsg_send(sk, msg, cmsg, sockc);
+			err = __sock_cmsg_send(sk, cmsg, sockc);
 			if (err)
 				return err;
 			continue;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index e31108e5ef79..2dbed042a412 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -791,7 +791,7 @@ int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	sock_rps_record_flow(sk);
 
 	err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
-				   flags & ~MSG_DONTWAIT, &addr_len);
+				   flags, &addr_len);
 	if (err >= 0)
 		msg->msg_namelen = addr_len;
 	return err;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 68065d7d383f..bd26b0bde784 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2825,18 +2825,28 @@
 EXPORT_SYMBOL(skb_dequeue_tail);
 
 /**
- *	skb_queue_purge - empty a list
- *	@list: list to empty
+ *	skb_queue_purge - empty a queue
+ *	@q: the queue to empty
  *
- *	Delete all buffers on an &sk_buff list. Each buffer is removed from
- *	the list and one reference dropped. This function takes the list
- *	lock and is atomic with respect to other list locking functions.
+ *	Dequeue and free each socket buffer that is in @q.
+ *
+ *	This function is atomic with respect to other queue-locking functions.
  */
-void skb_queue_purge(struct sk_buff_head *list)
+void skb_queue_purge(struct sk_buff_head *q)
 {
-	struct sk_buff *skb;
-	while ((skb = skb_dequeue(list)) != NULL)
+	unsigned long flags;
+	struct sk_buff *skb, *next, *head = (struct sk_buff *)q;
+
+	spin_lock_irqsave(&q->lock, flags);
+	skb = q->next;
+	__skb_queue_head_init(q);
+	spin_unlock_irqrestore(&q->lock, flags);
+
+	while (skb != head) {
+		next = skb->next;
 		kfree_skb(skb);
+		skb = next;
+	}
 }
 EXPORT_SYMBOL(skb_queue_purge);
 
-- 
2.14.1

^ permalink raw reply related

* [PATCH net 1/3] net: __sock_cmsg_send(): Remove unused parameter `msg'
From: Michael Witten @ 2017-10-01 22:19 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI
  Cc: Stephen Hemminger, Eric Dumazet, netdev, linux-kernel
In-Reply-To: <14527e6c082e4ea282a3f833118c68df-mfwitten@gmail.com>

Date: Thu, 7 Sep 2017 03:21:38 +0000
Signed-off-by: Michael Witten <mfwitten@gmail.com>
---
 include/net/sock.h     | 2 +-
 net/core/sock.c        | 4 ++--
 net/ipv4/ip_sockglue.c | 2 +-
 net/ipv6/datagram.c    | 2 +-
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 03a362568357..83373d7148a9 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1562,7 +1562,7 @@ struct sockcm_cookie {
 	u16 tsflags;
 };
 
-int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
+int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 		     struct sockcm_cookie *sockc);
 int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
 		   struct sockcm_cookie *sockc);
diff --git a/net/core/sock.c b/net/core/sock.c
index 9b7b6bbb2a23..425e03fe1c56 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2091,7 +2091,7 @@ struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
 }
 EXPORT_SYMBOL(sock_alloc_send_skb);
 
-int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
+int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 		     struct sockcm_cookie *sockc)
 {
 	u32 tsflags;
@@ -2137,7 +2137,7 @@ int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
 			return -EINVAL;
 		if (cmsg->cmsg_level != SOL_SOCKET)
 			continue;
-		ret = __sock_cmsg_send(sk, msg, cmsg, sockc);
+		ret = __sock_cmsg_send(sk, cmsg, sockc);
 		if (ret)
 			return ret;
 	}
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index e558e4f9597b..c79b7822b0b9 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -263,7 +263,7 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
 		}
 #endif
 		if (cmsg->cmsg_level == SOL_SOCKET) {
-			err = __sock_cmsg_send(sk, msg, cmsg, &ipc->sockc);
+			err = __sock_cmsg_send(sk, cmsg, &ipc->sockc);
 			if (err)
 				return err;
 			continue;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index a1f918713006..1d1926a4cbe2 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -756,7 +756,7 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 		}
 
 		if (cmsg->cmsg_level == SOL_SOCKET) {
-			err = __sock_cmsg_send(sk, msg, cmsg, sockc);
+			err = __sock_cmsg_send(sk, cmsg, sockc);
 			if (err)
 				return err;
 			continue;
-- 
2.14.1

^ permalink raw reply related

* [PATCH net 2/3] net: inet_recvmsg(): Remove unnecessary bitwise operation
From: Michael Witten @ 2017-10-01 22:19 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI
  Cc: Stephen Hemminger, Eric Dumazet, netdev, linux-kernel
In-Reply-To: <14527e6c082e4ea282a3f833118c68df-mfwitten@gmail.com>

Date: Fri, 8 Sep 2017 00:47:49 +0000
The flag `MSG_DONTWAIT' is handled by passing an argument through
the dedicated parameter `nonblock' of the function `tcp_recvmsg()'.

Presumably because `MSG_DONTWAIT' is handled so explicitly, it is
unset in the collection of flags that are passed to `tcp_recvmsg()';
yet, this unsetting appears to be unnecessary, and so this commit
removes the bitwise operation that performs the unsetting.

Signed-off-by: Michael Witten <mfwitten@gmail.com>
---
 net/ipv4/af_inet.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index e31108e5ef79..2dbed042a412 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -791,7 +791,7 @@ int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	sock_rps_record_flow(sk);
 
 	err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
-				   flags & ~MSG_DONTWAIT, &addr_len);
+				   flags, &addr_len);
 	if (err >= 0)
 		msg->msg_namelen = addr_len;
 	return err;
-- 
2.14.1

^ permalink raw reply related

* [PATCH net 3/3] net: skb_queue_purge(): lock/unlock the queue only once
From: Michael Witten @ 2017-10-01 22:19 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI
  Cc: Stephen Hemminger, Eric Dumazet, netdev, linux-kernel
In-Reply-To: <60c8906b751d4915be456009c220516e-mfwitten@gmail.com>

Date: Sat, 9 Sep 2017 05:50:23 +0000
Hitherto, the queue's lock has been locked/unlocked every time
an item is dequeued; this seems not only inefficient, but also
incorrect, as the whole point of `skb_queue_purge()' is to clear
the queue, presumably without giving any other thread a chance to
manipulate the queue in the interim.

With this commit, the queue's lock is locked/unlocked only once
when `skb_queue_purge()' is called, and in a way that disables
the IRQs for only a minimal amount of time.

This is achieved by atomically re-initializing the queue (thereby
clearing it), and then freeing each of the items as though it were
enqueued in a private queue that doesn't require locking.

Signed-off-by: Michael Witten <mfwitten@gmail.com>
---
 net/core/skbuff.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 68065d7d383f..bd26b0bde784 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2825,18 +2825,28 @@ struct sk_buff *skb_dequeue_tail(struct sk_buff_head *list)
 EXPORT_SYMBOL(skb_dequeue_tail);
 
 /**
- *	skb_queue_purge - empty a list
- *	@list: list to empty
+ *	skb_queue_purge - empty a queue
+ *	@q: the queue to empty
  *
- *	Delete all buffers on an &sk_buff list. Each buffer is removed from
- *	the list and one reference dropped. This function takes the list
- *	lock and is atomic with respect to other list locking functions.
+ *	Dequeue and free each socket buffer that is in @q.
+ *
+ *	This function is atomic with respect to other queue-locking functions.
  */
-void skb_queue_purge(struct sk_buff_head *list)
+void skb_queue_purge(struct sk_buff_head *q)
 {
-	struct sk_buff *skb;
-	while ((skb = skb_dequeue(list)) != NULL)
+	unsigned long flags;
+	struct sk_buff *skb, *next, *head = (struct sk_buff *)q;
+
+	spin_lock_irqsave(&q->lock, flags);
+	skb = q->next;
+	__skb_queue_head_init(q);
+	spin_unlock_irqrestore(&q->lock, flags);
+
+	while (skb != head) {
+		next = skb->next;
 		kfree_skb(skb);
+		skb = next;
+	}
 }
 EXPORT_SYMBOL(skb_queue_purge);
 
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH 00/18] use ARRAY_SIZE macro
From: Jérémy Lefaure @ 2017-10-02  0:52 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: alsa-devel, nouveau, dri-devel, dm-devel, brcm80211-dev-list,
	devel, linux-scsi, linux-rdma, amd-gfx, Jason Gunthorpe,
	linux-acpi, linux-video, intel-wired-lan, linux-media, intel-gfx,
	ecryptfs, linux-nfs, linux-raid, openipmi-developer,
	intel-gvt-dev, devel, brcm80211-dev-list.pdl, netdev, linux-usb,
	linux-wireless, linux-kernel, linux-integrity
In-Reply-To: <20171001220131.GA11812@eros>

On Mon, 2 Oct 2017 09:01:31 +1100
"Tobin C. Harding" <me@tobin.cc> wrote:

> > In order to reduce the size of the To: and Cc: lines, each patch of the
> > series is sent only to the maintainers and lists concerned by the patch.
> > This cover letter is sent to every list concerned by this series.  
> 
> Why don't you just send individual patches for each subsystem? I'm not a maintainer but I don't see
> how any one person is going to be able to apply this whole series, it is making it hard for
> maintainers if they have to pick patches out from among the series (if indeed any will bother
> doing that).
Yeah, maybe it would have been better to send individual patches.

From my point of view it's a series because the patches are related (I
did a git format-patch from my local branch). But for the maintainers
point of view, they are individual patches.

As each patch in this series is standalone, the maintainers can pick
the patches they want and apply them individually. So yeah, maybe I
should have sent individual patches. But I also wanted to tie all
patches together as it's the same change.

Anyway, I can tell to each maintainer that they can apply the patches
they're concerned about and next time I may send individual patches.

Thank you for your answer,
Jérémy
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply

* Re: [PATCH net 3/3] net: skb_queue_purge(): lock/unlock the queue only once
From: Stephen Hemminger @ 2017-10-02  0:59 UTC (permalink / raw)
  To: Michael Witten
  Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Eric Dumazet, netdev, linux-kernel
In-Reply-To: <ccfa23a0f76c4c9ebd07fc991db2f829-mfwitten@gmail.com>

On Sun, 01 Oct 2017 22:19:20 -0000
Michael Witten <mfwitten@gmail.com> wrote:

> +	spin_lock_irqsave(&q->lock, flags);
> +	skb = q->next;
> +	__skb_queue_head_init(q);
> +	spin_unlock_irqrestore(&q->lock, flags);

Other code manipulating lists uses splice operation and
a sk_buff_head temporary on the stack. That would be easier
to understand.

	struct sk_buf_head head;

	__skb_queue_head_init(&head);
	spin_lock_irqsave(&q->lock, flags);
	skb_queue_splice_init(q, &head);
	spin_unlock_irqrestore(&q->lock, flags);


> +	while (skb != head) {
> +		next = skb->next;
>  		kfree_skb(skb);
> +		skb = next;
> +	}

It would be cleaner if you could use
skb_queue_walk_safe rather than open coding the loop.

	skb_queue_walk_safe(&head, skb,  tmp)
		kfree_skb(skb);

^ permalink raw reply

* Re: [BUG] bpf is broken in net-next
From: Alexei Starovoitov @ 2017-10-02  1:34 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Martin KaFai Lau, netdev
In-Reply-To: <20171001140230.6a494b48@xeon-e3>

On Sun, Oct 01, 2017 at 02:02:30PM -0700, Stephen Hemminger wrote:
> Recent regression in net-next building bpf.c in samples/bpf now broken.
> 
> $ make samples/bpf/
> 
> 
>   HOSTCC  samples/bpf/../../tools/lib/bpf/bpf.o
> samples/bpf/../../tools/lib/bpf/bpf.c: In function ‘bpf_create_map_node’:
> samples/bpf/../../tools/lib/bpf/bpf.c:76:13: error: ‘union bpf_attr’ has no member named ‘map_name’; did you mean ‘map_type’?
>   memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));
>              ^
> samples/bpf/../../tools/lib/bpf/bpf.c:76:44: error: ‘BPF_OBJ_NAME_LEN’ undeclared (first use in this function)
>   memcpy(attr.map_name, name, min(name_len, BPF_OBJ_NAME_LEN - 1));

everything works fine for me...
did you do 'make headers_install' ?

^ permalink raw reply

* Re: [PATCH net-next] samples/bpf: fix warnings in xdp_monitor_user
From: Alexei Starovoitov @ 2017-10-02  1:35 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: ast, daniel, netdev, Stephen Hemminger, Jesper Dangaard Brouer
In-Reply-To: <20171001210734.27010-1-sthemmin@microsoft.com>

On Sun, Oct 01, 2017 at 02:07:34PM -0700, Stephen Hemminger wrote:
> Make local functions static to fix
> 
>   HOSTCC  samples/bpf/xdp_monitor_user.o
> samples/bpf/xdp_monitor_user.c:64:7: warning: no previous prototype for ‘gettime’ [-Wmissing-prototypes]
>  __u64 gettime(void)
>        ^~~~~~~
> samples/bpf/xdp_monitor_user.c:209:6: warning: no previous prototype for ‘print_bpf_prog_info’ [-Wmissing-prototypes]
>  void print_bpf_prog_info(void)
>       ^~~~~~~~~~~~~~~~~~~
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

Fixes: 3ffab5460264 ("samples/bpf: xdp_monitor tool based on tracepoints")
Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply

* Re: [RFC net-next 1/5] net: dsa: Add infrastructure to support LAG
From: Andrew Lunn @ 2017-10-02  2:03 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev, vivien.didelot, jiri, idosch, Woojung.Huh, john,
	sean.wang
In-Reply-To: <20171001194639.8647-2-f.fainelli@gmail.com>

On Sun, Oct 01, 2017 at 12:46:35PM -0700, Florian Fainelli wrote:
> Add the necessary logic to support network device events targetting LAG events,
> this is loosely inspired from mlxsw/spectrum.c.
> 
> In the process we change dsa_slave_changeupper() to be more generic and be called
> from both LAG events as well as normal bridge enslaving events paths.
> 
> The DSA layer takes care of managing the LAG group identifiers, how many LAGs
> may be supported by a switch, and how many members per LAG are supported by a
> switch device. When a LAG group is identified, the port is then configured to
> be a part of that group. When a LAG group no longer has any users, we remove it
> and we tell the drivers whether it is safe to disable trunking altogether.
> 
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  include/net/dsa.h  |  25 +++++++++
>  net/dsa/dsa2.c     |  12 ++++
>  net/dsa/dsa_priv.h |   7 +++
>  net/dsa/port.c     |  92 +++++++++++++++++++++++++++++++
>  net/dsa/slave.c    | 157 +++++++++++++++++++++++++++++++++++++++++++++++++----
>  net/dsa/switch.c   |  30 ++++++++++
>  6 files changed, 312 insertions(+), 11 deletions(-)
> 
> diff --git a/include/net/dsa.h b/include/net/dsa.h
> index 10dceccd9ce8..247ea58add68 100644
> --- a/include/net/dsa.h
> +++ b/include/net/dsa.h
> @@ -182,12 +182,20 @@ struct dsa_port {
>  	u8			stp_state;
>  	struct net_device	*bridge_dev;
>  	struct devlink_port	devlink_port;
> +	u8			lag_id;
> +	bool			lagged;
>  	/*
>  	 * Original copy of the master netdev ethtool_ops
>  	 */
>  	const struct ethtool_ops *orig_ethtool_ops;
>  };
>  
> +struct dsa_lag_group {
> +	/* Used to know when we can disable lag on the switch */
> +	unsigned int		ref_count;

Hi Florian

In what contexts is ref_count manipulated. Normally you use would
refcounf_t and the operations in linux/refcount.h. But if you know
there is some other protection, e.g. rtnl, an unsigned int is O.K.
Maybe scatter some assert_RTNL() in the code?

> +static bool dsa_slave_lag_check(struct net_device *dev, struct net_device *lag_dev,
> +				struct netdev_lag_upper_info *lag_upper_info)
> +{
> +	struct dsa_slave_priv *p = netdev_priv(dev);
> +	u8 lag_id;
> +
> +	/* No more lag identifiers available or already in use */
> +	if (dsa_switch_lag_get_index(p->dp->ds, lag_dev, &lag_id) != 0)
> +		return false;
> +
> +	if (lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH)
> +		return false;

I wounder if the driver needs to decide this? Can different hardware
support different tx_types?

	Andrew

^ permalink raw reply

* Re: [PATCH net-next] vhost_net: do not stall on zerocopy depletion
From: Michael S. Tsirkin @ 2017-10-02  4:08 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David Miller, Jason Wang, Koichiro Den,
	virtualization, Willem de Bruijn
In-Reply-To: <CAF=yD-KotdpHs96GomMKR-BqG3Gyrvo+to0sk2=a6E5BKjgpkg@mail.gmail.com>

On Fri, Sep 29, 2017 at 09:25:27PM -0400, Willem de Bruijn wrote:
> On Fri, Sep 29, 2017 at 3:38 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Wed, Sep 27, 2017 at 08:25:56PM -0400, Willem de Bruijn wrote:
> >> From: Willem de Bruijn <willemb@google.com>
> >>
> >> Vhost-net has a hard limit on the number of zerocopy skbs in flight.
> >> When reached, transmission stalls. Stalls cause latency, as well as
> >> head-of-line blocking of other flows that do not use zerocopy.
> >>
> >> Instead of stalling, revert to copy-based transmission.
> >>
> >> Tested by sending two udp flows from guest to host, one with payload
> >> of VHOST_GOODCOPY_LEN, the other too small for zerocopy (1B). The
> >> large flow is redirected to a netem instance with 1MBps rate limit
> >> and deep 1000 entry queue.
> >>
> >>   modprobe ifb
> >>   ip link set dev ifb0 up
> >>   tc qdisc add dev ifb0 root netem limit 1000 rate 1MBit
> >>
> >>   tc qdisc add dev tap0 ingress
> >>   tc filter add dev tap0 parent ffff: protocol ip \
> >>       u32 match ip dport 8000 0xffff \
> >>       action mirred egress redirect dev ifb0
> >>
> >> Before the delay, both flows process around 80K pps. With the delay,
> >> before this patch, both process around 400. After this patch, the
> >> large flow is still rate limited, while the small reverts to its
> >> original rate. See also discussion in the first link, below.
> >>
> >> The limit in vhost_exceeds_maxpend must be carefully chosen. When
> >> vq->num >> 1, the flows remain correlated. This value happens to
> >> correspond to VHOST_MAX_PENDING for vq->num == 256. Allow smaller
> >> fractions and ensure correctness also for much smaller values of
> >> vq->num, by testing the min() of both explicitly. See also the
> >> discussion in the second link below.
> >>
> >> Link:http://lkml.kernel.org/r/CAF=yD-+Wk9sc9dXMUq1+x_hh=3ThTXa6BnZkygP3tgVpjbp93g@mail.gmail.com
> >> Link:http://lkml.kernel.org/r/20170819064129.27272-1-den@klaipeden.com
> >> Signed-off-by: Willem de Bruijn <willemb@google.com>
> >
> > I'd like to see the effect on the non rate limited case though.
> > If guest is quick won't we have lots of copies then?
> 
> Yes, but not significantly more than without this patch.
> 
> I ran 1, 10 and 100 flow tcp_stream throughput tests from a sender
> in the guest to a receiver in the host.
> 
> To answer the other benchmark question first, I did not see anything
> noteworthy when increasing vq->num from 256 to 1024.
> 
> With 1 and 10 flows without this patch all packets use zerocopy.
> With the patch, less than 1% eschews zerocopy.
> 
> With 100 flows, even without this patch, 90+% of packets are copied.
> Some zerocopy packets from vhost_net fail this test in tun.c
> 
>     if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
> 
> Generating packets with up to 21 frags. I'm not sure yet why or
> what the fraction of these packets is. But this in turn can
> disable zcopy_used in vhost_net_tx_select_zcopy for a
> larger share of packets:
> 
>         return !net->tx_flush &&
>                 net->tx_packets / 64 >= net->tx_zcopy_err;
> 
> Because the number of copied and zerocopy packets are the
> same before and after the patch, so are the overall throughput
> numbers.

OK, thanks!
Are you looking into new warnings that kbuild system reported
with this patch?

Thanks,

-- 
MST

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox