Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] geneve: update skb dst pmtu on tx path
From: David Miller @ 2018-01-02 17:36 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev
In-Reply-To: <fe176067f43de7928afcbc9810b13acb26f7adf0.1514184238.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Mon, 25 Dec 2017 14:43:58 +0800

> Commit a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") has fixed
> a performance issue caused by the change of lower dev's mtu for vxlan.
> 
> The same thing needs to be done for geneve as well.
> 
> Note that geneve cannot adjust it's mtu according to lower dev's mtu
> when creating it. The performance is very low later when netperfing
> over it without fixing the mtu manually. This patch could also avoid
> this issue.
> 
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] ip6_tunnel: disable dst caching if tunnel is dual-stack
From: David Miller @ 2018-01-02 17:33 UTC (permalink / raw)
  To: elicooper; +Cc: netdev
In-Reply-To: <20171225024349.4879-1-elicooper@gmx.com>

From: Eli Cooper <elicooper@gmx.com>
Date: Mon, 25 Dec 2017 10:43:49 +0800

> When an ip6_tunnel is in mode 'any', where the transport layer
> protocol can be either 4 or 41, dst_cache must be disabled.
> 
> This is because xfrm policies might apply to only one of the two
> protocols. Caching dst would cause xfrm policies for one protocol
> incorrectly used for the other.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Eli Cooper <elicooper@gmx.com>

Please do not CC: stable on networking changes, as per the netdev
FAQ I queue these up by hand.

> -	} else if (!(t->parms.flags &
> +	} else if (t->parms.proto != 0 && !(t->parms.flags &
>  		     (IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) {

When you adjust the indentation of an inner-expression, you must reindent the
subsequent lines that are also part of that inner-expression.

I've fixed up both of these issues and applied your patch, but please
take care of this yourself next time.

Thanks.

^ permalink raw reply

* Re: [PATCH] sky2: Replace mdelay with msleep in sky2_vpd_wait
From: David Miller @ 2018-01-02 17:27 UTC (permalink / raw)
  To: baijiaju1990; +Cc: mlindner, stephen, shemminger, netdev, linux-kernel
In-Reply-To: <1514632187-14849-1-git-send-email-baijiaju1990@gmail.com>

From: Jia-Ju Bai <baijiaju1990@gmail.com>
Date: Sat, 30 Dec 2017 19:09:47 +0800

> sky2_vpd_wait is not called in an interrupt handler nor holding a spinlock.
> The function mdelay in it can be replaced with msleep, to reduce busy wait.
> 
> Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2] net: sh-eth: Add flag to determine the type of TSU register
From: Sergei Shtylyov @ 2018-01-02 17:27 UTC (permalink / raw)
  To: Nobuhiro Iwamatsu, netdev; +Cc: yoshihiro.shimoda.uh
In-Reply-To: <1367903291-32323-1-git-send-email-nobuhiro.iwamatsu.yj@renesas.com>

Hello!

On 05/07/2013 09:08 AM, Nobuhiro Iwamatsu wrote:

> Some sh-eth devices may have two ether devices inside.
> And the function of TSU is accessed from each ether device.
> In this case, sh-eth need to remap address using devm_ioremap(),
> without using devm_ioremap_resource().
> tsu_shared_reg of sh_eth_cpu_data is used for this control.
> 
> Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
> ---
> V2:
>    - Change variable name from tsu_multi_reg to tsu_shared_reg.
>    - Update commit message.
> 
>   drivers/net/ethernet/renesas/sh_eth.c |   11 ++++++++++-
>   drivers/net/ethernet/renesas/sh_eth.h |    1 +
>   2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
> index 33dc6f2..489be0e 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
[...]
> @@ -2750,7 +2754,12 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
>   			ret = -ENODEV;
>   			goto out_release;
>   		}
> -		mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu);
> +
> +		if (mdp->cd->tsu_shared_reg)
> +			mdp->tsu_addr = devm_ioremap(&pdev->dev, rtsu->start,
> +					resource_size(rtsu));

    I now think there's no need for a special flag -- we can just use 'devno' 
(at least for the pure platform devices). The DT probing case remains unsolved 
though (might make sense to look at the resource #0, whether it has 0x800 set 
or not)...

> +		else
> +			mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu);
>   		if (IS_ERR(mdp->tsu_addr)) {
>   			ret = PTR_ERR(mdp->tsu_addr);
>   			goto out_release;

MBR, Sergei

^ permalink raw reply

* Re: [PATCH 00/10] Convert mvneta to phylink
From: Russell King - ARM Linux @ 2018-01-02 17:27 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

On Tue, Jan 02, 2018 at 05:22:42PM +0000, Russell King - ARM Linux wrote:
> Hi,
> 
> This series converts mvneta to use phylink, which is necessary to
> support the SFP cages on SolidRun's Clearfog platform.  This series just
> converts mvneta without adding the DT parts - having discussed with
> Andrew, we believe we're too close to the merge window to submit that
> patch.
> 
> I've split the "net: mvneta: convert to phylink" patch up to make it
> easier to review, and in doing so, spotted some minor corner cases that
> needed to be fixed along the way.
> 
> This series depends on the previously merged phylink patches in netdev,
> along with the recently reviewed 7 patch series "Resolve races in phy
> accessors" without which, the race described in patch 5 of that series
> is very evident when triggering a dummy hibernate cycle.
> 
> This series also illustrates how to convert mvpp2 to phylink.
> 
> mvneta is the only user of the fixed_phy_update_state() API, and this
> becomes redundant with the conversion.
> 
> It would be good to get this series not only reviewed, but also
> independently tested to ensure that I haven't missed anything - I only
> have the Clearfog platform to test on, and that doesn't support all the
> different interface modes that mvneta supports.
> 
> A particularly interesting side effect of this series is that DSA
> switches no longer need the "CPU" port and DSA facing MAC ethernet
> instance to be marked as a fixed link anymore with mvneta - we can use
> 1000BaseX mode, and the DSA to CPU link will use the 802.3z negotiation
> to determine the link properties without needing the link parameters to
> be explicitly stated in DT - that is a subject of a future patch.
> 
>  drivers/net/ethernet/marvell/Kconfig  |   2 +-
>  drivers/net/ethernet/marvell/mvneta.c | 687 ++++++++++++++++++++--------------
>  drivers/net/phy/fixed_phy.c           |  31 --
>  include/linux/phy_fixed.h             |   9 -
>  4 files changed, 405 insertions(+), 324 deletions(-)

Of course, this series is for net-next, not net... forgot to add that
annotation, sorry.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply

* [PATCH 10/10] net: phy: fixed-phy: remove fixed_phy_update_state()
From: Russell King @ 2018-01-02 17:25 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

mvneta is the only user of fixed_phy_update_state(), which has been
converted to use phylink instead.  Remove fixed_phy_update_state().

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/net/phy/fixed_phy.c | 31 -------------------------------
 include/linux/phy_fixed.h   |  9 ---------
 2 files changed, 40 deletions(-)

diff --git a/drivers/net/phy/fixed_phy.c b/drivers/net/phy/fixed_phy.c
index eb5167210681..001fe1df7557 100644
--- a/drivers/net/phy/fixed_phy.c
+++ b/drivers/net/phy/fixed_phy.c
@@ -115,37 +115,6 @@ int fixed_phy_set_link_update(struct phy_device *phydev,
 }
 EXPORT_SYMBOL_GPL(fixed_phy_set_link_update);
 
-int fixed_phy_update_state(struct phy_device *phydev,
-			   const struct fixed_phy_status *status,
-			   const struct fixed_phy_status *changed)
-{
-	struct fixed_mdio_bus *fmb = &platform_fmb;
-	struct fixed_phy *fp;
-
-	if (!phydev || phydev->mdio.bus != fmb->mii_bus)
-		return -EINVAL;
-
-	list_for_each_entry(fp, &fmb->phys, node) {
-		if (fp->addr == phydev->mdio.addr) {
-			write_seqcount_begin(&fp->seqcount);
-#define _UPD(x) if (changed->x) \
-	fp->status.x = status->x
-			_UPD(link);
-			_UPD(speed);
-			_UPD(duplex);
-			_UPD(pause);
-			_UPD(asym_pause);
-#undef _UPD
-			fixed_phy_update(fp);
-			write_seqcount_end(&fp->seqcount);
-			return 0;
-		}
-	}
-
-	return -ENOENT;
-}
-EXPORT_SYMBOL(fixed_phy_update_state);
-
 int fixed_phy_add(unsigned int irq, int phy_addr,
 		  struct fixed_phy_status *status,
 		  int link_gpio)
diff --git a/include/linux/phy_fixed.h b/include/linux/phy_fixed.h
index cf6392de6eb0..ee54453a40a0 100644
--- a/include/linux/phy_fixed.h
+++ b/include/linux/phy_fixed.h
@@ -24,9 +24,6 @@ extern void fixed_phy_unregister(struct phy_device *phydev);
 extern int fixed_phy_set_link_update(struct phy_device *phydev,
 			int (*link_update)(struct net_device *,
 					   struct fixed_phy_status *));
-extern int fixed_phy_update_state(struct phy_device *phydev,
-			   const struct fixed_phy_status *status,
-			   const struct fixed_phy_status *changed);
 #else
 static inline int fixed_phy_add(unsigned int irq, int phy_id,
 				struct fixed_phy_status *status,
@@ -50,12 +47,6 @@ static inline int fixed_phy_set_link_update(struct phy_device *phydev,
 {
 	return -ENODEV;
 }
-static inline int fixed_phy_update_state(struct phy_device *phydev,
-			   const struct fixed_phy_status *status,
-			   const struct fixed_phy_status *changed)
-{
-	return -ENODEV;
-}
 #endif /* CONFIG_FIXED_PHY */
 
 #endif /* __PHY_FIXED_H */
-- 
2.7.4

^ permalink raw reply related

* [PATCH 09/10] net: mvneta: add module EEPROM reading support
From: Russell King @ 2018-01-02 17:25 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

Add support for reading the SFF module's EEPROM via the ethtool API.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 6a018cfa36f6..25e9a551cc8c 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -4073,6 +4073,22 @@ static int mvneta_ethtool_set_wol(struct net_device *dev,
 	return ret;
 }
 
+static int mvneta_ethtool_get_module_info(struct net_device *dev,
+					  struct ethtool_modinfo *modinfo)
+{
+	struct mvneta_port *pp = netdev_priv(dev);
+
+	return phylink_ethtool_get_module_info(pp->phylink, modinfo);
+}
+
+static int mvneta_ethtool_get_module_eeprom(struct net_device *dev,
+					    struct ethtool_eeprom *ee, u8 *buf)
+{
+	struct mvneta_port *pp = netdev_priv(dev);
+
+	return phylink_ethtool_get_module_eeprom(pp->phylink, ee, buf);
+}
+
 static int mvneta_ethtool_get_eee(struct net_device *dev,
 				  struct ethtool_eee *eee)
 {
@@ -4147,6 +4163,8 @@ static const struct ethtool_ops mvneta_eth_tool_ops = {
 	.set_link_ksettings = mvneta_ethtool_set_link_ksettings,
 	.get_wol        = mvneta_ethtool_get_wol,
 	.set_wol        = mvneta_ethtool_set_wol,
+	.get_module_info = mvneta_ethtool_get_module_info,
+	.get_module_eeprom = mvneta_ethtool_get_module_eeprom,
 	.get_eee	= mvneta_ethtool_get_eee,
 	.set_eee	= mvneta_ethtool_set_eee,
 };
-- 
2.7.4

^ permalink raw reply related

* [PATCH 08/10] net: mvneta: disable MVNETA_CAUSE_PSC_SYNC_CHANGE interrupt
From: Russell King @ 2018-01-02 17:25 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

The PSC sync change interrupt can fire multiple times while the link is
down, which is caused by noise on the serdes lines. As this isn't
information we make use of, it's pointless having the interrupt enabled.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 44d665887b50..6a018cfa36f6 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2704,8 +2704,7 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
 		mvreg_write(pp, MVNETA_INTR_MISC_CAUSE, 0);
 
 		if (cause_misc & (MVNETA_CAUSE_PHY_STATUS_CHANGE |
-				  MVNETA_CAUSE_LINK_CHANGE |
-				  MVNETA_CAUSE_PSC_SYNC_CHANGE))
+				  MVNETA_CAUSE_LINK_CHANGE))
 			mvneta_link_change(pp);
 	}
 
@@ -3044,8 +3043,7 @@ static void mvneta_start_dev(struct mvneta_port *pp)
 
 	mvreg_write(pp, MVNETA_INTR_MISC_MASK,
 		    MVNETA_CAUSE_PHY_STATUS_CHANGE |
-		    MVNETA_CAUSE_LINK_CHANGE |
-		    MVNETA_CAUSE_PSC_SYNC_CHANGE);
+		    MVNETA_CAUSE_LINK_CHANGE);
 
 	phylink_start(pp->phylink);
 	netif_tx_start_all_queues(pp->dev);
@@ -3570,8 +3568,7 @@ static int mvneta_cpu_online(unsigned int cpu, struct hlist_node *node)
 	on_each_cpu(mvneta_percpu_unmask_interrupt, pp, true);
 	mvreg_write(pp, MVNETA_INTR_MISC_MASK,
 		    MVNETA_CAUSE_PHY_STATUS_CHANGE |
-		    MVNETA_CAUSE_LINK_CHANGE |
-		    MVNETA_CAUSE_PSC_SYNC_CHANGE);
+		    MVNETA_CAUSE_LINK_CHANGE);
 	netif_tx_start_all_queues(pp->dev);
 	spin_unlock(&pp->lock);
 	return 0;
@@ -3612,8 +3609,7 @@ static int mvneta_cpu_dead(unsigned int cpu, struct hlist_node *node)
 	on_each_cpu(mvneta_percpu_unmask_interrupt, pp, true);
 	mvreg_write(pp, MVNETA_INTR_MISC_MASK,
 		    MVNETA_CAUSE_PHY_STATUS_CHANGE |
-		    MVNETA_CAUSE_LINK_CHANGE |
-		    MVNETA_CAUSE_PSC_SYNC_CHANGE);
+		    MVNETA_CAUSE_LINK_CHANGE);
 	netif_tx_start_all_queues(pp->dev);
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related

* [PATCH 07/10] net: mvneta: add EEE support
From: Russell King @ 2018-01-02 17:25 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

Add support for EEE to mvneta.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 99 +++++++++++++++++++++++++++++++++--
 1 file changed, 94 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 0b8701d3ba38..44d665887b50 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -244,6 +244,12 @@
 #define MVNETA_TXQ_TOKEN_SIZE_REG(q)             (0x3e40 + ((q) << 2))
 #define      MVNETA_TXQ_TOKEN_SIZE_MAX           0x7fffffff
 
+#define MVNETA_LPI_CTRL_0                        0x2cc0
+#define MVNETA_LPI_CTRL_1                        0x2cc4
+#define      MVNETA_LPI_REQUEST_ENABLE           BIT(0)
+#define MVNETA_LPI_CTRL_2                        0x2cc8
+#define MVNETA_LPI_STATUS                        0x2ccc
+
 #define MVNETA_CAUSE_TXQ_SENT_DESC_ALL_MASK	 0xff
 
 /* Descriptor ring Macros */
@@ -320,6 +326,11 @@
 #define MVNETA_RX_GET_BM_POOL_ID(rxd) \
 	(((rxd)->status & MVNETA_RXD_BM_POOL_MASK) >> MVNETA_RXD_BM_POOL_SHIFT)
 
+enum {
+	ETHTOOL_STAT_EEE_WAKEUP,
+	ETHTOOL_MAX_STATS,
+};
+
 struct mvneta_statistic {
 	unsigned short offset;
 	unsigned short type;
@@ -328,6 +339,7 @@ struct mvneta_statistic {
 
 #define T_REG_32	32
 #define T_REG_64	64
+#define T_SW		1
 
 static const struct mvneta_statistic mvneta_statistics[] = {
 	{ 0x3000, T_REG_64, "good_octets_received", },
@@ -362,6 +374,7 @@ static const struct mvneta_statistic mvneta_statistics[] = {
 	{ 0x304c, T_REG_32, "broadcast_frames_sent", },
 	{ 0x3054, T_REG_32, "fc_sent", },
 	{ 0x300c, T_REG_32, "internal_mac_transmit_err", },
+	{ ETHTOOL_STAT_EEE_WAKEUP, T_SW, "eee_wakeup_errors", },
 };
 
 struct mvneta_pcpu_stats {
@@ -424,6 +437,10 @@ struct mvneta_port {
 	struct mvneta_bm_pool *pool_short;
 	int bm_win_id;
 
+	bool eee_enabled;
+	bool eee_active;
+	bool tx_lpi_enabled;
+
 	u64 ethtool_stats[ARRAY_SIZE(mvneta_statistics)];
 
 	u32 indir[MVNETA_RSS_LU_TABLE_SIZE];
@@ -3369,6 +3386,18 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 	}
 }
 
+static void mvneta_set_eee(struct mvneta_port *pp, bool enable)
+{
+	u32 lpi_ctl1;
+
+	lpi_ctl1 = mvreg_read(pp, MVNETA_LPI_CTRL_1);
+	if (enable)
+		lpi_ctl1 |= MVNETA_LPI_REQUEST_ENABLE;
+	else
+		lpi_ctl1 &= ~MVNETA_LPI_REQUEST_ENABLE;
+	mvreg_write(pp, MVNETA_LPI_CTRL_1, lpi_ctl1);
+}
+
 static void mvneta_mac_link_down(struct net_device *ndev, unsigned int mode)
 {
 	struct mvneta_port *pp = netdev_priv(ndev);
@@ -3382,6 +3411,9 @@ static void mvneta_mac_link_down(struct net_device *ndev, unsigned int mode)
 		val |= MVNETA_GMAC_FORCE_LINK_DOWN;
 		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
 	}
+
+	pp->eee_active = false;
+	mvneta_set_eee(pp, false);
 }
 
 static void mvneta_mac_link_up(struct net_device *ndev, unsigned int mode,
@@ -3398,6 +3430,11 @@ static void mvneta_mac_link_up(struct net_device *ndev, unsigned int mode,
 	}
 
 	mvneta_port_up(pp);
+
+	if (phy && pp->eee_enabled) {
+		pp->eee_active = phy_init_eee(phy, 0) >= 0;
+		mvneta_set_eee(pp, pp->eee_active && pp->tx_lpi_enabled);
+	}
 }
 
 static const struct phylink_mac_ops mvneta_phylink_ops = {
@@ -3859,26 +3896,35 @@ static void mvneta_ethtool_update_stats(struct mvneta_port *pp)
 {
 	const struct mvneta_statistic *s;
 	void __iomem *base = pp->base;
-	u32 high, low, val;
-	u64 val64;
+	u32 high, low;
+	u64 val;
 	int i;
 
 	for (i = 0, s = mvneta_statistics;
 	     s < mvneta_statistics + ARRAY_SIZE(mvneta_statistics);
 	     s++, i++) {
+		val = 0;
+
 		switch (s->type) {
 		case T_REG_32:
 			val = readl_relaxed(base + s->offset);
-			pp->ethtool_stats[i] += val;
 			break;
 		case T_REG_64:
 			/* Docs say to read low 32-bit then high */
 			low = readl_relaxed(base + s->offset);
 			high = readl_relaxed(base + s->offset + 4);
-			val64 = (u64)high << 32 | low;
-			pp->ethtool_stats[i] += val64;
+			val = (u64)high << 32 | low;
+			break;
+		case T_SW:
+			switch (s->offset) {
+			case ETHTOOL_STAT_EEE_WAKEUP:
+				val = phylink_get_eee_err(pp->phylink);
+				break;
+			}
 			break;
 		}
+
+		pp->ethtool_stats[i] += val;
 	}
 }
 
@@ -4031,6 +4077,47 @@ static int mvneta_ethtool_set_wol(struct net_device *dev,
 	return ret;
 }
 
+static int mvneta_ethtool_get_eee(struct net_device *dev,
+				  struct ethtool_eee *eee)
+{
+	struct mvneta_port *pp = netdev_priv(dev);
+	u32 lpi_ctl0;
+
+	lpi_ctl0 = mvreg_read(pp, MVNETA_LPI_CTRL_0);
+
+	eee->eee_enabled = pp->eee_enabled;
+	eee->eee_active = pp->eee_active;
+	eee->tx_lpi_enabled = pp->tx_lpi_enabled;
+	eee->tx_lpi_timer = (lpi_ctl0) >> 8; // * scale;
+
+	return phylink_ethtool_get_eee(pp->phylink, eee);
+}
+
+static int mvneta_ethtool_set_eee(struct net_device *dev,
+				  struct ethtool_eee *eee)
+{
+	struct mvneta_port *pp = netdev_priv(dev);
+	u32 lpi_ctl0;
+
+	/* The Armada 37x documents do not give limits for this other than
+	 * it being an 8-bit register. */
+	if (eee->tx_lpi_enabled &&
+	    (eee->tx_lpi_timer < 0 || eee->tx_lpi_timer > 255))
+		return -EINVAL;
+
+	lpi_ctl0 = mvreg_read(pp, MVNETA_LPI_CTRL_0);
+	lpi_ctl0 &= ~(0xff << 8);
+	lpi_ctl0 |= eee->tx_lpi_timer << 8;
+	mvreg_write(pp, MVNETA_LPI_CTRL_0, lpi_ctl0);
+
+	pp->eee_enabled = eee->eee_enabled;
+	pp->tx_lpi_enabled = eee->tx_lpi_enabled;
+
+	mvneta_set_eee(pp, eee->tx_lpi_enabled && eee->eee_enabled);
+
+	return phylink_ethtool_set_eee(pp->phylink, eee);
+}
+
 static const struct net_device_ops mvneta_netdev_ops = {
 	.ndo_open            = mvneta_open,
 	.ndo_stop            = mvneta_stop,
@@ -4064,6 +4151,8 @@ static const struct ethtool_ops mvneta_eth_tool_ops = {
 	.set_link_ksettings = mvneta_ethtool_set_link_ksettings,
 	.get_wol        = mvneta_ethtool_get_wol,
 	.set_wol        = mvneta_ethtool_set_wol,
+	.get_eee	= mvneta_ethtool_get_eee,
+	.set_eee	= mvneta_ethtool_set_eee,
 };
 
 /* Initialize hw */
-- 
2.7.4

^ permalink raw reply related

* [PATCH 06/10] net: mvneta: add flow control support
From: Russell King @ 2018-01-02 17:24 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

Add support for flow control to mvneta.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 73643dbd5cb0..0b8701d3ba38 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3211,6 +3211,8 @@ static void mvneta_validate(struct net_device *ndev, unsigned long *supported,
 	phylink_set(mask, Autoneg);
 	phylink_set_port_modes(mask);
 
+	/* Asymmetric pause is unsupported */
+	phylink_set(mask, Pause);
 	/* Half-duplex at speeds higher than 100Mbit is unsupported */
 	phylink_set(mask, 1000baseT_Full);
 	phylink_set(mask, 1000baseX_Full);
@@ -3249,6 +3251,10 @@ static int mvneta_mac_link_state(struct net_device *ndev,
 	state->duplex = !!(gmac_stat & MVNETA_GMAC_FULL_DUPLEX);
 
 	state->pause = 0;
+	if (gmac_stat & MVNETA_GMAC_RX_FLOW_CTRL_ENABLE)
+		state->pause |= MLO_PAUSE_RX;
+	if (gmac_stat & MVNETA_GMAC_TX_FLOW_CTRL_ENABLE)
+		state->pause |= MLO_PAUSE_TX;
 
 	return 1;
 }
@@ -3298,6 +3304,11 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 	    phy_interface_mode_is_8023z(state->interface))
 		new_ctrl2 |= MVNETA_GMAC2_PCS_ENABLE;
 
+	if (phylink_test(state->advertising, Pause))
+		new_an |= MVNETA_GMAC_ADVERT_SYM_FLOW_CTRL;
+	if (state->pause & MLO_PAUSE_TXRX_MASK)
+		new_an |= MVNETA_GMAC_CONFIG_FLOW_CTRL;
+
 	if (!phylink_autoneg_inband(mode)) {
 		/* Phy or fixed speed */
 		if (state->duplex)
@@ -3326,6 +3337,9 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 			 MVNETA_GMAC_CONFIG_GMII_SPEED |
 			 /* The MAC only supports FD mode */
 			 MVNETA_GMAC_CONFIG_FULL_DUPLEX;
+
+		if (state->pause & MLO_PAUSE_AN && state->an_enabled)
+			new_an |= MVNETA_GMAC_AN_FLOW_CTRL_EN;
 	}
 
 	/* Armada 370 documentation says we can only change the port mode
@@ -3813,6 +3827,22 @@ static int mvneta_ethtool_set_ringparam(struct net_device *dev,
 	return 0;
 }
 
+static void mvneta_ethtool_get_pauseparam(struct net_device *dev,
+					  struct ethtool_pauseparam *pause)
+{
+	struct mvneta_port *pp = netdev_priv(dev);
+
+	phylink_ethtool_get_pauseparam(pp->phylink, pause);
+}
+
+static int mvneta_ethtool_set_pauseparam(struct net_device *dev,
+					 struct ethtool_pauseparam *pause)
+{
+	struct mvneta_port *pp = netdev_priv(dev);
+
+	return phylink_ethtool_set_pauseparam(pp->phylink, pause);
+}
+
 static void mvneta_ethtool_get_strings(struct net_device *netdev, u32 sset,
 				       u8 *data)
 {
@@ -4021,6 +4051,8 @@ static const struct ethtool_ops mvneta_eth_tool_ops = {
 	.get_drvinfo    = mvneta_ethtool_get_drvinfo,
 	.get_ringparam  = mvneta_ethtool_get_ringparam,
 	.set_ringparam	= mvneta_ethtool_set_ringparam,
+	.get_pauseparam	= mvneta_ethtool_get_pauseparam,
+	.set_pauseparam	= mvneta_ethtool_set_pauseparam,
 	.get_strings	= mvneta_ethtool_get_strings,
 	.get_ethtool_stats = mvneta_ethtool_get_stats,
 	.get_sset_count	= mvneta_ethtool_get_sset_count,
-- 
2.7.4

^ permalink raw reply related

* [PATCH 05/10] net: mvneta: add 1000BaseX support
From: Russell King @ 2018-01-02 17:24 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

Add support for 1000BaseX link modes.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 59 +++++++++++++++++++++++++++++------
 1 file changed, 50 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index ef54a8fc9515..73643dbd5cb0 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -189,6 +189,7 @@
 #define MVNETA_GMAC_CTRL_0                       0x2c00
 #define      MVNETA_GMAC_MAX_RX_SIZE_SHIFT       2
 #define      MVNETA_GMAC_MAX_RX_SIZE_MASK        0x7ffc
+#define      MVNETA_GMAC0_PORT_1000BASE_X        BIT(1)
 #define      MVNETA_GMAC0_PORT_ENABLE            BIT(0)
 #define MVNETA_GMAC_CTRL_2                       0x2c08
 #define      MVNETA_GMAC2_INBAND_AN_ENABLE       BIT(0)
@@ -210,9 +211,13 @@
 #define      MVNETA_GMAC_FORCE_LINK_DOWN         BIT(0)
 #define      MVNETA_GMAC_FORCE_LINK_PASS         BIT(1)
 #define      MVNETA_GMAC_INBAND_AN_ENABLE        BIT(2)
+#define      MVNETA_GMAC_AN_BYPASS_ENABLE        BIT(3)
+#define      MVNETA_GMAC_INBAND_RESTART_AN       BIT(4)
 #define      MVNETA_GMAC_CONFIG_MII_SPEED        BIT(5)
 #define      MVNETA_GMAC_CONFIG_GMII_SPEED       BIT(6)
 #define      MVNETA_GMAC_AN_SPEED_EN             BIT(7)
+#define      MVNETA_GMAC_CONFIG_FLOW_CTRL        BIT(8)
+#define      MVNETA_GMAC_ADVERT_SYM_FLOW_CTRL    BIT(9)
 #define      MVNETA_GMAC_AN_FLOW_CTRL_EN         BIT(11)
 #define      MVNETA_GMAC_CONFIG_FULL_DUPLEX      BIT(12)
 #define      MVNETA_GMAC_AN_DUPLEX_EN            BIT(13)
@@ -3192,10 +3197,11 @@ static void mvneta_validate(struct net_device *ndev, unsigned long *supported,
 {
 	__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
 
-	/* We only support QSGMII, SGMII and RGMII modes */
+	/* We only support QSGMII, SGMII, 802.3z and RGMII modes */
 	if (state->interface != PHY_INTERFACE_MODE_NA &&
 	    state->interface != PHY_INTERFACE_MODE_QSGMII &&
 	    state->interface != PHY_INTERFACE_MODE_SGMII &&
+	    !phy_interface_mode_is_8023z(state->interface) &&
 	    !phy_interface_mode_is_rgmii(state->interface)) {
 		bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
 		return;
@@ -3208,10 +3214,14 @@ static void mvneta_validate(struct net_device *ndev, unsigned long *supported,
 	/* Half-duplex at speeds higher than 100Mbit is unsupported */
 	phylink_set(mask, 1000baseT_Full);
 	phylink_set(mask, 1000baseX_Full);
-	phylink_set(mask, 10baseT_Half);
-	phylink_set(mask, 10baseT_Full);
-	phylink_set(mask, 100baseT_Half);
-	phylink_set(mask, 100baseT_Full);
+
+	if (!phy_interface_mode_is_8023z(state->interface)) {
+		/* 10M and 100M are only supported in non-802.3z mode */
+		phylink_set(mask, 10baseT_Half);
+		phylink_set(mask, 10baseT_Full);
+		phylink_set(mask, 100baseT_Half);
+		phylink_set(mask, 100baseT_Full);
+	}
 
 	bitmap_and(supported, supported, mask,
 		   __ETHTOOL_LINK_MODE_MASK_NBITS);
@@ -3243,14 +3253,27 @@ static int mvneta_mac_link_state(struct net_device *ndev,
 	return 1;
 }
 
+static void mvneta_mac_an_restart(struct net_device *ndev)
+{
+	struct mvneta_port *pp = netdev_priv(ndev);
+	u32 gmac_an = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
+
+	mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG,
+		    gmac_an | MVNETA_GMAC_INBAND_RESTART_AN);
+	mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG,
+		    gmac_an & ~MVNETA_GMAC_INBAND_RESTART_AN);
+}
+
 static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 	const struct phylink_link_state *state)
 {
 	struct mvneta_port *pp = netdev_priv(ndev);
+	u32 new_ctrl0, gmac_ctrl0 = mvreg_read(pp, MVNETA_GMAC_CTRL_0);
 	u32 new_ctrl2, gmac_ctrl2 = mvreg_read(pp, MVNETA_GMAC_CTRL_2);
 	u32 new_clk, gmac_clk = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
 	u32 new_an, gmac_an = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
 
+	new_ctrl0 = gmac_ctrl0 & ~MVNETA_GMAC0_PORT_1000BASE_X;
 	new_ctrl2 = gmac_ctrl2 & ~(MVNETA_GMAC2_INBAND_AN_ENABLE |
 				   MVNETA_GMAC2_PORT_RESET);
 	new_clk = gmac_clk & ~MVNETA_GMAC_1MS_CLOCK_ENABLE;
@@ -3259,6 +3282,8 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 			     MVNETA_GMAC_CONFIG_MII_SPEED |
 			     MVNETA_GMAC_CONFIG_GMII_SPEED |
 			     MVNETA_GMAC_AN_SPEED_EN |
+			     MVNETA_GMAC_ADVERT_SYM_FLOW_CTRL |
+			     MVNETA_GMAC_CONFIG_FLOW_CTRL |
 			     MVNETA_GMAC_AN_FLOW_CTRL_EN |
 			     MVNETA_GMAC_CONFIG_FULL_DUPLEX |
 			     MVNETA_GMAC_AN_DUPLEX_EN);
@@ -3269,7 +3294,8 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 	new_ctrl2 |= MVNETA_GMAC2_PORT_RGMII;
 
 	if (state->interface == PHY_INTERFACE_MODE_QSGMII ||
-	    state->interface == PHY_INTERFACE_MODE_SGMII)
+	    state->interface == PHY_INTERFACE_MODE_SGMII ||
+	    phy_interface_mode_is_8023z(state->interface))
 		new_ctrl2 |= MVNETA_GMAC2_PCS_ENABLE;
 
 	if (!phylink_autoneg_inband(mode)) {
@@ -3281,7 +3307,7 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 			new_an |= MVNETA_GMAC_CONFIG_GMII_SPEED;
 		else if (state->speed == SPEED_100)
 			new_an |= MVNETA_GMAC_CONFIG_MII_SPEED;
-	} else {
+	} else if (state->interface == PHY_INTERFACE_MODE_SGMII) {
 		/* SGMII mode receives the state from the PHY */
 		new_ctrl2 |= MVNETA_GMAC2_INBAND_AN_ENABLE;
 		new_clk |= MVNETA_GMAC_1MS_CLOCK_ENABLE;
@@ -3290,18 +3316,31 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 			 MVNETA_GMAC_INBAND_AN_ENABLE |
 			 MVNETA_GMAC_AN_SPEED_EN |
 			 MVNETA_GMAC_AN_DUPLEX_EN;
+	} else {
+		/* 802.3z negotiation - only 1000base-X */
+		new_ctrl0 |= MVNETA_GMAC0_PORT_1000BASE_X;
+		new_clk |= MVNETA_GMAC_1MS_CLOCK_ENABLE;
+		new_an = (new_an & ~(MVNETA_GMAC_FORCE_LINK_DOWN |
+				     MVNETA_GMAC_FORCE_LINK_PASS)) |
+			 MVNETA_GMAC_INBAND_AN_ENABLE |
+			 MVNETA_GMAC_CONFIG_GMII_SPEED |
+			 /* The MAC only supports FD mode */
+			 MVNETA_GMAC_CONFIG_FULL_DUPLEX;
 	}
 
 	/* Armada 370 documentation says we can only change the port mode
 	 * and in-band enable when the link is down, so force it down
 	 * while making these changes. We also do this for GMAC_CTRL2 */
-	if ((new_ctrl2 ^ gmac_ctrl2) & MVNETA_GMAC2_INBAND_AN_ENABLE ||
+	if ((new_ctrl0 ^ gmac_ctrl0) & MVNETA_GMAC0_PORT_1000BASE_X ||
+	    (new_ctrl2 ^ gmac_ctrl2) & MVNETA_GMAC2_INBAND_AN_ENABLE ||
 	    (new_an  ^ gmac_an) & MVNETA_GMAC_INBAND_AN_ENABLE) {
 		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG,
 			    (gmac_an & ~MVNETA_GMAC_FORCE_LINK_PASS) |
 			    MVNETA_GMAC_FORCE_LINK_DOWN);
 	}
 
+	if (new_ctrl0 != gmac_ctrl0)
+		mvreg_write(pp, MVNETA_GMAC_CTRL_0, new_ctrl0);
 	if (new_ctrl2 != gmac_ctrl2)
 		mvreg_write(pp, MVNETA_GMAC_CTRL_2, new_ctrl2);
 	if (new_clk != gmac_clk)
@@ -3350,6 +3389,7 @@ static void mvneta_mac_link_up(struct net_device *ndev, unsigned int mode,
 static const struct phylink_mac_ops mvneta_phylink_ops = {
 	.validate = mvneta_validate,
 	.mac_link_state = mvneta_mac_link_state,
+	.mac_an_restart = mvneta_mac_an_restart,
 	.mac_config = mvneta_mac_config,
 	.mac_link_down = mvneta_mac_link_down,
 	.mac_link_up = mvneta_mac_link_up,
@@ -4096,7 +4136,8 @@ static int mvneta_port_power_up(struct mvneta_port *pp, int phy_mode)
 
 	if (phy_mode == PHY_INTERFACE_MODE_QSGMII)
 		mvreg_write(pp, MVNETA_SERDES_CFG, MVNETA_QSGMII_SERDES_PROTO);
-	else if (phy_mode == PHY_INTERFACE_MODE_SGMII)
+	else if (phy_mode == PHY_INTERFACE_MODE_SGMII ||
+		 phy_mode == PHY_INTERFACE_MODE_1000BASEX)
 		mvreg_write(pp, MVNETA_SERDES_CFG, MVNETA_SGMII_SERDES_PROTO);
 	else if (!phy_interface_mode_is_rgmii(phy_mode))
 		return -EINVAL;
-- 
2.7.4

^ permalink raw reply related

* [PATCH 04/10] net: mvneta: move port configuration
From: Russell King @ 2018-01-02 17:24 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

Move the port configuration and release of reset to mvneta_mac_config()
along side the rest of the port mode configuration.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 51 ++++++++++++++---------------------
 1 file changed, 20 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index fabe17bd39f9..ef54a8fc9515 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3251,7 +3251,8 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 	u32 new_clk, gmac_clk = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
 	u32 new_an, gmac_an = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
 
-	new_ctrl2 = gmac_ctrl2 & ~MVNETA_GMAC2_INBAND_AN_ENABLE;
+	new_ctrl2 = gmac_ctrl2 & ~(MVNETA_GMAC2_INBAND_AN_ENABLE |
+				   MVNETA_GMAC2_PORT_RESET);
 	new_clk = gmac_clk & ~MVNETA_GMAC_1MS_CLOCK_ENABLE;
 	new_an = gmac_an & ~(MVNETA_GMAC_INBAND_AN_ENABLE |
 			     MVNETA_GMAC_INBAND_RESTART_AN |
@@ -3262,6 +3263,15 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 			     MVNETA_GMAC_CONFIG_FULL_DUPLEX |
 			     MVNETA_GMAC_AN_DUPLEX_EN);
 
+	/* Even though it might look weird, when we're configured in
+	 * SGMII or QSGMII mode, the RGMII bit needs to be set.
+	 */
+	new_ctrl2 |= MVNETA_GMAC2_PORT_RGMII;
+
+	if (state->interface == PHY_INTERFACE_MODE_QSGMII ||
+	    state->interface == PHY_INTERFACE_MODE_SGMII)
+		new_ctrl2 |= MVNETA_GMAC2_PCS_ENABLE;
+
 	if (!phylink_autoneg_inband(mode)) {
 		/* Phy or fixed speed */
 		if (state->duplex)
@@ -3298,6 +3308,12 @@ static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
 		mvreg_write(pp, MVNETA_GMAC_CLOCK_DIVIDER, new_clk);
 	if (new_an != gmac_an)
 		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, new_an);
+
+	if (gmac_ctrl2 & MVNETA_GMAC2_PORT_RESET) {
+		while ((mvreg_read(pp, MVNETA_GMAC_CTRL_2) &
+			MVNETA_GMAC2_PORT_RESET) != 0)
+			continue;
+	}
 }
 
 static void mvneta_mac_link_down(struct net_device *ndev, unsigned int mode)
@@ -4075,42 +4091,15 @@ static void mvneta_conf_mbus_windows(struct mvneta_port *pp,
 /* Power up the port */
 static int mvneta_port_power_up(struct mvneta_port *pp, int phy_mode)
 {
-	u32 ctrl;
-
 	/* MAC Cause register should be cleared */
 	mvreg_write(pp, MVNETA_UNIT_INTR_CAUSE, 0);
 
-	ctrl = mvreg_read(pp, MVNETA_GMAC_CTRL_2);
-
-	/* Even though it might look weird, when we're configured in
-	 * SGMII or QSGMII mode, the RGMII bit needs to be set.
-	 */
-	switch(phy_mode) {
-	case PHY_INTERFACE_MODE_QSGMII:
+	if (phy_mode == PHY_INTERFACE_MODE_QSGMII)
 		mvreg_write(pp, MVNETA_SERDES_CFG, MVNETA_QSGMII_SERDES_PROTO);
-		ctrl |= MVNETA_GMAC2_PCS_ENABLE | MVNETA_GMAC2_PORT_RGMII;
-		break;
-	case PHY_INTERFACE_MODE_SGMII:
+	else if (phy_mode == PHY_INTERFACE_MODE_SGMII)
 		mvreg_write(pp, MVNETA_SERDES_CFG, MVNETA_SGMII_SERDES_PROTO);
-		ctrl |= MVNETA_GMAC2_PCS_ENABLE | MVNETA_GMAC2_PORT_RGMII;
-		break;
-	case PHY_INTERFACE_MODE_RGMII:
-	case PHY_INTERFACE_MODE_RGMII_ID:
-	case PHY_INTERFACE_MODE_RGMII_RXID:
-	case PHY_INTERFACE_MODE_RGMII_TXID:
-		ctrl |= MVNETA_GMAC2_PORT_RGMII;
-		break;
-	default:
+	else if (!phy_interface_mode_is_rgmii(phy_mode))
 		return -EINVAL;
-	}
-
-	/* Cancel Port Reset */
-	ctrl &= ~MVNETA_GMAC2_PORT_RESET;
-	mvreg_write(pp, MVNETA_GMAC_CTRL_2, ctrl);
-
-	while ((mvreg_read(pp, MVNETA_GMAC_CTRL_2) &
-		MVNETA_GMAC2_PORT_RESET) != 0)
-		continue;
 
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related

* [PATCH 03/10] net: mvneta: convert to phylink
From: Russell King @ 2018-01-02 17:24 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

Convert mvneta to use phylink, which models the MAC to PHY link in
a generic, reusable form.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

- remove unused sync status
---
 drivers/net/ethernet/marvell/Kconfig  |   2 +-
 drivers/net/ethernet/marvell/mvneta.c | 417 ++++++++++++++--------------------
 2 files changed, 176 insertions(+), 243 deletions(-)

diff --git a/drivers/net/ethernet/marvell/Kconfig b/drivers/net/ethernet/marvell/Kconfig
index da6fb825afea..ebe5c9148935 100644
--- a/drivers/net/ethernet/marvell/Kconfig
+++ b/drivers/net/ethernet/marvell/Kconfig
@@ -60,7 +60,7 @@ config MVNETA
 	depends on ARCH_MVEBU || COMPILE_TEST
 	depends on HAS_DMA
 	select MVMDIO
-	select FIXED_PHY
+	select PHYLINK
 	---help---
 	  This driver supports the network interface units in the
 	  Marvell ARMADA XP, ARMADA 370, ARMADA 38x and
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 3dfe4b3edef2..fabe17bd39f9 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -28,7 +28,7 @@
 #include <linux/of_mdio.h>
 #include <linux/of_net.h>
 #include <linux/phy.h>
-#include <linux/phy_fixed.h>
+#include <linux/phylink.h>
 #include <linux/platform_device.h>
 #include <linux/skbuff.h>
 #include <net/hwbm.h>
@@ -204,6 +204,8 @@
 #define      MVNETA_GMAC_TX_FLOW_CTRL_ENABLE     BIT(5)
 #define      MVNETA_GMAC_RX_FLOW_CTRL_ACTIVE     BIT(6)
 #define      MVNETA_GMAC_TX_FLOW_CTRL_ACTIVE     BIT(7)
+#define      MVNETA_GMAC_AN_COMPLETE             BIT(11)
+#define      MVNETA_GMAC_SYNC_OK                 BIT(14)
 #define MVNETA_GMAC_AUTONEG_CONFIG               0x2c0c
 #define      MVNETA_GMAC_FORCE_LINK_DOWN         BIT(0)
 #define      MVNETA_GMAC_FORCE_LINK_PASS         BIT(1)
@@ -407,14 +409,10 @@ struct mvneta_port {
 	u16 tx_ring_size;
 	u16 rx_ring_size;
 
-	struct mii_bus *mii_bus;
 	phy_interface_t phy_interface;
-	struct device_node *phy_node;
-	unsigned int link;
-	unsigned int duplex;
-	unsigned int speed;
+	struct device_node *dn;
 	unsigned int tx_csum_limit;
-	unsigned int use_inband_status:1;
+	struct phylink *phylink;
 
 	struct mvneta_bm *bm_priv;
 	struct mvneta_bm_pool *pool_long;
@@ -1214,10 +1212,6 @@ static void mvneta_port_disable(struct mvneta_port *pp)
 	val &= ~MVNETA_GMAC0_PORT_ENABLE;
 	mvreg_write(pp, MVNETA_GMAC_CTRL_0, val);
 
-	pp->link = 0;
-	pp->duplex = -1;
-	pp->speed = 0;
-
 	udelay(200);
 }
 
@@ -1277,44 +1271,6 @@ static void mvneta_set_other_mcast_table(struct mvneta_port *pp, int queue)
 		mvreg_write(pp, MVNETA_DA_FILT_OTH_MCAST + offset, val);
 }
 
-static void mvneta_set_autoneg(struct mvneta_port *pp, int enable)
-{
-	u32 val;
-
-	if (enable) {
-		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
-		val &= ~(MVNETA_GMAC_FORCE_LINK_PASS |
-			 MVNETA_GMAC_FORCE_LINK_DOWN |
-			 MVNETA_GMAC_AN_FLOW_CTRL_EN);
-		val |= MVNETA_GMAC_INBAND_AN_ENABLE |
-		       MVNETA_GMAC_AN_SPEED_EN |
-		       MVNETA_GMAC_AN_DUPLEX_EN;
-		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
-
-		val = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
-		val |= MVNETA_GMAC_1MS_CLOCK_ENABLE;
-		mvreg_write(pp, MVNETA_GMAC_CLOCK_DIVIDER, val);
-
-		val = mvreg_read(pp, MVNETA_GMAC_CTRL_2);
-		val |= MVNETA_GMAC2_INBAND_AN_ENABLE;
-		mvreg_write(pp, MVNETA_GMAC_CTRL_2, val);
-	} else {
-		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
-		val &= ~(MVNETA_GMAC_INBAND_AN_ENABLE |
-		       MVNETA_GMAC_AN_SPEED_EN |
-		       MVNETA_GMAC_AN_DUPLEX_EN);
-		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
-
-		val = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
-		val &= ~MVNETA_GMAC_1MS_CLOCK_ENABLE;
-		mvreg_write(pp, MVNETA_GMAC_CLOCK_DIVIDER, val);
-
-		val = mvreg_read(pp, MVNETA_GMAC_CTRL_2);
-		val &= ~MVNETA_GMAC2_INBAND_AN_ENABLE;
-		mvreg_write(pp, MVNETA_GMAC_CTRL_2, val);
-	}
-}
-
 static void mvneta_percpu_unmask_interrupt(void *arg)
 {
 	struct mvneta_port *pp = arg;
@@ -1467,7 +1423,6 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
 	val &= ~MVNETA_PHY_POLLING_ENABLE;
 	mvreg_write(pp, MVNETA_UNIT_CONTROL, val);
 
-	mvneta_set_autoneg(pp, pp->use_inband_status);
 	mvneta_set_ucast_table(pp, -1);
 	mvneta_set_special_mcast_table(pp, -1);
 	mvneta_set_other_mcast_table(pp, -1);
@@ -2692,26 +2647,11 @@ static irqreturn_t mvneta_percpu_isr(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-static int mvneta_fixed_link_update(struct mvneta_port *pp,
-				    struct phy_device *phy)
+static void mvneta_link_change(struct mvneta_port *pp)
 {
-	struct fixed_phy_status status;
-	struct fixed_phy_status changed = {};
 	u32 gmac_stat = mvreg_read(pp, MVNETA_GMAC_STATUS);
 
-	status.link = !!(gmac_stat & MVNETA_GMAC_LINK_UP);
-	if (gmac_stat & MVNETA_GMAC_SPEED_1000)
-		status.speed = SPEED_1000;
-	else if (gmac_stat & MVNETA_GMAC_SPEED_100)
-		status.speed = SPEED_100;
-	else
-		status.speed = SPEED_10;
-	status.duplex = !!(gmac_stat & MVNETA_GMAC_FULL_DUPLEX);
-	changed.link = 1;
-	changed.speed = 1;
-	changed.duplex = 1;
-	fixed_phy_update_state(phy, &status, &changed);
-	return 0;
+	phylink_mac_change(pp->phylink, !!(gmac_stat & MVNETA_GMAC_LINK_UP));
 }
 
 /* NAPI handler
@@ -2727,7 +2667,6 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
 	u32 cause_rx_tx;
 	int rx_queue;
 	struct mvneta_port *pp = netdev_priv(napi->dev);
-	struct net_device *ndev = pp->dev;
 	struct mvneta_pcpu_port *port = this_cpu_ptr(pp->ports);
 
 	if (!netif_running(pp->dev)) {
@@ -2741,12 +2680,11 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
 		u32 cause_misc = mvreg_read(pp, MVNETA_INTR_MISC_CAUSE);
 
 		mvreg_write(pp, MVNETA_INTR_MISC_CAUSE, 0);
-		if (pp->use_inband_status && (cause_misc &
-				(MVNETA_CAUSE_PHY_STATUS_CHANGE |
-				 MVNETA_CAUSE_LINK_CHANGE |
-				 MVNETA_CAUSE_PSC_SYNC_CHANGE))) {
-			mvneta_fixed_link_update(pp, ndev->phydev);
-		}
+
+		if (cause_misc & (MVNETA_CAUSE_PHY_STATUS_CHANGE |
+				  MVNETA_CAUSE_LINK_CHANGE |
+				  MVNETA_CAUSE_PSC_SYNC_CHANGE))
+			mvneta_link_change(pp);
 	}
 
 	/* Release Tx descriptors */
@@ -3060,7 +2998,6 @@ static int mvneta_setup_txqs(struct mvneta_port *pp)
 static void mvneta_start_dev(struct mvneta_port *pp)
 {
 	int cpu;
-	struct net_device *ndev = pp->dev;
 
 	mvneta_max_rx_size_set(pp, pp->pkt_size);
 	mvneta_txq_max_tx_size_set(pp, pp->pkt_size);
@@ -3088,16 +3025,15 @@ static void mvneta_start_dev(struct mvneta_port *pp)
 		    MVNETA_CAUSE_LINK_CHANGE |
 		    MVNETA_CAUSE_PSC_SYNC_CHANGE);
 
-	phy_start(ndev->phydev);
+	phylink_start(pp->phylink);
 	netif_tx_start_all_queues(pp->dev);
 }
 
 static void mvneta_stop_dev(struct mvneta_port *pp)
 {
 	unsigned int cpu;
-	struct net_device *ndev = pp->dev;
 
-	phy_stop(ndev->phydev);
+	phylink_stop(pp->phylink);
 
 	if (!pp->neta_armada3700) {
 		for_each_online_cpu(cpu) {
@@ -3251,55 +3187,141 @@ static int mvneta_set_mac_addr(struct net_device *dev, void *addr)
 	return 0;
 }
 
-static void mvneta_mac_config(struct net_device *ndev)
+static void mvneta_validate(struct net_device *ndev, unsigned long *supported,
+			    struct phylink_link_state *state)
+{
+	__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
+
+	/* We only support QSGMII, SGMII and RGMII modes */
+	if (state->interface != PHY_INTERFACE_MODE_NA &&
+	    state->interface != PHY_INTERFACE_MODE_QSGMII &&
+	    state->interface != PHY_INTERFACE_MODE_SGMII &&
+	    !phy_interface_mode_is_rgmii(state->interface)) {
+		bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
+		return;
+	}
+
+	/* Allow all the expected bits */
+	phylink_set(mask, Autoneg);
+	phylink_set_port_modes(mask);
+
+	/* Half-duplex at speeds higher than 100Mbit is unsupported */
+	phylink_set(mask, 1000baseT_Full);
+	phylink_set(mask, 1000baseX_Full);
+	phylink_set(mask, 10baseT_Half);
+	phylink_set(mask, 10baseT_Full);
+	phylink_set(mask, 100baseT_Half);
+	phylink_set(mask, 100baseT_Full);
+
+	bitmap_and(supported, supported, mask,
+		   __ETHTOOL_LINK_MODE_MASK_NBITS);
+	bitmap_and(state->advertising, state->advertising, mask,
+		   __ETHTOOL_LINK_MODE_MASK_NBITS);
+}
+
+static int mvneta_mac_link_state(struct net_device *ndev,
+				 struct phylink_link_state *state)
 {
 	struct mvneta_port *pp = netdev_priv(ndev);
-	struct phy_device *phydev = ndev->phydev;
-	u32 val;
+	u32 gmac_stat;
 
-	if ((pp->speed != phydev->speed) ||
-	    (pp->duplex != phydev->duplex)) {
-		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
-		val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
-			 MVNETA_GMAC_CONFIG_GMII_SPEED |
-			 MVNETA_GMAC_CONFIG_FULL_DUPLEX);
+	gmac_stat = mvreg_read(pp, MVNETA_GMAC_STATUS);
 
-		if (phydev->duplex)
-			val |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
+	if (gmac_stat & MVNETA_GMAC_SPEED_1000)
+		state->speed = SPEED_1000;
+	else if (gmac_stat & MVNETA_GMAC_SPEED_100)
+		state->speed = SPEED_100;
+	else
+		state->speed = SPEED_10;
 
-		if (phydev->speed == SPEED_1000)
-			val |= MVNETA_GMAC_CONFIG_GMII_SPEED;
-		else if (phydev->speed == SPEED_100)
-			val |= MVNETA_GMAC_CONFIG_MII_SPEED;
+	state->an_complete = !!(gmac_stat & MVNETA_GMAC_AN_COMPLETE);
+	state->link = !!(gmac_stat & MVNETA_GMAC_LINK_UP);
+	state->duplex = !!(gmac_stat & MVNETA_GMAC_FULL_DUPLEX);
 
-		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
+	state->pause = 0;
+
+	return 1;
+}
+
+static void mvneta_mac_config(struct net_device *ndev, unsigned int mode,
+	const struct phylink_link_state *state)
+{
+	struct mvneta_port *pp = netdev_priv(ndev);
+	u32 new_ctrl2, gmac_ctrl2 = mvreg_read(pp, MVNETA_GMAC_CTRL_2);
+	u32 new_clk, gmac_clk = mvreg_read(pp, MVNETA_GMAC_CLOCK_DIVIDER);
+	u32 new_an, gmac_an = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
+
+	new_ctrl2 = gmac_ctrl2 & ~MVNETA_GMAC2_INBAND_AN_ENABLE;
+	new_clk = gmac_clk & ~MVNETA_GMAC_1MS_CLOCK_ENABLE;
+	new_an = gmac_an & ~(MVNETA_GMAC_INBAND_AN_ENABLE |
+			     MVNETA_GMAC_INBAND_RESTART_AN |
+			     MVNETA_GMAC_CONFIG_MII_SPEED |
+			     MVNETA_GMAC_CONFIG_GMII_SPEED |
+			     MVNETA_GMAC_AN_SPEED_EN |
+			     MVNETA_GMAC_AN_FLOW_CTRL_EN |
+			     MVNETA_GMAC_CONFIG_FULL_DUPLEX |
+			     MVNETA_GMAC_AN_DUPLEX_EN);
+
+	if (!phylink_autoneg_inband(mode)) {
+		/* Phy or fixed speed */
+		if (state->duplex)
+			new_an |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
+
+		if (state->speed == SPEED_1000)
+			new_an |= MVNETA_GMAC_CONFIG_GMII_SPEED;
+		else if (state->speed == SPEED_100)
+			new_an |= MVNETA_GMAC_CONFIG_MII_SPEED;
+	} else {
+		/* SGMII mode receives the state from the PHY */
+		new_ctrl2 |= MVNETA_GMAC2_INBAND_AN_ENABLE;
+		new_clk |= MVNETA_GMAC_1MS_CLOCK_ENABLE;
+		new_an = (new_an & ~(MVNETA_GMAC_FORCE_LINK_DOWN |
+				     MVNETA_GMAC_FORCE_LINK_PASS)) |
+			 MVNETA_GMAC_INBAND_AN_ENABLE |
+			 MVNETA_GMAC_AN_SPEED_EN |
+			 MVNETA_GMAC_AN_DUPLEX_EN;
+	}
 
-		pp->duplex = phydev->duplex;
-		pp->speed  = phydev->speed;
+	/* Armada 370 documentation says we can only change the port mode
+	 * and in-band enable when the link is down, so force it down
+	 * while making these changes. We also do this for GMAC_CTRL2 */
+	if ((new_ctrl2 ^ gmac_ctrl2) & MVNETA_GMAC2_INBAND_AN_ENABLE ||
+	    (new_an  ^ gmac_an) & MVNETA_GMAC_INBAND_AN_ENABLE) {
+		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG,
+			    (gmac_an & ~MVNETA_GMAC_FORCE_LINK_PASS) |
+			    MVNETA_GMAC_FORCE_LINK_DOWN);
 	}
+
+	if (new_ctrl2 != gmac_ctrl2)
+		mvreg_write(pp, MVNETA_GMAC_CTRL_2, new_ctrl2);
+	if (new_clk != gmac_clk)
+		mvreg_write(pp, MVNETA_GMAC_CLOCK_DIVIDER, new_clk);
+	if (new_an != gmac_an)
+		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, new_an);
 }
 
-static void mvneta_mac_link_down(struct net_device *ndev, bool autoneg)
+static void mvneta_mac_link_down(struct net_device *ndev, unsigned int mode)
 {
 	struct mvneta_port *pp = netdev_priv(ndev);
 	u32 val;
 
-	if (!autoneg) {
+	mvneta_port_down(pp);
+
+	if (!phylink_autoneg_inband(mode)) {
 		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
 		val &= ~MVNETA_GMAC_FORCE_LINK_PASS;
 		val |= MVNETA_GMAC_FORCE_LINK_DOWN;
 		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
 	}
-
-	mvneta_port_down(pp);
 }
 
-static void mvneta_mac_link_up(struct net_device *ndev, bool autoneg)
+static void mvneta_mac_link_up(struct net_device *ndev, unsigned int mode,
+			       struct phy_device *phy)
 {
 	struct mvneta_port *pp = netdev_priv(ndev);
 	u32 val;
 
-	if (!autoneg) {
+	if (!phylink_autoneg_inband(mode)) {
 		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
 		val &= ~MVNETA_GMAC_FORCE_LINK_DOWN;
 		val |= MVNETA_GMAC_FORCE_LINK_PASS;
@@ -3309,64 +3331,31 @@ static void mvneta_mac_link_up(struct net_device *ndev, bool autoneg)
 	mvneta_port_up(pp);
 }
 
-static void mvneta_adjust_link(struct net_device *ndev)
-{
-	struct mvneta_port *pp = netdev_priv(ndev);
-	struct phy_device *phydev = ndev->phydev;
-	int status_change = 0;
-
-	if (phydev->link)
-		mvneta_mac_config(ndev);
-
-	if (phydev->link != pp->link) {
-		if (!phydev->link) {
-			pp->duplex = -1;
-			pp->speed = 0;
-		}
-
-		pp->link = phydev->link;
-		status_change = 1;
-	}
-
-	if (status_change) {
-		if (phydev->link)
-			mvneta_mac_link_down(ndev, pp->use_inband_status);
-		else
-			mvneta_mac_link_up(ndev, pp->use_inband_status);
-		phy_print_status(phydev);
-	}
-}
+static const struct phylink_mac_ops mvneta_phylink_ops = {
+	.validate = mvneta_validate,
+	.mac_link_state = mvneta_mac_link_state,
+	.mac_config = mvneta_mac_config,
+	.mac_link_down = mvneta_mac_link_down,
+	.mac_link_up = mvneta_mac_link_up,
+};
 
 static int mvneta_mdio_probe(struct mvneta_port *pp)
 {
-	struct phy_device *phy_dev;
 	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
+	int err = phylink_of_phy_connect(pp->phylink, pp->dn, 0);
 
-	phy_dev = of_phy_connect(pp->dev, pp->phy_node, mvneta_adjust_link, 0,
-				 pp->phy_interface);
-	if (!phy_dev) {
-		netdev_err(pp->dev, "could not find the PHY\n");
-		return -ENODEV;
-	}
+	if (err)
+		netdev_err(pp->dev, "could not attach PHY: %d\n", err);
 
-	phy_ethtool_get_wol(phy_dev, &wol);
+	phylink_ethtool_get_wol(pp->phylink, &wol);
 	device_set_wakeup_capable(&pp->dev->dev, !!wol.supported);
 
-	phy_dev->supported &= PHY_GBIT_FEATURES;
-	phy_dev->advertising = phy_dev->supported;
-
-	pp->link    = 0;
-	pp->duplex  = 0;
-	pp->speed   = 0;
-
-	return 0;
+	return err;
 }
 
 static void mvneta_mdio_remove(struct mvneta_port *pp)
 {
-	struct net_device *ndev = pp->dev;
-
-	phy_disconnect(ndev->phydev);
+	phylink_disconnect_phy(pp->phylink);
 }
 
 /* Electing a CPU must be done in an atomic way: it should be done
@@ -3645,10 +3634,9 @@ static int mvneta_stop(struct net_device *dev)
 
 static int mvneta_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
 {
-	if (!dev->phydev)
-		return -ENOTSUPP;
+	struct mvneta_port *pp = netdev_priv(dev);
 
-	return phy_mii_ioctl(dev->phydev, ifr, cmd);
+	return phylink_mii_ioctl(pp->phylink, ifr, cmd);
 }
 
 /* Ethtool methods */
@@ -3659,44 +3647,25 @@ mvneta_ethtool_set_link_ksettings(struct net_device *ndev,
 				  const struct ethtool_link_ksettings *cmd)
 {
 	struct mvneta_port *pp = netdev_priv(ndev);
-	struct phy_device *phydev = ndev->phydev;
-
-	if (!phydev)
-		return -ENODEV;
-
-	if ((cmd->base.autoneg == AUTONEG_ENABLE) != pp->use_inband_status) {
-		u32 val;
-
-		mvneta_set_autoneg(pp, cmd->base.autoneg == AUTONEG_ENABLE);
 
-		if (cmd->base.autoneg == AUTONEG_DISABLE) {
-			val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
-			val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
-				 MVNETA_GMAC_CONFIG_GMII_SPEED |
-				 MVNETA_GMAC_CONFIG_FULL_DUPLEX);
-
-			if (phydev->duplex)
-				val |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
-
-			if (phydev->speed == SPEED_1000)
-				val |= MVNETA_GMAC_CONFIG_GMII_SPEED;
-			else if (phydev->speed == SPEED_100)
-				val |= MVNETA_GMAC_CONFIG_MII_SPEED;
+	return phylink_ethtool_ksettings_set(pp->phylink, cmd);
+}
 
-			mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
-		}
+/* Get link ksettings for ethtools */
+static int
+mvneta_ethtool_get_link_ksettings(struct net_device *ndev,
+				  struct ethtool_link_ksettings *cmd)
+{
+	struct mvneta_port *pp = netdev_priv(ndev);
 
-		pp->use_inband_status = (cmd->base.autoneg == AUTONEG_ENABLE);
-		netdev_info(pp->dev, "autoneg status set to %i\n",
-			    pp->use_inband_status);
+	return phylink_ethtool_ksettings_get(pp->phylink, cmd);
+}
 
-		if (netif_running(ndev)) {
-			mvneta_port_down(pp);
-			mvneta_port_up(pp);
-		}
-	}
+static int mvneta_ethtool_nway_reset(struct net_device *dev)
+{
+	struct mvneta_port *pp = netdev_priv(dev);
 
-	return phy_ethtool_ksettings_set(ndev->phydev, cmd);
+	return phylink_ethtool_nway_reset(pp->phylink);
 }
 
 /* Set interrupt coalescing for ethtools */
@@ -3958,22 +3927,18 @@ static int mvneta_ethtool_get_rxfh(struct net_device *dev, u32 *indir, u8 *key,
 static void mvneta_ethtool_get_wol(struct net_device *dev,
 				   struct ethtool_wolinfo *wol)
 {
-	wol->supported = 0;
-	wol->wolopts = 0;
+	struct mvneta_port *pp = netdev_priv(dev);
 
-	if (dev->phydev)
-		phy_ethtool_get_wol(dev->phydev, wol);
+	phylink_ethtool_get_wol(pp->phylink, wol);
 }
 
 static int mvneta_ethtool_set_wol(struct net_device *dev,
 				  struct ethtool_wolinfo *wol)
 {
+	struct mvneta_port *pp = netdev_priv(dev);
 	int ret;
 
-	if (!dev->phydev)
-		return -EOPNOTSUPP;
-
-	ret = phy_ethtool_set_wol(dev->phydev, wol);
+	ret = phylink_ethtool_set_wol(pp->phylink, wol);
 	if (!ret)
 		device_set_wakeup_enable(&dev->dev, !!wol->wolopts);
 
@@ -3993,7 +3958,7 @@ static const struct net_device_ops mvneta_netdev_ops = {
 };
 
 static const struct ethtool_ops mvneta_eth_tool_ops = {
-	.nway_reset	= phy_ethtool_nway_reset,
+	.nway_reset	= mvneta_ethtool_nway_reset,
 	.get_link       = ethtool_op_get_link,
 	.set_coalesce   = mvneta_ethtool_set_coalesce,
 	.get_coalesce   = mvneta_ethtool_get_coalesce,
@@ -4007,7 +3972,7 @@ static const struct ethtool_ops mvneta_eth_tool_ops = {
 	.get_rxnfc	= mvneta_ethtool_get_rxnfc,
 	.get_rxfh	= mvneta_ethtool_get_rxfh,
 	.set_rxfh	= mvneta_ethtool_set_rxfh,
-	.get_link_ksettings = phy_ethtool_get_link_ksettings,
+	.get_link_ksettings = mvneta_ethtool_get_link_ksettings,
 	.set_link_ksettings = mvneta_ethtool_set_link_ksettings,
 	.get_wol        = mvneta_ethtool_get_wol,
 	.set_wol        = mvneta_ethtool_set_wol,
@@ -4155,14 +4120,13 @@ static int mvneta_probe(struct platform_device *pdev)
 {
 	struct resource *res;
 	struct device_node *dn = pdev->dev.of_node;
-	struct device_node *phy_node;
 	struct device_node *bm_node;
 	struct mvneta_port *pp;
 	struct net_device *dev;
+	struct phylink *phylink;
 	const char *dt_mac_addr;
 	char hw_mac_addr[ETH_ALEN];
 	const char *mac_from;
-	const char *managed;
 	int tx_csum_limit;
 	int phy_mode;
 	int err;
@@ -4178,31 +4142,18 @@ static int mvneta_probe(struct platform_device *pdev)
 		goto err_free_netdev;
 	}
 
-	phy_node = of_parse_phandle(dn, "phy", 0);
-	if (!phy_node) {
-		if (!of_phy_is_fixed_link(dn)) {
-			dev_err(&pdev->dev, "no PHY specified\n");
-			err = -ENODEV;
-			goto err_free_irq;
-		}
-
-		err = of_phy_register_fixed_link(dn);
-		if (err < 0) {
-			dev_err(&pdev->dev, "cannot register fixed PHY\n");
-			goto err_free_irq;
-		}
-
-		/* In the case of a fixed PHY, the DT node associated
-		 * to the PHY is the Ethernet MAC DT node.
-		 */
-		phy_node = of_node_get(dn);
-	}
-
 	phy_mode = of_get_phy_mode(dn);
 	if (phy_mode < 0) {
 		dev_err(&pdev->dev, "incorrect phy-mode\n");
 		err = -EINVAL;
-		goto err_put_phy_node;
+		goto err_free_irq;
+	}
+
+	phylink = phylink_create(dev, pdev->dev.fwnode, phy_mode,
+				 &mvneta_phylink_ops);
+	if (IS_ERR(phylink)) {
+		err = PTR_ERR(phylink);
+		goto err_free_irq;
 	}
 
 	dev->tx_queue_len = MVNETA_MAX_TXD;
@@ -4213,12 +4164,9 @@ static int mvneta_probe(struct platform_device *pdev)
 
 	pp = netdev_priv(dev);
 	spin_lock_init(&pp->lock);
-	pp->phy_node = phy_node;
+	pp->phylink = phylink;
 	pp->phy_interface = phy_mode;
-
-	err = of_property_read_string(dn, "managed", &managed);
-	pp->use_inband_status = (err == 0 &&
-				 strcmp(managed, "in-band-status") == 0);
+	pp->dn = dn;
 
 	pp->rxq_def = rxq_def;
 
@@ -4240,7 +4188,7 @@ static int mvneta_probe(struct platform_device *pdev)
 		pp->clk = devm_clk_get(&pdev->dev, NULL);
 	if (IS_ERR(pp->clk)) {
 		err = PTR_ERR(pp->clk);
-		goto err_put_phy_node;
+		goto err_free_phylink;
 	}
 
 	clk_prepare_enable(pp->clk);
@@ -4377,14 +4325,6 @@ static int mvneta_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, pp->dev);
 
-	if (pp->use_inband_status) {
-		struct phy_device *phy = of_phy_find_device(dn);
-
-		mvneta_fixed_link_update(pp, phy);
-
-		put_device(&phy->mdio.dev);
-	}
-
 	return 0;
 
 err_netdev:
@@ -4401,10 +4341,9 @@ static int mvneta_probe(struct platform_device *pdev)
 err_clk:
 	clk_disable_unprepare(pp->clk_bus);
 	clk_disable_unprepare(pp->clk);
-err_put_phy_node:
-	of_node_put(phy_node);
-	if (of_phy_is_fixed_link(dn))
-		of_phy_deregister_fixed_link(dn);
+err_free_phylink:
+	if (pp->phylink)
+		phylink_destroy(pp->phylink);
 err_free_irq:
 	irq_dispose_mapping(dev->irq);
 err_free_netdev:
@@ -4416,7 +4355,6 @@ static int mvneta_probe(struct platform_device *pdev)
 static int mvneta_remove(struct platform_device *pdev)
 {
 	struct net_device  *dev = platform_get_drvdata(pdev);
-	struct device_node *dn = pdev->dev.of_node;
 	struct mvneta_port *pp = netdev_priv(dev);
 
 	unregister_netdev(dev);
@@ -4424,10 +4362,8 @@ static int mvneta_remove(struct platform_device *pdev)
 	clk_disable_unprepare(pp->clk);
 	free_percpu(pp->ports);
 	free_percpu(pp->stats);
-	if (of_phy_is_fixed_link(dn))
-		of_phy_deregister_fixed_link(dn);
 	irq_dispose_mapping(dev->irq);
-	of_node_put(pp->phy_node);
+	phylink_destroy(pp->phylink);
 	free_netdev(dev);
 
 	if (pp->bm_priv) {
@@ -4481,9 +4417,6 @@ static int mvneta_resume(struct device *device)
 		return err;
 	}
 
-	if (pp->use_inband_status)
-		mvneta_fixed_link_update(pp, dev->phydev);
-
 	netif_device_attach(dev);
 	rtnl_lock();
 	if (netif_running(dev)) {
-- 
2.7.4

^ permalink raw reply related

* [PATCH 02/10] net: mvneta: prepare to convert to phylink
From: Russell King @ 2018-01-02 17:24 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

Prepare to convert mvneta to phylink by splitting the adjust_link
function into its consituent parts.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 101 ++++++++++++++++++++--------------
 1 file changed, 60 insertions(+), 41 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index e08bd20352cb..3dfe4b3edef2 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3251,37 +3251,73 @@ static int mvneta_set_mac_addr(struct net_device *dev, void *addr)
 	return 0;
 }
 
-static void mvneta_adjust_link(struct net_device *ndev)
+static void mvneta_mac_config(struct net_device *ndev)
 {
 	struct mvneta_port *pp = netdev_priv(ndev);
 	struct phy_device *phydev = ndev->phydev;
-	int status_change = 0;
+	u32 val;
 
-	if (phydev->link) {
-		if ((pp->speed != phydev->speed) ||
-		    (pp->duplex != phydev->duplex)) {
-			u32 val;
+	if ((pp->speed != phydev->speed) ||
+	    (pp->duplex != phydev->duplex)) {
+		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
+		val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
+			 MVNETA_GMAC_CONFIG_GMII_SPEED |
+			 MVNETA_GMAC_CONFIG_FULL_DUPLEX);
 
-			val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
-			val &= ~(MVNETA_GMAC_CONFIG_MII_SPEED |
-				 MVNETA_GMAC_CONFIG_GMII_SPEED |
-				 MVNETA_GMAC_CONFIG_FULL_DUPLEX);
+		if (phydev->duplex)
+			val |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
 
-			if (phydev->duplex)
-				val |= MVNETA_GMAC_CONFIG_FULL_DUPLEX;
+		if (phydev->speed == SPEED_1000)
+			val |= MVNETA_GMAC_CONFIG_GMII_SPEED;
+		else if (phydev->speed == SPEED_100)
+			val |= MVNETA_GMAC_CONFIG_MII_SPEED;
 
-			if (phydev->speed == SPEED_1000)
-				val |= MVNETA_GMAC_CONFIG_GMII_SPEED;
-			else if (phydev->speed == SPEED_100)
-				val |= MVNETA_GMAC_CONFIG_MII_SPEED;
+		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
 
-			mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
+		pp->duplex = phydev->duplex;
+		pp->speed  = phydev->speed;
+	}
+}
 
-			pp->duplex = phydev->duplex;
-			pp->speed  = phydev->speed;
-		}
+static void mvneta_mac_link_down(struct net_device *ndev, bool autoneg)
+{
+	struct mvneta_port *pp = netdev_priv(ndev);
+	u32 val;
+
+	if (!autoneg) {
+		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
+		val &= ~MVNETA_GMAC_FORCE_LINK_PASS;
+		val |= MVNETA_GMAC_FORCE_LINK_DOWN;
+		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
 	}
 
+	mvneta_port_down(pp);
+}
+
+static void mvneta_mac_link_up(struct net_device *ndev, bool autoneg)
+{
+	struct mvneta_port *pp = netdev_priv(ndev);
+	u32 val;
+
+	if (!autoneg) {
+		val = mvreg_read(pp, MVNETA_GMAC_AUTONEG_CONFIG);
+		val &= ~MVNETA_GMAC_FORCE_LINK_DOWN;
+		val |= MVNETA_GMAC_FORCE_LINK_PASS;
+		mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
+	}
+
+	mvneta_port_up(pp);
+}
+
+static void mvneta_adjust_link(struct net_device *ndev)
+{
+	struct mvneta_port *pp = netdev_priv(ndev);
+	struct phy_device *phydev = ndev->phydev;
+	int status_change = 0;
+
+	if (phydev->link)
+		mvneta_mac_config(ndev);
+
 	if (phydev->link != pp->link) {
 		if (!phydev->link) {
 			pp->duplex = -1;
@@ -3293,27 +3329,10 @@ static void mvneta_adjust_link(struct net_device *ndev)
 	}
 
 	if (status_change) {
-		if (phydev->link) {
-			if (!pp->use_inband_status) {
-				u32 val = mvreg_read(pp,
-						  MVNETA_GMAC_AUTONEG_CONFIG);
-				val &= ~MVNETA_GMAC_FORCE_LINK_DOWN;
-				val |= MVNETA_GMAC_FORCE_LINK_PASS;
-				mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG,
-					    val);
-			}
-			mvneta_port_up(pp);
-		} else {
-			if (!pp->use_inband_status) {
-				u32 val = mvreg_read(pp,
-						  MVNETA_GMAC_AUTONEG_CONFIG);
-				val &= ~MVNETA_GMAC_FORCE_LINK_PASS;
-				val |= MVNETA_GMAC_FORCE_LINK_DOWN;
-				mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG,
-					    val);
-			}
-			mvneta_port_down(pp);
-		}
+		if (phydev->link)
+			mvneta_mac_link_down(ndev, pp->use_inband_status);
+		else
+			mvneta_mac_link_up(ndev, pp->use_inband_status);
 		phy_print_status(phydev);
 	}
 }
-- 
2.7.4

^ permalink raw reply related

* [PATCH 01/10] net: mvneta: ensure PM paths take the rtnl lock
From: Russell King @ 2018-01-02 17:24 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev
In-Reply-To: <20180102172242.GE28752@n2100.armlinux.org.uk>

The netdev core always ensures that the rtnl lock is held while calling
the ndo_open() and ndo_stop() methods. However, the suspend/resume paths
do not hold the rtnl lock. phylink will expect the rtnl lock to be held
when the MAC driver calls it, so we end up with kernel warnings. Take
the lock to ensure that these functions are called in a consistent
manner.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/ethernet/marvell/mvneta.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index a539263cd79c..e08bd20352cb 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -4426,8 +4426,10 @@ static int mvneta_suspend(struct device *device)
 	struct net_device *dev = dev_get_drvdata(device);
 	struct mvneta_port *pp = netdev_priv(dev);
 
+	rtnl_lock();
 	if (netif_running(dev))
 		mvneta_stop(dev);
+	rtnl_unlock();
 	netif_device_detach(dev);
 	clk_disable_unprepare(pp->clk_bus);
 	clk_disable_unprepare(pp->clk);
@@ -4464,10 +4466,12 @@ static int mvneta_resume(struct device *device)
 		mvneta_fixed_link_update(pp, dev->phydev);
 
 	netif_device_attach(dev);
+	rtnl_lock();
 	if (netif_running(dev)) {
 		mvneta_open(dev);
 		mvneta_set_rx_mode(dev);
 	}
+	rtnl_unlock();
 
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related

* Re: general protection fault in skb_segment
From: Willem de Bruijn @ 2018-01-02 17:23 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: syzbot, David Miller, LKML, linux-sctp, Network Development,
	Neil Horman, syzkaller-bugs, Vladislav Yasevich
In-Reply-To: <CAF=yD-K9M1yps+SogVLSVgoi+ngKzKrwWH+xMr4-m+v-pqyXXQ@mail.gmail.com>

> Actually, changes just to inet_gso_segment and ipv6_gso_segment
> will suffice:
>
>        bool udpfrag = false, fixedid = false, gso_partial, encap;
>         struct sk_buff *segs = ERR_PTR(-EINVAL);
> +       unsigned int offset = 0, gso_type;
>         const struct net_offload *ops;
> -       unsigned int offset = 0;
>         struct iphdr *iph;
>         int proto, tot_len;
>         int nhoff;
> @@ -1258,6 +1258,22 @@ struct sk_buff *inet_gso_segment(struct sk_buff *skb,
>
>         skb_reset_transport_header(skb);
>
> +       gso_type = skb_shinfo(skb)->gso_type;
> +       if (gso_type & SKB_GSO_DODGY) {
> +               switch (gso_type & (SKB_GSO_TCPV4 | SKB_GSO_UDP)) {
> +               case SKB_GSO_TCPV4:
> +                       if (proto != IPPROTO_TCP)
> +                               goto out;
> +                       break;
> +               case SKB_GSO_UDP:
> +                       if (proto != IPPROTO_UDP)
> +                               goto out;
> +                       break;
> +               default:
> +                       goto out;
> +               }
> +       }
>
> and analogous for IPv6. For a real patch I would deduplicate this
> logic between them and move it to a separate helper function
> (in a header file, then).

This approach would also need an skb->protocol check either in
virtio_net_hdr_to_skb or skb_mac_gso_segment.

^ permalink raw reply

* [PATCH 00/10] Convert mvneta to phylink
From: Russell King - ARM Linux @ 2018-01-02 17:22 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Thomas Petazzoni; +Cc: netdev

Hi,

This series converts mvneta to use phylink, which is necessary to
support the SFP cages on SolidRun's Clearfog platform.  This series just
converts mvneta without adding the DT parts - having discussed with
Andrew, we believe we're too close to the merge window to submit that
patch.

I've split the "net: mvneta: convert to phylink" patch up to make it
easier to review, and in doing so, spotted some minor corner cases that
needed to be fixed along the way.

This series depends on the previously merged phylink patches in netdev,
along with the recently reviewed 7 patch series "Resolve races in phy
accessors" without which, the race described in patch 5 of that series
is very evident when triggering a dummy hibernate cycle.

This series also illustrates how to convert mvpp2 to phylink.

mvneta is the only user of the fixed_phy_update_state() API, and this
becomes redundant with the conversion.

It would be good to get this series not only reviewed, but also
independently tested to ensure that I haven't missed anything - I only
have the Clearfog platform to test on, and that doesn't support all the
different interface modes that mvneta supports.

A particularly interesting side effect of this series is that DSA
switches no longer need the "CPU" port and DSA facing MAC ethernet
instance to be marked as a fixed link anymore with mvneta - we can use
1000BaseX mode, and the DSA to CPU link will use the 802.3z negotiation
to determine the link properties without needing the link parameters to
be explicitly stated in DT - that is a subject of a future patch.

 drivers/net/ethernet/marvell/Kconfig  |   2 +-
 drivers/net/ethernet/marvell/mvneta.c | 687 ++++++++++++++++++++--------------
 drivers/net/phy/fixed_phy.c           |  31 --
 include/linux/phy_fixed.h             |   9 -
 4 files changed, 405 insertions(+), 324 deletions(-)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

^ permalink raw reply

* Re: [net-next: PATCH v2 5/5] net: mvpp2: enable ACPI support in the driver
From: Florian Fainelli @ 2018-01-02 17:22 UTC (permalink / raw)
  To: Marcin Wojtas, Andrew Lunn
  Cc: linux-kernel, linux-arm-kernel, netdev, linux-acpi,
	Graeme Gregory, David S. Miller, Russell King - ARM Linux,
	Rafael J. Wysocki, Antoine Ténart, Thomas Petazzoni,
	Gregory Clément, Ezequiel Garcia, nadavh,
	Neta Zur Hershkovits, Ard Biesheuvel, Grzegorz Jaszczyk,
	Tomasz Nowicki
In-Reply-To: <CAPv3WKdD0HwdFVOBx-yeTCfcMF7DXxxFaUt+Zm06oxuVuty8Rg@mail.gmail.com>

On January 2, 2018 7:05:35 AM PST, Marcin Wojtas <mw@semihalf.com> wrote:
>2018-01-02 15:08 GMT+01:00 Andrew Lunn <andrew@lunn.ch>:
>>> Indeed in of_mdio_bus_register_phy, there is of_irq_get. This is
>more
>>> a discussion for a MDIO bus / ACPI patchset, but we either find a
>way
>>> to use IRQs with ACPI obtained from child nodes or for this world
>the
>>> functionality will be limited (at least for the beginning).
>>
>> Hi Marcin
>>
>> What i want to avoid is adding something which partially works, and
>> then have to throw it all away and start again in order to add full
>> support.
>>
>> If ACPI really limits interrupts to devices, maybe we need a totally
>> different representation of MDIO and PHYs in ACPI to what it used in
>> device tree? The same may be true for the Ethernet ports of the
>mvpp2?
>> They might have to be represented as real devices, not children of a
>> device? Maybe trying to map DT to ACPI on a one-to-one basis is the
>> wrong approach?
>>
>
>In terms of PP2 controller, I'd prefer to keep as much as possible to
>describing how real hardware looks like, i.e. single common controller
>with multiple ports as its children. Those considerations are
>reflected in the DT description shape and how the driver enumerates,
>which was part of the design of the initial support. Bending the
>driver (huge amount of shared initialization and resources) to
>multiple instances just for the sake of possible avoidance of IRQ
>description in ACPI is IMO a huge and unnecessary overkill.

True, although keep in mind that Device Tree, as implemented in Linux allows for a lot of flexibility in how parent/child nodes are represented and backed or not by a corresponding struct device. I suspect ACPI is much less permissive than that and we might want to have a struct device for the whole mvpp2 controller as well as for the child ports within that controller (something you could today with device tree too, just it would be more overhead). This does not necessarily have to influence the representation within the description language but we should not be biased by how the current implementation using Device Tree has shaped both representation and implementation.

-- 
Florian

^ permalink raw reply

* Re: [net-next PATCH] net: ptr_ring: otherwise safe empty checks can overrun array bounds
From: Michael S. Tsirkin @ 2018-01-02 17:17 UTC (permalink / raw)
  To: David Miller; +Cc: john.fastabend, jakub.kicinski, xiyou.wangcong, jiri, netdev
In-Reply-To: <20180102190107-mutt-send-email-mst@kernel.org>

On Tue, Jan 02, 2018 at 07:01:33PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jan 02, 2018 at 11:52:19AM -0500, David Miller wrote:
> > From: John Fastabend <john.fastabend@gmail.com>
> > Date: Wed, 27 Dec 2017 19:50:25 -0800
> > 
> > > When running consumer and/or producer operations and empty checks in
> > > parallel its possible to have the empty check run past the end of the
> > > array. The scenario occurs when an empty check is run while
> > > __ptr_ring_discard_one() is in progress. Specifically after the
> > > consumer_head is incremented but before (consumer_head >= ring_size)
> > > check is made and the consumer head is zeroe'd.
> > > 
> > > To resolve this, without having to rework how consumer/producer ops
> > > work on the array, simply add an extra dummy slot to the end of the
> > > array. Even if we did a rework to avoid the extra slot it looks
> > > like the normal case checks would suffer some so best to just
> > > allocate an extra pointer.
> > > 
> > > Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > > Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
> > > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > 
> > Applied, thanks John.
> 
> I think that patch is wrong. I'd rather it got reverted.

And replaced with something like the following - stil untested, but
apparently there's some urgency to fix it so posting for review ASAP.

John, others, could you pls confirm it's not too bad performance-wise?
I'll then split it up properly and re-post.

-->

net: don't misuse ptr_ring_peek

ptr_ring_peek only claims to be safe if the result is never
dereferenced, which isn't the case for its use in sch_generic.
Add locked API variants and use the bh one here.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 6866df4..f3d96c7 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -236,6 +236,51 @@ static inline bool ptr_ring_empty_bh(struct ptr_ring *r)
 	return ret;
 }
 
+static inline void *ptr_ring_peek(struct ptr_ring *r)
+{
+	void *ret;
+
+	spin_lock(&r->consumer_lock);
+	ret = __ptr_ring_peek(r);
+	spin_unlock(&r->consumer_lock);
+
+	return ret;
+}
+
+static inline void *ptr_ring_peek_irq(struct ptr_ring *r)
+{
+	void *ret;
+
+	spin_lock_irq(&r->consumer_lock);
+	ret = __ptr_ring_peek(r);
+	spin_unlock_irq(&r->consumer_lock);
+
+	return ret;
+}
+
+static inline void *ptr_ring_peek_any(struct ptr_ring *r)
+{
+	unsigned long flags;
+	void *ret;
+
+	spin_lock_irqsave(&r->consumer_lock, flags);
+	ret = __ptr_ring_peek(r);
+	spin_unlock_irqrestore(&r->consumer_lock, flags);
+
+	return ret;
+}
+
+static inline void *ptr_ring_peek_bh(struct ptr_ring *r)
+{
+	void *ret;
+
+	spin_lock_bh(&r->consumer_lock);
+	ret = __ptr_ring_peek(r);
+	spin_unlock_bh(&r->consumer_lock);
+
+	return ret;
+}
+
 /* Must only be called after __ptr_ring_peek returned !NULL */
 static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 {
diff --git a/include/linux/skb_array.h b/include/linux/skb_array.h
index c7addf3..ee3625c 100644
--- a/include/linux/skb_array.h
+++ b/include/linux/skb_array.h
@@ -97,6 +97,26 @@ static inline bool skb_array_empty_any(struct skb_array *a)
 	return ptr_ring_empty_any(&a->ring);
 }
 
+static inline struct sk_buff *skb_array_peek(struct skb_array *a)
+{
+	return ptr_ring_peek(&a->ring);
+}
+
+static inline struct sk_buff *skb_array_peek_bh(struct skb_array *a)
+{
+	return ptr_ring_peek_bh(&a->ring);
+}
+
+static inline struct sk_buff *skb_array_peek_irq(struct skb_array *a)
+{
+	return ptr_ring_peek_irq(&a->ring);
+}
+
+static inline struct sk_buff *skb_array_peek_any(struct skb_array *a)
+{
+	return ptr_ring_peek_any(&a->ring);
+}
+
 static inline struct sk_buff *skb_array_consume(struct skb_array *a)
 {
 	return ptr_ring_consume(&a->ring);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cc069b2..9288e84 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -659,7 +659,7 @@ static struct sk_buff *pfifo_fast_peek(struct Qdisc *qdisc)
 	for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
 		struct skb_array *q = band2list(priv, band);
 
-		skb = __skb_array_peek(q);
+		skb = skb_array_peek_bh(q);
 	}
 
 	return skb;

^ permalink raw reply related

* [net-next] ipv6: sr: export some functions of seg6local
From: Ahmed Abdelsalam @ 2017-12-29 21:09 UTC (permalink / raw)
  To: davem, kuznet, yoshfuji; +Cc: david.lebrun, netdev, amsalam20

Some functions of seg6local are very useful to process SRv6
encapsulated packets.

This patch exports some functions of seg6local that are useful and
can be re-used at different parts of the kernel.

The set of exported functions are:
(1) get_srh()
(2) advance_nextseg()
(3) lookup_nexthop

Signed-off-by: Ahmed Abdelsalam <amsalam20@gmail.com>
---
I'm writing some extensions to netfilter framework to support
Segment Routing. These function are useful to process
SR-encapsulated packets

 include/net/seg6.h    |  4 ++++
 net/ipv6/seg6_local.c | 11 +++++++----
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/include/net/seg6.h b/include/net/seg6.h
index 099bad5..4058f23 100644
--- a/include/net/seg6.h
+++ b/include/net/seg6.h
@@ -60,6 +60,10 @@ extern int seg6_local_init(void);
 extern void seg6_local_exit(void);
 
 extern bool seg6_validate_srh(struct ipv6_sr_hdr *srh, int len);
+extern struct ipv6_sr_hdr *get_srh(struct sk_buff *skb);
+extern void advance_nextseg(struct ipv6_sr_hdr *srh, struct in6_addr *daddr);
+extern void lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr,
+			u32 tbl_id);
 extern int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh,
 			     int proto);
 extern int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh);
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 825b8e0..5661d6c 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -59,7 +59,7 @@ static struct seg6_local_lwt *seg6_local_lwtunnel(struct lwtunnel_state *lwt)
 	return (struct seg6_local_lwt *)lwt->data;
 }
 
-static struct ipv6_sr_hdr *get_srh(struct sk_buff *skb)
+struct ipv6_sr_hdr *get_srh(struct sk_buff *skb)
 {
 	struct ipv6_sr_hdr *srh;
 	int len, srhoff = 0;
@@ -82,6 +82,7 @@ static struct ipv6_sr_hdr *get_srh(struct sk_buff *skb)
 
 	return srh;
 }
+EXPORT_SYMBOL_GPL(get_srh);
 
 static struct ipv6_sr_hdr *get_and_validate_srh(struct sk_buff *skb)
 {
@@ -131,7 +132,7 @@ static bool decap_and_validate(struct sk_buff *skb, int proto)
 	return true;
 }
 
-static void advance_nextseg(struct ipv6_sr_hdr *srh, struct in6_addr *daddr)
+void advance_nextseg(struct ipv6_sr_hdr *srh, struct in6_addr *daddr)
 {
 	struct in6_addr *addr;
 
@@ -139,9 +140,10 @@ static void advance_nextseg(struct ipv6_sr_hdr *srh, struct in6_addr *daddr)
 	addr = srh->segments + srh->segments_left;
 	*daddr = *addr;
 }
+EXPORT_SYMBOL_GPL(advance_nextseg);
 
-static void lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr,
-			   u32 tbl_id)
+void lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr,
+		    u32 tbl_id)
 {
 	struct net *net = dev_net(skb->dev);
 	struct ipv6hdr *hdr = ipv6_hdr(skb);
@@ -188,6 +190,7 @@ static void lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr,
 	skb_dst_drop(skb);
 	skb_dst_set(skb, dst);
 }
+EXPORT_SYMBOL_GPL(lookup_nexthop);
 
 /* regular endpoint function */
 static int input_action_end(struct sk_buff *skb, struct seg6_local_lwt *slwt)
-- 
2.1.4

^ permalink raw reply related

* Re: [net-next PATCH] net: ptr_ring: otherwise safe empty checks can overrun array bounds
From: Michael S. Tsirkin @ 2018-01-02 17:01 UTC (permalink / raw)
  To: John Fastabend; +Cc: jakub.kicinski, davem, xiyou.wangcong, jiri, netdev
In-Reply-To: <20180102183322-mutt-send-email-mst@kernel.org>

On Tue, Jan 02, 2018 at 06:53:08PM +0200, Michael S. Tsirkin wrote:
> On Wed, Dec 27, 2017 at 07:50:25PM -0800, John Fastabend wrote:
> > When running consumer and/or producer operations and empty checks in
> > parallel its possible to have the empty check run past the end of the
> > array. The scenario occurs when an empty check is run while
> > __ptr_ring_discard_one() is in progress. Specifically after the
> > consumer_head is incremented but before (consumer_head >= ring_size)
> > check is made and the consumer head is zeroe'd.
> > 
> > To resolve this, without having to rework how consumer/producer ops
> > work on the array, simply add an extra dummy slot to the end of the
> > array. Even if we did a rework to avoid the extra slot it looks
> > like the normal case checks would suffer some so best to just
> > allocate an extra pointer.
> > 
> > Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> 
> 
> 
> 
> > ---
> >  include/linux/ptr_ring.h |    7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
> > index 6866df4..13fb06a 100644
> > --- a/include/linux/ptr_ring.h
> > +++ b/include/linux/ptr_ring.h
> > @@ -447,7 +447,12 @@ static inline int ptr_ring_consume_batched_bh(struct ptr_ring *r,
> >  
> >  static inline void **__ptr_ring_init_queue_alloc(unsigned int size, gfp_t gfp)
> >  {
> > -	return kcalloc(size, sizeof(void *), gfp);
> > +	/* Allocate an extra dummy element at end of ring to avoid consumer head
> > +	 * or produce head access past the end of the array. Possible when
> > +	 * producer/consumer operations and __ptr_ring_peek operations run in
> > +	 * parallel.
> > +	 */
> > +	return kcalloc(size + 1, sizeof(void *), gfp);
> >  }
> >  
> >  static inline void __ptr_ring_set_size(struct ptr_ring *r, int size)
> 
> 
> Well the peek will return a false negative then, won't it?
> 
> So I kind of prefer just fixing the consumer.  The first step I think
> would look something like the below untested patch.  Pls take a look.  I
> suspect we'll need a memory barrier too.
> 
> I wonder though: are false positives or negatives ever a problem?
> 
> Would it be a big deal to just take a lock there, and
> avoid trying to support a lockless peek?
> 
> 
> It would definitely be more straight-forward to just
> remove the promise to support a lockless peek.
> 
> Thoughts?

In fact, the API says:

 * Callers must take consumer_lock
 * if they dereference the pointer - see e.g. PTR_RING_PEEK_CALL.
 * If ring is never resized, and if the pointer is merely
 * tested, there's no need to take the lock - see e.g.  __ptr_ring_empty.

So it looks like the API is actually misused here as callers
will dereferences the skb returned.


> -->
> 
> ptr_ring: keep consumer_head valid at all times
> 
> The comment near __ptr_ring_peek says: 
>  
>  * If ring is never resized, and if the pointer is merely 
>  * tested, there's no need to take the lock - see e.g.  __ptr_ring_empty. 
> 
> but this was in fact never possible.
> 
> Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
> 
> diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
> index 37b4bb2..802375f 100644
> --- a/include/linux/ptr_ring.h
> +++ b/include/linux/ptr_ring.h
> @@ -236,22 +236,28 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
>  	/* Fundamentally, what we want to do is update consumer
>  	 * index and zero out the entry so producer can reuse it.
>  	 * Doing it naively at each consume would be as simple as:
> -	 *       r->queue[r->consumer++] = NULL;
> -	 *       if (unlikely(r->consumer >= r->size))
> -	 *               r->consumer = 0;
> +	 *       consumer = r->consumer;
> +	 *       r->queue[consumer++] = NULL;
> +	 *       if (unlikely(consumer >= r->size))
> +	 *               consumer = 0;
> +	 *       r->consumer = consumer;
>  	 * but that is suboptimal when the ring is full as producer is writing
>  	 * out new entries in the same cache line.  Defer these updates until a
>  	 * batch of entries has been consumed.
>  	 */
> -	int head = r->consumer_head++;
> +	/* Note: we must keep consumer_head valid at all times for __ptr_ring_peek
> +	 * to work correctly.
> +	 */
> +	int consumer_head = r->consumer_head;
> +	int head = consumer_head++;
>  
>  	/* Once we have processed enough entries invalidate them in
>  	 * the ring all at once so producer can reuse their space in the ring.
>  	 * We also do this when we reach end of the ring - not mandatory
>  	 * but helps keep the implementation simple.
>  	 */
> -	if (unlikely(r->consumer_head - r->consumer_tail >= r->batch ||
> -		     r->consumer_head >= r->size)) {
> +	if (unlikely(consumer_head - r->consumer_tail >= r->batch ||
> +		     consumer_head >= r->size)) {
>  		/* Zero out entries in the reverse order: this way we touch the
>  		 * cache line that producer might currently be reading the last;
>  		 * producer won't make progress and touch other cache lines
> @@ -259,12 +265,13 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
>  		 */
>  		while (likely(head >= r->consumer_tail))
>  			r->queue[head--] = NULL;
> -		r->consumer_tail = r->consumer_head;
> +		r->consumer_tail = consumer_head;
>  	}
> -	if (unlikely(r->consumer_head >= r->size)) {
> -		r->consumer_head = 0;
> +	if (unlikely(consumer_head >= r->size)) {
> +		consumer_head = 0;
>  		r->consumer_tail = 0;
>  	}
> +	r->consumer_head = consumer_head;
>  }
>  
>  static inline void *__ptr_ring_consume(struct ptr_ring *r)

^ permalink raw reply

* Re: [net-next PATCH] net: ptr_ring: otherwise safe empty checks can overrun array bounds
From: Michael S. Tsirkin @ 2018-01-02 17:01 UTC (permalink / raw)
  To: David Miller; +Cc: john.fastabend, jakub.kicinski, xiyou.wangcong, jiri, netdev
In-Reply-To: <20180102.115219.1101472320429215260.davem@davemloft.net>

On Tue, Jan 02, 2018 at 11:52:19AM -0500, David Miller wrote:
> From: John Fastabend <john.fastabend@gmail.com>
> Date: Wed, 27 Dec 2017 19:50:25 -0800
> 
> > When running consumer and/or producer operations and empty checks in
> > parallel its possible to have the empty check run past the end of the
> > array. The scenario occurs when an empty check is run while
> > __ptr_ring_discard_one() is in progress. Specifically after the
> > consumer_head is incremented but before (consumer_head >= ring_size)
> > check is made and the consumer head is zeroe'd.
> > 
> > To resolve this, without having to rework how consumer/producer ops
> > work on the array, simply add an extra dummy slot to the end of the
> > array. Even if we did a rework to avoid the extra slot it looks
> > like the normal case checks would suffer some so best to just
> > allocate an extra pointer.
> > 
> > Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
> > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> 
> Applied, thanks John.

I think that patch is wrong. I'd rather it got reverted.

^ permalink raw reply

* Re: [net-next PATCH] net: ptr_ring: otherwise safe empty checks can overrun array bounds
From: Michael S. Tsirkin @ 2018-01-02 16:53 UTC (permalink / raw)
  To: John Fastabend; +Cc: jakub.kicinski, davem, xiyou.wangcong, jiri, netdev
In-Reply-To: <20171228035024.14699.69453.stgit@john-Precision-Tower-5810>

On Wed, Dec 27, 2017 at 07:50:25PM -0800, John Fastabend wrote:
> When running consumer and/or producer operations and empty checks in
> parallel its possible to have the empty check run past the end of the
> array. The scenario occurs when an empty check is run while
> __ptr_ring_discard_one() is in progress. Specifically after the
> consumer_head is incremented but before (consumer_head >= ring_size)
> check is made and the consumer head is zeroe'd.
> 
> To resolve this, without having to rework how consumer/producer ops
> work on the array, simply add an extra dummy slot to the end of the
> array. Even if we did a rework to avoid the extra slot it looks
> like the normal case checks would suffer some so best to just
> allocate an extra pointer.
> 
> Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>




> ---
>  include/linux/ptr_ring.h |    7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
> index 6866df4..13fb06a 100644
> --- a/include/linux/ptr_ring.h
> +++ b/include/linux/ptr_ring.h
> @@ -447,7 +447,12 @@ static inline int ptr_ring_consume_batched_bh(struct ptr_ring *r,
>  
>  static inline void **__ptr_ring_init_queue_alloc(unsigned int size, gfp_t gfp)
>  {
> -	return kcalloc(size, sizeof(void *), gfp);
> +	/* Allocate an extra dummy element at end of ring to avoid consumer head
> +	 * or produce head access past the end of the array. Possible when
> +	 * producer/consumer operations and __ptr_ring_peek operations run in
> +	 * parallel.
> +	 */
> +	return kcalloc(size + 1, sizeof(void *), gfp);
>  }
>  
>  static inline void __ptr_ring_set_size(struct ptr_ring *r, int size)


Well the peek will return a false negative then, won't it?

So I kind of prefer just fixing the consumer.  The first step I think
would look something like the below untested patch.  Pls take a look.  I
suspect we'll need a memory barrier too.

I wonder though: are false positives or negatives ever a problem?

Would it be a big deal to just take a lock there, and
avoid trying to support a lockless peek?


It would definitely be more straight-forward to just
remove the promise to support a lockless peek.

Thoughts?

-->

ptr_ring: keep consumer_head valid at all times

The comment near __ptr_ring_peek says: 
 
 * If ring is never resized, and if the pointer is merely 
 * tested, there's no need to take the lock - see e.g.  __ptr_ring_empty. 

but this was in fact never possible.

Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 37b4bb2..802375f 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -236,22 +236,28 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 	/* Fundamentally, what we want to do is update consumer
 	 * index and zero out the entry so producer can reuse it.
 	 * Doing it naively at each consume would be as simple as:
-	 *       r->queue[r->consumer++] = NULL;
-	 *       if (unlikely(r->consumer >= r->size))
-	 *               r->consumer = 0;
+	 *       consumer = r->consumer;
+	 *       r->queue[consumer++] = NULL;
+	 *       if (unlikely(consumer >= r->size))
+	 *               consumer = 0;
+	 *       r->consumer = consumer;
 	 * but that is suboptimal when the ring is full as producer is writing
 	 * out new entries in the same cache line.  Defer these updates until a
 	 * batch of entries has been consumed.
 	 */
-	int head = r->consumer_head++;
+	/* Note: we must keep consumer_head valid at all times for __ptr_ring_peek
+	 * to work correctly.
+	 */
+	int consumer_head = r->consumer_head;
+	int head = consumer_head++;
 
 	/* Once we have processed enough entries invalidate them in
 	 * the ring all at once so producer can reuse their space in the ring.
 	 * We also do this when we reach end of the ring - not mandatory
 	 * but helps keep the implementation simple.
 	 */
-	if (unlikely(r->consumer_head - r->consumer_tail >= r->batch ||
-		     r->consumer_head >= r->size)) {
+	if (unlikely(consumer_head - r->consumer_tail >= r->batch ||
+		     consumer_head >= r->size)) {
 		/* Zero out entries in the reverse order: this way we touch the
 		 * cache line that producer might currently be reading the last;
 		 * producer won't make progress and touch other cache lines
@@ -259,12 +265,13 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 		 */
 		while (likely(head >= r->consumer_tail))
 			r->queue[head--] = NULL;
-		r->consumer_tail = r->consumer_head;
+		r->consumer_tail = consumer_head;
 	}
-	if (unlikely(r->consumer_head >= r->size)) {
-		r->consumer_head = 0;
+	if (unlikely(consumer_head >= r->size)) {
+		consumer_head = 0;
 		r->consumer_tail = 0;
 	}
+	r->consumer_head = consumer_head;
 }
 
 static inline void *__ptr_ring_consume(struct ptr_ring *r)

^ permalink raw reply related

* Re: [net-next PATCH] net: ptr_ring: otherwise safe empty checks can overrun array bounds
From: David Miller @ 2018-01-02 16:52 UTC (permalink / raw)
  To: john.fastabend; +Cc: jakub.kicinski, mst, xiyou.wangcong, jiri, netdev
In-Reply-To: <20171228035024.14699.69453.stgit@john-Precision-Tower-5810>

From: John Fastabend <john.fastabend@gmail.com>
Date: Wed, 27 Dec 2017 19:50:25 -0800

> When running consumer and/or producer operations and empty checks in
> parallel its possible to have the empty check run past the end of the
> array. The scenario occurs when an empty check is run while
> __ptr_ring_discard_one() is in progress. Specifically after the
> consumer_head is incremented but before (consumer_head >= ring_size)
> check is made and the consumer head is zeroe'd.
> 
> To resolve this, without having to rework how consumer/producer ops
> work on the array, simply add an extra dummy slot to the end of the
> array. Even if we did a rework to avoid the extra slot it looks
> like the normal case checks would suffer some so best to just
> allocate an extra pointer.
> 
> Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Fixes: c5ad119fb6c09 ("net: sched: pfifo_fast use skb_array")
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Applied, thanks John.

^ permalink raw reply

* Re: [net] Revert "net: core: maybe return -EEXIST in __dev_alloc_name"
From: Johannes Berg @ 2018-01-02 16:52 UTC (permalink / raw)
  To: David Miller, mpe; +Cc: linux, michael, j, netdev, linuxppc-dev
In-Reply-To: <20180102.115008.2038929402603091054.davem@davemloft.net>

On Tue, 2018-01-02 at 11:50 -0500, David Miller wrote:
> From: Michael Ellerman <mpe@ellerman.id.au>
> Date: Fri, 22 Dec 2017 15:22:22 +1100
> 
> >> On Tue, Dec 19 2017, Michael Ellerman <michael@concordia.ellerman.id.au> wrote:
> >>> This revert seems to have broken networking on one of my powerpc
> >>> machines, according to git bisect.
> >>>
> >>> The symptom is DHCP fails and I don't get a link, I didn't dig any
> >>> further than that. I can if it's helpful.
> >>>
> >>> I think the problem is that 87c320e51519 ("net: core: dev_get_valid_name
> >>> is now the same as dev_alloc_name_ns") only makes sense while
> >>> d6f295e9def0 remains in the tree.
> >>
> >> I'm sorry about all of this, I really didn't think there would be such
> >> consequences of changing an errno return. Indeed, d6f29 was preparation
> >> for unifying the two functions that do the exact same thing (and how we
> >> ever got into that situation is somewhat unclear), except for
> >> their behaviour in the case the requested name already exists. So one of
> >> the two interfaces had to change its return value, and as I wrote, I
> >> thought EEXIST was the saner choice when an explicit name (no %d) had
> >> been requested.
> > 
> > No worries.
> > 
> >>> ie. before the entire series, dev_get_valid_name() would return EEXIST,
> >>> and that was retained when 87c320e51519 was merged, but now that
> >>> d6f295e9def0 has been reverted dev_get_valid_name() is returning ENFILE.
> >>>
> >>> I can get the network up again if I also revert 87c320e51519 ("net:
> >>> core: dev_get_valid_name is now the same as dev_alloc_name_ns"), or with
> >>> the gross patch below.
> >>
> >> I don't think changing -ENFILE to -EEXIST would be right either, since
> >> dev_get_valid_name() used to be able to return both (-EEXIST in the case
> >> where there's no %d, -ENFILE in the case where we end up calling
> >> dev_alloc_name_ns()). If anything, we could do the check for the old
> >> -EEXIST condition first, and then call dev_alloc_name_ns(). But I'm also
> >> fine with reverting.
> > 
> > Yeah I think a revert would be best, given it's nearly rc5.
> > 
> > My userspace is not exotic AFAIK, just debian something, so presumably
> > this will affect other people too.
> 
> I've just queued up the following revert, thanks!
> 
> ====================
> From 5047543928139184f060c8f3bccb788b3df4c1ea Mon Sep 17 00:00:00 2001
> From: "David S. Miller" <davem@davemloft.net>
> Date: Tue, 2 Jan 2018 11:45:07 -0500
> Subject: [PATCH] Revert "net: core: dev_get_valid_name is now the same as
>  dev_alloc_name_ns"
> 
> This reverts commit 87c320e51519a83c496ab7bfb4e96c8f9c001e89.
> 
> Changing the error return code in some situations turns out to
> be harmful in practice.  In particular Michael Ellerman reports
> that DHCP fails on his powerpc machines, and this revert gets
> things working again.
> 
> Johannes Berg agrees that this revert is the best course of
> action for now.

I'm not sure my voice matters much, I merely did the first revert of
these two patches ... :)

But I agree with Michael that you can't really salvage this without the
other patch, and that one caused problems in wifi ...

Thanks :)

johannes 

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox