Netdev List
 help / color / mirror / Atom feed
* [PATCH net] net: sched: fix reordering issues
From: Eric Dumazet @ 2019-09-05 12:20 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet, John Fastabend

Whenever MQ is not used on a multiqueue device, we experience
serious reordering problems. Bisection found the cited
commit.

The issue can be described this way :

- A single qdisc hierarchy is shared by all transmit queues.
  (eg : tc qdisc replace dev eth0 root fq_codel)

- When/if try_bulk_dequeue_skb_slow() dequeues a packet targetting
  a different transmit queue than the one used to build a packet train,
  we stop building the current list and save the 'bad' skb (P1) in a
  special queue. (bad_txq)

- When dequeue_skb() calls qdisc_dequeue_skb_bad_txq() and finds this
  skb (P1), it checks if the associated transmit queues is still in frozen
  state. If the queue is still blocked (by BQL or NIC tx ring full),
  we leave the skb in bad_txq and return NULL.

- dequeue_skb() calls q->dequeue() to get another packet (P2)

  The other packet can target the problematic queue (that we found
  in frozen state for the bad_txq packet), but another cpu just ran
  TX completion and made room in the txq that is now ready to accept
  new packets.

- Packet P2 is sent while P1 is still held in bad_txq, P1 might be sent
  at next round. In practice P2 is the lead of a big packet train
  (P2,P3,P4 ...) filling the BQL budget and delaying P1 by many packets :/

To solve this problem, we have to block the dequeue process as long
as the first packet in bad_txq can not be sent. Reordering issues
disappear and no side effects have been seen.

Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
 net/sched/sch_generic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 137db1cbde8538e124133c6e148acc3e92e56a74..ac28f6a5d70e0b38ae8ce7858f08e9d15778c22f 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -46,6 +46,8 @@ EXPORT_SYMBOL(default_qdisc_ops);
  * - updates to tree and tree walking are only done under the rtnl mutex.
  */
 
+#define SKB_XOFF_MAGIC ((struct sk_buff *)1UL)
+
 static inline struct sk_buff *__skb_dequeue_bad_txq(struct Qdisc *q)
 {
 	const struct netdev_queue *txq = q->dev_queue;
@@ -71,7 +73,7 @@ static inline struct sk_buff *__skb_dequeue_bad_txq(struct Qdisc *q)
 				q->q.qlen--;
 			}
 		} else {
-			skb = NULL;
+			skb = SKB_XOFF_MAGIC;
 		}
 	}
 
@@ -253,8 +255,11 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate,
 		return skb;
 
 	skb = qdisc_dequeue_skb_bad_txq(q);
-	if (unlikely(skb))
+	if (unlikely(skb)) {
+		if (skb == SKB_XOFF_MAGIC)
+			return NULL;
 		goto bulk;
+	}
 	skb = q->dequeue(q);
 	if (skb) {
 bulk:
-- 
2.23.0.187.g17f5b7556c-goog


^ permalink raw reply related

* [PATCH v3 net-next] net: stmmac: Add support for MDIO interrupts
From: Voon Weifeng @ 2019-09-05 12:05 UTC (permalink / raw)
  To: David S. Miller, Maxime Coquelin
  Cc: netdev, linux-kernel, Jose Abreu, Giuseppe Cavallaro, Andrew Lunn,
	Alexandre Torgue, Ong Boon Leong, Voon Weifeng

From: "Chuah, Kim Tatt" <kim.tatt.chuah@intel.com>

DW EQoS v5.xx controllers added capability for interrupt generation
when MDIO interface is done (GMII Busy bit is cleared).
This patch adds support for this interrupt on supported HW to avoid
polling on GMII Busy bit.

stmmac_mdio_read() & stmmac_mdio_write() will sleep until wake_up() is
called by the interrupt handler.

Reviewed-by: Voon Weifeng <weifeng.voon@intel.com>
Reviewed-by: Kweh, Hock Leong <hock.leong.kweh@intel.com>
Reviewed-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Chuah, Kim Tatt <kim.tatt.chuah@intel.com>
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>

Changelog v3
*Move the changelog to before the --- marker line
Changelog v2
*mdio interrupt mode or polling mode will depends on mdio interrupt
 enable bit
*Disable the mdio interrupt enable bit in stmmac_release
*Remove the condition for initialize wait queues
*Applied reverse Christmas tree
---
 drivers/net/ethernet/stmicro/stmmac/common.h      |  2 +
 drivers/net/ethernet/stmicro/stmmac/dwmac4.h      |  1 +
 drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c |  7 +++
 drivers/net/ethernet/stmicro/stmmac/dwmac5.c      |  8 ++++
 drivers/net/ethernet/stmicro/stmmac/dwmac5.h      |  4 ++
 drivers/net/ethernet/stmicro/stmmac/hwif.c        |  9 ++++
 drivers/net/ethernet/stmicro/stmmac/hwif.h        |  4 ++
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c |  5 ++
 drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c | 57 +++++++++++++++++++----
 9 files changed, 89 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 49aa56ca09cc..775a1c114b1a 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -448,6 +448,8 @@ struct mac_device_info {
 	unsigned int pcs;
 	unsigned int pmt;
 	unsigned int ps;
+	bool mdio_intr_en;
+	wait_queue_head_t mdio_busy_wait;
 };
 
 struct stmmac_rx_routing {
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
index 2ed11a581d80..1be6a8a88b8f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h
@@ -106,6 +106,7 @@ enum dwmac4_irq_status {
 	mmc_irq = 0x00000100,
 	lpi_irq = 0x00000020,
 	pmt_irq = 0x00000010,
+	mdio_irq = 0x00040000,
 };
 
 /* MAC PMT bitmap */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
index fc9954e4a772..97fca6d65141 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
@@ -59,6 +59,9 @@ static void dwmac4_core_init(struct mac_device_info *hw,
 	if (hw->pcs)
 		value |= GMAC_PCS_IRQ_DEFAULT;
 
+	if (hw->mdio_intr_en)
+		value |= GMAC_INT_MDIO_EN;
+
 	writel(value, ioaddr + GMAC_INT_EN);
 }
 
@@ -629,6 +632,9 @@ static int dwmac4_irq_status(struct mac_device_info *hw,
 			x->irq_rx_path_exit_lpi_mode_n++;
 	}
 
+	if (intr_status & mdio_irq)
+		wake_up(&hw->mdio_busy_wait);
+
 	dwmac_pcs_isr(ioaddr, GMAC_PCS_BASE, intr_status, x);
 	if (intr_status & PCS_RGSMIIIS_IRQ)
 		dwmac4_phystatus(ioaddr, x);
@@ -836,6 +842,7 @@ static void dwmac4_set_mac_loopback(void __iomem *ioaddr, bool enable)
 	.rxp_config = dwmac5_rxp_config,
 	.flex_pps_config = dwmac5_flex_pps_config,
 	.set_mac_loopback = dwmac4_set_mac_loopback,
+	.mdio_intr_dis = dwmac5_mdio_intr_dis,
 };
 
 int dwmac4_setup(struct stmmac_priv *priv)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac5.c b/drivers/net/ethernet/stmicro/stmmac/dwmac5.c
index 3f4f3132e16b..c58751e1dcb6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac5.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac5.c
@@ -549,3 +549,11 @@ int dwmac5_flex_pps_config(void __iomem *ioaddr, int index,
 	writel(val, ioaddr + MAC_PPS_CONTROL);
 	return 0;
 }
+
+void dwmac5_mdio_intr_dis(void __iomem *ioaddr)
+{
+	u32 val = readl(ioaddr + GMAC_INT_EN);
+
+	val &= ~GMAC_INT_MDIO_EN;
+	writel(val, ioaddr + GMAC_INT_EN);
+}
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac5.h b/drivers/net/ethernet/stmicro/stmmac/dwmac5.h
index 775db776b3cc..a56511a4c97d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac5.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac5.h
@@ -72,6 +72,9 @@
 #define TCEIE				BIT(0)
 #define DMA_ECC_INT_STATUS		0x00001088
 
+/* MDIO interrupt enable in MAC_Interrupt_Enable register */
+#define GMAC_INT_MDIO_EN		BIT(18)
+
 int dwmac5_safety_feat_config(void __iomem *ioaddr, unsigned int asp);
 int dwmac5_safety_feat_irq_status(struct net_device *ndev,
 		void __iomem *ioaddr, unsigned int asp,
@@ -83,5 +86,6 @@ int dwmac5_rxp_config(void __iomem *ioaddr, struct stmmac_tc_entry *entries,
 int dwmac5_flex_pps_config(void __iomem *ioaddr, int index,
 			   struct stmmac_pps_cfg *cfg, bool enable,
 			   u32 sub_second_inc, u32 systime_flags);
+void dwmac5_mdio_intr_dis(void __iomem *ioaddr);
 
 #endif /* __DWMAC5_H__ */
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.c b/drivers/net/ethernet/stmicro/stmmac/hwif.c
index 3af2e5015245..7127efe652db 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.c
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.c
@@ -73,6 +73,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 	bool gmac;
 	bool gmac4;
 	bool xgmac;
+	bool mdio_intr_en;
 	u32 min_id;
 	const struct stmmac_regs_off regs;
 	const void *desc;
@@ -90,6 +91,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 		.gmac = false,
 		.gmac4 = false,
 		.xgmac = false,
+		.mdio_intr_en = false,
 		.min_id = 0,
 		.regs = {
 			.ptp_off = PTP_GMAC3_X_OFFSET,
@@ -108,6 +110,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 		.gmac = true,
 		.gmac4 = false,
 		.xgmac = false,
+		.mdio_intr_en = false,
 		.min_id = 0,
 		.regs = {
 			.ptp_off = PTP_GMAC3_X_OFFSET,
@@ -126,6 +129,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 		.gmac = false,
 		.gmac4 = true,
 		.xgmac = false,
+		.mdio_intr_en = false,
 		.min_id = 0,
 		.regs = {
 			.ptp_off = PTP_GMAC4_OFFSET,
@@ -144,6 +148,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 		.gmac = false,
 		.gmac4 = true,
 		.xgmac = false,
+		.mdio_intr_en = false,
 		.min_id = DWMAC_CORE_4_00,
 		.regs = {
 			.ptp_off = PTP_GMAC4_OFFSET,
@@ -162,6 +167,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 		.gmac = false,
 		.gmac4 = true,
 		.xgmac = false,
+		.mdio_intr_en = false,
 		.min_id = DWMAC_CORE_4_10,
 		.regs = {
 			.ptp_off = PTP_GMAC4_OFFSET,
@@ -180,6 +186,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 		.gmac = false,
 		.gmac4 = true,
 		.xgmac = false,
+		.mdio_intr_en = true,
 		.min_id = DWMAC_CORE_5_10,
 		.regs = {
 			.ptp_off = PTP_GMAC4_OFFSET,
@@ -198,6 +205,7 @@ static int stmmac_dwmac4_quirks(struct stmmac_priv *priv)
 		.gmac = false,
 		.gmac4 = false,
 		.xgmac = true,
+		.mdio_intr_en = false,
 		.min_id = DWXGMAC_CORE_2_10,
 		.regs = {
 			.ptp_off = PTP_XGMAC_OFFSET,
@@ -276,6 +284,7 @@ int stmmac_hwif_init(struct stmmac_priv *priv)
 		mac->mode = mac->mode ? : entry->mode;
 		mac->tc = mac->tc ? : entry->tc;
 		mac->mmc = mac->mmc ? : entry->mmc;
+		mac->mdio_intr_en = mac->mdio_intr_en ? : entry->mdio_intr_en;
 
 		priv->hw = mac;
 		priv->ptpaddr = priv->ioaddr + entry->regs.ptp_off;
diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h
index 9435b312495d..d42885426e78 100644
--- a/drivers/net/ethernet/stmicro/stmmac/hwif.h
+++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h
@@ -363,6 +363,8 @@ struct stmmac_ops {
 	int (*get_mac_tx_timestamp)(struct mac_device_info *hw, u64 *ts);
 	/* Source Address Insertion / Replacement */
 	void (*sarc_configure)(void __iomem *ioaddr, int val);
+	/* Disable mdio interrupt */
+	void (*mdio_intr_dis)(void __iomem *ioaddr);
 };
 
 #define stmmac_core_init(__priv, __args...) \
@@ -443,6 +445,8 @@ struct stmmac_ops {
 	stmmac_do_callback(__priv, mac, get_mac_tx_timestamp, __args)
 #define stmmac_sarc_configure(__priv, __args...) \
 	stmmac_do_void_callback(__priv, mac, sarc_configure, __args)
+#define stmmac_mdio_intr_dis(__priv, __args...) \
+	stmmac_do_void_callback(__priv, mac, mdio_intr_dis, __args)
 
 /* PTP and HW Timer helpers */
 struct stmmac_hwtimestamp {
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 06ccd216ae90..2557a2beb03d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2768,6 +2768,8 @@ static int stmmac_release(struct net_device *dev)
 	phylink_stop(priv->phylink);
 	phylink_disconnect_phy(priv->phylink);
 
+	stmmac_mdio_intr_dis(priv, priv->ioaddr);
+
 	stmmac_stop_all_queues(priv);
 
 	stmmac_disable_all_queues(priv);
@@ -4463,6 +4465,9 @@ int stmmac_dvr_probe(struct device *device,
 	if (ret)
 		goto error_hw_init;
 
+	/* mdio intr wait queue */
+	init_waitqueue_head(&priv->hw->mdio_busy_wait);
+
 	stmmac_check_ether_addr(priv);
 
 	/* Configure real RX and TX queues */
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
index 40c42637ad75..625e1fde0c86 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c
@@ -19,6 +19,8 @@
 #include <linux/property.h>
 #include <linux/slab.h>
 
+#include "dwmac4.h"
+#include "dwmac5.h"
 #include "dwxgmac2.h"
 #include "stmmac.h"
 
@@ -142,6 +144,18 @@ static int stmmac_xgmac2_mdio_write(struct mii_bus *bus, int phyaddr,
 				  !(tmp & MII_XGMAC_BUSY), 100, 10000);
 }
 
+static bool stmmac_mdio_intr_done(struct mii_bus *bus)
+{
+	struct net_device *ndev = bus->priv;
+	struct stmmac_priv *priv;
+	unsigned int mii_address;
+
+	priv = netdev_priv(ndev);
+	mii_address = priv->hw->mii.addr;
+
+	return !(readl(priv->ioaddr + mii_address) & MII_BUSY);
+}
+
 /**
  * stmmac_mdio_read
  * @bus: points to the mii_bus structure
@@ -159,9 +173,11 @@ static int stmmac_mdio_read(struct mii_bus *bus, int phyaddr, int phyreg)
 	unsigned int mii_address = priv->hw->mii.addr;
 	unsigned int mii_data = priv->hw->mii.data;
 	u32 value = MII_BUSY;
+	u32 mdio_intr_en = 0;
 	int data = 0;
 	u32 v;
 
+	mdio_intr_en = readl(priv->ioaddr + GMAC_INT_EN) & GMAC_INT_MDIO_EN;
 	value |= (phyaddr << priv->hw->mii.addr_shift)
 		& priv->hw->mii.addr_mask;
 	value |= (phyreg << priv->hw->mii.reg_shift) & priv->hw->mii.reg_mask;
@@ -181,16 +197,26 @@ static int stmmac_mdio_read(struct mii_bus *bus, int phyaddr, int phyreg)
 		}
 	}
 
-	if (readl_poll_timeout(priv->ioaddr + mii_address, v, !(v & MII_BUSY),
-			       100, 10000))
+	if (mdio_intr_en) {
+		if (!wait_event_timeout(priv->hw->mdio_busy_wait,
+					stmmac_mdio_intr_done(bus), HZ / 100))
+			return -EBUSY;
+	} else if (readl_poll_timeout(priv->ioaddr + mii_address, v,
+				      !(v & MII_BUSY), 100, 10000)) {
 		return -EBUSY;
+	}
 
 	writel(data, priv->ioaddr + mii_data);
 	writel(value, priv->ioaddr + mii_address);
 
-	if (readl_poll_timeout(priv->ioaddr + mii_address, v, !(v & MII_BUSY),
-			       100, 10000))
+	if (mdio_intr_en) {
+		if (!wait_event_timeout(priv->hw->mdio_busy_wait,
+					stmmac_mdio_intr_done(bus), HZ / 100))
+			return -EBUSY;
+	} else if (readl_poll_timeout(priv->ioaddr + mii_address, v,
+				      !(v & MII_BUSY), 100, 10000)) {
 		return -EBUSY;
+	}
 
 	/* Read the data from the MII data register */
 	data = (int)readl(priv->ioaddr + mii_data) & MII_DATA_MASK;
@@ -214,9 +240,11 @@ static int stmmac_mdio_write(struct mii_bus *bus, int phyaddr, int phyreg,
 	unsigned int mii_address = priv->hw->mii.addr;
 	unsigned int mii_data = priv->hw->mii.data;
 	u32 value = MII_BUSY;
+	u32 mdio_intr_en = 0;
 	int data = phydata;
 	u32 v;
 
+	mdio_intr_en = readl(priv->ioaddr + GMAC_INT_EN) & GMAC_INT_MDIO_EN;
 	value |= (phyaddr << priv->hw->mii.addr_shift)
 		& priv->hw->mii.addr_mask;
 	value |= (phyreg << priv->hw->mii.reg_shift) & priv->hw->mii.reg_mask;
@@ -240,17 +268,30 @@ static int stmmac_mdio_write(struct mii_bus *bus, int phyaddr, int phyreg,
 	}
 
 	/* Wait until any existing MII operation is complete */
-	if (readl_poll_timeout(priv->ioaddr + mii_address, v, !(v & MII_BUSY),
-			       100, 10000))
+	if (mdio_intr_en) {
+		if (!wait_event_timeout(priv->hw->mdio_busy_wait,
+					stmmac_mdio_intr_done(bus), HZ / 100))
+			return -EBUSY;
+	} else if (readl_poll_timeout(priv->ioaddr + mii_address, v,
+				      !(v & MII_BUSY), 100, 10000)) {
 		return -EBUSY;
+	}
 
 	/* Set the MII address register to write */
 	writel(data, priv->ioaddr + mii_data);
 	writel(value, priv->ioaddr + mii_address);
 
 	/* Wait until any existing MII operation is complete */
-	return readl_poll_timeout(priv->ioaddr + mii_address, v, !(v & MII_BUSY),
-				  100, 10000);
+	if (mdio_intr_en) {
+		if (!wait_event_timeout(priv->hw->mdio_busy_wait,
+					stmmac_mdio_intr_done(bus), HZ / 100))
+			return -EBUSY;
+	} else if (readl_poll_timeout(priv->ioaddr + mii_address, v,
+				      !(v & MII_BUSY), 100, 10000)) {
+		return -EBUSY;
+	}
+
+	return 0;
 }
 
 /**
-- 
1.9.1


^ permalink raw reply related

* RE: [PATCH v2 net-next] net: stmmac: Add support for MDIO interrupts
From: Voon, Weifeng @ 2019-09-05 11:58 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David S. Miller, Maxime Coquelin, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Jose Abreu, Giuseppe Cavallaro,
	Alexandre Torgue, Ong, Boon Leong
In-Reply-To: <20190905064002.GB415@lunn.ch>

> > The change log is near the end of the patch:
> > /**
> > --
> > Changelog v2
> > *mdio interrupt mode or polling mode will depends on mdio interrupt
> > enable bit *Disable the mdio interrupt enable bit in stmmac_release
> > *Remove the condition for initialize wait queues *Applied reverse
> > Christmas tree
> > 1.9.1
> 
> At the end, nobody sees it, because everybody else does it at the
> beginning.
> 
> https://www.kernel.org/doc/html/latest/process/submitting-
> patches.html?highlight=submitting#the-canonical-patch-format
> 
> This talks about the ---. David prefers to see the change log before the
> ---. Other maintainers want it after the ---.
> 
> >
> > >
> > > The formatting of this patch also looks a bit odd. Did you use git
> > > format-patch ; git send-email?
> >
> > Yes, I do git format-patch, then ./scripts/checkpatch.pl.
> > Lastly git send-email
> 
> What looked odd is the missing --- marker. git format-patch should of
> create that as part of the patch.
> 
>        Andrew

I found out why as I did a git format-patch with -p which is without the
Stat. I will resent v3 to using correct format.

Weifeng

^ permalink raw reply

* Re: [RESEND PATCH 0/5] Add bluetooth support for Orange Pi 3
From: Maxime Ripard @ 2019-09-05 11:56 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: megous, Chen-Yu Tsai, Rob Herring, Johan Hedberg, Mark Rutland,
	David S. Miller, netdev, devicetree, linux-kernel,
	linux-arm-kernel, linux-bluetooth
In-Reply-To: <76FD40C7-10C5-4818-8EF9-60326ECA4243@holtmann.org>

[-- Attachment #1: Type: text/plain, Size: 2405 bytes --]

On Wed, Sep 04, 2019 at 04:19:37PM +0200, Marcel Holtmann wrote:
> Hi Maxime,
>
> >>>>> (Resend to add missing lists, sorry for the noise.)
> >>>>>
> >>>>> This series implements bluetooth support for Xunlong Orange Pi 3 board.
> >>>>>
> >>>>> The board uses AP6256 WiFi/BT 5.0 chip.
> >>>>>
> >>>>> Summary of changes:
> >>>>>
> >>>>> - add more delay to let initialize the chip
> >>>>> - let the kernel detect firmware file path
> >>>>> - add new compatible and update dt-bindings
> >>>>> - update Orange Pi 3 / H6 DTS
> >>>>>
> >>>>> Please take a look.
> >>>>>
> >>>>> thank you and regards,
> >>>>> Ondrej Jirman
> >>>>>
> >>>>> Ondrej Jirman (5):
> >>>>> dt-bindings: net: Add compatible for BCM4345C5 bluetooth device
> >>>>> bluetooth: bcm: Add support for loading firmware for BCM4345C5
> >>>>> bluetooth: hci_bcm: Give more time to come out of reset
> >>>>> arm64: dts: allwinner: h6: Add pin configs for uart1
> >>>>> arm64: dts: allwinner: orange-pi-3: Enable UART1 / Bluetooth
> >>>>>
> >>>>> .../bindings/net/broadcom-bluetooth.txt       |  1 +
> >>>>> .../dts/allwinner/sun50i-h6-orangepi-3.dts    | 19 +++++++++++++++++++
> >>>>> arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 10 ++++++++++
> >>>>> drivers/bluetooth/btbcm.c                     |  3 +++
> >>>>> drivers/bluetooth/hci_bcm.c                   |  3 ++-
> >>>>> 5 files changed, 35 insertions(+), 1 deletion(-)
> >>>>
> >>>> all 5 patches have been applied to bluetooth-next tree.
> >>>
> >>> The DTS patches (last 2) should go through the arm-soc tree, can you
> >>> drop them?
> >>
> >> why is that? We have included DTS changes for Bluetooth devices
> >> directly all the time. What is different with this hardware?
> >
> > I guess some maintainers are more relaxed with it than we are then,
> > but for the why, well, it's the usual reasons, the most immediate one
> > being that it reduces to a minimum the conflicts between trees.
> >
> > The other being that it's not really usual to merge patches supposed
> > to be handled by another maintainer without (at least) his
> > consent. I'm pretty sure you would have asked the same request if I
> > would have merged the bluetooth patches through my tree without
> > notice.
>
> I took the two DTS patches out now and let the submitter deal with
> getting these merged.

Thanks!
Maxime

--
Maxime Ripard, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: WARNING in hso_free_net_device
From: syzbot @ 2019-09-05 11:47 UTC (permalink / raw)
  To: alexios.zavras, andreyknvl, benquike, davem, gregkh, linux-kernel,
	linux-usb, mathias.payer, netdev, oneukum, rfontana, stephen,
	syzkaller-bugs, tglx
In-Reply-To: <CAAeHK+y3eQ7bXvo1tiAkwLCsFkbSU5B+6hsKbdEzkSXP2_Jyzg@mail.gmail.com>

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger  
crash:

Reported-and-tested-by:  
syzbot+44d53c7255bb1aea22d2@syzkaller.appspotmail.com

Tested on:

commit:         eea39f24 usb-fuzzer: main usb gadget fuzzer driver
git tree:       https://github.com/google/kasan.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=d0c62209eedfd54e
dashboard link: https://syzkaller.appspot.com/bug?extid=44d53c7255bb1aea22d2
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1188fcc6600000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply

* Re: [PATCH iproute2-next] bpf: fix snprintf truncation warning
From: Andrea Claudi @ 2019-09-05 11:44 UTC (permalink / raw)
  To: David Ahern; +Cc: linux-netdev, Stephen Hemminger, David Ahern
In-Reply-To: <83242eb4-6304-0fcf-2d2a-6ef4de464e81@gmail.com>

On Thu, Sep 5, 2019 at 12:15 AM David Ahern <dsahern@gmail.com> wrote:
>
> On 9/4/19 9:50 AM, Andrea Claudi wrote:
> > gcc v9.2.1 produces the following warning compiling iproute2:
> >
> > bpf.c: In function ‘bpf_get_work_dir’:
> > bpf.c:784:49: warning: ‘snprintf’ output may be truncated before the last format character [-Wformat-truncation=]
> >   784 |  snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir), "%s/", mnt);
> >       |                                                 ^
> > bpf.c:784:2: note: ‘snprintf’ output between 2 and 4097 bytes into a destination of size 4096
> >   784 |  snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir), "%s/", mnt);
> >       |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > Fix it extending bpf_wrk_dir size by 1 byte for the extra "/" char.
> >
> > Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
> > ---
> >  lib/bpf.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/bpf.c b/lib/bpf.c
> > index 7d2a322ffbaec..95de7894a93ce 100644
> > --- a/lib/bpf.c
> > +++ b/lib/bpf.c
> > @@ -742,7 +742,7 @@ static int bpf_gen_hierarchy(const char *base)
> >  static const char *bpf_get_work_dir(enum bpf_prog_type type)
> >  {
> >       static char bpf_tmp[PATH_MAX] = BPF_DIR_MNT;
> > -     static char bpf_wrk_dir[PATH_MAX];
> > +     static char bpf_wrk_dir[PATH_MAX + 1];
> >       static const char *mnt;
> >       static bool bpf_mnt_cached;
> >       const char *mnt_env = getenv(BPF_ENV_MNT);
> >
>
> PATH_MAX is meant to be the max length for a filesystem path including
> the null terminator, so I think it would be better to change the
> snprintf to 'sizeof(bpf_wrk_dir) - 1'.

With 'sizeof(bpf_wrk_dir) - 1' snprintf simply truncates at byte 4095
instead of byte 4096.
This means that bpf_wrk_dir can again be truncated before the final
"/", as it is by now.
Am I missing something?

Trying your suggestion I have this slightly different warning message:

bpf.c: In function ‘bpf_get_work_dir’:
bpf.c:784:52: warning: ‘/’ directive output may be truncated writing 1
byte into a region of size between 0 and 4095 [-Wformat-truncation=]
  784 |  snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir) - 1, "%s/", mnt);
      |                                                    ^
bpf.c:784:2: note: ‘snprintf’ output between 2 and 4097 bytes into a
destination of size 4095
  784 |  snprintf(bpf_wrk_dir, sizeof(bpf_wrk_dir) - 1, "%s/", mnt);
      |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

^ permalink raw reply

* [PATCH net-next] net: phy: Do not check Link status when loopback is enabled
From: Jose Abreu @ 2019-09-05 11:43 UTC (permalink / raw)
  To: netdev
  Cc: Joao Pinto, Jose Abreu, Andrew Lunn, Florian Fainelli,
	Heiner Kallweit, David S. Miller, linux-kernel

While running stmmac selftests I found that in my 1G setup some tests
were failling when running with PHY loopback enabled.

It looks like when loopback is enabled the PHY will report that Link is
down even though there is a valid connection.

As in loopback mode the data will not be sent anywhere we can bypass the
logic of checking if Link is valid thus saving unecessary reads.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>

---
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/net/phy/phy.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 35d29a823af8..7c92afd36bbe 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -525,6 +525,12 @@ static int phy_check_link_status(struct phy_device *phydev)
 
 	WARN_ON(!mutex_is_locked(&phydev->lock));
 
+	/* Keep previous state if loopback is enabled because some PHYs
+	 * report that Link is Down when loopback is enabled.
+	 */
+	if (phydev->loopback_enabled)
+		return 0;
+
 	err = phy_read_status(phydev);
 	if (err)
 		return err;
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH] net/skbuff: silence warnings under memory pressure
From: Sergey Senozhatsky @ 2019-09-05 11:32 UTC (permalink / raw)
  To: Qian Cai, Steven Rostedt, Petr Mladek
  Cc: Sergey Senozhatsky, Sergey Senozhatsky, Michal Hocko,
	Eric Dumazet, davem, netdev, linux-mm, linux-kernel
In-Reply-To: <1567629737.5576.87.camel@lca.pw>

On (09/04/19 16:42), Qian Cai wrote:
> > Let me think more.
> 
> To summary, those look to me are all good long-term improvement that would
> reduce the likelihood of this kind of livelock in general especially for other
> unknown allocations that happen while processing softirqs, but it is still up to
> the air if it fixes it 100% in all situations as printk() is going to take more
> time

Well. So. I guess that we don't need irq_work most of the time.

We need to queue irq_work for "safe" wake_up_interruptible(), when we
know that we can deadlock in scheduler. IOW, only when we are invoked
from the scheduler. Scheduler has printk_deferred(), which tells printk()
that it cannot do wake_up_interruptible(). Otherwise we can just use
normal wake_up_process() and don't need that irq_work->wake_up_interruptible()
indirection. The parts of the scheduler, which by mistake call plain printk()
from under pi_lock or rq_lock have chances to deadlock anyway and should
be switched to printk_deferred().

I think we can queue significantly much less irq_work-s from printk().

Petr, Steven, what do you think?

Something like this. Call wake_up_interruptible(), switch to
wake_up_klogd() only when called from sched code.

---
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index cd51aa7d08a9..89cb47882254 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2027,8 +2027,11 @@ asmlinkage int vprintk_emit(int facility, int level,
 	pending_output = (curr_log_seq != log_next_seq);
 	logbuf_unlock_irqrestore(flags);
 
+	if (!pending_output)
+		return printed_len;
+
 	/* If called from the scheduler, we can not call up(). */
-	if (!in_sched && pending_output) {
+	if (!in_sched) {
 		/*
 		 * Disable preemption to avoid being preempted while holding
 		 * console_sem which would prevent anyone from printing to
@@ -2043,10 +2046,11 @@ asmlinkage int vprintk_emit(int facility, int level,
 		if (console_trylock_spinning())
 			console_unlock();
 		preempt_enable();
-	}
 
-	if (pending_output)
+		wake_up_interruptible(&log_wait);
+	} else {
 		wake_up_klogd();
+	}
 	return printed_len;
 }
 EXPORT_SYMBOL(vprintk_emit);
---

> and could deal with console hardware that involve irq_exit() anyway.

printk->console_driver->write() does not involve irq.

> On the other hand, adding __GPF_NOWARN in the build_skb() allocation will fix
> this known NET_TX_SOFTIRQ case which is common when softirqd involved at least
> in short-term. It even have a benefit to reduce the overall warn_alloc() noise
> out there.

That's not up to me to decide.

	-ss

^ permalink raw reply related

* Re: WARNING in hso_free_net_device
From: Andrey Konovalov @ 2019-09-05 11:24 UTC (permalink / raw)
  To: Hui Peng
  Cc: Stephen Hemminger, syzbot+44d53c7255bb1aea22d2, alexios.zavras,
	David S. Miller, Greg Kroah-Hartman, LKML, USB list,
	Mathias Payer, netdev, rfontana, syzkaller-bugs, Thomas Gleixner,
	Oliver Neukum
In-Reply-To: <285edb24-01f9-3f9d-4946-b2f41ccd0774@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1328 bytes --]

On Thu, Sep 5, 2019 at 4:20 AM Hui Peng <benquike@gmail.com> wrote:
>
> Can you guys have  a look at the attached patch?

Let's try it:

#syz test: https://github.com/google/kasan.git eea39f24

FYI: there are two more reports coming from this driver, which might
(or might not) have the same root cause. One of them has a suggested
fix by Oliver.

https://syzkaller.appspot.com/bug?extid=67b2bd0e34f952d0321e
https://syzkaller.appspot.com/bug?extid=93f2f45b19519b289613

>
> On 9/4/19 6:41 PM, Stephen Hemminger wrote:
> > On Wed, 4 Sep 2019 16:27:50 -0400
> > Hui Peng <benquike@gmail.com> wrote:
> >
> >> Hi, all:
> >>
> >> I looked at the bug a little.
> >>
> >> The issue is that in the error handling code, hso_free_net_device
> >> unregisters
> >>
> >> the net_device (hso_net->net)  by calling unregister_netdev. In the
> >> error handling code path,
> >>
> >> hso_net->net has not been registered yet.
> >>
> >> I think there are two ways to solve the issue:
> >>
> >> 1. fix it in drivers/net/usb/hso.c to avoiding unregistering the
> >> net_device when it is still not registered
> >>
> >> 2. fix it in unregister_netdev. We can add a field in net_device to
> >> record whether it is registered, and make unregister_netdev return if
> >> the net_device is not registered yet.
> >>
> >> What do you guys think ?
> > #1

[-- Attachment #2: 0001-Fix-a-wrong-unregistering-bug-in-hso_free_net_device.patch --]
[-- Type: text/x-patch, Size: 2399 bytes --]

From f3fdee8fc03aa2bc982f22da1d29bbf6bca72935 Mon Sep 17 00:00:00 2001
From: Hui Peng <benquike@gmail.com>
Date: Wed, 4 Sep 2019 21:38:35 -0400
Subject: [PATCH] Fix a wrong unregistering bug in hso_free_net_device

As shown below, hso_create_net_device may call hso_free_net_device
before the net_device is registered. hso_free_net_device will
unregister the network device no matter it is registered or not,
unregister_netdev is not able to handle unregistered net_device,
resulting in the bug reported by the syzbot.

```
static struct hso_device *hso_create_net_device(struct usb_interface *interface,
					       int port_spec)
{
	......
	net = alloc_netdev(sizeof(struct hso_net), "hso%d", NET_NAME_UNKNOWN,
      			    hso_net_init);
	......
	if (!hso_net->out_endp) {
   	   	dev_err(&interface->dev, "Can't find BULK OUT endpoint\n");
		goto exit;
	}

	......
	result = register_netdev(net);
	......
exit:
	hso_free_net_device(hso_dev);
	return NULL;
}

static void hso_free_net_device(struct hso_device *hso_dev)
{
	......
	if (hso_net->net)
		unregister_netdev(hso_net->net);
	......
}

```

This patch adds a net_registered field in struct hso_net to record whether
the containing net_device is registered or not, and avoid unregistering it
if it is not registered yet.

Reported-by: syzbot+44d53c7255bb1aea22d2@syzkaller.appspotmail.com
Signed-off-by: Hui Peng <benquike@gmail.com>
---
 drivers/net/usb/hso.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c
index ce78714..5b3df33 100644
--- a/drivers/net/usb/hso.c
+++ b/drivers/net/usb/hso.c
@@ -128,6 +128,7 @@ struct hso_shared_int {
 struct hso_net {
 	struct hso_device *parent;
 	struct net_device *net;
+	bool net_registered;
 	struct rfkill *rfkill;
 	char name[24];
 
@@ -2362,7 +2363,7 @@ static void hso_free_net_device(struct hso_device *hso_dev)
 
 	remove_net_device(hso_net->parent);
 
-	if (hso_net->net)
+	if (hso_net->net && hso_net->net_registered)
 		unregister_netdev(hso_net->net);
 
 	/* start freeing */
@@ -2544,6 +2545,7 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface,
 		dev_err(&interface->dev, "Failed to register device\n");
 		goto exit;
 	}
+	hso_net->net_registered = true;
 
 	hso_log_port(hso_dev);
 
-- 
2.7.4


^ permalink raw reply related

* Re: [Bridge] [PATCH v3 1/2] net: bridge: use mac_len in bridge forwarding
From: Toshiaki Makita @ 2019-09-05 11:20 UTC (permalink / raw)
  To: Zahari Doychev
  Cc: netdev, makita.toshiaki, jiri, nikolay, simon.horman, roopa,
	bridge, jhs, dsahern, xiyou.wangcong, johannes,
	alexei.starovoitov
In-Reply-To: <20190904143227.5jpn2gnu3fed55wg@tycho>

On 2019/09/04 23:32, Zahari Doychev wrote:
> On Wed, Sep 04, 2019 at 04:14:28PM +0900, Toshiaki Makita wrote:
>> On 2019/09/03 22:36, Zahari Doychev wrote:
>>> On Tue, Sep 03, 2019 at 08:37:36PM +0900, Toshiaki Makita wrote:
>>>> Hi Zahari,
>>>>
>>>> Sorry for reviewing this late.
>>>>
>>>> On 2019/09/03 3:09, Zahari Doychev wrote:
>>>> ...
>>>>> @@ -466,13 +466,14 @@ static bool __allowed_ingress(const struct net_bridge *br,
>>>>>     		/* Tagged frame */
>>>>>     		if (skb->vlan_proto != br->vlan_proto) {
>>>>>     			/* Protocol-mismatch, empty out vlan_tci for new tag */
>>>>> -			skb_push(skb, ETH_HLEN);
>>>>> +			skb_push(skb, skb->mac_len);
>>>>>     			skb = vlan_insert_tag_set_proto(skb, skb->vlan_proto,
>>>>>     							skb_vlan_tag_get(skb));
>>>>
>>>> I think we should insert vlan at skb->data, i.e. mac_header + mac_len, while this
>>>> function inserts the tag at mac_header + ETH_HLEN which is not always the correct
>>>> offset.
>>>
>>> Maybe I am misunderstanding the concern here but this should make sure that
>>> the VLAN tag from the skb is move back in the payload as the outer most tag.
>>> So it should follow the ethernet header. It looks like this e.g.,:
>>>
>>> VLAN1 in skb:
>>> +------+------+-------+
>>> | DMAC | SMAC | ETYPE |
>>> +------+------+-------+
>>>
>>> VLAN1 moved to payload:
>>> +------+------+-------+-------+
>>> | DMAC | SMAC | VLAN1 | ETYPE |
>>> +------+------+-------+-------+
>>>
>>> VLAN2 in skb:
>>> +------+------+-------+-------+
>>> | DMAC | SMAC | VLAN1 | ETYPE |
>>> +------+------+-------+-------+
>>>
>>> VLAN2 moved to payload:
>>>
>>> +------+------+-------+-------+
>>> | DMAC | SMAC | VLAN2 | VLAN1 | ....
>>> +------+------+-------+-------+
>>>
>>> Doing the skb push with mac_len makes sure that VLAN tag is inserted in the
>>> correct offset. For mac_len == ETH_HLEN this does not change the current
>>> behaviour.
>>
>> Reordering VLAN headers here does not look correct to me. If skb->data points to ETH+VLAN,
>> then we should insert the vlan at the offset.
>> Vlan devices with reorder_hdr disabled produce packets whose mac_len includes ETH+VLAN header,
>> and they expects vlan insertion after the outer vlan header.
> 
> I see so in this case we should handle differently as it seems sometimes
> we have to insert after or before the tag in the packet. I am not quite sure
> if this is possible to be detected here. I was trying to do bridging with VLAN
> devices with reorder_hdr disabled working but somehow I was not able to get
> mac_len longer then ETH_HLEN in all cases that I tried. Can you provide some
> example how can I try this out? It will really help me to understand the
> problem better.

I'm not sure if there is a case where we should insert tags before data pointer.
Your case does not look valid to me because skb is already broken in TC (I think I
explained this in the previous discussion). Bridge should not workaround the broken skb.

>> Also I'm not sure there is standard ethernet header in mac_len, as mac_len is not ETH_HLEN.
>> E.g. tun devices can produce vlan packets without ehternet header.
> 
> How is the bridge forwarding decision done in this case when there are no
> MAC addresses, vlan based only?

Tun is just an example for header shorter than we expect. It's more like an attack vector.
So maybe it's sufficient to make sure we don't crash or write data to unexpected offset
for such packets. Or if such packets cannot make it to this point, that's ok.

> 
>>
>>>
>>>>
>>>>>     			if (unlikely(!skb))
>>>>>     				return false;
>>>>>     			skb_pull(skb, ETH_HLEN);
>>>>
>>>> Now skb->data is mac_header + ETH_HLEN which would be broken when mac_len is not
>>>> ETH_HLEN?
>>>
>>> I thought it would be better to point in this case to the outer tag as otherwise
>>> if mac_len is used the skb->data will point to the next tag which I find somehow
>>> inconsistent or do you see some case where this can cause problems?
>>
>> Vlan devices with reorder_hdr off will break because it relies on skb->data offset
>> as I described in the previous discussion.
> 
> I also see in vlan_do_receive that the VLAN tag is moved to the payload when
> reorder_hdr is off and the vlan_dev is not a bridge port. So it seems that
> I am misunderstanding the reorder_hdr option so if you can give me some more
> details about how it is supposed to be used will be highly appreciated.

No, you don't misunderstand it. I just forgot the condition was added.

Now reorder_hdr does not look like a problem, I lost the reason to handle
mac_len != ETH_HLEN case, as I'm thinking this change should not be a workaround for your problem.
If we fix the broken data pointer in TC, there should not be problems with mac_len in bridge.
Do you have any other possible cases this works for?

Toshiaki Makita

^ permalink raw reply

* Re: [PATCH net-next 4/7] net: hns3: add client node validity judgment
From: tanhuazhong @ 2019-09-05 11:01 UTC (permalink / raw)
  To: Sergei Shtylyov, davem
  Cc: netdev, linux-kernel, salil.mehta, yisen.zhuang, linuxarm,
	jakub.kicinski, Peng Li
In-Reply-To: <b0aa6da6-cd42-dd31-8ff7-ca3f48de58ff@cogentembedded.com>



On 2019/9/5 18:12, Sergei Shtylyov wrote:
> On 04.09.2019 17:06, Huazhong Tan wrote:
> 
>> From: Peng Li <lipeng321@huawei.com>
>>
>> HNS3 driver can only unregister client which included in 
>> hnae3_client_list.
>> This patch adds the client node validity judgment.
>>
>> Signed-off-by: Peng Li <lipeng321@huawei.com>
>> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
>> ---
>>   drivers/net/ethernet/hisilicon/hns3/hnae3.c | 16 ++++++++++++++++
>>   1 file changed, 16 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.c 
>> b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
>> index 528f624..6aa5257 100644
>> --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.c
>> +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
>> @@ -138,12 +138,28 @@ EXPORT_SYMBOL(hnae3_register_client);
>>   void hnae3_unregister_client(struct hnae3_client *client)
>>   {
>> +    struct hnae3_client *client_tmp;
>>       struct hnae3_ae_dev *ae_dev;
>> +    bool existed = false;
>>       if (!client)
>>           return;
>>       mutex_lock(&hnae3_common_lock);
>> +
>> +    list_for_each_entry(client_tmp, &hnae3_client_list, node) {
>> +        if (client_tmp->type == client->type) {
>> +            existed = true;
>> +            break;
>> +        }
>> +    }
>> +
>> +    if (!existed) {
>> +        mutex_unlock(&hnae3_common_lock);
>> +        pr_err("client %s not existed!\n", client->name);
> 
>     Did not exist, you mean?
> 

yes

> [...]
> 
> MBR, Sergei
> 
> .
> 


^ permalink raw reply

* Re: [PATCH net-next v2] r8152: adjust the settings of ups flags
From: David Miller @ 2019-09-05 10:41 UTC (permalink / raw)
  To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel
In-Reply-To: <1394712342-15778-328-Taiwan-albertk@realtek.com>

From: Hayes Wang <hayeswang@realtek.com>
Date: Thu, 5 Sep 2019 10:46:20 +0800

> The UPS feature only works for runtime suspend, so UPS flags only
> need to be set before enabling runtime suspend. Therefore, I create
> a struct to record relative information, and use it before runtime
> suspend.
> 
> All chips could record such information, even though not all of
> them support the feature of UPS. Then, some functions could be
> combined.
> 
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>

Applied.

^ permalink raw reply

* Re: [PATCH v2 2/2] PTP: add support for one-shot output
From: Felipe Balbi @ 2019-09-05 10:40 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Christopher S Hall, netdev, linux-kernel, davem
In-Reply-To: <20190831150149.GB1692@localhost>

[-- Attachment #1: Type: text/plain, Size: 8864 bytes --]


Hi,

Richard Cochran <richardcochran@gmail.com> writes:
> On Fri, Aug 30, 2019 at 11:00:20AM +0300, Felipe Balbi wrote:
>> seems like this should be defined together with the other flags? If
>> that's the case, it seems like we would EXTTS and PEROUT masks.
>
> Yes, let's make the meanings of the bit fields clear...
>
> --- ptp_clock.h ---
>
> /*
>  * Bits of the ptp_extts_request.flags field:
>  */
> #define PTP_ENABLE_FEATURE	BIT(0)
> #define PTP_RISING_EDGE		BIT(1)
> #define PTP_FALLING_EDGE	BIT(2)
> #define PTP_EXTTS_VALID_FLAGS	(PTP_ENABLE_FEATURE | \
> 				 PTP_RISING_EDGE | \
> 				 PTP_FALLING_EDGE)
>
> /*
>  * Bits of the ptp_perout_request.flags field:
>  */
> #define PTP_PEROUT_ONE_SHOT	BIT(0)
> #define PTP_PEROUT_VALID_FLAGS	(PTP_PEROUT_ONE_SHOT)
>
> struct ptp_extts_request {
> 	unsigned int flags;  /* Bit field of PTP_EXTTS_VALID_FLAGS. */
> };
>
> struct ptp_perout_request {
> 	unsigned int flags;  /* Bit field of PTP_PEROUT_VALID_FLAGS. */
> };

Below you can find the combined patch. Locally, I have it split into two
patches, but this lets us look at the full picture. I'll send it as v3
series of two patches on Monday if you like the result. Let me know if
you prefer that I convert the flags to BIT() macro calls instead.

8<------------------------------------------------------------------------

From 633a8214c86a43dcf880d7aed33758b576933369 Mon Sep 17 00:00:00 2001
From: Felipe Balbi <felipe.balbi@linux.intel.com>
Date: Wed, 14 Aug 2019 10:31:08 +0300
Subject: [PATCH 1/5] PTP: introduce new versions of IOCTLs

The current version of the IOCTL have a small problem which prevents us
from extending the API by making use of reserved fields. In these new
IOCTLs, we are now making sure that flags and rsv fields are zero which
will allow us to extend the API in the future.

Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
---
 drivers/ptp/ptp_chardev.c      | 63 ++++++++++++++++++++++++++++++++++
 include/uapi/linux/ptp_clock.h | 26 ++++++++++++--
 2 files changed, 87 insertions(+), 2 deletions(-)

diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index 18ffe449efdf..9c18476d8d10 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -126,7 +126,9 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 	switch (cmd) {
 
 	case PTP_CLOCK_GETCAPS:
+	case PTP_CLOCK_GETCAPS2:
 		memset(&caps, 0, sizeof(caps));
+
 		caps.max_adj = ptp->info->max_adj;
 		caps.n_alarm = ptp->info->n_alarm;
 		caps.n_ext_ts = ptp->info->n_ext_ts;
@@ -139,11 +141,24 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_EXTTS_REQUEST:
+	case PTP_EXTTS_REQUEST2:
+		memset(&req, 0, sizeof(req));
+
 		if (copy_from_user(&req.extts, (void __user *)arg,
 				   sizeof(req.extts))) {
 			err = -EFAULT;
 			break;
 		}
+		if (((req.extts.flags & ~PTP_EXTTS_VALID_FLAGS) ||
+			req.extts.rsv[0] || req.extts.rsv[1]) &&
+			cmd == PTP_EXTTS_REQUEST2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_EXTTS_REQUEST) {
+			req.extts.flags &= ~PTP_EXTTS_VALID_FLAGS;
+			req.extts.rsv[0] = 0;
+			req.extts.rsv[1] = 0;
+		}
 		if (req.extts.index >= ops->n_ext_ts) {
 			err = -EINVAL;
 			break;
@@ -154,11 +169,27 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_PEROUT_REQUEST:
+	case PTP_PEROUT_REQUEST2:
+		memset(&req, 0, sizeof(req));
+
 		if (copy_from_user(&req.perout, (void __user *)arg,
 				   sizeof(req.perout))) {
 			err = -EFAULT;
 			break;
 		}
+		if (((req.perout.flags & ~PTP_PEROUT_VALID_FLAGS) ||
+			req.perout.rsv[0] || req.perout.rsv[1] ||
+			req.perout.rsv[2] || req.perout.rsv[3]) &&
+			cmd == PTP_PEROUT_REQUEST2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_PEROUT_REQUEST) {
+			req.perout.flags &= ~PTP_PEROUT_VALID_FLAGS;
+			req.perout.rsv[0] = 0;
+			req.perout.rsv[1] = 0;
+			req.perout.rsv[2] = 0;
+			req.perout.rsv[3] = 0;
+		}
 		if (req.perout.index >= ops->n_per_out) {
 			err = -EINVAL;
 			break;
@@ -169,6 +200,9 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_ENABLE_PPS:
+	case PTP_ENABLE_PPS2:
+		memset(&req, 0, sizeof(req));
+
 		if (!capable(CAP_SYS_TIME))
 			return -EPERM;
 		req.type = PTP_CLK_REQ_PPS;
@@ -177,6 +211,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_SYS_OFFSET_PRECISE:
+	case PTP_SYS_OFFSET_PRECISE2:
 		if (!ptp->info->getcrosststamp) {
 			err = -EOPNOTSUPP;
 			break;
@@ -201,6 +236,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_SYS_OFFSET_EXTENDED:
+	case PTP_SYS_OFFSET_EXTENDED2:
 		if (!ptp->info->gettimex64) {
 			err = -EOPNOTSUPP;
 			break;
@@ -232,6 +268,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_SYS_OFFSET:
+	case PTP_SYS_OFFSET2:
 		sysoff = memdup_user((void __user *)arg, sizeof(*sysoff));
 		if (IS_ERR(sysoff)) {
 			err = PTR_ERR(sysoff);
@@ -266,10 +303,23 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_PIN_GETFUNC:
+	case PTP_PIN_GETFUNC2:
 		if (copy_from_user(&pd, (void __user *)arg, sizeof(pd))) {
 			err = -EFAULT;
 			break;
 		}
+		if ((pd.rsv[0] || pd.rsv[1] || pd.rsv[2]
+				|| pd.rsv[3] || pd.rsv[4])
+			&& cmd == PTP_PIN_GETFUNC2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_PIN_GETFUNC) {
+			pd.rsv[0] = 0;
+			pd.rsv[1] = 0;
+			pd.rsv[2] = 0;
+			pd.rsv[3] = 0;
+			pd.rsv[4] = 0;
+		}
 		pin_index = pd.index;
 		if (pin_index >= ops->n_pins) {
 			err = -EINVAL;
@@ -285,10 +335,23 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
 		break;
 
 	case PTP_PIN_SETFUNC:
+	case PTP_PIN_SETFUNC2:
 		if (copy_from_user(&pd, (void __user *)arg, sizeof(pd))) {
 			err = -EFAULT;
 			break;
 		}
+		if ((pd.rsv[0] || pd.rsv[1] || pd.rsv[2]
+				|| pd.rsv[3] || pd.rsv[4])
+			&& cmd == PTP_PIN_SETFUNC2) {
+			err = -EINVAL;
+			break;
+		} else if (cmd == PTP_PIN_SETFUNC) {
+			pd.rsv[0] = 0;
+			pd.rsv[1] = 0;
+			pd.rsv[2] = 0;
+			pd.rsv[3] = 0;
+			pd.rsv[4] = 0;
+		}
 		pin_index = pd.index;
 		if (pin_index >= ops->n_pins) {
 			err = -EINVAL;
diff --git a/include/uapi/linux/ptp_clock.h b/include/uapi/linux/ptp_clock.h
index 1bc794ad957a..cbdc0d97b471 100644
--- a/include/uapi/linux/ptp_clock.h
+++ b/include/uapi/linux/ptp_clock.h
@@ -25,11 +25,21 @@
 #include <linux/ioctl.h>
 #include <linux/types.h>
 
-/* PTP_xxx bits, for the flags field within the request structures. */
+/*
+ * Bits of the ptp_extts_request.flags field:
+ */
 #define PTP_ENABLE_FEATURE (1<<0)
 #define PTP_RISING_EDGE    (1<<1)
 #define PTP_FALLING_EDGE   (1<<2)
+#define PTP_EXTTS_VALID_FLAGS	(PTP_ENABLE_FEATURE |	\
+				 PTP_RISING_EDGE |	\
+				 PTP_FALLING_EDGE)
 
+/*
+ * Bits of the ptp_perout_request.flags field:
+ */
+#define PTP_PEROUT_ONE_SHOT (1<<0)
+#define PTP_PEROUT_VALID_FLAGS	(~PTP_PEROUT_ONE_SHOT)
 /*
  * struct ptp_clock_time - represents a time value
  *
@@ -67,7 +77,7 @@ struct ptp_perout_request {
 	struct ptp_clock_time start;  /* Absolute start time. */
 	struct ptp_clock_time period; /* Desired period, zero means disable. */
 	unsigned int index;           /* Which channel to configure. */
-	unsigned int flags;           /* Reserved for future use. */
+	unsigned int flags;
 	unsigned int rsv[4];          /* Reserved for future use. */
 };
 
@@ -149,6 +159,18 @@ struct ptp_pin_desc {
 #define PTP_SYS_OFFSET_EXTENDED \
 	_IOWR(PTP_CLK_MAGIC, 9, struct ptp_sys_offset_extended)
 
+#define PTP_CLOCK_GETCAPS2  _IOR(PTP_CLK_MAGIC, 10, struct ptp_clock_caps)
+#define PTP_EXTTS_REQUEST2  _IOW(PTP_CLK_MAGIC, 11, struct ptp_extts_request)
+#define PTP_PEROUT_REQUEST2 _IOW(PTP_CLK_MAGIC, 12, struct ptp_perout_request)
+#define PTP_ENABLE_PPS2     _IOW(PTP_CLK_MAGIC, 13, int)
+#define PTP_SYS_OFFSET2     _IOW(PTP_CLK_MAGIC, 14, struct ptp_sys_offset)
+#define PTP_PIN_GETFUNC2    _IOWR(PTP_CLK_MAGIC, 15, struct ptp_pin_desc)
+#define PTP_PIN_SETFUNC2    _IOW(PTP_CLK_MAGIC, 16, struct ptp_pin_desc)
+#define PTP_SYS_OFFSET_PRECISE2 \
+	_IOWR(PTP_CLK_MAGIC, 17, struct ptp_sys_offset_precise)
+#define PTP_SYS_OFFSET_EXTENDED2 \
+	_IOWR(PTP_CLK_MAGIC, 18, struct ptp_sys_offset_extended)
+
 struct ptp_extts_event {
 	struct ptp_clock_time t; /* Time event occured. */
 	unsigned int index;      /* Which channel produced the event. */
-- 
2.23.0


-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related

* Re: [PATCH 0/4] gianfar: some assorted cleanup
From: Vladimir Oltean @ 2019-09-05 10:39 UTC (permalink / raw)
  To: Arseny Solokha
  Cc: Claudiu Manoil, Ioana Ciornei, Russell King, Andrew Lunn,
	Florian Fainelli, netdev
In-Reply-To: <20190904135223.31754-1-asolokha@kb.kras.ru>

On Wed, 4 Sep 2019 at 16:53, Arseny Solokha <asolokha@kb.kras.ru> wrote:
>
> This is a cleanup series for the gianfar Ethernet driver, following up a
> discussion in [1]. It is intended to precede a conversion of gianfar from
> PHYLIB to PHYLINK API, which will be submitted later in its version 2.
> However, it won't make a conversion cleaner, except for the last patch in
> this series. Obviously this series is not intended for -stable.
>
> The first patch looks super controversial to me, as it moves lots of code
> around for the sole purpose of getting rid of static forward declarations
> in two translation units. On the other hand, this change is purely
> mechanical and cannot do any harm other than cluttering git blame output.
> I can prepare an alternative patch for only swapping adjacent functions
> around, if necessary.
>
> The second patch is a trivial follow-up to the first one, making functions
> that are only called from the same translation unit static.
>
> The third patch removes some now unused macro and structure definitions
> from gianfar.h, slipped away from various cleanups in the past.
>
> The fourth patch, also suggested in [1], makes the driver consistently use
> PHY connection type value obtained from a Device Tree node, instead of
> ignoring it and using the one auto-detected by MAC, when connecting to PHY.
> Obviously a value has to be specified correctly in DT source, or omitted
> altogether, in which case the driver will fall back to auto-detection. When
> querying a DT node, the driver will also take both applicable properties
> into account by making a proper API call instead of open-coding the lookup
> half-way correctly.
>
> [1] https://lore.kernel.org/netdev/CA+h21hruqt6nGG5ksDSwrGH_w5GtGF4fjAMCWJne7QJrjusERQ@mail.gmail.com/
>
> Arseny Solokha (4):
>   gianfar: remove forward declarations
>   gianfar: make five functions static
>   gianfar: cleanup gianfar.h
>   gianfar: use DT more consistently when selecting PHY connection type
>
>  drivers/net/ethernet/freescale/gianfar.c      | 4647 ++++++++---------
>  drivers/net/ethernet/freescale/gianfar.h      |   45 -
>  .../net/ethernet/freescale/gianfar_ethtool.c  |   13 -
>  3 files changed, 2303 insertions(+), 2402 deletions(-)
>
> --
> 2.23.0
>

Thanks for the cleanup!

-Vladimir

^ permalink raw reply

* Re: [PATCH] net: hns: Move static keyword to the front of declaration
From: David Miller @ 2019-09-05 10:39 UTC (permalink / raw)
  To: kw
  Cc: yisen.zhuang, salil.mehta, liuyonglong, lipeng321, gregkh,
	colin.king, huang.zijiang, tglx, netdev, linux-kernel
In-Reply-To: <20190904142116.31884-1-kw@linux.com>

From: Krzysztof Wilczynski <kw@linux.com>
Date: Wed,  4 Sep 2019 16:21:16 +0200

> Move the static keyword to the front of declaration of g_dsaf_mode_match,
> and resolve the following compiler warning that can be seen when building
> with warnings enabled (W=1):
> 
> drivers/net/ethernet/hisilicon/hns/hns_dsaf_main.c:27:1: warning:
>   ‘static’ is not at beginning of declaration [-Wold-style-declaration]
> 
> Signed-off-by: Krzysztof Wilczynski <kw@linux.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH] net: qed: Move static keyword to the front of declaration
From: David Miller @ 2019-09-05 10:39 UTC (permalink / raw)
  To: kw; +Cc: aelior, GR-everest-linux-l2, netdev, linux-kernel
In-Reply-To: <20190904141730.31497-1-kw@linux.com>

From: Krzysztof Wilczynski <kw@linux.com>
Date: Wed,  4 Sep 2019 16:17:30 +0200

> Move the static keyword to the front of declaration of iwarp_state_names,
> and resolve the following compiler warning that can be seen when building
> with warnings enabled (W=1):
> 
> drivers/net/ethernet/qlogic/qed/qed_iwarp.c:385:1: warning:
>   ‘static’ is not at beginning of declaration [-Wold-style-declaration]
> 
> Also, resolve checkpatch.pl script warning:
> 
> WARNING: static const char * array should probably be
>   static const char * const
> 
> Signed-off-by: Krzysztof Wilczynski <kw@linux.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v3 net] net: Properly update v4 routes with v6 nexthop
From: David Miller @ 2019-09-05 10:36 UTC (permalink / raw)
  To: sharpd; +Cc: netdev, dsahern, sworley
In-Reply-To: <20190904141158.17021-1-sharpd@cumulusnetworks.com>

From: Donald Sharp <sharpd@cumulusnetworks.com>
Date: Wed,  4 Sep 2019 10:11:58 -0400

> When creating a v4 route that uses a v6 nexthop from a nexthop group.
> Allow the kernel to properly send the nexthop as v6 via the RTA_VIA
> attribute.
> 
> Broken behavior:
> 
> $ ip nexthop add via fe80::9 dev eth0
> $ ip nexthop show
> id 1 via fe80::9 dev eth0 scope link
> $ ip route add 4.5.6.7/32 nhid 1
> $ ip route show
> default via 10.0.2.2 dev eth0
> 4.5.6.7 nhid 1 via 254.128.0.0 dev eth0
> 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
> $
> 
> Fixed behavior:
> 
> $ ip nexthop add via fe80::9 dev eth0
> $ ip nexthop show
> id 1 via fe80::9 dev eth0 scope link
> $ ip route add 4.5.6.7/32 nhid 1
> $ ip route show
> default via 10.0.2.2 dev eth0
> 4.5.6.7 nhid 1 via inet6 fe80::9 dev eth0
> 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15
> $
> 
> v2, v3: Addresses code review comments from David Ahern
> 
> Fixes: dcb1ecb50edf (“ipv4: Prepare for fib6_nh from a nexthop object”)
> Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH v1] pppoatm: use %*ph to print small buffer
From: David Miller @ 2019-09-05 10:33 UTC (permalink / raw)
  To: andriy.shevchenko; +Cc: mitch, netdev
In-Reply-To: <20190904174459.77067-1-andriy.shevchenko@linux.intel.com>

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Wed,  4 Sep 2019 20:44:59 +0300

> Use %*ph format to print small buffer as hex string.
> 
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v2 0/2] Fix GMII2RGMII private field
From: David Miller @ 2019-09-05 10:32 UTC (permalink / raw)
  To: harini.katakam
  Cc: andrew, f.fainelli, hkallweit1, michal.simek, netdev,
	linux-arm-kernel, linux-kernel, harinikatakamlinux,
	radhey.shyam.pandey
In-Reply-To: <1567605621-6818-1-git-send-email-harini.katakam@xilinx.com>

From: Harini Katakam <harini.katakam@xilinx.com>
Date: Wed,  4 Sep 2019 19:30:19 +0530

> Fix the usage of external phy's priv field by gmii2rgmii driver.
> 
> Based on net-next.

Series applied to net-next.

^ permalink raw reply

* Re: [PATCH 0/4] gianfar: some assorted cleanup
From: David Miller @ 2019-09-05 10:28 UTC (permalink / raw)
  To: asolokha
  Cc: claudiu.manoil, ioana.ciornei, linux, andrew, olteanv, f.fainelli,
	netdev
In-Reply-To: <20190904135223.31754-1-asolokha@kb.kras.ru>

From: Arseny Solokha <asolokha@kb.kras.ru>
Date: Wed,  4 Sep 2019 20:52:18 +0700

> This is a cleanup series for the gianfar Ethernet driver, following up a
> discussion in [1]. It is intended to precede a conversion of gianfar from
> PHYLIB to PHYLINK API, which will be submitted later in its version 2.
> However, it won't make a conversion cleaner, except for the last patch in
> this series. Obviously this series is not intended for -stable.
> 
> The first patch looks super controversial to me, as it moves lots of code
> around for the sole purpose of getting rid of static forward declarations
> in two translation units. On the other hand, this change is purely
> mechanical and cannot do any harm other than cluttering git blame output.
> I can prepare an alternative patch for only swapping adjacent functions
> around, if necessary.
> 
> The second patch is a trivial follow-up to the first one, making functions
> that are only called from the same translation unit static.
> 
> The third patch removes some now unused macro and structure definitions
> from gianfar.h, slipped away from various cleanups in the past.
> 
> The fourth patch, also suggested in [1], makes the driver consistently use
> PHY connection type value obtained from a Device Tree node, instead of
> ignoring it and using the one auto-detected by MAC, when connecting to PHY.
> Obviously a value has to be specified correctly in DT source, or omitted
> altogether, in which case the driver will fall back to auto-detection. When
> querying a DT node, the driver will also take both applicable properties
> into account by making a proper API call instead of open-coding the lookup
> half-way correctly.
> 
> [1] https://lore.kernel.org/netdev/CA+h21hruqt6nGG5ksDSwrGH_w5GtGF4fjAMCWJne7QJrjusERQ@mail.gmail.com/

Series applied, thanks.

^ permalink raw reply

* [PATCH 1/5] xfrm interface: avoid corruption on changelink
From: Steffen Klassert @ 2019-09-05 10:21 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20190905102201.1636-1-steffen.klassert@secunet.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

The new parameters must not be stored in the netdev_priv() before
validation, it may corrupt the interface. Note also that if data is NULL,
only a memset() is done.

$ ip link add xfrm1 type xfrm dev lo if_id 1
$ ip link add xfrm2 type xfrm dev lo if_id 2
$ ip link set xfrm1 type xfrm dev lo if_id 2
RTNETLINK answers: File exists
$ ip -d link list dev xfrm1
5: xfrm1@lo: <NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/none 00:00:00:00:00:00 brd 00:00:00:00:00:00 promiscuity 0 minmtu 68 maxmtu 1500
    xfrm if_id 0x2 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

=> "if_id 0x2"

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Julien Floret <julien.floret@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_interface.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 74868f9d81fb..2310dc9e35c2 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -671,12 +671,12 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
 			   struct nlattr *data[],
 			   struct netlink_ext_ack *extack)
 {
-	struct xfrm_if *xi = netdev_priv(dev);
 	struct net *net = dev_net(dev);
+	struct xfrm_if_parms p;
+	struct xfrm_if *xi;
 
-	xfrmi_netlink_parms(data, &xi->p);
-
-	xi = xfrmi_locate(net, &xi->p);
+	xfrmi_netlink_parms(data, &p);
+	xi = xfrmi_locate(net, &p);
 	if (!xi) {
 		xi = netdev_priv(dev);
 	} else {
@@ -684,7 +684,7 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
 			return -EEXIST;
 	}
 
-	return xfrmi_update(xi, &xi->p);
+	return xfrmi_update(xi, &p);
 }
 
 static size_t xfrmi_get_size(const struct net_device *dev)
-- 
2.17.1


^ permalink raw reply related

* pull request (net): ipsec 2019-09-05
From: Steffen Klassert @ 2019-09-05 10:21 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev

1) Several xfrm interface fixes from Nicolas Dichtel:
   - Avoid an interface ID corruption on changelink.
   - Fix wrong intterface names in the logs.
   - Fix a list corruption when changing network namespaces.
   - Fix unregistation of the underying phydev.

2) Fix a potential warning when merging xfrm_plocy nodes.
   From Florian Westphal.

Please pull or let me know if there are problems.

Thanks!

The following changes since commit 114a5c3240155fdb01bf821c9d326d7bb05bd464:

  Merge tag 'mlx5-fixes-2019-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux (2019-07-11 15:06:37 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec.git master

for you to fetch changes up to 769a807d0b41df4201dbeb01c22eaeb3e5905532:

  xfrm: policy: avoid warning splat when merging nodes (2019-08-20 08:09:42 +0200)

----------------------------------------------------------------
Florian Westphal (1):
      xfrm: policy: avoid warning splat when merging nodes

Nicolas Dichtel (4):
      xfrm interface: avoid corruption on changelink
      xfrm interface: ifname may be wrong in logs
      xfrm interface: fix list corruption for x-netns
      xfrm interface: fix management of phydev

 include/net/xfrm.h                         |  2 --
 net/xfrm/xfrm_interface.c                  | 56 +++++++++++++-----------------
 net/xfrm/xfrm_policy.c                     |  6 ++--
 tools/testing/selftests/net/xfrm_policy.sh |  7 ++++
 4 files changed, 36 insertions(+), 35 deletions(-)

^ permalink raw reply

* [PATCH 5/5] xfrm: policy: avoid warning splat when merging nodes
From: Steffen Klassert @ 2019-09-05 10:22 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20190905102201.1636-1-steffen.klassert@secunet.com>

From: Florian Westphal <fw@strlen.de>

syzbot reported a splat:
 xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877
 CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57
 Call Trace:
  xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline]
  xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline]
  xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023
  xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139
  xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182
  xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574
  xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670
  xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676
  netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477
  xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684
  netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
  netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328
  netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917
  sock_sendmsg_nosec net/socket.c:637 [inline]
  sock_sendmsg net/socket.c:657 [inline]

There is no reproducer, however, the warning can be reproduced
by adding rules with ever smaller prefixes.

The sanity check ("does the policy match the node") uses the prefix value
of the node before its updated to the smaller value.

To fix this, update the prefix earlier.  The bug has no impact on tree
correctness, this is only to prevent a false warning.

Reported-by: syzbot+8cc27ace5f6972910b31@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_policy.c                     | 6 ++++--
 tools/testing/selftests/net/xfrm_policy.sh | 7 +++++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 8ca637a72697..0fa7c5ce3b2c 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -912,6 +912,7 @@ static void xfrm_policy_inexact_node_reinsert(struct net *net,
 		} else if (delta > 0) {
 			p = &parent->rb_right;
 		} else {
+			bool same_prefixlen = node->prefixlen == n->prefixlen;
 			struct xfrm_policy *tmp;
 
 			hlist_for_each_entry(tmp, &n->hhead, bydst) {
@@ -919,9 +920,11 @@ static void xfrm_policy_inexact_node_reinsert(struct net *net,
 				hlist_del_rcu(&tmp->bydst);
 			}
 
+			node->prefixlen = prefixlen;
+
 			xfrm_policy_inexact_list_reinsert(net, node, family);
 
-			if (node->prefixlen == n->prefixlen) {
+			if (same_prefixlen) {
 				kfree_rcu(n, rcu);
 				return;
 			}
@@ -929,7 +932,6 @@ static void xfrm_policy_inexact_node_reinsert(struct net *net,
 			rb_erase(*p, new);
 			kfree_rcu(n, rcu);
 			n = node;
-			n->prefixlen = prefixlen;
 			goto restart;
 		}
 	}
diff --git a/tools/testing/selftests/net/xfrm_policy.sh b/tools/testing/selftests/net/xfrm_policy.sh
index 5445943bf07f..7a1bf94c5bd3 100755
--- a/tools/testing/selftests/net/xfrm_policy.sh
+++ b/tools/testing/selftests/net/xfrm_policy.sh
@@ -106,6 +106,13 @@ do_overlap()
     #
     # 10.0.0.0/24 and 10.0.1.0/24 nodes have been merged as 10.0.0.0/23.
     ip -net $ns xfrm policy add src 10.1.0.0/24 dst 10.0.0.0/23 dir fwd priority 200 action block
+
+    # similar to above: add policies (with partially random address), with shrinking prefixes.
+    for p in 29 28 27;do
+      for k in $(seq 1 32); do
+       ip -net $ns xfrm policy add src 10.253.1.$((RANDOM%255))/$p dst 10.254.1.$((RANDOM%255))/$p dir fwd priority $((200+k)) action block 2>/dev/null
+      done
+    done
 }
 
 do_esp_policy_get_check() {
-- 
2.17.1


^ permalink raw reply related

* [PATCH 4/5] xfrm interface: fix management of phydev
From: Steffen Klassert @ 2019-09-05 10:22 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20190905102201.1636-1-steffen.klassert@secunet.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

With the current implementation, phydev cannot be removed:

$ ip link add dummy type dummy
$ ip link add xfrm1 type xfrm dev dummy if_id 1
$ ip l d dummy
 kernel:[77938.465445] unregister_netdevice: waiting for dummy to become free. Usage count = 1

Manage it like in ip tunnels, ie just keep the ifindex. Not that the side
effect, is that the phydev is now optional.

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Julien Floret <julien.floret@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/xfrm.h        |  1 -
 net/xfrm/xfrm_interface.c | 32 +++++++++++++++++---------------
 2 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index ad761ef84797..aa08a7a5f6ac 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -990,7 +990,6 @@ struct xfrm_if_parms {
 struct xfrm_if {
 	struct xfrm_if __rcu *next;	/* next interface in list */
 	struct net_device *dev;		/* virtual device associated with interface */
-	struct net_device *phydev;	/* physical device */
 	struct net *net;		/* netns for packet i/o */
 	struct xfrm_if_parms p;		/* interface parms */
 
diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 53e5e47b2c55..2ab4859df55a 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -175,7 +175,6 @@ static void xfrmi_dev_uninit(struct net_device *dev)
 	struct xfrmi_net *xfrmn = net_generic(xi->net, xfrmi_net_id);
 
 	xfrmi_unlink(xfrmn, xi);
-	dev_put(xi->phydev);
 	dev_put(dev);
 }
 
@@ -362,7 +361,7 @@ static netdev_tx_t xfrmi_xmit(struct sk_buff *skb, struct net_device *dev)
 		goto tx_err;
 	}
 
-	fl.flowi_oif = xi->phydev->ifindex;
+	fl.flowi_oif = xi->p.link;
 
 	ret = xfrmi_xmit2(skb, dev, &fl);
 	if (ret < 0)
@@ -548,7 +547,7 @@ static int xfrmi_get_iflink(const struct net_device *dev)
 {
 	struct xfrm_if *xi = netdev_priv(dev);
 
-	return xi->phydev->ifindex;
+	return xi->p.link;
 }
 
 
@@ -574,12 +573,14 @@ static void xfrmi_dev_setup(struct net_device *dev)
 	dev->needs_free_netdev	= true;
 	dev->priv_destructor	= xfrmi_dev_free;
 	netif_keep_dst(dev);
+
+	eth_broadcast_addr(dev->broadcast);
 }
 
 static int xfrmi_dev_init(struct net_device *dev)
 {
 	struct xfrm_if *xi = netdev_priv(dev);
-	struct net_device *phydev = xi->phydev;
+	struct net_device *phydev = __dev_get_by_index(xi->net, xi->p.link);
 	int err;
 
 	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
@@ -594,13 +595,19 @@ static int xfrmi_dev_init(struct net_device *dev)
 
 	dev->features |= NETIF_F_LLTX;
 
-	dev->needed_headroom = phydev->needed_headroom;
-	dev->needed_tailroom = phydev->needed_tailroom;
+	if (phydev) {
+		dev->needed_headroom = phydev->needed_headroom;
+		dev->needed_tailroom = phydev->needed_tailroom;
 
-	if (is_zero_ether_addr(dev->dev_addr))
-		eth_hw_addr_inherit(dev, phydev);
-	if (is_zero_ether_addr(dev->broadcast))
-		memcpy(dev->broadcast, phydev->broadcast, dev->addr_len);
+		if (is_zero_ether_addr(dev->dev_addr))
+			eth_hw_addr_inherit(dev, phydev);
+		if (is_zero_ether_addr(dev->broadcast))
+			memcpy(dev->broadcast, phydev->broadcast,
+			       dev->addr_len);
+	} else {
+		eth_hw_addr_random(dev);
+		eth_broadcast_addr(dev->broadcast);
+	}
 
 	return 0;
 }
@@ -644,13 +651,8 @@ static int xfrmi_newlink(struct net *src_net, struct net_device *dev,
 	xi->p = p;
 	xi->net = net;
 	xi->dev = dev;
-	xi->phydev = dev_get_by_index(net, p.link);
-	if (!xi->phydev)
-		return -ENODEV;
 
 	err = xfrmi_create(dev);
-	if (err < 0)
-		dev_put(xi->phydev);
 	return err;
 }
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH 3/5] xfrm interface: fix list corruption for x-netns
From: Steffen Klassert @ 2019-09-05 10:21 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20190905102201.1636-1-steffen.klassert@secunet.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>

dev_net(dev) is the netns of the device and xi->net is the link netns,
where the device has been linked.
changelink() must operate in the link netns to avoid a corruption of
the xfrm lists.

Note that xi->net and dev_net(xi->physdev) are always the same.

Before the patch, the xfrmi lists may be corrupted and can later trigger a
kernel panic.

Fixes: f203b76d7809 ("xfrm: Add virtual xfrm interfaces")
Reported-by: Julien Floret <julien.floret@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Julien Floret <julien.floret@6wind.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_interface.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 68336ee00d72..53e5e47b2c55 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -503,7 +503,7 @@ static int xfrmi_change(struct xfrm_if *xi, const struct xfrm_if_parms *p)
 
 static int xfrmi_update(struct xfrm_if *xi, struct xfrm_if_parms *p)
 {
-	struct net *net = dev_net(xi->dev);
+	struct net *net = xi->net;
 	struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
 	int err;
 
@@ -663,9 +663,9 @@ static int xfrmi_changelink(struct net_device *dev, struct nlattr *tb[],
 			   struct nlattr *data[],
 			   struct netlink_ext_ack *extack)
 {
-	struct net *net = dev_net(dev);
+	struct xfrm_if *xi = netdev_priv(dev);
+	struct net *net = xi->net;
 	struct xfrm_if_parms p;
-	struct xfrm_if *xi;
 
 	xfrmi_netlink_parms(data, &p);
 	xi = xfrmi_locate(net, &p);
@@ -707,7 +707,7 @@ static struct net *xfrmi_get_link_net(const struct net_device *dev)
 {
 	struct xfrm_if *xi = netdev_priv(dev);
 
-	return dev_net(xi->phydev);
+	return xi->net;
 }
 
 static const struct nla_policy xfrmi_policy[IFLA_XFRM_MAX + 1] = {
-- 
2.17.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox