Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v5] net: caif: fix stack out-of-bounds write in cfctrl_link_setup()
From: Simon Horman @ 2026-04-12 13:57 UTC (permalink / raw)
  To: Kangzheng Gu
  Cc: pabeni, davem, edumazet, kuba, kees, thorsten.blum, arnd,
	sjur.brandeland, netdev, linux-kernel, stable
In-Reply-To: <20260408125333.38489-1-xiaoguai0992@gmail.com>

On Wed, Apr 08, 2026 at 12:53:33PM +0000, Kangzheng Gu wrote:
> cfctrl_link_setup() copies the RFM volume name from a received control
> packet into linkparam.u.rfm.volume until a '\0' is found. A malformed
> packet can omit the terminator and make the copy run past the 20-byte
> stack buffer.
> 
> Stop copying once the buffer is full and mark the frame as failed by
> setting CFCTRL_ERR_BIT so the link setup is rejected.
> 
> Fixes: b482cd2053e3 ("net-caif: add CAIF core protocol stack")
> Cc: stable@vger.kernel.org
> Signed-off-by: Kangzheng Gu <xiaoguai0992@gmail.com>
> ---
>  v5:
>  - remove the Reported-by.
>  - print a warn message and reject link setup by setting CFCTRL_ERR_BIT.
>  - using %zu to adapt the compilation of 32-bit kernel.
>  - add rate limit to error message
> 
>  net/caif/cfctrl.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/net/caif/cfctrl.c b/net/caif/cfctrl.c
> index c6cc2bfed65d..373ab1dc67a7 100644
> --- a/net/caif/cfctrl.c
> +++ b/net/caif/cfctrl.c
> @@ -416,8 +416,16 @@ static int cfctrl_link_setup(struct cfctrl *cfctrl, struct cfpkt *pkt, u8 cmdrsp
>  		cp = (u8 *) linkparam.u.rfm.volume;
>  		for (tmp = cfpkt_extr_head_u8(pkt);
>  		     cfpkt_more(pkt) && tmp != '\0';
> -		     tmp = cfpkt_extr_head_u8(pkt))
> +		     tmp = cfpkt_extr_head_u8(pkt)) {
> +			if (cp >= (u8 *)linkparam.u.rfm.volume +
> +			    sizeof(linkparam.u.rfm.volume) - 1) {
> +				pr_warn_ratelimited("Request reject, volume name length exceeds %zu\n",
> +						    sizeof(linkparam.u.rfm.volume));
> +				cmdrsp |= CFCTRL_ERR_BIT;
> +				break;
> +			}
>  			*cp++ = tmp;
> +		}
>  		*cp = '\0';
>  
>  		if (CFCTRL_ERR_BIT & cmdrsp)

I am wondering if it would be best to follow the pattern for
writing linkparam.u.utility.name elsewhere in this function.
That:
1. Uses a somewhat more succinct loop control structure
2. Silently truncates input without updating cmdrsp if overrun would occur

Something like this (compile tested only!):

diff --git a/net/caif/cfctrl.c b/net/caif/cfctrl.c
index c6cc2bfed65d..ba184c11386e 100644
--- a/net/caif/cfctrl.c
+++ b/net/caif/cfctrl.c
@@ -15,6 +15,7 @@
 #include <net/caif/cfctrl.h>
 
 #define container_obj(layr) container_of(layr, struct cfctrl, serv.layer)
+#define RFM_VOLUME_LEN 20
 #define UTILITY_NAME_LENGTH 16
 #define CFPKT_CTRL_PKT_LEN 20
 
@@ -414,10 +415,11 @@ static int cfctrl_link_setup(struct cfctrl *cfctrl, struct cfpkt *pkt, u8 cmdrsp
 		 */
 		linkparam.u.rfm.connid = cfpkt_extr_head_u32(pkt);
 		cp = (u8 *) linkparam.u.rfm.volume;
-		for (tmp = cfpkt_extr_head_u8(pkt);
-		     cfpkt_more(pkt) && tmp != '\0';
-		     tmp = cfpkt_extr_head_u8(pkt))
+		caif_assert(sizeof(linkparam.u.rfm.volume) >= RFM_VOLUME_LEN);
+		for(i = 0; i < RFM_VOLUME_LEN - 1 && cfpkt_more(pkt); i++) {
+			tmp = cfpkt_extr_head_u8(pkt);
 			*cp++ = tmp;
+		}
 		*cp = '\0';
 
 		if (CFCTRL_ERR_BIT & cmdrsp)

Also, it seems that writing linkparam.u.utility.paramlen elsewhere
in this function also has a potential buffer overrun (by one byte).

^ permalink raw reply related

* [PATCH net-next v3 0/5] net: phy: Fix phy_init_hw() placement and update locking
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc

From: Biju Das <biju.das.jz@bp.renesas.com>

This series fixes two related issues in the PHY subsystem: incorrect
placement of phy_init_hw() in the resume path, and drop/update locking
in several PHY drivers.

Patch 1 identifies that when mac_managed_pm is set, mdio_bus_phy_resume()
is skipped entirely, meaning phy_init_hw() which performs soft reset and
config_init is never called on resume for that path. To make both paths
consistent, phy_init_hw() is moved into phy_resume() so it runs
unconditionally. As a consequence, the separate phy_init_hw() +
phy_resume() call sequence in phy_attach_direct() is collapsed
into a single phy_resume() call.

Patch 2 removes the now-redundant explicit phy_init_hw() call in
rtl8169_up(), since phy_resume() already handles it.

Patch 3 removes manual mutex_lock/unlock(&phydev->lock) from four
functions in the MSCC PHY driver. In vsc85xx_edge_rate_cntl_set(),
the lock wraps a single phy_modify_paged() call, which is already a
fully locked atomic operation that acquires the MDIO bus lock
internally, so the additional phydev->lock is unnecessary. The
remaining three functions — vsc85xx_mac_if_set(),
vsc8531_pre_init_seq_set(), and vsc85xx_eee_init_seq_set() — use
phy_read(), phy_write(), phy_select_page(), and phy_restore_page(),
all of which operate under the MDIO bus lock, so taking phydev->lock
around them provides no additional serialisation. Error-path labels
are updated accordingly.

Patch 4 fixes lan937x_dsp_workaround() in the Microchip T1 driver,
which was incorrectly taking phydev->lock. The function performs raw
MDIO bus accesses and must instead hold mdio_lock. The phy_read() and
phy_write() calls are also switched to their unlocked __phy_read() and
__phy_write() variants since mdio_lock is now held explicitly.

Patch 5 refines the placement introduced in patch 1 by moving
phy_init_hw() from phy_resume() down into __phy_resume(), so that it
is called with phydev->lock already held. This is necessary because
many MAC drivers and phylink reach the resume path via phy_start() ->
__phy_resume() directly, bypassing phy_resume().

v2->v3:
 * Moved all the patches into series as the order they get merged
   matters, otherwise a git bisect could land on a deadlock.
 * Updated commit description for patch#2 and #3.
v1->v2:
 * Updated commit description.
 * phy_init_hw() is moved from __phy_resume() -> phy_resume() to make it
   lock-free.
 * Dropped redundant phy_init_hw() call from mdio_bus_phy_resume() and
   phy_attach_direct().

Biju Das (5):
  net: phy: call phy_init_hw() in phy resume path
  r8169: Drop redundant phy_init_hw() call in rtl8169_up()
  net: phy: mscc: Drop unnecessary phydev->lock
  net: phy: microchip_t1: Replace phydev->lock with mdio_lock in
  net: phy: Move phy_init_hw() from phy_resume() to __phy_resume()

 drivers/net/ethernet/realtek/r8169_main.c |  1 -
 drivers/net/phy/microchip_t1.c            |  8 ++---
 drivers/net/phy/mscc/mscc_main.c          | 41 +++++++----------------
 drivers/net/phy/phy_device.c              | 14 ++++----
 4 files changed, 22 insertions(+), 42 deletions(-)

-- 
2.43.0


^ permalink raw reply

* [PATCH net-next v3 1/5] net: phy: call phy_init_hw() in phy resume path
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc, Ovidiu Panait
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

When mac_managed_pm flag is set, mdio_bus_phy_resume() is skipped, so
phy_init_hw(), which performs soft_reset and config_init, is not called
during resume.

This is inconsistent with the non-mac_managed_pm path, where
mdio_bus_phy_resume() calls phy_init_hw() before phy_resume() on every
resume.

To align both paths, move the phy_init_hw() call into phy_resume() itself,
before invoking the driver's resume callback. This ensures PHY soft reset
and re-initialization happen unconditionally, regardless of whether PM is
managed by the MAC or the MDIO bus. As a result, drop the redundant
phy_init_hw() call in mdio_bus_phy_resume().

Additionally, in phy_attach_direct(), replace the separate phy_init_hw()
and phy_resume() calls with a single phy_resume() call, since
phy_init_hw() is now handled inside phy_resume().

Signed-off-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v2->v3:
 * No change,
v1->v2:
 * Updated commit description.
 * phy_init_hw() is moved from __phy_resume() -> phy_resume() to make it
   lock-free.
 * Dropped redundant phy_init_hw() call from mdio_bus_phy_resume() and
   phy_attach_direct().
---
 drivers/net/phy/phy_device.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 0edff47478c2..4a2b19d39373 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -396,10 +396,6 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
 	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
 		phydev->state != PHY_UP);
 
-	ret = phy_init_hw(phydev);
-	if (ret < 0)
-		return ret;
-
 	ret = phy_resume(phydev);
 	if (ret < 0)
 		return ret;
@@ -1857,16 +1853,14 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 	if (dev)
 		netif_carrier_off(phydev->attached_dev);
 
-	/* Do initial configuration here, now that
+	/* Do initial configuration inside phy_init_hw(), now that
 	 * we have certain key parameters
 	 * (dev_flags and interface)
 	 */
-	err = phy_init_hw(phydev);
+	err = phy_resume(phydev);
 	if (err)
 		goto error;
 
-	phy_resume(phydev);
-
 	/**
 	 * If the external phy used by current mac interface is managed by
 	 * another mac interface, so we should create a device link between
@@ -2020,6 +2014,10 @@ int phy_resume(struct phy_device *phydev)
 {
 	int ret;
 
+	ret = phy_init_hw(phydev);
+	if (ret)
+		return ret;
+
 	mutex_lock(&phydev->lock);
 	ret = __phy_resume(phydev);
 	mutex_unlock(&phydev->lock);
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 2/5] r8169: Drop redundant phy_init_hw() call in rtl8169_up()
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Heiner Kallweit, nic_swsd, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

Since phy_resume() now calls phy_init_hw() internally as part of the
resume sequence, the explicit phy_init_hw() call immediately before
phy_resume() in rtl8169_up() is redundant. Remove it.

No functional change intended.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v2->v3:
 * Added the patch into series
 * Updated commit description.
v2:
 * New patch
---
 drivers/net/ethernet/realtek/r8169_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 791277e750ba..cb22105f323f 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -5032,7 +5032,6 @@ static void rtl8169_up(struct rtl8169_private *tp)
 		rtl8168_driver_start(tp);
 
 	pci_set_master(tp->pci_dev);
-	phy_init_hw(tp->phydev);
 	phy_resume(tp->phydev);
 	rtl8169_init_phy(tp);
 	napi_enable(&tp->napi);
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 3/5] net: phy: mscc: Drop unnecessary phydev->lock
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, Lad Prabhakar, Horatiu Vultur,
	Vladimir Oltean, netdev, linux-kernel, Geert Uytterhoeven,
	Biju Das, linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

Remove manual mutex_lock/unlock(&phydev->lock) calls from several
functions in the MSCC PHY driver.

In vsc85xx_edge_rate_cntl_set(), phydev->lock is taken around a single
phy_modify_paged() call. phy_modify_paged() is already a fully locked
atomic operation that acquires the MDIO bus lock internally, so the
additional phydev->lock is unnecessary.

The remaining three functions — vsc85xx_mac_if_set(),
vsc8531_pre_init_seq_set(), and vsc85xx_eee_init_seq_set() — use
phy_read(), phy_write(), phy_select_page(), and phy_restore_page(),
all of which operate under the MDIO bus lock. Taking phydev->lock
around them provides no additional serialisation.

Along with dropping the locks, error-path labels are renamed from
out_unlock to err or restore_oldpage to better reflect their purpose.
In vsc8531_pre_init_seq_set() and vsc85xx_eee_init_seq_set(), the
redundant intermediate assignment of oldpage before returning is also
eliminated.

No functional change intended.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v2->v3:
 * Added the patch into series
 * Updated commit description.
v2:
 * New patch
---
 drivers/net/phy/mscc/mscc_main.c | 41 ++++++++++----------------------
 1 file changed, 12 insertions(+), 29 deletions(-)

diff --git a/drivers/net/phy/mscc/mscc_main.c b/drivers/net/phy/mscc/mscc_main.c
index 2b9fb8a675a6..75430f55acfd 100644
--- a/drivers/net/phy/mscc/mscc_main.c
+++ b/drivers/net/phy/mscc/mscc_main.c
@@ -486,15 +486,9 @@ static int vsc85xx_dt_led_modes_get(struct phy_device *phydev,
 
 static int vsc85xx_edge_rate_cntl_set(struct phy_device *phydev, u8 edge_rate)
 {
-	int rc;
-
-	mutex_lock(&phydev->lock);
-	rc = phy_modify_paged(phydev, MSCC_PHY_PAGE_EXTENDED_2,
-			      MSCC_PHY_WOL_MAC_CONTROL, EDGE_RATE_CNTL_MASK,
-			      edge_rate << EDGE_RATE_CNTL_POS);
-	mutex_unlock(&phydev->lock);
-
-	return rc;
+	return phy_modify_paged(phydev, MSCC_PHY_PAGE_EXTENDED_2,
+				MSCC_PHY_WOL_MAC_CONTROL, EDGE_RATE_CNTL_MASK,
+				edge_rate << EDGE_RATE_CNTL_POS);
 }
 
 static int vsc85xx_mac_if_set(struct phy_device *phydev,
@@ -503,7 +497,6 @@ static int vsc85xx_mac_if_set(struct phy_device *phydev,
 	int rc;
 	u16 reg_val;
 
-	mutex_lock(&phydev->lock);
 	reg_val = phy_read(phydev, MSCC_PHY_EXT_PHY_CNTL_1);
 	reg_val &= ~(MAC_IF_SELECTION_MASK);
 	switch (interface) {
@@ -522,17 +515,15 @@ static int vsc85xx_mac_if_set(struct phy_device *phydev,
 		break;
 	default:
 		rc = -EINVAL;
-		goto out_unlock;
+		goto err;
 	}
 	rc = phy_write(phydev, MSCC_PHY_EXT_PHY_CNTL_1, reg_val);
 	if (rc)
-		goto out_unlock;
+		goto err;
 
 	rc = genphy_soft_reset(phydev);
 
-out_unlock:
-	mutex_unlock(&phydev->lock);
-
+err:
 	return rc;
 }
 
@@ -668,19 +659,15 @@ static int vsc8531_pre_init_seq_set(struct phy_device *phydev)
 	if (rc < 0)
 		return rc;
 
-	mutex_lock(&phydev->lock);
 	oldpage = phy_select_page(phydev, MSCC_PHY_PAGE_TR);
 	if (oldpage < 0)
-		goto out_unlock;
+		goto restore_oldpage;
 
 	for (i = 0; i < ARRAY_SIZE(init_seq); i++)
 		vsc85xx_tr_write(phydev, init_seq[i].reg, init_seq[i].val);
 
-out_unlock:
-	oldpage = phy_restore_page(phydev, oldpage, oldpage);
-	mutex_unlock(&phydev->lock);
-
-	return oldpage;
+restore_oldpage:
+	return phy_restore_page(phydev, oldpage, oldpage);
 }
 
 static int vsc85xx_eee_init_seq_set(struct phy_device *phydev)
@@ -708,19 +695,15 @@ static int vsc85xx_eee_init_seq_set(struct phy_device *phydev)
 	unsigned int i;
 	int oldpage;
 
-	mutex_lock(&phydev->lock);
 	oldpage = phy_select_page(phydev, MSCC_PHY_PAGE_TR);
 	if (oldpage < 0)
-		goto out_unlock;
+		goto restore_oldpage;
 
 	for (i = 0; i < ARRAY_SIZE(init_eee); i++)
 		vsc85xx_tr_write(phydev, init_eee[i].reg, init_eee[i].val);
 
-out_unlock:
-	oldpage = phy_restore_page(phydev, oldpage, oldpage);
-	mutex_unlock(&phydev->lock);
-
-	return oldpage;
+restore_oldpage:
+	return phy_restore_page(phydev, oldpage, oldpage);
 }
 
 /* phydev->bus->mdio_lock should be locked when using this function */
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 4/5] net: phy: microchip_t1: Replace phydev->lock with mdio_lock in lan937x_dsp_workaround()
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Arun Ramadoss, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, UNGLinuxDriver, Russell King, netdev, linux-kernel,
	Geert Uytterhoeven, Prabhakar Mahadev Lad, Biju Das,
	linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

lan937x_dsp_workaround() performs raw MDIO bus accesses and must
therefore hold mdio_lock rather than phydev->lock. Using phydev->lock
here is incorrect as it does not serialise access to the MDIO bus.

Replace phydev->lock with bus->mdio_lock, and switch the phy_read()/
phy_write() calls to their unlocked __phy_read()/__phy_write()
variants since mdio_lock is now held explicitly.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v3:
 * New patch.
---
 drivers/net/phy/microchip_t1.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/microchip_t1.c b/drivers/net/phy/microchip_t1.c
index 62b36a318100..afb4e8908b78 100644
--- a/drivers/net/phy/microchip_t1.c
+++ b/drivers/net/phy/microchip_t1.c
@@ -337,9 +337,9 @@ static int lan937x_dsp_workaround(struct phy_device *phydev, u16 ereg, u8 bank)
 	int rc = 0;
 	u16 val;
 
-	mutex_lock(&phydev->lock);
+	mutex_lock(&phydev->mdio.bus->mdio_lock);
 	/* Read previous selected bank */
-	rc = phy_read(phydev, LAN87XX_EXT_REG_CTL);
+	rc = __phy_read(phydev, LAN87XX_EXT_REG_CTL);
 	if (rc < 0)
 		goto out_unlock;
 
@@ -353,11 +353,11 @@ static int lan937x_dsp_workaround(struct phy_device *phydev, u16 ereg, u8 bank)
 		val |= LAN87XX_EXT_REG_CTL_RD_CTL;
 
 		/* access twice for DSP bank change,dummy access */
-		rc = phy_write(phydev, LAN87XX_EXT_REG_CTL, val);
+		rc = __phy_write(phydev, LAN87XX_EXT_REG_CTL, val);
 	}
 
 out_unlock:
-	mutex_unlock(&phydev->lock);
+	mutex_unlock(&phydev->mdio.bus->mdio_lock);
 
 	return rc;
 }
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 5/5] net: phy: Move phy_init_hw() from phy_resume() to __phy_resume()
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

Now that redundant locking has been removed from PHY driver callbacks,
phy_init_hw() can be called with phydev->lock held.

Many MAC drivers and the phylink framework resume the PHY via
phy_start(), which invokes __phy_resume() directly without going
through phy_resume(). Keeping phy_init_hw() in phy_resume() means it
is not called in this path.

Move phy_init_hw() into __phy_resume() so that PHY soft reset and
re-initialisation happen unconditionally on every resume, regardless
of which code path triggers it.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v3:
 * New patch.
---
 drivers/net/phy/phy_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 4a2b19d39373..16fc2fc63c50 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1999,6 +1999,10 @@ int __phy_resume(struct phy_device *phydev)
 
 	lockdep_assert_held(&phydev->lock);
 
+	ret = phy_init_hw(phydev);
+	if (ret)
+		return ret;
+
 	if (!phydrv || !phydrv->resume)
 		return 0;
 
@@ -2014,10 +2018,6 @@ int phy_resume(struct phy_device *phydev)
 {
 	int ret;
 
-	ret = phy_init_hw(phydev);
-	if (ret)
-		return ret;
-
 	mutex_lock(&phydev->lock);
 	ret = __phy_resume(phydev);
 	mutex_unlock(&phydev->lock);
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Maxime Chevallier @ 2026-04-12 14:01 UTC (permalink / raw)
  To: Russell King (Oracle), Andrew Lunn
  Cc: Alexandre Torgue, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, linux-arm-kernel, linux-stm32, netdev,
	Paolo Abeni, Sam Edwards
In-Reply-To: <E1wBBaR-0000000GZHR-1dbM@rmk-PC.armlinux.org.uk>

Hi Russell,

On 10/04/2026 15:07, Russell King (Oracle) wrote:
> Enable receive process stopped and receive buffer unavailable
> interrupts, so that the statistic counters can be updated.
> 
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Since we are seeing receive buffer exhaustion on several platforms,
> let's enable the interrupts so the statistics we publish via ethtool -S
> actually work to aid diagnosis. I've been in two minds about whether
> to send this patch, but given the problems with stmmac at the moment,
> I think it should be merged.

Looks like my reply to your original RFC was lost in limbo as the review/test tags are missing.

Here's my original answer :


It works, I can indeed see the stats get properly updated on imx8mp 🙂

There's one downside to it though, which is that as soon as we hit a situation
where we don't have RX bufs available, this patchs has a tendancy to make things
worse as we'll trigger interrupts for each packet we receive and that we can't
process, making it even longer for queues to be refilled.

It shows on iperf3 with small packets :

---- Before patch, 17% packet loss on UDP 56 bytes packets -----------------

# iperf3 -u -b 0 -l 56 -c 192.168.2.1 -R
Connecting to host 192.168.2.1, port 5201
Reverse mode, remote host 192.168.2.1 is sending
[  5] local 192.168.2.18 port 47851 connected to 192.168.2.1 port 5201
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  10.7 MBytes  90.0 Mbits/sec  0.003 ms  48550/249650 (19%)  
[  5]   1.00-2.00   sec  11.3 MBytes  95.0 Mbits/sec  0.003 ms  41881/253832 (16%)  
[  5]   2.00-3.00   sec  11.3 MBytes  94.9 Mbits/sec  0.002 ms  42060/253913 (17%)  
[  5]   3.00-4.00   sec  11.3 MBytes  95.1 Mbits/sec  0.003 ms  41499/253785 (16%)  
[  5]   4.00-5.00   sec  11.3 MBytes  94.6 Mbits/sec  0.003 ms  42663/253787 (17%)  
[  5]   5.00-6.00   sec  11.3 MBytes  94.9 Mbits/sec  0.006 ms  41976/253719 (17%)  
[  5]   6.00-7.00   sec  11.3 MBytes  94.5 Mbits/sec  0.003 ms  43133/253999 (17%)  
[  5]   7.00-8.00   sec  11.3 MBytes  95.0 Mbits/sec  0.004 ms  41442/253579 (16%)  
[  5]   8.00-9.00   sec  11.4 MBytes  95.2 Mbits/sec  0.004 ms  41518/254131 (16%)  
[  5]   9.00-10.00  sec  11.2 MBytes  94.3 Mbits/sec  0.006 ms  43580/254143 (17%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   135 MBytes   114 Mbits/sec  0.000 ms  0/0 (0%)  sender
[  5]   0.00-10.00  sec   112 MBytes  94.3 Mbits/sec  0.006 ms  428302/2534538 (17%)  receiver

iperf Done.
# ethtool -S eth1 | grep rx_buf_unav_irq
     rx_buf_unav_irq: 0

---- After patch, 22% packet loss on UDP 56 bytes packets ----------------------

# iperf3 -u -b 0 -l 56 -c 192.168.2.1 -R
Connecting to host 192.168.2.1, port 5201
Reverse mode, remote host 192.168.2.1 is sending
[  5] local 192.168.2.18 port 42121 connected to 192.168.2.1 port 5201
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  10.3 MBytes  85.8 Mbits/sec  0.004 ms  55146/247172 (22%)  
[  5]   1.00-2.00   sec  10.6 MBytes  89.1 Mbits/sec  0.003 ms  54699/253355 (22%)  
[  5]   2.00-3.00   sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  55231/253887 (22%)  
[  5]   3.00-4.00   sec  10.6 MBytes  88.9 Mbits/sec  0.003 ms  55138/253602 (22%)  
[  5]   4.00-5.00   sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  54938/253722 (22%)  
[  5]   5.00-6.00   sec  10.6 MBytes  88.9 Mbits/sec  0.003 ms  55273/253580 (22%)  
[  5]   6.00-7.00   sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  55202/253986 (22%)  
[  5]   7.00-8.00   sec  10.6 MBytes  89.1 Mbits/sec  0.003 ms  55047/253958 (22%)  
[  5]   8.00-9.00   sec  10.6 MBytes  88.9 Mbits/sec  0.003 ms  55612/254140 (22%)  
[  5]   9.00-10.00  sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  55683/254403 (22%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   135 MBytes   113 Mbits/sec  0.000 ms  0/0 (0%)  sender
[  5]   0.00-10.00  sec   106 MBytes  88.7 Mbits/sec  0.003 ms  551969/2531805 (22%)  receiver

iperf Done.
# ethtool -S eth1 | grep rx_buf_unav_irq
     rx_buf_unav_irq: 30624


So clearly there are pros and cons with this, but I don't want to fall into the
"let's not break microbenchmarks" pitfall.

I personnaly find the stat useful, and that having the stat visible to user
but stuck at 0 is misleading so,

Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>


Maxime

> 
>  drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> index af6580332d49..43b036d4e95b 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> @@ -99,6 +99,8 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
>  #define DMA_CHAN_INTR_ENA_NIE_4_10	BIT(15)
>  #define DMA_CHAN_INTR_ENA_AIE_4_10	BIT(14)
>  #define DMA_CHAN_INTR_ENA_FBE		BIT(12)
> +#define DMA_CHAN_INTR_ENA_RPS		BIT(8)
> +#define DMA_CHAN_INTR_ENA_RBU		BIT(7)
>  #define DMA_CHAN_INTR_ENA_RIE		BIT(6)
>  #define DMA_CHAN_INTR_ENA_TIE		BIT(0)
>  
> @@ -107,6 +109,8 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
>  					 DMA_CHAN_INTR_ENA_TIE)
>  
>  #define DMA_CHAN_INTR_ABNORMAL		(DMA_CHAN_INTR_ENA_AIE | \
> +					 DMA_CHAN_INTR_ENA_RPS | \
> +					 DMA_CHAN_INTR_ENA_RBU | \
>  					 DMA_CHAN_INTR_ENA_FBE)
>  /* DMA default interrupt mask for 4.00 */
>  #define DMA_CHAN_INTR_DEFAULT_MASK	(DMA_CHAN_INTR_NORMAL | \
> @@ -117,6 +121,8 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
>  					 DMA_CHAN_INTR_ENA_TIE)
>  
>  #define DMA_CHAN_INTR_ABNORMAL_4_10	(DMA_CHAN_INTR_ENA_AIE_4_10 | \
> +					 DMA_CHAN_INTR_ENA_RPS | \
> +					 DMA_CHAN_INTR_ENA_RBU | \
>  					 DMA_CHAN_INTR_ENA_FBE)
>  /* DMA default interrupt mask for 4.10a */
>  #define DMA_CHAN_INTR_DEFAULT_MASK_4_10	(DMA_CHAN_INTR_NORMAL_4_10 | \


^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Russell King (Oracle) @ 2026-04-12 14:23 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: Andrew Lunn, Alexandre Torgue, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
	netdev, Paolo Abeni, Sam Edwards
In-Reply-To: <266998d8-7e38-4bae-a4df-2f889538fe88@bootlin.com>

On Sun, Apr 12, 2026 at 04:01:59PM +0200, Maxime Chevallier wrote:
> Hi Russell,
> 
> On 10/04/2026 15:07, Russell King (Oracle) wrote:
> > Enable receive process stopped and receive buffer unavailable
> > interrupts, so that the statistic counters can be updated.
> > 
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > Since we are seeing receive buffer exhaustion on several platforms,
> > let's enable the interrupts so the statistics we publish via ethtool -S
> > actually work to aid diagnosis. I've been in two minds about whether
> > to send this patch, but given the problems with stmmac at the moment,
> > I think it should be merged.
> 
> Looks like my reply to your original RFC was lost in limbo as the review/test tags are missing.

Thanks. Unfortunately, I can't run iperf3 against stmmac on the Jetson
NX because stmmac just totally screws itself (at the first RBU, the
receive side irrevocably collapses.)

Against i.MX6 (which is limited to around 480Mbps,) it's recoverable by
taking the interface down and back up a couple of times.

Against x86 (which will saturate the link) its pretty much
irrecoverable without entire system reboot - if one tries the down+up,
we then get arm-smmu errors because it seems that, despite stmmac being
reset, it still attempts to access a previous receive buffer from
before the down/up sometime after the up. Moreover, transmit stops
working - packets get queued but they are never processed by the
hardware. This is a scenario that I can only rarely test myself (as
it depends on my physical location.)

As the dwmac 5.0 core receive path seems to lock up after the first
RBU, I never see more than one of those at a time.

Right now, I consider this pretty much unsolvable - I've spent quite
some time looking at it and trying various approaches, nothing seems
to fix it. However, adding dma_rmb() in the descriptor cleanup/refill
paths does seem to improve the situation a little with the 480Mbps
case, because I think it means that we're reading the descriptors in
a more timely manner after the hardware has updated them.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* Re: [PATCH v4] net/mlx5: Fix OOB access and stack information leak in PTP event handling
From: Carolina Jubran @ 2026-04-12 14:25 UTC (permalink / raw)
  To: Prathamesh Deshpande, Saeed Mahameed, Leon Romanovsky
  Cc: Richard Cochran, Tariq Toukan, Mark Bloch, netdev, linux-rdma,
	linux-kernel
In-Reply-To: <20260412000418.8415-1-prathameshdeshpande7@gmail.com>


On 12/04/2026 3:04, Prathamesh Deshpande wrote:
> In mlx5_pps_event(), several critical issues were identified:
>
> 1. The 'pin' index from the hardware event was used without bounds
>     checking to index 'pin_config' and 'pps_info->start'. Check against
>     MAX_PIN_NUM to prevent out-of-bounds access.
> 2. 'ptp_event' was not zero-initialized, potentially leaking stack
>     memory through the union.
> 3. A NULL 'pin_config' could be dereferenced if initialization failed.
> 4. 'clock->ptp' could be NULL if ptp_clock_register() failed.
>
> Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section")
> Suggested-by: Carolina Jubran <cjubran@nvidia.com>
> Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>

^ permalink raw reply

* Re: [PATCH iwl-next 2/2] idpf: implement pci error handlers
From: Tantilov, Emil S @ 2026-04-12 14:38 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: intel-wired-lan, netdev, przemyslaw.kitszel, jay.bhat,
	ivan.d.barrera, aleksandr.loktionov, larysa.zaremba,
	anthony.l.nguyen, andrew+netdev, davem, edumazet, kuba, pabeni,
	aleksander.lobakin, linux-pci, madhu.chittim, decot, willemb,
	sheenamo
In-Reply-To: <adnfeAJHoFoaGYH7@wunner.de>



On 4/10/2026 10:43 PM, Lukas Wunner wrote:
> On Fri, Apr 10, 2026 at 05:39:59PM -0700, Emil Tantilov wrote:
>> +static pci_ers_result_t
>> +idpf_pci_err_slot_reset(struct pci_dev *pdev)
>> +{
>> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
>> +
>> +	pci_restore_state(pdev);
>> +	pci_set_master(pdev);
>> +	pci_wake_from_d3(pdev, false);
>> +	if (readl(adapter->reset_reg.rstat) != 0xFFFFFFFF) {
>> +		pci_save_state(pdev);
>> +		return PCI_ERS_RESULT_RECOVERED;
>> +	}
> 
> The pci_save_state() is no longer necessary here, please drop it.
> See commits a2f1e22390ac and 383d89699c50 for details.

Ah, the state_saved check was still there when I last checked and I
missed this change ... I will remove it in v2.

Thanks!
Emil

> 
> Thanks,
> 
> Lukas


^ permalink raw reply

* Re: [PATCH net] netrom: do some basic forms of validation on incoming frames
From: Jakub Kicinski @ 2026-04-12 14:41 UTC (permalink / raw)
  To: Craig
  Cc: hugh, Kuniyuki Iwashima, davem, edumazet, gregkh, horms,
	linux-hams, linux-kernel, netdev, pabeni, stable, workflows,
	yizhe
In-Reply-To: <761f83cc-58eb-4b4a-ba91-d11412e7b2a6@gmail.com>

On Fri, 10 Apr 2026 15:51:39 -0700 Craig wrote:
> If you're sure the Internet will never fail, I guess it makes 
> sense removing all of this since it's inconvenient to maintain.

I did a sweep thru the fixes on Friday, there are 4 more patches
for ham radio pending review. Go ahead and review those.

^ permalink raw reply

* Re: [PATCH v5 net-next 0/8] dpll/ice: Add TXC DPLL type and full TX reference clock control for E825
From: Jakub Kicinski @ 2026-04-12 14:50 UTC (permalink / raw)
  To: Nitka, Grzegorz
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org, Oros, Petr,
	richardcochran@gmail.com, andrew+netdev@lunn.ch,
	Kitszel, Przemyslaw, Nguyen, Anthony L,
	Prathosh.Satish@microchip.com, Vecera, Ivan, jiri@resnulli.us,
	Kubalewski, Arkadiusz, vadim.fedorenko@linux.dev,
	donald.hunter@gmail.com, horms@kernel.org, pabeni@redhat.com,
	davem@davemloft.net, edumazet@google.com
In-Reply-To: <IA1PR11MB62195E2AF6FE5B102B2EECD392272@IA1PR11MB6219.namprd11.prod.outlook.com>

On Sun, 12 Apr 2026 13:50:23 +0000 Nitka, Grzegorz wrote:
> > > Proposed solution just seems to be clean and fully reflects current
> > > connection topology.  
> > 
> > Do you have the link to the old proposal that was adding stuff to
> > rtnetlink? I remember some discussion long-ish ago, maybe I was wrong.
>
> This is the patch from the discussion I put the link in the cover letter:
> https://lore.kernel.org/netdev/20250828164345.116097-1-arkadiusz.kubalewski@intel.com/

Let's go back to something like that. But leave the OC info out of 
the XO, just ext-ref, dpll, xo? We can add the xo types later if really
needed. Sorry for the flip flop.

^ permalink raw reply

* Re: [PATCH 2/4] net: ionic: Add PHC state page for user space access
From: Jakub Kicinski @ 2026-04-12 14:52 UTC (permalink / raw)
  To: Allen Hubbe
  Cc: Abhijit Gangurde, jgg, leon, brett.creeley, andrew+netdev, davem,
	edumazet, pabeni, nikhil.agarwal, linux-rdma, netdev,
	linux-kernel
In-Reply-To: <69c189b7-9896-41cf-b28f-5ed4c7b80d6a@amd.com>

On Fri, 10 Apr 2026 19:44:18 -0400 Allen Hubbe wrote:
> On 4/10/2026 4:43 PM, Jakub Kicinski wrote:
> > On Fri, 10 Apr 2026 09:10:09 -0400 Allen Hubbe wrote:  
> >> The simple answer is just following the same approach as an existing
> >> implementation.  See struct mlx5_ib_clock_info and
> >> mlx5_update_clock_info_page().
> >>
> >> Making this common might risk presuming that other implementations will
> >> be a similar design.  Compare these to the sfc driver.  The clock is
> >> quite different from ionic and mlx5, not using timecounter, because
> >> instead of a free-running cycle counter the hardware itself provides an
> >> adjustable clock for timestamping.  
> > 
> > So your augment is basically that drivers which don't use sw timecounter
> > exist so we shouldn't bother creating common definitions for drivers
> > that do? Why do we have common implementation of timecounter in the
> > kernel at all then?
> > 
> > These are rhetorical questions.  
> 
> There is no suggestion to get rid of timecounter in the kernel.
> 
> Maybe I've been overthinking this and misunderstood your first reply. 
> Did you mean, just, why not move this to ib_user_verbs.h, struct 
> ib_uverbs_phc_state, and use it from the vendor driver?

I think so, just drop the ionic from the names and pop it into a header
that won't be awkward to reuse by other vendors.

^ permalink raw reply

* Re: [RFC net-next v5 0/3] Add RSS and LRO support
From: Jakub Kicinski @ 2026-04-12 14:54 UTC (permalink / raw)
  To: Frank Wunderlich
  Cc: linux, nbd, sean.wang, lorenzo, andrew+netdev, davem, edumazet,
	pabeni, matthias.bgg, angelogioacchino.delregno, linux, daniel,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek
In-Reply-To: <trinity-1304ecb5-9a3f-4e37-b2c6-bcd7f35c2921-1775995067670@trinity-msg-rest-gmx-gmx-live-579dcf886d-nkpds>

On Sun, 12 Apr 2026 11:57:47 +0000 Frank Wunderlich wrote:
> some time has passed without a single comment, so i just send a friendly reminder ;)

You have a lot of people in the To:
Could you clarify who you expect to action these patches?
Patches are an RFC and I suppose ain't nobody got much comments?

^ permalink raw reply

* Re: [PATCH v2 net-next 00/15] ip6mr: No RTNL for RTNL_FAMILY_IP6MR rtnetlink.
From: Jakub Kicinski @ 2026-04-12 14:58 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S . Miller, David Ahern, Eric Dumazet, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, netdev
In-Reply-To: <20260410211726.1668756-1-kuniyu@google.com>

On Fri, 10 Apr 2026 21:16:56 +0000 Kuniyuki Iwashima wrote:
> This series is the IPv6 version of
> 
>   https://lore.kernel.org/netdev/20260228221800.1082070-1-kuniyu@google.com/
> 
> and removes RTNL from ip6mr rtnetlink handlers.
> 
> After this series, there are a few RTNL left in net/ipv6/ipmr.c
> and such users will be converted to per-netns RTNL in another
> series.
> 
> Patch 1 extends the ipmr selftest to exercise most of the RTNL
>  paths in net/ipv6/ipmr.c
> 
> Patch 2 - 6 converts RTM_GETROUTE handlers to RCU.
> 
> Patch 7 removes struct fib_dump_filter.rtnl_held.
> 
> Patch 8 - 9 use RCU for mr_table for CONFIG_IP_MROUTE_MULTIPLE_TABLES=n
>  and CONFIG_IPV6_MROUTE_MULTIPLE_TABLES=n for ->exit_rtnl().
> 
> Patch 10 - 12 converts ->exit_batch() to ->exit_rtnl() to
>  save one RTNL in cleanup_net().
> 
> Patch 13 - 14 removes unnecessary RTNL during setup_net()
>  failure.
> 
> Patch 15 drops RTNL for MRT6_(ADD|DEL)_MFC(_PROXY)?.

Hitting a bunch of:

  SKIP      no netlink MFC interface

on the new test here. Do we need to add something to .../config ?
Current effective config for the net target is here:
https://netdev-ctrl.bots.linux.dev/logs/vmksft/forwarding/results/597761/config

^ permalink raw reply

* Re: [PATCH v2 0/6] bus: mhi: host: mhi_phc: Add support for PHC over MHI
From: Jakub Kicinski @ 2026-04-12 15:09 UTC (permalink / raw)
  To: Krishna Chaitanya Chundru
  Cc: Manivannan Sadhasivam, Richard Cochran, mhi, linux-arm-msm,
	linux-kernel, netdev, Vivek Pernamitta, Sivareddy Surasani,
	Vivek Pernamitta, Imran Shaik, Taniya Das
In-Reply-To: <20260411-tsc_timesync-v2-0-6f25f72987b3@oss.qualcomm.com>

On Sat, 11 Apr 2026 13:42:00 +0530 Krishna Chaitanya Chundru wrote:
> - User space applications use the standard Linux PTP interface.
> - The PTP subsystem routes IOCTLs to the MHI PHC driver.
> - The MHI PHC driver communicates with the MHI core to fetch timestamps.
> - The MHI core interacts with the device to retrieve accurate time data.

Nack, stop adding functionality under the mhi "bus".
Bus is supposed to be an abstraction into which real drivers plug in.

^ permalink raw reply

* [PATTCH net v6 0/5] netem bug fixes
From: Stephen Hemminger @ 2026-04-12 15:23 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger

These bugs were found when doing AI assisted  review of sch_netem.c
during investigation of the packet duplication recursion problem
addressed in Jamal's series.

The fixes cover:

 - probability gaps in the 4-state Markov loss model
 - queue limit not accounting for reordered packets
 - PRNG reseeded on every tc change, breaking reproducibility
 - slot delay configuration not validated for inverted ranges
 - slot delay arithmetic overflow for ranges above ~2.1 seconds

v6 - drop 3 unnecessary patches; focus on real bugs.

Stephen Hemminger (5):
  net/sched: netem: fix probability gaps in 4-state loss model
  net/sched: netem: fix queue limit check to include reordered packets
  net/sched: netem: only reseed PRNG when seed is explicitly provided
  net/sched: netem: check for invalid slot range
  net/sched: netem: fix slot delay calculation overflow

 net/sched/sch_netem.c | 43 +++++++++++++++++++++++++++++++------------
 1 file changed, 31 insertions(+), 12 deletions(-)

-- 
2.53.0


^ permalink raw reply

* [PATTCH net v6 1/5] net/sched: netem: fix probability gaps in 4-state loss model
From: Stephen Hemminger @ 2026-04-12 15:23 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list
In-Reply-To: <20260412152429.143474-1-stephen@networkplumber.org>

The 4-state Markov chain in loss_4state() has gaps at the boundaries
between transition probability ranges. The comparisons use:

  if (rnd < a4)
  else if (a4 < rnd && rnd < a1 + a4)

When rnd equals a boundary value exactly, neither branch matches and
no state transition occurs. The redundant lower-bound check (a4 < rnd)
is already implied by being in the else branch.

Remove the unnecessary lower-bound comparisons so the ranges are
contiguous and every random value produces a transition, matching
the GI (General and Intuitive) loss model specification.

This bug goes back to original implementation of this model.

Fixes: 661b79725fea ("netem: revised correlated loss generator")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 20df1c08b1e9..8ee72cac1faf 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -227,10 +227,10 @@ static bool loss_4state(struct netem_sched_data *q)
 		if (rnd < clg->a4) {
 			clg->state = LOST_IN_GAP_PERIOD;
 			return true;
-		} else if (clg->a4 < rnd && rnd < clg->a1 + clg->a4) {
+		} else if (rnd < clg->a1 + clg->a4) {
 			clg->state = LOST_IN_BURST_PERIOD;
 			return true;
-		} else if (clg->a1 + clg->a4 < rnd) {
+		} else {
 			clg->state = TX_IN_GAP_PERIOD;
 		}
 
@@ -247,9 +247,9 @@ static bool loss_4state(struct netem_sched_data *q)
 	case LOST_IN_BURST_PERIOD:
 		if (rnd < clg->a3)
 			clg->state = TX_IN_BURST_PERIOD;
-		else if (clg->a3 < rnd && rnd < clg->a2 + clg->a3) {
+		else if (rnd < clg->a2 + clg->a3) {
 			clg->state = TX_IN_GAP_PERIOD;
-		} else if (clg->a2 + clg->a3 < rnd) {
+		} else {
 			clg->state = LOST_IN_BURST_PERIOD;
 			return true;
 		}
-- 
2.53.0


^ permalink raw reply related

* [PATTCH net v6 2/5] net/sched: netem: fix queue limit check to include reordered packets
From: Stephen Hemminger @ 2026-04-12 15:23 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	open list
In-Reply-To: <20260412152429.143474-1-stephen@networkplumber.org>

The queue limit check in netem_enqueue() uses q->t_len which only
counts packets in the internal tfifo. Packets placed in sch->q by
the reorder path (__qdisc_enqueue_head) are not counted, allowing
the total queue occupancy to exceed sch->limit under reordering.

Include sch->q.qlen in the limit check.

Fixes: 50612537e9ab ("netem: fix classful handling")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 8ee72cac1faf..d400a730eadd 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -524,7 +524,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
 				1 << get_random_u32_below(8);
 	}
 
-	if (unlikely(q->t_len >= sch->limit)) {
+	if (unlikely(sch->q.qlen >= sch->limit)) {
 		/* re-link segs, so that qdisc_drop_all() frees them all */
 		skb->next = segs;
 		qdisc_drop_all(skb, sch, to_free);
-- 
2.53.0


^ permalink raw reply related

* [PATTCH net v6 3/5] net/sched: netem: only reseed PRNG when seed is explicitly provided
From: Stephen Hemminger @ 2026-04-12 15:23 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	François Michel, open list
In-Reply-To: <20260412152429.143474-1-stephen@networkplumber.org>

netem_change() unconditionally reseeds the PRNG on every tc change
command. If TCA_NETEM_PRNG_SEED is not specified, a new random seed
is generated, destroying reproducibility for users who set a
deterministic seed on a previous change.

Move the initial random seed generation to netem_init() and only
reseed in netem_change() when TCA_NETEM_PRNG_SEED is explicitly
provided by the user.

Fixes: 4072d97ddc44 ("netem: add prng attribute to netem_sched_data")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index d400a730eadd..556f9747f0e7 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -1112,11 +1112,10 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
 	/* capping jitter to the range acceptable by tabledist() */
 	q->jitter = min_t(s64, abs(q->jitter), INT_MAX);
 
-	if (tb[TCA_NETEM_PRNG_SEED])
+	if (tb[TCA_NETEM_PRNG_SEED]) {
 		q->prng.seed = nla_get_u64(tb[TCA_NETEM_PRNG_SEED]);
-	else
-		q->prng.seed = get_random_u64();
-	prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+		prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+	}
 
 unlock:
 	sch_tree_unlock(sch);
@@ -1139,6 +1138,9 @@ static int netem_init(struct Qdisc *sch, struct nlattr *opt,
 		return -EINVAL;
 
 	q->loss_model = CLG_RANDOM;
+	q->prng.seed = get_random_u64();
+	prandom_seed_state(&q->prng.prng_state, q->prng.seed);
+
 	ret = netem_change(sch, opt, extack);
 	if (ret)
 		pr_info("netem: change failed\n");
-- 
2.53.0


^ permalink raw reply related

* [PATTCH net v6 4/5] net/sched: netem: check for invalid slot range
From: Stephen Hemminger @ 2026-04-12 15:23 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Neal Cardwell, Yousuk Seung, open list
In-Reply-To: <20260412152429.143474-1-stephen@networkplumber.org>

Reject slot configuration where min_delay exceeds max_delay.
The delay range computation in get_slot_next() underflows in
this case, producing bogus results.

Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 556f9747f0e7..16368948bde9 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -827,6 +827,18 @@ static int get_dist_table(struct disttable **tbl, const struct nlattr *attr)
 	return 0;
 }
 
+static int validate_slot(const struct nlattr *attr,
+			 struct netlink_ext_ack *extack)
+{
+	const struct tc_netem_slot *c = nla_data(attr);
+
+	if (c->min_delay > c->max_delay) {
+		NL_SET_ERR_MSG(extack, "slot min delay greater than max delay");
+		return -EINVAL;
+	}
+	return 0;
+}
+
 static void get_slot(struct netem_sched_data *q, const struct nlattr *attr)
 {
 	const struct tc_netem_slot *c = nla_data(attr);
@@ -1040,6 +1052,12 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
 			goto table_free;
 	}
 
+	if (tb[TCA_NETEM_SLOT]) {
+		ret = validate_slot(tb[TCA_NETEM_SLOT], extack);
+		if (ret)
+			goto table_free;
+	}
+
 	sch_tree_lock(sch);
 	/* backup q->clg and q->loss_model */
 	old_clg = q->clg;
-- 
2.53.0


^ permalink raw reply related

* [PATTCH net v6 5/5] net/sched: netem: fix slot delay calculation overflow
From: Stephen Hemminger @ 2026-04-12 15:23 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, Simon Horman, Jamal Hadi Salim, Jiri Pirko,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Yousuk Seung, Neal Cardwell, open list
In-Reply-To: <20260412152429.143474-1-stephen@networkplumber.org>

get_slot_next() computes a random delay between min_delay and
max_delay using:

  get_random_u32() * (max_delay - min_delay) >> 32

This overflows signed 64-bit arithmetic when the delay range exceeds
approximately 2.1 seconds (2^31 nanoseconds), producing a negative
result that effectively disables slot-based pacing. This is a
realistic configuration for WAN emulation (e.g., slot 1s 5s).

Use mul_u64_u32_shr() which handles the widening multiply without
overflow.

Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
---
 net/sched/sch_netem.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 16368948bde9..452ad2b483a5 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -659,9 +659,8 @@ static void get_slot_next(struct netem_sched_data *q, u64 now)
 
 	if (!q->slot_dist)
 		next_delay = q->slot_config.min_delay +
-				(get_random_u32() *
-				 (q->slot_config.max_delay -
-				  q->slot_config.min_delay) >> 32);
+			mul_u64_u32_shr(q->slot_config.max_delay - q->slot_config.min_delay,
+					get_random_u32(), 32);
 	else
 		next_delay = tabledist(q->slot_config.dist_delay,
 				       (s32)(q->slot_config.dist_jitter),
-- 
2.53.0


^ permalink raw reply related

* Re: [PATCH net v2] NFC: digital: bound SENSF response copy into nfc_target
From: Jakub Kicinski @ 2026-04-12 15:35 UTC (permalink / raw)
  To: Pengpeng Hou
  Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
	Kees Cook, linux-kernel
In-Reply-To: <20260407120004.4-nfc-sensf-v2-pengpeng@iscas.ac.cn>

On Tue, 7 Apr 2026 09:57:36 +0800 Pengpeng Hou wrote:
> digital_in_recv_sensf_res() copies the received SENSF response into
> struct nfc_target without bounding the copy to target.sensf_res. A full
> on-wire digital_sensf_res is 19 bytes long, while nfc_target stores 18
> bytes, so oversized or full-length frames can overwrite adjacent stack
> fields before digital_target_found() sees the target.
> 
> Reject payloads larger than struct digital_sensf_res and clamp the copy
> into target.sensf_res so valid 19-byte responses keep working while the
> fixed destination buffer stays bounded.

You need to solve the riddle why this driver thinks the response is 19
bytes but the core wants to store only 18...

> Fixes: 8c0695e4998dd268ff2a05951961247b7e015651 ("NFC Digital: Add NFC-F technology support")

nit: the hash in the Fixes tag should be only 12 chars
-- 
pw-bot: cr

^ permalink raw reply

* [PATCH net] slip: reject VJ frames when no receive slots are allocated
From: Weiming Shi @ 2026-04-12 15:42 UTC (permalink / raw)
  To: Andrew Lunn, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Kees Cook, netdev, Xiang Mei, Weiming Shi

slhc_init() allows rslots == 0 and in that case skips the allocation
of comp->rstate, leaving it NULL. Because the struct is zero-initialized
by kzalloc, comp->rslot_limit is also 0.

The receive-side entry points slhc_uncompress() and slhc_remember()
only compare a packet's slot index against rslot_limit, so slot 0
passes the bounds check even though no receive state array exists.
Any VJ-compressed or VJ-uncompressed packet that selects slot 0 then
dereferences the NULL rstate pointer.

This can be reached through PPP by issuing PPPIOCSMAXCID with a value
whose upper 16 bits, after arithmetic right shift, yield -1, making
val2 + 1 == 0 and thus rslots == 0.

 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
 KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
 RIP: 0010:slhc_uncompress (drivers/net/slip/slhc.c:519)
 Call Trace:
  <TASK>
  ppp_receive_nonmp_frame (drivers/net/ppp/ppp_generic.c:2466)
  ppp_input (drivers/net/ppp/ppp_generic.c:2359)
  ppp_async_process (drivers/net/ppp/ppp_async.c:492)
  tasklet_action_common (kernel/softirq.c:926)
  handle_softirqs (kernel/softirq.c:623)
  run_ksoftirqd (kernel/softirq.c:1055)
  smpboot_thread_fn (kernel/smpboot.c:160)
  kthread (kernel/kthread.c:436)
  ret_from_fork (arch/x86/kernel/process.c:164)
  </TASK>

Add a NULL check on comp->rstate at the entry of slhc_uncompress() and
slhc_remember() so that frames are rejected when no receive slots exist.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
---
 drivers/net/slip/slhc.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/slip/slhc.c b/drivers/net/slip/slhc.c
index e3c785da3eef..e67052bcab57 100644
--- a/drivers/net/slip/slhc.c
+++ b/drivers/net/slip/slhc.c
@@ -502,6 +502,10 @@ slhc_uncompress(struct slcompress *comp, unsigned char *icp, int isize)
 
 	/* We've got a compressed packet; read the change byte */
 	comp->sls_i_compressed++;
+	if (!comp->rstate) {
+		comp->sls_i_error++;
+		return 0;
+	}
 	if(isize < 3){
 		comp->sls_i_error++;
 		return 0;
@@ -651,8 +655,9 @@ slhc_remember(struct slcompress *comp, unsigned char *icp, int isize)
 
 	/* The packet is shorter than a legal IP header.
 	 * Also make sure isize is positive.
+	 * Reject if no receive slots are configured (rstate is NULL).
 	 */
-	if (isize < (int)sizeof(struct iphdr)) {
+	if (!comp->rstate || isize < (int)sizeof(struct iphdr)) {
 runt:
 		comp->sls_i_runt++;
 		return slhc_toss(comp);
-- 
2.43.0


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox