Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH iwl-next 2/2] idpf: implement pci error handlers
From: Tantilov, Emil S @ 2026-04-12 14:38 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: intel-wired-lan, netdev, przemyslaw.kitszel, jay.bhat,
	ivan.d.barrera, aleksandr.loktionov, larysa.zaremba,
	anthony.l.nguyen, andrew+netdev, davem, edumazet, kuba, pabeni,
	aleksander.lobakin, linux-pci, madhu.chittim, decot, willemb,
	sheenamo
In-Reply-To: <adnfeAJHoFoaGYH7@wunner.de>



On 4/10/2026 10:43 PM, Lukas Wunner wrote:
> On Fri, Apr 10, 2026 at 05:39:59PM -0700, Emil Tantilov wrote:
>> +static pci_ers_result_t
>> +idpf_pci_err_slot_reset(struct pci_dev *pdev)
>> +{
>> +	struct idpf_adapter *adapter = pci_get_drvdata(pdev);
>> +
>> +	pci_restore_state(pdev);
>> +	pci_set_master(pdev);
>> +	pci_wake_from_d3(pdev, false);
>> +	if (readl(adapter->reset_reg.rstat) != 0xFFFFFFFF) {
>> +		pci_save_state(pdev);
>> +		return PCI_ERS_RESULT_RECOVERED;
>> +	}
> 
> The pci_save_state() is no longer necessary here, please drop it.
> See commits a2f1e22390ac and 383d89699c50 for details.

Ah, the state_saved check was still there when I last checked and I
missed this change ... I will remove it in v2.

Thanks!
Emil

> 
> Thanks,
> 
> Lukas


^ permalink raw reply

* Re: [PATCH v4] net/mlx5: Fix OOB access and stack information leak in PTP event handling
From: Carolina Jubran @ 2026-04-12 14:25 UTC (permalink / raw)
  To: Prathamesh Deshpande, Saeed Mahameed, Leon Romanovsky
  Cc: Richard Cochran, Tariq Toukan, Mark Bloch, netdev, linux-rdma,
	linux-kernel
In-Reply-To: <20260412000418.8415-1-prathameshdeshpande7@gmail.com>


On 12/04/2026 3:04, Prathamesh Deshpande wrote:
> In mlx5_pps_event(), several critical issues were identified:
>
> 1. The 'pin' index from the hardware event was used without bounds
>     checking to index 'pin_config' and 'pps_info->start'. Check against
>     MAX_PIN_NUM to prevent out-of-bounds access.
> 2. 'ptp_event' was not zero-initialized, potentially leaking stack
>     memory through the union.
> 3. A NULL 'pin_config' could be dereferenced if initialization failed.
> 4. 'clock->ptp' could be NULL if ptp_clock_register() failed.
>
> Fixes: 7c39afb394c7 ("net/mlx5: PTP code migration to driver core section")
> Suggested-by: Carolina Jubran <cjubran@nvidia.com>
> Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>

^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Russell King (Oracle) @ 2026-04-12 14:23 UTC (permalink / raw)
  To: Maxime Chevallier
  Cc: Andrew Lunn, Alexandre Torgue, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
	netdev, Paolo Abeni, Sam Edwards
In-Reply-To: <266998d8-7e38-4bae-a4df-2f889538fe88@bootlin.com>

On Sun, Apr 12, 2026 at 04:01:59PM +0200, Maxime Chevallier wrote:
> Hi Russell,
> 
> On 10/04/2026 15:07, Russell King (Oracle) wrote:
> > Enable receive process stopped and receive buffer unavailable
> > interrupts, so that the statistic counters can be updated.
> > 
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > Since we are seeing receive buffer exhaustion on several platforms,
> > let's enable the interrupts so the statistics we publish via ethtool -S
> > actually work to aid diagnosis. I've been in two minds about whether
> > to send this patch, but given the problems with stmmac at the moment,
> > I think it should be merged.
> 
> Looks like my reply to your original RFC was lost in limbo as the review/test tags are missing.

Thanks. Unfortunately, I can't run iperf3 against stmmac on the Jetson
NX because stmmac just totally screws itself (at the first RBU, the
receive side irrevocably collapses.)

Against i.MX6 (which is limited to around 480Mbps,) it's recoverable by
taking the interface down and back up a couple of times.

Against x86 (which will saturate the link) its pretty much
irrecoverable without entire system reboot - if one tries the down+up,
we then get arm-smmu errors because it seems that, despite stmmac being
reset, it still attempts to access a previous receive buffer from
before the down/up sometime after the up. Moreover, transmit stops
working - packets get queued but they are never processed by the
hardware. This is a scenario that I can only rarely test myself (as
it depends on my physical location.)

As the dwmac 5.0 core receive path seems to lock up after the first
RBU, I never see more than one of those at a time.

Right now, I consider this pretty much unsolvable - I've spent quite
some time looking at it and trying various approaches, nothing seems
to fix it. However, adding dma_rmb() in the descriptor cleanup/refill
paths does seem to improve the situation a little with the 480Mbps
case, because I think it means that we're reading the descriptors in
a more timely manner after the hardware has updated them.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: enable RPS and RBU interrupts
From: Maxime Chevallier @ 2026-04-12 14:01 UTC (permalink / raw)
  To: Russell King (Oracle), Andrew Lunn
  Cc: Alexandre Torgue, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, linux-arm-kernel, linux-stm32, netdev,
	Paolo Abeni, Sam Edwards
In-Reply-To: <E1wBBaR-0000000GZHR-1dbM@rmk-PC.armlinux.org.uk>

Hi Russell,

On 10/04/2026 15:07, Russell King (Oracle) wrote:
> Enable receive process stopped and receive buffer unavailable
> interrupts, so that the statistic counters can be updated.
> 
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> Since we are seeing receive buffer exhaustion on several platforms,
> let's enable the interrupts so the statistics we publish via ethtool -S
> actually work to aid diagnosis. I've been in two minds about whether
> to send this patch, but given the problems with stmmac at the moment,
> I think it should be merged.

Looks like my reply to your original RFC was lost in limbo as the review/test tags are missing.

Here's my original answer :


It works, I can indeed see the stats get properly updated on imx8mp 🙂

There's one downside to it though, which is that as soon as we hit a situation
where we don't have RX bufs available, this patchs has a tendancy to make things
worse as we'll trigger interrupts for each packet we receive and that we can't
process, making it even longer for queues to be refilled.

It shows on iperf3 with small packets :

---- Before patch, 17% packet loss on UDP 56 bytes packets -----------------

# iperf3 -u -b 0 -l 56 -c 192.168.2.1 -R
Connecting to host 192.168.2.1, port 5201
Reverse mode, remote host 192.168.2.1 is sending
[  5] local 192.168.2.18 port 47851 connected to 192.168.2.1 port 5201
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  10.7 MBytes  90.0 Mbits/sec  0.003 ms  48550/249650 (19%)  
[  5]   1.00-2.00   sec  11.3 MBytes  95.0 Mbits/sec  0.003 ms  41881/253832 (16%)  
[  5]   2.00-3.00   sec  11.3 MBytes  94.9 Mbits/sec  0.002 ms  42060/253913 (17%)  
[  5]   3.00-4.00   sec  11.3 MBytes  95.1 Mbits/sec  0.003 ms  41499/253785 (16%)  
[  5]   4.00-5.00   sec  11.3 MBytes  94.6 Mbits/sec  0.003 ms  42663/253787 (17%)  
[  5]   5.00-6.00   sec  11.3 MBytes  94.9 Mbits/sec  0.006 ms  41976/253719 (17%)  
[  5]   6.00-7.00   sec  11.3 MBytes  94.5 Mbits/sec  0.003 ms  43133/253999 (17%)  
[  5]   7.00-8.00   sec  11.3 MBytes  95.0 Mbits/sec  0.004 ms  41442/253579 (16%)  
[  5]   8.00-9.00   sec  11.4 MBytes  95.2 Mbits/sec  0.004 ms  41518/254131 (16%)  
[  5]   9.00-10.00  sec  11.2 MBytes  94.3 Mbits/sec  0.006 ms  43580/254143 (17%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   135 MBytes   114 Mbits/sec  0.000 ms  0/0 (0%)  sender
[  5]   0.00-10.00  sec   112 MBytes  94.3 Mbits/sec  0.006 ms  428302/2534538 (17%)  receiver

iperf Done.
# ethtool -S eth1 | grep rx_buf_unav_irq
     rx_buf_unav_irq: 0

---- After patch, 22% packet loss on UDP 56 bytes packets ----------------------

# iperf3 -u -b 0 -l 56 -c 192.168.2.1 -R
Connecting to host 192.168.2.1, port 5201
Reverse mode, remote host 192.168.2.1 is sending
[  5] local 192.168.2.18 port 42121 connected to 192.168.2.1 port 5201
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  10.3 MBytes  85.8 Mbits/sec  0.004 ms  55146/247172 (22%)  
[  5]   1.00-2.00   sec  10.6 MBytes  89.1 Mbits/sec  0.003 ms  54699/253355 (22%)  
[  5]   2.00-3.00   sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  55231/253887 (22%)  
[  5]   3.00-4.00   sec  10.6 MBytes  88.9 Mbits/sec  0.003 ms  55138/253602 (22%)  
[  5]   4.00-5.00   sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  54938/253722 (22%)  
[  5]   5.00-6.00   sec  10.6 MBytes  88.9 Mbits/sec  0.003 ms  55273/253580 (22%)  
[  5]   6.00-7.00   sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  55202/253986 (22%)  
[  5]   7.00-8.00   sec  10.6 MBytes  89.1 Mbits/sec  0.003 ms  55047/253958 (22%)  
[  5]   8.00-9.00   sec  10.6 MBytes  88.9 Mbits/sec  0.003 ms  55612/254140 (22%)  
[  5]   9.00-10.00  sec  10.6 MBytes  89.0 Mbits/sec  0.003 ms  55683/254403 (22%)  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   135 MBytes   113 Mbits/sec  0.000 ms  0/0 (0%)  sender
[  5]   0.00-10.00  sec   106 MBytes  88.7 Mbits/sec  0.003 ms  551969/2531805 (22%)  receiver

iperf Done.
# ethtool -S eth1 | grep rx_buf_unav_irq
     rx_buf_unav_irq: 30624


So clearly there are pros and cons with this, but I don't want to fall into the
"let's not break microbenchmarks" pitfall.

I personnaly find the stat useful, and that having the stat visible to user
but stuck at 0 is misleading so,

Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>


Maxime

> 
>  drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> index af6580332d49..43b036d4e95b 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h
> @@ -99,6 +99,8 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
>  #define DMA_CHAN_INTR_ENA_NIE_4_10	BIT(15)
>  #define DMA_CHAN_INTR_ENA_AIE_4_10	BIT(14)
>  #define DMA_CHAN_INTR_ENA_FBE		BIT(12)
> +#define DMA_CHAN_INTR_ENA_RPS		BIT(8)
> +#define DMA_CHAN_INTR_ENA_RBU		BIT(7)
>  #define DMA_CHAN_INTR_ENA_RIE		BIT(6)
>  #define DMA_CHAN_INTR_ENA_TIE		BIT(0)
>  
> @@ -107,6 +109,8 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
>  					 DMA_CHAN_INTR_ENA_TIE)
>  
>  #define DMA_CHAN_INTR_ABNORMAL		(DMA_CHAN_INTR_ENA_AIE | \
> +					 DMA_CHAN_INTR_ENA_RPS | \
> +					 DMA_CHAN_INTR_ENA_RBU | \
>  					 DMA_CHAN_INTR_ENA_FBE)
>  /* DMA default interrupt mask for 4.00 */
>  #define DMA_CHAN_INTR_DEFAULT_MASK	(DMA_CHAN_INTR_NORMAL | \
> @@ -117,6 +121,8 @@ static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs,
>  					 DMA_CHAN_INTR_ENA_TIE)
>  
>  #define DMA_CHAN_INTR_ABNORMAL_4_10	(DMA_CHAN_INTR_ENA_AIE_4_10 | \
> +					 DMA_CHAN_INTR_ENA_RPS | \
> +					 DMA_CHAN_INTR_ENA_RBU | \
>  					 DMA_CHAN_INTR_ENA_FBE)
>  /* DMA default interrupt mask for 4.10a */
>  #define DMA_CHAN_INTR_DEFAULT_MASK_4_10	(DMA_CHAN_INTR_NORMAL_4_10 | \


^ permalink raw reply

* [PATCH net-next v3 5/5] net: phy: Move phy_init_hw() from phy_resume() to __phy_resume()
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

Now that redundant locking has been removed from PHY driver callbacks,
phy_init_hw() can be called with phydev->lock held.

Many MAC drivers and the phylink framework resume the PHY via
phy_start(), which invokes __phy_resume() directly without going
through phy_resume(). Keeping phy_init_hw() in phy_resume() means it
is not called in this path.

Move phy_init_hw() into __phy_resume() so that PHY soft reset and
re-initialisation happen unconditionally on every resume, regardless
of which code path triggers it.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v3:
 * New patch.
---
 drivers/net/phy/phy_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 4a2b19d39373..16fc2fc63c50 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1999,6 +1999,10 @@ int __phy_resume(struct phy_device *phydev)
 
 	lockdep_assert_held(&phydev->lock);
 
+	ret = phy_init_hw(phydev);
+	if (ret)
+		return ret;
+
 	if (!phydrv || !phydrv->resume)
 		return 0;
 
@@ -2014,10 +2018,6 @@ int phy_resume(struct phy_device *phydev)
 {
 	int ret;
 
-	ret = phy_init_hw(phydev);
-	if (ret)
-		return ret;
-
 	mutex_lock(&phydev->lock);
 	ret = __phy_resume(phydev);
 	mutex_unlock(&phydev->lock);
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 4/5] net: phy: microchip_t1: Replace phydev->lock with mdio_lock in lan937x_dsp_workaround()
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Arun Ramadoss, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, UNGLinuxDriver, Russell King, netdev, linux-kernel,
	Geert Uytterhoeven, Prabhakar Mahadev Lad, Biju Das,
	linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

lan937x_dsp_workaround() performs raw MDIO bus accesses and must
therefore hold mdio_lock rather than phydev->lock. Using phydev->lock
here is incorrect as it does not serialise access to the MDIO bus.

Replace phydev->lock with bus->mdio_lock, and switch the phy_read()/
phy_write() calls to their unlocked __phy_read()/__phy_write()
variants since mdio_lock is now held explicitly.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v3:
 * New patch.
---
 drivers/net/phy/microchip_t1.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/phy/microchip_t1.c b/drivers/net/phy/microchip_t1.c
index 62b36a318100..afb4e8908b78 100644
--- a/drivers/net/phy/microchip_t1.c
+++ b/drivers/net/phy/microchip_t1.c
@@ -337,9 +337,9 @@ static int lan937x_dsp_workaround(struct phy_device *phydev, u16 ereg, u8 bank)
 	int rc = 0;
 	u16 val;
 
-	mutex_lock(&phydev->lock);
+	mutex_lock(&phydev->mdio.bus->mdio_lock);
 	/* Read previous selected bank */
-	rc = phy_read(phydev, LAN87XX_EXT_REG_CTL);
+	rc = __phy_read(phydev, LAN87XX_EXT_REG_CTL);
 	if (rc < 0)
 		goto out_unlock;
 
@@ -353,11 +353,11 @@ static int lan937x_dsp_workaround(struct phy_device *phydev, u16 ereg, u8 bank)
 		val |= LAN87XX_EXT_REG_CTL_RD_CTL;
 
 		/* access twice for DSP bank change,dummy access */
-		rc = phy_write(phydev, LAN87XX_EXT_REG_CTL, val);
+		rc = __phy_write(phydev, LAN87XX_EXT_REG_CTL, val);
 	}
 
 out_unlock:
-	mutex_unlock(&phydev->lock);
+	mutex_unlock(&phydev->mdio.bus->mdio_lock);
 
 	return rc;
 }
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 3/5] net: phy: mscc: Drop unnecessary phydev->lock
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, Lad Prabhakar, Horatiu Vultur,
	Vladimir Oltean, netdev, linux-kernel, Geert Uytterhoeven,
	Biju Das, linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

Remove manual mutex_lock/unlock(&phydev->lock) calls from several
functions in the MSCC PHY driver.

In vsc85xx_edge_rate_cntl_set(), phydev->lock is taken around a single
phy_modify_paged() call. phy_modify_paged() is already a fully locked
atomic operation that acquires the MDIO bus lock internally, so the
additional phydev->lock is unnecessary.

The remaining three functions — vsc85xx_mac_if_set(),
vsc8531_pre_init_seq_set(), and vsc85xx_eee_init_seq_set() — use
phy_read(), phy_write(), phy_select_page(), and phy_restore_page(),
all of which operate under the MDIO bus lock. Taking phydev->lock
around them provides no additional serialisation.

Along with dropping the locks, error-path labels are renamed from
out_unlock to err or restore_oldpage to better reflect their purpose.
In vsc8531_pre_init_seq_set() and vsc85xx_eee_init_seq_set(), the
redundant intermediate assignment of oldpage before returning is also
eliminated.

No functional change intended.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v2->v3:
 * Added the patch into series
 * Updated commit description.
v2:
 * New patch
---
 drivers/net/phy/mscc/mscc_main.c | 41 ++++++++++----------------------
 1 file changed, 12 insertions(+), 29 deletions(-)

diff --git a/drivers/net/phy/mscc/mscc_main.c b/drivers/net/phy/mscc/mscc_main.c
index 2b9fb8a675a6..75430f55acfd 100644
--- a/drivers/net/phy/mscc/mscc_main.c
+++ b/drivers/net/phy/mscc/mscc_main.c
@@ -486,15 +486,9 @@ static int vsc85xx_dt_led_modes_get(struct phy_device *phydev,
 
 static int vsc85xx_edge_rate_cntl_set(struct phy_device *phydev, u8 edge_rate)
 {
-	int rc;
-
-	mutex_lock(&phydev->lock);
-	rc = phy_modify_paged(phydev, MSCC_PHY_PAGE_EXTENDED_2,
-			      MSCC_PHY_WOL_MAC_CONTROL, EDGE_RATE_CNTL_MASK,
-			      edge_rate << EDGE_RATE_CNTL_POS);
-	mutex_unlock(&phydev->lock);
-
-	return rc;
+	return phy_modify_paged(phydev, MSCC_PHY_PAGE_EXTENDED_2,
+				MSCC_PHY_WOL_MAC_CONTROL, EDGE_RATE_CNTL_MASK,
+				edge_rate << EDGE_RATE_CNTL_POS);
 }
 
 static int vsc85xx_mac_if_set(struct phy_device *phydev,
@@ -503,7 +497,6 @@ static int vsc85xx_mac_if_set(struct phy_device *phydev,
 	int rc;
 	u16 reg_val;
 
-	mutex_lock(&phydev->lock);
 	reg_val = phy_read(phydev, MSCC_PHY_EXT_PHY_CNTL_1);
 	reg_val &= ~(MAC_IF_SELECTION_MASK);
 	switch (interface) {
@@ -522,17 +515,15 @@ static int vsc85xx_mac_if_set(struct phy_device *phydev,
 		break;
 	default:
 		rc = -EINVAL;
-		goto out_unlock;
+		goto err;
 	}
 	rc = phy_write(phydev, MSCC_PHY_EXT_PHY_CNTL_1, reg_val);
 	if (rc)
-		goto out_unlock;
+		goto err;
 
 	rc = genphy_soft_reset(phydev);
 
-out_unlock:
-	mutex_unlock(&phydev->lock);
-
+err:
 	return rc;
 }
 
@@ -668,19 +659,15 @@ static int vsc8531_pre_init_seq_set(struct phy_device *phydev)
 	if (rc < 0)
 		return rc;
 
-	mutex_lock(&phydev->lock);
 	oldpage = phy_select_page(phydev, MSCC_PHY_PAGE_TR);
 	if (oldpage < 0)
-		goto out_unlock;
+		goto restore_oldpage;
 
 	for (i = 0; i < ARRAY_SIZE(init_seq); i++)
 		vsc85xx_tr_write(phydev, init_seq[i].reg, init_seq[i].val);
 
-out_unlock:
-	oldpage = phy_restore_page(phydev, oldpage, oldpage);
-	mutex_unlock(&phydev->lock);
-
-	return oldpage;
+restore_oldpage:
+	return phy_restore_page(phydev, oldpage, oldpage);
 }
 
 static int vsc85xx_eee_init_seq_set(struct phy_device *phydev)
@@ -708,19 +695,15 @@ static int vsc85xx_eee_init_seq_set(struct phy_device *phydev)
 	unsigned int i;
 	int oldpage;
 
-	mutex_lock(&phydev->lock);
 	oldpage = phy_select_page(phydev, MSCC_PHY_PAGE_TR);
 	if (oldpage < 0)
-		goto out_unlock;
+		goto restore_oldpage;
 
 	for (i = 0; i < ARRAY_SIZE(init_eee); i++)
 		vsc85xx_tr_write(phydev, init_eee[i].reg, init_eee[i].val);
 
-out_unlock:
-	oldpage = phy_restore_page(phydev, oldpage, oldpage);
-	mutex_unlock(&phydev->lock);
-
-	return oldpage;
+restore_oldpage:
+	return phy_restore_page(phydev, oldpage, oldpage);
 }
 
 /* phydev->bus->mdio_lock should be locked when using this function */
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 2/5] r8169: Drop redundant phy_init_hw() call in rtl8169_up()
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Heiner Kallweit, nic_swsd, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

Since phy_resume() now calls phy_init_hw() internally as part of the
resume sequence, the explicit phy_init_hw() call immediately before
phy_resume() in rtl8169_up() is redundant. Remove it.

No functional change intended.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v2->v3:
 * Added the patch into series
 * Updated commit description.
v2:
 * New patch
---
 drivers/net/ethernet/realtek/r8169_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 791277e750ba..cb22105f323f 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -5032,7 +5032,6 @@ static void rtl8169_up(struct rtl8169_private *tp)
 		rtl8168_driver_start(tp);
 
 	pci_set_master(tp->pci_dev);
-	phy_init_hw(tp->phydev);
 	phy_resume(tp->phydev);
 	rtl8169_init_phy(tp);
 	napi_enable(&tp->napi);
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 1/5] net: phy: call phy_init_hw() in phy resume path
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc, Ovidiu Panait
In-Reply-To: <20260412140032.122841-1-biju.das.jz@bp.renesas.com>

From: Biju Das <biju.das.jz@bp.renesas.com>

When mac_managed_pm flag is set, mdio_bus_phy_resume() is skipped, so
phy_init_hw(), which performs soft_reset and config_init, is not called
during resume.

This is inconsistent with the non-mac_managed_pm path, where
mdio_bus_phy_resume() calls phy_init_hw() before phy_resume() on every
resume.

To align both paths, move the phy_init_hw() call into phy_resume() itself,
before invoking the driver's resume callback. This ensures PHY soft reset
and re-initialization happen unconditionally, regardless of whether PM is
managed by the MAC or the MDIO bus. As a result, drop the redundant
phy_init_hw() call in mdio_bus_phy_resume().

Additionally, in phy_attach_direct(), replace the separate phy_init_hw()
and phy_resume() calls with a single phy_resume() call, since
phy_init_hw() is now handled inside phy_resume().

Signed-off-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
---
v2->v3:
 * No change,
v1->v2:
 * Updated commit description.
 * phy_init_hw() is moved from __phy_resume() -> phy_resume() to make it
   lock-free.
 * Dropped redundant phy_init_hw() call from mdio_bus_phy_resume() and
   phy_attach_direct().
---
 drivers/net/phy/phy_device.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 0edff47478c2..4a2b19d39373 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -396,10 +396,6 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
 	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
 		phydev->state != PHY_UP);
 
-	ret = phy_init_hw(phydev);
-	if (ret < 0)
-		return ret;
-
 	ret = phy_resume(phydev);
 	if (ret < 0)
 		return ret;
@@ -1857,16 +1853,14 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 	if (dev)
 		netif_carrier_off(phydev->attached_dev);
 
-	/* Do initial configuration here, now that
+	/* Do initial configuration inside phy_init_hw(), now that
 	 * we have certain key parameters
 	 * (dev_flags and interface)
 	 */
-	err = phy_init_hw(phydev);
+	err = phy_resume(phydev);
 	if (err)
 		goto error;
 
-	phy_resume(phydev);
-
 	/**
 	 * If the external phy used by current mac interface is managed by
 	 * another mac interface, so we should create a device link between
@@ -2020,6 +2014,10 @@ int phy_resume(struct phy_device *phydev)
 {
 	int ret;
 
+	ret = phy_init_hw(phydev);
+	if (ret)
+		return ret;
+
 	mutex_lock(&phydev->lock);
 	ret = __phy_resume(phydev);
 	mutex_unlock(&phydev->lock);
-- 
2.43.0


^ permalink raw reply related

* [PATCH net-next v3 0/5] net: phy: Fix phy_init_hw() placement and update locking
From: Biju @ 2026-04-12 14:00 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Biju Das, Russell King, netdev, linux-kernel, Geert Uytterhoeven,
	Prabhakar Mahadev Lad, Biju Das, linux-renesas-soc

From: Biju Das <biju.das.jz@bp.renesas.com>

This series fixes two related issues in the PHY subsystem: incorrect
placement of phy_init_hw() in the resume path, and drop/update locking
in several PHY drivers.

Patch 1 identifies that when mac_managed_pm is set, mdio_bus_phy_resume()
is skipped entirely, meaning phy_init_hw() which performs soft reset and
config_init is never called on resume for that path. To make both paths
consistent, phy_init_hw() is moved into phy_resume() so it runs
unconditionally. As a consequence, the separate phy_init_hw() +
phy_resume() call sequence in phy_attach_direct() is collapsed
into a single phy_resume() call.

Patch 2 removes the now-redundant explicit phy_init_hw() call in
rtl8169_up(), since phy_resume() already handles it.

Patch 3 removes manual mutex_lock/unlock(&phydev->lock) from four
functions in the MSCC PHY driver. In vsc85xx_edge_rate_cntl_set(),
the lock wraps a single phy_modify_paged() call, which is already a
fully locked atomic operation that acquires the MDIO bus lock
internally, so the additional phydev->lock is unnecessary. The
remaining three functions — vsc85xx_mac_if_set(),
vsc8531_pre_init_seq_set(), and vsc85xx_eee_init_seq_set() — use
phy_read(), phy_write(), phy_select_page(), and phy_restore_page(),
all of which operate under the MDIO bus lock, so taking phydev->lock
around them provides no additional serialisation. Error-path labels
are updated accordingly.

Patch 4 fixes lan937x_dsp_workaround() in the Microchip T1 driver,
which was incorrectly taking phydev->lock. The function performs raw
MDIO bus accesses and must instead hold mdio_lock. The phy_read() and
phy_write() calls are also switched to their unlocked __phy_read() and
__phy_write() variants since mdio_lock is now held explicitly.

Patch 5 refines the placement introduced in patch 1 by moving
phy_init_hw() from phy_resume() down into __phy_resume(), so that it
is called with phydev->lock already held. This is necessary because
many MAC drivers and phylink reach the resume path via phy_start() ->
__phy_resume() directly, bypassing phy_resume().

v2->v3:
 * Moved all the patches into series as the order they get merged
   matters, otherwise a git bisect could land on a deadlock.
 * Updated commit description for patch#2 and #3.
v1->v2:
 * Updated commit description.
 * phy_init_hw() is moved from __phy_resume() -> phy_resume() to make it
   lock-free.
 * Dropped redundant phy_init_hw() call from mdio_bus_phy_resume() and
   phy_attach_direct().

Biju Das (5):
  net: phy: call phy_init_hw() in phy resume path
  r8169: Drop redundant phy_init_hw() call in rtl8169_up()
  net: phy: mscc: Drop unnecessary phydev->lock
  net: phy: microchip_t1: Replace phydev->lock with mdio_lock in
  net: phy: Move phy_init_hw() from phy_resume() to __phy_resume()

 drivers/net/ethernet/realtek/r8169_main.c |  1 -
 drivers/net/phy/microchip_t1.c            |  8 ++---
 drivers/net/phy/mscc/mscc_main.c          | 41 +++++++----------------
 drivers/net/phy/phy_device.c              | 14 ++++----
 4 files changed, 22 insertions(+), 42 deletions(-)

-- 
2.43.0


^ permalink raw reply

* Re: [PATCH v5] net: caif: fix stack out-of-bounds write in cfctrl_link_setup()
From: Simon Horman @ 2026-04-12 13:57 UTC (permalink / raw)
  To: Kangzheng Gu
  Cc: pabeni, davem, edumazet, kuba, kees, thorsten.blum, arnd,
	sjur.brandeland, netdev, linux-kernel, stable
In-Reply-To: <20260408125333.38489-1-xiaoguai0992@gmail.com>

On Wed, Apr 08, 2026 at 12:53:33PM +0000, Kangzheng Gu wrote:
> cfctrl_link_setup() copies the RFM volume name from a received control
> packet into linkparam.u.rfm.volume until a '\0' is found. A malformed
> packet can omit the terminator and make the copy run past the 20-byte
> stack buffer.
> 
> Stop copying once the buffer is full and mark the frame as failed by
> setting CFCTRL_ERR_BIT so the link setup is rejected.
> 
> Fixes: b482cd2053e3 ("net-caif: add CAIF core protocol stack")
> Cc: stable@vger.kernel.org
> Signed-off-by: Kangzheng Gu <xiaoguai0992@gmail.com>
> ---
>  v5:
>  - remove the Reported-by.
>  - print a warn message and reject link setup by setting CFCTRL_ERR_BIT.
>  - using %zu to adapt the compilation of 32-bit kernel.
>  - add rate limit to error message
> 
>  net/caif/cfctrl.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/net/caif/cfctrl.c b/net/caif/cfctrl.c
> index c6cc2bfed65d..373ab1dc67a7 100644
> --- a/net/caif/cfctrl.c
> +++ b/net/caif/cfctrl.c
> @@ -416,8 +416,16 @@ static int cfctrl_link_setup(struct cfctrl *cfctrl, struct cfpkt *pkt, u8 cmdrsp
>  		cp = (u8 *) linkparam.u.rfm.volume;
>  		for (tmp = cfpkt_extr_head_u8(pkt);
>  		     cfpkt_more(pkt) && tmp != '\0';
> -		     tmp = cfpkt_extr_head_u8(pkt))
> +		     tmp = cfpkt_extr_head_u8(pkt)) {
> +			if (cp >= (u8 *)linkparam.u.rfm.volume +
> +			    sizeof(linkparam.u.rfm.volume) - 1) {
> +				pr_warn_ratelimited("Request reject, volume name length exceeds %zu\n",
> +						    sizeof(linkparam.u.rfm.volume));
> +				cmdrsp |= CFCTRL_ERR_BIT;
> +				break;
> +			}
>  			*cp++ = tmp;
> +		}
>  		*cp = '\0';
>  
>  		if (CFCTRL_ERR_BIT & cmdrsp)

I am wondering if it would be best to follow the pattern for
writing linkparam.u.utility.name elsewhere in this function.
That:
1. Uses a somewhat more succinct loop control structure
2. Silently truncates input without updating cmdrsp if overrun would occur

Something like this (compile tested only!):

diff --git a/net/caif/cfctrl.c b/net/caif/cfctrl.c
index c6cc2bfed65d..ba184c11386e 100644
--- a/net/caif/cfctrl.c
+++ b/net/caif/cfctrl.c
@@ -15,6 +15,7 @@
 #include <net/caif/cfctrl.h>
 
 #define container_obj(layr) container_of(layr, struct cfctrl, serv.layer)
+#define RFM_VOLUME_LEN 20
 #define UTILITY_NAME_LENGTH 16
 #define CFPKT_CTRL_PKT_LEN 20
 
@@ -414,10 +415,11 @@ static int cfctrl_link_setup(struct cfctrl *cfctrl, struct cfpkt *pkt, u8 cmdrsp
 		 */
 		linkparam.u.rfm.connid = cfpkt_extr_head_u32(pkt);
 		cp = (u8 *) linkparam.u.rfm.volume;
-		for (tmp = cfpkt_extr_head_u8(pkt);
-		     cfpkt_more(pkt) && tmp != '\0';
-		     tmp = cfpkt_extr_head_u8(pkt))
+		caif_assert(sizeof(linkparam.u.rfm.volume) >= RFM_VOLUME_LEN);
+		for(i = 0; i < RFM_VOLUME_LEN - 1 && cfpkt_more(pkt); i++) {
+			tmp = cfpkt_extr_head_u8(pkt);
 			*cp++ = tmp;
+		}
 		*cp = '\0';
 
 		if (CFCTRL_ERR_BIT & cmdrsp)

Also, it seems that writing linkparam.u.utility.paramlen elsewhere
in this function also has a potential buffer overrun (by one byte).

^ permalink raw reply related

* Re: [PATCH net-next v2] r8169: Use napi_schedule_irqoff()
From: Matt Vollrath @ 2026-04-12 13:51 UTC (permalink / raw)
  To: Heiner Kallweit, netdev; +Cc: edumazet, pabeni, kuba, andrew+netdev, nic_swsd
In-Reply-To: <d0e1bab5-3476-4b58-834c-6e742e89539d@gmail.com>

On 4/12/26 07:30, Heiner Kallweit wrote:
> On 12.04.2026 03:40, Matt Vollrath wrote:
>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>> index 791277e750ba..4c0ad0de3410 100644
>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>> @@ -4873,7 +4873,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>>   		phy_mac_interrupt(tp->phydev);
>>   
>>   	rtl_irq_disable(tp);
>> -	napi_schedule(&tp->napi);
>> +	napi_schedule_irqoff(&tp->napi);
>>   out:
>>   	rtl_ack_events(tp, status);
>>   
> 
> Not using napi_schedule_irqoff() here is intentional,
> see 2734a24e6e5d18522fbf599135c59b82ec9b2c9e.

It looks like forced threading was fixed after your fix
to mitigate the issue of forced threading not masking
interrupts.

see 81e2073c175b887398e5bca6c004efa89983f58d

If I understand correctly, this should make
napi_schedule_irqoff() safe in any interrupt handler.


^ permalink raw reply

* RE: [PATCH v5 net-next 0/8] dpll/ice: Add TXC DPLL type and full TX reference clock control for E825
From: Nitka, Grzegorz @ 2026-04-12 13:50 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	intel-wired-lan@lists.osuosl.org, Oros, Petr,
	richardcochran@gmail.com, andrew+netdev@lunn.ch,
	Kitszel, Przemyslaw, Nguyen, Anthony L,
	Prathosh.Satish@microchip.com, Vecera, Ivan, jiri@resnulli.us,
	Kubalewski, Arkadiusz, vadim.fedorenko@linux.dev,
	donald.hunter@gmail.com, horms@kernel.org, pabeni@redhat.com,
	davem@davemloft.net, edumazet@google.com
In-Reply-To: <20260410133812.4cf9b090@kernel.org>



> -----Original Message-----
> From: Jakub Kicinski <kuba@kernel.org>
> Sent: Friday, April 10, 2026 10:38 PM
> To: Nitka, Grzegorz <grzegorz.nitka@intel.com>
> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org; intel-wired-
> lan@lists.osuosl.org; Oros, Petr <poros@redhat.com>;
> richardcochran@gmail.com; andrew+netdev@lunn.ch; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Prathosh.Satish@microchip.com; Vecera,
> Ivan <ivecera@redhat.com>; jiri@resnulli.us; Kubalewski, Arkadiusz
> <arkadiusz.kubalewski@intel.com>; vadim.fedorenko@linux.dev;
> donald.hunter@gmail.com; horms@kernel.org; pabeni@redhat.com;
> davem@davemloft.net; edumazet@google.com
> Subject: Re: [PATCH v5 net-next 0/8] dpll/ice: Add TXC DPLL type and full TX
> reference clock control for E825
> 
> On Fri, 10 Apr 2026 14:23:58 +0000 Nitka, Grzegorz wrote:
> > Here is the high-level connection diagram for E825 device. I hope you find it
> helpful:
> > [..]
> 
> It does thanks a lot.
> 
> > Before this series, we tried different approaches.
> > One of them was to create MUX pin associated with netdev interface.
> > EXT_REF and SYNCE pins were registered with this MUX pin.
> > However I recall there were at least two issues with this solution:
> > - when using DPLL subsystem not all the connections/relations were visible
> >   from DPLL pin-get perspective. RT netlink was required
> > - due to mixing pins from different modules (like fwnode based pin from zl
> driver
> >   and the pins from ice), we were not able to safely clean the references
> between
> >   pins and dpll (basicaly .. we observed crashes)
> >
> > Proposed solution just seems to be clean and fully reflects current
> > connection topology.
> 
> Do you have the link to the old proposal that was adding stuff to
> rtnetlink? I remember some discussion long-ish ago, maybe I was wrong.
> 

Hello Jakub,

This is the patch from the discussion I put the link in the cover letter:
https://lore.kernel.org/netdev/20250828164345.116097-1-arkadiusz.kubalewski@intel.com/

Regards

Grzegorz

> > What's actually your biggest concern?
> > The fact we introduce a new DPLL type? Or multiply DPLL instances? Or
> both?
> > Do you prefer to see "one big" DPLL with 16 pins in our case (8 ports x 2 tx-
> clk pins)?
> > Each pin with the name like, for example, PF0-SyncE/PF0-eRef etc.?
> 
> My concern is that I think this is a pretty run of the mill SyncE
> design. If we need to pretend we have two DPLLs here if we really
> only have one and a mux - then our APIs are mis-designed :(

^ permalink raw reply

* Re: [patch 23/38] alpha: Select ARCH_HAS_RANDOM_ENTROPY
From: Magnus Lindholm @ 2026-04-12 13:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Richard Henderson, linux-alpha, Arnd Bergmann, x86,
	Lu Baolu, iommu, Michael Grzeschik, netdev, linux-wireless,
	Herbert Xu, linux-crypto, Vlastimil Babka, linux-mm,
	David Woodhouse, Bernie Thompson, linux-fbdev, Theodore Tso,
	linux-ext4, Andrew Morton, Uladzislau Rezki, Marco Elver,
	Dmitry Vyukov, kasan-dev, Andrey Ryabinin, Thomas Sailer,
	linux-hams, Jason A. Donenfeld, Russell King, linux-arm-kernel,
	Catalin Marinas, Huacai Chen, loongarch, Geert Uytterhoeven,
	linux-m68k, Dinh Nguyen, Jonas Bonn, linux-openrisc, Helge Deller,
	linux-parisc, Michael Ellerman, linuxppc-dev, Paul Walmsley,
	linux-riscv, Heiko Carstens, linux-s390, David S. Miller,
	sparclinux
In-Reply-To: <20260410120319.131582521@kernel.org>

On Fri, Apr 10, 2026 at 2:36 PM Thomas Gleixner <tglx@kernel.org> wrote:
>
> The only remaining usage of get_cycles() is to provide
> random_get_entropy().
>
> Switch alpha over to the new scheme of selecting ARCH_HAS_RANDOM_ENTROPY
> and providing random_get_entropy() in asm/random.h.
>
> Remove asm/timex.h as it has no functionality anymore.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Richard Henderson <richard.henderson@linaro.org>
> Cc: linux-alpha@vger.kernel.org
> ---
>  arch/alpha/Kconfig              |    1 +
>  arch/alpha/include/asm/random.h |   14 ++++++++++++++
>  arch/alpha/include/asm/timex.h  |   26 --------------------------
>  3 files changed, 15 insertions(+), 26 deletions(-)

Hi,

The Alpha side looks fine to me.

I've applied this patch on top of v7.0-rc7, built a kernel successfully,
boot-tested it on an Alpha UP2000+ (SMP) without issues.

Acked-by: Magnus Lindholm <linmag7@gmail.com>
Tested-by: Magnus Lindholm <linmag7@gmail.com>

^ permalink raw reply

* Re: [PATCH net-next v2] r8169: Use napi_schedule_irqoff()
From: Matt Vollrath @ 2026-04-12 13:11 UTC (permalink / raw)
  To: Heiner Kallweit, netdev; +Cc: edumazet, pabeni, kuba, andrew+netdev, nic_swsd
In-Reply-To: <d0e1bab5-3476-4b58-834c-6e742e89539d@gmail.com>

On 4/12/26 07:30, Heiner Kallweit wrote:
> On 12.04.2026 03:40, Matt Vollrath wrote:
>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>> index 791277e750ba..4c0ad0de3410 100644
>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>> @@ -4873,7 +4873,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
>>   		phy_mac_interrupt(tp->phydev);
>>   
>>   	rtl_irq_disable(tp);
>> -	napi_schedule(&tp->napi);
>> +	napi_schedule_irqoff(&tp->napi);
>>   out:
>>   	rtl_ack_events(tp, status);
>>   
> 
> Not using napi_schedule_irqoff() here is intentional,
> see 2734a24e6e5d18522fbf599135c59b82ec9b2c9e.

I see, thank you Heiner. Seems like it could work if
legacy interrupts went to a separate handler, but not
worth the complexity for this small optimization.
mv


^ permalink raw reply

* Re: [PATCH v11 net-next 09/11] net/nebula-matrix: add Dispatch layer implementation
From: Simon Horman @ 2026-04-12 12:59 UTC (permalink / raw)
  To: illusion.wang
  Cc: Simon Horman, alvin.wang, sam.chen, netdev, skhan, lorenzo,
	open list, edumazet, linux-doc, enelsonmoore, andrew+netdev,
	pabeni, lukas.bulwahn, kuba, hkallweit1, dimon.zhao, corbet,
	vadim.fedorenko
In-Reply-To: <20260408093739.56001-10-illusion.wang@nebula-matrix.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/nebula-matrix: add Dispatch layer implementation

This patch introduces a control-level routing mechanism for the Dispatch
layer with two routing paths: a direct path (Dispatch -> Resource -> HW)
and a channel path (Dispatch -> Channel). The implementation uses
interface-declared control levels and upper-layer configured control
levels to determine the routing.

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dispatch.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dispatch.c
> index 8b031f6e963f..eee41d10ad63 100644
> --- a/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dispatch.c
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_core/nbl_dispatch.c
> @@ -6,6 +6,396 @@
>  #include <linux/pci.h>
>  #include "nbl_dispatch.h"
>
> +static u16 nbl_disp_chan_get_vsi_id_req(struct nbl_dispatch_mgt *disp_mgt,
> +					u16 func_id, u16 type)
> +{
> +	struct nbl_channel_ops *chan_ops = disp_mgt->chan_ops_tbl->ops;
> +	struct nbl_common_info *common = disp_mgt->common;
> +	struct nbl_chan_param_get_vsi_id result = { 0 };
> +	struct nbl_chan_param_get_vsi_id param = { 0 };
> +	struct nbl_chan_send_info chan_send;
> +
> +	param.type = type;
> +
> +	NBL_CHAN_SEND(chan_send, common->mgt_pf, NBL_CHAN_MSG_GET_VSI_ID,
> +		      &param, sizeof(param), &result, sizeof(result), 1);
> +	chan_ops->send_msg(disp_mgt->chan_ops_tbl->priv, &chan_send);
> +
> +	return result.vsi_id;
> +}

Does nbl_disp_chan_get_vsi_id_req() ignore the func_id parameter?

The function accepts func_id but only sends param.type in the channel
message. The nbl_chan_param_get_vsi_id struct appears to have only
{vsi_id, type} fields, with no func_id field.

On the responder side:

> +static void nbl_disp_chan_get_vsi_id_resp(void *priv, u16 src_id, u16 msg_id,
> +					  void *data, u32 data_len)
> +{
> +	struct nbl_dispatch_mgt *disp_mgt = (struct nbl_dispatch_mgt *)priv;
> +	struct nbl_channel_ops *chan_ops = disp_mgt->chan_ops_tbl->ops;
> +	struct nbl_resource_ops *res_ops = disp_mgt->res_ops_tbl->ops;
> +	struct nbl_resource_mgt *p = disp_mgt->res_ops_tbl->priv;
> +	struct device *dev = disp_mgt->common->dev;
> +	struct nbl_chan_param_get_vsi_id *param;
> +	struct nbl_chan_param_get_vsi_id result = { 0 };
> +	struct nbl_chan_ack_info chan_ack;
> +	int err = NBL_CHAN_RESP_OK;
> +	int ret;
> +
> +	param = (struct nbl_chan_param_get_vsi_id *)data;
> +
> +	result.vsi_id =
> +		NBL_OPS_CALL_RET(res_ops->get_vsi_id, (p, src_id, param->type));

nbl_disp_chan_get_vsi_id_resp() uses src_id (the channel source) instead
of func_id. This creates a behavioral inconsistency with the direct path:

> +static u16 nbl_disp_get_vsi_id(struct nbl_dispatch_mgt *disp_mgt, u16 func_id,
> +			       u16 type)
> +{
> +	struct nbl_resource_ops *res_ops = disp_mgt->res_ops_tbl->ops;
> +	struct nbl_resource_mgt *p = disp_mgt->res_ops_tbl->priv;
> +
> +	return NBL_OPS_CALL_RET(res_ops->get_vsi_id, (p, func_id, type));
> +}

The direct path correctly passes func_id to res_ops->get_vsi_id(p, func_id,
type).

When the channel path is active, callers cannot query VSI IDs for
functions other than the sender. Is the func_id parameter intended to be
used in nbl_disp_chan_get_vsi_id_req()?

[ ... ]

^ permalink raw reply

* Re: [PATCH v11 net-next 05/11] net/nebula-matrix: add channel layer
From: Simon Horman @ 2026-04-12 12:59 UTC (permalink / raw)
  To: illusion.wang
  Cc: Simon Horman, dimon.zhao, andrew+netdev, alvin.wang, corbet,
	enelsonmoore, netdev, kuba, sam.chen, vadim.fedorenko, edumazet,
	skhan, hkallweit1, lukas.bulwahn, lorenzo, open list, linux-doc,
	pabeni
In-Reply-To: <20260408093739.56001-6-illusion.wang@nebula-matrix.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/nebula-matrix: add channel layer

This commit adds a channel management layer for the nebula-matrix
network driver, providing mailbox-based communication between PF0
and other PFs with message send/receive, ACK handling, and
hash-based message handler lookup.

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c
> --- a/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_channel/nbl_channel.c

[ ... ]

> +static int nbl_chan_send_msg(struct nbl_channel_mgt *chan_mgt,
> +			     struct nbl_chan_send_info *chan_send)
> +{

[ ... ]

> +	if (test_bit(NBL_CHAN_INTERRUPT_READY, chan_info->state)) {
> +		ret = wait_event_timeout(wait_head->wait_queue,
> +					 wait_head->acked,
> +					 NBL_CHAN_ACK_WAIT_TIME);
> +		if (!ret) {
> +			wait_head->status = NBL_MBX_STATUS_TIMEOUT;
> +			dev_err(dev,
> +				"Channel waiting ack failed, message type: %d, msg id: %u\n",
> +				chan_send->msg_type, msgid.id);
> +			return -ETIMEDOUT;
> +		}

[ ... ]

> +	/*polling wait mailbox ack*/
> +	while (i--) {
> +		nbl_chan_clean_queue(chan_mgt, chan_info);
> +
> +		if (wait_head->acked) {
> +			chan_send->ack_len = wait_head->ack_data_len;
> +			wait_head->status = NBL_MBX_STATUS_IDLE;
> +			return wait_head->ack_err;
> +		}
> +		usleep_range(NBL_CHAN_TX_WAIT_ACK_US_MIN,
> +			     NBL_CHAN_TX_WAIT_ACK_US_MAX);
> +	}
> +
> +	wait_head->status = NBL_MBX_STATUS_TIMEOUT;
> +	dev_err(dev,
> +		"Channel polling ack failed, message type: %d msg id: %u\n",
> +		chan_send->msg_type, msgid.id);
> +
> +	return -EFAULT;
          ^^^^^^

The interrupt-based ACK wait path above returns -ETIMEDOUT on timeout,
but this polling-based ACK wait path returns -EFAULT. -EFAULT ("bad
address") is not the right error code for a timeout condition.

Should this also return -ETIMEDOUT for consistency? Any caller that
checks specifically for -ETIMEDOUT to handle timeouts would miss
this polling timeout case.

> +}

[ ... ]

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c
> --- /dev/null
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.c

[ ... ]

> +int nbl_common_create_wq(void)
> +{
> +	wq_mgt = kzalloc_obj(*wq_mgt);
> +	if (!wq_mgt)
> +		return -ENOMEM;
> +
> +	wq_mgt->ctrl_dev_wq = alloc_workqueue("%s", WQ_MEM_RECLAIM | WQ_UNBOUND,
> +					      0, "nbl_ctrldev_wq");
> +	if (!wq_mgt->ctrl_dev_wq) {
> +		pr_err("Failed to create workqueue nbl_ctrldev_wq\n");
> +		goto alloc_ctrl_dev_wq_failed;
> +	}
> +
> +	return 0;
> +alloc_ctrl_dev_wq_failed:
> +	kfree(wq_mgt);
> +	return -ENOMEM;
> +}

When alloc_workqueue() fails, wq_mgt is freed but the static pointer
is not set to NULL. After this error path, wq_mgt becomes a dangling
pointer.

If nbl_common_queue_work() or nbl_common_destroy_wq() is called
afterwards, they dereference wq_mgt unconditionally:

    nbl_common_queue_work():
        queue_work(wq_mgt->ctrl_dev_wq, task);

    nbl_common_destroy_wq():
        destroy_workqueue(wq_mgt->ctrl_dev_wq);
        kfree(wq_mgt);

Would adding "wq_mgt = NULL;" after kfree(wq_mgt) on the error path
prevent this?

[ ... ]

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.h b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.h
> --- /dev/null
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_common/nbl_common.h

[ ... ]

> +/* it used for y_axis no necessay */
> +struct nbl_hash_entry_node {
> +	struct hlist_node node;
> +	void *key;
> +	void *data;
> +};

This isn't a bug, but the comment "it used for y_axis no necessay"
appears garbled and "necessay" is misspelled. Could the comment be
rewritten to explain the purpose of this struct, or removed if it
is no longer relevant?

^ permalink raw reply

* Re: [PATCH v11 net-next 02/11] net/nebula-matrix: add our driver architecture
From: Simon Horman @ 2026-04-12 12:57 UTC (permalink / raw)
  To: illusion.wang
  Cc: Simon Horman, enelsonmoore, alvin.wang, linux-doc, hkallweit1,
	edumazet, lukas.bulwahn, kuba, open list, pabeni, netdev,
	sam.chen, corbet, vadim.fedorenko, lorenzo, dimon.zhao, skhan,
	andrew+netdev
In-Reply-To: <20260408093739.56001-3-illusion.wang@nebula-matrix.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net/nebula-matrix: add our driver architecture

This commit introduces a layered architecture for the nebula-matrix
driver, consisting of Device, Dispatch, Resource, Hardware, Channel, and
Common layers. It adds initialization and cleanup functions for each layer
with struct-based operation registration.

Signed-off-by: illusion.wang <illusion.wang@nebula-matrix.com>

> diff --git a/drivers/net/ethernet/nebula-matrix/nbl/nbl_main.c b/drivers/net/ethernet/nebula-matrix/nbl/nbl_main.c
> index 10c3536b327b..db035136e09c 100644
> --- a/drivers/net/ethernet/nebula-matrix/nbl/nbl_main.c
> +++ b/drivers/net/ethernet/nebula-matrix/nbl/nbl_main.c

[ ... ]

> +struct nbl_adapter *nbl_core_init(struct pci_dev *pdev,
> +				  struct nbl_init_param *param)
> +{
> +	struct nbl_product_base_ops *product_base_ops;
> +	struct nbl_common_info *common;
> +	struct nbl_adapter *adapter;
> +	int ret;
> +
> +	adapter = devm_kzalloc(&pdev->dev, sizeof(*adapter), GFP_KERNEL);
> +	if (!adapter)
> +		return NULL;
> +
> +	adapter->pdev = pdev;
> +	common = &adapter->common;
> +
> +	common->pdev = pdev;
> +	common->dev = &pdev->dev;
> +	common->has_ctrl = param->caps.has_ctrl;
> +	common->has_net = param->caps.has_net;
> +	common->pci_using_dac = param->pci_using_dac;
> +	common->function = PCI_FUNC(pdev->devfn);
> +	common->devid = PCI_SLOT(pdev->devfn);
> +	common->bus = pdev->bus->number;
> +	common->product_type = param->product_type;
> +
> +	product_base_ops = nbl_core_setup_product_ops(adapter, param);
> +
> +	/*
> +	 *every product's hw/chan/res layer has a great difference,
> +	 *so call their own init ops
> +	 */
> +	ret = product_base_ops->hw_init(adapter);
> +	if (ret)
> +		goto hw_init_fail;
> +
> +	ret = product_base_ops->chan_init(adapter);
> +	if (ret)
> +		goto chan_init_fail;
> +
> +	ret = product_base_ops->res_init(adapter);
> +	if (ret)
> +		goto res_init_fail;
> +
> +	ret = nbl_disp_init(adapter);
> +	if (ret)
> +		goto res_init_fail;
> +
> +	ret = nbl_dev_init(adapter);
> +	if (ret)
> +		goto res_init_fail;

Does the error path correctly clean up all initialized layers?

The initialization order is:
  1. hw_init
  2. chan_init
  3. res_init
  4. nbl_disp_init
  5. nbl_dev_init

When nbl_disp_init() or nbl_dev_init() fails, control jumps to the
'res_init_fail' label, which only tears down the channel and hardware
layers:

> +res_init_fail:
> +	product_base_ops->chan_remove(adapter);
> +chan_init_fail:
> +	product_base_ops->hw_remove(adapter);

However, res_init has already succeeded at this point and is never cleaned
up. Looking at nbl_product_base_ops, there is no res_remove callback
defined, and nbl_core_remove() also shows the same asymmetry:

> +void nbl_core_remove(struct nbl_adapter *adapter)
> +{
> +	struct nbl_product_base_ops *product_base_ops;
> +
> +	product_base_ops = adapter->product_base_ops;
> +	nbl_dev_remove(adapter);
> +	product_base_ops->chan_remove(adapter);
> +	product_base_ops->hw_remove(adapter);
> +}

This function calls nbl_dev_remove(), chan_remove(), and hw_remove(), but
never tears down the resource layer or the dispatch layer.

Currently, all allocations in res_init (nbl_res_init_leonis) and
nbl_disp_init use devm_kzalloc, so resources are auto-freed.  However, the
error path structure is concretely wrong: if res_init or disp_init ever
needs non-devm resources (as nbl_hw_init_leonis already uses ioremap and
pci_request_selected_regions), the missing cleanup will cause resource
leaks.

Should the init/remove paths be symmetric, with every successful init
having a corresponding remove in both the error cleanup path and
nbl_core_remove()?

^ permalink raw reply

* Re: [PATCH net] netrom: do some basic forms of validation on incoming frames
From: Chris Maness @ 2026-04-12 12:56 UTC (permalink / raw)
  To: hugh
  Cc: Greg KH, Kuniyuki Iwashima, kuba, davem, edumazet, horms,
	linux-hams, linux-kernel, netdev, pabeni, stable, workflows,
	yizhe
In-Reply-To: <3cd91fbc-d3a9-431e-b915-58e851c7df9f@blemings.org>

Thanks for your work, Hugh.

-73 de Chris KQ6UP

On Sat, Apr 11, 2026 at 7:33 PM Hugh Blemings <hugh@blemings.org> wrote:
>
>
> On 11/4/2026 18:58, Greg KH wrote:
> > On Sat, Apr 11, 2026 at 05:24:17PM +1000, Hugh Blemings wrote:
> >> On 11/4/2026 15:50, Greg KH wrote:
> >>> On Sat, Apr 11, 2026 at 08:25:19AM +1000, Hugh Blemings wrote:
> >>>> On 11/4/2026 08:11, Kuniyuki Iwashima wrote:
> >>>>> From: Jakub Kicinski <kuba@kernel.org>
> >>>>> Date: Fri, 10 Apr 2026 14:54:48 -0700
> >>>>>> On Fri, 10 Apr 2026 14:30:42 -0700 Jakub Kicinski wrote:
> >>>>>>> On Fri, 10 Apr 2026 07:24:36 +0200 Greg Kroah-Hartman wrote:
> >>>>>>>> On Thu, Apr 09, 2026 at 08:32:35PM -0700, Jakub Kicinski wrote:
> >>>>>>>>> Or for simplicity we could also be testing against skb_headlen()
> >>>>>>>>> since we don't expect any legit non-linear frames here? Dunno.
> >>>>>>>> I'll be glad to change this either way, your call.  Given that this is
> >>>>>>>> an obsolete protocol that seems to only be a target for drive-by fuzzers
> >>>>>>>> to attack, whatever the simplest thing to do to quiet them up I'll be
> >>>>>>>> glad to implement.
> >>>>>>>>
> >>>>>>>> Or can we just delete this stuff entirely?  :)
> >>>>>>> Yes.
> >>>>>>>
> >>>>>>> My thinking is to delete hamradio, nfc, atm, caif.. [more to come]
> >>>>>>> Create GH repos which provide them as OOT modules.
> >>>>>>> Hopefully we can convince any existing users to switch to that.
> >>>>>>>
> >>>>>>> The only thing stopping me is the concern that this is just the softest
> >>>>>>> target and the LLMs will find something else to focus on which we can't
> >>>>>>> delete. I suspect any PCIe driver can be flooded with "aren't you
> >>>>>>> trusting the HW to provide valid responses here?" bullshit.
> >>>>>>>
> >>>>>>> But hey, let's try. I'll post a patch nuking all of hamradio later
> >>>>>>> today.
> >>>>>> Well, either we "expunge" this code to OOT repos, or we mark it
> >>>>>> as broken and tell everyone that we don't take security fixes
> >>>>>> for anything that depends on BROKEN. I'd personally rather expunge.
> >>>>> +1 for "expunge" to prevent LLM-based patch flood.
> >>>>>
> >>>>> IIRC, we did that recently for one driver only used by OpenWRT ?
> >>>>>
> >>>>>
> >>>> If the main concern here is ongoing maintenance of these Ham Radio related
> >>>> protocols/drivers, can we pause for a moment on anything as dramatic as
> >>>> removing from the tree entirely ?
> >>> Sure, but:
> >>>
> >>>> There is a good cohort of capable kernel folks that either are or were ham
> >>>> radio operators who I believe, upon realising that things have got to this
> >>>> point, will be happy to redouble efforts to ensure this code maintained and
> >>>> tested to a satisfactory standard.
> >>> We need this code to be maintained, because as is being shown, there are
> >>> reported problems with it that will affect these devices/networks that
> >>> you all are using.  So all we need is a maintainer for this to be able
> >>> to take reports that we get and fix things up as needed.  I know you
> >>> have that experience, want to come back to kernel development, we've
> >>> missed you :)
> >> That's most kind Greg, thank you, have missed all you cool kids too :)
> >>
> >> More seriously though - I'd be up for doing it, but I think there may be
> >> others better placed than I who haven't yet realised we have this conundrum.
> >> I'm nudging a few folks offline on this front.
> > The main "conundrum" is, is that this protocol completly trusts the
> > hardware to give the kernel the "correct" data.  So if you trust the
> > hardware to work properly, it will be fine, but as the fuzzing tools are
> > finding, if the data from the hardware modems is a bit out-of-spec,
> > "bad" things can happen.
> >
> > I don't know how well controlled the data is from these devices, if it's
> > just a "pass through" from what they get off the "wire" or if the
> > devices always ensure the protocol packets are sane before passing them
> > off to the kernel.  That's going to be something you all with the
> > hardware is going to have to determine in order to keep this a working
> > system over time.  Especially given that this is a wireless protcol
> > where you "have" to trust the remote end.
>
> Thanks for the thoughts Greg - and ya, I guess on balance I come back to
> being generally skeptical of both hardware and software to Do The Right
> Thing (TM)
>
> So bounds checking and the like seems prudent irrespective of whether
> the kernel is getting the data from real hardware, software modems etc.
>
> I've done some initial digging around that confirms my suspicion that
> this in kernel code remains quite widely used, if somewhat out of view.
> Accordingly I lean then towards working to get these various mitigations
> in place with some revised patches etc. as needed and into the main tree.
>
> Once this done I think that'll give me a good sense of whether I or
> someone else is well positioned to keep the code maintained longer term
> and thus justify it remaining in tree or not.
>
> More to follow once I finish remembering this kernel thing!
>
> Cheers,
> Hugh
>
>
>
>


-- 
Thanks,
Chris Maness

^ permalink raw reply

* Re: [PATCH net-next] net: phy: call phy_init_hw() in phy resume path
From: Russell King (Oracle) @ 2026-04-12 12:55 UTC (permalink / raw)
  To: Biju Das
  Cc: Andrew Lunn, biju.das.au, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Ovidiu Panait,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Geert Uytterhoeven, Prabhakar Mahadev Lad,
	linux-renesas-soc@vger.kernel.org
In-Reply-To: <TY3PR01MB11346F78B929EA7F377BB5B4A86272@TY3PR01MB11346.jpnprd01.prod.outlook.com>

On Sun, Apr 12, 2026 at 12:05:06PM +0000, Biju Das wrote:
> Hi Russell King,
> 
> > -----Original Message-----
> > From: Russell King <linux@armlinux.org.uk>
> > Sent: 11 April 2026 17:47
> > Subject: Re: [PATCH net-next] net: phy: call phy_init_hw() in phy resume path
> > 
> > On Sat, Apr 11, 2026 at 03:50:13PM +0200, Andrew Lunn wrote:
> > > > So, I question whether any of the functions in this driver actually
> > > > have a valid reason to take phydev->lock - looks to me like a not
> > > > very well written driver.
> > > >
> > > > In cases like this, I don't think we should make things more
> > > > difficult in the core just because we have a lockdep splat when that
> > > > can be avoided by killing off unnecessary locking.
> > >
> > > Agreed. This patchset should cleanup these locks.
> > >
> > > We also need to look at lan937x_dsp_workaround(). I also don't see
> > > what that mutex lock/unlock is protecting. Accessing bank registers
> > > need to be protected, so doing one additional access within that
> > > should not need additional protection.
> > 
> > Looking at access_ereg(), shouldn't it be taking the MDIO bus lock and using the __phy_* accessors
> > anyway because it's writing various registers which determine what is being read via the
> > LAN87XX_EXT_REG_RD_DATA register or the value written via the LAN87XX_EXT_REG_WR_DATA register.
> > 
> > Also, as it has access_ereg_modify_changed(), that entire sequence needs to take the MDIO bus lock to
> > safely do the read-modify-write.
> > 
> > Then there's lan87xx_config_rgmii_delay() which is a large open coded read-modify-write for the
> > PHYACC_ATTR_BANK_MISC, LAN87XX_CTRL_1 register.
> > 
> > To me, this looks like a racy driver, and it also looks like it's using the wrong lock to try and
> > protect hardware accesses.
> 
> OK, will replace it with MDIO bus lock.

Remember that the phy_* accessors will take the MDIO bus lock, so will
need to be changed to their __phy_* counterparts.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* Re: [PATCH net 2/2] net: mana: Move current_speed debugfs file to mana_init_port()
From: Simon Horman @ 2026-04-12 12:52 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shradhagupta, kees, kotaranov, yury.norov, linux-hyperv, netdev,
	linux-kernel
In-Reply-To: <20260408081224.302308-3-ernis@linux.microsoft.com>

On Wed, Apr 08, 2026 at 01:12:20AM -0700, Erni Sri Satya Vennela wrote:
> Move the current_speed debugfs file creation from mana_probe_port() to
> mana_init_port(). The file was previously created only during initial
> probe, but mana_cleanup_port_context() removes the entire vPort debugfs
> directory during detach/attach cycles. Since mana_init_port() recreates
> the directory on re-attach, moving current_speed here ensures it survives
> these cycles.
> 
> Fixes: 75cabb46935b ("net: mana: Add support for net_shaper_ops")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net 1/2] net: mana: Use pci_name() for debugfs directory naming
From: Simon Horman @ 2026-04-12 12:52 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shradhagupta, kees, kotaranov, yury.norov, linux-hyperv, netdev,
	linux-kernel
In-Reply-To: <20260408081224.302308-2-ernis@linux.microsoft.com>

On Wed, Apr 08, 2026 at 01:12:19AM -0700, Erni Sri Satya Vennela wrote:
> Use pci_name(pdev) for the per-device debugfs directory instead of
> hardcoded "0" for PFs and pci_slot_name(pdev->slot) for VFs. The
> previous approach had two issues:
> 
> 1. pci_slot_name() dereferences pdev->slot, which can be NULL for VFs
>    in environments like generic VFIO passthrough or nested KVM,
>    causing a NULL pointer dereference.
> 
> 2. Multiple PFs would all use "0", and VFs across different PCI
>    domains or buses could share the same slot name, leading to
>    -EEXIST errors from debugfs_create_dir().
> 
> pci_name(pdev) returns the unique BDF address, is always valid, and is
> unique across the system.
> 
> Fixes: 6607c17c6c5e ("net: mana: Enable debugfs files for MANA device")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net-next v6] net: mana: Expose hardware diagnostic info via debugfs
From: Simon Horman @ 2026-04-12 12:49 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, kotaranov, shradhagupta, shirazsaleem,
	yury.norov, kees, ssengar, dipayanroy, gargaditya, linux-hyperv,
	netdev, linux-kernel, linux-rdma
In-Reply-To: <20260408081555.302620-1-ernis@linux.microsoft.com>

On Wed, Apr 08, 2026 at 01:15:46AM -0700, Erni Sri Satya Vennela wrote:
> Add debugfs entries to expose hardware configuration and diagnostic
> information that aids in debugging driver initialization and runtime
> operations without adding noise to dmesg.
> 
> The debugfs directory for each PCI device is named using pci_name()
> (the unique BDF address), and its creation and removal is integrated
> into mana_gd_setup() and mana_gd_cleanup_device() respectively, so
> that all callers (probe, remove, suspend, resume, shutdown) share a
> single code path.
> 
> Device-level entries (under /sys/kernel/debug/mana/<BDF>/):
>   - num_msix_usable, max_num_queues: Max resources from hardware
>   - gdma_protocol_ver, pf_cap_flags1: VF version negotiation results
>   - num_vports, bm_hostmode: Device configuration
> 
> Per-vPort entries (under /sys/kernel/debug/mana/<BDF>/vportN/):
>   - port_handle: Hardware vPort handle
>   - max_sq, max_rq: Max queues from vPort config
>   - indir_table_sz: Indirection table size
>   - steer_rx, steer_rss, steer_update_tab, steer_cqe_coalescing:
>     Last applied steering configuration parameters
> 
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
> This patch depends on the following fixes submitted to net:
>   - "net: mana: Use pci_name() for debugfs directory naming"
>   - "net: mana: Move current_speed debugfs file to mana_init_port()"
> Conflict resolution may be needed when net merges into net-next.

Unfortunately this patch doesn't apply to net-next,
which is a requirement for our workflow.

-- 
pw-bot: changes-requested

^ permalink raw reply

* Re: [PATCH nf] netfilter: arp_tables: fix IEEE1394 ARP payload parsing in arp_packet_match()
From: Simon Horman @ 2026-04-12 12:47 UTC (permalink / raw)
  To: Weiming Shi
  Cc: Pablo Neira Ayuso, Florian Westphal, David S . Miller,
	David Ahern, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Phil Sutter, netfilter-devel, coreteam, netdev, Xiang Mei
In-Reply-To: <20260408073515.79296-2-bestswngs@gmail.com>

On Wed, Apr 08, 2026 at 03:35:16PM +0800, Weiming Shi wrote:
> arp_packet_match() unconditionally parses the ARP payload assuming two
> hardware addresses are present (source and target). However,
> IPv4-over-IEEE1394 ARP (RFC 2734) omits the target hardware address
> field, and arp_hdr_len() already accounts for this by returning a
> shorter length for ARPHRD_IEEE1394 devices.
> 
> As a result, on IEEE1394 interfaces arp_packet_match() advances past a
> nonexistent target hardware address and reads the wrong bytes for both
> the target device address comparison and the target IP address. This
> causes arptables rules to match against garbage data, leading to
> incorrect filtering decisions: packets that should be accepted may be
> dropped and vice versa.
> 
> The ARP stack in net/ipv4/arp.c (arp_create and arp_process) already
> handles this correctly by skipping the target hardware address for
> ARPHRD_IEEE1394. Apply the same pattern to arp_packet_match().
> 
> Fixes: 6752c8db8e0c ("firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection.")
> Reported-by: Xiang Mei <xmei5@asu.edu>
> Signed-off-by: Weiming Shi <bestswngs@gmail.com>
> ---
>  net/ipv4/netfilter/arp_tables.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
> index 1cdd9c28ab2da..4b2392bdcd0a6 100644
> --- a/net/ipv4/netfilter/arp_tables.c
> +++ b/net/ipv4/netfilter/arp_tables.c
> @@ -86,7 +86,7 @@ static inline int arp_packet_match(const struct arphdr *arphdr,
>  				   const struct arpt_arp *arpinfo)
>  {
>  	const char *arpptr = (char *)(arphdr + 1);
> -	const char *src_devaddr, *tgt_devaddr;
> +	const char *src_devaddr, *tgt_devaddr = NULL;

I think that it's more in keeping with Kernel code practices
to set tgt_devaddr conditionally.

>  	__be32 src_ipaddr, tgt_ipaddr;
>  	long ret;
>  
> @@ -110,13 +110,23 @@ static inline int arp_packet_match(const struct arphdr *arphdr,
>  	arpptr += dev->addr_len;
>  	memcpy(&src_ipaddr, arpptr, sizeof(u32));
>  	arpptr += sizeof(u32);
> -	tgt_devaddr = arpptr;
> -	arpptr += dev->addr_len;
> +	switch (dev->type) {
> +#if IS_ENABLED(CONFIG_FIREWIRE_NET)
> +	case ARPHRD_IEEE1394:
> +		break;
> +#endif
> +	default:
> +		tgt_devaddr = arpptr;
> +		arpptr += dev->addr_len;
> +		break;
> +	}

While I acknowledge this isn't the approach taken in arp_hdr_len()
I think it would be nicer to use the following construction
which will give build coverage to all paths regardless of if
CONFIG_FIREWIRE_NET is set or not.

	if (IS_ENABLED(CONFIG_FIREWIRE_NET) && dev->type == ARPHRD_IEEE1394) {
		tgt_devaddr = NULL;
	} else {
		tgt_devaddr = arpptr;
		arpptr += dev->addr_len;
	}

Also, I would include a blank line before the if condition.

>  	memcpy(&tgt_ipaddr, arpptr, sizeof(u32));
>  
>  	if (NF_INVF(arpinfo, ARPT_INV_SRCDEVADDR,
>  		    arp_devaddr_compare(&arpinfo->src_devaddr, src_devaddr,
> -					dev->addr_len)) ||
> +					dev->addr_len)))
> +		return 0;
> +	if (tgt_devaddr &&
>  	    NF_INVF(arpinfo, ARPT_INV_TGTDEVADDR,
>  		    arp_devaddr_compare(&arpinfo->tgt_devaddr, tgt_devaddr,
>  					dev->addr_len)))
> -- 
> 2.43.0
> 

^ permalink raw reply

* [PATCH] RDS: Fix memory leak in rds_rdma_extra_size()
From: Xiaobo Liu @ 2026-04-12 12:44 UTC (permalink / raw)
  To: Allison Henderson, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, netdev, linux-rdma, rds-devel, linux-kernel,
	Xiaobo Liu

Free iov->iov when copy_from_user() or page count validation fails in rds_rdma_extra_size().

This preserves the existing success path and avoids leaking the allocated iovec array on error.
---
 net/rds/rdma.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/net/rds/rdma.c b/net/rds/rdma.c
index aa6465dc7..91a20c1e2 100644
--- a/net/rds/rdma.c
+++ b/net/rds/rdma.c
@@ -560,6 +560,7 @@ int rds_rdma_extra_size(struct rds_rdma_args *args,
 	struct rds_iovec *vec;
 	struct rds_iovec __user *local_vec;
 	int tot_pages = 0;
+	int ret = 0;
 	unsigned int nr_pages;
 	unsigned int i;
 
@@ -578,16 +579,20 @@ int rds_rdma_extra_size(struct rds_rdma_args *args,
 	vec = &iov->iov[0];
 
 	if (copy_from_user(vec, local_vec, args->nr_local *
-			   sizeof(struct rds_iovec)))
-		return -EFAULT;
+			   sizeof(struct rds_iovec))) {
+		ret = -EFAULT;
+		goto out;
+	}
 	iov->len = args->nr_local;
 
 	/* figure out the number of pages in the vector */
 	for (i = 0; i < args->nr_local; i++, vec++) {
 
 		nr_pages = rds_pages_in_vec(vec);
-		if (nr_pages == 0)
-			return -EINVAL;
+		if (nr_pages == 0) {
+			ret = -EINVAL;
+			goto out;
+		}
 
 		tot_pages += nr_pages;
 
@@ -595,11 +600,20 @@ int rds_rdma_extra_size(struct rds_rdma_args *args,
 		 * nr_pages for one entry is limited to (UINT_MAX>>PAGE_SHIFT)+1,
 		 * so tot_pages cannot overflow without first going negative.
 		 */
-		if (tot_pages < 0)
-			return -EINVAL;
+		if (tot_pages < 0) {
+			ret = -EINVAL;
+			goto out;
+		}
 	}
 
-	return tot_pages * sizeof(struct scatterlist);
+	ret = tot_pages * sizeof(struct scatterlist);
+
+out:
+	if (ret < 0) {
+		kfree(iov->iov);
+		iov->iov = NULL;
+	}
+	return ret;
 }
 
 /*
-- 
2.34.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox