Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net] net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
From: patchwork-bot+netdevbpf @ 2026-04-14 13:20 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, linux-arm-kernel,
	linux-mediatek, netdev
In-Reply-To: <20260412-airoha_ppe_hw_init-missing-bits-v1-1-06ac670819e3@kernel.org>

Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Sun, 12 Apr 2026 10:43:26 +0200 you wrote:
> Add the following PPE configuration in airoha_ppe_hw_init routine:
> - 6RD hw offloading is currently not supported by Netfilter flowtable.
>   Disable explicitly PPE 6RD offloading in order to prevent PPE to learn
>   6RD flows and eventually interrupt the traffic.
> - Add missing PPE bind rate configuration for L3 and L2 traffic.
>   PPE bind rate configuration specifies the pps threshold to move a PPE
>   entry state from UNBIND to BIND. Without this configuration this value
>   is random.
> - Set ageing thresholds to the values used in the vendor SDK in order to
>   improve connection stability under load and avoid packet loss caused by
>   fast aging.
> 
> [...]

Here is the summary with links:
  - [net] net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
    https://git.kernel.org/netdev/net/c/b9d8b856689d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH] net: wwan: t7xx: validate port_count against message length in t7xx_port_enum_msg_handler
From: Willy Tarreau @ 2026-04-14 13:17 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Pavitra Jha, chandrashekar.devegowda, linux-wwan, netdev, stable
In-Reply-To: <3b67dedb-3472-4322-9a30-32bf8e3cef99@redhat.com>

On Tue, Apr 14, 2026 at 11:41:54AM +0200, Paolo Abeni wrote:
> On 4/11/26 10:39 AM, Pavitra Jha wrote:
> > t7xx_port_enum_msg_handler() uses the modem-supplied port_count field as
> > a loop bound over port_msg->data[] without checking that the message buffer
> > contains sufficient data. A modem sending port_count=65535 in a 12-byte
> > buffer triggers a slab-out-of-bounds read of up to 262140 bytes.
> > 
> > Add a struct_size() check after extracting port_count and before the loop.
> > Pass msg_len from both call sites: skb->len at the DPMAIF path after
> > skb_pull(), and the captured rt_feature->data_len at the handshake path.
> > 
> > Fixes: 1e3e8eb9b6e3 ("net: wwan: t7xx: Add control DMA interface")
> 
> Wrong fixes tag:
> 
> fatal: ambiguous argument '1e3e8eb9b6e3': unknown revision or path not
> in the working tree.

Interesting, there isn't a single digit correct here! The matching one
I'm finding based on the subject is:

  39d439047f1d ("net: wwan: t7xx: Add control DMA interface")

Willy

> > diff --git a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
> > index ae632ef96..d984a688d 100644
> > --- a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
> > +++ b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
> > @@ -124,7 +124,7 @@ static int fsm_ee_message_handler(struct t7xx_port *port, struct t7xx_fsm_ctl *c
> >   * * 0		- Success.
> >   * * -EFAULT	- Message check failure.
> >   */
> > -int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg)
> > +int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg, size_t msg_len)
> 
> Undocumented new argument
> 
> /P

^ permalink raw reply

* Re: [PATCH net] net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()
From: Lorenzo Bianconi @ 2026-04-14 13:15 UTC (permalink / raw)
  To: Simon Horman
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, linux-arm-kernel,
	linux-mediatek, netdev
In-Reply-To: <20260414124641.754059-1-horms@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 2536 bytes --]

> From: 'Simon Horman' <horms@kernel.org>
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
>     net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()
> 
> This commit adds a poll loop in airoha_ppe_offload_setup() to wait for the NPU
> PPE configuration to complete before running airoha_ppe_hw_init().
> 
> > diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c
> > index c2c32b6833df9..52199f6b39f9b 100644
> > --- a/drivers/net/ethernet/airoha/airoha_ppe.c
> > +++ b/drivers/net/ethernet/airoha/airoha_ppe.c
> [ ... ]
> > @@ -1340,6 +1341,13 @@ static int airoha_ppe_offload_setup(struct airoha_eth *eth)
> >  	if (err)
> >  		goto error_npu_put;
> >  
> > +	/* Wait for NPU PPE configuration to complete */
> > +	err = read_poll_timeout(airoha_fe_rr, val, val, USEC_PER_MSEC,
> > +				100 * USEC_PER_MSEC, false, eth,
> > +				REG_PPE_PPE_FLOW_CFG(0));
> 
> If the SoC has multiple PPE instances, is it sufficient to only wait for
> REG_PPE_PPE_FLOW_CFG(0)?
> 
> Later in airoha_ppe_hw_init(), the code iterates up to eth->soc->num_ppe,
> initializing all available instances:
> 
> drivers/net/ethernet/airoha/airoha_ppe.c:airoha_ppe_hw_init() {
>     ...
> 	for (i = 0; i < eth->soc->num_ppe; i++) {
> 		...
> 		airoha_fe_wr(eth, REG_PPE_TB_BASE(i),
> 			     ppe->foe_dma + sram_tb_size);
>     ...
> }

ack, let me look into it, I will post a v2.

> 
> If we only poll PPE 0, could the driver race with the NPU firmware's
> initialization of subsequent PPEs?
> 
> Additionally, does this poll condition properly ensure the hardware is ready?
> Since the break condition is simply `val`, it exits immediately if the register
> contains any non-zero value.
> 
> If the register retains a non-zero value from a previous session (like across
> driver reloads), or if an unclocked domain returns 0xFFFFFFFF, this poll might
> succeed immediately and bypass the wait.
> 
> Could we clear the register before polling, or wait for a specific bit instead?

I guess check val != 0 is enough since we reset the FE block and defualt
register value is 0 for both REG_PPE_PPE_FLOW_CFG(0) and
REG_PPE_PPE_FLOW_CFG(1).

Regards,
Lorenzo

> 
> > +	if (err)
> > +		goto error_npu_put;
> > +
> >  	ppe_num_stats_entries = airoha_ppe_get_total_num_stats_entries(ppe);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH iwl-next] ice: call netif_keep_dst() once when entering switchdev mode
From: Holda, Patryk @ 2026-04-14 13:13 UTC (permalink / raw)
  To: Simon Horman, Loktionov, Aleksandr
  Cc: intel-wired-lan@lists.osuosl.org, Nguyen, Anthony L,
	netdev@vger.kernel.org, Szycik, Marcin
In-Reply-To: <20260403124133.GA94926@horms.kernel.org>

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Simon Horman
> Sent: Friday, April 3, 2026 2:42 PM
> To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; netdev@vger.kernel.org; Szycik, Marcin
> <marcin.szycik@intel.com>
> Subject: Re: [Intel-wired-lan] [PATCH iwl-next] ice: call netif_keep_dst() once
> when entering switchdev mode
> 
> On Fri, Mar 27, 2026 at 08:22:36AM +0100, Aleksandr Loktionov wrote:
> > From: Marcin Szycik <marcin.szycik@intel.com>
> >
> > netif_keep_dst() only needs to be called once for the uplink VSI, not
> > once for each port representor.  Move it from ice_eswitch_setup_repr()
> > to ice_eswitch_enable_switchdev().
> >
> > Fixes: defd52455aee ("ice: do Tx through PF netdev in slow-path")
> 
> This problem seems to predate the cited commit.
> 
> > Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
> > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Tested-by: Patryk Holda <patryk.holda@intel.com> 



^ permalink raw reply

* Re: [PATCH RFC bpf-next 1/8] kasan: expose generic kasan helpers
From: Alexis Lothoré @ 2026-04-14 13:12 UTC (permalink / raw)
  To: Andrey Konovalov, Alexis Lothoré (eBPF Foundation)
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, ebpf, Bastien Curutchet, Thomas Petazzoni,
	Xu Kuohai, bpf, linux-kernel, netdev, linux-kselftest,
	linux-stm32, linux-arm-kernel, kasan-dev, linux-mm
In-Reply-To: <CA+fCnZfubV6LgRjO3NQvhrG2Q5o0ftkFFupLWVYS50XDnmCaog@mail.gmail.com>

Hi Andrey, thanks for the prompt review !

On Tue Apr 14, 2026 at 12:19 AM CEST, Andrey Konovalov wrote:
> On Mon, Apr 13, 2026 at 8:29 PM Alexis Lothoré (eBPF Foundation)
> <alexis.lothore@bootlin.com> wrote:
>>

[...]

>> +#ifdef CONFIG_KASAN_GENERIC
>> +void __asan_load1(void *p);
>> +void __asan_store1(void *p);
>> +void __asan_load2(void *p);
>> +void __asan_store2(void *p);
>> +void __asan_load4(void *p);
>> +void __asan_store4(void *p);
>> +void __asan_load8(void *p);
>> +void __asan_store8(void *p);
>> +void __asan_load16(void *p);
>> +void __asan_store16(void *p);
>> +#endif /* CONFIG_KASAN_GENERIC */
>
> This looks ugly, let's not do this unless it's really required.
>
> You can just use kasan_check_read/write() instead - these are public
> wrappers around the same shadow memory checking functions. And they
> also work with the SW_TAGS mode, in case the BPF would want to use
> that mode at some point. (For HW_TAGS, we only have kasan_check_byte()
> that checks a single byte, but it can be extended in the future if
> required to be used by BPF.)

ACK, I'll try to use those kasan_check_read and kasan_check_write rather
than __asan_{load,store}X.

Alexis

-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH iwl-next v1 1/3] i40e: prepare for XDP metadata ops support
From: Holda, Patryk @ 2026-04-14 13:12 UTC (permalink / raw)
  To: Loktionov, Aleksandr, Kohei Enju,
	intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org
  Cc: Nguyen, Anthony L, Kitszel, Przemyslaw, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	kohei.enju@gmail.com
In-Reply-To: <IA3PR11MB89861AD556C1C4D863DD4F3EE54CA@IA3PR11MB8986.namprd11.prod.outlook.com>

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Loktionov, Aleksandr
> Sent: Friday, March 20, 2026 7:57 AM
> To: Kohei Enju <kohei@enjuk.jp>; intel-wired-lan@lists.osuosl.org;
> netdev@vger.kernel.org
> Cc: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Andrew Lunn <andrew+netdev@lunn.ch>;
> David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; kohei.enju@gmail.com
> Subject: Re: [Intel-wired-lan] [PATCH iwl-next v1 1/3] i40e: prepare for XDP
> metadata ops support
> 
> 
> 
> > -----Original Message-----
> > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> > Of Kohei Enju
> > Sent: Thursday, March 19, 2026 6:17 PM
> > To: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org
> > Cc: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
> > Przemyslaw <przemyslaw.kitszel@intel.com>; Andrew Lunn
> > <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric
> > Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>;
> Paolo
> > Abeni <pabeni@redhat.com>; kohei.enju@gmail.com; Kohei Enju
> > <kohei@enjuk.jp>
> > Subject: [Intel-wired-lan] [PATCH iwl-next v1 1/3] i40e: prepare for
> > XDP metadata ops support
> >
> > Prepare 'struct i40e_xdp_buff' that contains an xdp_buff and a pointer
> > to i40e_rx_desc in order to pass the RX descriptor to the XDP kfuncs.
> > Also in ZC path, use XSK_CHECK_PRIV_TYPE() to ensure i40e_xdp_buff
> > doesn't exceed the offset of cb in xdp_buff_xsk.
> >
> > No functional changes.
> >
> > Signed-off-by: Kohei Enju <kohei@enjuk.jp>
> > ---
> >  drivers/net/ethernet/intel/i40e/i40e_main.c |  2 +-
> > drivers/net/ethernet/intel/i40e/i40e_txrx.c |  5 ++++-
> > drivers/net/ethernet/intel/i40e/i40e_txrx.h |  7 ++++++-
> > drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 12 ++++++++++++
> >  4 files changed, 23 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
> > b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > index 31a42ee18aa0..7966d9cb8009 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > @@ -3619,7 +3619,7 @@ static int i40e_configure_rx_ring(struct
> > i40e_ring *ring)
> >  	}
> >
> >  skip:
> > -	xdp_init_buff(&ring->xdp, xdp_frame_sz, &ring->xdp_rxq);
> > +	xdp_init_buff(&ring->xdp_ctx.xdp, xdp_frame_sz, &ring-
> > >xdp_rxq);
> >
> >  	rx_ctx.dbuff = DIV_ROUND_UP(ring->rx_buf_len,
> >  				    BIT_ULL(I40E_RXQ_CTX_DBUFF_SHIFT));
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > index 4ffdb007c41a..cfaf724ee7ff 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > @@ -2438,10 +2438,11 @@ static int i40e_clean_rx_irq(struct i40e_ring
> > *rx_ring, int budget,
> >  			     unsigned int *rx_cleaned)
> >  {
> >  	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
> 
> ...
> 
> >  		xdp_res = i40e_run_xdp_zc(rx_ring, first, xdp_prog);
> >  		i40e_handle_xdp_result_zc(rx_ring, first, rx_desc,
> &rx_packets,
> >  					  &rx_bytes, xdp_res, &failure);
> > --
> > 2.51.0
> 
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>


Tested-by: Patryk Holda <patryk.holda@intel.com> 



^ permalink raw reply

* Re: [PATCH net] net: airoha: Fix max TX packet length configuration
From: Paolo Abeni @ 2026-04-14 13:04 UTC (permalink / raw)
  To: Lorenzo Bianconi, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Simon Horman
  Cc: linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260412-airoha-fix-max-mtu-v1-1-333030d0a564@kernel.org>

On 4/12/26 10:09 AM, Lorenzo Bianconi wrote:
> According to the Airoha documentation, REG_GDM_LEN_CFG() register does not
> include FCS length. Fix MTU configuration removing ETH_FCS_LEN from
> maximum TX packet length configuration.
> 
> Fixes: 54d989d58d2ac ("net: airoha: Move min/max packet len configuration in airoha_dev_open()")
> Fixes: 03b1b69f0662c ("net: airoha: Introduce airoha_dev_change_mtu callback")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/airoha/airoha_eth.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index c14cdce588a7..a81ffda72b39 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -1629,9 +1629,9 @@ static void airoha_update_hw_stats(struct airoha_gdm_port *port)
>  
>  static int airoha_dev_open(struct net_device *dev)
>  {
> -	int err, len = ETH_HLEN + dev->mtu + ETH_FCS_LEN;
>  	struct airoha_gdm_port *port = netdev_priv(dev);
>  	struct airoha_qdma *qdma = port->qdma;
> +	int err, len = ETH_HLEN + dev->mtu;

Sashiko noted that the above may cause regressions, dropping max MTU
vlan frames.

/P


^ permalink raw reply

* Re: [PATCH net] net: airoha: Fix max TX packet length configuration
From: Lorenzo Bianconi @ 2026-04-14 13:04 UTC (permalink / raw)
  To: Simon Horman
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, linux-arm-kernel,
	linux-mediatek, netdev
In-Reply-To: <20260414124011.751922-1-horms@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]

> From: 'Simon Horman' <horms@kernel.org>
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> net: airoha: Fix max TX packet length configuration
> 
> This commit fixes the MTU configuration for the Airoha driver by removing
> ETH_FCS_LEN from the maximum TX packet length, matching the hardware
> documentation which states the register does not include the FCS length.
> 
> > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> > index c14cdce588a7c..a81ffda72b39f 100644
> > --- a/drivers/net/ethernet/airoha/airoha_eth.c
> > +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> > @@ -1629,9 +1629,9 @@ static void airoha_update_hw_stats(struct airoha_gdm_port *port)
> >  
> >  static int airoha_dev_open(struct net_device *dev)
> >  {
> > -	int err, len = ETH_HLEN + dev->mtu + ETH_FCS_LEN;
> >  	struct airoha_gdm_port *port = netdev_priv(dev);
> >  	struct airoha_qdma *qdma = port->qdma;
> > +	int err, len = ETH_HLEN + dev->mtu;
> 
> Does this regression cause MTU-sized VLAN-tagged frames to be dropped by the
> hardware?
> 
> By limiting the frame size to ETH_HLEN + dev->mtu, the maximum allowed length
> becomes 1514 bytes for a standard 1500 MTU. An Ethernet frame containing a
> standard 802.1Q VLAN tag and a maximum-sized 1500-byte IP payload requires an
> 18-byte L2 header, making the total frame size 1518 bytes (excluding FCS).
> 
> Previously, the inclusion of ETH_FCS_LEN inadvertently provided the 4-byte
> headroom needed to accommodate a single VLAN tag. 
> 
> Could the configuration explicitly account for VLAN headers using VLAN_ETH_HLEN
> instead?

ack, thx for pointing this out. I will fix it in v2.

Regards,
Lorenzo

> 
> [ ... ]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [net,PATCH v3 2/2] net: ks8851: Avoid excess softirq scheduling
From: Sebastian Andrzej Siewior @ 2026-04-14 13:02 UTC (permalink / raw)
  To: Marek Vasut
  Cc: netdev, stable, David S. Miller, Andrew Lunn, Eric Dumazet,
	Jakub Kicinski, Nicolai Buchwitz, Paolo Abeni, Ronald Wahl,
	Yicong Hui, linux-kernel
In-Reply-To: <20260414103327.113500-2-marex@nabladev.com>

On 2026-04-14 12:32:53 [+0200], Marek Vasut wrote:
> The code injects a packet into netif_rx() repeatedly, which will add
> it to its internal NAPI and schedule a softirq, and process it. It is
> more efficient to queue multiple packets and process them all at the
> local_bh_enable() time.
> 
> Fixes: e0863634bf9f ("net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs")
> Cc: stable@vger.kernel.org
> Signed-off-by: Marek Vasut <marex@nabladev.com>

Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Sebastian

^ permalink raw reply

* Re: [PATCH iwl-next v2 1/8] ixgbe: lower IXGBE_ITR_ADAPTIVE_MAX_USECS to prevent RX starvation
From: Simon Horman @ 2026-04-14 12:58 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: 'Simon Horman', intel-wired-lan, anthony.l.nguyen, netdev
In-Reply-To: <20260408131216.2662245-2-aleksandr.loktionov@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ixgbe: lower IXGBE_ITR_ADAPTIVE_MAX_USECS to prevent RX starvation

The patch lowers IXGBE_ITR_ADAPTIVE_MAX_USECS from 126 to 84 to prevent
descriptor ring starvation under sustained full-line-rate bulk RX traffic.
It states that this provides enough headroom to drain the ring before it wraps.

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 59a1cee40b436..eedee3b2e72c8 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -475,7 +475,7 @@ static inline unsigned int ixgbe_rx_pg_order(struct ixgbe_ring *ring)
>  
>  #define IXGBE_ITR_ADAPTIVE_MIN_INC	2
>  #define IXGBE_ITR_ADAPTIVE_MIN_USECS	10
> -#define IXGBE_ITR_ADAPTIVE_MAX_USECS	126
> +#define IXGBE_ITR_ADAPTIVE_MAX_USECS	84

Will the hardware still be configured with a 126 us delay during bulk traffic?

Looking at ixgbe_update_itr(), when traffic is categorized as bulk
(packets >= 256), itr is initialized to IXGBE_ITR_ADAPTIVE_BULK (0).
For a 10Gbps link, avg_wire_size plateaus at 32256:

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:ixgbe_update_itr() {
    ...
	} else {
		/* plateau at a limit of 8K ints/sec */
		avg_wire_size = 32256;
	}
    ...
}

The itr is then calculated and updated directly:

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:ixgbe_update_itr() {
    ...
	switch (q_vector->adapter->link_speed) {
	case IXGBE_LINK_SPEED_10GB_FULL:
	case IXGBE_LINK_SPEED_100_FULL:
	default:
		itr += DIV_ROUND_UP(avg_wire_size,
				    IXGBE_ITR_ADAPTIVE_MIN_INC * 256) *
		       IXGBE_ITR_ADAPTIVE_MIN_INC;
		break;
    ...
}

With IXGBE_ITR_ADAPTIVE_MIN_INC being 2, the calculation is
DIV_ROUND_UP(32256, 512) * 2, which equals exactly 126.

There is no clamping against IXGBE_ITR_ADAPTIVE_MAX_USECS applied after this
bulk calculation, so ring_container->itr becomes 126.

Later, in ixgbe_set_itr(), the final ITR is determined by taking the minimum
between the rx and tx values:

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:ixgbe_set_itr() {
    ...
	new_itr = min(q_vector->rx.itr, q_vector->tx.itr);
    ...
}

If the TX ring is idle or operating in latency mode, it will have the
IXGBE_ITR_ADAPTIVE_LATENCY flag (0x80) set.  Even though its base value is
clamped to 84, the latency flag makes it numerically 212.

Does this mean min(126, 212) will return the RX bulk value of 126, leaving
the interrupt delay at 126 us and descriptor ring starvation unresolved?

>  #define IXGBE_ITR_ADAPTIVE_LATENCY	0x80
>  #define IXGBE_ITR_ADAPTIVE_BULK		0x00

^ permalink raw reply

* Re: [net,PATCH v3 1/2] net: ks8851: Reinstate disabling of BHs around IRQ handler
From: Sebastian Andrzej Siewior @ 2026-04-14 12:57 UTC (permalink / raw)
  To: Marek Vasut
  Cc: netdev, stable, David S. Miller, Andrew Lunn, Eric Dumazet,
	Jakub Kicinski, Nicolai Buchwitz, Paolo Abeni, Ronald Wahl,
	Yicong Hui, linux-kernel
In-Reply-To: <20260414103327.113500-1-marex@nabladev.com>

On 2026-04-14 12:32:52 [+0200], Marek Vasut wrote:
> If CONFIG_PREEMPT_RT=y is set AND the driver executes ks8851_irq() AND
> KSZ_ISR register bit IRQ_RXI is set AND ks8851_rx_pkts() detects that
> there are packets in the RX FIFO, then netdev_alloc_skb_ip_align() is
> called to allocate SKBs. If netdev_alloc_skb_ip_align() is called with
> BH enabled, local_bh_enable() at the end of netdev_alloc_skb_ip_align()
> will call __local_bh_enable_ip(), which will call __do_softirq(), which
> may trigger net_tx_action() softirq, which may ultimately call the xmit
> callback ks8851_start_xmit_par(). The ks8851_start_xmit_par() will try
> to lock struct ks8851_net_par .lock spinlock, which is already locked
> by ks8851_irq() from which ks8851_start_xmit_par() was called. This
> leads to a deadlock, which is reported by the kernel, including a trace
> listed below.

#1 [received RX packet and a] TX packet has been sent
#2 Driver enables TX queue via netif_wake_queue() which schedules TX
   softirq to queue packets for this device.
#2 After spin_unlock_bh(&ks->statelock) the pending softirqs will be
   processed
#3 This deadlocks because of recursive locking via ks8851_net::lock in
   ks8851_irq() and ks8851_start_xmit_par().

This is what happens since commit 0913ec336a6c0 ("net: ks8851: Fix
deadlock with the SPI chip variant"). Before that commit the softirq
execution will be picked up by netdev_alloc_skb_ip_align() and requires
PREEMPT_RT and a RX packet in #1 to trigger the deadlock.

> Fix the problem by disabling BH around critical sections, including the
> IRQ handler, thus preventing the net_tx_action() softirq from triggering
> during these critical sections. The net_tx_action() softirq is triggered
> at the end of the IRQ handler, once all the other IRQ handler actions have
> been completed.
> 
>  __schedule from schedule_rtlock+0x1c/0x34
>  schedule_rtlock from rtlock_slowlock_locked+0x548/0x904
>  rtlock_slowlock_locked from rt_spin_lock+0x60/0x9c
>  rt_spin_lock from ks8851_start_xmit_par+0x74/0x1a8
>  ks8851_start_xmit_par from netdev_start_xmit+0x20/0x44
>  netdev_start_xmit from dev_hard_start_xmit+0xd0/0x188
>  dev_hard_start_xmit from sch_direct_xmit+0xb8/0x25c
>  sch_direct_xmit from __qdisc_run+0x1f8/0x4ec
>  __qdisc_run from qdisc_run+0x1c/0x28
>  qdisc_run from net_tx_action+0x1f0/0x268
>  net_tx_action from handle_softirqs+0x1a4/0x270
>  handle_softirqs from __local_bh_enable_ip+0xcc/0xe0
>  __local_bh_enable_ip from __alloc_skb+0xd8/0x128
>  __alloc_skb from __netdev_alloc_skb+0x3c/0x19c
>  __netdev_alloc_skb from ks8851_irq+0x388/0x4d4
>  ks8851_irq from irq_thread_fn+0x24/0x64
>  irq_thread_fn from irq_thread+0x178/0x28c
>  irq_thread from kthread+0x12c/0x138
>  kthread from ret_from_fork+0x14/0x28

The backtrace here and the description is based on an older kernel.
However

Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> Fixes: e0863634bf9f ("net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs")
> Cc: stable@vger.kernel.org
> Signed-off-by: Marek Vasut <marex@nabladev.com>

Sebastian

^ permalink raw reply

* Re: [PATCH net] net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
From: Simon Horman @ 2026-04-14 12:48 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260412-airoha_ppe_hw_init-missing-bits-v1-1-06ac670819e3@kernel.org>

On Sun, Apr 12, 2026 at 10:43:26AM +0200, Lorenzo Bianconi wrote:
> Add the following PPE configuration in airoha_ppe_hw_init routine:
> - 6RD hw offloading is currently not supported by Netfilter flowtable.
>   Disable explicitly PPE 6RD offloading in order to prevent PPE to learn
>   6RD flows and eventually interrupt the traffic.
> - Add missing PPE bind rate configuration for L3 and L2 traffic.
>   PPE bind rate configuration specifies the pps threshold to move a PPE
>   entry state from UNBIND to BIND. Without this configuration this value
>   is random.
> - Set ageing thresholds to the values used in the vendor SDK in order to
>   improve connection stability under load and avoid packet loss caused by
>   fast aging.
> 
> Fixes: 00a7678310fe3 ("net: airoha: Introduce flowtable offload support")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net] net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()
From: Simon Horman @ 2026-04-14 12:46 UTC (permalink / raw)
  To: lorenzo
  Cc: 'Simon Horman', andrew+netdev, davem, edumazet, kuba,
	pabeni, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260412-airoha-wait-for-npu-config-offload-setup-v1-1-f4e0aa2a5d85@kernel.org>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
    net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()

This commit adds a poll loop in airoha_ppe_offload_setup() to wait for the NPU
PPE configuration to complete before running airoha_ppe_hw_init().

> diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c
> index c2c32b6833df9..52199f6b39f9b 100644
> --- a/drivers/net/ethernet/airoha/airoha_ppe.c
> +++ b/drivers/net/ethernet/airoha/airoha_ppe.c
[ ... ]
> @@ -1340,6 +1341,13 @@ static int airoha_ppe_offload_setup(struct airoha_eth *eth)
>  	if (err)
>  		goto error_npu_put;
>  
> +	/* Wait for NPU PPE configuration to complete */
> +	err = read_poll_timeout(airoha_fe_rr, val, val, USEC_PER_MSEC,
> +				100 * USEC_PER_MSEC, false, eth,
> +				REG_PPE_PPE_FLOW_CFG(0));

If the SoC has multiple PPE instances, is it sufficient to only wait for
REG_PPE_PPE_FLOW_CFG(0)?

Later in airoha_ppe_hw_init(), the code iterates up to eth->soc->num_ppe,
initializing all available instances:

drivers/net/ethernet/airoha/airoha_ppe.c:airoha_ppe_hw_init() {
    ...
	for (i = 0; i < eth->soc->num_ppe; i++) {
		...
		airoha_fe_wr(eth, REG_PPE_TB_BASE(i),
			     ppe->foe_dma + sram_tb_size);
    ...
}

If we only poll PPE 0, could the driver race with the NPU firmware's
initialization of subsequent PPEs?

Additionally, does this poll condition properly ensure the hardware is ready?
Since the break condition is simply `val`, it exits immediately if the register
contains any non-zero value.

If the register retains a non-zero value from a previous session (like across
driver reloads), or if an unclocked domain returns 0xFFFFFFFF, this poll might
succeed immediately and bypass the wait.

Could we clear the register before polling, or wait for a specific bit instead?

> +	if (err)
> +		goto error_npu_put;
> +
>  	ppe_num_stats_entries = airoha_ppe_get_total_num_stats_entries(ppe);

^ permalink raw reply

* Re: [PATCH net 1/1] ipv6: xfrm6: release dst on error in xfrm6_rcv_encap()
From: Simon Horman @ 2026-04-14 12:43 UTC (permalink / raw)
  To: n05ec
  Cc: 'Simon Horman', netdev, steffen.klassert, herbert, davem,
	dsahern, edumazet, kuba, pabeni, sd, yifanwucs, tomapufckgml,
	yuantan098, bird, caoruide123, zylzyl2333
In-Reply-To: <fc06a81e1c1cbbb3e918825a718046cb18204f3c.1775886483.git.zylzyl2333@gmail.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ipv6: xfrm6: release dst on error in xfrm6_rcv_encap()

This commit fixes a dst entry leak in xfrm6_rcv_encap() by ensuring
the dst is released when ip6_route_input_lookup() returns an error route.

> diff --git a/net/ipv6/xfrm6_protocol.c b/net/ipv6/xfrm6_protocol.c
> index ea2f805d3b014..9b586fcec4850 100644
> --- a/net/ipv6/xfrm6_protocol.c
> +++ b/net/ipv6/xfrm6_protocol.c
> @@ -88,8 +88,10 @@ int xfrm6_rcv_encap(struct sk_buff *skb, int nexthdr, __be32 spi,
>  
>  		dst = ip6_route_input_lookup(dev_net(skb->dev), skb->dev, &fl6,
>  					     skb, flags);

This isn't a bug introduced by this patch, but is it possible for skb->dev
to be NULL here?

When ESP packets are delivered over TCP, they are processed asynchronously
via handle_esp() in net/xfrm/espintcp.c. This dynamically resolves skb->dev
using dev_get_by_index_rcu(). If the interface was removed before this runs,
it returns NULL.

Because TCP receive paths drop the skb's dst entry, skb_dst(skb) is NULL.
When handle_esp() passes the skb to xfrm6_rcv_encap(), it enters the
!skb_dst(skb) block and unconditionally accesses skb->dev->ifindex and
calls dev_net(skb->dev).

Could this result in a NULL pointer dereference if the interface is torn
down while packets are being processed asynchronously?

> -		if (dst->error)
> +		if (dst->error) {
> +			dst_release(dst);
>  			goto drop;
> +		}
>  		skb_dst_set(skb, dst);
>  	}
>

^ permalink raw reply

* Re: [patch 31/38] parisc: Select ARCH_HAS_RANDOM_ENTROPY
From: Helge Deller @ 2026-04-14 12:41 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: linux-parisc, Arnd Bergmann, x86, Lu Baolu, iommu,
	Michael Grzeschik, netdev, linux-wireless, Herbert Xu,
	linux-crypto, Vlastimil Babka, linux-mm, David Woodhouse,
	Bernie Thompson, linux-fbdev, Theodore Tso, linux-ext4,
	Andrew Morton, Uladzislau Rezki, Marco Elver, Dmitry Vyukov,
	kasan-dev, Andrey Ryabinin, Thomas Sailer, linux-hams,
	Jason A. Donenfeld, Richard Henderson, linux-alpha, Russell King,
	linux-arm-kernel, Catalin Marinas, Huacai Chen, loongarch,
	Geert Uytterhoeven, linux-m68k, Dinh Nguyen, Jonas Bonn,
	linux-openrisc, Michael Ellerman, linuxppc-dev, Paul Walmsley,
	linux-riscv, Heiko Carstens, linux-s390, David S. Miller,
	sparclinux
In-Reply-To: <20260410120319.658485572@kernel.org>

On 4/10/26 14:21, Thomas Gleixner wrote:
> The only remaining non-architecture usage of get_cycles() is to provide
> random_get_entropy().
> 
> Switch parisc over to the new scheme of selecting ARCH_HAS_RANDOM_ENTROPY
> and providing random_get_entropy() in asm/random.h.
> 
> Add 'asm/timex.h' includes to the relevant files, so the global include can
> be removed once all architectures are converted over.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Helge Deller <deller@gmx.de>
> Cc: linux-parisc@vger.kernel.org
> ---
>   arch/parisc/Kconfig              |    1 +
>   arch/parisc/include/asm/random.h |   12 ++++++++++++
>   arch/parisc/include/asm/timex.h  |    6 ------
>   arch/parisc/kernel/processor.c   |    1 +
>   arch/parisc/kernel/time.c        |    1 +
>   5 files changed, 15 insertions(+), 6 deletions(-)

I tested this series on parisc.
Works as expected.

Tested-by: Helge Deller <deller@gmx.de>

Thanks!
Helge

^ permalink raw reply

* Re: [PATCH v3 1/3] net: dsa: microchip: implement KSZ87xx Module 3 low-loss cable errata
From: Andrew Lunn @ 2026-04-14 12:40 UTC (permalink / raw)
  To: Marek Vasut
  Cc: Fidelio Lawson, Woojung Huh, UNGLinuxDriver, Vladimir Oltean,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Marek Vasut, Maxime Chevallier, Simon Horman, Heiner Kallweit,
	Russell King, netdev, linux-kernel, Fidelio Lawson
In-Reply-To: <ea90a671-70be-4d89-b842-1e54d687336f@nabladev.com>

On Tue, Apr 14, 2026 at 01:05:49PM +0200, Marek Vasut wrote:
> On 4/14/26 11:12 AM, Fidelio Lawson wrote:
> > Implement the "Module 3: Equalizer fix for short cables" erratum from
> > Microchip document DS80000687C for KSZ87xx switches.
> > 
> > The issue affects short or low-loss cable links (e.g. CAT5e/CAT6),
> > where the PHY receiver equalizer may amplify high-amplitude signals
> > excessively, resulting in internal distortion and link establishment
> > failures.
> > 
> > KSZ87xx devices require a workaround for the Module 3 low-loss cable
> > condition, controlled through the switch TABLE_LINK_MD_V indirect
> > registers.
> > 
> > The affected registers are part of the switch address space and are not
> > directly accessible from the PHY driver. To keep the PHY-facing API
> > clean and avoid leaking switch-specific details, model this errata
> > control as vendor-specific Clause 22 PHY registers.
> > 
> > A vendor-specific Clause 22 PHY register is introduced as a mode
> > selector in PHY_REG_LOW_LOSS_CTRL, and ksz8_r_phy() / ksz8_w_phy()
> > translate accesses to these bits into the appropriate indirect
> > TABLE_LINK_MD_V accesses.
> > 
> > The control register defines the following modes:
> > 0: disabled (default behavior)
> > 1: EQ training workaround
> > 2: LPF 90 MHz
> > 3: LPF 62 MHz
> > 4: LPF 55 MHz
> > 5: LPF 44 MHz
> I may not fully understand this, but aren't the EQ and LPF settings
> orthogonal ?

What is the real life experience using this feature? Is it needed for
1cm cables, but most > 1m cables are O.K with the defaults? Do we need
all these configuration options? How is a user supposed to discover
the different options? Can we simplify it down to a Boolean?

Ethernet is just supposed to work with any valid length of cable,
KISS. So maybe we should try to keep this feature KISS. Just tell the
driver it is a short cable, pick different defaults which should work
with any short cable?

A boolean should also help with making this tunable reusable with
other devices. It is unlikely any other devices have these same
configuration options, unless it is from the same vendor.

     Andrew

^ permalink raw reply

* Re: [PATCH net] net: airoha: Fix max TX packet length configuration
From: Simon Horman @ 2026-04-14 12:40 UTC (permalink / raw)
  To: lorenzo
  Cc: 'Simon Horman', andrew+netdev, davem, edumazet, kuba,
	pabeni, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260412-airoha-fix-max-mtu-v1-1-333030d0a564@kernel.org>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
net: airoha: Fix max TX packet length configuration

This commit fixes the MTU configuration for the Airoha driver by removing
ETH_FCS_LEN from the maximum TX packet length, matching the hardware
documentation which states the register does not include the FCS length.

> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index c14cdce588a7c..a81ffda72b39f 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -1629,9 +1629,9 @@ static void airoha_update_hw_stats(struct airoha_gdm_port *port)
>  
>  static int airoha_dev_open(struct net_device *dev)
>  {
> -	int err, len = ETH_HLEN + dev->mtu + ETH_FCS_LEN;
>  	struct airoha_gdm_port *port = netdev_priv(dev);
>  	struct airoha_qdma *qdma = port->qdma;
> +	int err, len = ETH_HLEN + dev->mtu;

Does this regression cause MTU-sized VLAN-tagged frames to be dropped by the
hardware?

By limiting the frame size to ETH_HLEN + dev->mtu, the maximum allowed length
becomes 1514 bytes for a standard 1500 MTU. An Ethernet frame containing a
standard 802.1Q VLAN tag and a maximum-sized 1500-byte IP payload requires an
18-byte L2 header, making the total frame size 1518 bytes (excluding FCS).

Previously, the inclusion of ETH_FCS_LEN inadvertently provided the 4-byte
headroom needed to accommodate a single VLAN tag. 

Could the configuration explicitly account for VLAN headers using VLAN_ETH_HLEN
instead?

[ ... ]

^ permalink raw reply

* Re: [PATCH net 1/1] ipv6: xfrm6: release dst on error in xfrm6_rcv_encap()
From: Simon Horman @ 2026-04-14 12:33 UTC (permalink / raw)
  To: Ren Wei
  Cc: netdev, steffen.klassert, herbert, davem, dsahern, edumazet, kuba,
	pabeni, sd, yifanwucs, tomapufckgml, yuantan098, bird,
	caoruide123, zylzyl2333
In-Reply-To: <fc06a81e1c1cbbb3e918825a718046cb18204f3c.1775886483.git.zylzyl2333@gmail.com>

On Sun, Apr 12, 2026 at 01:07:54PM +0800, Ren Wei wrote:
> From: Yilin Zhu <zylzyl2333@gmail.com>
> 
> xfrm6_rcv_encap() performs an IPv6 route lookup when the skb does not
> already have a dst attached. ip6_route_input_lookup() returns a
> referenced dst entry even when the lookup resolves to an error route.
> 
> If dst->error is set, xfrm6_rcv_encap() drops the skb without attaching
> the dst to the skb and without releasing the reference returned by the
> lookup. Repeated packets hitting this path therefore leak dst entries.
> 
> Release the dst before jumping to the drop path.
> 
> Fixes: 0146dca70b87 ("xfrm: add support for UDPv6 encapsulation of ESP")
> Cc: stable@kernel.org
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Co-developed-by: Yuan Tan <yuantan098@gmail.com>
> Signed-off-by: Yuan Tan <yuantan098@gmail.com>
> Suggested-by: Xin Liu <bird@lzu.edu.cn>
> Tested-by: Ruide Cao <caoruide123@gmail.com>
> Signed-off-by: Yilin Zhu <zylzyl2333@gmail.com>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
>  net/ipv6/xfrm6_protocol.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* linux-next: manual merge of the bpf-next tree with the origin tree
From: Mark Brown @ 2026-04-14 12:18 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko, bpf,
	Networking
  Cc: Joel Fernandes, Kumar Kartikeya Dwivedi,
	Linux Kernel Mailing List, Linux Next Mailing List,
	Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 1538 bytes --]

Hi all,

Today's linux-next merge of the bpf-next tree got a conflict in:

  include/linux/rcupdate.h

between commit:

  ad6ef775cbeff ("rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods")

from the origin tree and commit:

  57b23c0f612dc ("bpf: Retire rcu_trace_implies_rcu_gp()")

from the bpf-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

diff --combined include/linux/rcupdate.h
index 18a85c30fd4f3,bfa765132de85..0000000000000
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@@ -205,15 -205,6 +205,6 @@@ static inline void exit_tasks_rcu_start
  static inline void exit_tasks_rcu_finish(void) { }
  #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
  
- /**
-  * rcu_trace_implies_rcu_gp - does an RCU Tasks Trace grace period imply an RCU grace period?
-  *
-  * Now that RCU Tasks Trace is implemented in terms of SRCU-fast, a
-  * call to synchronize_rcu_tasks_trace() is guaranteed to imply at least
-  * one call to synchronize_rcu().
-  */
- static inline bool rcu_trace_implies_rcu_gp(void) { return true; }
- 
  /**
   * cond_resched_tasks_rcu_qs - Report potential quiescent states to RCU
   *

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH iwl-net] i40e: keep q_vectors array in sync with channel count changes
From: Maciej Fijalkowski @ 2026-04-14 12:14 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, magnus.karlsson, kuba, pabeni, horms, przemyslaw.kitszel,
	jacob.e.keller, Maciej Fijalkowski

For the main VSI, i40e_set_num_rings_in_vsi() always derives
num_q_vectors from pf->num_lan_msix. At the same time, ethtool -L stores
the user requested channel count in vsi->req_queue_pairs and the queue
setup path uses that value for the effective number of queue pairs.

This leaves queue and vector counts out of sync after shrinking channel
count via ethtool -L. The active queue configuration is reduced, but the
VSI still keeps the full PF-sized q_vector topology.

That mismatch breaks reconfiguration flows which rely on vector/NAPI
state matching the effective channel configuration. In particular,
toggling /sys/class/net/<dev>/threaded after reducing the channel count
can hang, and later channel-count changes can fail because VSI reinit
does not rebuild q_vectors to match the new vector count.

Fix this by making the main VSI num_q_vectors follow the effective
requested channel count, capped by the available MSI-X vectors. Update
i40e_vsi_reinit_setup() to rebuild q_vectors during VSI reinit so the
vector topology is refreshed together with the ring arrays when channel
count changes.

Keep alloc_queue_pairs unchanged and based on pf->num_lan_qps so the VSI
retains its full queue capacity.

Selftest napi_threaded.py was originally used when Jakub reported hang
on /sys/class/net/<dev>/threaded toggle. In order to make it pass on
i40e, use persistent NAPI configuration for q_vector NAPIs so NAPI
identity and threaded settings survive q_vector reallocation across
channel-count changes. This is achieved by using netif_napi_add_config()
when configuring q_vectors.

$ export NETIF=ens259f1np1
$ sudo -E env PATH="$PATH" ./tools/testing/selftests/drivers/net/napi_threaded.py
TAP version 13
1..3
ok 1 napi_threaded.napi_init
ok 2 napi_threaded.change_num_queues
ok 3 napi_threaded.enable_dev_threaded_disable_napi_threaded
Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/intel-wired-lan/20260316133100.6054a11f@kernel.org/
Fixes: d2a69fefd756 ("i40e: Fix changing previously set num_queue_pairs for PFs")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 34 +++++++++++++++++----
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 926d001b2150..5636ad71f940 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11403,10 +11403,14 @@ static void i40e_service_timer(struct timer_list *t)
 static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
 {
 	struct i40e_pf *pf = vsi->back;
+	u16 qps;
 
 	switch (vsi->type) {
 	case I40E_VSI_MAIN:
 		vsi->alloc_queue_pairs = pf->num_lan_qps;
+		qps = vsi->req_queue_pairs ?
+		      min_t(u16, vsi->req_queue_pairs, pf->num_lan_qps) :
+		      pf->num_lan_qps;
 		if (!vsi->num_tx_desc)
 			vsi->num_tx_desc = ALIGN(I40E_DEFAULT_NUM_DESCRIPTORS,
 						 I40E_REQ_DESCRIPTOR_MULTIPLE);
@@ -11414,7 +11418,8 @@ static int i40e_set_num_rings_in_vsi(struct i40e_vsi *vsi)
 			vsi->num_rx_desc = ALIGN(I40E_DEFAULT_NUM_DESCRIPTORS,
 						 I40E_REQ_DESCRIPTOR_MULTIPLE);
 		if (test_bit(I40E_FLAG_MSIX_ENA, pf->flags))
-			vsi->num_q_vectors = pf->num_lan_msix;
+			vsi->num_q_vectors = max_t(int, 1,
+						   min_t(int, qps, pf->num_lan_msix));
 		else
 			vsi->num_q_vectors = 1;
 
@@ -12043,7 +12048,8 @@ static int i40e_vsi_alloc_q_vector(struct i40e_vsi *vsi, int v_idx)
 	cpumask_copy(&q_vector->affinity_mask, cpu_possible_mask);
 
 	if (vsi->netdev)
-		netif_napi_add(vsi->netdev, &q_vector->napi, i40e_napi_poll);
+		netif_napi_add_config(vsi->netdev, &q_vector->napi,
+				      i40e_napi_poll, v_idx);
 
 	/* tie q_vector and vsi together */
 	vsi->q_vectors[v_idx] = q_vector;
@@ -14265,12 +14271,27 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 
 	pf = vsi->back;
 
+	if (test_bit(I40E_FLAG_MSIX_ENA, pf->flags)) {
+		i40e_put_lump(pf->irq_pile, vsi->base_vector, vsi->idx);
+		vsi->base_vector = 0;
+	}
+
 	i40e_put_lump(pf->qp_pile, vsi->base_queue, vsi->idx);
 	i40e_vsi_clear_rings(vsi);
 
-	i40e_vsi_free_arrays(vsi, false);
+	i40e_vsi_free_q_vectors(vsi);
+	i40e_vsi_free_arrays(vsi, true);
 	i40e_set_num_rings_in_vsi(vsi);
-	ret = i40e_vsi_alloc_arrays(vsi, false);
+
+	ret = i40e_vsi_alloc_arrays(vsi, true);
+	if (ret)
+		goto err_vsi;
+
+	/* Rebuild q_vectors during VSI reinit because the effective channel
+	 * count may change num_q_vectors. Keep vector topology aligned with the
+	 * queue configuration after ethtool's .set_channels() callback.
+	 */
+	ret = i40e_vsi_setup_vectors(vsi);
 	if (ret)
 		goto err_vsi;
 
@@ -14282,7 +14303,7 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 		dev_info(&pf->pdev->dev,
 			 "failed to get tracking for %d queues for VSI %d err %d\n",
 			 alloc_queue_pairs, vsi->seid, ret);
-		goto err_vsi;
+		goto err_lump;
 	}
 	vsi->base_queue = ret;
 
@@ -14306,7 +14327,6 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 	return vsi;
 
 err_rings:
-	i40e_vsi_free_q_vectors(vsi);
 	if (vsi->netdev_registered) {
 		vsi->netdev_registered = false;
 		unregister_netdev(vsi->netdev);
@@ -14316,6 +14336,8 @@ static struct i40e_vsi *i40e_vsi_reinit_setup(struct i40e_vsi *vsi)
 	if (vsi->type == I40E_VSI_MAIN)
 		i40e_devlink_destroy_port(pf);
 	i40e_aq_delete_element(&pf->hw, vsi->seid, NULL);
+err_lump:
+	i40e_vsi_free_q_vectors(vsi);
 err_vsi:
 	i40e_vsi_clear(vsi);
 	return NULL;
-- 
2.43.0


^ permalink raw reply related

* Re: [syzbot] [lvs?] BUG: sleeping function called from invalid context in ip_vs_conn_expire
From: Jiayuan Chen @ 2026-04-14 12:09 UTC (permalink / raw)
  To: syzbot, coreteam, davem, edumazet, fw, horms, ja, kuba,
	linux-kernel, lvs-devel, netdev, netfilter-devel, pabeni, pablo,
	phil, syzkaller-bugs
In-Reply-To: <69de1743.a00a0220.475f0.0040.GAE@google.com>


On 4/14/26 6:30 PM, syzbot wrote:

[...]

> if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+504e778ddaecd36fdd17@syzkaller.appspotmail.com
>
> BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48



The problem occurs under PREEMPT_RT. conn_tab_lock pair with spin_lock 
has the problem:

     conn_tab_lock(...) -> hlist_bl_lock -> preempt_disable()  ==> 
disables preemption
     spin_lock(&cp->lock) -> rt_mutex  ==> sleepable under RT, but 
preemption is already disabled by conn_tab_lock


> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 16, name: ktimers/0
> preempt_count: 2, expected: 0
> RCU nest depth: 3, expected: 3
> 8 locks held by ktimers/0/16:
>   #0: ffffffff8de5f260 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x3c/0x420 kernel/softirq.c:163
>   #1: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0x3c/0x420 kernel/softirq.c:163
>   #2: ffff8880b8826360 (&base->expiry_lock){+...}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:45 [inline]
>   #2: ffff8880b8826360 (&base->expiry_lock){+...}-{3:3}, at: timer_base_lock_expiry kernel/time/timer.c:1502 [inline]
>   #2: ffff8880b8826360 (&base->expiry_lock){+...}-{3:3}, at: __run_timer_base+0x120/0x9f0 kernel/time/timer.c:2384
>   #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
>   #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
>   #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: __rt_spin_lock kernel/locking/spinlock_rt.c:50 [inline]
>   #3: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0x1e0/0x400 kernel/locking/spinlock_rt.c:57
>   #4: ffffc90000157a80 ((&cp->timer)){+...}-{0:0}, at: call_timer_fn+0xd4/0x5e0 kernel/time/timer.c:1745
>   #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:300 [inline]
>   #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
>   #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: ip_vs_conn_unlink net/netfilter/ipvs/ip_vs_conn.c:315 [inline]
>   #5: ffffffff8dfc80c0 (rcu_read_lock){....}-{1:3}, at: ip_vs_conn_expire+0x257/0x2390 net/netfilter/ipvs/ip_vs_conn.c:1260
>   #6: ffffffff8de5f260 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x3c/0x420 kernel/softirq.c:163
>   #7: ffff888068d4c3f0 (&cp->lock#2){+...}-{3:3}, at: spin_lock include/linux/spinlock_rt.h:45 [inline]
>   #7: ffff888068d4c3f0 (&cp->lock#2){+...}-{3:3}, at: ip_vs_conn_unlink net/netfilter/ipvs/ip_vs_conn.c:324 [inline]
>   #7: ffff888068d4c3f0 (&cp->lock#2){+...}-{3:3}, at: ip_vs_conn_expire+0xd4a/0x2390 net/netfilter/ipvs/ip_vs_conn.c:1260
> Preemption disabled at:
> [<ffffffff898a6358>] bit_spin_lock include/linux/bit_spinlock.h:38 [inline]
> [<ffffffff898a6358>] hlist_bl_lock+0x18/0x110 include/linux/list_bl.h:149
> CPU: 0 UID: 0 PID: 16 Comm: ktimers/0 Tainted: G        W    L      syzkaller #0 PREEMPT_{RT,(full)}
> Tainted: [W]=WARN, [L]=SOFTLOCKUP
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026
> Call Trace:
>   <TASK>
>   dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
>   __might_resched+0x329/0x480 kernel/sched/core.c:9162
>   __rt_spin_lock kernel/locking/spinlock_rt.c:48 [inline]
>   rt_spin_lock+0xc2/0x400 kernel/locking/spinlock_rt.c:57
>   spin_lock include/linux/spinlock_rt.h:45 [inline]
>   ip_vs_conn_unlink net/netfilter/ipvs/ip_vs_conn.c:324 [inline]
>   ip_vs_conn_expire+0xd4a/0x2390 net/netfilter/ipvs/ip_vs_conn.c:1260
>   call_timer_fn+0x192/0x5e0 kernel/time/timer.c:1748
>   expire_timers kernel/time/timer.c:1799 [inline]
>   __run_timers kernel/time/timer.c:2374 [inline]
>   __run_timer_base+0x6a3/0x9f0 kernel/time/timer.c:2386
>   run_timer_base kernel/time/timer.c:2395 [inline]
>   run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2405
>   handle_softirqs+0x1de/0x6d0 kernel/softirq.c:622
>   __do_softirq kernel/softirq.c:656 [inline]
>   run_ktimerd+0x69/0x100 kernel/softirq.c:1151
>   smpboot_thread_fn+0x541/0xa50 kernel/smpboot.c:160
>   kthread+0x388/0x470 kernel/kthread.c:436
>   ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
>   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>   </TASK>
>

^ permalink raw reply

* Re: [PATCH 2/4] tools: ynl-gen-c: optionally emit structs and helpers
From: Christoph Böhmwalder @ 2026-04-14 12:08 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jens Axboe, drbd-dev, linux-kernel, Lars Ellenberg,
	Philipp Reisner, linux-block, Donald Hunter, Eric Dumazet, netdev
In-Reply-To: <20260413104939.5ef4d9dc@kernel.org>

On Mon, Apr 13, 2026 at 10:49:39AM -0700, Jakub Kicinski wrote:
>On Mon, 13 Apr 2026 13:48:32 +0200 Christoph Böhmwalder wrote:
>> >Can we just commit the code they output and leave the YNL itself be?
>> >Every single legacy family has some weird quirks the point of YNL
>> >is to get rid of them, not support them all..
>>
>> Fair enough, we could also do that. Though the question then becomes
>> whether we want to keep the YAML spec for the "drbd" family (patch 3 of
>> this series) in Documentation/.
>>
>> I would argue it makes sense to keep it around somewhere so that the old
>> family is somehow documented, but obviously that yaml file won't work
>> with the unmodified generator.
>
>To be clear (correct me if I misunderstood) it looked like we would be
>missing out on "automating" things, so extra work would still need to
>be done in the C code / manually written headers. But pure YNL (eg
>Python or Rust) client _would_ work? They could generate correct
>requests and parse responses, right?

I haven't tested this, but yes, a regular YNL client should work with
this spec. The new flags only influence kernel codegen, so a client
that doesn't know about them could still construct valid messages and
parse responses.

However, if we drop patch 2 completely, the new flags won't be in the
genetlink-legacy schema either, so schema validation would fail when
trying to generate.

>If yes, keeping it makes sense. FWIW all the specs we have for "old"
>networking families (routing etc) also don't replace any kernel code.
>They are purely to enable user space libraries in various languages.
>Whether having broad languages support for drbd or you just have one
>well known user space stack - I dunno.

Well, one of the main motivations for porting the current "drbd" family
to YNL is to get rid of the genl_magic infrastructure. We intend to add
a new modernized "drbd2" family, which will be fully YNL-based from the
start.
But we still need to support the current family via a compat path, and
I would much rather have two YNL-based families than one genl_magic and
one YNL-based. Carrying both sounds like a nightmare.

So the spec proposed in this series would never actually be used to
generate a userspace client, if that's what you're asking. We would
continue to use the current libgenl-based approach, with some userspace
compat shims to make it work with YNL. Then, when "drbd2" comes along,
we could "do things properly".

Might also be worth to mention that we are also experimenting with
Rust-based userspace utilities at the moment, so once we have "drbd2",
there will be a real benefit to having multi-language support.

So I'm fine with whichever route you want to take here, as long as
it enables us to move away from genl_magic.

If we decide to carry the "drbd" spec in-tree, that would then pretty
much only be for documentation purposes. Otherwise there would be
generated code where the spec it was generated from is non-existant,
which may be surprising.

>
>> Maybe keep it, but with a comment at the top that notes that
>> - this family is deprecated and "frozen",
>> - the spec is only for documentation purposes, and
>> - the spec doesn't work with the upstream parser?
>
>The past point needs a clarification, I guess..

^ permalink raw reply

* Re: [PATCH v2 nf] netfilter: nf_flow_table_ip: Introduce nf_flow_vlan_push()
From: Eric Woudstra @ 2026-04-14 12:00 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Florian Westphal, Phil Sutter, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, netfilter-devel,
	netdev
In-Reply-To: <ad4nQzsbeF1S53zt@chamomile>



On 4/14/26 1:38 PM, Pablo Neira Ayuso wrote:
> On Tue, Apr 14, 2026 at 01:21:20PM +0200, Eric Woudstra wrote:
>> Calling skb_reset_mac_header() before calling skb_vlan_push() does
>> remove the error:
>>
>> "skb_vlan_push got skb with skb->data not at mac header (offset 18)"
>>
>> But the inner vlan tag is still not inserted correctly.
>>
>> skb_vlan_push() uses __vlan_insert_inner_tag() to insert the tag
>> at offset ETH_HLEN. But the inner tag should only be pushed, without
>> offset, similar to nf_flow_pppoe_push().
> 
> It is doubled-tagged-vlan that is broken, right? I observed this once
> but I have been burdened into a few things.

That is correct, both q-in-q and q-in-ad (that may not be the correct
terms, but I think it is clear).

>> Fixes: c653d5a78f34 ("netfilter: flowtable: inline vlan encapsulation in xmit path")
>> Fixes: a3aca98aec9a ("netfilter: nf_flow_table_ip: reset mac header before vlan push")
>> Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
>>
>> ---
>>
>>  net/netfilter/nf_flow_table_ip.c | 25 ++++++++++++++++++++++---
>>  1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c
>> index fd56d663cb5b..0086f8a1a0d6 100644
>> --- a/net/netfilter/nf_flow_table_ip.c
>> +++ b/net/netfilter/nf_flow_table_ip.c
>> @@ -544,6 +544,26 @@ static int nf_flow_offload_forward(struct nf_flowtable_ctx *ctx,
>>  	return 1;
>>  }
>>  
>> +static int nf_flow_vlan_push(struct sk_buff *skb, __be16 proto, u16 id)
>> +{
>> +	if (skb_vlan_tag_present(skb)) {
>> +		struct vlan_hdr *vhdr;
>> +
>> +		if (skb_cow_head(skb, VLAN_HLEN))
>> +			return -1;
>> +
>> +		__skb_push(skb, VLAN_HLEN);
>> +		skb_reset_network_header(skb);
>> +		vhdr = (struct vlan_hdr *)(skb->data);
>> +		vhdr->h_vlan_TCI = htons(id);
>> +		vhdr->h_vlan_encapsulated_proto = skb->protocol;
>> +		skb->protocol = proto;
>> +	} else {
>> +		__vlan_hwaccel_put_tag(skb, proto, id);
>> +	}
>> +	return 0;
>> +}
>> +
>>  static int nf_flow_pppoe_push(struct sk_buff *skb, u16 id)
>>  {
>>  	int data_len = skb->len + sizeof(__be16);
>> @@ -738,9 +758,8 @@ static int nf_flow_encap_push(struct sk_buff *skb,
>>  		switch (tuple->encap[i].proto) {
>>  		case htons(ETH_P_8021Q):
>>  		case htons(ETH_P_8021AD):
>> -			skb_reset_mac_header(skb);
>> -			if (skb_vlan_push(skb, tuple->encap[i].proto,
>> -					  tuple->encap[i].id) < 0)
>> +			if (nf_flow_vlan_push(skb, tuple->encap[i].proto,
>> +					      tuple->encap[i].id) < 0)
>>  				return -1;
>>  			break;
>>  		case htons(ETH_P_PPP_SES):
>> -- 
>> 2.53.0
>>


^ permalink raw reply

* Re: [PATCH v3 1/3] net: dsa: microchip: implement KSZ87xx Module 3 low-loss cable errata
From: Fidelio LAWSON @ 2026-04-14 11:59 UTC (permalink / raw)
  To: Marek Vasut, Woojung Huh, UNGLinuxDriver, Andrew Lunn,
	Vladimir Oltean, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Marek Vasut, Maxime Chevallier, Simon Horman,
	Heiner Kallweit, Russell King
  Cc: netdev, linux-kernel, Fidelio Lawson
In-Reply-To: <ea90a671-70be-4d89-b842-1e54d687336f@nabladev.com>

On 4/14/26 13:05, Marek Vasut wrote:
> On 4/14/26 11:12 AM, Fidelio Lawson wrote:
>> Implement the "Module 3: Equalizer fix for short cables" erratum from
>> Microchip document DS80000687C for KSZ87xx switches.
>>
>> The issue affects short or low-loss cable links (e.g. CAT5e/CAT6),
>> where the PHY receiver equalizer may amplify high-amplitude signals
>> excessively, resulting in internal distortion and link establishment
>> failures.
>>
>> KSZ87xx devices require a workaround for the Module 3 low-loss cable
>> condition, controlled through the switch TABLE_LINK_MD_V indirect
>> registers.
>>
>> The affected registers are part of the switch address space and are not
>> directly accessible from the PHY driver. To keep the PHY-facing API
>> clean and avoid leaking switch-specific details, model this errata
>> control as vendor-specific Clause 22 PHY registers.
>>
>> A vendor-specific Clause 22 PHY register is introduced as a mode
>> selector in PHY_REG_LOW_LOSS_CTRL, and ksz8_r_phy() / ksz8_w_phy()
>> translate accesses to these bits into the appropriate indirect
>> TABLE_LINK_MD_V accesses.
>>
>> The control register defines the following modes:
>> 0: disabled (default behavior)
>> 1: EQ training workaround
>> 2: LPF 90 MHz
>> 3: LPF 62 MHz
>> 4: LPF 55 MHz
>> 5: LPF 44 MHz
> I may not fully understand this, but aren't the EQ and LPF settings 
> orthogonal ?
You are right that EQ training and LPF bandwidth control
are orthogonal from a hardware point of view.

In this case, the interface is intentionally modeled after the erratum
guidance rather than exposing all possible combinations. Microchip
documents the workarounds as alternative solutions:

"If work around 1 does not solve the short cable issue in a CAT-5E or 
CAT-6 application, change the work around 1 register (0x3C) to its 
default value (0x0A), and use the following settings"
from: 
https://ww1.microchip.com/downloads/aemDocuments/documents/OTH/ProductDocuments/Errata/KSZ87xx-Errata-DS80000687C.pdf

If you’d prefer exposing these as orthogonal controls, I can revise the
interface in the next iteration.



^ permalink raw reply

* [syzbot] [bridge?] KASAN: use-after-free Read in qdisc_pkt_len_segs_init
From: syzbot @ 2026-04-14 11:58 UTC (permalink / raw)
  To: bridge, davem, edumazet, horms, idosch, kuba, linux-kernel,
	netdev, pabeni, razor, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    17ad4759a082 Merge branch 'wangxun-improvement'
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1505dcd2580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=229411a0a13ccb7d
dashboard link: https://syzkaller.appspot.com/bug?extid=83181a31faf9455499c5
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/b67be09d914c/disk-17ad4759.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/20a2548795c3/vmlinux-17ad4759.xz
kernel image: https://storage.googleapis.com/syzbot-assets/29e723395cef/bzImage-17ad4759.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+83181a31faf9455499c5@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: use-after-free in __tcp_hdrlen include/linux/tcp.h:31 [inline]
BUG: KASAN: use-after-free in qdisc_pkt_len_segs_init+0x7f8/0xa30 net/core/dev.c:4146
Read of size 2 at addr ffff88815ace2434 by task syz.2.24/6033

CPU: 0 UID: 0 PID: 6033 Comm: syz.2.24 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026
Call Trace:
 <IRQ>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:378 [inline]
 print_report+0xba/0x230 mm/kasan/report.c:482
 kasan_report+0x117/0x150 mm/kasan/report.c:595
 __tcp_hdrlen include/linux/tcp.h:31 [inline]
 qdisc_pkt_len_segs_init+0x7f8/0xa30 net/core/dev.c:4146
 sch_handle_ingress net/core/dev.c:4483 [inline]
 __netif_receive_skb_core+0x13bd/0x31a0 net/core/dev.c:6065
 __netif_receive_skb_list_core+0x24d/0x810 net/core/dev.c:6289
 __netif_receive_skb_list net/core/dev.c:6356 [inline]
 netif_receive_skb_list_internal+0x995/0xcf0 net/core/dev.c:6447
 gro_normal_list include/net/gro.h:523 [inline]
 gro_flush_normal include/net/gro.h:531 [inline]
 napi_complete_done+0x299/0x730 net/core/dev.c:6815
 gro_cell_poll+0x5a9/0x5d0 net/core/gro_cells.c:74
 __napi_poll+0xae/0x340 net/core/dev.c:7742
 napi_poll net/core/dev.c:7805 [inline]
 net_rx_action+0x627/0xf70 net/core/dev.c:7962
 handle_softirqs+0x22a/0x870 kernel/softirq.c:622
 do_softirq+0x76/0xd0 kernel/softirq.c:523
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0xf8/0x130 kernel/softirq.c:450
 local_bh_enable include/linux/bottom_half.h:33 [inline]
 tun_rx_batched+0x617/0x790 drivers/net/tun.c:-1
 tun_get_user+0x2aeb/0x3ed0 drivers/net/tun.c:1953
 tun_chr_write_iter+0x113/0x200 drivers/net/tun.c:1999
 new_sync_write fs/read_write.c:595 [inline]
 vfs_write+0x61d/0xb90 fs/read_write.c:688
 ksys_write+0x150/0x270 fs/read_write.c:740
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f177e39c819
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f177f1e3028 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f177e616180 RCX: 00007f177e39c819
RDX: 000000000000fdef RSI: 00002000000002c0 RDI: 0000000000000003
RBP: 00007f177e432c91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f177e616218 R14: 00007f177e616180 R15: 00007ffee7d40588
 </TASK>

The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x15ace2
flags: 0x57ff00000000000(node=1|zone=2|lastcpupid=0x7ff)
raw: 057ff00000000000 ffffea00056b3888 ffffea00056b3888 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner info is not present (never set?)

Memory state around the buggy address:
 ffff88815ace2300: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88815ace2380: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff88815ace2400: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                     ^
 ffff88815ace2480: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 ffff88815ace2500: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox