Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC] Proposal: Add sysfs interface for PCIe TPH Steering Tag retrieval and configuration
From: Leon Romanovsky @ 2026-04-14  8:57 UTC (permalink / raw)
  To: fengchengwen
  Cc: Jason Gunthorpe, Bjorn Helgaas, linux-rdma, linux-pci, netdev,
	dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, Zhiping Zhang
In-Reply-To: <b95ced54-339f-4859-b3eb-8bf261393ffc@huawei.com>

On Tue, Apr 14, 2026 at 09:07:23AM +0800, fengchengwen wrote:
> On 4/14/2026 3:19 AM, Leon Romanovsky wrote:
> > On Mon, Apr 13, 2026 at 08:04:10PM +0800, fengchengwen wrote:
> >> On 4/13/2026 6:01 PM, Leon Romanovsky wrote:
> >>> On Fri, Apr 10, 2026 at 10:30:52PM +0800, fengchengwen wrote:
> >>>> Hi all,
> >>>>
> >>>> I'm writing to propose adding a sysfs interface to expose and configure the
> >>>> PCIe TPH
> >>>> Steering Tag for PCIe devices, which is retrieved inside the kernel.
> >>>>
> >>>>
> >>>> Background: The TPH Steering Tag is tightly coupled with both a PCIe device
> >>>> (identified
> >>>> by its BDF) and a CPU core. It can only be obtained in kernel mode. To allow
> >>>> user-space
> >>>> applications to fetch and set this value securely and conveniently, we need
> >>>> a standard
> >>>> kernel-to-user interface.
> >>>>
> >>>>
> >>>> Proposed Solution: Add several sysfs attributes under each PCIe device's
> >>>> sysfs directory:
> >>>> 1. /sys/bus/pci/devices/<BDF>/tph_mode to query the TPH mode (interrupt or
> >>>> device specific)
> >>>> 2. /sys/bus/pci/devices/<BDF>/tph_enable to control the TPH feature
> >>>> 3. /sys/bus/pci/devices/<BDF>/tph_st to support both read and write
> >>>> operations, e.g.:
> >>>>    Read operation:
> >>>>      echo "cpu=3" > /sys/bus/pci/devices/0000:01:00.0/tph_st
> >>>>      cat /sys/bus/pci/devices/0000:01:00.0/tph_st
> >>>>    Write operation:
> >>>>      echo "index=10 st=123" > /sys/bus/pci/devices/0000:01:00.0/tph_st
> >>>>
> >>>>
> >>>> The design strictly follows PCI subsystem sysfs standards and has the
> >>>> following key properties:
> >>>>
> >>>> 1. Dynamic Visibility: The sysfs attributes will only be present for PCIe
> >>>> devices that
> >>>>    support TPH Steering Tag. Devices without TPH capability will not show
> >>>> these nodes,
> >>>>    avoiding unnecessary user confusion.
> >>>>
> >>>> 2. Permission Control: The attributes will use 0600 file permissions,
> >>>> ensuring only
> >>>>    privileged root users can read or write them, which satisfies security
> >>>> requirements
> >>>>    for hardware configuration interfaces.
> >>>>
> >>>> 3. Standard Implementation Location: The interface will be implemented in
> >>>>    drivers/pci/pci-sysfs.c, the canonical location for all PCI device sysfs
> >>>> attributes,
> >>>>    ensuring consistency and maintainability within the PCI subsystem.
> >>>>
> >>>>
> >>>> Why sysfs instead of alternatives like VFIO-PCI ioctl:
> >>>>
> >>>> - Universality: sysfs does not require binding the device to a special
> >>>> driver such as
> >>>>   vfio-pci. It is available to any privileged user-space component,
> >>>> including system
> >>>>   utilities, daemons, and monitoring tools.
> >>>>
> >>>> - Simplicity: Both user-space usage (cat/echo) and kernel implementation are
> >>>>   straightforward, reducing code complexity and long-term maintenance cost.
> >>>>
> >>>> - Design Alignment: TPH Steering Tag is a generic PCIe device feature, not
> >>>> specific to
> >>>>   user-space drivers like DPDK or VFIO. Exposing it via sysfs matches the
> >>>> kernel's
> >>>>   standard pattern for hardware capabilities.
> >>>>
> >>>>
> >>>> I look forward to your comments about this design before submitting the
> >>>> final patch.
> >>>
> >>> You need to explain more clearly why this write functionality is useful
> >>> and necessary outside the VFIO/RDMA context:
> >>> https://lore.kernel.org/all/20260324234615.3731237-1-zhipingz@meta.com/
> >>>
> >>> AFAIK, for non-VFIO TPH callers, kernel has enough knowledge to set
> >>> right ST values.
> >>>
> >>> There are several comments regarding the implementation, but those can wait
> >>> until the rationale behind the proposal is fully clarified.
> >>
> >> Thanks for your review and comments.
> >>
> >> Let me clarify the rationale behind this user-space sysfs interface:
> >>
> >> 1. VFIO is just one of the user-space device access frameworks.
> >>    There are many other in-kernel frameworks that expose devices
> >>    to user space, such as UIO, UACCE, etc., which may also require
> >>    TPH Steering Tag support.
> >>
> >> 2. The kernel can automatically program Steering Tags only when
> >>    the device provides a standard ST table in MSI-X or config space.
> >>    However, many devices implement vendor-specific or platform-specific
> >>    Steering Tag programming methods that cannot be fully handled
> >>    by the generic kernel code.
> >>
> >> 3. For such devices, user-space applications or framework drivers
> >>    need to retrieve and configure TPH Steering Tags directly.
> >>    A unified sysfs interface allows all user-space frameworks
> >>    (not just VFIO) to use a common, standard way to manage
> >>    TPH Steering Tags, rather than implementing duplicated logic
> >>    in each subsystem.
> >>
> >> This interface provides a uniform method for any user-space
> >> device access solution to work with TPH, which is why I believe
> >> it is useful and necessary beyond the VFIO/RDMA case.
> > 
> > I understand the rationale for providing a read interface, for example for
> > debugging, but I do not see any justification for a write interface.
> 
> Thank you for the comment!
> 
> As I explained, read interface is not only for debugging. It was used to
> such device who don't declare ST location in MSI-X or config-space, the following
> is Intel X710 NIC device's lspci output (only TPH part):
> 
> 	Capabilities: [1a0 v1] Transaction Processing Hints
> 		Device specific mode supported
> 		No steering table available
> 
> So we could not config the ST for device on kernel because it's vendor specific.
> But we could configure ST by it's vendor user-space driver, in this case, we
> should get ST from kernel to user-space.

Vendor-specific, in the context of the PCI specification, does not mean the
kernel cannot configure it. It simply means that the ST values are not
stored in the ST table.

Thanks

^ permalink raw reply

* Re: [net-next] net: airoha: Wait for TX to complete in airoha_dev_stop()
From: Lorenzo Bianconi @ 2026-04-14  8:57 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: edumazet, kuba, davem, netdev, andrew+netdev, linux-arm-kernel,
	linux-mediatek
In-Reply-To: <20260414082949.104611-1-pabeni@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3065 bytes --]

> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> 
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net: airoha: Wait for TX to complete in airoha_dev_stop()
> 
> Wait for TX to complete in airoha_dev_stop routine before stopping the
> TX DMA and run airoha_qdma_cleanup_tx_queue routine. Moreover,
> start/stop TX/RX NAPIs in ndo_open()/ndo_stop() callbacks in order to be
> sure the TX NAPIs have completed before stopping the TX DMA engine in
> airoha_dev_stop routine.
> 
> > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> > index 8e4b043af4bc..9e40c8f375c1 100644
> > --- a/drivers/net/ethernet/airoha/airoha_eth.c
> > +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> 
> [ ... ]
> 
> > @@ -1684,18 +1686,26 @@ static int airoha_dev_stop(struct net_device *dev)
> >  	struct airoha_qdma *qdma = port->qdma;
> >  	int i, err;
> >
> > -	netif_tx_disable(dev);
> >  	err = airoha_set_vip_for_gdm_port(port, false);
> >  	if (err)
> >  		return err;
> >
> > -	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
> > -		netdev_tx_reset_subqueue(dev, i);
> > -
> >  	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
> >  				    FE_PSE_PORT_DROP);
> >
> > +	netif_tx_disable(dev);
> >  	if (atomic_dec_and_test(&qdma->users)) {
> > +		u32 val;
> > +
> > +		/* Wait for TX to complete */
> > +		err = read_poll_timeout(airoha_qdma_rr, val,
> > +					!(val & GLOBAL_CFG_TX_DMA_BUSY_MASK),
> > +					USEC_PER_MSEC, 100 * USEC_PER_MSEC,
> > +					false, qdma, REG_QDMA_GLOBAL_CFG);
> > +		if (err)
> > +			return err;
> >                   ^^^^^^^^^^
> 
> Can this leave the device in an inconsistent state? When read_poll_timeout()
> fails (TX DMA busy for more than 100ms), airoha_dev_stop() returns an error
> immediately. However, atomic_dec_and_test(&qdma->users) has already
> decremented qdma->users to 0, but airoha_qdma_stop_napi() was never called,
> so all NAPIs remain enabled.
> 
> The networking core (__dev_close_many in net/core/dev.c) ignores the
> ndo_stop return value and marks the device as down regardless. When the
> device is later brought up again, airoha_dev_open() sees atomic_fetch_inc()
> return 0 (old value) and calls airoha_qdma_start_napi() -> napi_enable(),
> which hits BUG_ON(!test_bit(NAPI_STATE_SCHED, &val)) in napi_enable_locked()
> because the NAPI was never disabled.
> 
> The early return also skips the netdev_tx_reset_subqueue() calls at the end
> of the function.
> 
> Should the code either continue with cleanup on timeout (don't return), or
> ensure NAPI and DMA state are properly cleaned up before returning?

ack, I will fix it in v2.

Regards,
Lorenzo

> 
> > +
> > +		airoha_qdma_stop_napi(qdma);
> >  		airoha_qdma_clear(qdma, REG_QDMA_GLOBAL_CFG,
> >  				  GLOBAL_CFG_TX_DMA_EN_MASK |
> >  				  GLOBAL_CFG_RX_DMA_EN_MASK);
> 
> [ ... ]
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH v2] Bluetooth: Add Broadcom channel priority commands
From: Neal Gompa @ 2026-04-14  8:59 UTC (permalink / raw)
  To: fnkl.kernel
  Cc: Sven Peter, Janne Grunau, Marcel Holtmann, Luiz Augusto von Dentz,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-kernel, asahi, linux-arm-kernel,
	linux-bluetooth, netdev
In-Reply-To: <20260407-brcm-prio-v2-1-3f745edf49af@gmail.com>

On Tue, Apr 7, 2026 at 1:46 PM Sasha Finkelstein via B4 Relay
<devnull+fnkl.kernel.gmail.com@kernel.org> wrote:
>
> From: Sasha Finkelstein <fnkl.kernel@gmail.com>
>
> Certain Broadcom bluetooth chips (bcm4377/bcm4378/bcm438) need ACL
> streams carrying audio to be set as "high priority" using a vendor
> specific command to prevent 10-ish second-long dropouts whenever
> something does a device scan. This patch sends the command when the
> socket priority is set to TC_PRIO_INTERACTIVE, as BlueZ does for audio.
>
> Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com>
> ---
> Changes in v2:
> - new ioctl got nack-ed, so let's use sk_priority as the trigger
> - Link to v1: https://lore.kernel.org/r/20260407-brcm-prio-v1-1-f38b17376640@gmail.com
> ---

Thank you so much for this!

Reviewed-by: Neal Gompa <neal@gompa.dev>


-- 
真実はいつも一つ！/ Always, there's only one truth!

^ permalink raw reply

* Re: [PATCH net v7 0/2] net,bpf: fix null-ptr-deref in xdp_master_redirect() for bonding and add selftest
From: patchwork-bot+netdevbpf @ 2026-04-14  9:00 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: netdev, jiayuan.chen, ast, daniel, andrii, martin.lau, eddyz87,
	song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo, jolsa,
	davem, edumazet, kuba, pabeni, horms, hawk, shuah, joamaki, bpf,
	linux-kernel, linux-kselftest
In-Reply-To: <20260411005524.201200-1-jiayuan.chen@linux.dev>

Hello:

This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Sat, 11 Apr 2026 08:55:18 +0800 you wrote:
> From: Jiayuan Chen <jiayuan.chen@shopee.com>
> 
> This series has gone through several rounds of discussion and the
> maintainers hold different views on where the fix should live (in the
> generic xdp_master_redirect() path vs. inside bonding). I respect all
> of the suggestions, but I would like to get the crash fixed first, so
> this version takes the approach of checking whether the master device
> is up in xdp_master_redirect(), as suggested by Daniel Borkmann. If a
> different shape is preferred later it can be done as a follow-up, but
> the null-ptr-deref should not linger.
> 
> [...]

Here is the summary with links:
  - [net,v7,1/2] net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
    https://git.kernel.org/netdev/net/c/1921f91298d1
  - [net,v7,2/2] selftests/bpf: add test for xdp_master_redirect with bond not up
    https://git.kernel.org/netdev/net/c/8dd1bdde38af

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next 2/5] selftests: ovpn: fail notification check on mismatch
From: Antonio Quartulli @ 2026-04-14  9:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, ralf, Sabrina Dubroca, Paolo Abeni, Andrew Lunn,
	David S. Miller, Eric Dumazet
In-Reply-To: <20260413170014.12316e9b@kernel.org>

Hi,

On 14/04/2026 02:00, Jakub Kicinski wrote:
> On Mon, 13 Apr 2026 00:11:18 +0200 Antonio Quartulli wrote:
>> compare_ntfs doesn't fail when expected and received notification
>> streams diverge.
>>
>> Fix this bug by trackink the diff exit status explicitly and return it
>> to the caller so notification mismatches propagate as test failures.
> 
> Hm, this series nicely cleans up test_mark.sh failures
> but test_tcp.sh now always fails on debug (slow) kernel
> builds with:
> 
> # TAP version 13
> # 1..12
> # ok 1 setup network topology
> # ok 2 run baseline data traffic
> # ok 3 run LAN traffic behind peer1
> # ok 4 run iperf throughput
> # ok 5 run key rollout
> # ok 6 query peers
> # ok 7 query missing peer fails
> # ok 8 peer lifecycle and key queries
> # ok 9 delete peer while traffic
> # ok 10 delete stale keys
> # ok 11 check timeout behavior
> # Checking notifications for peer 3... failed
> # 1,9d0
> # < {
> # <   "name": "peer-del-ntf",
> # <   "msg": {
> # <     "peer": {
> # <       "del-reason": "expired",
> # <       "id": 12
> # <     }
> # <   }
> # < }
> # validate listener output for peer 3: command failed with rc=1: ovpn_compare_ntfs 3
> # not ok 12 validate notification output
> # # Totals: pass:11 fail:1 xfail:0 xpass:0 skip:0 error:0
> 
> Similar failure in test_symmetric_id_tcp.sh
> 
> Only the debug kernels tho, non-debug kernels seem to pass.
> So probably some race / slowness.

We have to extend the internal timeout a bit, because it triggers before 
the notification is delivered.

Will get this fixed.

Thanks,


-- 
Antonio Quartulli
OpenVPN Inc.


^ permalink raw reply

* Re: [PATCH net-next 5/5] selftests: ovpn: align command flow with TAP
From: Antonio Quartulli @ 2026-04-14  9:02 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, ralf, sd, pabeni, andrew+netdev, davem, edumazet
In-Reply-To: <20260413235608.3699773-2-kuba@kernel.org>

Hi,

On 14/04/2026 01:56, Jakub Kicinski wrote:
> This is an AI-generated review of your patch. The human sending this
> email says: "The second one looks legit, would you prefer to follow
> up or respin?"

Will respin, since we also have to fix the timeout in 2/5.

Thanks,

-- 
Antonio Quartulli
OpenVPN Inc.


^ permalink raw reply

* Re: [PATCH iwl-next 2/10] ice: allow setting advertised speed and duplex for all media types
From: Simon Horman @ 2026-04-14  9:04 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: 'Simon Horman', intel-wired-lan, anthony.l.nguyen, netdev,
	jacob.e.keller
In-Reply-To: <20260410074921.1254213-3-aleksandr.loktionov@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ice: allow setting advertised speed and duplex for all media types

This commit allows setting advertised speed and duplex for all media types by
only refusing media types NONE or UNKNOWN in strict mode. It adds AUI and NONE
to the ice_media_type enum.

> diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> index e6a20af6f63de..62d214c5582ad 100644
> --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
> +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
> @@ -2634,10 +2635,14 @@ ice_set_link_ksettings(struct net_device *netdev,
>  	if (!pi)
>  		return -EIO;
>  
> -	if (pi->phy.media_type != ICE_MEDIA_BASET &&
> -	    pi->phy.media_type != ICE_MEDIA_FIBER &&
> -	    pi->phy.media_type != ICE_MEDIA_BACKPLANE &&
> -	    pi->phy.media_type != ICE_MEDIA_DA &&
> +	lenient_mode = test_bit(ICE_FLAG_LINK_LENIENT_MODE_ENA, pf->flags);
> +
> +	/* Setting the speed and duplex advertised by autonegotiation is
> +	 * supported for all media types, so only return unsupported for media
> +	 * type none or unknown in strict mode.
> +	 */
> +	if ((pi->phy.media_type == ICE_MEDIA_NONE ||
> +	     (pi->phy.media_type == ICE_MEDIA_UNKNOWN && !lenient_mode)) &&
>  	    pi->phy.link_info.link_info & ICE_AQ_LINK_UP)
>  		return -EOPNOTSUPP;

Will AUI media types still be rejected here in strict mode?

The patch adds ICE_MEDIA_AUI and ICE_MEDIA_NONE to enum ice_media_type, but
doesn't appear to update ice_get_media_type() to actually return these new
values. 

If AUI media continues to fall through to ICE_MEDIA_UNKNOWN, wouldn't the new
logic still reject it when lenient mode is disabled, defeating the intent of
the patch?

[ ... ]
> @@ -2667,7 +2672,7 @@ ice_set_link_ksettings(struct net_device *netdev,
>  	if (!bitmap_subset(copy_ks.link_modes.advertising,
>  			   safe_ks.link_modes.supported,
>  			   __ETHTOOL_LINK_MODE_MASK_NBITS)) {
> -		if (!test_bit(ICE_FLAG_LINK_LENIENT_MODE_ENA, pf->flags))
> +		if (!lenient_mode)
>  			netdev_info(netdev, "The selected speed is not supported by the current media. Please select a link speed that is supported by the current media.\n");
>  		err = -EOPNOTSUPP;
>  		goto done;

This isn't a bug introduced by this patch, but does taking this error path
accidentally release another thread's lock?

The ICE_CFG_BUSY flag is used as an ad-hoc lock later in this function:

	while (test_and_set_bit(ICE_CFG_BUSY, pf->state)) {
		timeout--;
		if (!timeout) {
			err = -EBUSY;
			goto done;
		}
		usleep_range(TEST_SET_BITS_SLEEP_MIN, TEST_SET_BITS_SLEEP_MAX);
	}

If the bitmap_subset() check fails, or if the while loop times out, it jumps to
the done label:

done:
	kfree(phy_caps);
	clear_bit(ICE_CFG_BUSY, pf->state);

Since the bitmap_subset() check happens before test_and_set_bit() has acquired
the lock, wouldn't this unconditionally clear the ICE_CFG_BUSY flag even if
another thread currently holds it? Could this synchronization be replaced with a
standard lock to avoid these issues?

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] selftests/bpf: verify syncookie statistics in tcp_custom_syncookie
From: Paolo Abeni @ 2026-04-14  9:08 UTC (permalink / raw)
  To: Kuniyuki Iwashima, Jiayuan Chen
  Cc: netdev, Eric Dumazet, Neal Cardwell, David S. Miller,
	Jakub Kicinski, Simon Horman, David Ahern, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
	Eduard Zingerman, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa, Shuah Khan,
	linux-kernel, bpf, linux-kselftest
In-Reply-To: <CAAVpQUCCZVogDUUTfr3k-JxV52kpVvdYzqFC68-J3n6_ugd4Uw@mail.gmail.com>

On 4/14/26 7:50 AM, Kuniyuki Iwashima wrote:
> On Fri, Apr 10, 2026 at 6:32 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>>
>> Add read_tcpext_snmp() helper to network_helpers which reads a
>> TcpExt SNMP counter via nstat, and use it in the tcp_custom_syncookie
>> test to verify that LINUX_MIB_SYNCOOKIESRECV is incremented and
>> LINUX_MIB_SYNCOOKIESFAILED stays unchanged across a successful
>> BPF custom syncookie validation.
>>
>> The delta is captured between start_server() and accept(), which
>> covers the full SYN/ACK/cookie-check path for one connection.
>>
>> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
>> ---
>>  tools/testing/selftests/bpf/network_helpers.c | 22 +++++++++++++++++++
>>  tools/testing/selftests/bpf/network_helpers.h |  1 +
>>  .../bpf/prog_tests/tcp_custom_syncookie.c     | 20 +++++++++++++++++
> 
> As you touch bpf selftest helper files, please rebase on bpf-next
> to avoid possible conflicts and tag bpf-next in the Subject.

To hopefully  minimize the conflicts handling I'm going to apply patch
1/2 to net-next. Please resubmit patch 2/2 to bpf-next after the
relevant net core reach there.

/P


^ permalink raw reply

* Re: [PATCH net-next v11 05/10] bng_en: add support for link async events
From: Bhargava Chenna Marreddy @ 2026-04-14  9:11 UTC (permalink / raw)
  To: Vadim Fedorenko
  Cc: davem, edumazet, kuba, pabeni, andrew+netdev, horms, netdev,
	linux-kernel, michael.chan, pavan.chebbi, vsrama-krishna.nemani,
	vikas.gupta, Rajashekar Hudumula, Ajit Kumar Khaparde
In-Reply-To: <3596a43d-8c8e-47ac-ae73-ee282f3be945@linux.dev>

[-- Attachment #1: Type: text/plain, Size: 511 bytes --]

On Wed, Apr 8, 2026 at 6:22 PM Vadim Fedorenko
<vadim.fedorenko@linux.dev> wrote:
> > @@ -190,6 +199,14 @@ int bnge_hwrm_func_drv_rgtr(struct bnge_dev *bd)
> >       req->ver_min = cpu_to_le16(DRV_VER_MIN);
> >       req->ver_upd = cpu_to_le16(DRV_VER_UPD);
> >
> > +     memset(async_events_bmap, 0, sizeof(async_events_bmap));
>
> bitmap API has bitmap_zero()

Thanks, Vadim.

Since the subsequent version is already merged, I've noted this for a
future update.

Thanks,
Bhargava Marreddy.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5496 bytes --]

^ permalink raw reply

* [PATCH v3 0/3] ksz87xx: add support for low-loss cable equalizer errata
From: Fidelio Lawson @ 2026-04-14  9:12 UTC (permalink / raw)
  To: Woojung Huh, UNGLinuxDriver, Andrew Lunn, Vladimir Oltean,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Marek Vasut, Maxime Chevallier, Simon Horman, Heiner Kallweit,
	Russell King
  Cc: Woojung Huh, netdev, linux-kernel, Fidelio Lawson

Hello,

This patch implements the “Module 3: Equalizer fix for short cables” erratum
described in Microchip document DS80000687C for KSZ87xx switches.

According to the erratum, the embedded PHY receiver in KSZ87xx switches is
tuned by default for long, high-loss Ethernet cables. When operating with
short or low-loss cables (for example CAT5e or CAT6), the PHY equalizer may
over-amplify the incoming signal, leading to internal distortion and link
establishment failures.

Microchip provides two workarounds, each requiring a write to a different
indirect PHY register access mechanism.

The workaround requires programming internal PHY/DSP registers located in the
LinkMD table, accessed through the KSZ8 indirect register mechanism. Since these
registers belong to the switch address space and are not directly accessible
from a standalone PHY driver, the erratum control is modeled as a vendor-specific
Clause 22 PHY register, virtualized by the KSZ8 DSA driver.

Reads and writes to this register are intercepted by ksz8_r_phy() /
ksz8_w_phy() and translated into the required TABLE_LINK_MD_V indirect accesses.
The erratum affects the shared PHY analog front-end and therefore applies
globally to the switch.

The control register defines the following modes:
0: disabled (default behavior)
1: EQ training workaround
2: LPF 90 MHz
3: LPF 62 MHz
4: LPF 55 MHz
5: LPF 44 MHz

The register can be read and written from userspace via a phy tunable.
Note that current ethtool userspace only supports a fixed set of PHY tunables;
vendor-specific tunables may require either phytool or a newer userspace extension.

This series is based on Linux v7.0-rc1.

Signed-off-by: Fidelio Lawson <fidelio.lawson@exotec.com>
---
Changes in v3:
- Exposed all LPF bandwidth values supported by the hardware.
- Added phy tunable.
- Link to v2: https://patch.msgid.link/20260408-ksz87xx_errata_low_loss_connections-v2-1-9cfe38691713@exotec.com

Changes in v2:
- Dropped the device tree approach based on review feedback
- Modeled the errata control as a vendor-specific Clause 22 PHY register
- Added KSZ87xx-specific guards and replaced magic values with named macros
- Rebased on Linux v7.0-rc1
- Link to v1: https://patch.msgid.link/20260326-ksz87xx_errata_low_loss_connections-v1-0-79a698f43626@exotec.com

---
Fidelio Lawson (3):
      net: dsa: microchip: implement KSZ87xx Module 3 low-loss cable errata
      net: ethtool: add KSZ87xx low-loss PHY tunable
      net: phy: micrel: expose KSZ87xx low-loss erratum via PHY tunable

 drivers/net/dsa/microchip/ksz8.c       | 45 ++++++++++++++++++++++++++++++++++
 drivers/net/dsa/microchip/ksz8_reg.h   | 36 ++++++++++++++++++++++++++-
 drivers/net/dsa/microchip/ksz_common.h |  3 +++
 drivers/net/phy/micrel.c               | 39 +++++++++++++++++++++++++++++
 include/uapi/linux/ethtool.h           |  1 +
 net/ethtool/common.c                   |  1 +
 net/ethtool/ioctl.c                    |  1 +
 7 files changed, 125 insertions(+), 1 deletion(-)
---
base-commit: 2d1373e4246da3b58e1df058374ed6b101804e07
change-id: 20260323-ksz87xx_errata_low_loss_connections-b65e76e2b403

Best regards,
--  
Fidelio Lawson <fidelio.lawson@exotec.com>

^ permalink raw reply

* [PATCH v3 1/3] net: dsa: microchip: implement KSZ87xx Module 3 low-loss cable errata
From: Fidelio Lawson @ 2026-04-14  9:12 UTC (permalink / raw)
  To: Woojung Huh, UNGLinuxDriver, Andrew Lunn, Vladimir Oltean,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Marek Vasut, Maxime Chevallier, Simon Horman, Heiner Kallweit,
	Russell King
  Cc: Woojung Huh, netdev, linux-kernel, Fidelio Lawson
In-Reply-To: <20260414-ksz87xx_errata_low_loss_connections-v3-0-0e3838ca98c9@exotec.com>

Implement the "Module 3: Equalizer fix for short cables" erratum from
Microchip document DS80000687C for KSZ87xx switches.

The issue affects short or low-loss cable links (e.g. CAT5e/CAT6),
where the PHY receiver equalizer may amplify high-amplitude signals
excessively, resulting in internal distortion and link establishment
failures.

KSZ87xx devices require a workaround for the Module 3 low-loss cable
condition, controlled through the switch TABLE_LINK_MD_V indirect
registers.

The affected registers are part of the switch address space and are not
directly accessible from the PHY driver. To keep the PHY-facing API
clean and avoid leaking switch-specific details, model this errata
control as vendor-specific Clause 22 PHY registers.

A vendor-specific Clause 22 PHY register is introduced as a mode
selector in PHY_REG_LOW_LOSS_CTRL, and ksz8_r_phy() / ksz8_w_phy()
translate accesses to these bits into the appropriate indirect
TABLE_LINK_MD_V accesses.

The control register defines the following modes:
0: disabled (default behavior)
1: EQ training workaround
2: LPF 90 MHz
3: LPF 62 MHz
4: LPF 55 MHz
5: LPF 44 MHz

Workaround 1: Adjusts the DSP EQ training behavior via LinkMD register
0x3C. Widens and optimizes the DSP EQ compensation range,
and is expected to solve most short/low-loss cable issues.

Workaround 2: for the cases where Workaround 1 is not sufficient.
This one adjusts the receiver low-pass filter bandwidth, effectively
reducing the high-frequency component of the received signal

The register is accessible through standard PHY read/write operations
(e.g. phytool), without requiring any switch-specific userspace
interface. This allows robust link establishment on short or
low-loss cabling without requiring DTS properties and without
constraining hardware design choices.

The erratum affects the shared PHY analog front-end and therefore
applies globally to the switch.

Signed-off-by: Fidelio Lawson <fidelio.lawson@exotec.com>
---
 drivers/net/dsa/microchip/ksz8.c       | 45 ++++++++++++++++++++++++++++++++++
 drivers/net/dsa/microchip/ksz8_reg.h   | 36 ++++++++++++++++++++++++++-
 drivers/net/dsa/microchip/ksz_common.h |  3 +++
 3 files changed, 83 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/microchip/ksz8.c b/drivers/net/dsa/microchip/ksz8.c
index c354abdafc1b..596c85654f24 100644
--- a/drivers/net/dsa/microchip/ksz8.c
+++ b/drivers/net/dsa/microchip/ksz8.c
@@ -1058,6 +1058,11 @@ int ksz8_r_phy(struct ksz_device *dev, u16 phy, u16 reg, u16 *val)
 		if (ret)
 			return ret;
 
+		break;
+	case PHY_REG_KSZ87XX_LOW_LOSS:
+		if (!ksz_is_ksz87xx(dev))
+			return -EOPNOTSUPP;
+		data = dev->low_loss_wa_mode;
 		break;
 	default:
 		processed = false;
@@ -1271,6 +1276,46 @@ int ksz8_w_phy(struct ksz_device *dev, u16 phy, u16 reg, u16 val)
 		if (ret)
 			return ret;
 		break;
+	case PHY_REG_KSZ87XX_LOW_LOSS:
+		if (!ksz_is_ksz87xx(dev))
+			return -EOPNOTSUPP;
+
+		switch (val & PHY_KSZ87XX_LOW_LOSS_MASK) {
+		case PHY_LOW_LOSS_ERRATA_DISABLED:
+			ret = ksz8_ind_write8(dev, TABLE_LINK_MD, KSZ87XX_REG_EQ_TRAIN,
+					      KSZ87XX_EQ_TRAIN_DEFAULT);
+			if (!ret)
+				ret = ksz8_ind_write8(dev, TABLE_LINK_MD,
+						      KSZ87XX_REG_PHY_LPF,
+						      KSZ87XX_LOW_LOSS_LPF_90MHZ);
+			break;
+		case KSZ87XX_LOW_LOSS_EQ_TRAIN:
+			ret = ksz8_ind_write8(dev, TABLE_LINK_MD, KSZ87XX_REG_EQ_TRAIN,
+					      KSZ87XX_EQ_TRAIN_LOW_LOSS);
+			break;
+		case KSZ87XX_LOW_LOSS_LPF_90MHZ:
+			ret = ksz8_ind_write8(dev, TABLE_LINK_MD, KSZ87XX_REG_PHY_LPF,
+					      KSZ87XX_PHY_LPF_90MHZ);
+			break;
+		case KSZ87XX_LOW_LOSS_LPF_62MHZ:
+			ret = ksz8_ind_write8(dev, TABLE_LINK_MD, KSZ87XX_REG_PHY_LPF,
+					      KSZ87XX_PHY_LPF_62MHZ);
+			break;
+		case KSZ87XX_LOW_LOSS_LPF_55MHZ:
+			ret = ksz8_ind_write8(dev, TABLE_LINK_MD, KSZ87XX_REG_PHY_LPF,
+					      KSZ87XX_PHY_LPF_55MHZ);
+			break;
+		case KSZ87XX_LOW_LOSS_LPF_44MHZ:
+			ret = ksz8_ind_write8(dev, TABLE_LINK_MD, KSZ87XX_REG_PHY_LPF,
+					      KSZ87XX_PHY_LPF_44MHZ);
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		if (!ret)
+			dev->low_loss_wa_mode = val & PHY_KSZ87XX_LOW_LOSS_MASK;
+		return ret;
 	default:
 		break;
 	}
diff --git a/drivers/net/dsa/microchip/ksz8_reg.h b/drivers/net/dsa/microchip/ksz8_reg.h
index 332408567b47..4e02e044339c 100644
--- a/drivers/net/dsa/microchip/ksz8_reg.h
+++ b/drivers/net/dsa/microchip/ksz8_reg.h
@@ -202,6 +202,10 @@
 #define REG_PORT_3_STATUS_0		0x38
 #define REG_PORT_4_STATUS_0		0x48
 
+/* KSZ87xx LinkMD registers (TABLE_LINK_MD_V) */
+#define KSZ87XX_REG_EQ_TRAIN		0x3C
+#define KSZ87XX_REG_PHY_LPF			0x4C
+
 /* For KSZ8765. */
 #define PORT_REMOTE_ASYM_PAUSE		BIT(5)
 #define PORT_REMOTE_SYM_PAUSE		BIT(4)
@@ -342,7 +346,7 @@
 #define TABLE_EEE			(TABLE_EEE_V << TABLE_EXT_SELECT_S)
 #define TABLE_ACL			(TABLE_ACL_V << TABLE_EXT_SELECT_S)
 #define TABLE_PME			(TABLE_PME_V << TABLE_EXT_SELECT_S)
-#define TABLE_LINK_MD			(TABLE_LINK_MD << TABLE_EXT_SELECT_S)
+#define TABLE_LINK_MD			(TABLE_LINK_MD_V << TABLE_EXT_SELECT_S)
 #define TABLE_READ			BIT(4)
 #define TABLE_SELECT_S			2
 #define TABLE_STATIC_MAC_V		0
@@ -729,6 +733,36 @@
 #define PHY_POWER_SAVING_ENABLE		BIT(2)
 #define PHY_REMOTE_LOOPBACK		BIT(1)
 
+/* Equalizer low-loss workaround */
+#define PHY_REG_KSZ87XX_LOW_LOSS       0x1C
+#define PHY_KSZ87XX_LOW_LOSS_MASK      GENMASK(2, 0)
+
+/* KSZ87xx low-loss EQ mode selector (vendor-specific PHY reg 0x1c)
+ *
+ * Values:
+ *  0: disabled (default behavior)
+ *  1: EQ training workaround
+ *  2: LPF 90 MHz
+ *  3: LPF 62 MHz
+ *  4: LPF 55 MHz
+ *  5: LPF 44 MHz
+ */
+#define PHY_LOW_LOSS_ERRATA_DISABLED		0
+#define KSZ87XX_LOW_LOSS_EQ_TRAIN			1
+#define KSZ87XX_LOW_LOSS_LPF_90MHZ			2
+#define KSZ87XX_LOW_LOSS_LPF_62MHZ			3
+#define KSZ87XX_LOW_LOSS_LPF_55MHZ			4
+#define KSZ87XX_LOW_LOSS_LPF_44MHZ			5
+
+#define KSZ87XX_EQ_TRAIN_DEFAULT       0x0A
+#define KSZ87XX_EQ_TRAIN_LOW_LOSS      0x15
+
+/* LPF bandwidth bits [7:6]: 00 = 90MHz, 01 = 62MHz, 10 = 55MHz, 11 = 44MHz  */
+#define KSZ87XX_PHY_LPF_90MHZ          0x00
+#define KSZ87XX_PHY_LPF_62MHZ          0x40
+#define KSZ87XX_PHY_LPF_55MHZ          0x80
+#define KSZ87XX_PHY_LPF_44MHZ          0xC0
+
 /* KSZ8463 specific registers. */
 #define P1MBCR				0x4C
 #define P1MBSR				0x4E
diff --git a/drivers/net/dsa/microchip/ksz_common.h b/drivers/net/dsa/microchip/ksz_common.h
index 929aff4c55de..16a6074ea4b4 100644
--- a/drivers/net/dsa/microchip/ksz_common.h
+++ b/drivers/net/dsa/microchip/ksz_common.h
@@ -219,6 +219,9 @@ struct ksz_device {
 	 * the switch’s internal PHYs, bypassing the main SPI interface.
 	 */
 	struct mii_bus *parent_mdio_bus;
+
+	/* Equalizer low-loss workaround tunable */
+	u8 low_loss_wa_mode; /* KSZ87xx low-loss EQ/LPF mode selector (0-5) */
 };
 
 /* List of supported models */

-- 
2.53.0


^ permalink raw reply related

* [PATCH v3 2/3] net: ethtool: add KSZ87xx low-loss PHY tunable
From: Fidelio Lawson @ 2026-04-14  9:12 UTC (permalink / raw)
  To: Woojung Huh, UNGLinuxDriver, Andrew Lunn, Vladimir Oltean,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Marek Vasut, Maxime Chevallier, Simon Horman, Heiner Kallweit,
	Russell King
  Cc: Woojung Huh, netdev, linux-kernel, Fidelio Lawson
In-Reply-To: <20260414-ksz87xx_errata_low_loss_connections-v3-0-0e3838ca98c9@exotec.com>

Introduce a new PHY tunable identifier,
ETHTOOL_PHY_KSZ87XX_LOW_LOSS, to allow userspace to control the
KSZ87xx low-loss cable erratum through the ethtool PHY tunable
interface.

KSZ87xx switches integrate embedded PHYs whose receiver behavior may
require specific equalizer or low-pass filter adjustments when used
with short or low-loss Ethernet cables, as described in Microchip
errata DS80000687C (Module 3). The new tunable provides a userspace
interface for selecting the desired operating mode.

The tunable uses a u8 value and is vendor-specific by design. The
actual handling is implemented by the corresponding PHY driver.

Signed-off-by: Fidelio Lawson <fidelio.lawson@exotec.com>
---
 include/uapi/linux/ethtool.h | 1 +
 net/ethtool/common.c         | 1 +
 net/ethtool/ioctl.c          | 1 +
 3 files changed, 3 insertions(+)

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index b74b80508553..5c539e1bca4b 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -295,6 +295,7 @@ enum phy_tunable_id {
 	 * Add your fresh new phy tunable attribute above and remember to update
 	 * phy_tunable_strings[] in net/ethtool/common.c
 	 */
+	ETHTOOL_PHY_KSZ87XX_LOW_LOSS,
 	__ETHTOOL_PHY_TUNABLE_COUNT,
 };
 
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index e252cf20c22f..e1c98ce66093 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -101,6 +101,7 @@ phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN] = {
 	[ETHTOOL_PHY_DOWNSHIFT]	= "phy-downshift",
 	[ETHTOOL_PHY_FAST_LINK_DOWN] = "phy-fast-link-down",
 	[ETHTOOL_PHY_EDPD]	= "phy-energy-detect-power-down",
+	[ETHTOOL_PHY_KSZ87XX_LOW_LOSS] = "phy-ksz87xx-low-loss",
 };
 
 #define __LINK_MODE_NAME(speed, type, duplex) \
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index ff4b4780d6af..9e7bd887acf5 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -3109,6 +3109,7 @@ static int ethtool_phy_tunable_valid(const struct ethtool_tunable *tuna)
 	switch (tuna->id) {
 	case ETHTOOL_PHY_DOWNSHIFT:
 	case ETHTOOL_PHY_FAST_LINK_DOWN:
+	case ETHTOOL_PHY_KSZ87XX_LOW_LOSS:
 		if (tuna->len != sizeof(u8) ||
 		    tuna->type_id != ETHTOOL_TUNABLE_U8)
 			return -EINVAL;

-- 
2.53.0


^ permalink raw reply related

* [PATCH v3 3/3] net: phy: micrel: expose KSZ87xx low-loss erratum via PHY tunable
From: Fidelio Lawson @ 2026-04-14  9:12 UTC (permalink / raw)
  To: Woojung Huh, UNGLinuxDriver, Andrew Lunn, Vladimir Oltean,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Marek Vasut, Maxime Chevallier, Simon Horman, Heiner Kallweit,
	Russell King
  Cc: Woojung Huh, netdev, linux-kernel, Fidelio Lawson
In-Reply-To: <20260414-ksz87xx_errata_low_loss_connections-v3-0-0e3838ca98c9@exotec.com>

Expose the KSZ87xx low-loss cable erratum control through the PHY
tunable interface.

KSZ87xx switches integrate embedded PHYs whose receiver analog front-end
may require specific equalizer or low-pass filter adjustments when used
with short or low-loss Ethernet cables, as described in Microchip errata
DS80000687C (Module 3).

Implement get_tunable / set_tunable callbacks in the Micrel PHY driver
for KSZ87xx devices, mapping the ETHTOOL_PHY_KSZ87XX_LOW_LOSS tunable
to a vendor-specific Clause 22 PHY register. Accesses are routed through
standard phy_read() / phy_write() operations and translated by the KSZ8
DSA driver into the appropriate internal LinkMD table updates.

The tunable uses a u8 mode selector, allowing userspace to select
between the documented equalizer and LPF bandwidth configurations.

Signed-off-by: Fidelio Lawson <fidelio.lawson@exotec.com>
---
 drivers/net/phy/micrel.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index c6b011a9d636..7cbca6a7ed84 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -287,6 +287,11 @@
 /* PHY Control 2 / PHY Control (if no PHY Control 1) */
 #define MII_KSZPHY_CTRL_2			0x1f
 #define MII_KSZPHY_CTRL				MII_KSZPHY_CTRL_2
+
+/* Vendor-specific Clause 22 register, virtualized by KSZ87xx embedded PHYs DSA driver */
+#define MII_KSZ87XX_LOW_LOSS			0x1c
+#define KSZ87XX_LOW_LOSS_MAX			5
+
 /* bitmap of PHY register to set interrupt mode */
 #define KSZ8081_CTRL2_HP_MDIX			BIT(15)
 #define KSZ8081_CTRL2_MDI_MDI_X_SELECT		BIT(14)
@@ -940,6 +945,38 @@ static int ksz8795_match_phy_device(struct phy_device *phydev,
 	return ksz8051_ksz8795_match_phy_device(phydev, false);
 }
 
+static int ksz87xx_get_tunable(struct phy_device *phydev,
+			       struct ethtool_tunable *tuna, void *data)
+{
+	int ret;
+
+	switch (tuna->id) {
+	case ETHTOOL_PHY_KSZ87XX_LOW_LOSS:
+		ret = phy_read(phydev, MII_KSZ87XX_LOW_LOSS);
+		if (ret < 0)
+			return ret;
+		*(u8 *)data = ret;
+		return 0;
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int ksz87xx_set_tunable(struct phy_device *phydev,
+			       struct ethtool_tunable *tuna, const void *data)
+{
+	u8 val = *(const u8 *)data;
+
+	switch (tuna->id) {
+	case ETHTOOL_PHY_KSZ87XX_LOW_LOSS:
+		if (val > KSZ87XX_LOW_LOSS_MAX)
+			return -EINVAL;
+		return phy_write(phydev, MII_KSZ87XX_LOW_LOSS, val);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
 static int ksz9021_load_values_from_of(struct phy_device *phydev,
 				       const struct device_node *of_node,
 				       u16 reg,
@@ -6809,6 +6846,8 @@ static struct phy_driver ksphy_driver[] = {
 	/* PHY_BASIC_FEATURES */
 	.config_init	= kszphy_config_init,
 	.match_phy_device = ksz8795_match_phy_device,
+	.get_tunable	= ksz87xx_get_tunable,
+	.set_tunable	= ksz87xx_set_tunable,
 	.suspend	= genphy_suspend,
 	.resume		= genphy_resume,
 }, {

-- 
2.53.0


^ permalink raw reply related

* [syzbot] [ppp?] KMSAN: uninit-value in ppp_sync_receive (4)
From: syzbot @ 2026-04-14  9:14 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, linux-kernel, linux-ppp,
	netdev, pabeni, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    9a9c8ce300cd Merge tag 'kbuild-fixes-7.0-4' of git://git.k..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=112a874e580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=963de479f54c6dbb
dashboard link: https://syzkaller.appspot.com/bug?extid=88679c919eb801bd16f8
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/3e2ad23d804a/disk-9a9c8ce3.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a3dde4990296/vmlinux-9a9c8ce3.xz
kernel image: https://storage.googleapis.com/syzbot-assets/256e0a89d22c/bzImage-9a9c8ce3.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+88679c919eb801bd16f8@syzkaller.appspotmail.com

=====================================================
BUG: KMSAN: uninit-value in ppp_sync_input drivers/net/ppp/ppp_synctty.c:684 [inline]
BUG: KMSAN: uninit-value in ppp_sync_receive+0x626/0xfa0 drivers/net/ppp/ppp_synctty.c:334
 ppp_sync_input drivers/net/ppp/ppp_synctty.c:684 [inline]
 ppp_sync_receive+0x626/0xfa0 drivers/net/ppp/ppp_synctty.c:334
 tty_ldisc_receive_buf+0x1f7/0x2c0 drivers/tty/tty_buffer.c:391
 tty_port_default_receive_buf+0xd7/0x1a0 drivers/tty/tty_port.c:37
 receive_buf drivers/tty/tty_buffer.c:445 [inline]
 flush_to_ldisc+0x43e/0xe40 drivers/tty/tty_buffer.c:495
 process_one_work kernel/workqueue.c:3276 [inline]
 process_scheduled_works+0xb82/0x1e80 kernel/workqueue.c:3359
 worker_thread+0xee4/0x1590 kernel/workqueue.c:3440
 kthread+0x53f/0x600 kernel/kthread.c:436
 ret_from_fork+0x20f/0x910 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245

Uninit was created at:
 slab_post_alloc_hook mm/slub.c:4545 [inline]
 slab_alloc_node mm/slub.c:4866 [inline]
 __do_kmalloc_node mm/slub.c:5259 [inline]
 __kmalloc_noprof+0x486/0x1680 mm/slub.c:5272
 kmalloc_noprof include/linux/slab.h:954 [inline]
 tty_buffer_alloc drivers/tty/tty_buffer.c:180 [inline]
 __tty_buffer_request_room+0x3d4/0x7a0 drivers/tty/tty_buffer.c:273
 __tty_insert_flip_string_flags+0x157/0x6e0 drivers/tty/tty_buffer.c:309
 tty_insert_flip_char include/linux/tty_flip.h:77 [inline]
 uart_insert_char+0x368/0x930 drivers/tty/serial/serial_core.c:3431
 serial8250_read_char+0x1ba/0x670 drivers/tty/serial/8250/8250_port.c:1643
 serial8250_rx_chars drivers/tty/serial/8250/8250_port.c:1660 [inline]
 serial8250_handle_irq_locked+0x6d4/0xa40 drivers/tty/serial/8250/8250_port.c:1820
 serial8250_handle_irq+0x187/0x730 drivers/tty/serial/8250/8250_port.c:1841
 serial8250_default_handle_irq+0x116/0x370 drivers/tty/serial/8250/8250_port.c:1855
 serial8250_interrupt+0xcb/0x420 drivers/tty/serial/8250/8250_core.c:86
 __handle_irq_event_percpu+0x13c/0xf90 kernel/irq/handle.c:209
 handle_irq_event_percpu kernel/irq/handle.c:246 [inline]
 handle_irq_event+0xe0/0x2a0 kernel/irq/handle.c:263
 handle_edge_irq+0x2a9/0xb30 kernel/irq/chip.c:855
 generic_handle_irq_desc include/linux/irqdesc.h:186 [inline]
 handle_irq arch/x86/kernel/irq.c:262 [inline]
 call_irq_handler arch/x86/kernel/irq.c:-1 [inline]
 __common_interrupt+0x9d/0x180 arch/x86/kernel/irq.c:333
 common_interrupt+0x4c/0xb0 arch/x86/kernel/irq.c:326
 asm_common_interrupt+0x2b/0x40 arch/x86/include/asm/idtentry.h:688

CPU: 0 UID: 0 PID: 35 Comm: kworker/u8:2 Tainted: G        W    L      syzkaller #0 PREEMPT(full) 
Tainted: [W]=WARN, [L]=SOFTLOCKUP
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/18/2026
Workqueue: events_unbound flush_to_ldisc
=====================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply

* Re: [PATCH iwl-next 4/10] ice: reorder ice_flash_info fields to eliminate padding
From: Simon Horman @ 2026-04-14  9:16 UTC (permalink / raw)
  To: Aleksandr Loktionov
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, Jacob Keller
In-Reply-To: <20260410074921.1254213-5-aleksandr.loktionov@intel.com>

On Fri, Apr 10, 2026 at 09:49:15AM +0200, Aleksandr Loktionov wrote:
> From: Jacob Keller <jacob.e.keller@intel.com>
> 
> The ice_flash_info structure has a u16 sr_words field before a u32
> flash_size value. This creates a 2-byte hole as well as 3 bytes of
> padding at the end of the structure due to the blank_nvm_mode bitfield.
> 
> Re-order the structure to place flash_size first, which gives a better
> layout and reduces padding.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] selftests/bpf: verify syncookie statistics in tcp_custom_syncookie
From: Paolo Abeni @ 2026-04-14  9:17 UTC (permalink / raw)
  To: Kuniyuki Iwashima, Jiayuan Chen, Eric Dumazet, Daniel Borkmann
  Cc: netdev, Neal Cardwell, David S. Miller, Jakub Kicinski,
	Simon Horman, David Ahern, Alexei Starovoitov, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	Shuah Khan, linux-kernel, bpf, linux-kselftest
In-Reply-To: <6b22a595-cd65-42ec-b332-c835273bc174@redhat.com>

On 4/14/26 11:08 AM, Paolo Abeni wrote:
> On 4/14/26 7:50 AM, Kuniyuki Iwashima wrote:
>> On Fri, Apr 10, 2026 at 6:32 PM Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
>>>
>>> Add read_tcpext_snmp() helper to network_helpers which reads a
>>> TcpExt SNMP counter via nstat, and use it in the tcp_custom_syncookie
>>> test to verify that LINUX_MIB_SYNCOOKIESRECV is incremented and
>>> LINUX_MIB_SYNCOOKIESFAILED stays unchanged across a successful
>>> BPF custom syncookie validation.
>>>
>>> The delta is captured between start_server() and accept(), which
>>> covers the full SYN/ACK/cookie-check path for one connection.
>>>
>>> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
>>> ---
>>>  tools/testing/selftests/bpf/network_helpers.c | 22 +++++++++++++++++++
>>>  tools/testing/selftests/bpf/network_helpers.h |  1 +
>>>  .../bpf/prog_tests/tcp_custom_syncookie.c     | 20 +++++++++++++++++
>>
>> As you touch bpf selftest helper files, please rebase on bpf-next
>> to avoid possible conflicts and tag bpf-next in the Subject.
> 
> To hopefully  minimize the conflicts handling I'm going to apply patch
> 1/2 to net-next. Please resubmit patch 2/2 to bpf-next after the
> relevant net core reach there.

Uhmm... the original feature went through the bpf tree, so I guess both
patches could/should via bpf-next. Hopefully conflict into the tcp code
should be minimal.

@Eric, @Daniel: please LMK if you prefer otherwise.
/P


^ permalink raw reply

* Re: [PATCH] netfilter: nfnetlink_osf: fix null-ptr-deref in nf_osf_ttl
From: Pablo Neira Ayuso @ 2026-04-14  9:19 UTC (permalink / raw)
  To: Kito Xu (veritas501)
  Cc: Florian Westphal, Phil Sutter, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman,
	Fernando Fernandez Mancera, netfilter-devel, coreteam, netdev,
	linux-kernel
In-Reply-To: <20260414074556.2512750-1-hxzene@gmail.com>

On Tue, Apr 14, 2026 at 03:45:56PM +0800, Kito Xu (veritas501) wrote:
> diff --git a/net/netfilter/nfnetlink_osf.c b/net/netfilter/nfnetlink_osf.c
> index d64ce21c7b55..85dbd47dbbd4 100644
> --- a/net/netfilter/nfnetlink_osf.c
> +++ b/net/netfilter/nfnetlink_osf.c
> @@ -43,6 +43,9 @@ static inline int nf_osf_ttl(const struct sk_buff *skb,
>  	else if (ip->ttl <= f_ttl)
>  		return 1;
>  
> +	if (!in_dev)
> +		return 0;
> +

I suggest you add this check a bit earlier:

diff --git a/net/netfilter/nfnetlink_osf.c b/net/netfilter/nfnetlink_osf.c
index 5d15651c74f0..e8069f4e139b 100644
--- a/net/netfilter/nfnetlink_osf.c
+++ b/net/netfilter/nfnetlink_osf.c
@@ -36,6 +36,9 @@ static inline int nf_osf_ttl(const struct sk_buff *skb,
        const struct in_ifaddr *ifa;
        int ret = 0;
 
+       if (!in_dev)
+               return 0;
+
        if (ttl_check == NF_OSF_TTL_TRUE)
                return ip->ttl == f_ttl;
        if (ttl_check == NF_OSF_TTL_NOCHECK)
@@ -43,9 +46,6 @@ static inline int nf_osf_ttl(const struct sk_buff *skb,
        else if (ip->ttl <= f_ttl)
                return 1;
 
-       if (!in_dev)
-               return 0;
-
        in_dev_for_each_ifa_rcu(ifa, in_dev) {
                if (inet_ifa_match(ip->saddr, ifa)) {
                        ret = (ip->ttl == f_ttl);

Thanks!

^ permalink raw reply related

* Re: [RFC] Proposal: Add sysfs interface for PCIe TPH Steering Tag retrieval and configuration
From: fengchengwen @ 2026-04-14  9:30 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jason Gunthorpe, Bjorn Helgaas, linux-rdma, linux-pci, netdev,
	dri-devel, Keith Busch, Yochai Cohen, Yishai Hadas, Zhiping Zhang
In-Reply-To: <20260414085723.GR21470@unreal>

On 4/14/2026 4:57 PM, Leon Romanovsky wrote:
> On Tue, Apr 14, 2026 at 09:07:23AM +0800, fengchengwen wrote:
>> On 4/14/2026 3:19 AM, Leon Romanovsky wrote:
>>> On Mon, Apr 13, 2026 at 08:04:10PM +0800, fengchengwen wrote:
>>>> On 4/13/2026 6:01 PM, Leon Romanovsky wrote:
>>>>> On Fri, Apr 10, 2026 at 10:30:52PM +0800, fengchengwen wrote:
>>>>>> Hi all,
>>>>>>
>>>>>> I'm writing to propose adding a sysfs interface to expose and configure the
>>>>>> PCIe TPH
>>>>>> Steering Tag for PCIe devices, which is retrieved inside the kernel.
>>>>>>
>>>>>>
>>>>>> Background: The TPH Steering Tag is tightly coupled with both a PCIe device
>>>>>> (identified
>>>>>> by its BDF) and a CPU core. It can only be obtained in kernel mode. To allow
>>>>>> user-space
>>>>>> applications to fetch and set this value securely and conveniently, we need
>>>>>> a standard
>>>>>> kernel-to-user interface.
>>>>>>
>>>>>>
>>>>>> Proposed Solution: Add several sysfs attributes under each PCIe device's
>>>>>> sysfs directory:
>>>>>> 1. /sys/bus/pci/devices/<BDF>/tph_mode to query the TPH mode (interrupt or
>>>>>> device specific)
>>>>>> 2. /sys/bus/pci/devices/<BDF>/tph_enable to control the TPH feature
>>>>>> 3. /sys/bus/pci/devices/<BDF>/tph_st to support both read and write
>>>>>> operations, e.g.:
>>>>>>    Read operation:
>>>>>>      echo "cpu=3" > /sys/bus/pci/devices/0000:01:00.0/tph_st
>>>>>>      cat /sys/bus/pci/devices/0000:01:00.0/tph_st
>>>>>>    Write operation:
>>>>>>      echo "index=10 st=123" > /sys/bus/pci/devices/0000:01:00.0/tph_st
>>>>>>
>>>>>>
>>>>>> The design strictly follows PCI subsystem sysfs standards and has the
>>>>>> following key properties:
>>>>>>
>>>>>> 1. Dynamic Visibility: The sysfs attributes will only be present for PCIe
>>>>>> devices that
>>>>>>    support TPH Steering Tag. Devices without TPH capability will not show
>>>>>> these nodes,
>>>>>>    avoiding unnecessary user confusion.
>>>>>>
>>>>>> 2. Permission Control: The attributes will use 0600 file permissions,
>>>>>> ensuring only
>>>>>>    privileged root users can read or write them, which satisfies security
>>>>>> requirements
>>>>>>    for hardware configuration interfaces.
>>>>>>
>>>>>> 3. Standard Implementation Location: The interface will be implemented in
>>>>>>    drivers/pci/pci-sysfs.c, the canonical location for all PCI device sysfs
>>>>>> attributes,
>>>>>>    ensuring consistency and maintainability within the PCI subsystem.
>>>>>>
>>>>>>
>>>>>> Why sysfs instead of alternatives like VFIO-PCI ioctl:
>>>>>>
>>>>>> - Universality: sysfs does not require binding the device to a special
>>>>>> driver such as
>>>>>>   vfio-pci. It is available to any privileged user-space component,
>>>>>> including system
>>>>>>   utilities, daemons, and monitoring tools.
>>>>>>
>>>>>> - Simplicity: Both user-space usage (cat/echo) and kernel implementation are
>>>>>>   straightforward, reducing code complexity and long-term maintenance cost.
>>>>>>
>>>>>> - Design Alignment: TPH Steering Tag is a generic PCIe device feature, not
>>>>>> specific to
>>>>>>   user-space drivers like DPDK or VFIO. Exposing it via sysfs matches the
>>>>>> kernel's
>>>>>>   standard pattern for hardware capabilities.
>>>>>>
>>>>>>
>>>>>> I look forward to your comments about this design before submitting the
>>>>>> final patch.
>>>>>
>>>>> You need to explain more clearly why this write functionality is useful
>>>>> and necessary outside the VFIO/RDMA context:
>>>>> https://lore.kernel.org/all/20260324234615.3731237-1-zhipingz@meta.com/
>>>>>
>>>>> AFAIK, for non-VFIO TPH callers, kernel has enough knowledge to set
>>>>> right ST values.
>>>>>
>>>>> There are several comments regarding the implementation, but those can wait
>>>>> until the rationale behind the proposal is fully clarified.
>>>>
>>>> Thanks for your review and comments.
>>>>
>>>> Let me clarify the rationale behind this user-space sysfs interface:
>>>>
>>>> 1. VFIO is just one of the user-space device access frameworks.
>>>>    There are many other in-kernel frameworks that expose devices
>>>>    to user space, such as UIO, UACCE, etc., which may also require
>>>>    TPH Steering Tag support.
>>>>
>>>> 2. The kernel can automatically program Steering Tags only when
>>>>    the device provides a standard ST table in MSI-X or config space.
>>>>    However, many devices implement vendor-specific or platform-specific
>>>>    Steering Tag programming methods that cannot be fully handled
>>>>    by the generic kernel code.
>>>>
>>>> 3. For such devices, user-space applications or framework drivers
>>>>    need to retrieve and configure TPH Steering Tags directly.
>>>>    A unified sysfs interface allows all user-space frameworks
>>>>    (not just VFIO) to use a common, standard way to manage
>>>>    TPH Steering Tags, rather than implementing duplicated logic
>>>>    in each subsystem.
>>>>
>>>> This interface provides a uniform method for any user-space
>>>> device access solution to work with TPH, which is why I believe
>>>> it is useful and necessary beyond the VFIO/RDMA case.
>>>
>>> I understand the rationale for providing a read interface, for example for
>>> debugging, but I do not see any justification for a write interface.
>>
>> Thank you for the comment!
>>
>> As I explained, read interface is not only for debugging. It was used to
>> such device who don't declare ST location in MSI-X or config-space, the following
>> is Intel X710 NIC device's lspci output (only TPH part):
>>
>> 	Capabilities: [1a0 v1] Transaction Processing Hints
>> 		Device specific mode supported
>> 		No steering table available
>>
>> So we could not config the ST for device on kernel because it's vendor specific.
>> But we could configure ST by it's vendor user-space driver, in this case, we
>> should get ST from kernel to user-space.
> 
> Vendor-specific, in the context of the PCI specification, does not mean the
> kernel cannot configure it. It simply means that the ST values are not
> stored in the ST table.

Thank you for the clarification!

I agree with your interpretation of "vendor-specific" in PCI spec terms—it
does not prevent the kernel from handling TPH in principle. However, the
real problem is that the kernel has no standardized way to know where or
how to program those vendor-specific ST values.

When a device  reports "No steering table available" and operates in
device-specific mode, the method used to set ST values is entirely
device-specific and not covered by the PCI specification. If the device
is taken over to user-space by UIO framework (e.g. VFIO or IGB_UIO), the
generic kernel cannot infer the proper programming sequence or registers
for each vendor-specific implementation.

In these cases, the configuration must be done by the vendor’s
user-space driver, which is aware of the device’s private programming
model. But such a user-space driver still needs to obtain valid,
platform-provided ST values (from ACPI _DSM), which it cannot do
without a kernel interface.

This is why a read-only interface to retrieve ST values is still
needed: the kernel holds the valid platform tags, while the user-space
driver handles the device-specific programming.

Thanks

> 
> Thanks


^ permalink raw reply

* Re: [PATCH iwl-next 5/10] ice: improve Add/Update VSI error messages in ice_vsi_init()
From: Simon Horman @ 2026-04-14  9:30 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: 'Simon Horman', intel-wired-lan, anthony.l.nguyen, netdev,
	jacob.e.keller
In-Reply-To: <20260410074921.1254213-6-aleksandr.loktionov@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ice: improve Add/Update VSI error messages in ice_vsi_init()

The error messages emitted when Add VSI or Update VSI Admin
Queue commands fail are missing the VSI index and the last AQ
error code. Add both to match the pattern used elsewhere in
the driver for AQ-command failures, which helps narrow down
firmware issues without requiring debug logs.

> diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c
> index 689c6025ea82e..0e79d66959d4f 100644
> --- a/drivers/net/ethernet/intel/ice/ice_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_lib.c
> @@ -1320,14 +1320,18 @@ static int ice_vsi_init(struct ice_vsi *vsi, u32 vsi_flags)
>  	if (vsi_flags & ICE_VSI_FLAG_INIT) {
>  		ret = ice_add_vsi(hw, vsi->idx, ctxt, NULL);
>  		if (ret) {
> -			dev_err(dev, "Add VSI failed, err %d\n", ret);
> +			dev_err(dev, "Add VSI %d failed, err %d aq_err %s\n",
> +				vsi->idx, ret,
> +				libie_aq_str(hw->adminq.sq_last_status));

Will this print a misleading "aq_err OK" if ice_add_vsi() fails for software
reasons?

For instance, if devm_kzalloc() fails inside ice_add_vsi(), it calls
ice_aq_free_vsi() to clean up before returning -ENOMEM:

int ice_add_vsi(...)
{
	...
	tmp_vsi_ctx = devm_kzalloc(ice_hw_to_dev(hw), ...
	if (!tmp_vsi_ctx) {
		ice_aq_free_vsi(hw, vsi_ctx, false, cd);
		return -ENOMEM;
	}
	...
}

The successful ice_aq_free_vsi() call resets hw->adminq.sq_last_status to OK.
Since the log unconditionally prints sq_last_status, it will display the
-ENOMEM error alongside "aq_err OK".

Additionally, could reading hw->adminq.sq_last_status outside the Admin Queue
lock be racy? Could a concurrent AQ command overwrite the status before it
is printed?

>  			ret = -EIO;
>  			goto out;
>  		}
>  	} else {
>  		ret = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
>  		if (ret) {
> -			dev_err(dev, "Update VSI failed, err %d\n", ret);
> +			dev_err(dev, "Update VSI %d failed, err %d aq_err %s\n",
> +				vsi->idx, ret,
> +				libie_aq_str(hw->adminq.sq_last_status));

A similar situation applies here. If ice_update_vsi() returns early due to an
invalid VSI handle, it returns -EINVAL without executing an AQ command. This
would result in printing whatever AQ status was left behind from a previous
command.

>  			ret = -EIO;
>  			goto out;
>  		}

^ permalink raw reply

* Re: [PATCH iwl-next 6/10] ice: increase OICR interrupt moderation rate to 20K interrupts/sec
From: Simon Horman @ 2026-04-14  9:33 UTC (permalink / raw)
  To: Aleksandr Loktionov
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, Jacob Keller
In-Reply-To: <20260410074921.1254213-7-aleksandr.loktionov@intel.com>

On Fri, Apr 10, 2026 at 09:49:17AM +0200, Aleksandr Loktionov wrote:
> The miscellaneous interrupt cause (OICR) is throttled to 8K
> interrupts per second (124 us minimum spacing). This interrupt
> handles VF mailbox messages and Tx timestamps, so the low rate
> imposes a minimum latency floor on both use-cases.
> 
> Raise the rate to 20K interrupts per second (50 us minimum
> spacing) to allow lower latency handling for Tx timestamp
> bursts and high VF message rates.
> 
> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH net 1/1] net: bridge: use a stable FDB dst snapshot in RCU readers
From: Nikolay Aleksandrov @ 2026-04-14  9:33 UTC (permalink / raw)
  To: Ren Wei, bridge, netdev
  Cc: idosch, davem, edumazet, kuba, pabeni, horms, makita.toshiaki,
	vyasevic, yifanwucs, tomapufckgml, yuantan098, bird, enjou1224z,
	zcliangcn
In-Reply-To: <6570fabb85ecadb8baaf019efe856f407711c7b9.1776043229.git.zcliangcn@gmail.com>

On 13/04/2026 12:08, Ren Wei wrote:
> From: Zhengchuan Liang <zcliangcn@gmail.com>
> 
> Local FDB entries can be rewritten in place by `fdb_delete_local()`, which
> updates `f->dst` to another port or to `NULL` while keeping the entry
> alive. Several bridge RCU readers inspect `f->dst`, including
> `br_fdb_fillbuf()` through the `brforward_read()` sysfs path.
> 
> These readers currently load `f->dst` multiple times and can therefore
> observe inconsistent values across the check and later dereference.
> In `br_fdb_fillbuf()`, this means a concurrent local-FDB update can change
> `f->dst` after the NULL check and before the `port_no` dereference,
> leading to a NULL-ptr-deref.
> 
> Fix this by taking a single `READ_ONCE()` snapshot of `f->dst` in each
> affected RCU reader and using that snapshot for the rest of the access
> sequence. Also publish the in-place `f->dst` updates in `fdb_delete_local()`
> with `WRITE_ONCE()` so the readers and writer use matching access patterns.
> 
> Fixes: 960b589f86c7 ("bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address")
> Cc: stable@kernel.org
> Reported-by: Yifan Wu <yifanwucs@gmail.com>
> Reported-by: Juefei Pu <tomapufckgml@gmail.com>
> Co-developed-by: Yuan Tan <yuantan098@gmail.com>
> Signed-off-by: Yuan Tan <yuantan098@gmail.com>
> Suggested-by: Xin Liu <bird@lzu.edu.cn>
> Tested-by: Ren Wei <enjou1224z@gmail.com>
> Signed-off-by: Zhengchuan Liang <zcliangcn@gmail.com>
> Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
> ---
>   net/bridge/br_arp_nd_proxy.c |  8 +++++---
>   net/bridge/br_fdb.c          | 28 ++++++++++++++++++----------
>   2 files changed, 23 insertions(+), 13 deletions(-)
> 

Acked-by: Nikolay Aleksandrov <razor@blackwall.org>

^ permalink raw reply

* Re: [PATCH iwl-next 7/10] ice: emit user-visible info message for non-contiguous ETS TC config
From: Simon Horman @ 2026-04-14  9:35 UTC (permalink / raw)
  To: Aleksandr Loktionov
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, Karen Ostrowska
In-Reply-To: <20260410074921.1254213-8-aleksandr.loktionov@intel.com>

On Fri, Apr 10, 2026 at 09:49:18AM +0200, Aleksandr Loktionov wrote:
> When the remote LLDP peer advertises a non-contiguous TC
> mapping the driver silently falls back to a default single-TC
> configuration. This leaves the user without any indication of
> why their DCB configuration was not honoured.
> 
> Print an informational message at the entry of
> ice_dcb_noncontig_cfg() so the user knows ETS with
> non-contiguous TCs is not supported and that the driver
> has fallen back to defaults.
> 
> Suggested-by: Karen Ostrowska <karen.ostrowska@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_dcb_lib.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> index bd77f1c..1c53b09 100644
> --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
> @@ -712,6 +712,8 @@ static int ice_dcb_noncontig_cfg(struct ice_pf *pf)
>  	struct device *dev = ice_pf_to_dev(pf);
>  	int ret;
>  
> +	dev_info(dev, "Non-contiguous ETS TC config not supported, falling back to default single TC\n");

Sashiko points out that this seems to be controlled by user input.
If so, it should probably be rate limited.

> +
>  	/* Configure SW DCB default with ETS non-willing */
>  	ret = ice_dcb_sw_dflt_cfg(pf, false, true);
>  	if (ret) {
> -- 
> 2.52.0
> 

^ permalink raw reply

* Re: [PATCH] net: Optimize flush calculation in inet_gro_receive()
From: David Laight @ 2026-04-14  9:36 UTC (permalink / raw)
  To: Helge Deller
  Cc: Kuniyuki Iwashima, deller, davem, dsahern, linux-kernel,
	linux-parisc, netdev, edumazet
In-Reply-To: <49c05cd8-5ad0-4015-8f55-fed3416784bf@gmx.de>

On Tue, 14 Apr 2026 09:46:55 +0200
Helge Deller <deller@gmx.de> wrote:

> Hi Kikuyu and David,
...
> >>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> >>> index c7731e300a44..58cad2687c2c 100644
> >>> --- a/net/ipv4/af_inet.c
> >>> +++ b/net/ipv4/af_inet.c
> >>> @@ -1479,7 +1479,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >>>   	struct sk_buff *p;
> >>>   	unsigned int hlen;
> >>>   	unsigned int off;
> >>> -	int flush = 1;
> >>> +	u16 flush = 1;
> >>>   	int proto;
> >>>   
> >>>   	off = skb_gro_offset(skb);
> >>> @@ -1504,7 +1504,8 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
> >>>   		goto out;
> >>>   
> >>>   	NAPI_GRO_CB(skb)->proto = proto;
> >>> -	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb)) | (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> >>> +	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb)) |
> >>> +	        (get_unaligned_be16(&iph->frag_off) & ~IP_DF);  
> >>
> >> I think here we intentionally use 32-bit loads:
> >>
> >> commit 
> >> Author: Herbert Xu <herbert@gondor.apana.org.au>
> >> Date:   Tue May 26 18:50:29 2009
> >>
> >>      ipv4: Use 32-bit loads for ID and length in GRO  
> 
> I see, this patch is exactly the opposite of mine.
> 
> >> Before your patch, 32-bit load + bswap are used while
> >> 16-bit load + rol 8 after the change.
> >>
> >> I feel the 4-byte aligned load + bswap is faster than
> >> misaligned access + 8 times shift (Is this internally
> >> optimised like xchg for a single word size ?)
> >>
> >> Do you have some numbers ?  
> 
> No, I don't have.
> In the end it's very platform specific anyway.
>   
> > Check on some architecture that doesn't support misaligned loads.
> > Actually, aren't the accesses aligned??  
> 
> The reason why I touched this code at all, is because I got unaligned
> accesses in that function on parisc.
> But those unaligned accesses were triggered by parisc-specific
> inline assembly, and not by this code here.

The network stack is supposed to ensure that all receive packets are
aligned to that IP header is on a 4-byte boundary.
This typically requires the ethernet receive buffer be 4n+2 aligned.
Unfortunately there is some ethernet hardware that requires 4n aligned
buffers (often on SoC devices with cpu that fault misaligned accesses).
(Just writing two bytes of garbage before the frame solves the issue.) 

> So, I believe those accesses here are aligned, and the get_unaligned_XX()
> helpers make the code more readable, but are NOT necessary.
> 
> That said, I suggest to drop my patch.
> It makes the code more readable, but probably will not improve speed.

I think the purpose of the change was to use the hardware's 32bit
byte-swapping memory loads rather than software swapping of the 
16-bit items.
That shaves off a few instructions - and they can be measurable
in some of the network paths with specific workloads.

Remember, save 0.1% 100 times and the code runs 10% faster.
Every little bit can make a difference.

	David

> 
> Thanks for your help!
> Helge
> 
> > Also on ones without 32bit byteswap (some do have byteswapping
> > memory reads).
> > 
> > Also you may not want to change 'flush' to u16.
> > On non-x86 it may force the compiler add extra masking instructions.
> > 
> > 	David
> >     
> >>
> >>
> >> Before:
> >> 	flush = (u16)((ntohl(*(__be32 *)iph) ^ skb_gro_len(skb))
> >> mov    edx,DWORD PTR [rcx]
> >> bswap  edx
> >> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
> >> mov    r8d,DWORD PTR [rsi+0x38]
> >> mov    r9d,DWORD PTR [rsi+0x70]
> >> sub    r9d,r8d
> >> xor    r9d,edx
> >> 	| (ntohl(*(__be32 *)&iph->id) & ~IP_DF));
> >> mov    ebp,0xffbfffff
> >> and    ebp,DWORD PTR [rcx+0x4]
> >> bswap  ebp
> >> or     ebp,r9d
> >>
> >>
> >> After:
> >> 	flush = (get_unaligned_be16(&iph->tot_len) ^ skb_gro_len(skb))
> >> movzx  edx,WORD PTR [rcx+0x2]
> >> rol    dx,0x8
> >> 	return skb->len - NAPI_GRO_CB(skb)->data_offset;
> >> mov    r8d,DWORD PTR [rsi+0x38]
> >> mov    r9d,DWORD PTR [rsi+0x70]
> >> sub    r9d,r8d
> >> xor    r9d,edx
> >> 	| (get_unaligned_be16(&iph->frag_off) & ~IP_DF);
> >> movzx  ebp,WORD PTR [rcx+0x6]
> >> and    ebp,0xffffffbf
> >> rol    bp,0x8
> >> or     ebp,r9d
> >>  
> >   
> 


^ permalink raw reply

* Re: [PATCH iwl-next 8/10] ice: move ice_phy_get_speed_eth56g() from ice_ptp_hw.c to ice_common.c
From: Simon Horman @ 2026-04-14  9:38 UTC (permalink / raw)
  To: Aleksandr Loktionov; +Cc: intel-wired-lan, anthony.l.nguyen, netdev
In-Reply-To: <20260410074921.1254213-9-aleksandr.loktionov@intel.com>

On Fri, Apr 10, 2026 at 09:49:19AM +0200, Aleksandr Loktionov wrote:
> ice_phy_get_speed_eth56g() is currently a file-local (static)
> helper in ice_ptp_hw.c. Future users outside that compilation
> unit require access to it.

FWIIW, I think it would be slightly better if this patch was accompanied by
such a user.

> 
> Move the function to ice_common.c, add a declaration in
> ice_common.h, and relocate the enum ice_eth56g_link_spd from
> ice_ptp_hw.h to ice_type.h so it is visible to callers of the
> new exported function.
> 
> Suggested-by: Karol Kolacinski <karol.kolacinski@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

* Re: [PATCH iwl-next 9/10] ice: use inline helpers instead of memcmp() for IPv6 mask checks in ice_ethtool_fdir
From: Simon Horman @ 2026-04-14  9:39 UTC (permalink / raw)
  To: Aleksandr Loktionov
  Cc: intel-wired-lan, anthony.l.nguyen, netdev, Larysa Zaremba
In-Reply-To: <20260410074921.1254213-10-aleksandr.loktionov@intel.com>

On Fri, Apr 10, 2026 at 09:49:20AM +0200, Aleksandr Loktionov wrote:
> Replace static full_ipv6_addr_mask / zero_ipv6_addr_mask structs
> and the associated memcmp() calls in ice_ethtool_fdir.c with the
> kernel-provided ipv6_addr_any() helper and a new ice_ipv6_mask_full()
> inline, reducing boilerplate and making intent clearer.
> 
> Suggested-by: Larysa Zaremba <larysa.zaremba@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox