Re: [PATCH net-next v4 13/16] onsemi: s2500: Add driver support for TS2500 MAC-PHY

Devicetree
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Selvamani Rajagopal via B4 Relay"
	<devnull+Selvamani.Rajagopal.onsemi.com@kernel.org>
Cc: robh@kernel.org, devicetree@vger.kernel.org, conor+dt@kernel.org
Subject: Re: [PATCH net-next v4 13/16] onsemi: s2500: Add driver support for TS2500 MAC-PHY
Date: Sun, 07 Jun 2026 05:43:02 +0000	[thread overview]
Message-ID: <20260607054302.DDCC71F00893@smtp.kernel.org> (raw)
In-Reply-To: <20260605-s2500-mac-phy-support-v4-13-de0fbc13c6d8@onsemi.com>

Thank you for your contribution! Sashiko AI review found 14 potential issue(s) to consider:
- [High] The PTP adjtime ioctl callback actively blocks the calling user-space thread for up to 3 minutes.
- [High] Hardware multicast filtering is inadvertently permanently disabled when 1 to 3 multicast addresses are configured.
- [High] Missing cleanup in the s2500_open() error path leaks the PHY and leaves a dangling ERR_PTR causing kernel panics.
- [High] A transient SPI error permanently terminates the background kthread and leads to a Use-After-Free.
- [High] Missing cleanup in device probe error paths leaks the OA TC6 framework and PTP resources.
- [High] Unprotected read-modify-write operations on the MAC_CTRL register lead to race conditions and corrupted device state.
- [High] The `target` timespec64 variable is used uninitialized if `ptp->gettimex64()` fails.
- [High] Missing kthread_should_stop() check after setting TASK_INTERRUPTIBLE causes up to a 5-second shutdown delay.
- [High] The driver iterates over the multicast address list locklessly in a background thread, risking list corruption and use-after-free.
- [Medium] The driver duplicates standard network statistics in its private ethtool -S interface and fails to implement ndo_get_stats64.
- [Medium] The driver advertises a max_adj of 100,000,000 ppb, but its adjfine implementation rejects values exceeding ~33.5 million ppb.
- [Medium] Software MAC address is updated even if the hardware update fails.
- [Medium] Concurrent adjtime and adjfine calls clobber the hardware frequency adjustments.
- [Medium] Multicast MAC filter registers are written in the wrong order.
--

commit 7da83aa832497d5fef58b67e53a49c036aa7948e
Author: Selvamani Rajagopal <Selvamani.Rajagopal@onsemi.com>
Subject: onsemi: s2500: Add driver support for TS2500 MAC-PHY

This commit introduces the driver for the onsemi S2500 10BASE-T1S MAC-PHY
Ethernet transceiver. It utilizes the Open Alliance TC6 framework to handle
SPI transactions, and includes standard network device operations, ethtool
capabilities, and PTP hardware timestamping with a custom phase adjustment
fallback loop.

> diff --git a/drivers/net/ethernet/onsemi/s2500/s2500_ethtool.c b/drivers/net/ethernet/onsemi/s2500/s2500_ethtool.c
> new file mode 100644
> index 0000000000000..85bd8ea50dd8c
> --- /dev/null
> +++ b/drivers/net/ethernet/onsemi/s2500/s2500_ethtool.c
[ ... ]
> +static const char s2500_mac_stat_strings[][ETH_GSTRING_LEN] = {
> +	"tx_bytes",
> +	"tx_frames",
> +	"tx_broadcast_frames",
> +	"tx_multicast_frames",

[Severity: Medium]
Should standard network UAPI stats like tx_bytes and tx_frames be duplicated
here in the private ethtool strings?

According to the standard kernel networking guidelines, UAPI stats should be
implemented via the standard ndo_get_stats64 interface instead of adding
private UAPI strings in ethtool -S.

> diff --git a/drivers/net/ethernet/onsemi/s2500/s2500_main.c b/drivers/net/ethernet/onsemi/s2500/s2500_main.c
> new file mode 100644
> index 0000000000000..7aa3267db39b5
> --- /dev/null
> +++ b/drivers/net/ethernet/onsemi/s2500/s2500_main.c
[ ... ]
> +static int s2500_mac_ctrl_modify_bits(struct s2500_info *priv,
> +				      u32 in_bits, bool clr)
> +{
> +	u32 reg = S2500_REG_MAC_CTRL;
> +	u32 rval = 0;
> +	int ret;
> +
> +	ret = oa_tc6_read_register_mms(priv->tc6, reg,
> +				       OA_TC6_PHY_C45_MAC_MMS1, &rval);
> +	if (!ret) {
> +		u32 wval = 0;
> +
> +		if (clr)
> +			wval = rval & ~in_bits;
> +		else
> +			wval = rval | in_bits;
> +		if (rval != wval)
> +			ret = oa_tc6_write_register_mms(priv->tc6, reg,
> +							OA_TC6_PHY_C45_MAC_MMS1, wval);
> +	}
> +	return ret;
> +}

[Severity: High]
Does this read-modify-write operation need mutual exclusion?

s2500_mac_ctrl_modify_bits() performs a non-atomic read-modify-write on
the MAC_CTRL register over the SPI bus. Because SPI operations can sleep,
calling this concurrently from the kthread (e.g. s2500_rx_mode_update())
and ndo_stop() could overwrite the device state, potentially leaving the
hardware incorrectly enabled or disabled.

[ ... ]
> +static int s2500_set_multicast_mode(struct s2500_info *priv,
> +				    unsigned int rx_flags)
> +{
[ ... ]
> +	} else {
> +		struct netdev_hw_addr *ha;
> +		u32 addrh, addrl;
> +
> +		/* Disable multicast filter */
> +		ret = s2500_mac_ctrl_modify_bits(priv,
> +						 S2500_MAC_CTRL_MCSF_BIT,
> +						 true);
> +		if (ret)
> +			return ret;
> +
> +		/* Disable filters */
> +		for (i = 1; i <= S2500_N_MCAST_FILTERS; i++) {
> +			ret = oa_tc6_write_register_mms(priv->tc6,
> +							S2500_REG_MAC_ADDRFILTH(i),
> +							OA_TC6_PHY_C45_MAC_MMS1, 0);
> +			if (ret)
> +				return ret;
> +		}
> +
> +		i = 1;
> +		netdev_for_each_mc_addr(ha, priv->ndev) {

[Severity: High]
Is the multicast filter accidentally left disabled in this execution path?

When 1 to 3 multicast addresses are configured, s2500_mac_ctrl_modify_bits()
is called to disable the filter, but there appears to be no subsequent call
to clear this bit and re-enable the filter after the addresses are written.
This would leave the MAC unconditionally accepting all multicast traffic.

[Severity: High]
Can this list traversal race with user-space configuration?

Since this executes from a background kthread, traversing priv->ndev->mc.list
without holding netif_addr_lock_bh() risks list corruption or a
use-after-free if a user-space daemon simultaneously modifies the addresses.

> +			if (i > S2500_N_MCAST_FILTERS)
> +				break;
> +
> +			addrh = S2500_MAC_ADDRFILT_EN_BIT |
> +				 get_unaligned_be16(ha->addr);
> +			addrl = get_unaligned_be32(&ha->addr[2]);
> +
> +			ret = oa_tc6_write_register_mms(priv->tc6,
> +							S2500_REG_MAC_ADDRFILTH(i),
> +							OA_TC6_PHY_C45_MAC_MMS1, addrh);
> +			if (ret)
> +				return ret;
> +
> +			ret = oa_tc6_write_register_mms(priv->tc6,
> +							S2500_REG_MAC_ADDRFILTL(i),
> +							OA_TC6_PHY_C45_MAC_MMS1, addrl);

[Severity: Medium]
Should the lower register (FILTL) be written before the high register
(FILTH)?

Because FILTH contains the S2500_MAC_ADDRFILT_EN_BIT, writing it first
enables the filter using the new high bytes with the old, uninitialized
low bytes. This could cause a brief window of incorrect hardware filtering.

[ ... ]
> +static int s2500_set_mac_address(struct net_device *ndev, void *p)
> +{
> +	struct sockaddr *addr = p;
> +
> +	if (!is_valid_ether_addr(addr->sa_data))
> +		return -EADDRNOTAVAIL;
> +
> +	eth_hw_addr_set(ndev, addr->sa_data);
> +	return s2500_set_mac_filter(ndev, addr->sa_data);
> +}

[Severity: Medium]
Should the software MAC address only be updated after the hardware update
succeeds?

Updating the address via eth_hw_addr_set() before s2500_set_mac_filter()
completes means a transient SPI failure will leave the software and hardware
MAC addresses permanently out of sync.

[ ... ]
> +static int s2500_thread_fun(void *data)
> +{
> +	struct s2500_info *priv = data;
> +	bool update_rx_mode = false;
> +	unsigned long flags;
> +	signed long tout;
> +	int ret = 0;
> +
> +	tout = priv->poll_jiff;
> +
> +	do {
> +		if (update_rx_mode) {
> +			ret = s2500_rx_mode_update(priv);
> +			if (unlikely(ret)) {
> +				dev_err(&priv->spi->dev, "Failed to set new RX mode");
> +				break;
> +			}
> +		}

[Severity: High]
Does this break statement leave priv->thread dangling?

If a transient SPI error occurs during s2500_rx_mode_update(), the kthread
exits permanently, but priv->thread is never cleared. Later calls to
s2500_stop() or s2500_set_rx_mode() might attempt to dereference the
freed task_struct, causing a use-after-free.

[ ... ]
> +		spin_lock_irqsave(&priv->lock, flags);
> +		__set_current_state(TASK_INTERRUPTIBLE);
> +
> +		update_rx_mode = priv->rx_flags_upd;
> +		ret = update_rx_mode;
> +
> +		spin_unlock_irqrestore(&priv->lock, flags);
> +
> +		if (!ret)
> +			tout = schedule_timeout(tout);
> +		else
> +			set_current_state(TASK_RUNNING);
> +	} while (!kthread_should_stop());
> +	return 0;
> +}

[Severity: High]
Can this sequence cause a deadlock or a delay when stopping the interface?

The thread sets TASK_INTERRUPTIBLE and then calls schedule_timeout() without
first checking kthread_should_stop(). If kthread_stop() is executed before
the state change, the wakeup is consumed early, and the thread will sleep
unconditionally for up to 5 seconds while the caller blocks on kthread_stop.

[ ... ]
> +static int s2500_open(struct net_device *ndev)
> +{
> +	struct s2500_info *priv = netdev_priv(ndev);
> +	int ret = 0;
> +	u32 val;
> +
> +	dev_dbg(&ndev->dev, "%s", "s2500_open");
> +	phy_start(priv->ndev->phydev);
> +
> +	priv->thread = kthread_run(s2500_thread_fun, priv, DRV_NAME "/%s:%d",
> +				   dev_name(&priv->spi->dev),
> +				   spi_get_chipselect(priv->spi, 0));
> +
> +	if (IS_ERR(priv->thread)) {
> +		ret = PTR_ERR(priv->thread);
> +	} else {
> +		val = S2500_MAC_CTRL_TXEN_BIT | S2500_MAC_CTRL_RXEN_BIT;
> +		ret = s2500_mac_ctrl_modify_bits(priv, val, false);
> +
> +		netif_start_queue(priv->ndev);
> +	}
> +	return ret;
> +}

[Severity: High]
Does this error path properly handle a kthread creation failure?

If kthread_run() fails, priv->thread is left containing an ERR_PTR and the
PHY remains running. If s2500_set_rx_mode() is later called, it checks if
priv->thread exists and attempts a wakeup, which would trigger a kernel
panic when dereferencing the ERR_PTR.

[ ... ]
> +static int s2500_probe(struct spi_device *spi)
> +{
[ ... ]
> +	ret = s2500_init(priv);
> +	if (unlikely(ret)) {
> +		dev_err(&spi->dev, "failed to s2500_init the device");
> +		goto err_reg_read;
> +	}
> +
> +	/* Configure PTP if the model supports it */
> +	ret = s2500_ptp_register(priv);
> +	if (unlikely(ret))
> +		goto err_reg_read;
> +
> +	ret = register_netdev(ndev);
> +	if (ret) {
> +		dev_err(&spi->dev, "failed to register the S2500 device\n");
> +		ret = -ENODEV;
> +
> +		goto err_reg_read;
> +	}
> +	return 0;
> +
> +err_reg_read:
> +	dev_err(&spi->dev, "could not initialize macphy");
> +	return ret;
> +}

[Severity: High]
Are the OA TC6 and PTP resources being leaked in this error path?

If initialization steps like s2500_init() or register_netdev() fail, the
code branches to err_reg_read and returns without invoking oa_tc6_exit()
or oa_tc6_ptp_unregister(). Because these are not managed by devres, they
could leak permanently.

> diff --git a/drivers/net/ethernet/onsemi/s2500/s2500_ptp.c b/drivers/net/ethernet/onsemi/s2500/s2500_ptp.c
> new file mode 100644
> index 0000000000000..fd6617c7ac798
> --- /dev/null
> +++ b/drivers/net/ethernet/onsemi/s2500/s2500_ptp.c
[ ... ]
> +static int s2500_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
> +{
> +	struct s2500_info *priv = container_of(ptp, struct s2500_info,
> +					       ptp_clock_info);
> +	u32 sign_bit = 0;
> +	long adj;
> +	u32 val;
> +	u64 ppm;
> +
> +	if (scaled_ppm < 0) {
> +		/* split sign / mod */
> +		sign_bit = 1U << 31;
> +		scaled_ppm = ~scaled_ppm + 1;
> +	}
[ ... ]
> +	ppm = (u64)scaled_ppm * 125;
> +	do_div(ppm, 1024);
> +	adj = (long)ppm;
> +
> +	/* check overflow */
> +	if (adj >= (1L << 28))
> +		return -ERANGE;

[Severity: Medium]
Does this boundary check reject valid frequency adjustments?

In s2500_ptp_register(), the max_adj is advertised as 100,000,000 ppb (10%).
However, restricting adj to (1L << 28) mathematically restricts the maximum
scaled_ppm equivalent to roughly 33.5 million ppb. This mismatch causes
valid user-space adjustments to fail with -ERANGE.

> +
> +	val = (u32)adj | sign_bit;
> +	return oa_tc6_write_register_mms(priv->tc6, S2500_REG_VS_PTP_ADJ,
> +					 OA_TC6_PHY_C45_VS_MMS12, val);
> +}

[Severity: Medium]
Is a lock required here to prevent clobbering concurrent frequency updates?

This function writes directly to the hardware without acquiring
priv->ptp_adj_lock. A concurrent user-space request to adjfine could conflict
with the driver's own internal slewing loop in s2500_ptp_adjtime().

[ ... ]
> +static int s2500_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
> +{
[ ... ]
> +	while (remaining != 0 && max_iters--) {
> +		s64 abs_delta = remaining > 0 ? remaining : -remaining;
> +
> +		/* If the adjustment is very large, more than 1 second,
> +		 * use settime to avoid very long slewing periods or
> +		 * excessive frequency offsets.
> +		 */
> +		if (abs_delta > 1000000000LL) {
> +			memset(&sts, 0, sizeof(sts));
> +			ret = ptp->gettimex64(ptp, &now, &sts);
> +			if (!ret) {
> +				struct timespec64 delta_ts;
> +
> +				if (remaining >= 0) {
> +					delta_ts = ns_to_timespec64(remaining);
> +					target = timespec64_add(now, delta_ts);
> +				} else {
> +					delta_ts = ns_to_timespec64(-remaining);
> +					target = timespec64_sub(now, delta_ts);
> +				}
> +			}
> +
> +			if (target.tv_sec < 0 || target.tv_sec >= (1ULL << 32))
> +				ret = -ERANGE;
> +			else
> +				ret = ptp->settime64(ptp, &target);

[Severity: High]
Can target be evaluated uninitialized if ptp->gettimex64() fails?

If gettimex64() encounters an SPI error and returns a non-zero status, the
initialization block is bypassed, but execution falls through to check
target.tv_sec. This may pass garbage data to ptp->settime64(), potentially
corrupting the hardware clock.

[ ... ]
> +		/* Apply frequency adjustment */
> +		ret = ptp->adjfine(ptp, (long)scaled_ppm);
> +		if (ret)
> +			break;
> +
> +		/* Sleep for the slew period (interruptible). If interrupted, clear
> +		 * the adjfine and return with -EINTR.
> +		 */
> +		if (msleep_interruptible(period_ms)) {

[Severity: High]
Is it safe to block the caller for up to 3 minutes during phase adjustment?

Because adjtime is invoked synchronously by the clock_adjtime syscall,
sleeping here with msleep_interruptible() for up to 60 seconds per iteration
will cause the user-space daemon (e.g., ptp4l) to block entirely, preventing
it from processing other network events.

> +			/* Clear adjfine */
> +			ptp->adjfine(ptp, 0);
> +			ret = -EINTR;
> +			break;
> +		}
> +
> +		/* Clear adjfine and measure remaining offset */
> +		ptp->adjfine(ptp, 0);

[Severity: Medium]
Does clearing adjfine here wipe out concurrent user-space frequency changes?

Unconditionally calling ptp->adjfine(ptp, 0) at the end of the sleep cycle
will overwrite any concurrent clock adjustments made by other user-space
tasks, disrupting synchronization.

> +
> +		memset(&sts, 0, sizeof(sts));
> +		ret = ptp->gettimex64(ptp, &now, &sts);
> +		if (ret)
> +			break;
[ ... ]

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605-s2500-mac-phy-support-v4-0-de0fbc13c6d8@onsemi.com?part=13

next prev parent reply	other threads:[~2026-06-07  5:43 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-06  5:42 [PATCH net-next v4 00/16] Support for onsemi's S2500 10Base-T1S MAC-PHY Selvamani Rajagopal via B4 Relay
2026-06-06  5:42 ` [PATCH net-next v4 01/16] net: phy: Helper to read and write through C45 without lock Selvamani Rajagopal via B4 Relay
2026-06-06  5:42 ` [PATCH net-next v4 02/16] net: phy: Helper to modify PHY loopback mode only Selvamani Rajagopal via B4 Relay
2026-06-06  5:42 ` [PATCH net-next v4 03/16] net: ethernet: oa_tc6: Move oa_tc6.c to its own directory Selvamani Rajagopal via B4 Relay
2026-06-07  5:42   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 04/16] net: phy: microchip_t1s: Use generic APIs for C45 read and write Selvamani Rajagopal via B4 Relay
2026-06-06  5:42 ` [PATCH net-next v4 05/16] net: ethernet: oa_tc6: Move constant definitions to header file Selvamani Rajagopal via B4 Relay
2026-06-06  5:42 ` [PATCH net-next v4 06/16] net: ethernet: oa_tc6: Support for hardware timestamp Selvamani Rajagopal via B4 Relay
2026-06-07  5:42   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 07/16] net: ethernet: oa_tc6: Support for vendor specific MMS Selvamani Rajagopal via B4 Relay
2026-06-07  5:42   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 08/16] net: ethernet: oa_tc6: Remove FCS size in RX frame Selvamani Rajagopal via B4 Relay
2026-06-07  5:42   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 09/16] net: ethernet: oa_tc6: read, write interface with MMS option Selvamani Rajagopal via B4 Relay
2026-06-07  5:43   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 10/16] net: phy: ncn26000: Support for onsemi's S2500 internal phy Selvamani Rajagopal via B4 Relay
2026-06-06  5:42 ` [PATCH net-next v4 11/16] net: phy: ncn26000: Enable enhanced noise immunity Selvamani Rajagopal via B4 Relay
2026-06-06  5:42 ` [PATCH net-next v4 12/16] net: phy: ncn26000: Support for loopback support Selvamani Rajagopal via B4 Relay
2026-06-07  5:43   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 13/16] onsemi: s2500: Add driver support for TS2500 MAC-PHY Selvamani Rajagopal via B4 Relay
2026-06-07  5:43   ` sashiko-bot [this message]
2026-06-07  5:56   ` Randy Dunlap
2026-06-06  5:42 ` [PATCH net-next v4 14/16] onsemi: s2500: Added selftest support to onsemi's S2500 driver Selvamani Rajagopal via B4 Relay
2026-06-07  5:43   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 15/16] dt-bindings: net: add onsemi's S2500 Selvamani Rajagopal via B4 Relay
2026-06-07  5:43   ` sashiko-bot
2026-06-06  5:42 ` [PATCH net-next v4 16/16] Documentation: networking: Add timestamp related APIs to OA TC6 framework Selvamani Rajagopal via B4 Relay
2026-06-07  5:43   ` sashiko-bot
2026-06-07  5:49   ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260607054302.DDCC71F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=devnull+Selvamani.Rajagopal.onsemi.com@kernel.org \
    --cc=robh@kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox