Re: [net-next v24 4/7] net: mtip: Add net_device_ops functions to the L2 switch driver

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jakub Kicinski <kuba@kernel.org>
To: lukasz.majewski@mailbox.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	pabeni@redhat.com, richardcochran@gmail.com, lgirdwood@gmail.com,
	nathan@kernel.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org, krzk+dt@kernel.org, imx@lists.linux.dev,
	linux-arm-kernel@lists.infradead.org, robh@kernel.org
Subject: Re: [net-next v24 4/7] net: mtip: Add net_device_ops functions to the L2 switch driver
Date: Thu,  4 Jun 2026 19:18:55 -0700	[thread overview]
Message-ID: <20260605021855.3594587-1-kuba@kernel.org> (raw)
In-Reply-To: <20260601112437.2216043-5-lukasz.majewski@mailbox.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: mtip: Add net_device_ops functions to the L2 switch driver

This patch adds the ndo_start_xmit, ndo_set_rx_mode, ndo_tx_timeout,
ndo_set_mac_address, and ndo_get_port_parent_id callbacks to the MTIP
L2 switch driver, along with helpers and a tx_timeout work item.

> diff --git a/drivers/net/ethernet/freescale/mtipsw/mtipl2sw.c b/drivers/net/ethernet/freescale/mtipsw/mtipl2sw.c
> index 5b5229ee411d..7dfeda230e3f 100644
> --- a/drivers/net/ethernet/freescale/mtipsw/mtipl2sw.c
> +++ b/drivers/net/ethernet/freescale/mtipsw/mtipl2sw.c

[ ... ]

> @@ -617,6 +740,76 @@ static void mtip_switch_restart(struct net_device *dev, int duplex0,
>  	mtip_config_switch(fep);
>  }
>  
> +static void mtip_print_hw_state(struct net_device *dev)
> +{

[ ... ]

> +static void mtip_timeout_work(struct work_struct *work)
> +{
> +	struct mtip_ndev_priv *priv =
> +		container_of(work, struct mtip_ndev_priv, tx_timeout_work);
> +	struct switch_enet_private *fep = priv->fep;
> +	struct net_device *dev = priv->dev;
> +
> +	rtnl_lock();
> +	if (netif_device_present(dev) || netif_running(dev)) {
> +		napi_disable(&fep->napi);
> +		netif_tx_lock_bh(dev);
> +		mtip_switch_restart(dev, fep->full_duplex[0],
> +				    fep->full_duplex[1]);
> +		netif_tx_wake_all_queues(dev);
> +		netif_tx_unlock_bh(dev);
> +		napi_enable(&fep->napi);
> +	}
> +	rtnl_unlock();
> +}

[High]
Should the guard here be a logical AND rather than OR?

	if (netif_device_present(dev) || netif_running(dev)) {

netif_device_present() returns true unless netif_device_detach() has been
called, and this driver never calls netif_device_detach().  So the OR
collapses to "always true" once the device has been registered, and the
work proceeds even after mtip_close().

mtip_close() does this on the last close:

	if (fep->usage_count == 1) {
		napi_disable(&fep->napi);
		netif_napi_del(&fep->napi);
		mtip_free_buffers(dev);
		clk_disable(fep->clk_ipg);
	}

After that, can mtip_timeout_work() running due to a previously scheduled
watchdog still call napi_disable(&fep->napi) / napi_enable(&fep->napi) on
a netif_napi_del()'d napi, and call mtip_switch_restart() which writes
hardware registers after clk_ipg has been disabled?

Would gating the work on fep->usage_count (and/or cancel_work_sync() in
mtip_close()) be more correct than the present netif_device_present() ||
netif_running() check?

[ ... ]

> @@ -1095,6 +1288,98 @@ static int mtip_close(struct net_device *dev)
>  	return 0;
>  }
>  
> +#define FEC_HASH_BITS	6		/* #bits in hash */

[ ... ]

> +static int mtip_set_mac_address(struct net_device *dev, void *p)
> +{
> +	struct mtip_ndev_priv *priv = netdev_priv(dev);
> +	struct switch_enet_private *fep = priv->fep;
> +	void __iomem *enet_addr = fep->enet_addr;
> +	struct sockaddr *addr = p;
> +	int ret;
> +
> +	if (!is_valid_ether_addr(addr->sa_data))
> +		return -EADDRNOTAVAIL;
> +
> +	ret = mtip_update_atable_static((unsigned char *)addr->sa_data,
> +					7, 7, fep);
> +	if (ret)
> +		return ret;

[Medium]
Is the previously-programmed MAC ever removed from the switch's static
address table?  mtip_update_atable_static() either updates a matching
entry in place or fills the first empty slot in an 8-entry block:

	for (entry = block_index; entry < index_end; entry++) {
		mtip_read_atable(fep, entry, &read_lo, &read_hi);
		if (read_lo == write_lo &&
		    ((read_hi & 0x0000FFFF) ==
		     (write_hi & 0x0000FFFF))) {
			mtip_write_atable(fep, entry, write_lo, write_hi);
			...
			return 0;
		} else if (!(read_hi & (1 << 16))) {
			mtip_write_atable(fep, entry, write_lo, write_hi);
			...
			return 0;
		}
	}
	...
	return -ENOSPC;

After 8 distinct MAC changes whose CRC8 maps to the same block, will
further mtip_set_mac_address() calls return -ENOSPC, and will the stale
MACs continue to be honored as static entries with port mask 7 and
priority 7?

> +
> +	eth_hw_addr_set(dev, addr->sa_data);
> +
> +	if (priv->portnum == 2)
> +		enet_addr += MCF_ESW_ENET_PORT_OFFSET;
> +
> +	writel(dev->dev_addr[3] | (dev->dev_addr[2] << 8) |
> +	       (dev->dev_addr[1] << 16) | (dev->dev_addr[0] << 24),
> +	       enet_addr + MCF_FEC_PALR);
> +	writel((dev->dev_addr[5] << 16) | (dev->dev_addr[4] << 24),
> +	       enet_addr + MCF_FEC_PAUR);
> +
> +	return 0;
> +}

[Medium]
Should fep->mac[priv->portnum - 1] be updated here as well?  The driver
caches the MAC in fep->mac[] and mtip_get_port_parent_id() reads
fep->mac[0] as the parent identifier.  After a userspace MAC change
fep->mac[] stays at its boot/DT value, so ndo_get_port_parent_id keeps
returning the old value while dev->dev_addr and the hardware registers
have moved on.

[Medium]
Does this sequence need to disable the controller before reprogramming
PALR/PAUR?  mtip_configure_enet_mii() programs ECR to MAGIC_ENA only
(no ETHER_EN), then writes PALR/PAUR, then enables ETHER_EN at the end:

	writel(MCF_FEC_ECR_MAGIC_ENA, enet_addr + MCF_FEC_ECR);
	...
	writel(... , enet_addr + MCF_FEC_PALR);
	writel(... , enet_addr + MCF_FEC_PAUR);
	...
	writel(readl(enet_addr + MCF_FEC_ECR) | MCF_FEC_ECR_ETHER_EN,
	       enet_addr + MCF_FEC_ECR);

mtip_set_mac_address() writes PALR and PAUR while ETHER_EN can be set
and traffic is being matched, and it does not take fep->hw_lock either,
so it can race with mtip_configure_enet_mii() / mtip_switch_restart()
that touch the same register block.  Is that intended?

> +
> +static int mtip_get_port_parent_id(struct net_device *ndev,
> +				   struct netdev_phys_item_id *ppid)
> +{
> +	struct mtip_ndev_priv *priv = netdev_priv(ndev);
> +	struct switch_enet_private *fep = priv->fep;
> +
> +	ppid->id_len = sizeof(fep->mac[0]);
> +	memcpy(&ppid->id, &fep->mac[0], ppid->id_len);
> +
> +	return 0;
> +}

[Medium]
Is fep->mac[0] a suitable choice for the switch parent id?  fep->mac[]
is populated only from DT via of_get_mac_address() in mtip_parse_of();
when DT does not supply a mac-address for port 1, fep->mac[0] stays
zeroed (it is kzalloc'd in probe), since mtip_setup_mac()'s bootloader
and random-MAC fallbacks update dev->dev_addr only and never write back
into fep->mac[].

If two MTIP switches both lack a DT MAC for port 1, both will report
ppid = 00:00:00:00:00:00, which switchdev/bridge code will treat as the
same switch (offload_fwd_mark, bridge offload decisions).

Would a switch-wide identifier (a register-derived id, the platform
of_node phandle, etc.) be more appropriate than a port MAC?

[ ... ]

> @@ -1158,10 +1448,14 @@ static int mtip_switch_dma_init(struct switch_enet_private *fep)
>  
>  static void mtip_ndev_cleanup(struct switch_enet_private *fep)
>  {
> +	struct mtip_ndev_priv *priv;
>  	int i;
>  
>  	for (i = 0; i < SWITCH_EPORT_NUMBER; i++) {
>  		if (fep->ndev[i]) {
> +			priv = netdev_priv(fep->ndev[i]);
> +			cancel_work_sync(&priv->tx_timeout_work);
> +
>  			unregister_netdev(fep->ndev[i]);
>  			free_netdev(fep->ndev[i]);
>  			fep->ndev[i] = NULL;

[High]
Is the order here safe against the netdev TX watchdog?  The watchdog
timer is only stopped inside unregister_netdev() (via dev_deactivate),
so between cancel_work_sync() returning and unregister_netdev() actually
disarming the watchdog the timer can still fire mtip_timeout(), which
calls schedule_work(&priv->tx_timeout_work) and re-queues the work that
was just drained.

After unregister_netdev() and free_netdev() complete, priv (which embeds
tx_timeout_work) is freed, and when the worker thread later picks up
the queued work it will dereference a freed priv/fep.

Would the sequence

	unregister_netdev(fep->ndev[i]);
	cancel_work_sync(&priv->tx_timeout_work);
	free_netdev(fep->ndev[i]);

be safer, since unregister_netdev() quiesces all sources of new work
before cancel_work_sync() drains what remains?  The same concern applies
to the rollback path in mtip_ndev_init() for ports whose watchdog is
already active.

next prev parent reply	other threads:[~2026-06-05  2:19 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-01 11:24 [net-next v24 0/7] net: mtip: Add support for MTIP imx287 L2 switch driver Lukasz Majewski
2026-06-01 11:24 ` [net-next v24 1/7] dt-bindings: net: Add MTIP L2 switch description Lukasz Majewski
2026-06-02 11:32   ` sashiko-bot
2026-06-01 11:24 ` [net-next v24 2/7] net: mtip: The L2 switch driver for imx287 Lukasz Majewski
2026-06-02 11:32   ` sashiko-bot
2026-06-05  2:18   ` Jakub Kicinski
2026-06-01 11:24 ` [net-next v24 3/7] net: mtip: Add buffers management functions to the L2 switch driver Lukasz Majewski
2026-06-02 11:32   ` sashiko-bot
2026-06-05  2:18   ` Jakub Kicinski
2026-06-09  8:37     ` Łukasz Majewski
2026-06-01 11:24 ` [net-next v24 4/7] net: mtip: Add net_device_ops " Lukasz Majewski
2026-06-02 11:32   ` sashiko-bot
2026-06-05  2:18   ` Jakub Kicinski [this message]
2026-06-01 11:24 ` [net-next v24 5/7] net: mtip: Add mtip_switch_{rx|tx} " Lukasz Majewski
2026-06-02 11:32   ` sashiko-bot
2026-06-05  2:18   ` Jakub Kicinski
2026-06-01 11:24 ` [net-next v24 6/7] net: mtip: Extend the L2 switch driver with management operations Lukasz Majewski
2026-06-02 11:32   ` sashiko-bot
2026-06-05  2:18   ` Jakub Kicinski
2026-06-01 11:24 ` [net-next v24 7/7] net: mtip: Extend the L2 switch driver for imx287 with bridge operations Lukasz Majewski
2026-06-02 11:32   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260605021855.3594587-1-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=imx@lists.linux.dev \
    --cc=krzk+dt@kernel.org \
    --cc=lgirdwood@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lukasz.majewski@mailbox.org \
    --cc=nathan@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=richardcochran@gmail.com \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.