Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-deletions v2] net: remove unused ATM protocols and legacy ATM device drivers
From: Andy Shevchenko @ 2026-04-23  7:53 UTC (permalink / raw)
  To: Philip Prindeville
  Cc: David Woodhouse, Jakub Kicinski, davem, openwrt-devel, Guy Ellis,
	netdev, edumazet, pabeni, andrew+netdev, horms, corbet, skhan,
	linux, tsbogend, maddy, mpe, npiggin, chleroy, 3chas3, razor,
	idosch, jani.nikula, mchehab+huawei, tytso, herbert, geert,
	ebiggers, johannes.berg, jonathan.cameron, kees, kuniyu,
	fourier.thomas, rdunlap, akpm, linux-doc, linux-mips,
	linuxppc-dev, bridge
In-Reply-To: <68316F0B-2442-4492-A041-E57EFC58AC08@redfish-solutions.com>

On Wed, Apr 22, 2026 at 08:41:27PM -0600, Philip Prindeville wrote:
> > On Apr 22, 2026, at 7:05 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> > On Tue, 2026-04-21 at 21:18 -0700, Jakub Kicinski wrote:

...

> >>    I'm still deleting the solos driver, chances are nobody uses it.
> >>    Easy enough to revert back in since core is still around.
> >>    The guiding principle is to keep USB modems and delete
> >>    the rest as USB ADSL2+ CPEs were most popular historically.
> > 
> > Still not entirely convinced; I worked on both USB ATM modems and on
> > Solos, and the Solos is both the most modern and the only one I still
> > actually have. And the only one we have native support for that could
> > ever do full 24Mb/s ADSL2+, I believe.
> > 
> > If we drop it, OpenWrt will need to drop support for these, which I
> > think were quite popular at the time; there were a few UK resellers:
> > https://openwrt.org/toh/traverse/geos1_1
> > 
> > I still don't actually care *enough* to try to find an ADSL line I
> > could plug one into for testing though... :)
> 
> I have 3 boards lying around if anyone wants them.

The problem as I understand it is in one's willing to maintain and
support that driver while doing regular testing...

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* Re: [PATCH net] net: airoha: Do not wake all netdev TX queues in airoha_qdma_wake_netdev_txqs()
From: Lorenzo Bianconi @ 2026-04-23  7:51 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260421-airoha-wake_netdev_txqs-optmization-v1-1-e0be95115d53@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 3452 bytes --]

> Do not wake every netdev TX queue across all ports sharing the QDMA
> running netif_tx_wake_all_queues routine in airoha_qdma_wake_netdev_txqs()
> but only the ones that are mapped the specific QDMA stopped hw TX queue.
> This patch can potentially avoid waking already stopped netdev TX queues
> that are mapped to a different QDMA hw TX queue.
> Introduce airoha_qdma_get_txq utility routine.
> 
> Fixes: b94769eb2f30 ("net: airoha: Fix possible TX queue stall in airoha_qdma_tx_napi_poll()")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/airoha/airoha_eth.c | 19 +++++++++++++++----
>  drivers/net/ethernet/airoha/airoha_eth.h |  5 +++++
>  2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index 19f67c7dd8e1..2ca569501045 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -847,13 +847,24 @@ static void airoha_qdma_wake_netdev_txqs(struct airoha_queue *q)
>  {
>  	struct airoha_qdma *qdma = q->qdma;
>  	struct airoha_eth *eth = qdma->eth;
> -	int i;
> +	int i, qid = q - &qdma->q_tx[0];
>  
>  	for (i = 0; i < ARRAY_SIZE(eth->ports); i++) {
>  		struct airoha_gdm_port *port = eth->ports[i];
> +		int j;
> +
> +		if (!port)
> +			continue;
>  
> -		if (port && port->qdma == qdma)
> -			netif_tx_wake_all_queues(port->dev);
> +		if (port->qdma != qdma)
> +			continue;
> +
> +		for (j = 0; j < port->dev->num_tx_queues; j++) {
> +			if (airoha_qdma_get_txq(qdma, j) != qid)
> +				continue;
> +
> +			netif_wake_subqueue(port->dev, j);
> +		}
>  	}
>  	q->txq_stopped = false;
>  }
> @@ -1965,7 +1976,7 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
>  	u16 index;
>  	u8 fport;
>  
> -	qid = skb_get_queue_mapping(skb) % ARRAY_SIZE(qdma->q_tx);
> +	qid = airoha_qdma_get_txq(qdma, skb_get_queue_mapping(skb));
>  	tag = airoha_get_dsa_tag(skb, dev);
>  
>  	msg0 = FIELD_PREP(QDMA_ETH_TXMSG_CHAN_MASK,
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.h b/drivers/net/ethernet/airoha/airoha_eth.h
> index 87b328cfefb0..c3ea7aadbd82 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.h
> +++ b/drivers/net/ethernet/airoha/airoha_eth.h
> @@ -631,6 +631,11 @@ u32 airoha_rmw(void __iomem *base, u32 offset, u32 mask, u32 val);
>  #define airoha_qdma_clear(qdma, offset, val)			\
>  	airoha_rmw((qdma)->regs, (offset), (val), 0)
>  
> +static inline u16 airoha_qdma_get_txq(struct airoha_qdma *qdma, u16 qid)
> +{
> +	return qid % ARRAY_SIZE(qdma->q_tx);
> +}
> +
>  static inline bool airoha_is_lan_gdm_port(struct airoha_gdm_port *port)
>  {
>  	/* GDM1 port on EN7581 SoC is connected to the lan dsa switch.
> 
> ---
> base-commit: a663bac71a2f0b3ac6c373168ca57b2a6e6381aa
> change-id: 20260421-airoha-wake_netdev_txqs-optmization-65171ce4ebad
> 
> Best regards,
> -- 
> Lorenzo Bianconi <lorenzo@kernel.org>
> 

commenting on Sashiko retported issues:
https://sashiko.dev/#/patchset/20260421-airoha-wake_netdev_txqs-optmization-v1-1-e0be95115d53%40kernel.org

- Can this cause an infinite NETDEV_TX_BUSY livelock when a QDMA hardware queue is full?
  The issue is already fixed in the following patch:
  https://patchwork.kernel.org/project/netdevbpf/patch/20260421-airoha-fix-bql-v1-1-f135afe4275b@kernel.org/

Regards,
Lorenzo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH net v8 4/6] net/sched: netem: validate slot configuration
From: Paolo Abeni @ 2026-04-23  7:50 UTC (permalink / raw)
  To: jhs
  Cc: netdev, jiri, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Dave Taht, open list, Stephen Hemminger, Simon Horman
In-Reply-To: <20260421131039.GA651125@horms.kernel.org>

On 4/21/26 3:10 PM, Simon Horman wrote:
> On Fri, Apr 17, 2026 at 08:19:42PM -0700, Stephen Hemminger wrote:
>> Reject slot configurations that have no defensible meaning:
>>
>>   - negative min_delay or max_delay
>>   - min_delay greater than max_delay
>>   - negative dist_delay or dist_jitter
>>   - negative max_packets or max_bytes
>>
>> Negative or out-of-order delays underflow in get_slot_next(),
>> producing garbage intervals. Negative limits trip the per-slot
>> accounting (packets_left/bytes_left <= 0) on the first packet of
>> every slot, defeating the rate-limiting half of the slot feature.
>>
>> Note that dist_jitter has been silently coerced to its absolute
>> value by get_slot() since the feature was introduced; rejecting
>> negatives here converts that silent coercion into -EINVAL. The
>> abs() can be removed in a follow-up.
>>
>> Fixes: 836af83b54e3 ("netem: support delivering packets in delayed time slots")
>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> 
> Reviewed-by: Simon Horman <horms@kernel.org>
> 
> I acknowledge that Sashiko has provided feedback on this patch.
> 
> 1. "Does rejecting negative dist_jitter values with -EINVAL cause a
>     regression in userspace ABI backward compatibility?  Since the kernel
>     previously accepted these values and silently coerced them using abs(),
>     existing userspace tools or scripts that happen to pass negative values
>     might start failing to configure the qdisc."
> 
> This is intended and explicitly explained in the cover letter.
Jamal, given the uAPI implication, could you please double check that
the change is fine?

Thanks,

Paolo


^ permalink raw reply

* [PATCH net v2] net: mctp i2c: check length before marking flow active
From: William A. Kennington III @ 2026-04-23  7:46 UTC (permalink / raw)
  To: Jeremy Kerr, Matt Johnston, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Wolfram Sang
  Cc: William A. Kennington III, netdev, linux-kernel
In-Reply-To: <20260423001517.79219-1-william@wkennington.com>

Currently, mctp_i2c_get_tx_flow_state() is called before the packet length
sanity check. This function marks a new flow as active in the MCTP core.

If the sanity check fails, mctp_i2c_xmit() returns early without calling
mctp_i2c_lock_nest(). This results in a mismatched locking state: the
flow is active, but the I2C bus lock was never acquired for it.

When the flow is later released, mctp_i2c_release_flow() will see the
active state and queue an unlock marker. The TX thread will then
decrement midev->i2c_lock_count from 0, causing it to underflow to -1.

This underflow permanently breaks the driver's locking logic, allowing
future transmissions to occur without holding the I2C bus lock, leading
to bus collisions and potential hardware hangs.

Move the mctp_i2c_get_tx_flow_state() call to after the length sanity
check to ensure we only transition the flow state if we are actually
going to proceed with the transmission and locking.

Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
Signed-off-by: William A. Kennington III <william@wkennington.com>
---
 drivers/net/mctp/mctp-i2c.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mctp/mctp-i2c.c b/drivers/net/mctp/mctp-i2c.c
index 15fe4d1163c1..ee2913758e54 100644
--- a/drivers/net/mctp/mctp-i2c.c
+++ b/drivers/net/mctp/mctp-i2c.c
@@ -496,8 +496,6 @@ static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb)
 	u8 *pecp;
 	int rc;

-	fs = mctp_i2c_get_tx_flow_state(midev, skb);
-
 	hdr = (void *)skb_mac_header(skb);
 	/* Sanity check that packet contents matches skb length,
 	 * and can't exceed MCTP_I2C_BUFSZ
@@ -509,6 +507,8 @@ static void mctp_i2c_xmit(struct mctp_i2c_dev *midev, struct sk_buff *skb)
 		return;
 	}

+	fs = mctp_i2c_get_tx_flow_state(midev, skb);
+
 	if (skb_tailroom(skb) >= 1) {
 		/* Linear case with space, we can just append the PEC */
 		skb_put(skb, 1);
-- 
2.54.0.545.g6539524ca2-goog

^ permalink raw reply related

* [RFC PATCH] net: skb: on zero-copy formatted output to skb
From: Dmitry Antipov @ 2026-04-23  7:36 UTC (permalink / raw)
  To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-bluetooth, Dmitry Antipov

Some code, most notably the Bluetooth drivers, uses something like
the following:

char buf[80];
snprintf(buf, sizeof(buf), "Driver: %s\n", driver_name);
skb_put_data(skb, buf, strlen(buf));

This looks suboptimal at least because:

1) It yields in BUG() just in case the developer underestimates
   the size of an skb being used;
2) It requires extra data copy from an external buffer;
3) It uses 'strlen()' redundantly because actual data length
   is calculated by 'snprintf()' itself.

So introduce 'skb_printf()' which aims to address all of these
issues. As usual, thoughts and comments are highly appreciated.

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
---
 include/linux/skbuff.h |  1 +
 net/core/skbuff.c      | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2bcf78a4de7b..fb4ef55a8f86 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4292,6 +4292,7 @@ int skb_mpls_update_lse(struct sk_buff *skb, __be32 mpls_lse);
 int skb_mpls_dec_ttl(struct sk_buff *skb);
 struct sk_buff *pskb_extract(struct sk_buff *skb, int off, int to_copy,
 			     gfp_t gfp);
+int skb_printf(struct sk_buff *skb, const char *fmt, ...);

 static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7dad68e3b518..051ab4f28c75 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6992,6 +6992,24 @@ struct sk_buff *pskb_extract(struct sk_buff *skb, int off,
 }
 EXPORT_SYMBOL(pskb_extract);

+int skb_printf(struct sk_buff *skb, const char *fmt, ...)
+{
+	int len, size = skb_availroom(skb);
+	va_list args;
+
+	va_start(args, fmt);
+	len = vsnprintf(skb_tail_pointer(skb), size, fmt, args);
+	va_end(args);
+
+	if (unlikely(len >= size))
+		return -ENOSPC;
+
+	skb->tail += len;
+	skb->len += len;
+	return len;
+}
+EXPORT_SYMBOL(skb_printf);
+
 /**
  * skb_condense - try to get rid of fragments/frag_list if possible
  * @skb: buffer
-- 
2.53.0

^ permalink raw reply related

* Re: [PATCH net] net: airoha: stop net_device TX queue before updating CPU index
From: Lorenzo Bianconi @ 2026-04-23  7:42 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: Simon Horman, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260421-airoha-xmit-stop-condition-v1-1-e670d6a48467@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 2890 bytes --]

> Currently, airoha_eth driver updates the CPU index register prior of
> verifying whether the number of free descriptors has fallen below the
> threshold.
> Move net_device TX queue length check before updating the TX CPU index
> in order to update TX CPU index even if there are more packets to be
> transmitted but the net_device TX queue is going to be stopped
> accounting the inflight packets.
> 
> Fixes: 1d304174106c ("net: airoha: Implement BQL support")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/airoha/airoha_eth.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index 19f67c7dd8e1..5d327237e274 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -2058,17 +2058,16 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
>  
>  	skb_tx_timestamp(skb);
>  	netdev_tx_sent_queue(txq, skb->len);
> +	if (q->ndesc - q->queued < q->free_thr) {
> +		netif_tx_stop_queue(txq);
> +		q->txq_stopped = true;
> +	}
>  
>  	if (netif_xmit_stopped(txq) || !netdev_xmit_more())
>  		airoha_qdma_rmw(qdma, REG_TX_CPU_IDX(qid),
>  				TX_RING_CPU_IDX_MASK,
>  				FIELD_PREP(TX_RING_CPU_IDX_MASK, index));
>  
> -	if (q->ndesc - q->queued < q->free_thr) {
> -		netif_tx_stop_queue(txq);
> -		q->txq_stopped = true;
> -	}
> -
>  	spin_unlock_bh(&q->lock);
>  
>  	return NETDEV_TX_OK;
> 
> ---
> base-commit: a663bac71a2f0b3ac6c373168ca57b2a6e6381aa
> change-id: 20260421-airoha-xmit-stop-condition-344dc0292a19
> 
> Best regards,
> -- 
> Lorenzo Bianconi <lorenzo@kernel.org>
> 

commenting on Sashiko retported issues:
https://sashiko.dev/#/patchset/20260421-airoha-xmit-stop-condition-v1-1-e670d6a48467%40kernel.org

- Could this cause a deadlock if exactly q->free_thr descriptors are free?
  This does not seem a problem to me since, even if the netdev tx queue is
  stopped as described in the report, the airoha_qdma_tx_napi_poll() will free
  space in the queue and subsequent packets will update REG_TX_CPU_IDX register.

- Is it possible for this loop to read past the end of the frags array?
  As pointed out by Sashiko, this issue is not introduced by this patch and I
  will fix with a dedicated patch.

- Might this lead to memory corruption if the tcp header is not in the linear area?
  This issue is not introduced by this patch and I will fix with a dedicated patch.

- If an error occurs during transmission, the driver jumps to the error label
  frees the skb, and returns NETDEV_TX_OK without ringing the qdma cpu index doorbell?
  Similar to the first issue, this does not seem a problem to me since subsequent
  packets will update REG_TX_CPU_IDX register.

Regards,
Lorenzo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* [PATCH RFC net-next 2/4] net: pse-pd: add notifier chain for controller lifecycle events
From: Corey Leavitt via B4 Relay @ 2026-04-23  7:42 UTC (permalink / raw)
  To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
	Russell King
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info>

From: Corey Leavitt <corey@leavitt.info>

Introduce a blocking notifier chain that allows other subsystems to be
informed when a PSE controller is registered or unregistered, and
provide pse_register_notifier() / pse_unregister_notifier() as the
subscriber interface.

Subsequent patches will use this to let the phy subsystem own the
phydev->psec lifecycle directly, decoupling PSE lookup from
fwnode_mdiobus_register_phy() and removing the probe-time
-EPROBE_DEFER coupling that currently exists between mdio, phy and
pse-pd when the PSE controller driver is modular.

A blocking chain (rather than atomic) is used because callbacks will
take rtnl_lock and call back into pse_core via of_pse_control_get().

The enum pse_controller_event is placed outside the
IS_ENABLED(CONFIG_PSE_CONTROLLER) guard so that subscribers compiled
into a kernel without PSE support can still reference the event
values in dead-code paths without breaking the build.

This patch is pure infrastructure: nothing fires events yet, and
nothing subscribes. No observable behavior change.

Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/pse-pd/pse_core.c | 34 ++++++++++++++++++++++++++++++++++
 include/linux/pse-pd/pse.h    | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 893ec2185947..80c5c6c1758c 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -8,6 +8,7 @@
 #include <linux/device.h>
 #include <linux/ethtool.h>
 #include <linux/ethtool_netlink.h>
+#include <linux/notifier.h>
 #include <linux/of.h>
 #include <linux/phy.h>
 #include <linux/pse-pd/pse.h>
@@ -23,6 +24,39 @@ static LIST_HEAD(pse_controller_list);
 static DEFINE_XARRAY_ALLOC(pse_pw_d_map);
 static DEFINE_MUTEX(pse_pw_d_mutex);
 
+static BLOCKING_NOTIFIER_HEAD(pse_controller_notifier);
+
+/**
+ * pse_register_notifier - register a callback for PSE controller events
+ * @nb: notifier block to register
+ *
+ * See enum pse_controller_event for events fired and their subscriber
+ * contract. Callbacks run in process context; they may sleep, take
+ * rtnl, and call of_pse_control_get(). The chain fires synchronously,
+ * so a PSE controller driver's probe/unbind path must not hold any
+ * such lock when calling pse_controller_register() or
+ * pse_controller_unregister().
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int pse_register_notifier(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_register(&pse_controller_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(pse_register_notifier);
+
+/**
+ * pse_unregister_notifier - unregister a previously registered callback
+ * @nb: notifier block previously passed to pse_register_notifier()
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int pse_unregister_notifier(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_unregister(&pse_controller_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(pse_unregister_notifier);
+
 /**
  * struct pse_control - a PSE control
  * @pcdev: a pointer to the PSE controller device
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 4e5696cfade7..78fe3a2b1ea8 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -21,6 +21,7 @@ struct net_device;
 struct phy_device;
 struct pse_controller_dev;
 struct netlink_ext_ack;
+struct notifier_block;
 
 /* C33 PSE extended state and substate. */
 struct ethtool_c33_pse_ext_state_info {
@@ -337,6 +338,24 @@ enum pse_budget_eval_strategies {
 	PSE_BUDGET_EVAL_STRAT_DYNAMIC	= 1 << 2,
 };
 
+/**
+ * enum pse_controller_event - PSE controller lifecycle events
+ *
+ * Event data in callbacks is always a pointer to the struct
+ * pse_controller_dev firing the event.
+ *
+ * @PSE_REGISTERED: controller added to pse_controller_list and
+ *	resolvable by of_pse_control_get().
+ * @PSE_UNREGISTERED: controller about to be removed from
+ *	pse_controller_list. Subscribers holding pse_control references
+ *	targeting it must drop them before returning and must not
+ *	acquire new references for it.
+ */
+enum pse_controller_event {
+	PSE_REGISTERED,
+	PSE_UNREGISTERED,
+};
+
 #if IS_ENABLED(CONFIG_PSE_CONTROLLER)
 int pse_controller_register(struct pse_controller_dev *pcdev);
 void pse_controller_unregister(struct pse_controller_dev *pcdev);
@@ -366,6 +385,9 @@ int pse_ethtool_set_prio(struct pse_control *psec,
 bool pse_has_podl(struct pse_control *psec);
 bool pse_has_c33(struct pse_control *psec);
 
+int pse_register_notifier(struct notifier_block *nb);
+int pse_unregister_notifier(struct notifier_block *nb);
+
 #else
 
 static inline struct pse_control *of_pse_control_get(struct device_node *node,
@@ -416,6 +438,16 @@ static inline bool pse_has_c33(struct pse_control *psec)
 	return false;
 }
 
+static inline int pse_register_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+
+static inline int pse_unregister_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+
 #endif
 
 #endif

-- 
2.53.0



^ permalink raw reply related

* [PATCH RFC net-next 1/4] net: pse-pd: scope pse_control regulator handle to kref lifetime
From: Corey Leavitt via B4 Relay @ 2026-04-23  7:42 UTC (permalink / raw)
  To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
	Russell King
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info>

From: Corey Leavitt <corey@leavitt.info>

__pse_control_release() drops psec->ps via devm_regulator_put(), which
only succeeds if the devres entry added by the matching
devm_regulator_get_exclusive() is still present on pcdev->dev at the
time the pse_control's kref hits zero.

In practice that assumption does not hold when the controller is
unbound while any pse_control still has consumers: pcdev->dev's
devres list is released LIFO, so every per-attach regulator-GET
devres runs (and regulator_put()s the underlying regulator) before
pse_controller_unregister() itself is invoked. Any later
pse_control_put() from that unbind path then reads psec->ps as a
dangling pointer inside devm_regulator_put() and WARNs at
drivers/regulator/devres.c:232 (devres_release() fails to find the
already-released match).

The pse_control's consumer handle is logically scoped to the
pse_control's refcount, not to pcdev->dev's devres lifetime. Switch
to the plain regulator_get_exclusive() / regulator_put() pair so
__pse_control_release() does the right put regardless of whether
the controller's devres has already been unwound.

No change to the regulator-framework-visible refcount or lifetime of
the underlying regulator: a single get paired with a single put. The
existing devm_regulator_register() for the per-PI rails is unchanged
(those ARE correctly scoped to the controller's lifetime).

Fixes: d83e13761d5b ("net: pse-pd: Use regulator framework within PSE framework")
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/pse-pd/pse_core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index f6b94ac7a68a..893ec2185947 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -1362,7 +1362,7 @@ static void __pse_control_release(struct kref *kref)

 	if (psec->pcdev->pi[psec->id].admin_state_enabled)
 		regulator_disable(psec->ps);
-	devm_regulator_put(psec->ps);
+	regulator_put(psec->ps);

 	module_put(psec->pcdev->owner);

@@ -1431,8 +1431,8 @@ pse_control_get_internal(struct pse_controller_dev *pcdev, unsigned int index,
 		goto free_psec;

 	pcdev->pi[index].admin_state_enabled = ret;
-	psec->ps = devm_regulator_get_exclusive(pcdev->dev,
-						rdev_get_name(pcdev->pi[index].rdev));
+	psec->ps = regulator_get_exclusive(pcdev->dev,
+					   rdev_get_name(pcdev->pi[index].rdev));
 	if (IS_ERR(psec->ps)) {
 		ret = PTR_ERR(psec->ps);
 		goto put_module;

-- 
2.53.0

^ permalink raw reply related

* [PATCH RFC net-next 4/4] net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook
From: Corey Leavitt via B4 Relay @ 2026-04-23  7:42 UTC (permalink / raw)
  To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
	Russell King
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info>

From: Corey Leavitt <corey@leavitt.info>

Transfer ownership of phydev->psec from fwnode_mdio to the phy
subsystem itself. The phy subsystem now subscribes to the pse-pd
notifier chain and manages psec attach/detach in response to PSE
controller lifecycle events, while fwnode_mdio loses its PSE
awareness entirely.

Split phy_device_register() into a public entry point that takes
rtnl_lock() and a phy_device_register_locked() variant that assumes
rtnl is already held. Callers that already hold rtnl (the SFP
module state machine via __sfp_sm_event) use the _locked form to
avoid deadlock; all other callers use the unchanged public API.
This pair mirrors the register_netdevice() / register_netdev()
split convention already established in the core networking stack.
rtnl must span the full registration sequence through device_add(),
not just phy_try_attach_pse(): a PSE_REGISTERED event firing between
a narrow attach lock and device_add() would walk mdio_bus_type, find
the phy not yet on the bus, and leave it permanently unattached.

With rtnl held across the full registration sequence:

  - At phy_device_register_locked(), phy_try_attach_pse() attempts
    an of_pse_control_get() for phys whose DT pses phandle resolves
    now. If the controller is already registered, psec is attached
    before device_add() makes the phy visible on mdio_bus_type.
    If the controller is not yet registered, the swallow-error path
    leaves psec NULL and relies on the subsequent notifier event.

  - On PSE_REGISTERED: an rtnl-guarded bus walk retries the attach
    for every registered phy whose psec is still NULL. This is the
    "phy was enumerated before the PSE controller loaded" case,
    which is the root cause of the boot-time probe-retry storm on
    systems with a modular PSE controller driver. Because the
    pse_controller_notifier is fired synchronously, a concurrent
    pse_controller_register() either (a) completes list_add and
    releases pse_list_mutex before this function takes rtnl, in
    which case phy_try_attach_pse() finds the controller in the
    list and attaches; or (b) fires its notifier during this
    function, in which case the callback blocks on rtnl until this
    function returns, then walks the bus and finds the phy fully
    registered (attaching if psec is still NULL).

  - On PSE_UNREGISTERED: an rtnl-guarded bus walk releases every
    phydev->psec that targets the departing controller before
    pse_release_pis() frees pcdev->pi. Without this, a phy still
    holding a pse_control reference would cause a use-after-free
    in __pse_control_release's pcdev->pi[psec->id] access, and the
    PSE driver module could not finish unloading while any phy
    still held a reference via module_put().

Introduce phy_try_attach_pse() as the rtnl-guarded helper used by
both the register path and the notifier walk. Holding rtnl across
of_pse_control_get() is safe because pse_list_mutex is never held
in the opposite order.

Expose pse_control_matches_pcdev() as a predicate so subscribers
can identify which of their held pse_control references target a
given controller, without leaking the struct pse_controller_dev *
out of pse_control opacity.

Move the final pse_control_put() of phydev->psec from
phy_device_remove() to phy_device_release(). The kobject release
callback runs only after every reference on the device has been
dropped, including the bus iterator references taken by
bus_for_each_dev() in the notifier walk, which means by the time
release fires no concurrent reader or writer of phydev->psec can
exist. The mdio_bus_type klist is set up in bus_register() with
klist_devices_get() / klist_devices_put() (drivers/base/bus.c),
which bracket each iteration step with get_device() / put_device()
on the underlying struct device; that reference defers the release
callback from firing until the walk has advanced past this phy.
Keeping phy_device_remove() unchanged avoids introducing a new
locking contract on its many callers (sfp, fixed_phy, xgbe, hns,
netsec, bcm_sf2, mdiobus_unregister).

Finally, delete fwnode_find_pse_control() and its call site in
fwnode_mdiobus_register_phy(), and drop the PSE header from
fwnode_mdio.c. This removes the probe-time -EPROBE_DEFER coupling
between mdio and pse-pd that caused the boot-hang regression on
systems with a modular PSE controller driver and a DT phy with a
pses phandle: the MDIO/DSA probe no longer sees any PSE-originated
-EPROBE_DEFER, so the probe-retry storm is gone. fwnode_mdio is
now PSE-agnostic.

Fixes: fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse control")
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/mdio/fwnode_mdio.c |  34 ----------
 drivers/net/phy/phy_device.c   | 144 ++++++++++++++++++++++++++++++++++++++---
 drivers/net/phy/sfp.c          |   2 +-
 drivers/net/pse-pd/pse_core.c  |  14 ++++
 include/linux/phy.h            |   2 +
 include/linux/pse-pd/pse.h     |   9 +++
 6 files changed, 161 insertions(+), 44 deletions(-)

diff --git a/drivers/net/mdio/fwnode_mdio.c b/drivers/net/mdio/fwnode_mdio.c
index ba7091518265..7bd979b59f49 100644
--- a/drivers/net/mdio/fwnode_mdio.c
+++ b/drivers/net/mdio/fwnode_mdio.c
@@ -11,33 +11,11 @@
 #include <linux/fwnode_mdio.h>
 #include <linux/of.h>
 #include <linux/phy.h>
-#include <linux/pse-pd/pse.h>
 
 MODULE_AUTHOR("Calvin Johnson <calvin.johnson@oss.nxp.com>");
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("FWNODE MDIO bus (Ethernet PHY) accessors");
 
-static struct pse_control *
-fwnode_find_pse_control(struct fwnode_handle *fwnode,
-			struct phy_device *phydev)
-{
-	struct pse_control *psec;
-	struct device_node *np;
-
-	if (!IS_ENABLED(CONFIG_PSE_CONTROLLER))
-		return NULL;
-
-	np = to_of_node(fwnode);
-	if (!np)
-		return NULL;
-
-	psec = of_pse_control_get(np, phydev);
-	if (PTR_ERR(psec) == -ENOENT)
-		return NULL;
-
-	return psec;
-}
-
 static struct mii_timestamper *
 fwnode_find_mii_timestamper(struct fwnode_handle *fwnode)
 {
@@ -118,7 +96,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
 				struct fwnode_handle *child, u32 addr)
 {
 	struct mii_timestamper *mii_ts = NULL;
-	struct pse_control *psec = NULL;
 	struct phy_device *phy;
 	bool is_c45;
 	u32 phy_id;
@@ -159,14 +136,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
 			goto clean_phy;
 	}
 
-	psec = fwnode_find_pse_control(child, phy);
-	if (IS_ERR(psec)) {
-		rc = PTR_ERR(psec);
-		goto unregister_phy;
-	}
-
-	phy->psec = psec;
-
 	/* phy->mii_ts may already be defined by the PHY driver. A
 	 * mii_timestamper probed via the device tree will still have
 	 * precedence.
@@ -176,9 +145,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
 
 	return 0;
 
-unregister_phy:
-	if (is_acpi_node(child) || is_of_node(child))
-		phy_device_remove(phy);
 clean_phy:
 	phy_device_free(phy);
 clean_mii_ts:
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index c2cdf1ae3542..7948800e6e49 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -223,8 +223,19 @@ static void phy_mdio_device_free(struct mdio_device *mdiodev)
 
 static void phy_device_release(struct device *dev)
 {
+	struct phy_device *phydev = to_phy_device(dev);
+
+	/* bus_for_each_dev() holds get_device() across each iteration
+	 * step, deferring this release callback until any in-flight PSE
+	 * notifier walk has advanced past this phy. pse_control_put()
+	 * takes pse_list_mutex, so this path must run in sleepable
+	 * context.
+	 */
+	might_sleep();
+	pse_control_put(phydev->psec);
+
 	fwnode_handle_put(dev->fwnode);
-	kfree(to_phy_device(dev));
+	kfree(phydev);
 }
 
 static void phy_mdio_device_remove(struct mdio_device *mdiodev)
@@ -1102,14 +1113,102 @@ struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45)
 }
 EXPORT_SYMBOL(get_phy_device);
 
-/**
- * phy_device_register - Register the phy device on the MDIO bus
- * @phydev: phy_device structure to be added to the MDIO bus
+/* Best-effort attach of phydev->psec from a DT `pses = <&...>` phandle.
+ * Caller must hold rtnl. Errors are swallowed; the notifier retries
+ * at PSE_REGISTERED time.
  */
-int phy_device_register(struct phy_device *phydev)
+static void phy_try_attach_pse(struct phy_device *phydev)
+{
+	struct pse_control *psec;
+	struct device_node *np;
+
+	ASSERT_RTNL();
+
+	np = phydev->mdio.dev.of_node;
+	if (!np)
+		return;
+
+	if (phydev->psec)
+		return;
+
+	psec = of_pse_control_get(np, phydev);
+	if (IS_ERR(psec))
+		return;
+
+	phydev->psec = psec;
+}
+
+static int phy_pse_attach_one(struct device *dev, void *data __maybe_unused)
+{
+	ASSERT_RTNL();
+
+	if (dev->type != &mdio_bus_phy_type)
+		return 0;
+
+	phy_try_attach_pse(to_phy_device(dev));
+	return 0;
+}
+
+static int phy_pse_detach_one(struct device *dev, void *data)
+{
+	struct pse_controller_dev *pcdev = data;
+	struct phy_device *phydev;
+	struct pse_control *psec;
+
+	ASSERT_RTNL();
+
+	if (dev->type != &mdio_bus_phy_type)
+		return 0;
+
+	phydev = to_phy_device(dev);
+	psec = phydev->psec;
+	if (!psec || !pse_control_matches_pcdev(psec, pcdev))
+		return 0;
+
+	phydev->psec = NULL;
+	pse_control_put(psec);
+	return 0;
+}
+
+static int phy_pse_notifier_event(struct notifier_block *nb,
+				  unsigned long event, void *data)
+{
+	switch (event) {
+	case PSE_REGISTERED:
+		rtnl_lock();
+		bus_for_each_dev(&mdio_bus_type, NULL, NULL,
+				 phy_pse_attach_one);
+		rtnl_unlock();
+		return NOTIFY_OK;
+	case PSE_UNREGISTERED:
+		rtnl_lock();
+		bus_for_each_dev(&mdio_bus_type, NULL, data,
+				 phy_pse_detach_one);
+		rtnl_unlock();
+		return NOTIFY_OK;
+	default:
+		return NOTIFY_DONE;
+	}
+}
+
+static struct notifier_block phy_pse_notifier __read_mostly = {
+	.notifier_call = phy_pse_notifier_event,
+};
+
+/**
+ * phy_device_register_locked - Register the phy device on the MDIO bus
+ * @phydev: phy_device structure to be added to the MDIO bus
+ *
+ * Same as phy_device_register() but caller must already hold rtnl_lock().
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int phy_device_register_locked(struct phy_device *phydev)
 {
 	int err;
 
+	ASSERT_RTNL();
+
 	err = mdiobus_register_device(&phydev->mdio);
 	if (err)
 		return err;
@@ -1124,6 +1223,8 @@ int phy_device_register(struct phy_device *phydev)
 		goto out;
 	}
 
+	phy_try_attach_pse(phydev);
+
 	err = device_add(&phydev->mdio.dev);
 	if (err) {
 		phydev_err(phydev, "failed to add\n");
@@ -1133,12 +1234,32 @@ int phy_device_register(struct phy_device *phydev)
 	return 0;
 
  out:
-	/* Assert the reset signal */
+	/* If phy_try_attach_pse() set phydev->psec before device_add()
+	 * failed, the caller's phy_device_free() -> phy_device_release()
+	 * chain will drop it.
+	 */
 	phy_device_reset(phydev, 1);
-
 	mdiobus_unregister_device(&phydev->mdio);
 	return err;
 }
+EXPORT_SYMBOL(phy_device_register_locked);
+
+/**
+ * phy_device_register - Register the phy device on the MDIO bus
+ * @phydev: phy_device structure to be added to the MDIO bus
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int phy_device_register(struct phy_device *phydev)
+{
+	int err;
+
+	rtnl_lock();
+	err = phy_device_register_locked(phydev);
+	rtnl_unlock();
+
+	return err;
+}
 EXPORT_SYMBOL(phy_device_register);
 
 /**
@@ -1152,8 +1273,6 @@ EXPORT_SYMBOL(phy_device_register);
 void phy_device_remove(struct phy_device *phydev)
 {
 	unregister_mii_timestamper(phydev->mii_ts);
-	pse_control_put(phydev->psec);
-
 	device_del(&phydev->mdio.dev);
 
 	/* Assert the reset signal */
@@ -3962,8 +4081,14 @@ static int __init phy_init(void)
 	if (rc)
 		goto err_c45;
 
+	rc = pse_register_notifier(&phy_pse_notifier);
+	if (rc)
+		goto err_genphy;
+
 	return 0;
 
+err_genphy:
+	phy_driver_unregister(&genphy_driver);
 err_c45:
 	phy_driver_unregister(&genphy_c45_driver);
 err_ethtool_phy_ops:
@@ -3980,6 +4105,7 @@ static int __init phy_init(void)
 
 static void __exit phy_exit(void)
 {
+	pse_unregister_notifier(&phy_pse_notifier);
 	phy_driver_unregister(&genphy_c45_driver);
 	phy_driver_unregister(&genphy_driver);
 	rtnl_lock();
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index bd970f753beb..d19fe0f30c5d 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1932,7 +1932,7 @@ static int sfp_sm_probe_phy(struct sfp *sfp, int addr, bool is_c45)
 	/* Mark this PHY as being on a SFP module */
 	phy->is_on_sfp_module = true;
 
-	err = phy_device_register(phy);
+	err = phy_device_register_locked(phy);
 	if (err) {
 		phy_device_free(phy);
 		dev_err(sfp->dev, "phy_device_register failed: %pe\n",
diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 82125502a8e3..a0667324a029 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -2016,3 +2016,17 @@ bool pse_has_c33(struct pse_control *psec)
 	return psec->pcdev->types & ETHTOOL_PSE_C33;
 }
 EXPORT_SYMBOL_GPL(pse_has_c33);
+
+/**
+ * pse_control_matches_pcdev - Test whether a pse_control targets a controller
+ * @psec: pse_control obtained from of_pse_control_get()
+ * @pcdev: PSE controller to compare against
+ *
+ * Return: %true if @psec was obtained from @pcdev, %false otherwise.
+ */
+bool pse_control_matches_pcdev(struct pse_control *psec,
+			       struct pse_controller_dev *pcdev)
+{
+	return psec->pcdev == pcdev;
+}
+EXPORT_SYMBOL_GPL(pse_control_matches_pcdev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 199a7aaa341b..865b9baddb85 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -2158,6 +2158,8 @@ struct phy_device *fwnode_phy_find_device(struct fwnode_handle *phy_fwnode);
 struct fwnode_handle *fwnode_get_phy_node(const struct fwnode_handle *fwnode);
 struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45);
 int phy_device_register(struct phy_device *phy);
+/* Caller must hold rtnl_lock(); see phy_device_register() for the public form. */
+int phy_device_register_locked(struct phy_device *phy);
 void phy_device_free(struct phy_device *phydev);
 void phy_device_remove(struct phy_device *phydev);
 int phy_get_c45_ids(struct phy_device *phydev);
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 78fe3a2b1ea8..d4310ca71a3e 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -385,6 +385,9 @@ int pse_ethtool_set_prio(struct pse_control *psec,
 bool pse_has_podl(struct pse_control *psec);
 bool pse_has_c33(struct pse_control *psec);
 
+bool pse_control_matches_pcdev(struct pse_control *psec,
+			       struct pse_controller_dev *pcdev);
+
 int pse_register_notifier(struct notifier_block *nb);
 int pse_unregister_notifier(struct notifier_block *nb);
 
@@ -438,6 +441,12 @@ static inline bool pse_has_c33(struct pse_control *psec)
 	return false;
 }
 
+static inline bool pse_control_matches_pcdev(struct pse_control *psec,
+					     struct pse_controller_dev *pcdev)
+{
+	return false;
+}
+
 static inline int pse_register_notifier(struct notifier_block *nb)
 {
 	return 0;

-- 
2.53.0



^ permalink raw reply related

* [PATCH RFC net-next 3/4] net: pse-pd: fire lifecycle events on controller register/unregister
From: Corey Leavitt via B4 Relay @ 2026-04-23  7:42 UTC (permalink / raw)
  To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
	Russell King
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info>

From: Corey Leavitt <corey@leavitt.info>

Hook the newly-introduced pse_controller_notifier chain so that
pse_controller_register() fires PSE_REGISTERED after the controller
has been added to pse_controller_list (i.e. is now resolvable by
of_pse_control_get()), and pse_controller_unregister() fires
PSE_UNREGISTERED before the controller is removed from the list
(while it is still valid to dereference from a subscriber's
pse_control pointer targeting it).

With no subscribers yet, this is observably a no-op. A later change
wires the phy subsystem in as the first subscriber.

Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/pse-pd/pse_core.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 80c5c6c1758c..82125502a8e3 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -1138,6 +1138,9 @@ int pse_controller_register(struct pse_controller_dev *pcdev)
 	list_add(&pcdev->list, &pse_controller_list);
 	mutex_unlock(&pse_list_mutex);
 
+	blocking_notifier_call_chain(&pse_controller_notifier,
+				     PSE_REGISTERED, pcdev);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(pse_controller_register);
@@ -1148,6 +1151,9 @@ EXPORT_SYMBOL_GPL(pse_controller_register);
  */
 void pse_controller_unregister(struct pse_controller_dev *pcdev)
 {
+	blocking_notifier_call_chain(&pse_controller_notifier,
+				     PSE_UNREGISTERED, pcdev);
+
 	pse_flush_pw_ds(pcdev);
 	pse_release_pis(pcdev);
 	if (pcdev->irq)

-- 
2.53.0



^ permalink raw reply related

* [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
From: Corey Leavitt via B4 Relay @ 2026-04-23  7:42 UTC (permalink / raw)
  To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
	Russell King
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt

On systems where a PSE controller driver loads as a module and a
device-tree PHY node carries a `pses = <&pse_pi>` reference,
fwnode_mdiobus_register_phy() tries to resolve the PSE handle before
the controller driver has probed. of_pse_control_get() returns
-EPROBE_DEFER, the enclosing MDIO/DSA probe fails, and driver-core
re-queues the work. The retry loop spins until the PSE driver module
loads and its controller registers.

Commit fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse
control") made each retry expensive. It reordered
fwnode_mdiobus_register_phy() so the PHY is registered before the
PSE lookup. Every deferral now performs a full
phy_device_register() / phy_device_remove() cycle. On a board with a
sufficiently tight watchdog the retry loop can starve the watchdog
kthread. On the reporting hardware (MT7621 + gpio-wdt, 1-second
margin) the retry loop converts a slow probe phase into a reset
before userspace loads.

The affected population today looks small. OpenWrt, where PSE
actually ships, is still on 6.12 (pre-regression), and most
environments with CONFIG_PSE_*=m do not have boards whose DT
references a PSE controller from a PHY. Still, the mechanism is
general. Any modular PSE driver combined with the documented
`pses = <&...>` binding reproduces the retry loop. Whether it
reaches brick-grade or merely slow/flaky boot depends on local
watchdog timing. More exposure is expected as distribution and
embedded kernels move to 6.13 and later.

The narrow fix would be to partially revert the ordering in
fa2f0454174c so each defer is cheap again. That keeps the same
architecture (fwnode_mdio holding PSE knowledge, -EPROBE_DEFER
flowing across the subsystem boundary), and any future reorder
reintroduces the same class of bug. This series takes the larger
fix: decouple PSE controller lookup from MDIO registration entirely.
pse_core now publishes a BLOCKING_NOTIFIER chain with REGISTERED
and UNREGISTERED events. phy_device subscribes, owns phydev->psec
lifetime, and attaches PSE handles in response to controller
lifecycle rather than during probe. fwnode_mdio loses its PSE
awareness, and -EPROBE_DEFER no longer flows out of fwnode_mdio.

Patch breakdown:

  1. Scope the pse_control regulator handle to kref lifetime
     (Fixes: d83e13761d5b). A latent bug that patch 4 makes
     reachable.
  2. Add the notifier chain (enum, head, register/unregister
     helpers). Pure infrastructure. No subscribers yet, no
     observable change.
  3. Fire REGISTERED and UNREGISTERED events from the controller
     register/unregister paths. Still no subscribers, still no
     observable change.
  4. Subscribe from the PHY layer, take ownership of phydev->psec
     via the notifier, and remove fwnode_find_pse_control() from
     fwnode_mdio.

Patch 1 is bundled here per stable-kernel-rules section 4
reachability guidance. On mainline today, with no notifier
subscriber, no caller drives the dangling regulator-handle sequence.
Patches 2 and 3 are deliberately split to separate "add
infrastructure" from "wire it up". Happy to fold them if maintainers
prefer the combined form.

Validated on a Cudy C200P (MT7621 + IP804AR) running an OpenWrt
build of 6.18.21 with the series applied. A lockdep build
(CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP) shows no splats
from the series' code paths during boot, PHY attach, PHY detach, or
a full controller unbind/rebind cycle. ethtool --set-pse drives all
four PoE-capable LAN ports, and a Ruckus H510 class-4 PD plugged
into lan3 negotiates and receives 48 V.

The C200P has no SFP cage, so the SFP path change in sfp.c
(phy_device_register -> phy_device_register_locked) isn't exercised
on the bench. Verified by call-graph audit: every path reaching
sfp_sm_probe_phy() holds rtnl at entry, via sfp_timeout,
sfp_check_state, sfp_probe, sfp_remove, or
sfp_bus_{add,del}_upstream.

Not addressed by this series: ethtool --show-pse returns "No data
available" on DSA netdevs in 6.18, because dev->phydev is NULL for
DSA-frontend netdevs and ethnl_req_get_phydev() therefore returns
NULL. That's a DSA / ethtool integration quirk that predates this
work.

Sending as RFC because this is my first net-next series. I'd
appreciate maintainer guidance on whether patch 1 should go to net
rather than net-next, and whether the patch 2/3 split is preferred
to the combined form.

Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
Corey Leavitt (4):
      net: pse-pd: scope pse_control regulator handle to kref lifetime
      net: pse-pd: add notifier chain for controller lifecycle events
      net: pse-pd: fire lifecycle events on controller register/unregister
      net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook

 drivers/net/mdio/fwnode_mdio.c |  34 ----------
 drivers/net/phy/phy_device.c   | 144 ++++++++++++++++++++++++++++++++++++++---
 drivers/net/phy/sfp.c          |   2 +-
 drivers/net/pse-pd/pse_core.c  |  60 ++++++++++++++++-
 include/linux/phy.h            |   2 +
 include/linux/pse-pd/pse.h     |  41 ++++++++++++
 6 files changed, 236 insertions(+), 47 deletions(-)
---
base-commit: 1f5ffc672165ff851063a5fd044b727ab2517ae3
change-id: 20260422-pse-notifier-decouple-efa80d77f4be

Best regards,
--  
Corey Leavitt <corey@leavitt.info>

^ permalink raw reply

* Re: [PATCH v2] net/intel: Replace manual array size calculation with ARRAY_SIZE
From: Przemek Kitszel @ 2026-04-23  7:41 UTC (permalink / raw)
  To: Jakub Raczynski, error27; +Cc: netdev, kuba, anthony.l.nguyen, kernel-janitors
In-Reply-To: <20260422105710.268003-1-j.raczynski@samsung.com>

On 4/22/26 12:57, Jakub Raczynski wrote:
> There are still places in the code where manual calculation of array size
> exist, but it is good to enforce usage of single macro through the whole
> code as it makes code bit more readable.
> While at it, beautify condition surrounding it by reversing check and remove
> unnecessary casting.
> 

thank you for the submission, please find some process-related feedback
from me

for future submissions for intel networking please target IWL (Intel 
Wired Lan mailing list)

patches should be split into per-driver changes most of the time

please don't set "In-reply-to: v1" to v2 - just send as a standalone new
series (but link to v1 in changelog)

this is also a smallest bit above "too trivial to merge" IMO

finally this is not -net material, but -next, and -next is closed now
for PRs, and this is the only reason that warrants "v3" from you
(to:iwl, cc:netdev, after submission window reopens, ~Apr 27th)

(please collect Dan's Reviewed-by tag)


> Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_adminq.h | 2 +-
>   drivers/net/ethernet/intel/iavf/iavf_adminq.h | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.h b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
> index 1be97a3a86ce..dcf3baec7b73 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_adminq.h
> +++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
> @@ -109,7 +109,7 @@ static inline int i40e_aq_rc_to_posix(int aq_ret, int aq_rc)
>   		-EFBIG,      /* I40E_AQ_RC_EFBIG */
>   	};
>   
> -	if (!((u32)aq_rc < (sizeof(aq_to_posix) / sizeof((aq_to_posix)[0]))))
> +	if (aq_rc >= ARRAY_SIZE(aq_to_posix))
>   		return -ERANGE;
>   
>   	return aq_to_posix[aq_rc];
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_adminq.h b/drivers/net/ethernet/intel/iavf/iavf_adminq.h
> index bbf5c4b3a2ae..dd2f61172157 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_adminq.h
> +++ b/drivers/net/ethernet/intel/iavf/iavf_adminq.h
> @@ -113,7 +113,7 @@ static inline int iavf_aq_rc_to_posix(int aq_ret, int aq_rc)
>   	if (aq_ret == IAVF_ERR_ADMIN_QUEUE_TIMEOUT)
>   		return -EAGAIN;
>   
> -	if (!((u32)aq_rc < (sizeof(aq_to_posix) / sizeof((aq_to_posix)[0]))))
> +	if (aq_rc >= ARRAY_SIZE(aq_to_posix))
>   		return -ERANGE;
>   
>   	return aq_to_posix[aq_rc];


^ permalink raw reply

* Re: Re: [PATCH] ipv6: udp: fix memory leak in udpv6_sendmsg error path
From: 王明煜 @ 2026-04-23  7:36 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: willemdebruijn.kernel, davem, dsahern, edumazet, kuba, pabeni,
	horms, netdev, linux-kernel
In-Reply-To: <aei3QDpiToAcYfR1@krikkit>

Hi Sabrina and Jakub,

Thank you for the detailed review and pointing out the crash risk in v1. 

You are absolutely correct. My previous patch would lead to a Double Free/UAF because functions like ip6_setup_cork() and __ip6_flush_pending_frames() already handle the dst release in their error paths.

After a deeper investigation, I found the actual source of the memory leak reported by the fuzzer. It occurs inside `__ip6_make_skb()` in `net/ipv6/ip6_output.c`.

In specific cases (such as fault injection during packet construction), `__ip6_append_data()` can succeed while leaving the queue empty. When `__ip6_make_skb()` is then called, `__skb_dequeue(queue)` returns NULL. 

Currently, the code handles a NULL skb with a `goto out;`, which completely bypasses the call to `ip6_cork_release(cork);`. Since the `cork` structure still holds the `dst_entry` reference at this point, skipping the release causes the memory leak.

I am abandoning the v1 approach and will submit a v2 patch that fixes the logic in `__ip6_make_skb()` to ensure `ip6_cork_release()` is always called.

Best regards,
Mingyu Wang


> -----原始邮件-----
> 发件人: "Sabrina Dubroca" <sd@queasysnail.net>
> 发送时间:2026-04-22 19:55:44 (星期三)
> 收件人: "Mingyu Wang" <25181214217@stu.xidian.edu.cn>
> 抄送: willemdebruijn.kernel@gmail.com, davem@davemloft.net, dsahern@kernel.org, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, horms@kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
> 主题: Re: [PATCH] ipv6: udp: fix memory leak in udpv6_sendmsg error path
> 
> 2026-04-22, 18:58:02 +0800, Mingyu Wang wrote:
> > During fuzzing with failslab enabled, a memory leak was observed in the
> > IPv6 UDP send path.
> > 
> > When sending via the lockless fast path (!corkreq), udpv6_sendmsg()
> > calls ip6_make_skb() and assumes that the routing entry (dst_entry)
> > reference has been stolen by the callee. However, if ip6_make_skb()
> > fails early (e.g., due to an ENOMEM from memory allocation failure),
> > it returns an error pointer without consuming the dst reference.
> 
> Not in all cases? If ip6_setup_cork() fails, we call
> ip6_cork_release() which will release the dst. The MSG_PROBE path also
> releases the dst. __ip6_flush_pending_frames() also looks like it does
> that.
> 
> > Since udpv6_sendmsg() unconditionally jumps to the 'out_no_dst' label,
> > the unconsumed dst_entry is never released, resulting in a memory leak.
> > 
> > Fix this by explicitly calling dst_release(dst) when ip6_make_skb()
> > returns an error.
> > 
> > Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> 
> And this is missing a Fixes tag.
> 
> >  net/ipv6/udp.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
> > index 15e032194ecc..b83ecfd729af 100644
> > --- a/net/ipv6/udp.c
> > +++ b/net/ipv6/udp.c
> > @@ -1706,8 +1706,11 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> >  				   dst_rt6_info(dst),
> >  				   msg->msg_flags, &cork);
> >  		err = PTR_ERR(skb);
> > -		if (!IS_ERR_OR_NULL(skb))
> > +		if (!IS_ERR_OR_NULL(skb)) {
> >  			err = udp_v6_send_skb(skb, fl6, &cork.base);
> > +		} else {
> > +			dst_release(dst);
> > +		}
> >  		/* ip6_make_skb steals dst reference */
> 
> This comment becomes really confusing after your patch.
> 
> >  		goto out_no_dst;
> >  	}
> > -- 
> > 2.34.1
> > 
> > 
> 
> -- 
> Sabrina

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH iwl-next] ice: use ice_fill_eth_hdr() in ice_fill_sw_rule()
From: Rinitha, SX @ 2026-04-23  7:32 UTC (permalink / raw)
  To: Loktionov, Aleksandr, intel-wired-lan@lists.osuosl.org,
	Nguyen, Anthony L, Loktionov, Aleksandr
  Cc: netdev@vger.kernel.org, Szycik, Marcin, Szapar-Mudlaw, Martyna
In-Reply-To: <20260320050556.422762-1-aleksandr.loktionov@intel.com>

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Aleksandr Loktionov
> Sent: 20 March 2026 10:36
> To: intel-wired-lan@lists.osuosl.org; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>
> Cc: netdev@vger.kernel.org; Szycik, Marcin <marcin.szycik@intel.com>; Szapar-Mudlaw, Martyna <martyna.szapar-mudlaw@intel.com>
> Subject: [Intel-wired-lan] [PATCH iwl-next] ice: use ice_fill_eth_hdr() in ice_fill_sw_rule()
>
> From: Marcin Szycik <marcin.szycik@intel.com>
>
> Use the already existing helper function to fill Ethernet header. Also replace sizeof with a (also existing) macro to reduce the number of variables.
>
> Suggested-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@intel.com>
> Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> drivers/net/ethernet/intel/ice/ice_switch.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>

Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)

^ permalink raw reply

* Re: [PATCH v4 net 2/3] net: mlx5e: fix CWR handling in drivers to preserve ACE signal
From: Paolo Abeni @ 2026-04-23  7:30 UTC (permalink / raw)
  To: chia-yu.chang, linyunsheng, andrew+netdev, parav, jasowang, mst,
	shenjian15, salil.mehta, shaojijie, saeedm, tariqt, mbloch,
	leonro, linux-rdma, netdev, davem, edumazet, kuba, horms, ij,
	ncardwell, koen.de_schepper, g.white, ingemar.s.johansson,
	mirja.kuehlewind, cheshire, rs.ietf, Jason_Livingood, vidhi_goel
In-Reply-To: <20260417152642.71674-3-chia-yu.chang@nokia-bell-labs.com>

On 4/17/26 5:26 PM, chia-yu.chang@nokia-bell-labs.com wrote:
> From: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
> 
> Currently, mlx5 Rx paths use the SKB_GSO_TCP_ECN flag when a TCP segment
> with the CWR flag set. This is wrong because SKB_GSO_TCP_ECN is only
> valid for RFC3168 ECN on Tx, and using it on Rx allows RFC3168 ECN
> offload to clear the CWR flag. As a result, incoming TCP segments
> may lose their ACE signal integrity required for AccECN (RFC9768),
> especially when the packet is forwarded and later re-segmented by GSO.
> 
> Fix this by setting SKB_GSO_TCP_ACCECN for any Rx segment with the CWR
> flag set. SKB_GSO_TCP_ACCECN ensures that RFC3168 ECN offload will
> not clear the CWR flag, therefore preserving the ACE signal.
> 
> Fixes: 92552d3abd329 ("net/mlx5e: HW_GRO cqe handler implementation")
> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> index 5b60aa47c75b..9b1c80079532 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> @@ -1180,7 +1180,7 @@ static void mlx5e_shampo_update_ipv4_tcp_hdr(struct mlx5e_rq *rq, struct iphdr *
>  	skb->csum_offset = offsetof(struct tcphdr, check);
>  
>  	if (tcp->cwr)
> -		skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ECN;
> +		skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ACCECN;

Here there is an open question for nVidia:

Is the above enough or will later segmentation lead to the wrong
results? I think/guess the firmware is (still) aggregating the wire
frames using the ECN schema, i.e. the first wire packet has CWR == 1,
the later CWR==0.

If so, later segmentation of this GSO packet will emit CWR == 1 on all
the packets, making the egress stream different from ingress.

@Saeed, Leon, Tariq: could you please have a look here?

I guess that with a more conservative approach drivers update should be
omitted, and the updated documentation should be less forceful (i.e.
"TCP_ECN should not be used in RX")

/P

Thanks,

Paolo


^ permalink raw reply

* Re: [PATCH v1 net] udp: Use READ_ONCE()/WRITE_ONCE() for hslot->count.
From: Eric Dumazet @ 2026-04-23  7:29 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Willem de Bruijn,
	Simon Horman, David Held, Kuniyuki Iwashima, netdev,
	syzbot+04905b8b3523856476c0
In-Reply-To: <20260423010933.3899132-1-kuniyu@google.com>

On Wed, Apr 22, 2026 at 6:09 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>
> __udp4_lib_mcast_demux_lookup() and __udp4_lib_mcast_deliver()
> reads the number of sockets in the 1-tuple hash table chain
> locklessly. [0]
>
> Let's use READ_ONCE() and WRITE_ONCE() for hslot->count.
>
> [0]:
> BUG: KCSAN: data-race in __udp4_lib_mcast_deliver / udp_lib_unhash

...

>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 UID: 0 PID: 15111 Comm: syz.6.4060 Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
>
> Fixes: 2dc41cff7545 ("udp: Use hash2 for long hash1 chains in __udp*_lib_mcast_deliver.")
> Fixes: 63c6f81cdde5 ("udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup")
> Reported-by: syzbot+04905b8b3523856476c0@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/netdev/69e97093.050a0220.24bfd3.0048.GAE@google.com/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---

You forgot to change net/ipv6/udp.c

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 15e032194eccc3c29b3e5f3a09214cad60623329..1f4dabe4c350118441f61c1cf0398e698581542c
100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -953,7 +953,7 @@ static int __udp6_lib_mcast_deliver(struct net
*net, struct sk_buff *skb,
        hash2_any = 0;
        hash2 = 0;
        hslot = udp_hashslot(udptable, net, hnum);
-       use_hash2 = hslot->count > 10;
+       use_hash2 = READ_ONCE(hslot->count) > 10;
        offset = offsetof(typeof(*sk), sk_node);

        if (use_hash2) {


Also one part is missing in udp_lib_get_port()

@@ -289,7 +289,7 @@ int udp_lib_get_port(struct sock *sk, unsigned short snum,
                        hash2_nulladdr &= udptable->mask;

                        hslot2 = udp_hashslot2(udptable, slot2);
-                       if (hslot->count < hslot2->count)
+                       if (hslot->count < READ_ONCE(hslot2->count))
                                goto scan_primary_hash;

                        exist = udp_lib_lport_inuse2(net, snum, hslot2, sk);

^ permalink raw reply

* [bug report] Potential atomicity/refcounting issues in 'drivers/net/ethernet/chelsio/cxgb4/clip_tbl.c', between 'cxgb4_clip_get()' and 'cxgb4_clip_release()'
From: Ginger @ 2026-04-23  7:27 UTC (permalink / raw)
  To: bharat; +Cc: netdev, linux-kernel

Dear Linux kernel maintainers,

My research-based static analyzer found a potential
refcounting/atomicity bug within the
'drivers/net/ethernet/chelsio/cxgb4' subsystem, more specifically, in
'drivers/net/ethernet/chelsio/cxgb4/clip_tbl.c'.

Kernel version: long-term kernel v6.18.9

Potential concurrent triggering executions:
T0:
cxgb4_clip_get
     --> read_unlock_bh(&ctbl->lock);
     --> refcount_inc(&ce->refcnt);
or
cxgb4_clip_get
     --> write_unlock_bh(&ctbl->lock);
     --> refcount_set(&ce->refcnt, 1);

T1:
cxgb4_clip_release
    --> write_lock_bh(&ctbl->lock);
    --> spin_lock_bh(&ce->lock);
    --> refcount_dec_and_test(&ce->refcnt);
    --> spin_unlock_bh(&ce->lock);
    --> write_unlock_bh(&ctbl->lock);

In T0, the refcounting increment on 'ce->refcnt' is not operated
within 'ctbl->lock' and or does not check whether the refcount has
already reached zero, i.e., not synchronized with
'cxgb4_clip_release'.
Although 'ctbl->lock' does not seem to protect 'ce->refcnt', this is
potentially problematic because T1 decrements the 'ce->refcnt' using
synchronized locks and refcounting decrement.

Thank you for your time and consideration.

Best regards,
Ginger

^ permalink raw reply

* [PATCH RFC net-next 4/4] net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook
From: Corey Leavitt @ 2026-04-23  7:23 UTC (permalink / raw)
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-7d4856f686f6@leavitt.info>

Transfer ownership of phydev->psec from fwnode_mdio to the phy
subsystem itself. The phy subsystem now subscribes to the pse-pd
notifier chain and manages psec attach/detach in response to PSE
controller lifecycle events, while fwnode_mdio loses its PSE
awareness entirely.

Split phy_device_register() into a public entry point that takes
rtnl_lock() and a phy_device_register_locked() variant that assumes
rtnl is already held. Callers that already hold rtnl (the SFP
module state machine via __sfp_sm_event) use the _locked form to
avoid deadlock; all other callers use the unchanged public API.
This pair mirrors the register_netdevice() / register_netdev()
split convention already established in the core networking stack.
rtnl must span the full registration sequence through device_add(),
not just phy_try_attach_pse(): a PSE_REGISTERED event firing between
a narrow attach lock and device_add() would walk mdio_bus_type, find
the phy not yet on the bus, and leave it permanently unattached.

With rtnl held across the full registration sequence:

  - At phy_device_register_locked(), phy_try_attach_pse() attempts
    an of_pse_control_get() for phys whose DT pses phandle resolves
    now. If the controller is already registered, psec is attached
    before device_add() makes the phy visible on mdio_bus_type.
    If the controller is not yet registered, the swallow-error path
    leaves psec NULL and relies on the subsequent notifier event.

  - On PSE_REGISTERED: an rtnl-guarded bus walk retries the attach
    for every registered phy whose psec is still NULL. This is the
    "phy was enumerated before the PSE controller loaded" case,
    which is the root cause of the boot-time probe-retry storm on
    systems with a modular PSE controller driver. Because the
    pse_controller_notifier is fired synchronously, a concurrent
    pse_controller_register() either (a) completes list_add and
    releases pse_list_mutex before this function takes rtnl, in
    which case phy_try_attach_pse() finds the controller in the
    list and attaches; or (b) fires its notifier during this
    function, in which case the callback blocks on rtnl until this
    function returns, then walks the bus and finds the phy fully
    registered (attaching if psec is still NULL).

  - On PSE_UNREGISTERED: an rtnl-guarded bus walk releases every
    phydev->psec that targets the departing controller before
    pse_release_pis() frees pcdev->pi. Without this, a phy still
    holding a pse_control reference would cause a use-after-free
    in __pse_control_release's pcdev->pi[psec->id] access, and the
    PSE driver module could not finish unloading while any phy
    still held a reference via module_put().

Introduce phy_try_attach_pse() as the rtnl-guarded helper used by
both the register path and the notifier walk. Holding rtnl across
of_pse_control_get() is safe because pse_list_mutex is never held
in the opposite order.

Expose pse_control_matches_pcdev() as a predicate so subscribers
can identify which of their held pse_control references target a
given controller, without leaking the struct pse_controller_dev *
out of pse_control opacity.

Move the final pse_control_put() of phydev->psec from
phy_device_remove() to phy_device_release(). The kobject release
callback runs only after every reference on the device has been
dropped, including the bus iterator references taken by
bus_for_each_dev() in the notifier walk, which means by the time
release fires no concurrent reader or writer of phydev->psec can
exist. The mdio_bus_type klist is set up in bus_register() with
klist_devices_get() / klist_devices_put() (drivers/base/bus.c),
which bracket each iteration step with get_device() / put_device()
on the underlying struct device; that reference defers the release
callback from firing until the walk has advanced past this phy.
Keeping phy_device_remove() unchanged avoids introducing a new
locking contract on its many callers (sfp, fixed_phy, xgbe, hns,
netsec, bcm_sf2, mdiobus_unregister).

Finally, delete fwnode_find_pse_control() and its call site in
fwnode_mdiobus_register_phy(), and drop the PSE header from
fwnode_mdio.c. This removes the probe-time -EPROBE_DEFER coupling
between mdio and pse-pd that caused the boot-hang regression on
systems with a modular PSE controller driver and a DT phy with a
pses phandle: the MDIO/DSA probe no longer sees any PSE-originated
-EPROBE_DEFER, so the probe-retry storm is gone. fwnode_mdio is
now PSE-agnostic.

Fixes: fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse control")
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/mdio/fwnode_mdio.c |  34 ----------
 drivers/net/phy/phy_device.c   | 144 ++++++++++++++++++++++++++++++++++++++---
 drivers/net/phy/sfp.c          |   2 +-
 drivers/net/pse-pd/pse_core.c  |  14 ++++
 include/linux/phy.h            |   2 +
 include/linux/pse-pd/pse.h     |   9 +++
 6 files changed, 161 insertions(+), 44 deletions(-)

diff --git a/drivers/net/mdio/fwnode_mdio.c b/drivers/net/mdio/fwnode_mdio.c
index ba7091518265..7bd979b59f49 100644
--- a/drivers/net/mdio/fwnode_mdio.c
+++ b/drivers/net/mdio/fwnode_mdio.c
@@ -11,33 +11,11 @@
 #include <linux/fwnode_mdio.h>
 #include <linux/of.h>
 #include <linux/phy.h>
-#include <linux/pse-pd/pse.h>
 
 MODULE_AUTHOR("Calvin Johnson <calvin.johnson@oss.nxp.com>");
 MODULE_LICENSE("GPL");
 MODULE_DESCRIPTION("FWNODE MDIO bus (Ethernet PHY) accessors");
 
-static struct pse_control *
-fwnode_find_pse_control(struct fwnode_handle *fwnode,
-			struct phy_device *phydev)
-{
-	struct pse_control *psec;
-	struct device_node *np;
-
-	if (!IS_ENABLED(CONFIG_PSE_CONTROLLER))
-		return NULL;
-
-	np = to_of_node(fwnode);
-	if (!np)
-		return NULL;
-
-	psec = of_pse_control_get(np, phydev);
-	if (PTR_ERR(psec) == -ENOENT)
-		return NULL;
-
-	return psec;
-}
-
 static struct mii_timestamper *
 fwnode_find_mii_timestamper(struct fwnode_handle *fwnode)
 {
@@ -118,7 +96,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
 				struct fwnode_handle *child, u32 addr)
 {
 	struct mii_timestamper *mii_ts = NULL;
-	struct pse_control *psec = NULL;
 	struct phy_device *phy;
 	bool is_c45;
 	u32 phy_id;
@@ -159,14 +136,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
 			goto clean_phy;
 	}
 
-	psec = fwnode_find_pse_control(child, phy);
-	if (IS_ERR(psec)) {
-		rc = PTR_ERR(psec);
-		goto unregister_phy;
-	}
-
-	phy->psec = psec;
-
 	/* phy->mii_ts may already be defined by the PHY driver. A
 	 * mii_timestamper probed via the device tree will still have
 	 * precedence.
@@ -176,9 +145,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
 
 	return 0;
 
-unregister_phy:
-	if (is_acpi_node(child) || is_of_node(child))
-		phy_device_remove(phy);
 clean_phy:
 	phy_device_free(phy);
 clean_mii_ts:
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index c2cdf1ae3542..7948800e6e49 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -223,8 +223,19 @@ static void phy_mdio_device_free(struct mdio_device *mdiodev)
 
 static void phy_device_release(struct device *dev)
 {
+	struct phy_device *phydev = to_phy_device(dev);
+
+	/* bus_for_each_dev() holds get_device() across each iteration
+	 * step, deferring this release callback until any in-flight PSE
+	 * notifier walk has advanced past this phy. pse_control_put()
+	 * takes pse_list_mutex, so this path must run in sleepable
+	 * context.
+	 */
+	might_sleep();
+	pse_control_put(phydev->psec);
+
 	fwnode_handle_put(dev->fwnode);
-	kfree(to_phy_device(dev));
+	kfree(phydev);
 }
 
 static void phy_mdio_device_remove(struct mdio_device *mdiodev)
@@ -1102,14 +1113,102 @@ struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45)
 }
 EXPORT_SYMBOL(get_phy_device);
 
-/**
- * phy_device_register - Register the phy device on the MDIO bus
- * @phydev: phy_device structure to be added to the MDIO bus
+/* Best-effort attach of phydev->psec from a DT `pses = <&...>` phandle.
+ * Caller must hold rtnl. Errors are swallowed; the notifier retries
+ * at PSE_REGISTERED time.
  */
-int phy_device_register(struct phy_device *phydev)
+static void phy_try_attach_pse(struct phy_device *phydev)
+{
+	struct pse_control *psec;
+	struct device_node *np;
+
+	ASSERT_RTNL();
+
+	np = phydev->mdio.dev.of_node;
+	if (!np)
+		return;
+
+	if (phydev->psec)
+		return;
+
+	psec = of_pse_control_get(np, phydev);
+	if (IS_ERR(psec))
+		return;
+
+	phydev->psec = psec;
+}
+
+static int phy_pse_attach_one(struct device *dev, void *data __maybe_unused)
+{
+	ASSERT_RTNL();
+
+	if (dev->type != &mdio_bus_phy_type)
+		return 0;
+
+	phy_try_attach_pse(to_phy_device(dev));
+	return 0;
+}
+
+static int phy_pse_detach_one(struct device *dev, void *data)
+{
+	struct pse_controller_dev *pcdev = data;
+	struct phy_device *phydev;
+	struct pse_control *psec;
+
+	ASSERT_RTNL();
+
+	if (dev->type != &mdio_bus_phy_type)
+		return 0;
+
+	phydev = to_phy_device(dev);
+	psec = phydev->psec;
+	if (!psec || !pse_control_matches_pcdev(psec, pcdev))
+		return 0;
+
+	phydev->psec = NULL;
+	pse_control_put(psec);
+	return 0;
+}
+
+static int phy_pse_notifier_event(struct notifier_block *nb,
+				  unsigned long event, void *data)
+{
+	switch (event) {
+	case PSE_REGISTERED:
+		rtnl_lock();
+		bus_for_each_dev(&mdio_bus_type, NULL, NULL,
+				 phy_pse_attach_one);
+		rtnl_unlock();
+		return NOTIFY_OK;
+	case PSE_UNREGISTERED:
+		rtnl_lock();
+		bus_for_each_dev(&mdio_bus_type, NULL, data,
+				 phy_pse_detach_one);
+		rtnl_unlock();
+		return NOTIFY_OK;
+	default:
+		return NOTIFY_DONE;
+	}
+}
+
+static struct notifier_block phy_pse_notifier __read_mostly = {
+	.notifier_call = phy_pse_notifier_event,
+};
+
+/**
+ * phy_device_register_locked - Register the phy device on the MDIO bus
+ * @phydev: phy_device structure to be added to the MDIO bus
+ *
+ * Same as phy_device_register() but caller must already hold rtnl_lock().
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int phy_device_register_locked(struct phy_device *phydev)
 {
 	int err;
 
+	ASSERT_RTNL();
+
 	err = mdiobus_register_device(&phydev->mdio);
 	if (err)
 		return err;
@@ -1124,6 +1223,8 @@ int phy_device_register(struct phy_device *phydev)
 		goto out;
 	}
 
+	phy_try_attach_pse(phydev);
+
 	err = device_add(&phydev->mdio.dev);
 	if (err) {
 		phydev_err(phydev, "failed to add\n");
@@ -1133,12 +1234,32 @@ int phy_device_register(struct phy_device *phydev)
 	return 0;
 
  out:
-	/* Assert the reset signal */
+	/* If phy_try_attach_pse() set phydev->psec before device_add()
+	 * failed, the caller's phy_device_free() -> phy_device_release()
+	 * chain will drop it.
+	 */
 	phy_device_reset(phydev, 1);
-
 	mdiobus_unregister_device(&phydev->mdio);
 	return err;
 }
+EXPORT_SYMBOL(phy_device_register_locked);
+
+/**
+ * phy_device_register - Register the phy device on the MDIO bus
+ * @phydev: phy_device structure to be added to the MDIO bus
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int phy_device_register(struct phy_device *phydev)
+{
+	int err;
+
+	rtnl_lock();
+	err = phy_device_register_locked(phydev);
+	rtnl_unlock();
+
+	return err;
+}
 EXPORT_SYMBOL(phy_device_register);
 
 /**
@@ -1152,8 +1273,6 @@ EXPORT_SYMBOL(phy_device_register);
 void phy_device_remove(struct phy_device *phydev)
 {
 	unregister_mii_timestamper(phydev->mii_ts);
-	pse_control_put(phydev->psec);
-
 	device_del(&phydev->mdio.dev);
 
 	/* Assert the reset signal */
@@ -3962,8 +4081,14 @@ static int __init phy_init(void)
 	if (rc)
 		goto err_c45;
 
+	rc = pse_register_notifier(&phy_pse_notifier);
+	if (rc)
+		goto err_genphy;
+
 	return 0;
 
+err_genphy:
+	phy_driver_unregister(&genphy_driver);
 err_c45:
 	phy_driver_unregister(&genphy_c45_driver);
 err_ethtool_phy_ops:
@@ -3980,6 +4105,7 @@ static int __init phy_init(void)
 
 static void __exit phy_exit(void)
 {
+	pse_unregister_notifier(&phy_pse_notifier);
 	phy_driver_unregister(&genphy_c45_driver);
 	phy_driver_unregister(&genphy_driver);
 	rtnl_lock();
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index bd970f753beb..d19fe0f30c5d 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1932,7 +1932,7 @@ static int sfp_sm_probe_phy(struct sfp *sfp, int addr, bool is_c45)
 	/* Mark this PHY as being on a SFP module */
 	phy->is_on_sfp_module = true;
 
-	err = phy_device_register(phy);
+	err = phy_device_register_locked(phy);
 	if (err) {
 		phy_device_free(phy);
 		dev_err(sfp->dev, "phy_device_register failed: %pe\n",
diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 82125502a8e3..a0667324a029 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -2016,3 +2016,17 @@ bool pse_has_c33(struct pse_control *psec)
 	return psec->pcdev->types & ETHTOOL_PSE_C33;
 }
 EXPORT_SYMBOL_GPL(pse_has_c33);
+
+/**
+ * pse_control_matches_pcdev - Test whether a pse_control targets a controller
+ * @psec: pse_control obtained from of_pse_control_get()
+ * @pcdev: PSE controller to compare against
+ *
+ * Return: %true if @psec was obtained from @pcdev, %false otherwise.
+ */
+bool pse_control_matches_pcdev(struct pse_control *psec,
+			       struct pse_controller_dev *pcdev)
+{
+	return psec->pcdev == pcdev;
+}
+EXPORT_SYMBOL_GPL(pse_control_matches_pcdev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 199a7aaa341b..865b9baddb85 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -2158,6 +2158,8 @@ struct phy_device *fwnode_phy_find_device(struct fwnode_handle *phy_fwnode);
 struct fwnode_handle *fwnode_get_phy_node(const struct fwnode_handle *fwnode);
 struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45);
 int phy_device_register(struct phy_device *phy);
+/* Caller must hold rtnl_lock(); see phy_device_register() for the public form. */
+int phy_device_register_locked(struct phy_device *phy);
 void phy_device_free(struct phy_device *phydev);
 void phy_device_remove(struct phy_device *phydev);
 int phy_get_c45_ids(struct phy_device *phydev);
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 78fe3a2b1ea8..d4310ca71a3e 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -385,6 +385,9 @@ int pse_ethtool_set_prio(struct pse_control *psec,
 bool pse_has_podl(struct pse_control *psec);
 bool pse_has_c33(struct pse_control *psec);
 
+bool pse_control_matches_pcdev(struct pse_control *psec,
+			       struct pse_controller_dev *pcdev);
+
 int pse_register_notifier(struct notifier_block *nb);
 int pse_unregister_notifier(struct notifier_block *nb);
 
@@ -438,6 +441,12 @@ static inline bool pse_has_c33(struct pse_control *psec)
 	return false;
 }
 
+static inline bool pse_control_matches_pcdev(struct pse_control *psec,
+					     struct pse_controller_dev *pcdev)
+{
+	return false;
+}
+
 static inline int pse_register_notifier(struct notifier_block *nb)
 {
 	return 0;

-- 
2.53.0



^ permalink raw reply related

* [PATCH RFC net-next 2/4] net: pse-pd: add notifier chain for controller lifecycle events
From: Corey Leavitt @ 2026-04-23  7:22 UTC (permalink / raw)
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-7d4856f686f6@leavitt.info>

Introduce a blocking notifier chain that allows other subsystems to be
informed when a PSE controller is registered or unregistered, and
provide pse_register_notifier() / pse_unregister_notifier() as the
subscriber interface.

Subsequent patches will use this to let the phy subsystem own the
phydev->psec lifecycle directly, decoupling PSE lookup from
fwnode_mdiobus_register_phy() and removing the probe-time
-EPROBE_DEFER coupling that currently exists between mdio, phy and
pse-pd when the PSE controller driver is modular.

A blocking chain (rather than atomic) is used because callbacks will
take rtnl_lock and call back into pse_core via of_pse_control_get().

The enum pse_controller_event is placed outside the
IS_ENABLED(CONFIG_PSE_CONTROLLER) guard so that subscribers compiled
into a kernel without PSE support can still reference the event
values in dead-code paths without breaking the build.

This patch is pure infrastructure: nothing fires events yet, and
nothing subscribes. No observable behavior change.

Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/pse-pd/pse_core.c | 34 ++++++++++++++++++++++++++++++++++
 include/linux/pse-pd/pse.h    | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 893ec2185947..80c5c6c1758c 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -8,6 +8,7 @@
 #include <linux/device.h>
 #include <linux/ethtool.h>
 #include <linux/ethtool_netlink.h>
+#include <linux/notifier.h>
 #include <linux/of.h>
 #include <linux/phy.h>
 #include <linux/pse-pd/pse.h>
@@ -23,6 +24,39 @@ static LIST_HEAD(pse_controller_list);
 static DEFINE_XARRAY_ALLOC(pse_pw_d_map);
 static DEFINE_MUTEX(pse_pw_d_mutex);
 
+static BLOCKING_NOTIFIER_HEAD(pse_controller_notifier);
+
+/**
+ * pse_register_notifier - register a callback for PSE controller events
+ * @nb: notifier block to register
+ *
+ * See enum pse_controller_event for events fired and their subscriber
+ * contract. Callbacks run in process context; they may sleep, take
+ * rtnl, and call of_pse_control_get(). The chain fires synchronously,
+ * so a PSE controller driver's probe/unbind path must not hold any
+ * such lock when calling pse_controller_register() or
+ * pse_controller_unregister().
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int pse_register_notifier(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_register(&pse_controller_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(pse_register_notifier);
+
+/**
+ * pse_unregister_notifier - unregister a previously registered callback
+ * @nb: notifier block previously passed to pse_register_notifier()
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int pse_unregister_notifier(struct notifier_block *nb)
+{
+	return blocking_notifier_chain_unregister(&pse_controller_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(pse_unregister_notifier);
+
 /**
  * struct pse_control - a PSE control
  * @pcdev: a pointer to the PSE controller device
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 4e5696cfade7..78fe3a2b1ea8 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -21,6 +21,7 @@ struct net_device;
 struct phy_device;
 struct pse_controller_dev;
 struct netlink_ext_ack;
+struct notifier_block;
 
 /* C33 PSE extended state and substate. */
 struct ethtool_c33_pse_ext_state_info {
@@ -337,6 +338,24 @@ enum pse_budget_eval_strategies {
 	PSE_BUDGET_EVAL_STRAT_DYNAMIC	= 1 << 2,
 };
 
+/**
+ * enum pse_controller_event - PSE controller lifecycle events
+ *
+ * Event data in callbacks is always a pointer to the struct
+ * pse_controller_dev firing the event.
+ *
+ * @PSE_REGISTERED: controller added to pse_controller_list and
+ *	resolvable by of_pse_control_get().
+ * @PSE_UNREGISTERED: controller about to be removed from
+ *	pse_controller_list. Subscribers holding pse_control references
+ *	targeting it must drop them before returning and must not
+ *	acquire new references for it.
+ */
+enum pse_controller_event {
+	PSE_REGISTERED,
+	PSE_UNREGISTERED,
+};
+
 #if IS_ENABLED(CONFIG_PSE_CONTROLLER)
 int pse_controller_register(struct pse_controller_dev *pcdev);
 void pse_controller_unregister(struct pse_controller_dev *pcdev);
@@ -366,6 +385,9 @@ int pse_ethtool_set_prio(struct pse_control *psec,
 bool pse_has_podl(struct pse_control *psec);
 bool pse_has_c33(struct pse_control *psec);
 
+int pse_register_notifier(struct notifier_block *nb);
+int pse_unregister_notifier(struct notifier_block *nb);
+
 #else
 
 static inline struct pse_control *of_pse_control_get(struct device_node *node,
@@ -416,6 +438,16 @@ static inline bool pse_has_c33(struct pse_control *psec)
 	return false;
 }
 
+static inline int pse_register_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+
+static inline int pse_unregister_notifier(struct notifier_block *nb)
+{
+	return 0;
+}
+
 #endif
 
 #endif

-- 
2.53.0



^ permalink raw reply related

* [PATCH RFC net-next 3/4] net: pse-pd: fire lifecycle events on controller register/unregister
From: Corey Leavitt @ 2026-04-23  7:23 UTC (permalink / raw)
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-7d4856f686f6@leavitt.info>

Hook the newly-introduced pse_controller_notifier chain so that
pse_controller_register() fires PSE_REGISTERED after the controller
has been added to pse_controller_list (i.e. is now resolvable by
of_pse_control_get()), and pse_controller_unregister() fires
PSE_UNREGISTERED before the controller is removed from the list
(while it is still valid to dereference from a subscriber's
pse_control pointer targeting it).

With no subscribers yet, this is observably a no-op. A later change
wires the phy subsystem in as the first subscriber.

Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/pse-pd/pse_core.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 80c5c6c1758c..82125502a8e3 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -1138,6 +1138,9 @@ int pse_controller_register(struct pse_controller_dev *pcdev)
 	list_add(&pcdev->list, &pse_controller_list);
 	mutex_unlock(&pse_list_mutex);
 
+	blocking_notifier_call_chain(&pse_controller_notifier,
+				     PSE_REGISTERED, pcdev);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(pse_controller_register);
@@ -1148,6 +1151,9 @@ EXPORT_SYMBOL_GPL(pse_controller_register);
  */
 void pse_controller_unregister(struct pse_controller_dev *pcdev)
 {
+	blocking_notifier_call_chain(&pse_controller_notifier,
+				     PSE_UNREGISTERED, pcdev);
+
 	pse_flush_pw_ds(pcdev);
 	pse_release_pis(pcdev);
 	if (pcdev->irq)

-- 
2.53.0



^ permalink raw reply related

* [PATCH RFC net-next 1/4] net: pse-pd: scope pse_control regulator handle to kref lifetime
From: Corey Leavitt @ 2026-04-23  7:22 UTC (permalink / raw)
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-7d4856f686f6@leavitt.info>

__pse_control_release() drops psec->ps via devm_regulator_put(), which
only succeeds if the devres entry added by the matching
devm_regulator_get_exclusive() is still present on pcdev->dev at the
time the pse_control's kref hits zero.

In practice that assumption does not hold when the controller is
unbound while any pse_control still has consumers: pcdev->dev's
devres list is released LIFO, so every per-attach regulator-GET
devres runs (and regulator_put()s the underlying regulator) before
pse_controller_unregister() itself is invoked. Any later
pse_control_put() from that unbind path then reads psec->ps as a
dangling pointer inside devm_regulator_put() and WARNs at
drivers/regulator/devres.c:232 (devres_release() fails to find the
already-released match).

The pse_control's consumer handle is logically scoped to the
pse_control's refcount, not to pcdev->dev's devres lifetime. Switch
to the plain regulator_get_exclusive() / regulator_put() pair so
__pse_control_release() does the right put regardless of whether
the controller's devres has already been unwound.

No change to the regulator-framework-visible refcount or lifetime of
the underlying regulator: a single get paired with a single put. The
existing devm_regulator_register() for the per-PI rails is unchanged
(those ARE correctly scoped to the controller's lifetime).

Fixes: d83e13761d5b ("net: pse-pd: Use regulator framework within PSE framework")
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
 drivers/net/pse-pd/pse_core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index f6b94ac7a68a..893ec2185947 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -1362,7 +1362,7 @@ static void __pse_control_release(struct kref *kref)

 	if (psec->pcdev->pi[psec->id].admin_state_enabled)
 		regulator_disable(psec->ps);
-	devm_regulator_put(psec->ps);
+	regulator_put(psec->ps);

 	module_put(psec->pcdev->owner);

@@ -1431,8 +1431,8 @@ pse_control_get_internal(struct pse_controller_dev *pcdev, unsigned int index,
 		goto free_psec;

 	pcdev->pi[index].admin_state_enabled = ret;
-	psec->ps = devm_regulator_get_exclusive(pcdev->dev,
-						rdev_get_name(pcdev->pi[index].rdev));
+	psec->ps = regulator_get_exclusive(pcdev->dev,
+					   rdev_get_name(pcdev->pi[index].rdev));
 	if (IS_ERR(psec->ps)) {
 		ret = PTR_ERR(psec->ps);
 		goto put_module;

-- 
2.53.0

^ permalink raw reply related

* [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
From: Corey Leavitt @ 2026-04-23  7:22 UTC (permalink / raw)
  Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt

On systems where a PSE controller driver loads as a module and a
device-tree PHY node carries a `pses = <&pse_pi>` reference,
fwnode_mdiobus_register_phy() tries to resolve the PSE handle before
the controller driver has probed. of_pse_control_get() returns
-EPROBE_DEFER, the enclosing MDIO/DSA probe fails, and driver-core
re-queues the work. The retry loop spins until the PSE driver module
loads and its controller registers.

Commit fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse
control") made each retry expensive. It reordered
fwnode_mdiobus_register_phy() so the PHY is registered before the
PSE lookup. Every deferral now performs a full
phy_device_register() / phy_device_remove() cycle. On a board with a
sufficiently tight watchdog the retry loop can starve the watchdog
kthread. On the reporting hardware (MT7621 + gpio-wdt, 1-second
margin) the retry loop converts a slow probe phase into a reset
before userspace loads.

The affected population today looks small. OpenWrt, where PSE
actually ships, is still on 6.12 (pre-regression), and most
environments with CONFIG_PSE_*=m do not have boards whose DT
references a PSE controller from a PHY. Still, the mechanism is
general. Any modular PSE driver combined with the documented
`pses = <&...>` binding reproduces the retry loop. Whether it
reaches brick-grade or merely slow/flaky boot depends on local
watchdog timing. More exposure is expected as distribution and
embedded kernels move to 6.13 and later.

The narrow fix would be to partially revert the ordering in
fa2f0454174c so each defer is cheap again. That keeps the same
architecture (fwnode_mdio holding PSE knowledge, -EPROBE_DEFER
flowing across the subsystem boundary), and any future reorder
reintroduces the same class of bug. This series takes the larger
fix: decouple PSE controller lookup from MDIO registration entirely.
pse_core now publishes a BLOCKING_NOTIFIER chain with REGISTERED
and UNREGISTERED events. phy_device subscribes, owns phydev->psec
lifetime, and attaches PSE handles in response to controller
lifecycle rather than during probe. fwnode_mdio loses its PSE
awareness, and -EPROBE_DEFER no longer flows out of fwnode_mdio.

Patch breakdown:

  1. Scope the pse_control regulator handle to kref lifetime
     (Fixes: d83e13761d5b). A latent bug that patch 4 makes
     reachable.
  2. Add the notifier chain (enum, head, register/unregister
     helpers). Pure infrastructure. No subscribers yet, no
     observable change.
  3. Fire REGISTERED and UNREGISTERED events from the controller
     register/unregister paths. Still no subscribers, still no
     observable change.
  4. Subscribe from the PHY layer, take ownership of phydev->psec
     via the notifier, and remove fwnode_find_pse_control() from
     fwnode_mdio.

Patch 1 is bundled here per stable-kernel-rules section 4
reachability guidance. On mainline today, with no notifier
subscriber, no caller drives the dangling regulator-handle sequence.
Patches 2 and 3 are deliberately split to separate "add
infrastructure" from "wire it up". Happy to fold them if maintainers
prefer the combined form.

Validated on a Cudy C200P (MT7621 + IP804AR) running an OpenWrt
build of 6.18.21 with the series applied. A lockdep build
(CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP) shows no splats
from the series' code paths during boot, PHY attach, PHY detach, or
a full controller unbind/rebind cycle. ethtool --set-pse drives all
four PoE-capable LAN ports, and a Ruckus H510 class-4 PD plugged
into lan3 negotiates and receives 48 V.

The C200P has no SFP cage, so the SFP path change in sfp.c
(phy_device_register -> phy_device_register_locked) isn't exercised
on the bench. Verified by call-graph audit: every path reaching
sfp_sm_probe_phy() holds rtnl at entry, via sfp_timeout,
sfp_check_state, sfp_probe, sfp_remove, or
sfp_bus_{add,del}_upstream.

Not addressed by this series: ethtool --show-pse returns "No data
available" on DSA netdevs in 6.18, because dev->phydev is NULL for
DSA-frontend netdevs and ethnl_req_get_phydev() therefore returns
NULL. That's a DSA / ethtool integration quirk that predates this
work.

Sending as RFC because this is my first net-next series. I'd
appreciate maintainer guidance on whether patch 1 should go to net
rather than net-next, and whether the patch 2/3 split is preferred
to the combined form.

Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
Corey Leavitt (4):
      net: pse-pd: scope pse_control regulator handle to kref lifetime
      net: pse-pd: add notifier chain for controller lifecycle events
      net: pse-pd: fire lifecycle events on controller register/unregister
      net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook

 drivers/net/mdio/fwnode_mdio.c |  34 ----------
 drivers/net/phy/phy_device.c   | 144 ++++++++++++++++++++++++++++++++++++++---
 drivers/net/phy/sfp.c          |   2 +-
 drivers/net/pse-pd/pse_core.c  |  60 ++++++++++++++++-
 include/linux/phy.h            |   2 +
 include/linux/pse-pd/pse.h     |  41 ++++++++++++
 6 files changed, 236 insertions(+), 47 deletions(-)
---
base-commit: 1f5ffc672165ff851063a5fd044b727ab2517ae3
change-id: 20260422-pse-notifier-decouple-efa80d77f4be

Best regards,
--  
Corey Leavitt <corey@leavitt.info>

^ permalink raw reply

* Re: [PATCH net v4 0/2] net: airoha: Fix airoha_qdma_cleanup_tx_queue() processing
From: patchwork-bot+netdevbpf @ 2026-04-23  7:20 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, horms,
	linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260417-airoha_qdma_cleanup_tx_queue-fix-net-v4-0-e04bcc2c9642@kernel.org>

Hello:

This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Fri, 17 Apr 2026 08:36:30 +0200 you wrote:
> Add missing bits in airoha_qdma_cleanup_tx_queue routine.
> Fix airoha_qdma_cleanup_tx_queue processing errors intorduced in commit
> '3f47e67dff1f7 ("net: airoha: Add the capability to consume out-of-order
> DMA tx descriptors")'.
> 
> ---
> Changes in v4:
> - Drop patch 2/3 to move entries to queue head in case of DMA mapping
>   failure in airoha_dev_xmit().
> - Link to v3: https://lore.kernel.org/r/20260416-airoha_qdma_cleanup_tx_queue-fix-net-v3-0-2b69f5788580@kernel.org
> 
> [...]

Here is the summary with links:
  - [net,v4,1/2] net: airoha: Move ndesc initialization at end of airoha_qdma_init_tx()
    https://git.kernel.org/netdev/net/c/f329924bb494
  - [net,v4,2/2] net: airoha: Add missing bits in airoha_qdma_cleanup_tx_queue()
    https://git.kernel.org/netdev/net/c/3309965fe44c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net v2 10/15] drivers: net: cirrus: mac89x0: Remove this driver
From: John Paul Adrian Glaubitz @ 2026-04-23  7:10 UTC (permalink / raw)
  To: Geert Uytterhoeven, Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jonathan Corbet, Shuah Khan,
	Michael Fritscher, Byron Stanoszek, Daniel Palmer, linux-kernel,
	netdev, linux-doc, linux-m68k
In-Reply-To: <CAMuHMdV-vF6sTvAi8kKzxGwZ9YUSBO1Qta5PDCRbA0zr-LEp_w@mail.gmail.com>

Hi Geert,

On Thu, 2026-04-23 at 09:07 +0200, Geert Uytterhoeven wrote:
> CC linux-m68k
> 
> On Thu, 23 Apr 2026 at 02:29, Andrew Lunn <andrew@lunn.ch> wrote:
> > The mac89x0 was written by Russell Nelson in 1996. It is an MAC
> 
> It is based on the ISA cs89x0 driver, written by Russell Nelson.
> 
> > device, so unlikely to be used with modern kernels.
> 
> Macs do run modern kernels.

Retrocomputing still is not well regarded by some maintainers, it seems :-(.

Adrian

-- 
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-    GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913

^ permalink raw reply

* Re: [PATCH net v2 10/15] drivers: net: cirrus: mac89x0: Remove this driver
From: Geert Uytterhoeven @ 2026-04-23  7:07 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jonathan Corbet, Shuah Khan,
	Michael Fritscher, Byron Stanoszek, Daniel Palmer, linux-kernel,
	netdev, linux-doc, linux-m68k
In-Reply-To: <20260422-v7-0-0-net-next-driver-removal-v1-v2-10-08a5b59784d5@lunn.ch>

CC linux-m68k

On Thu, 23 Apr 2026 at 02:29, Andrew Lunn <andrew@lunn.ch> wrote:
> The mac89x0 was written by Russell Nelson in 1996. It is an MAC

It is based on the ISA cs89x0 driver, written by Russell Nelson.

> device, so unlikely to be used with modern kernels.

Macs do run modern kernels.

>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> ---
>  drivers/net/ethernet/cirrus/Kconfig   |  10 -
>  drivers/net/ethernet/cirrus/Makefile  |   1 -
>  drivers/net/ethernet/cirrus/cs89x0.h  | 461 ---------------------------
>  drivers/net/ethernet/cirrus/mac89x0.c | 577 ----------------------------------
>  4 files changed, 1049 deletions(-)
>
> diff --git a/drivers/net/ethernet/cirrus/Kconfig b/drivers/net/ethernet/cirrus/Kconfig
> index 1a0c7b3bfcd6..786d379e79fe 100644
> --- a/drivers/net/ethernet/cirrus/Kconfig
> +++ b/drivers/net/ethernet/cirrus/Kconfig
> @@ -25,14 +25,4 @@ config EP93XX_ETH
>           This is a driver for the ethernet hardware included in EP93xx CPUs.
>           Say Y if you are building a kernel for EP93xx based devices.
>
> -config MAC89x0
> -       tristate "Macintosh CS89x0 based ethernet cards"
> -       depends on MAC
> -       help
> -         Support for CS89x0 chipset based Ethernet cards.  If you have a
> -         Nubus or LC-PDS network (Ethernet) card of this type, say Y here.
> -
> -         To compile this driver as a module, choose M here. This module will
> -         be called mac89x0.
> -
>  endif # NET_VENDOR_CIRRUS
> diff --git a/drivers/net/ethernet/cirrus/Makefile b/drivers/net/ethernet/cirrus/Makefile
> index cb740939d976..03800af0f0e1 100644
> --- a/drivers/net/ethernet/cirrus/Makefile
> +++ b/drivers/net/ethernet/cirrus/Makefile
> @@ -4,4 +4,3 @@
>  #
>
>  obj-$(CONFIG_EP93XX_ETH) += ep93xx_eth.o
> -obj-$(CONFIG_MAC89x0) += mac89x0.o
> diff --git a/drivers/net/ethernet/cirrus/cs89x0.h b/drivers/net/ethernet/cirrus/cs89x0.h
> deleted file mode 100644
> index 210f9ec9af4b..000000000000
> --- a/drivers/net/ethernet/cirrus/cs89x0.h
> +++ /dev/null
> @@ -1,461 +0,0 @@
> -/*  Copyright, 1988-1992, Russell Nelson, Crynwr Software
> -
> -   This program is free software; you can redistribute it and/or modify
> -   it under the terms of the GNU General Public License as published by
> -   the Free Software Foundation, version 1.
> -
> -   This program is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> -   GNU General Public License for more details.
> -
> -   You should have received a copy of the GNU General Public License
> -   along with this program; if not, write to the Free Software
> -   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
> -   */
> -
> -
> -#define PP_ChipID 0x0000       /* offset   0h -> Corp -ID              */
> -                               /* offset   2h -> Model/Product Number  */
> -                               /* offset   3h -> Chip Revision Number  */
> -
> -#define PP_ISAIOB 0x0020       /*  IO base address */
> -#define PP_CS8900_ISAINT 0x0022        /*  ISA interrupt select */
> -#define PP_CS8920_ISAINT 0x0370        /*  ISA interrupt select */
> -#define PP_CS8900_ISADMA 0x0024        /*  ISA Rec DMA channel */
> -#define PP_CS8920_ISADMA 0x0374        /*  ISA Rec DMA channel */
> -#define PP_ISASOF 0x0026       /*  ISA DMA offset */
> -#define PP_DmaFrameCnt 0x0028  /*  ISA DMA Frame count */
> -#define PP_DmaByteCnt 0x002A   /*  ISA DMA Byte count */
> -#define PP_CS8900_ISAMemB 0x002C       /*  Memory base */
> -#define PP_CS8920_ISAMemB 0x0348 /*  */
> -
> -#define PP_ISABootBase 0x0030  /*  Boot Prom base  */
> -#define PP_ISABootMask 0x0034  /*  Boot Prom Mask */
> -
> -/* EEPROM data and command registers */
> -#define PP_EECMD 0x0040                /*  NVR Interface Command register */
> -#define PP_EEData 0x0042       /*  NVR Interface Data Register */
> -#define PP_DebugReg 0x0044     /*  Debug Register */
> -
> -#define PP_RxCFG 0x0102                /*  Rx Bus config */
> -#define PP_RxCTL 0x0104                /*  Receive Control Register */
> -#define PP_TxCFG 0x0106                /*  Transmit Config Register */
> -#define PP_TxCMD 0x0108                /*  Transmit Command Register */
> -#define PP_BufCFG 0x010A       /*  Bus configuration Register */
> -#define PP_LineCTL 0x0112      /*  Line Config Register */
> -#define PP_SelfCTL 0x0114      /*  Self Command Register */
> -#define PP_BusCTL 0x0116       /*  ISA bus control Register */
> -#define PP_TestCTL 0x0118      /*  Test Register */
> -#define PP_AutoNegCTL 0x011C   /*  Auto Negotiation Ctrl */
> -
> -#define PP_ISQ 0x0120          /*  Interrupt Status */
> -#define PP_RxEvent 0x0124      /*  Rx Event Register */
> -#define PP_TxEvent 0x0128      /*  Tx Event Register */
> -#define PP_BufEvent 0x012C     /*  Bus Event Register */
> -#define PP_RxMiss 0x0130       /*  Receive Miss Count */
> -#define PP_TxCol 0x0132                /*  Transmit Collision Count */
> -#define PP_LineST 0x0134       /*  Line State Register */
> -#define PP_SelfST 0x0136       /*  Self State register */
> -#define PP_BusST 0x0138                /*  Bus Status */
> -#define PP_TDR 0x013C          /*  Time Domain Reflectometry */
> -#define PP_AutoNegST 0x013E    /*  Auto Neg Status */
> -#define PP_TxCommand 0x0144    /*  Tx Command */
> -#define PP_TxLength 0x0146     /*  Tx Length */
> -#define PP_LAF 0x0150          /*  Hash Table */
> -#define PP_IA 0x0158           /*  Physical Address Register */
> -
> -#define PP_RxStatus 0x0400     /*  Receive start of frame */
> -#define PP_RxLength 0x0402     /*  Receive Length of frame */
> -#define PP_RxFrame 0x0404      /*  Receive frame pointer */
> -#define PP_TxFrame 0x0A00      /*  Transmit frame pointer */
> -
> -/*  Primary I/O Base Address. If no I/O base is supplied by the user, then this */
> -/*  can be used as the default I/O base to access the PacketPage Area. */
> -#define DEFAULTIOBASE 0x0300
> -#define FIRST_IO 0x020C                /*  First I/O port to check */
> -#define LAST_IO 0x037C         /*  Last I/O port to check (+10h) */
> -#define ADD_MASK 0x3000                /*  Mask it use of the ADD_PORT register */
> -#define ADD_SIG 0x3000         /*  Expected ID signature */
> -
> -/* On Macs, we only need use the ISA I/O stuff until we do MEMORY_ON */
> -#ifdef CONFIG_MAC
> -#define LCSLOTBASE 0xfee00000
> -#define MMIOBASE 0x40000
> -#endif
> -
> -#define CHIP_EISA_ID_SIG 0x630E   /*  Product ID Code for Crystal Chip (CS8900 spec 4.3) */
> -#define CHIP_EISA_ID_SIG_STR "0x630E"
> -
> -#ifdef IBMEIPKT
> -#define EISA_ID_SIG 0x4D24     /*  IBM */
> -#define PART_NO_SIG 0x1010     /*  IBM */
> -#define MONGOOSE_BIT 0x0000    /*  IBM */
> -#else
> -#define EISA_ID_SIG 0x630E     /*  PnP Vendor ID (same as chip id for Crystal board) */
> -#define PART_NO_SIG 0x4000     /*  ID code CS8920 board (PnP Vendor Product code) */
> -#define MONGOOSE_BIT 0x2000    /*  PART_NO_SIG + MONGOOSE_BUT => ID of mongoose */
> -#endif
> -
> -#define PRODUCT_ID_ADD 0x0002   /*  Address of product ID */
> -
> -/*  Mask to find out the types of  registers */
> -#define REG_TYPE_MASK 0x001F
> -
> -/*  Eeprom Commands */
> -#define ERSE_WR_ENBL 0x00F0
> -#define ERSE_WR_DISABLE 0x0000
> -
> -/*  Defines Control/Config register quintuplet numbers */
> -#define RX_BUF_CFG 0x0003
> -#define RX_CONTROL 0x0005
> -#define TX_CFG 0x0007
> -#define TX_COMMAND 0x0009
> -#define BUF_CFG 0x000B
> -#define LINE_CONTROL 0x0013
> -#define SELF_CONTROL 0x0015
> -#define BUS_CONTROL 0x0017
> -#define TEST_CONTROL 0x0019
> -
> -/*  Defines Status/Count registers quintuplet numbers */
> -#define RX_EVENT 0x0004
> -#define TX_EVENT 0x0008
> -#define BUF_EVENT 0x000C
> -#define RX_MISS_COUNT 0x0010
> -#define TX_COL_COUNT 0x0012
> -#define LINE_STATUS 0x0014
> -#define SELF_STATUS 0x0016
> -#define BUS_STATUS 0x0018
> -#define TDR 0x001C
> -
> -/* PP_RxCFG - Receive  Configuration and Interrupt Mask bit definition -  Read/write */
> -#define SKIP_1 0x0040
> -#define RX_STREAM_ENBL 0x0080
> -#define RX_OK_ENBL 0x0100
> -#define RX_DMA_ONLY 0x0200
> -#define AUTO_RX_DMA 0x0400
> -#define BUFFER_CRC 0x0800
> -#define RX_CRC_ERROR_ENBL 0x1000
> -#define RX_RUNT_ENBL 0x2000
> -#define RX_EXTRA_DATA_ENBL 0x4000
> -
> -/* PP_RxCTL - Receive Control bit definition - Read/write */
> -#define RX_IA_HASH_ACCEPT 0x0040
> -#define RX_PROM_ACCEPT 0x0080
> -#define RX_OK_ACCEPT 0x0100
> -#define RX_MULTCAST_ACCEPT 0x0200
> -#define RX_IA_ACCEPT 0x0400
> -#define RX_BROADCAST_ACCEPT 0x0800
> -#define RX_BAD_CRC_ACCEPT 0x1000
> -#define RX_RUNT_ACCEPT 0x2000
> -#define RX_EXTRA_DATA_ACCEPT 0x4000
> -#define RX_ALL_ACCEPT (RX_PROM_ACCEPT|RX_BAD_CRC_ACCEPT|RX_RUNT_ACCEPT|RX_EXTRA_DATA_ACCEPT)
> -/*  Default receive mode - individually addressed, broadcast, and error free */
> -#define DEF_RX_ACCEPT (RX_IA_ACCEPT | RX_BROADCAST_ACCEPT | RX_OK_ACCEPT)
> -
> -/* PP_TxCFG - Transmit Configuration Interrupt Mask bit definition - Read/write */
> -#define TX_LOST_CRS_ENBL 0x0040
> -#define TX_SQE_ERROR_ENBL 0x0080
> -#define TX_OK_ENBL 0x0100
> -#define TX_LATE_COL_ENBL 0x0200
> -#define TX_JBR_ENBL 0x0400
> -#define TX_ANY_COL_ENBL 0x0800
> -#define TX_16_COL_ENBL 0x8000
> -
> -/* PP_TxCMD - Transmit Command bit definition - Read-only */
> -#define TX_START_4_BYTES 0x0000
> -#define TX_START_64_BYTES 0x0040
> -#define TX_START_128_BYTES 0x0080
> -#define TX_START_ALL_BYTES 0x00C0
> -#define TX_FORCE 0x0100
> -#define TX_ONE_COL 0x0200
> -#define TX_TWO_PART_DEFF_DISABLE 0x0400
> -#define TX_NO_CRC 0x1000
> -#define TX_RUNT 0x2000
> -
> -/* PP_BufCFG - Buffer Configuration Interrupt Mask bit definition - Read/write */
> -#define GENERATE_SW_INTERRUPT 0x0040
> -#define RX_DMA_ENBL 0x0080
> -#define READY_FOR_TX_ENBL 0x0100
> -#define TX_UNDERRUN_ENBL 0x0200
> -#define RX_MISS_ENBL 0x0400
> -#define RX_128_BYTE_ENBL 0x0800
> -#define TX_COL_COUNT_OVRFLOW_ENBL 0x1000
> -#define RX_MISS_COUNT_OVRFLOW_ENBL 0x2000
> -#define RX_DEST_MATCH_ENBL 0x8000
> -
> -/* PP_LineCTL - Line Control bit definition - Read/write */
> -#define SERIAL_RX_ON 0x0040
> -#define SERIAL_TX_ON 0x0080
> -#define AUI_ONLY 0x0100
> -#define AUTO_AUI_10BASET 0x0200
> -#define MODIFIED_BACKOFF 0x0800
> -#define NO_AUTO_POLARITY 0x1000
> -#define TWO_PART_DEFDIS 0x2000
> -#define LOW_RX_SQUELCH 0x4000
> -
> -/* PP_SelfCTL - Software Self Control bit definition - Read/write */
> -#define POWER_ON_RESET 0x0040
> -#define SW_STOP 0x0100
> -#define SLEEP_ON 0x0200
> -#define AUTO_WAKEUP 0x0400
> -#define HCB0_ENBL 0x1000
> -#define HCB1_ENBL 0x2000
> -#define HCB0 0x4000
> -#define HCB1 0x8000
> -
> -/* PP_BusCTL - ISA Bus Control bit definition - Read/write */
> -#define RESET_RX_DMA 0x0040
> -#define MEMORY_ON 0x0400
> -#define DMA_BURST_MODE 0x0800
> -#define IO_CHANNEL_READY_ON 0x1000
> -#define RX_DMA_SIZE_64K 0x2000
> -#define ENABLE_IRQ 0x8000
> -
> -/* PP_TestCTL - Test Control bit definition - Read/write */
> -#define LINK_OFF 0x0080
> -#define ENDEC_LOOPBACK 0x0200
> -#define AUI_LOOPBACK 0x0400
> -#define BACKOFF_OFF 0x0800
> -#define FDX_8900 0x4000
> -#define FAST_TEST 0x8000
> -
> -/* PP_RxEvent - Receive Event Bit definition - Read-only */
> -#define RX_IA_HASHED 0x0040
> -#define RX_DRIBBLE 0x0080
> -#define RX_OK 0x0100
> -#define RX_HASHED 0x0200
> -#define RX_IA 0x0400
> -#define RX_BROADCAST 0x0800
> -#define RX_CRC_ERROR 0x1000
> -#define RX_RUNT 0x2000
> -#define RX_EXTRA_DATA 0x4000
> -
> -#define HASH_INDEX_MASK 0x0FC00
> -
> -/* PP_TxEvent - Transmit Event Bit definition - Read-only */
> -#define TX_LOST_CRS 0x0040
> -#define TX_SQE_ERROR 0x0080
> -#define TX_OK 0x0100
> -#define TX_LATE_COL 0x0200
> -#define TX_JBR 0x0400
> -#define TX_16_COL 0x8000
> -#define TX_SEND_OK_BITS (TX_OK|TX_LOST_CRS)
> -#define TX_COL_COUNT_MASK 0x7800
> -
> -/* PP_BufEvent - Buffer Event Bit definition - Read-only */
> -#define SW_INTERRUPT 0x0040
> -#define RX_DMA 0x0080
> -#define READY_FOR_TX 0x0100
> -#define TX_UNDERRUN 0x0200
> -#define RX_MISS 0x0400
> -#define RX_128_BYTE 0x0800
> -#define TX_COL_OVRFLW 0x1000
> -#define RX_MISS_OVRFLW 0x2000
> -#define RX_DEST_MATCH 0x8000
> -
> -/* PP_LineST - Ethernet Line Status bit definition - Read-only */
> -#define LINK_OK 0x0080
> -#define AUI_ON 0x0100
> -#define TENBASET_ON 0x0200
> -#define POLARITY_OK 0x1000
> -#define CRS_OK 0x4000
> -
> -/* PP_SelfST - Chip Software Status bit definition */
> -#define ACTIVE_33V 0x0040
> -#define INIT_DONE 0x0080
> -#define SI_BUSY 0x0100
> -#define EEPROM_PRESENT 0x0200
> -#define EEPROM_OK 0x0400
> -#define EL_PRESENT 0x0800
> -#define EE_SIZE_64 0x1000
> -
> -/* PP_BusST - ISA Bus Status bit definition */
> -#define TX_BID_ERROR 0x0080
> -#define READY_FOR_TX_NOW 0x0100
> -
> -/* PP_AutoNegCTL - Auto Negotiation Control bit definition */
> -#define RE_NEG_NOW 0x0040
> -#define ALLOW_FDX 0x0080
> -#define AUTO_NEG_ENABLE 0x0100
> -#define NLP_ENABLE 0x0200
> -#define FORCE_FDX 0x8000
> -#define AUTO_NEG_BITS (FORCE_FDX|NLP_ENABLE|AUTO_NEG_ENABLE)
> -#define AUTO_NEG_MASK (FORCE_FDX|NLP_ENABLE|AUTO_NEG_ENABLE|ALLOW_FDX|RE_NEG_NOW)
> -
> -/* PP_AutoNegST - Auto Negotiation Status bit definition */
> -#define AUTO_NEG_BUSY 0x0080
> -#define FLP_LINK 0x0100
> -#define FLP_LINK_GOOD 0x0800
> -#define LINK_FAULT 0x1000
> -#define HDX_ACTIVE 0x4000
> -#define FDX_ACTIVE 0x8000
> -
> -/*  The following block defines the ISQ event types */
> -#define ISQ_RECEIVER_EVENT 0x04
> -#define ISQ_TRANSMITTER_EVENT 0x08
> -#define ISQ_BUFFER_EVENT 0x0c
> -#define ISQ_RX_MISS_EVENT 0x10
> -#define ISQ_TX_COL_EVENT 0x12
> -
> -#define ISQ_EVENT_MASK 0x003F   /*  ISQ mask to find out type of event */
> -#define ISQ_HIST 16            /*  small history buffer */
> -#define AUTOINCREMENT 0x8000   /*  Bit mask to set bit-15 for autoincrement */
> -
> -#define TXRXBUFSIZE 0x0600
> -#define RXDMABUFSIZE 0x8000
> -#define RXDMASIZE 0x4000
> -#define TXRX_LENGTH_MASK 0x07FF
> -
> -/*  rx options bits */
> -#define RCV_WITH_RXON  1       /*  Set SerRx ON */
> -#define RCV_COUNTS     2       /*  Use Framecnt1 */
> -#define RCV_PONG       4       /*  Pong respondent */
> -#define RCV_DONG       8       /*  Dong operation */
> -#define RCV_POLLING    0x10    /*  Poll RxEvent */
> -#define RCV_ISQ                0x20    /*  Use ISQ, int */
> -#define RCV_AUTO_DMA   0x100   /*  Set AutoRxDMAE */
> -#define RCV_DMA                0x200   /*  Set RxDMA only */
> -#define RCV_DMA_ALL    0x400   /*  Copy all DMA'ed */
> -#define RCV_FIXED_DATA 0x800   /*  Every frame same */
> -#define RCV_IO         0x1000  /*  Use ISA IO only */
> -#define RCV_MEMORY     0x2000  /*  Use ISA Memory */
> -
> -#define RAM_SIZE       0x1000       /*  The card has 4k bytes or RAM */
> -#define PKT_START PP_TxFrame  /*  Start of packet RAM */
> -
> -#define RX_FRAME_PORT  0x0000
> -#define TX_FRAME_PORT RX_FRAME_PORT
> -#define TX_CMD_PORT    0x0004
> -#define TX_NOW         0x0000       /*  Tx packet after   5 bytes copied */
> -#define TX_AFTER_381   0x0040       /*  Tx packet after 381 bytes copied */
> -#define TX_AFTER_ALL   0x00c0       /*  Tx packet after all bytes copied */
> -#define TX_LEN_PORT    0x0006
> -#define ISQ_PORT       0x0008
> -#define ADD_PORT       0x000A
> -#define DATA_PORT      0x000C
> -
> -#define EEPROM_WRITE_EN                0x00F0
> -#define EEPROM_WRITE_DIS       0x0000
> -#define EEPROM_WRITE_CMD       0x0100
> -#define EEPROM_READ_CMD                0x0200
> -
> -/*  Receive Header */
> -/*  Description of header of each packet in receive area of memory */
> -#define RBUF_EVENT_LOW 0   /*  Low byte of RxEvent - status of received frame */
> -#define RBUF_EVENT_HIGH        1   /*  High byte of RxEvent - status of received frame */
> -#define RBUF_LEN_LOW   2   /*  Length of received data - low byte */
> -#define RBUF_LEN_HI    3   /*  Length of received data - high byte */
> -#define RBUF_HEAD_LEN  4   /*  Length of this header */
> -
> -#define CHIP_READ 0x1   /*  Used to mark state of the repins code (chip or dma) */
> -#define DMA_READ 0x2   /*  Used to mark state of the repins code (chip or dma) */
> -
> -/*  for bios scan */
> -/*  */
> -#ifdef CSDEBUG
> -/*  use these values for debugging bios scan */
> -#define BIOS_START_SEG 0x00000
> -#define BIOS_OFFSET_INC 0x0010
> -#else
> -#define BIOS_START_SEG 0x0c000
> -#define BIOS_OFFSET_INC 0x0200
> -#endif
> -
> -#define BIOS_LAST_OFFSET 0x0fc00
> -
> -/*  Byte offsets into the EEPROM configuration buffer */
> -#define ISA_CNF_OFFSET 0x6
> -#define TX_CTL_OFFSET (ISA_CNF_OFFSET + 8)                     /*  8900 eeprom */
> -#define AUTO_NEG_CNF_OFFSET (ISA_CNF_OFFSET + 8)               /*  8920 eeprom */
> -
> -  /*  the assumption here is that the bits in the eeprom are generally  */
> -  /*  in the same position as those in the autonegctl register. */
> -  /*  Of course the IMM bit is not in that register so it must be  */
> -  /*  masked out */
> -#define EE_FORCE_FDX  0x8000
> -#define EE_NLP_ENABLE 0x0200
> -#define EE_AUTO_NEG_ENABLE 0x0100
> -#define EE_ALLOW_FDX 0x0080
> -#define EE_AUTO_NEG_CNF_MASK (EE_FORCE_FDX|EE_NLP_ENABLE|EE_AUTO_NEG_ENABLE|EE_ALLOW_FDX)
> -
> -#define IMM_BIT 0x0040         /*  ignore missing media         */
> -
> -#define ADAPTER_CNF_OFFSET (AUTO_NEG_CNF_OFFSET + 2)
> -#define A_CNF_10B_T 0x0001
> -#define A_CNF_AUI 0x0002
> -#define A_CNF_10B_2 0x0004
> -#define A_CNF_MEDIA_TYPE 0x0070
> -#define A_CNF_MEDIA_AUTO 0x0070
> -#define A_CNF_MEDIA_10B_T 0x0020
> -#define A_CNF_MEDIA_AUI 0x0040
> -#define A_CNF_MEDIA_10B_2 0x0010
> -#define A_CNF_DC_DC_POLARITY 0x0080
> -#define A_CNF_NO_AUTO_POLARITY 0x2000
> -#define A_CNF_LOW_RX_SQUELCH 0x4000
> -#define A_CNF_EXTND_10B_2 0x8000
> -
> -#define PACKET_PAGE_OFFSET 0x8
> -
> -/*  Bit definitions for the ISA configuration word from the EEPROM */
> -#define INT_NO_MASK 0x000F
> -#define DMA_NO_MASK 0x0070
> -#define ISA_DMA_SIZE 0x0200
> -#define ISA_AUTO_RxDMA 0x0400
> -#define ISA_RxDMA 0x0800
> -#define DMA_BURST 0x1000
> -#define STREAM_TRANSFER 0x2000
> -#define ANY_ISA_DMA (ISA_AUTO_RxDMA | ISA_RxDMA)
> -
> -/*  DMA controller registers */
> -#define DMA_BASE 0x00     /*  DMA controller base */
> -#define DMA_BASE_2 0x0C0    /*  DMA controller base */
> -
> -#define DMA_STAT 0x0D0    /*  DMA controller status register */
> -#define DMA_MASK 0x0D4    /*  DMA controller mask register */
> -#define DMA_MODE 0x0D6    /*  DMA controller mode register */
> -#define DMA_RESETFF 0x0D8    /*  DMA controller first/last flip flop */
> -
> -/*  DMA data */
> -#define DMA_DISABLE 0x04     /*  Disable channel n */
> -#define DMA_ENABLE 0x00     /*  Enable channel n */
> -/*  Demand transfers, incr. address, auto init, writes, ch. n */
> -#define DMA_RX_MODE 0x14
> -/*  Demand transfers, incr. address, auto init, reads, ch. n */
> -#define DMA_TX_MODE 0x18
> -
> -#define DMA_SIZE (16*1024) /*  Size of dma buffer - 16k */
> -
> -#define CS8900 0x0000
> -#define CS8920 0x4000
> -#define CS8920M 0x6000
> -#define REVISON_BITS 0x1F00
> -#define EEVER_NUMBER 0x12
> -#define CHKSUM_LEN 0x14
> -#define CHKSUM_VAL 0x0000
> -#define START_EEPROM_DATA 0x001c /*  Offset into eeprom for start of data */
> -#define IRQ_MAP_EEPROM_DATA 0x0046 /*  Offset into eeprom for the IRQ map */
> -#define IRQ_MAP_LEN 0x0004 /*  No of bytes to read for the IRQ map */
> -#define PNP_IRQ_FRMT 0x0022 /*  PNP small item IRQ format */
> -#define CS8900_IRQ_MAP 0x1c20 /*  This IRQ map is fixed */
> -
> -#define CS8920_NO_INTS 0x0F   /*  Max CS8920 interrupt select # */
> -
> -#define PNP_ADD_PORT 0x0279
> -#define PNP_WRITE_PORT 0x0A79
> -
> -#define GET_PNP_ISA_STRUCT 0x40
> -#define PNP_ISA_STRUCT_LEN 0x06
> -#define PNP_CSN_CNT_OFF 0x01
> -#define PNP_RD_PORT_OFF 0x02
> -#define PNP_FUNCTION_OK 0x00
> -#define PNP_WAKE 0x03
> -#define PNP_RSRC_DATA 0x04
> -#define PNP_RSRC_READY 0x01
> -#define PNP_STATUS 0x05
> -#define PNP_ACTIVATE 0x30
> -#define PNP_CNF_IO_H 0x60
> -#define PNP_CNF_IO_L 0x61
> -#define PNP_CNF_INT 0x70
> -#define PNP_CNF_DMA 0x74
> -#define PNP_CNF_MEM 0x48
> diff --git a/drivers/net/ethernet/cirrus/mac89x0.c b/drivers/net/ethernet/cirrus/mac89x0.c
> deleted file mode 100644
> index 6723df9b65d9..000000000000
> --- a/drivers/net/ethernet/cirrus/mac89x0.c
> +++ /dev/null
> @@ -1,577 +0,0 @@
> -/* mac89x0.c: A Crystal Semiconductor CS89[02]0 driver for linux. */
> -/*
> -       Written 1996 by Russell Nelson, with reference to skeleton.c
> -       written 1993-1994 by Donald Becker.
> -
> -       This software may be used and distributed according to the terms
> -       of the GNU General Public License, incorporated herein by reference.
> -
> -       The author may be reached at nelson@crynwr.com, Crynwr
> -       Software, 11 Grant St., Potsdam, NY 13676
> -
> -  Changelog:
> -
> -  Mike Cruse        : mcruse@cti-ltd.com
> -                    : Changes for Linux 2.0 compatibility.
> -                    : Added dev_id parameter in net_interrupt(),
> -                    : request_irq() and free_irq(). Just NULL for now.
> -
> -  Mike Cruse        : Added MOD_INC_USE_COUNT and MOD_DEC_USE_COUNT macros
> -                    : in net_open() and net_close() so kerneld would know
> -                    : that the module is in use and wouldn't eject the
> -                    : driver prematurely.
> -
> -  Mike Cruse        : Rewrote init_module() and cleanup_module using 8390.c
> -                    : as an example. Disabled autoprobing in init_module(),
> -                    : not a good thing to do to other devices while Linux
> -                    : is running from all accounts.
> -
> -  Alan Cox          : Removed 1.2 support, added 2.1 extra counters.
> -
> -  David Huggins-Daines <dhd@debian.org>
> -
> -  Split this off into mac89x0.c, and gutted it of all parts which are
> -  not relevant to the existing CS8900 cards on the Macintosh
> -  (i.e. basically the Daynaport CS and LC cards).  To be precise:
> -
> -    * Removed all the media-detection stuff, because these cards are
> -    TP-only.
> -
> -    * Lobotomized the ISA interrupt bogosity, because these cards use
> -    a hardwired NuBus interrupt and a magic ISAIRQ value in the card.
> -
> -    * Basically eliminated everything not relevant to getting the
> -    cards minimally functioning on the Macintosh.
> -
> -  I might add that these cards are badly designed even from the Mac
> -  standpoint, in that Dayna, in their infinite wisdom, used NuBus slot
> -  I/O space and NuBus interrupts for these cards, but neglected to
> -  provide anything even remotely resembling a NuBus ROM.  Therefore we
> -  have to probe for them in a brain-damaged ISA-like fashion.
> -
> -  Arnaldo Carvalho de Melo <acme@conectiva.com.br> - 11/01/2001
> -  check kmalloc and release the allocated memory on failure in
> -  mac89x0_probe and in init_module
> -  use local_irq_{save,restore}(flags) in net_get_stat, not just
> -  local_irq_{dis,en}able()
> -*/
> -
> -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> -
> -static const char version[] =
> -"cs89x0.c:v1.02 11/26/96 Russell Nelson <nelson@crynwr.com>\n";
> -
> -#include <linux/module.h>
> -
> -/*
> -  Sources:
> -
> -       Crynwr packet driver epktisa.
> -
> -       Crystal Semiconductor data sheets.
> -
> -*/
> -
> -#include <linux/kernel.h>
> -#include <linux/types.h>
> -#include <linux/fcntl.h>
> -#include <linux/interrupt.h>
> -#include <linux/ioport.h>
> -#include <linux/in.h>
> -#include <linux/string.h>
> -#include <linux/nubus.h>
> -#include <linux/errno.h>
> -#include <linux/init.h>
> -#include <linux/netdevice.h>
> -#include <linux/platform_device.h>
> -#include <linux/etherdevice.h>
> -#include <linux/skbuff.h>
> -#include <linux/delay.h>
> -#include <linux/bitops.h>
> -#include <linux/gfp.h>
> -
> -#include <asm/io.h>
> -#include <asm/hwtest.h>
> -#include <asm/macints.h>
> -
> -#include "cs89x0.h"
> -
> -static int debug = -1;
> -module_param(debug, int, 0);
> -MODULE_PARM_DESC(debug, "debug message level");
> -
> -/* Information that need to be kept for each board. */
> -struct net_local {
> -       int msg_enable;
> -       int chip_type;          /* one of: CS8900, CS8920, CS8920M */
> -       char chip_revision;     /* revision letter of the chip ('A'...) */
> -       int send_cmd;           /* the propercommand used to send a packet. */
> -       int rx_mode;
> -       int curr_rx_cfg;
> -        int send_underrun;      /* keep track of how many underruns in a row we get */
> -};
> -
> -/* Index to functions, as function prototypes. */
> -static int net_open(struct net_device *dev);
> -static netdev_tx_t net_send_packet(struct sk_buff *skb, struct net_device *dev);
> -static irqreturn_t net_interrupt(int irq, void *dev_id);
> -static void set_multicast_list(struct net_device *dev);
> -static void net_rx(struct net_device *dev);
> -static int net_close(struct net_device *dev);
> -static struct net_device_stats *net_get_stats(struct net_device *dev);
> -static int set_mac_address(struct net_device *dev, void *addr);
> -
> -/* For reading/writing registers ISA-style */
> -static inline int
> -readreg_io(struct net_device *dev, int portno)
> -{
> -       nubus_writew(swab16(portno), dev->base_addr + ADD_PORT);
> -       return swab16(nubus_readw(dev->base_addr + DATA_PORT));
> -}
> -
> -static inline void
> -writereg_io(struct net_device *dev, int portno, int value)
> -{
> -       nubus_writew(swab16(portno), dev->base_addr + ADD_PORT);
> -       nubus_writew(swab16(value), dev->base_addr + DATA_PORT);
> -}
> -
> -/* These are for reading/writing registers in shared memory */
> -static inline int
> -readreg(struct net_device *dev, int portno)
> -{
> -       return swab16(nubus_readw(dev->mem_start + portno));
> -}
> -
> -static inline void
> -writereg(struct net_device *dev, int portno, int value)
> -{
> -       nubus_writew(swab16(value), dev->mem_start + portno);
> -}
> -
> -static const struct net_device_ops mac89x0_netdev_ops = {
> -       .ndo_open               = net_open,
> -       .ndo_stop               = net_close,
> -       .ndo_start_xmit         = net_send_packet,
> -       .ndo_get_stats          = net_get_stats,
> -       .ndo_set_rx_mode        = set_multicast_list,
> -       .ndo_set_mac_address    = set_mac_address,
> -       .ndo_validate_addr      = eth_validate_addr,
> -};
> -
> -/* Probe for the CS8900 card in slot E.  We won't bother looking
> -   anywhere else until we have a really good reason to do so. */
> -static int mac89x0_device_probe(struct platform_device *pdev)
> -{
> -       struct net_device *dev;
> -       struct net_local *lp;
> -       int i, slot;
> -       unsigned rev_type = 0;
> -       unsigned long ioaddr;
> -       unsigned short sig;
> -       int err = -ENODEV;
> -       struct nubus_rsrc *fres;
> -
> -       dev = alloc_etherdev(sizeof(struct net_local));
> -       if (!dev)
> -               return -ENOMEM;
> -
> -       /* We might have to parameterize this later */
> -       slot = 0xE;
> -       /* Get out now if there's a real NuBus card in slot E */
> -       for_each_func_rsrc(fres)
> -               if (fres->board->slot == slot)
> -                       goto out;
> -
> -       /* The pseudo-ISA bits always live at offset 0x300 (gee,
> -           wonder why...) */
> -       ioaddr = (unsigned long)
> -               nubus_slot_addr(slot) | (((slot&0xf) << 20) + DEFAULTIOBASE);
> -       {
> -               int card_present;
> -
> -               card_present = (hwreg_present((void *)ioaddr + 4) &&
> -                               hwreg_present((void *)ioaddr + DATA_PORT));
> -               if (!card_present)
> -                       goto out;
> -       }
> -
> -       nubus_writew(0, ioaddr + ADD_PORT);
> -       sig = nubus_readw(ioaddr + DATA_PORT);
> -       if (sig != swab16(CHIP_EISA_ID_SIG))
> -               goto out;
> -
> -       SET_NETDEV_DEV(dev, &pdev->dev);
> -
> -       /* Initialize the net_device structure. */
> -       lp = netdev_priv(dev);
> -
> -       lp->msg_enable = netif_msg_init(debug, 0);
> -
> -       /* Fill in the 'dev' fields. */
> -       dev->base_addr = ioaddr;
> -       dev->mem_start = (unsigned long)
> -               nubus_slot_addr(slot) | (((slot&0xf) << 20) + MMIOBASE);
> -       dev->mem_end = dev->mem_start + 0x1000;
> -
> -       /* Turn on shared memory */
> -       writereg_io(dev, PP_BusCTL, MEMORY_ON);
> -
> -       /* get the chip type */
> -       rev_type = readreg(dev, PRODUCT_ID_ADD);
> -       lp->chip_type = rev_type &~ REVISON_BITS;
> -       lp->chip_revision = ((rev_type & REVISON_BITS) >> 8) + 'A';
> -
> -       /* Check the chip type and revision in order to set the correct send command
> -       CS8920 revision C and CS8900 revision F can use the faster send. */
> -       lp->send_cmd = TX_AFTER_381;
> -       if (lp->chip_type == CS8900 && lp->chip_revision >= 'F')
> -               lp->send_cmd = TX_NOW;
> -       if (lp->chip_type != CS8900 && lp->chip_revision >= 'C')
> -               lp->send_cmd = TX_NOW;
> -
> -       netif_dbg(lp, drv, dev, "%s", version);
> -
> -       pr_info("cs89%c0%s rev %c found at %#8lx\n",
> -               lp->chip_type == CS8900 ? '0' : '2',
> -               lp->chip_type == CS8920M ? "M" : "",
> -               lp->chip_revision, dev->base_addr);
> -
> -       /* Try to read the MAC address */
> -       if ((readreg(dev, PP_SelfST) & (EEPROM_PRESENT | EEPROM_OK)) == 0) {
> -               pr_info("No EEPROM, giving up now.\n");
> -               goto out1;
> -        } else {
> -               u8 addr[ETH_ALEN];
> -
> -                for (i = 0; i < ETH_ALEN; i += 2) {
> -                       /* Big-endian (why??!) */
> -                       unsigned short s = readreg(dev, PP_IA + i);
> -                       addr[i] = s >> 8;
> -                       addr[i+1] = s & 0xff;
> -                }
> -               eth_hw_addr_set(dev, addr);
> -        }
> -
> -       dev->irq = SLOT2IRQ(slot);
> -
> -       /* print the IRQ and ethernet address. */
> -
> -       pr_info("MAC %pM, IRQ %d\n", dev->dev_addr, dev->irq);
> -
> -       dev->netdev_ops         = &mac89x0_netdev_ops;
> -
> -       err = register_netdev(dev);
> -       if (err)
> -               goto out1;
> -
> -       platform_set_drvdata(pdev, dev);
> -       return 0;
> -out1:
> -       nubus_writew(0, dev->base_addr + ADD_PORT);
> -out:
> -       free_netdev(dev);
> -       return err;
> -}
> -
> -/* Open/initialize the board.  This is called (in the current kernel)
> -   sometime after booting when the 'ifconfig' program is run.
> -
> -   This routine should set everything up anew at each open, even
> -   registers that "should" only need to be set once at boot, so that
> -   there is non-reboot way to recover if something goes wrong.
> -   */
> -static int
> -net_open(struct net_device *dev)
> -{
> -       struct net_local *lp = netdev_priv(dev);
> -       int i;
> -
> -       /* Disable the interrupt for now */
> -       writereg(dev, PP_BusCTL, readreg(dev, PP_BusCTL) & ~ENABLE_IRQ);
> -
> -       /* Grab the interrupt */
> -       if (request_irq(dev->irq, net_interrupt, 0, "cs89x0", dev))
> -               return -EAGAIN;
> -
> -       /* Set up the IRQ - Apparently magic */
> -       if (lp->chip_type == CS8900)
> -               writereg(dev, PP_CS8900_ISAINT, 0);
> -       else
> -               writereg(dev, PP_CS8920_ISAINT, 0);
> -
> -       /* set the Ethernet address */
> -       for (i=0; i < ETH_ALEN/2; i++)
> -               writereg(dev, PP_IA+i*2, dev->dev_addr[i*2] | (dev->dev_addr[i*2+1] << 8));
> -
> -       /* Turn on both receive and transmit operations */
> -       writereg(dev, PP_LineCTL, readreg(dev, PP_LineCTL) | SERIAL_RX_ON | SERIAL_TX_ON);
> -
> -       /* Receive only error free packets addressed to this card */
> -       lp->rx_mode = 0;
> -       writereg(dev, PP_RxCTL, DEF_RX_ACCEPT);
> -
> -       lp->curr_rx_cfg = RX_OK_ENBL | RX_CRC_ERROR_ENBL;
> -
> -       writereg(dev, PP_RxCFG, lp->curr_rx_cfg);
> -
> -       writereg(dev, PP_TxCFG, TX_LOST_CRS_ENBL | TX_SQE_ERROR_ENBL | TX_OK_ENBL |
> -              TX_LATE_COL_ENBL | TX_JBR_ENBL | TX_ANY_COL_ENBL | TX_16_COL_ENBL);
> -
> -       writereg(dev, PP_BufCFG, READY_FOR_TX_ENBL | RX_MISS_COUNT_OVRFLOW_ENBL |
> -                TX_COL_COUNT_OVRFLOW_ENBL | TX_UNDERRUN_ENBL);
> -
> -       /* now that we've got our act together, enable everything */
> -       writereg(dev, PP_BusCTL, readreg(dev, PP_BusCTL) | ENABLE_IRQ);
> -       netif_start_queue(dev);
> -       return 0;
> -}
> -
> -static netdev_tx_t
> -net_send_packet(struct sk_buff *skb, struct net_device *dev)
> -{
> -       struct net_local *lp = netdev_priv(dev);
> -       unsigned long flags;
> -
> -       netif_dbg(lp, tx_queued, dev, "sent %d byte packet of type %x\n",
> -                 skb->len, skb->data[ETH_ALEN + ETH_ALEN] << 8 |
> -                 skb->data[ETH_ALEN + ETH_ALEN + 1]);
> -
> -       /* keep the upload from being interrupted, since we
> -          ask the chip to start transmitting before the
> -          whole packet has been completely uploaded. */
> -       local_irq_save(flags);
> -       netif_stop_queue(dev);
> -
> -       /* initiate a transmit sequence */
> -       writereg(dev, PP_TxCMD, lp->send_cmd);
> -       writereg(dev, PP_TxLength, skb->len);
> -
> -       /* Test to see if the chip has allocated memory for the packet */
> -       if ((readreg(dev, PP_BusST) & READY_FOR_TX_NOW) == 0) {
> -               /* Gasp!  It hasn't.  But that shouldn't happen since
> -                  we're waiting for TxOk, so return 1 and requeue this packet. */
> -               local_irq_restore(flags);
> -               return NETDEV_TX_BUSY;
> -       }
> -
> -       /* Write the contents of the packet */
> -       skb_copy_from_linear_data(skb, (void *)(dev->mem_start + PP_TxFrame),
> -                                 skb->len+1);
> -
> -       local_irq_restore(flags);
> -       dev_kfree_skb (skb);
> -
> -       return NETDEV_TX_OK;
> -}
> -
> -/* The typical workload of the driver:
> -   Handle the network interface interrupts. */
> -static irqreturn_t net_interrupt(int irq, void *dev_id)
> -{
> -       struct net_device *dev = dev_id;
> -       struct net_local *lp;
> -       int ioaddr, status;
> -
> -       ioaddr = dev->base_addr;
> -       lp = netdev_priv(dev);
> -
> -       /* we MUST read all the events out of the ISQ, otherwise we'll never
> -           get interrupted again.  As a consequence, we can't have any limit
> -           on the number of times we loop in the interrupt handler.  The
> -           hardware guarantees that eventually we'll run out of events.  Of
> -           course, if you're on a slow machine, and packets are arriving
> -           faster than you can read them off, you're screwed.  Hasta la
> -           vista, baby!  */
> -       while ((status = swab16(nubus_readw(dev->base_addr + ISQ_PORT)))) {
> -               netif_dbg(lp, intr, dev, "status=%04x\n", status);
> -               switch(status & ISQ_EVENT_MASK) {
> -               case ISQ_RECEIVER_EVENT:
> -                       /* Got a packet(s). */
> -                       net_rx(dev);
> -                       break;
> -               case ISQ_TRANSMITTER_EVENT:
> -                       dev->stats.tx_packets++;
> -                       netif_wake_queue(dev);
> -                       if ((status & TX_OK) == 0)
> -                               dev->stats.tx_errors++;
> -                       if (status & TX_LOST_CRS)
> -                               dev->stats.tx_carrier_errors++;
> -                       if (status & TX_SQE_ERROR)
> -                               dev->stats.tx_heartbeat_errors++;
> -                       if (status & TX_LATE_COL)
> -                               dev->stats.tx_window_errors++;
> -                       if (status & TX_16_COL)
> -                               dev->stats.tx_aborted_errors++;
> -                       break;
> -               case ISQ_BUFFER_EVENT:
> -                       if (status & READY_FOR_TX) {
> -                               /* we tried to transmit a packet earlier,
> -                                   but inexplicably ran out of buffers.
> -                                   That shouldn't happen since we only ever
> -                                   load one packet.  Shrug.  Do the right
> -                                   thing anyway. */
> -                               netif_wake_queue(dev);
> -                       }
> -                       if (status & TX_UNDERRUN) {
> -                               netif_dbg(lp, tx_err, dev, "transmit underrun\n");
> -                                lp->send_underrun++;
> -                                if (lp->send_underrun == 3) lp->send_cmd = TX_AFTER_381;
> -                                else if (lp->send_underrun == 6) lp->send_cmd = TX_AFTER_ALL;
> -                        }
> -                       break;
> -               case ISQ_RX_MISS_EVENT:
> -                       dev->stats.rx_missed_errors += (status >> 6);
> -                       break;
> -               case ISQ_TX_COL_EVENT:
> -                       dev->stats.collisions += (status >> 6);
> -                       break;
> -               }
> -       }
> -       return IRQ_HANDLED;
> -}
> -
> -/* We have a good packet(s), get it/them out of the buffers. */
> -static void
> -net_rx(struct net_device *dev)
> -{
> -       struct net_local *lp = netdev_priv(dev);
> -       struct sk_buff *skb;
> -       int status, length;
> -
> -       status = readreg(dev, PP_RxStatus);
> -       if ((status & RX_OK) == 0) {
> -               dev->stats.rx_errors++;
> -               if (status & RX_RUNT)
> -                               dev->stats.rx_length_errors++;
> -               if (status & RX_EXTRA_DATA)
> -                               dev->stats.rx_length_errors++;
> -               if ((status & RX_CRC_ERROR) &&
> -                   !(status & (RX_EXTRA_DATA|RX_RUNT)))
> -                       /* per str 172 */
> -                       dev->stats.rx_crc_errors++;
> -               if (status & RX_DRIBBLE)
> -                               dev->stats.rx_frame_errors++;
> -               return;
> -       }
> -
> -       length = readreg(dev, PP_RxLength);
> -       /* Malloc up new buffer. */
> -       skb = alloc_skb(length, GFP_ATOMIC);
> -       if (skb == NULL) {
> -               dev->stats.rx_dropped++;
> -               return;
> -       }
> -       skb_put(skb, length);
> -
> -       skb_copy_to_linear_data(skb, (void *)(dev->mem_start + PP_RxFrame),
> -                               length);
> -
> -       netif_dbg(lp, rx_status, dev, "received %d byte packet of type %x\n",
> -                 length, skb->data[ETH_ALEN + ETH_ALEN] << 8 |
> -                 skb->data[ETH_ALEN + ETH_ALEN + 1]);
> -
> -        skb->protocol=eth_type_trans(skb,dev);
> -       netif_rx(skb);
> -       dev->stats.rx_packets++;
> -       dev->stats.rx_bytes += length;
> -}
> -
> -/* The inverse routine to net_open(). */
> -static int
> -net_close(struct net_device *dev)
> -{
> -
> -       writereg(dev, PP_RxCFG, 0);
> -       writereg(dev, PP_TxCFG, 0);
> -       writereg(dev, PP_BufCFG, 0);
> -       writereg(dev, PP_BusCTL, 0);
> -
> -       netif_stop_queue(dev);
> -
> -       free_irq(dev->irq, dev);
> -
> -       /* Update the statistics here. */
> -
> -       return 0;
> -
> -}
> -
> -/* Get the current statistics. This may be called with the card open or
> -   closed. */
> -static struct net_device_stats *
> -net_get_stats(struct net_device *dev)
> -{
> -       unsigned long flags;
> -
> -       local_irq_save(flags);
> -       /* Update the statistics from the device registers. */
> -       dev->stats.rx_missed_errors += (readreg(dev, PP_RxMiss) >> 6);
> -       dev->stats.collisions += (readreg(dev, PP_TxCol) >> 6);
> -       local_irq_restore(flags);
> -
> -       return &dev->stats;
> -}
> -
> -static void set_multicast_list(struct net_device *dev)
> -{
> -       struct net_local *lp = netdev_priv(dev);
> -
> -       if(dev->flags&IFF_PROMISC)
> -       {
> -               lp->rx_mode = RX_ALL_ACCEPT;
> -       } else if ((dev->flags & IFF_ALLMULTI) || !netdev_mc_empty(dev)) {
> -               /* The multicast-accept list is initialized to accept-all, and we
> -                  rely on higher-level filtering for now. */
> -               lp->rx_mode = RX_MULTCAST_ACCEPT;
> -       }
> -       else
> -               lp->rx_mode = 0;
> -
> -       writereg(dev, PP_RxCTL, DEF_RX_ACCEPT | lp->rx_mode);
> -
> -       /* in promiscuous mode, we accept errored packets, so we have to enable interrupts on them also */
> -       writereg(dev, PP_RxCFG, lp->curr_rx_cfg |
> -            (lp->rx_mode == RX_ALL_ACCEPT? (RX_CRC_ERROR_ENBL|RX_RUNT_ENBL|RX_EXTRA_DATA_ENBL) : 0));
> -}
> -
> -
> -static int set_mac_address(struct net_device *dev, void *addr)
> -{
> -       struct sockaddr *saddr = addr;
> -       int i;
> -
> -       if (!is_valid_ether_addr(saddr->sa_data))
> -               return -EADDRNOTAVAIL;
> -
> -       eth_hw_addr_set(dev, saddr->sa_data);
> -       netdev_info(dev, "Setting MAC address to %pM\n", dev->dev_addr);
> -
> -       /* set the Ethernet address */
> -       for (i=0; i < ETH_ALEN/2; i++)
> -               writereg(dev, PP_IA+i*2, dev->dev_addr[i*2] | (dev->dev_addr[i*2+1] << 8));
> -
> -       return 0;
> -}
> -
> -MODULE_DESCRIPTION("Macintosh CS89x0-based Ethernet driver");
> -MODULE_LICENSE("GPL");
> -
> -static void mac89x0_device_remove(struct platform_device *pdev)
> -{
> -       struct net_device *dev = platform_get_drvdata(pdev);
> -
> -       unregister_netdev(dev);
> -       nubus_writew(0, dev->base_addr + ADD_PORT);
> -       free_netdev(dev);
> -}
> -
> -static struct platform_driver mac89x0_platform_driver = {
> -       .probe = mac89x0_device_probe,
> -       .remove = mac89x0_device_remove,
> -       .driver = {
> -               .name = "mac89x0",
> -       },
> -};
> -
> -module_platform_driver(mac89x0_platform_driver);
>
> --
> 2.53.0

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox