Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-deletions v2] net: remove unused ATM protocols and legacy ATM device drivers
From: Andy Shevchenko @ 2026-04-23  7:53 UTC (permalink / raw)
  To: Philip Prindeville
  Cc: David Woodhouse, Jakub Kicinski, davem, openwrt-devel, Guy Ellis,
	netdev, edumazet, pabeni, andrew+netdev, horms, corbet, skhan,
	linux, tsbogend, maddy, mpe, npiggin, chleroy, 3chas3, razor,
	idosch, jani.nikula, mchehab+huawei, tytso, herbert, geert,
	ebiggers, johannes.berg, jonathan.cameron, kees, kuniyu,
	fourier.thomas, rdunlap, akpm, linux-doc, linux-mips,
	linuxppc-dev, bridge
In-Reply-To: <68316F0B-2442-4492-A041-E57EFC58AC08@redfish-solutions.com>

On Wed, Apr 22, 2026 at 08:41:27PM -0600, Philip Prindeville wrote:
> > On Apr 22, 2026, at 7:05 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> > On Tue, 2026-04-21 at 21:18 -0700, Jakub Kicinski wrote:

...

> >>    I'm still deleting the solos driver, chances are nobody uses it.
> >>    Easy enough to revert back in since core is still around.
> >>    The guiding principle is to keep USB modems and delete
> >>    the rest as USB ADSL2+ CPEs were most popular historically.
> > 
> > Still not entirely convinced; I worked on both USB ATM modems and on
> > Solos, and the Solos is both the most modern and the only one I still
> > actually have. And the only one we have native support for that could
> > ever do full 24Mb/s ADSL2+, I believe.
> > 
> > If we drop it, OpenWrt will need to drop support for these, which I
> > think were quite popular at the time; there were a few UK resellers:
> > https://openwrt.org/toh/traverse/geos1_1
> > 
> > I still don't actually care *enough* to try to find an ADSL line I
> > could plug one into for testing though... :)
> 
> I have 3 boards lying around if anyone wants them.

The problem as I understand it is in one's willing to maintain and
support that driver while doing regular testing...

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply

* Re: [PATCH net v2] hv_sock: Return -EIO for malformed/short packets
From: Stefano Garzarella @ 2026-04-23  7:55 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: kys, haiyangz, wei.liu, longli, davem, edumazet, kuba, pabeni,
	horms, niuxuewei.nxw, linux-hyperv, virtualization, netdev,
	linux-kernel, stable
In-Reply-To: <20260423064811.1371749-1-decui@microsoft.com>

On Wed, Apr 22, 2026 at 11:48:11PM -0700, Dexuan Cui wrote:
>Commit f63152958994 fixes a regression, however it fails to report an
>error for malformed/short packets -- normally we should never see such
>packets, but let's report an error for them just in case.
>
>Fixes: f63152958994 ("hv_sock: Report EOF instead of -EIO for FIN")
>Cc: stable@vger.kernel.org
>Signed-off-by: Dexuan Cui <decui@microsoft.com>
>---
>
>Commit f63152958994 is currently only in net.git's master branch.
>
>Changes since v1:
>    Integrated comments from Stefano Garzarella:
>
>        1) access 'vsk' directly:
>           s/hvs->vsk->peer_shutdown/vsk->peer_shutdown/
>
>        2) test the error condition first and return -EIO for that.
>
>    NO other changes.

Thanks, LGTM!

Acked-by: Stefano Garzarella <sgarzare@redhat.com>


^ permalink raw reply

* [PATCH] net: net_failover: Fix the deadlock in slave register
From: Faicker Mo @ 2026-04-23  7:59 UTC (permalink / raw)
  To: sridhar.samudrala@intel.com, andrew+netdev@lunn.ch,
	David S. Miller, Eric Dumazet, Paolo Abeni, sdf@fomichev.me
  Cc: open list:NETWORKING DRIVERS, open list

There is netdev_lock_ops() before the NETDEV_REGISTER notifier
in register_netdevice(), so use the non-locking functions
in net_failover_slave_register().

Call Trace:
 <TASK>
 __schedule+0x30d/0x7a0
 schedule+0x27/0x90
 schedule_preempt_disabled+0x15/0x30
 __mutex_lock.constprop.0+0x538/0x9e0
 __mutex_lock_slowpath+0x13/0x20
 mutex_lock+0x3b/0x50
 dev_set_mtu+0x40/0xe0
 net_failover_slave_register+0x24/0x280
 failover_slave_register+0x103/0x1b0
 failover_event+0x15e/0x210
 ? dropmon_net_event+0xac/0xe0
 notifier_call_chain+0x5e/0xe0
 raw_notifier_call_chain+0x16/0x30
 call_netdevice_notifiers_info+0x52/0xa0
 register_netdevice+0x5f4/0x7c0
 register_netdev+0x1e/0x40
 _mlx5e_probe+0xe2/0x370 [mlx5_core]
 mlx5e_probe+0x59/0x70 [mlx5_core]
 ? __pfx_mlx5e_probe+0x10/0x10 [mlx5_core]

Fixes: 4c975fd70002 ("net: hold instance lock during NETDEV_REGISTER/UP")
Signed-off-by: Faicker Mo <faicker.mo@zenlayer.com>
---
 drivers/net/net_failover.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c
index d0361aaf25ef..6702490eb4ed 100644
--- a/drivers/net/net_failover.c
+++ b/drivers/net/net_failover.c
@@ -502,7 +502,7 @@ static int net_failover_slave_register(struct net_device *slave_dev,

        /* Align MTU of slave with failover dev */
        orig_mtu = slave_dev->mtu;
-       err = dev_set_mtu(slave_dev, failover_dev->mtu);
+       err = netif_set_mtu(slave_dev, failover_dev->mtu);
        if (err) {
                netdev_err(failover_dev, "unable to change mtu of %s to %u register failed\n",
                           slave_dev->name, failover_dev->mtu);
@@ -512,7 +512,7 @@ static int net_failover_slave_register(struct net_device *slave_dev,
        dev_hold(slave_dev);

        if (netif_running(failover_dev)) {
-               err = dev_open(slave_dev, NULL);
+               err = netif_open(slave_dev, NULL);
                if (err && (err != -EBUSY)) {
                        netdev_err(failover_dev, "Opening slave %s failed err:%d\n",
                                   slave_dev->name, err);
@@ -565,7 +565,7 @@ static int net_failover_slave_register(struct net_device *slave_dev,
        dev_close(slave_dev);
 err_dev_open:
        dev_put(slave_dev);
-       dev_set_mtu(slave_dev, orig_mtu);
+       netif_set_mtu(slave_dev, orig_mtu);
 done:
        return err;
 }
--
2.34.1

^ permalink raw reply related

* Re: [PATCH net] net: ipv6: fix NOREF dst use in seg6 and rpl lwtunnels
From: Sebastian Andrzej Siewior @ 2026-04-23  8:00 UTC (permalink / raw)
  To: Andrea Mayer
  Cc: davem, dsahern, edumazet, kuba, pabeni, horms, clrkwllms, rostedt,
	david.lebrun, alex.aring, stefano.salsano, netdev, linux-rt-devel,
	linux-kernel, stable
In-Reply-To: <20260421094735.20997-1-andrea.mayer@uniroma2.it>

On 2026-04-21 11:47:35 [+0200], Andrea Mayer wrote:
> seg6_input_core() and rpl_input() call ip6_route_input() which sets a
> NOREF dst on the skb, then pass it to dst_cache_set_ip6() invoking
> dst_hold() unconditionally.
> On PREEMPT_RT, ksoftirqd is preemptible and a higher-priority task can
> release the underlying pcpu_rt between the lookup and the caching
> through a concurrent FIB lookup on a shared nexthop.
> Simplified race sequence:
> 
>   ksoftirqd/X                       higher-prio task (same CPU X)
>   -----------                       --------------------------------
>   seg6_input_core(,skb)/rpl_input(skb)
>     dst_cache_get()
>       -> miss
>     ip6_route_input(skb)
>       -> ip6_pol_route(,skb,flags)
>          [RT6_LOOKUP_F_DST_NOREF in flags]
>         -> FIB lookup resolves fib6_nh
>            [nhid=N route]
>         -> rt6_make_pcpu_route()
>            [creates pcpu_rt, refcount=1]
>              pcpu_rt->sernum = fib6_sernum
>              [fib6_sernum=W]
>            -> cmpxchg(fib6_nh.rt6i_pcpu,
>                       NULL, pcpu_rt)
>               [slot was empty, store succeeds]
>       -> skb_dst_set_noref(skb, dst)
>          [dst is pcpu_rt, refcount still 1]
> 
>                                     rt_genid_bump_ipv6()
>                                       -> bumps fib6_sernum
>                                          [fib6_sernum from W to Z]
>                                     ip6_route_output()
>                                       -> ip6_pol_route()
>                                         -> FIB lookup resolves fib6_nh
>                                            [nhid=N]
>                                         -> rt6_get_pcpu_route()
>                                              pcpu_rt->sernum != fib6_sernum
>                                              [W <> Z, stale]
>                                           -> prev = xchg(rt6i_pcpu, NULL)
>                                           -> dst_release(prev)
>                                              [prev is pcpu_rt,
>                                               refcount 1->0, dead]
> 
>     dst = skb_dst(skb)
>     [dst is the dead pcpu_rt]
>     dst_cache_set_ip6(dst)
>       -> dst_hold() on dead dst
>       -> WARN / use-after-free

So the dst passed to skb_dst_set_noref() has no reference count. The fix
is to use skb_dst_force() to increment the refcount on it. But this
requires that we are in the same RCU section. And I guess we are since
none of the warnings are visible.

Doesn't this make ip6_route_input() on RT fragile in general due to the
RT6_LOOKUP_F_DST_NOREF usage or here something special about the two
files that are patched?
Based on your explanation it all makes sense, I am just not sure if this
race is limited to those two are if there is more to it.

> For the race to occur, ksoftirqd must be preemptible (PREEMPT_RT without
> PREEMPT_RT_NEEDS_BH_LOCK) and a concurrent task must be able to release
> the pcpu_rt. Shared nexthop objects provide such a path, as two routes
> pointing to the same nhid share the same fib6_nh and its rt6i_pcpu
> entry.
> 
> Fix seg6_input_core() and rpl_input() by calling skb_dst_force() after
> ip6_route_input() to force the NOREF dst into a refcounted one before
> caching.
> The output path is not affected as ip6_route_output() already returns a
> refcounted dst.
> 
> Fixes: af4a2209b134 ("ipv6: sr: use dst_cache in seg6_input")
> Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel")

If having PREEMPT_RT_NEEDS_BH_LOCK unset is the requirement then the
right fixes: would be
Fixes: 3253cb49cbad4 ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT")

as prior this commit the race is not possible, right?

Does this mean that rpl_input() does a local_bh_disable() while
obtaining the dst but it never runs outside of bh-disabled section?
Because if it can run in preemptible context then it would not be to
PREEMPT_RT at which point the Fixes: tags from above would make sense
again.

> Cc: stable@vger.kernel.org
> Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>

Sebastian

^ permalink raw reply

* [bug report] Potential refcounting issues in 'drivers/net/ethernet/mellanox/mlx4/srq.c', between 'mlx4_srq_event()' and 'mlx4_srq_free()'
From: Ginger @ 2026-04-23  8:00 UTC (permalink / raw)
  To: tariqt; +Cc: netdev, linux-kernel, linux-rdma

Dear Linux kernel maintainers,

My research-based static analyzer found a potential atomicity bug
within the 'drivers/net/ethernet/mellanox/mlx4' subsystem, more
specifically, in 'drivers/net/ethernet/mellanox/mlx4/srq.c'.

Kernel version: long-term kernel v6.18.9

Potential concurrent triggering executions:
T0:
mlx4_srq_free
     --> spin_lock_irq(&srq_table->lock);
     --> radix_tree_delete(&srq_table->tree, srq->srqn);
     --> spin_unlock_irq(&srq_table->lock);
     --> if (refcount_dec_and_test(&srq->refcount))

T1:
mlx4_srq_event
    --> rcu_read_lock();
    --> srq = radix_tree_lookup(&srq_table->tree, srqn &
(dev->caps.num_srqs - 1));
    --> rcu_read_unlock();
    --> refcount_inc(&srq->refcount);
    --> if (refcount_dec_and_test(&srq->refcount))

In T1, the refcounting increment on 'srq->refcount' does not check
whether this value has already reached zero in T0. In that case, if
the refcount already reaches zero, then the first 'refcount_inc()'
will increment it to one and the subsequent 'if
(refcount_dec_and_test(&srq->refcount))' will test to true, resulting
an additional call to 'complete(&srq->free)'.
This is potentially problematic for mlx4 NICs.

Thank you for your time and consideration.

Best regards,
Ginger

^ permalink raw reply

* Re: [PATCH net] virtio_net: sync rss_trailer.max_tx_vq on queue_pairs change via VQ_PAIRS_SET
From: Michael S. Tsirkin @ 2026-04-23  8:05 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Brett Creeley, jasowang, andrew+netdev, davem, edumazet, kuba,
	netdev, Xuan Zhuo, Eugenio Pérez
In-Reply-To: <1222eb4d-2b6d-44c6-96a3-03c42f714b4a@redhat.com>

On Thu, Apr 23, 2026 at 09:01:26AM +0200, Paolo Abeni wrote:
> On 4/16/26 11:21 PM, Brett Creeley wrote:
> > When netif_is_rxfh_configured() is true (i.e., the user has explicitly
> > configured the RSS indirection table), virtnet_set_queues() skips the
> > RSS update path and falls through to the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
> > command to change the number of queue pairs. However, it does not update
> > vi->rss_trailer.max_tx_vq to reflect the new queue_pairs value.
> > 
> > This causes a mismatch between vi->curr_queue_pairs and
> > vi->rss_trailer.max_tx_vq. Any subsequent RSS reconfiguration (e.g.,
> > via ethtool -X) calls virtnet_commit_rss_command(), which sends the
> > stale max_tx_vq to the device, silently reverting the queue count.
> > 
> > Reproduction:
> > 1. User configured RSS
> >   ethtool -X eth0 equal 8
> > 2. VQ_PAIRS_SET path; max_tx_vq stays 16
> >   ethtool -L eth0 combined 12
> > 3. RSS commit uses max_tx_vq=16 instead of 12
> >   ethtool -X eth0 equal 4
> > 
> > Fix this by updating vi->rss_trailer.max_tx_vq after a successful
> > VQ_PAIRS_SET command when RSS is enabled, keeping it in sync with
> > curr_queue_pairs.
> > 
> > Fixes: 50bfcaedd78e ("virtio_net: Update rss when set queue")
> > Assisted-by: Claude: claude-opus-4.6
> > Signed-off-by: Brett Creeley <brett.creeley@amd.com>
> 
> The patch LGTM, but waiting a little longer just in case the virtio crew
> has some comments.
> 
> /P

Acked-by: Michael S. Tsirkin <mst@redhat.com>


^ permalink raw reply

* Re: [PATCH net] net: airoha: fix BQL imbalance in TX path
From: Lorenzo Bianconi @ 2026-04-23  8:12 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Hariprasad Kelam
  Cc: Simon Horman, linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260421-airoha-fix-bql-v1-1-f135afe4275b@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 4017 bytes --]

> Fix a possible BQL imbalance in airoha_dev_xmit(), where inflight
> packets are accounted only for the AIROHA_NUM_TX_RING netdev TX
> queues. The queue index is computed as:
> 
>     qid = skb_get_queue_mapping(skb) % ARRAY_SIZE(qdma->q_tx)
>     txq = netdev_get_tx_queue(dev, qid);
> 
> However, airoha_qdma_tx_napi_poll() accounts completions across all
> netdev TX queues (num_tx_queues), leading to inconsistent BQL
> accounting.
> 
> Also reset all netdev TX queues in the ndo_stop callback.
> 
> Fixes: 1d304174106c ("net: airoha: Implement BQL support")
> Fixes: c9f947769b77 ("net: airoha: Reset BQL stopping the netdevice")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/airoha/airoha_eth.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index 19f67c7dd8e1..6c7390f0de5d 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -929,10 +929,9 @@ static int airoha_qdma_tx_napi_poll(struct napi_struct *napi, int budget)
>  		q->queued--;
>  
>  		if (skb) {
> -			u16 queue = skb_get_queue_mapping(skb);
>  			struct netdev_queue *txq;
>  
> -			txq = netdev_get_tx_queue(skb->dev, queue);
> +			txq = skb_get_tx_queue(skb->dev, skb);
>  			netdev_tx_completed_queue(txq, 1, skb->len);
>  			dev_kfree_skb_any(skb);
>  		}
> @@ -1711,7 +1710,7 @@ static int airoha_dev_stop(struct net_device *dev)
>  	if (err)
>  		return err;
>  
> -	for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
> +	for (i = 0; i < dev->num_tx_queues; i++)
>  		netdev_tx_reset_subqueue(dev, i);
>  
>  	airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
> @@ -2002,7 +2001,7 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
>  
>  	spin_lock_bh(&q->lock);
>  
> -	txq = netdev_get_tx_queue(dev, qid);
> +	txq = skb_get_tx_queue(dev, skb);
>  	nr_frags = 1 + skb_shinfo(skb)->nr_frags;
>  
>  	if (q->queued + nr_frags >= q->ndesc) {
> 
> ---
> base-commit: a663bac71a2f0b3ac6c373168ca57b2a6e6381aa
> change-id: 20260421-airoha-fix-bql-7fff7cebbc9a
> 
> Best regards,
> -- 
> Lorenzo Bianconi <lorenzo@kernel.org>
> 

commenting on Sashiko reported issues:
https://sashiko.dev/#/patchset/20260421-airoha-fix-bql-v1-1-f135afe4275b%40kernel.org

- This isn't a bug in this patch, but does using 0xff as a sentinel value cause a permanent stall?
  I do not think this is a real issue since, according to my understanding, the NIC
  never writes 0xff in irq_q queue.

- This is another pre-existing issue, but does freeing the SKB here cause a DMA use-after-free
  for multi-fragment packets?
  This issue is not related to this patch, and I will fix it in a dedicated
  patch storing the skb pointer in the last descriptor in airoha_dev_xmit()

- Since the QDMA hardware and NAPI instance are shared among multiple ports (qdma->users),
  could active NAPI polling cause a BUG_ON() in dql_completed()?
  This is not an issue related to this patch since here we are just resetting
  all the netdev tx queues instead of just the first ARRAY_SIZE(qdma->q_tx)
  ones.

- This isn't a bug in this patch, but does failing to wait for the DMA engines to become
  idle before unmapping buffers cause memory corruption?
  This issue is not related to this patch and it will be fixed with a dedicated
  patch.

- This is also pre-existing, but can this mapping cause a kernel panic on highmem systems?
  Can we have fragments in high memory? e.g on ARM architecture? Anyway, as
  pointed out by Sashiko, this issue is not related to this patch.

- This isn't introduced here, but does this logic cause a permanent TX stall?
  this issue is already fixed in the following patch:
  https://patchwork.kernel.org/project/netdevbpf/patch/20260421-airoha-xmit-stop-condition-v1-1-e670d6a48467@kernel.org/

Regards,
Lorenzo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [PATCH net-deletions] net: remove ax25 and amateur radio (hamradio) subsystem
From: Toke Høiland-Jørgensen @ 2026-04-23  8:15 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, Jakub Kicinski,
	corbet, skhan, federico.vaga, carlos.bilbao, avadhut.naik, alexs,
	si.yanteng, dzm91, 2023002089, tsbogend, dsahern, jani.nikula,
	mchehab+huawei, gregkh, jirislaby, tytso, herbert, ebiggers,
	johannes.berg, geert, pablo, tglx, mashiro.chen, mingo, dqfext,
	jreuter, sdf, pkshih, enelsonmoore, mkl, kees, crossd, jlayton,
	wangliang74, aha310510, takamitz, kuniyu, linux-doc, linux-mips
In-Reply-To: <20260421021824.1293976-1-kuba@kernel.org>

Jakub Kicinski <kuba@kernel.org> writes:

> Remove the amateur radio (AX.25, NET/ROM, ROSE) protocol implementation
> and all associated hamradio device drivers from the kernel tree.
> This set of protocols has long been a huge bug/syzbot magnet,
> and since nobody stepped up to help us deal with the influx
> of the AI-generated bug reports we need to move it out of tree
> to protect our sanity.
>
> The code is moved to an out-of-tree repo:
> https://github.com/linux-netdev/mod-orphan
> if it's cleaned up and reworked there we can accept it back.
>
> Minimal stub headers are kept for include/net/ax25.h (AX25_P_IP,
> AX25_ADDR_LEN, ax25_address) and include/net/rose.h (ROSE_ADDR_LEN)
> so that the conditional integration code in arp.c and tun.c continues
> to compile and work when the out-of-tree modules are loaded.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>

^ permalink raw reply

* Re: [PATCH net v2] net: iptunnel: fix stale transport header after GRE/TEB decap
From: Paolo Abeni @ 2026-04-23  8:19 UTC (permalink / raw)
  To: Jiayuan Chen, Eric Dumazet
  Cc: netdev, syzbot+83181a31faf9455499c5, David S. Miller, David Ahern,
	Jakub Kicinski, Simon Horman, Pravin B Shelar, Tom Herbert,
	linux-kernel
In-Reply-To: <eb0683d9-617f-41b6-b535-e15ffe081a17@linux.dev>

On 4/19/26 3:01 PM, Jiayuan Chen wrote:
> [...]
>>>   +662,18 @@ static inline int iptunnel_pull_offloads(struct sk_buff *skb)
>>>          return 0;
>>>   }
>>>
>>> +static inline void iptunnel_rebuild_transport_header(struct sk_buff *skb)
>>> +{
>>> +       if (!skb_is_gso(skb))
>>> +               return;
>>> +
>>> +       skb->transport_header = (typeof(skb->transport_header))~0U;
>>> +       skb_probe_transport_header(skb);
>>> +
>>> +       if (!skb_transport_header_was_set(skb))
>>> +               skb_gso_reset(skb);
>> I do not think this makes sense.
>> What is a valid case for this packet being processed further?
>> The buggy packet must be dropped, instead of being mangled like this.
> Hi Eric,
> 
> The reproducer builds a gre frame whose inner Ethernet header is 
> all-zero. Tracing the skb through RX:
> 
> 1. At GRE decap exit, skb_transport_offset(skb) < 0 is the rule, not the 
> exception.
> 
> It is negative for every packet leaving the tunnel, including perfectly 
> well-formed inner IPv4 traffic
> because the tunnel leaves skb->transport_header at the outer L4 offset while
> pskb_pull() has already advanced skb->data past it. 

Is it? the transport header is an offset on top of skb->head, pskb_pull
changes head only if the header is not in the linear part (and the
transport offset is already invalid).

> skb_transport_header_was_set() stays true, so downstream
> code that trusts that flag now trusts a stale, negative offset.
> 
> 2. GRO repairs it — but only for protocols it knows.
> 
> In dev_gro_receive(), skb->protocol is dispatched through the offload 
> table. For ETH_P_IP,
> inet_gro_receive() calls skb_set_transport_header(skb, 
> skb_gro_offset(skb)), and the offset
> becomes valid again. But for malformed skb, dev_gro_receive just bypass it.

So only malformed packets cause trouble, right?

> 3. Both kinds then reach __netif_receive_skb_core().
> 
> So the skb that qdisc/tc/BPF segmenters later see has an
> invariant violation — _was_set == true but offset < 0 — that the core
> layer has no intention of catching for us.
> 
> My reading of this is that the tunnel decap path is producing an skb 
> that doesn't
> honor the contract __netif_receive_skb_core() expects from its 
> producers, and that
> it doesn't really make sense to ask GRE to parse or validate the inner 
> L4 in order
> to fix this.
> 
> I'm thinking at the end of GRE decap, before handing the skb to 
> gro_cells_receive(),
> call skb_reset_transport_header(skb).

My take is that you need to address the issue earlier than the current
patch, dropping the malformed packets.

/P


^ permalink raw reply

* Re: [PATCH v2] ipv6: fix memory leak in __ip6_make_skb() when queue is empty
From: syzbot @ 2026-04-23  8:23 UTC (permalink / raw)
  To: 25181214217
  Cc: 25181214217, davem, dsahern, edumazet, horms, kuba, linux-kernel,
	netdev, pabeni, sd, willemdebruijn.kernel
In-Reply-To: <20260423082233.514056-1-25181214217@stu.xidian.edu.cn>

> During fuzzing with failslab enabled, a memory leak was observed in the
> IPv6 UDP send path.
>
> The root cause resides in __ip6_make_skb(). In extremely rare cases
> (such as fault injection or specific empty payload conditions),
> __ip6_append_data() may succeed but leave the socket's write queue
> empty.
>
> When __ip6_make_skb() is subsequently called, __skb_dequeue(queue)
> returns NULL. The previous logic handled this by executing a 'goto out;',
> which completely bypassed the call to ip6_cork_release(cork).
>
> Since the 'cork' structure actively holds a reference to the routing
> entry (dst_entry) and potentially other allocated options, skipping
> the release cleanly leaks these resources.
>
> Fix this by introducing an 'out_cork_release' label and jumping to it
> when skb is NULL, ensuring the cork state is always properly cleaned up.
> The now-unused 'out' label is also removed to prevent compiler warnings.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Reported-by: syzbot+e5d6936b9f4545fd88ab@syzkaller.appspotmail.com
> Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
> ---
>  net/ipv6/ip6_output.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 7e92909ab5be..82210dd5eb96 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1934,7 +1934,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
>  
>  	skb = __skb_dequeue(queue);
>  	if (!skb)
> -		goto out;
> +		goto out_cork_release;
>  	tail_skb = &(skb_shinfo(skb)->frag_list);
>  
>  	/* move skb->data to ip header from ext header */
> @@ -1998,8 +1998,8 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
>  		ICMP6_INC_STATS(net, idev, ICMP6_MIB_OUTMSGS);
>  	}
>  
> +out_cork_release:
>  	ip6_cork_release(cork);
> -out:
>  	return skb;
>  }
>  
> -- 
> 2.34.1
>

I see the command but can't find the corresponding bug.
The email is sent to  syzbot+HASH@syzkaller.appspotmail.com address
but the HASH does not correspond to any known bug.
Please double check the address.


^ permalink raw reply

* [PATCH v2] ipv6: fix memory leak in __ip6_make_skb() when queue is empty
From: Mingyu Wang @ 2026-04-23  8:22 UTC (permalink / raw)
  To: willemdebruijn.kernel, davem, dsahern, edumazet, kuba, pabeni
  Cc: sd, horms, netdev, linux-kernel, Mingyu Wang,
	syzbot+e5d6936b9f4545fd88ab

During fuzzing with failslab enabled, a memory leak was observed in the
IPv6 UDP send path.

The root cause resides in __ip6_make_skb(). In extremely rare cases
(such as fault injection or specific empty payload conditions),
__ip6_append_data() may succeed but leave the socket's write queue
empty.

When __ip6_make_skb() is subsequently called, __skb_dequeue(queue)
returns NULL. The previous logic handled this by executing a 'goto out;',
which completely bypassed the call to ip6_cork_release(cork).

Since the 'cork' structure actively holds a reference to the routing
entry (dst_entry) and potentially other allocated options, skipping
the release cleanly leaks these resources.

Fix this by introducing an 'out_cork_release' label and jumping to it
when skb is NULL, ensuring the cork state is always properly cleaned up.
The now-unused 'out' label is also removed to prevent compiler warnings.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+e5d6936b9f4545fd88ab@syzkaller.appspotmail.com
Signed-off-by: Mingyu Wang <25181214217@stu.xidian.edu.cn>
---
 net/ipv6/ip6_output.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 7e92909ab5be..82210dd5eb96 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1934,7 +1934,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,

 	skb = __skb_dequeue(queue);
 	if (!skb)
-		goto out;
+		goto out_cork_release;
 	tail_skb = &(skb_shinfo(skb)->frag_list);

 	/* move skb->data to ip header from ext header */
@@ -1998,8 +1998,8 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 		ICMP6_INC_STATS(net, idev, ICMP6_MIB_OUTMSGS);
 	}

+out_cork_release:
 	ip6_cork_release(cork);
-out:
 	return skb;
 }

-- 
2.34.1

^ permalink raw reply related

* Re: [PATCH net v2] net: dsa: mt7530: fix .get_stats64 sleeping in atomic context
From: Paolo Abeni @ 2026-04-23  8:30 UTC (permalink / raw)
  To: Daniel Golle, Chester A. Unal, Andrew Lunn, Vladimir Oltean,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Matthias Brugger,
	AngeloGioacchino Del Regno, Russell King, Christian Marangi,
	netdev, linux-kernel, linux-arm-kernel, linux-mediatek
  Cc: Frank Wunderlich, John Crispin
In-Reply-To: <58aff8b5b1d691872342a6ffd3315f27854788a6.1776595131.git.daniel@makrotopia.org>

On 4/19/26 12:43 PM, Daniel Golle wrote:
> The .get_stats64 callback runs in atomic context, but on
> MDIO-connected switches every register read acquires the MDIO bus
> mutex, which can sleep:
> [   12.645973] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:609
> [   12.654442] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 759, name: grep
> [   12.663377] preempt_count: 0, expected: 0
> [   12.667410] RCU nest depth: 1, expected: 0
> [   12.671511] INFO: lockdep is turned off.
> [   12.675441] CPU: 0 UID: 0 PID: 759 Comm: grep Tainted: G S      W           7.0.0+ #0 PREEMPT
> [   12.675453] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
> [   12.675456] Hardware name: Bananapi BPI-R64 (DT)
> [   12.675459] Call trace:
> [   12.675462]  show_stack+0x14/0x1c (C)
> [   12.675477]  dump_stack_lvl+0x68/0x8c
> [   12.675487]  dump_stack+0x14/0x1c
> [   12.675495]  __might_resched+0x14c/0x220
> [   12.675504]  __might_sleep+0x44/0x80
> [   12.675511]  __mutex_lock+0x50/0xb10
> [   12.675523]  mutex_lock_nested+0x20/0x30
> [   12.675532]  mt7530_get_stats64+0x40/0x2ac
> [   12.675542]  dsa_user_get_stats64+0x2c/0x40
> [   12.675553]  dev_get_stats+0x44/0x1e0
> [   12.675564]  dev_seq_printf_stats+0x24/0xe0
> [   12.675575]  dev_seq_show+0x14/0x3c
> [   12.675583]  seq_read_iter+0x37c/0x480
> [   12.675595]  seq_read+0xd0/0xec
> [   12.675605]  proc_reg_read+0x94/0xe4
> [   12.675615]  vfs_read+0x98/0x29c
> [   12.675625]  ksys_read+0x54/0xdc
> [   12.675633]  __arm64_sys_read+0x18/0x20
> [   12.675642]  invoke_syscall.constprop.0+0x54/0xec
> [   12.675653]  do_el0_svc+0x3c/0xb4
> [   12.675662]  el0_svc+0x38/0x200
> [   12.675670]  el0t_64_sync_handler+0x98/0xdc
> [   12.675679]  el0t_64_sync+0x158/0x15c
> 
> For MDIO-connected switches, poll MIB counters asynchronously using a
> delayed workqueue every second and let .get_stats64 return the cached
> values under a spinlock. A mod_delayed_work() call on each read
> triggers an immediate refresh so counters stay responsive when queried
> more frequently.
> 
> MMIO-connected switches (MT7988, EN7581, AN7583) are not affected
> because their regmap does not sleep, so they continue to read MIB
> counters directly in .get_stats64.
> 
> Fixes: 88c810f35ed5 ("net: dsa: mt7530: implement .get_stats64")
> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> Acked-by: Chester A. Unal <chester.a.unal@arinc9.com>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
> ---
> v2:
>  * use spin_lock_bh()/spin_unlock_bh() to prevent potential deadlock
>  * rate-limit mod_delayed_work() refresh to at most once per 100ms
>  * move cancel_delayed_work_sync() after dsa_unregister_switch()
>  * add mt753x_teardown() callback to cancel the stats work
>  * fix commit message
> 
>  drivers/net/dsa/mt7530.c | 66 ++++++++++++++++++++++++++++++++++++++--
>  drivers/net/dsa/mt7530.h |  8 +++++
>  2 files changed, 71 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
> index b9423389c2ef0..8c1186ba2279b 100644
> --- a/drivers/net/dsa/mt7530.c
> +++ b/drivers/net/dsa/mt7530.c
> @@ -25,6 +25,9 @@
>  
>  #include "mt7530.h"
>  
> +#define MT7530_STATS_POLL_INTERVAL	(1 * HZ)
> +#define MT7530_STATS_RATE_LIMIT		(HZ / 10)
> +
>  static struct mt753x_pcs *pcs_to_mt753x_pcs(struct phylink_pcs *pcs)
>  {
>  	return container_of(pcs, struct mt753x_pcs, pcs);
> @@ -906,10 +909,9 @@ static void mt7530_get_rmon_stats(struct dsa_switch *ds, int port,
>  	*ranges = mt7530_rmon_ranges;
>  }
>  
> -static void mt7530_get_stats64(struct dsa_switch *ds, int port,
> -			       struct rtnl_link_stats64 *storage)
> +static void mt7530_read_port_stats64(struct mt7530_priv *priv, int port,
> +				     struct rtnl_link_stats64 *storage)
>  {
> -	struct mt7530_priv *priv = ds->priv;
>  	uint64_t data;
>  
>  	/* MIB counter doesn't provide a FramesTransmittedOK but instead
> @@ -951,6 +953,45 @@ static void mt7530_get_stats64(struct dsa_switch *ds, int port,
>  			       &storage->rx_crc_errors);
>  }
>  
> +static void mt7530_stats_poll(struct work_struct *work)
> +{
> +	struct mt7530_priv *priv = container_of(work, struct mt7530_priv,
> +						stats_work.work);
> +	struct rtnl_link_stats64 stats = {};
> +	struct dsa_port *dp;
> +	int port;
> +
> +	dsa_switch_for_each_user_port(dp, priv->ds) {
> +		port = dp->index;
> +
> +		mt7530_read_port_stats64(priv, port, &stats);
> +
> +		spin_lock_bh(&priv->stats_lock);
> +		priv->ports[port].stats = stats;
> +		spin_unlock_bh(&priv->stats_lock);
> +	}
> +
> +	priv->stats_last = jiffies;
> +	schedule_delayed_work(&priv->stats_work,
> +			      MT7530_STATS_POLL_INTERVAL);
> +}
> +
> +static void mt7530_get_stats64(struct dsa_switch *ds, int port,
> +			       struct rtnl_link_stats64 *storage)
> +{
> +	struct mt7530_priv *priv = ds->priv;
> +
> +	if (priv->bus) {
> +		spin_lock_bh(&priv->stats_lock);
> +		*storage = priv->ports[port].stats;
> +		spin_unlock_bh(&priv->stats_lock);
> +		if (time_after(jiffies, priv->stats_last + MT7530_STATS_RATE_LIMIT))

Since both the `stats_last` access and read are lockless, it looks like
they may race leading wrong/unexpected delay. I think it would be better
move both under the spinlock (yes, the write will happen multiple times
per stats update, I don't think it will matter).

/P


^ permalink raw reply

* Re: [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
From: Corey Leavitt @ 2026-04-23  8:40 UTC (permalink / raw)
  To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
	Russell King
  Cc: Andrew Lunn, netdev, linux-kernel
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-7d4856f686f6@leavitt.info>

Apologies for the noise -- this series was inadvertently sent twice. The
first send went out through an SMTP path that stripped the patatt
developer signature and re-encoded the bodies as quoted-printable.

Please disregard this thread and use the clean, signed copy as canonical:

  https://lore.kernel.org/netdev/20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info/T/

Any v2 will land as a reply to that thread. Sorry for the extra inbox
traffic.

Thanks,
Corey

^ permalink raw reply

* Re: [PATCH net] ipv6: validate extension header length before copying to cmsg
From: Paolo Abeni @ 2026-04-23  8:45 UTC (permalink / raw)
  To: Qi Tang, davem, dsahern, edumazet, kuba, horms; +Cc: netdev, linux-kernel
In-Reply-To: <20260419150344.624673-1-tpluszz77@gmail.com>

On 4/19/26 5:03 PM, Qi Tang wrote:
> ip6_datagram_recv_specific_ctl() builds IPV6_{HOPOPTS,DSTOPTS,RTHDR}
> cmsgs (and their IPV6_2292* legacy counterparts) by trusting the
> on-wire hdrlen byte (ptr[1]) when computing the put_cmsg() length.
> The length was validated only at parse time (ipv6_parse_hopopts(),
> etc.). An nftables payload-write expression can rewrite hdrlen after
> parsing and before the skb reaches recvmsg; the write itself is
> in-bounds but put_cmsg() then reads up to ((hdrlen+1) << 3) = 2040
> bytes from an 8-byte header. nftables is reachable from an unprivi-
> leged user namespace, so this is an unprivileged slab-out-of-bounds
> read:
> 
>   BUG: KASAN: slab-out-of-bounds in put_cmsg+0x3ac/0x540
>    put_cmsg+0x3ac/0x540
>    udpv6_recvmsg+0xca0/0x1250
>    sock_recvmsg+0xdf/0x190
>    ____sys_recvmsg+0x1b1/0x620
> 
> Clamp each cmsg length against skb_tail_pointer(skb) before calling
> put_cmsg(). Extension headers are kept in the linear skb area by
> pskb_may_pull() during input, so skb_tail_pointer() is the correct
> bound. The check is replicated at each call site (one HbH, four
> RFC2292 sites, and four switch cases in the DSTOPTS/RTHDR/AH walk)
> rather than hoisted out of the switch, to keep the fix minimal and
> backportable; a follow-up cleanup can factor it out. In the walk
> loop a failed check also aborts the walk, since subsequent offsets
> depend on the tampered length.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Qi Tang <tpluszz77@gmail.com>
> ---
>  net/ipv6/datagram.c | 35 ++++++++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
> index ca3605acb..a7b9f5a24 100644
> --- a/net/ipv6/datagram.c
> +++ b/net/ipv6/datagram.c
> @@ -643,7 +643,10 @@ void ip6_datagram_recv_specific_ctl(struct sock *sk, struct msghdr *msg,
>  	/* HbH is allowed only once */
>  	if (np->rxopt.bits.hopopts && (opt->flags & IP6SKB_HOPBYHOP)) {
>  		u8 *ptr = nh + sizeof(struct ipv6hdr);
> -		put_cmsg(msg, SOL_IPV6, IPV6_HOPOPTS, (ptr[1]+1)<<3, ptr);
> +		u16 hbhlen = (ptr[1] + 1) << 3;
> +
> +		if (ptr + hbhlen <= skb_tail_pointer(skb))
> +			put_cmsg(msg, SOL_IPV6, IPV6_HOPOPTS, hbhlen, ptr);

The patch looks functionally correct to me, but the above 3 statements
are repeated multiple times. You can put them in a local helper and
avoud a lot of duplicate code.

>  	}
>  
>  	if (opt->lastopt &&
> @@ -668,27 +671,37 @@ void ip6_datagram_recv_specific_ctl(struct sock *sk, struct msghdr *msg,
>  			case IPPROTO_DSTOPTS:
>  				nexthdr = ptr[0];
>  				len = (ptr[1] + 1) << 3;
> +				if (ptr + len > skb_tail_pointer(skb))
> +					goto ext_hdr_done;

The packet is corrupted, allowing processing of later rxopt requires the
IMHO not nice empty label. I think it would be better just returning
from this function.

/P


^ permalink raw reply

* Re: [PATCH iproute2] ss: fix vsock port filter
From: Luigi Leonardi @ 2026-04-23  8:49 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Stefano Garzarella, stefanha, netdev, Mathieu Schroeter,
	David Ahern
In-Reply-To: <20260422092100.46744a32@phoenix.local>

On Wed, Apr 22, 2026 at 09:21:00AM -0700, Stephen Hemminger wrote:
>On Wed, 22 Apr 2026 10:03:49 +0200
>Stefano Garzarella <sgarzare@redhat.com> wrote:
>
>> On Wed, 22 Apr 2026 at 01:38, Stephen Hemminger <stephen@networkplumber.org> wrote:
>> >
>> > On Tue, 21 Apr 2026 14:35:12 +0200
>> > Luigi Leonardi <leonardi@redhat.com> wrote:
>> >
>> > > parse_hostcond() uses get_u32() to parse the vsock port into the
>> > > aafilter.port field, which is a long. On 64-bit systems, get_u32()
>> > > only writes the lower 32 bits, leaving the upper 32 bits set from
>> > > the -1 initialization. This causes the port comparison
>> > > "a->port != s->rport" in run_ssfilter() to always fail, since the
>> > > corrupted long value never matches the int rport.
>> > >
>> > > Fix by using get_long() instead, consistent with how AF_PACKET and
>> > > AF_NETLINK handle the same field.
>> > >
>> > > Fixes: c759116a0b2b ("ss: add AF_VSOCK support")
>> > > Signed-off-by: Luigi Leonardi <leonardi@redhat.com>
>> > > ---
>> > >  misc/ss.c | 2 +-
>> > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > >
>> > > diff --git a/misc/ss.c b/misc/ss.c
>> > > index 14e9f27a..6e3321ac 100644
>> > > --- a/misc/ss.c
>> > > +++ b/misc/ss.c
>> > > @@ -2323,7 +2323,7 @@ void *parse_hostcond(char *addr, bool is_port)
>> > >               port = find_port(addr, is_port);
>> > >
>> > >               if (port && strcmp(port, "*") &&
>> > > -                 get_u32((__u32 *)&a.port, port, 0))
>> > > +                 get_long(&a.port, port, 0))
>> > >                       return NULL;
>> >
>> > If you use get_long() then the code could get negative values.
>> > Actually have port in ss as signed value seems like a mistake in original design.
>> >
>> > The port in unix domain socket is inode number.
>> > Originally it was int, but got changed to long back in 6.6
>> >
>> > The port in ss cache is int.
>>
>> Yeah, as I mentioned I think the issue was introduced by commit
>> 012cb515 ("ss: change aafilter port from int to long (inode support)").
>
>What about this which avoids the cast but keeps the same semantics.
>
>diff --git a/misc/ss.c b/misc/ss.c
>index 14e9f27a..e830e146 100644
>--- a/misc/ss.c
>+++ b/misc/ss.c
>@@ -2317,14 +2317,16 @@ void *parse_hostcond(char *addr, bool is_port)
>
> 	if (fam == AF_VSOCK) {
> 		__u32 cid = ~(__u32)0;
>+		__u32 vport = 0;
>
> 		a.addr.family = AF_VSOCK;
>
> 		port = find_port(addr, is_port);
>-
>-		if (port && strcmp(port, "*") &&
>-		    get_u32((__u32 *)&a.port, port, 0))
>-			return NULL;
>+		if (port && strcmp(port, "*")) {
>+			if (get_u32(&vport, port, 0))
>+				return NULL;
>+		}
>+		a.port = vport;
>
> 		if (!is_port && addr[0] && strcmp(addr, "*")) {
> 			a.addr.bitlen = 32;
>
>

With high enough ports it's not working, I expect to be a problem with
the sign. I'll try updating `struct sockstat` as Stefano suggested.

Luigi


^ permalink raw reply

* Re: [PATCH net v2 10/15] drivers: net: cirrus: mac89x0: Remove this driver
From: Daniel Palmer @ 2026-04-23  8:52 UTC (permalink / raw)
  To: John Paul Adrian Glaubitz
  Cc: Geert Uytterhoeven, Andrew Lunn, Andrew Lunn, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Jonathan Corbet, Shuah Khan, Michael Fritscher, Byron Stanoszek,
	linux-kernel, netdev, linux-doc, linux-m68k
In-Reply-To: <c0c80113af470b265650405fa24deefe2d82ea24.camel@physik.fu-berlin.de>

Hi Adrian,

On Thu, 23 Apr 2026 at 16:10, John Paul Adrian Glaubitz
<glaubitz@physik.fu-berlin.de> wrote:
> > Macs do run modern kernels.
>
> Retrocomputing still is not well regarded by some maintainers, it seems :-(.

I've found bugs in drivers by plugging those things into exotic
hardware like my Amiga 4000 and Ultra5 [0].
So, it's not totally pointless. And having a shader capable Amiga[1]
is pretty cool.
Sad to see fun stuff getting pushed out by basically spam bots. :(

0 - https://lists.freedesktop.org/archives/amd-gfx/2025-October/132283.html
1 - https://gist.github.com/fifteenhex/0e5ce8c1614bcec20ed242045c11d1d9

^ permalink raw reply

* [PATCH net-next v6 0/3] net: stmmac: eic7700: fix EIC7700 eth1 RX sampling timing
From: lizhi2 @ 2026-04-23  8:55 UTC (permalink / raw)
  To: devicetree, andrew+netdev, davem, edumazet, kuba, robh, krzk+dt,
	conor+dt, netdev, pabeni, mcoquelin.stm32, alexandre.torgue,
	rmk+kernel, pjw, palmer, aou, alex, linux-riscv, linux-stm32,
	linux-arm-kernel, linux-kernel, maxime.chevallier
  Cc: ningyu, linmin, pinkesh.vaghela, pritesh.patel, weishangjuan,
	horms, Zhi Li

From: Zhi Li <lizhi2@eswincomputing.com>

v5 -> v6:
  - Update DTS/DTSI descriptions to fix invalid phandle references reported by DTC:
    - Add missing GMAC provider nodes required for proper hardware description:
      - HSP power domain: GMAC nodes moved under this domain to reflect
        hardware power hierarchy.
      - Clock nodes: added to provide clk phandles referenced by GMAC.
      - Reset nodes: added to provide reset phandles referenced by GMAC.
      - Pinctrl nodes: defines pinctrl settings for GMAC signals
        (pinctrl_gpio106, pinctrl_gpio111).
    - Move GMAC nodes under the correct HSP power domain.
    - Ensure DTS builds without dtc errors and all phandle references
      (clk/reset/pinctrl/power-domain) are valid.
    - This update does not change runtime behavior; it only improves DTS
      consistency and resolves issues reported by dtc.

  - Note:
    - The patch 3/3 for DTS changes in this series provide an overview of the GMAC
      integration and its dependencies, as discussed previously:
      https://lore.kernel.org/lkml/64bf6b40-b947-4ffa-8d48-4d6341931327@lunn.ch/

    - It is **not intended for upstream inclusion** in its current form,
      and is provided solely for architecture overview and integration
      context.

    - A fully cleaned and upstream-ready DTS series will be submitted
      separately once all related components (pinctrl, clock, power-domain,
      etc.) are finalized.

  - dtbs_check has been run on top of net-next for reference purposes.
    Remaining warnings are expected due to missing EIC7700 clock binding[1]
    in net-next and do not reflect issues in the DTS design itself.

  - One remaining warning:
    - eswin,eic7700-clock

  - The clock binding has already been applied to upstream and is present
    in mainline, but not yet available in net-next.

  - The syscon binding is extended in this series to include the
    eswin,eic7700-syscfg compatible.

  - Any further refinement of the syscfg binding will be handled in
    separate patches if needed.

  - Dependencies:
    - [1]EIC7700 clock binding:
      https://lore.kernel.org/lkml/20260303080637.2100-1-dongxuyang@eswincomputing.com/
      (already applied to upstream)

  - Link to v5:
    https://lore.kernel.org/lkml/20260324073017.376-1-lizhi2@eswincomputing.com/

v4 -> v5:
  - eswin,eic7700-eth.yaml:
    - Add Acked-by from Conor Dooley
    - No functional changes

  - Update dwmac-eic7700.c:
    - Disable clocks on the error path to fix a clock leak in
      eic7700_dwmac_init() when regmap_set_bits() fails
      (reported by Simon Horman <horms@kernel.org>)

  - Link to v4:
    https://lore.kernel.org/lkml/20260313075234.1567-1-lizhi2@eswincomputing.com/

v3 -> v4:
  - Update eswin,eic7700-eth.yaml:
    - Improve commit message in dt-bindings patch to clarify the
      hardware difference of the eth1 MAC and why a new compatible
      string is required.
    - Move the newly added eswin,hsp-sp-csr item to the end of the list
      to avoid inserting entries in the middle of the binding schema.
    - Simplify the compatible schema by replacing the previous oneOf
      construct with an enum.

  - Update dwmac-eic7700.c:
    - Fix build issues.
    - Adjust code to match the updated binding definition.

  - Update DTS/DTSI descriptions:
    - Move SoC-level descriptions to the .dtsi file.
    - Keep board-specific configuration in the .dts file.

  - Link to v3:
    https://lore.kernel.org/lkml/20260303061525.846-1-lizhi2@eswincomputing.com/

v2 -> v3:
  - Update eswin,eic7700-eth.yaml:
    - Extend rx-internal-delay-ps and tx-internal-delay-ps range
      from 0-2400 to 0-2540 to match the full 7-bit hardware delay
      field (127 * 20 ps).
    - Add "multipleOf: 20" constraint to reflect the 20 ps hardware
      step size.
    - Make rx-internal-delay-ps and tx-internal-delay-ps optional.
      A well-designed board should not require internal delay tuning.
    - Remove rx-internal-delay-ps and tx-internal-delay-ps from the
      example to avoid encouraging blind copy into board DTs.

  - Update dwmac-eic7700.c:
    - Treat rx-internal-delay-ps and tx-internal-delay-ps as optional
      DT properties.
    - Apply delay configuration only when properties are present.
    - Keep TX/RX delay registers cleared by default to ensure a
      deterministic state when no delay is specified.

  - Describe Ethernet configuration for the HiFive Premier P550 board:
    - Add GMAC controller nodes for the HiFive Premier P550 board
      to describe the on-board Ethernet configuration.

      The Ethernet controller depends on clock, reset, pinctrl
      and HSP subsystem providers which are currently under
      upstream review. These dependent nodes will be submitted
      separately once the corresponding drivers are merged.

      Due to these missing dependencies, dt-binding-check may
      report warnings or failures for this series.

  - No functional changes to RX clock inversion logic.

  - Link to v2:
    https://lore.kernel.org/lkml/20260209094628.886-1-lizhi2@eswincomputing.com/

  - This series is based on the EIC7700 clock support series:
    https://lore.kernel.org/all/20260210095008.726-1-dongxuyang@eswincomputing.com/
    The clock series is currently under review.

v1 -> v2:
  - Update eswin,eic7700-eth.yaml:
    - Drop the vendor-specific properties eswin,rx-clk-invert and
      eswin,tx-clk-invert.
    - Introduce a distinct compatible string
      "eswin,eic7700-qos-eth-clk-inversion" to describe MAC instances that
      require internal RGMII clock inversion.
      This models the SoC-specific hardware difference directly via the
      compatible string and avoids per-board configuration properties.
    - Change rx-internal-delay-ps and tx-internal-delay-ps from enum to
      minimum/maximum to reflect the actual delay range (0-2400 ps)
    - Add reference to High-Speed Subsystem documentation in eswin,hsp-sp-csr
      description. The HSP CSR block is described in Chapter 10
      ("High-Speed Interface") of the EIC7700X SoC Technical Reference Manual,
      Part 4 (EIC7700X_SoC_Technical_Reference_Manual_Part4.pdf):
      https://github.com/eswincomputing/EIC7700X-SoC-Technical-Reference-Manual/releases

  - Update dwmac-eic7700.c:
    - Remove handling of eswin,rx-clk-invert and eswin,tx-clk-invert
      properties.
    - Select RX clock inversion based on the new
      "eswin,eic7700-qos-eth-clk-inversion" compatible string, using
      match data to apply the required configuration for affected MAC
      instances (eth1).

  - Link to v1:
    https://lore.kernel.org/lkml/20260109080601.1262-1-lizhi2@eswincomputing.com/

Zhi Li (3):
  dt-bindings: ethernet: eswin: add clock sampling control
  net: stmmac: eic7700: enable clocks before syscon access and correct
    RX sampling timing
  riscv: dts: eswin: eic7700-hifive-premier-p550: enable Ethernet
    controller

 .../devicetree/bindings/mfd/syscon.yaml       |   2 +
 .../bindings/net/eswin,eic7700-eth.yaml       |  69 ++++--
 .../dts/eswin/eic7700-hifive-premier-p550.dts | 232 ++++++++++++++++++
 arch/riscv/boot/dts/eswin/eic7700.dtsi        | 103 ++++++++
 .../ethernet/stmicro/stmmac/dwmac-eic7700.c   | 183 ++++++++++----
 5 files changed, 532 insertions(+), 57 deletions(-)

-- 
2.25.1


^ permalink raw reply

* [PATCH net-next v6 1/3] dt-bindings: ethernet: eswin: add clock sampling control
From: lizhi2 @ 2026-04-23  8:56 UTC (permalink / raw)
  To: devicetree, andrew+netdev, davem, edumazet, kuba, robh, krzk+dt,
	conor+dt, netdev, pabeni, mcoquelin.stm32, alexandre.torgue,
	rmk+kernel, pjw, palmer, aou, alex, linux-riscv, linux-stm32,
	linux-arm-kernel, linux-kernel, maxime.chevallier
  Cc: ningyu, linmin, pinkesh.vaghela, pritesh.patel, weishangjuan,
	horms, Zhi Li, Conor Dooley
In-Reply-To: <20260423085501.760-1-lizhi2@eswincomputing.com>

From: Zhi Li <lizhi2@eswincomputing.com>

Due to chip backend reasons, there is already an approximately 4-5 ns
skew between the RX clock and data of the eth1 MAC controller inside
the silicon.

For 1000M, the RX clock must be inverted since it is not possible to
meet the RGMII timing requirements using only rx-internal-delay-ps on
the MAC together with the standard 2 ns delay on the PHY. Therefore,
even on a properly designed board, eth1 still requires RX clock
inversion.

This behaviour effectively breaks the RGMII timing assumptions at the
SoC level.

For the TX path of eth1, there is also a skew between the TX clock
and data on the MAC controller inside the silicon. This skew happens
to be approximately 2 ns. Therefore, it can be considered that the
2 ns delay of TX is provided by the MAC, so the TX is compliant with
the RGMII standard.

For 10/100 operation, the approximately 4-5 ns skew in the chip does
not break the standard. The RGMII timing table (Section 3.3) specifies
that for 10/100 operation the maximum value is unspecified:
https://community.nxp.com/pwmxy87654/attachments/pwmxy87654/imx-processors/20655/1/RGMIIv2_0_final_hp.pdf

Due to the eth1 silicon behavior described above, a new compatible
string "eswin,eic7700-qos-eth-clk-inversion" is added to the device
tree. This allows the driver to handle the differences between eth1
and eth0 through dedicated logic.

The rx-internal-delay-ps and tx-internal-delay-ps properties now use
minimum and maximum constraints to reflect the actual hardware delay
range (0-2540 ps) applied in 20 ps steps. This relaxes the binding
validation compared to the previous enum-based definition and avoids
regressions for existing DTBs while keeping the same hardware limits.

Treat the RX/TX internal delay properties as optional, board-specific
tuning knobs and remove them from the example to avoid encouraging
their use.

In addition, the binding now includes additional background information
about the HSP CSR registers accessed by the MAC. The TXD and RXD delay
control registers are included so the driver can explicitly clear any
residual configuration left by the bootloader.

Background reference for the High-Speed Subsystem and HSP CSR block is
available in Chapter 10 ("High-Speed Interface") of the EIC7700X SoC
Technical Reference Manual, Part 4
(EIC7700X_SoC_Technical_Reference_Manual_Part4.pdf):
https://github.com/eswincomputing/EIC7700X-SoC-Technical-Reference-Manual/releases

There are currently no in-tree users of the EIC7700 Ethernet driver, so
these changes are safe.

Fixes: 888bd0eca93c ("dt-bindings: ethernet: eswin: Document for EIC7700 SoC")
Signed-off-by: Zhi Li <lizhi2@eswincomputing.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
---
 .../bindings/net/eswin,eic7700-eth.yaml       | 69 +++++++++++++++----
 1 file changed, 55 insertions(+), 14 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/eswin,eic7700-eth.yaml b/Documentation/devicetree/bindings/net/eswin,eic7700-eth.yaml
index 91e8cd1db67b..0b27719feb7d 100644
--- a/Documentation/devicetree/bindings/net/eswin,eic7700-eth.yaml
+++ b/Documentation/devicetree/bindings/net/eswin,eic7700-eth.yaml
@@ -20,6 +20,7 @@ select:
       contains:
         enum:
           - eswin,eic7700-qos-eth
+          - eswin,eic7700-qos-eth-clk-inversion
   required:
     - compatible
 
@@ -29,7 +30,9 @@ allOf:
 properties:
   compatible:
     items:
-      - const: eswin,eic7700-qos-eth
+      - enum:
+          - eswin,eic7700-qos-eth
+          - eswin,eic7700-qos-eth-clk-inversion
       - const: snps,dwmac-5.20
 
   reg:
@@ -63,16 +66,29 @@ properties:
       - const: stmmaceth
 
   rx-internal-delay-ps:
-    enum: [0, 200, 600, 1200, 1600, 1800, 2000, 2200, 2400]
+    minimum: 0
+    maximum: 2540
+    multipleOf: 20
 
   tx-internal-delay-ps:
-    enum: [0, 200, 600, 1200, 1600, 1800, 2000, 2200, 2400]
+    minimum: 0
+    maximum: 2540
+    multipleOf: 20
 
   eswin,hsp-sp-csr:
     description:
       HSP CSR is to control and get status of different high-speed peripherals
       (such as Ethernet, USB, SATA, etc.) via register, which can tune
       board-level's parameters of PHY, etc.
+
+      Additional background information about the High-Speed Subsystem
+      and the HSP CSR block is available in Chapter 10 ("High-Speed Interface")
+      of the EIC7700X SoC Technical Reference Manual, Part 4
+      (EIC7700X_SoC_Technical_Reference_Manual_Part4.pdf). The manual is
+      publicly available at
+      https://github.com/eswincomputing/EIC7700X-SoC-Technical-Reference-Manual/releases
+
+      This reference is provided for background information only.
     $ref: /schemas/types.yaml#/definitions/phandle-array
     items:
       - items:
@@ -82,6 +98,8 @@ properties:
           - description: Offset of AXI clock controller Low-Power request
                          register
           - description: Offset of register controlling TX/RX clock delay
+          - description: Offset of register controlling TXD delay
+          - description: Offset of register controlling RXD delay
 
 required:
   - compatible
@@ -93,8 +111,6 @@ required:
   - phy-mode
   - resets
   - reset-names
-  - rx-internal-delay-ps
-  - tx-internal-delay-ps
   - eswin,hsp-sp-csr
 
 unevaluatedProperties: false
@@ -104,24 +120,49 @@ examples:
     ethernet@50400000 {
         compatible = "eswin,eic7700-qos-eth", "snps,dwmac-5.20";
         reg = <0x50400000 0x10000>;
+        interrupt-parent = <&plic>;
+        interrupts = <61>;
+        interrupt-names = "macirq";
         clocks = <&d0_clock 186>, <&d0_clock 171>, <&d0_clock 40>,
                 <&d0_clock 193>;
         clock-names = "axi", "cfg", "stmmaceth", "tx";
+        resets = <&reset 95>;
+        reset-names = "stmmaceth";
+        eswin,hsp-sp-csr = <&hsp_sp_csr 0x100 0x108 0x118 0x114 0x11c>;
+        phy-handle = <&gmac0_phy0>;
+        phy-mode = "rgmii-id";
+        snps,aal;
+        snps,fixed-burst;
+        snps,tso;
+        snps,axi-config = <&stmmac_axi_setup_gmac0>;
+
+        stmmac_axi_setup_gmac0: stmmac-axi-config {
+            snps,blen = <0 0 0 0 16 8 4>;
+            snps,rd_osr_lmt = <2>;
+            snps,wr_osr_lmt = <2>;
+        };
+    };
+
+    ethernet@50410000 {
+        compatible = "eswin,eic7700-qos-eth-clk-inversion", "snps,dwmac-5.20";
+        reg = <0x50410000 0x10000>;
         interrupt-parent = <&plic>;
-        interrupts = <61>;
+        interrupts = <70>;
         interrupt-names = "macirq";
-        phy-mode = "rgmii-id";
-        phy-handle = <&phy0>;
-        resets = <&reset 95>;
+        clocks = <&d0_clock 186>, <&d0_clock 171>, <&d0_clock 40>,
+                <&d0_clock 194>;
+        clock-names = "axi", "cfg", "stmmaceth", "tx";
+        resets = <&reset 94>;
         reset-names = "stmmaceth";
-        rx-internal-delay-ps = <200>;
-        tx-internal-delay-ps = <200>;
-        eswin,hsp-sp-csr = <&hsp_sp_csr 0x100 0x108 0x118>;
-        snps,axi-config = <&stmmac_axi_setup>;
+        eswin,hsp-sp-csr = <&hsp_sp_csr 0x200 0x208 0x218 0x214 0x21c>;
+        phy-handle = <&gmac1_phy0>;
+        phy-mode = "rgmii-id";
         snps,aal;
         snps,fixed-burst;
         snps,tso;
-        stmmac_axi_setup: stmmac-axi-config {
+        snps,axi-config = <&stmmac_axi_setup_gmac1>;
+
+        stmmac_axi_setup_gmac1: stmmac-axi-config {
             snps,blen = <0 0 0 0 16 8 4>;
             snps,rd_osr_lmt = <2>;
             snps,wr_osr_lmt = <2>;
-- 
2.25.1


^ permalink raw reply related

* [PATCH net-next v6 2/3] net: stmmac: eic7700: enable clocks before syscon access and correct RX sampling timing
From: lizhi2 @ 2026-04-23  8:56 UTC (permalink / raw)
  To: devicetree, andrew+netdev, davem, edumazet, kuba, robh, krzk+dt,
	conor+dt, netdev, pabeni, mcoquelin.stm32, alexandre.torgue,
	rmk+kernel, pjw, palmer, aou, alex, linux-riscv, linux-stm32,
	linux-arm-kernel, linux-kernel, maxime.chevallier
  Cc: ningyu, linmin, pinkesh.vaghela, pritesh.patel, weishangjuan,
	horms, Zhi Li
In-Reply-To: <20260423085501.760-1-lizhi2@eswincomputing.com>

From: Zhi Li <lizhi2@eswincomputing.com>

The second Ethernet controller (eth1) on the Eswin EIC7700 SoC may fail
to sample RX data correctly at Gigabit speed due to EIC7700-specific
receive clock to data skew at the MAC input in the silicon.

The existing internal delay configuration does not provide sufficient
adjustment range to compensate for this condition at 1000Mbps.
Update the EIC7700 DWMAC glue driver to apply EIC7700-specific clock
sampling inversion only during Gigabit operation on MAC instances
that require it.

TXD and RXD delay registers are explicitly cleared during initialization
to override any residual configuration left by the bootloader. All HSP
CSR register accesses are performed only after the required clocks are
enabled.

Fixes: ea77dbbdbc4e ("net: stmmac: add Eswin EIC7700 glue driver")
Signed-off-by: Zhi Li <lizhi2@eswincomputing.com>
---
 .../ethernet/stmicro/stmmac/dwmac-eic7700.c   | 183 ++++++++++++++----
 1 file changed, 140 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c
index bcb8e000e720..33144611da8d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c
@@ -28,20 +28,40 @@
 
 /*
  * TX/RX Clock Delay Bit Masks:
- * - TX Delay: bits [14:8] — TX_CLK delay (unit: 0.1ns per bit)
- * - RX Delay: bits [30:24] — RX_CLK delay (unit: 0.1ns per bit)
+ * - TX Delay: bits [14:8] — TX_CLK delay (unit: 0.02ns per bit)
+ * - TX Invert : bit  [15]
+ * - RX Delay: bits [30:24] — RX_CLK delay (unit: 0.02ns per bit)
+ * - RX Invert : bit  [31]
  */
 #define EIC7700_ETH_TX_ADJ_DELAY	GENMASK(14, 8)
 #define EIC7700_ETH_RX_ADJ_DELAY	GENMASK(30, 24)
+#define EIC7700_ETH_TX_INV_DELAY	BIT(15)
+#define EIC7700_ETH_RX_INV_DELAY	BIT(31)
 
-#define EIC7700_MAX_DELAY_UNIT 0x7F
+#define EIC7700_MAX_DELAY_STEPS		0x7F
+#define EIC7700_DELAY_STEP_PS		20
+#define EIC7700_MAX_DELAY_PS	\
+	(EIC7700_MAX_DELAY_STEPS * EIC7700_DELAY_STEP_PS)
 
 static const char * const eic7700_clk_names[] = {
 	"tx", "axi", "cfg",
 };
 
+struct eic7700_dwmac_data {
+	bool rgmii_rx_clk_invert;
+};
+
 struct eic7700_qos_priv {
+	struct device *dev;
 	struct plat_stmmacenet_data *plat_dat;
+	struct regmap *eic7700_hsp_regmap;
+	u32 eth_axi_lp_ctrl_offset;
+	u32 eth_phy_ctrl_offset;
+	u32 eth_txd_offset;
+	u32 eth_clk_offset;
+	u32 eth_rxd_offset;
+	u32 eth_clk_dly_param;
+	bool eth_rx_clk_inv;
 };
 
 static int eic7700_clks_config(void *priv, bool enabled)
@@ -61,8 +81,28 @@ static int eic7700_clks_config(void *priv, bool enabled)
 static int eic7700_dwmac_init(struct device *dev, void *priv)
 {
 	struct eic7700_qos_priv *dwc = priv;
+	int ret;
+
+	ret = eic7700_clks_config(dwc, true);
+	if (ret)
+		return ret;
+
+	ret = regmap_set_bits(dwc->eic7700_hsp_regmap,
+			      dwc->eth_phy_ctrl_offset,
+			      EIC7700_ETH_TX_CLK_SEL |
+			      EIC7700_ETH_PHY_INTF_SELI);
+	if (ret) {
+		eic7700_clks_config(dwc, false);
+		return ret;
+	}
+
+	regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_axi_lp_ctrl_offset,
+		     EIC7700_ETH_CSYSREQ_VAL);
+
+	regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_txd_offset, 0);
+	regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_rxd_offset, 0);
 
-	return eic7700_clks_config(dwc, true);
+	return 0;
 }
 
 static void eic7700_dwmac_exit(struct device *dev, void *priv)
@@ -88,18 +128,35 @@ static int eic7700_dwmac_resume(struct device *dev, void *priv)
 	return ret;
 }
 
+static void eic7700_dwmac_fix_speed(void *priv, phy_interface_t interface,
+				    int speed, unsigned int mode)
+{
+	struct eic7700_qos_priv *dwc = (struct eic7700_qos_priv *)priv;
+	u32 dly_param = dwc->eth_clk_dly_param;
+
+	switch (speed) {
+	case SPEED_1000:
+		if (dwc->eth_rx_clk_inv)
+			dly_param |= EIC7700_ETH_RX_INV_DELAY;
+		break;
+	case SPEED_100:
+	case SPEED_10:
+		break;
+	default:
+		dev_err(dwc->dev, "invalid speed %u\n", speed);
+		break;
+	}
+
+	regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_clk_offset, dly_param);
+}
+
 static int eic7700_dwmac_probe(struct platform_device *pdev)
 {
+	const struct eic7700_dwmac_data *data;
 	struct plat_stmmacenet_data *plat_dat;
 	struct stmmac_resources stmmac_res;
 	struct eic7700_qos_priv *dwc_priv;
-	struct regmap *eic7700_hsp_regmap;
-	u32 eth_axi_lp_ctrl_offset;
-	u32 eth_phy_ctrl_offset;
-	u32 eth_phy_ctrl_regset;
-	u32 eth_rxd_dly_offset;
-	u32 eth_dly_param = 0;
-	u32 delay_ps;
+	u32 delay_ps, val;
 	int i, ret;
 
 	ret = stmmac_get_platform_resources(pdev, &stmmac_res);
@@ -116,70 +173,95 @@ static int eic7700_dwmac_probe(struct platform_device *pdev)
 	if (!dwc_priv)
 		return -ENOMEM;
 
+	dwc_priv->dev = &pdev->dev;
+
+	data = device_get_match_data(&pdev->dev);
+	if (!data)
+		return dev_err_probe(&pdev->dev,
+				     -EINVAL, "no match data found\n");
+
+	dwc_priv->eth_rx_clk_inv = data->rgmii_rx_clk_invert;
+
 	/* Read rx-internal-delay-ps and update rx_clk delay */
 	if (!of_property_read_u32(pdev->dev.of_node,
 				  "rx-internal-delay-ps", &delay_ps)) {
-		u32 val = min(delay_ps / 100, EIC7700_MAX_DELAY_UNIT);
+		if (delay_ps % EIC7700_DELAY_STEP_PS)
+			return dev_err_probe(&pdev->dev, -EINVAL,
+				"rx delay must be multiple of %dps\n",
+				EIC7700_DELAY_STEP_PS);
+
+		if (delay_ps > EIC7700_MAX_DELAY_PS)
+			return dev_err_probe(&pdev->dev, -EINVAL,
+				"rx delay out of range\n");
 
-		eth_dly_param &= ~EIC7700_ETH_RX_ADJ_DELAY;
-		eth_dly_param |= FIELD_PREP(EIC7700_ETH_RX_ADJ_DELAY, val);
-	} else {
-		return dev_err_probe(&pdev->dev, -EINVAL,
-			"missing required property rx-internal-delay-ps\n");
+		val = delay_ps / EIC7700_DELAY_STEP_PS;
+
+		dwc_priv->eth_clk_dly_param &= ~EIC7700_ETH_RX_ADJ_DELAY;
+		dwc_priv->eth_clk_dly_param |=
+				 FIELD_PREP(EIC7700_ETH_RX_ADJ_DELAY, val);
 	}
 
 	/* Read tx-internal-delay-ps and update tx_clk delay */
 	if (!of_property_read_u32(pdev->dev.of_node,
 				  "tx-internal-delay-ps", &delay_ps)) {
-		u32 val = min(delay_ps / 100, EIC7700_MAX_DELAY_UNIT);
+		if (delay_ps % EIC7700_DELAY_STEP_PS)
+			return dev_err_probe(&pdev->dev, -EINVAL,
+				"tx delay must be multiple of %dps\n",
+				EIC7700_DELAY_STEP_PS);
+
+		if (delay_ps > EIC7700_MAX_DELAY_PS)
+			return dev_err_probe(&pdev->dev, -EINVAL,
+				"tx delay out of range\n");
+
+		val = delay_ps / EIC7700_DELAY_STEP_PS;
 
-		eth_dly_param &= ~EIC7700_ETH_TX_ADJ_DELAY;
-		eth_dly_param |= FIELD_PREP(EIC7700_ETH_TX_ADJ_DELAY, val);
-	} else {
-		return dev_err_probe(&pdev->dev, -EINVAL,
-			"missing required property tx-internal-delay-ps\n");
+		dwc_priv->eth_clk_dly_param &= ~EIC7700_ETH_TX_ADJ_DELAY;
+		dwc_priv->eth_clk_dly_param |=
+				 FIELD_PREP(EIC7700_ETH_TX_ADJ_DELAY, val);
 	}
 
-	eic7700_hsp_regmap = syscon_regmap_lookup_by_phandle(pdev->dev.of_node,
-							     "eswin,hsp-sp-csr");
-	if (IS_ERR(eic7700_hsp_regmap))
+	dwc_priv->eic7700_hsp_regmap =
+			syscon_regmap_lookup_by_phandle(pdev->dev.of_node,
+							"eswin,hsp-sp-csr");
+	if (IS_ERR(dwc_priv->eic7700_hsp_regmap))
 		return dev_err_probe(&pdev->dev,
-				PTR_ERR(eic7700_hsp_regmap),
+				PTR_ERR(dwc_priv->eic7700_hsp_regmap),
 				"Failed to get hsp-sp-csr regmap\n");
 
 	ret = of_property_read_u32_index(pdev->dev.of_node,
 					 "eswin,hsp-sp-csr",
-					 1, &eth_phy_ctrl_offset);
+					 1, &dwc_priv->eth_phy_ctrl_offset);
 	if (ret)
 		return dev_err_probe(&pdev->dev, ret,
 				     "can't get eth_phy_ctrl_offset\n");
 
-	regmap_read(eic7700_hsp_regmap, eth_phy_ctrl_offset,
-		    &eth_phy_ctrl_regset);
-	eth_phy_ctrl_regset |=
-		(EIC7700_ETH_TX_CLK_SEL | EIC7700_ETH_PHY_INTF_SELI);
-	regmap_write(eic7700_hsp_regmap, eth_phy_ctrl_offset,
-		     eth_phy_ctrl_regset);
-
 	ret = of_property_read_u32_index(pdev->dev.of_node,
 					 "eswin,hsp-sp-csr",
-					 2, &eth_axi_lp_ctrl_offset);
+					 2, &dwc_priv->eth_axi_lp_ctrl_offset);
 	if (ret)
 		return dev_err_probe(&pdev->dev, ret,
 				     "can't get eth_axi_lp_ctrl_offset\n");
 
-	regmap_write(eic7700_hsp_regmap, eth_axi_lp_ctrl_offset,
-		     EIC7700_ETH_CSYSREQ_VAL);
+	ret = of_property_read_u32_index(pdev->dev.of_node,
+					 "eswin,hsp-sp-csr",
+					 3, &dwc_priv->eth_clk_offset);
+	if (ret)
+		return dev_err_probe(&pdev->dev, ret,
+				     "can't get eth_clk_offset\n");
 
 	ret = of_property_read_u32_index(pdev->dev.of_node,
 					 "eswin,hsp-sp-csr",
-					 3, &eth_rxd_dly_offset);
+					 4, &dwc_priv->eth_txd_offset);
 	if (ret)
 		return dev_err_probe(&pdev->dev, ret,
-				     "can't get eth_rxd_dly_offset\n");
+				     "can't get eth_txd_offset\n");
 
-	regmap_write(eic7700_hsp_regmap, eth_rxd_dly_offset,
-		     eth_dly_param);
+	ret = of_property_read_u32_index(pdev->dev.of_node,
+					 "eswin,hsp-sp-csr",
+					 5, &dwc_priv->eth_rxd_offset);
+	if (ret)
+		return dev_err_probe(&pdev->dev, ret,
+				     "can't get eth_rxd_offset\n");
 
 	plat_dat->num_clks = ARRAY_SIZE(eic7700_clk_names);
 	plat_dat->clks = devm_kcalloc(&pdev->dev,
@@ -208,12 +290,27 @@ static int eic7700_dwmac_probe(struct platform_device *pdev)
 	plat_dat->exit = eic7700_dwmac_exit;
 	plat_dat->suspend = eic7700_dwmac_suspend;
 	plat_dat->resume = eic7700_dwmac_resume;
+	plat_dat->fix_mac_speed = eic7700_dwmac_fix_speed;
 
 	return devm_stmmac_pltfr_probe(pdev, plat_dat, &stmmac_res);
 }
 
+static const struct eic7700_dwmac_data eic7700_dwmac_data = {
+	.rgmii_rx_clk_invert = false,
+};
+
+static const struct eic7700_dwmac_data eic7700_dwmac_data_clk_inversion = {
+	.rgmii_rx_clk_invert = true,
+};
+
 static const struct of_device_id eic7700_dwmac_match[] = {
-	{ .compatible = "eswin,eic7700-qos-eth" },
+	{	.compatible = "eswin,eic7700-qos-eth",
+		.data = &eic7700_dwmac_data,
+	},
+	{
+		.compatible = "eswin,eic7700-qos-eth-clk-inversion",
+		.data = &eic7700_dwmac_data_clk_inversion,
+	},
 	{ }
 };
 MODULE_DEVICE_TABLE(of, eic7700_dwmac_match);
-- 
2.25.1


^ permalink raw reply related

* [PATCH net-next v6 3/3] riscv: dts: eswin: eic7700-hifive-premier-p550: enable Ethernet controller
From: lizhi2 @ 2026-04-23  8:56 UTC (permalink / raw)
  To: devicetree, andrew+netdev, davem, edumazet, kuba, robh, krzk+dt,
	conor+dt, netdev, pabeni, mcoquelin.stm32, alexandre.torgue,
	rmk+kernel, pjw, palmer, aou, alex, linux-riscv, linux-stm32,
	linux-arm-kernel, linux-kernel, maxime.chevallier
  Cc: ningyu, linmin, pinkesh.vaghela, pritesh.patel, weishangjuan,
	horms, Zhi Li
In-Reply-To: <20260423085501.760-1-lizhi2@eswincomputing.com>

From: Zhi Li <lizhi2@eswincomputing.com>

Enable the on-board Gigabit Ethernet controller on the
HiFive Premier P550 development board.

Signed-off-by: Zhi Li <lizhi2@eswincomputing.com>
---
 .../devicetree/bindings/mfd/syscon.yaml       |   2 +
 .../dts/eswin/eic7700-hifive-premier-p550.dts | 232 ++++++++++++++++++
 arch/riscv/boot/dts/eswin/eic7700.dtsi        | 103 ++++++++
 3 files changed, 337 insertions(+)

diff --git a/Documentation/devicetree/bindings/mfd/syscon.yaml b/Documentation/devicetree/bindings/mfd/syscon.yaml
index e57add2bacd3..89e90b3f12a9 100644
--- a/Documentation/devicetree/bindings/mfd/syscon.yaml
+++ b/Documentation/devicetree/bindings/mfd/syscon.yaml
@@ -61,6 +61,7 @@ select:
           - cirrus,ep7209-syscon2
           - cirrus,ep7209-syscon3
           - cnxt,cx92755-uc
+          - eswin,eic7700-syscfg
           - freecom,fsg-cs2-system-controller
           - fsl,imx93-aonmix-ns-syscfg
           - fsl,imx93-wakeupmix-syscfg
@@ -173,6 +174,7 @@ properties:
               - cirrus,ep7209-syscon2
               - cirrus,ep7209-syscon3
               - cnxt,cx92755-uc
+              - eswin,eic7700-syscfg
               - freecom,fsg-cs2-system-controller
               - fsl,imx93-aonmix-ns-syscfg
               - fsl,imx93-wakeupmix-syscfg
diff --git a/arch/riscv/boot/dts/eswin/eic7700-hifive-premier-p550.dts b/arch/riscv/boot/dts/eswin/eic7700-hifive-premier-p550.dts
index 131ed1fc6b2e..12e032dbe88d 100644
--- a/arch/riscv/boot/dts/eswin/eic7700-hifive-premier-p550.dts
+++ b/arch/riscv/boot/dts/eswin/eic7700-hifive-premier-p550.dts
@@ -13,11 +13,243 @@ / {
 
 	aliases {
 		serial0 = &uart0;
+		ethernet0 = &gmac0;
+		ethernet1 = &gmac1;
 	};
 
 	chosen {
 		stdout-path = "serial0:115200n8";
 	};
+
+	vcc_1v8: vcc1v8 {
+		 compatible = "regulator-fixed";
+		 regulator-name = "vcc1v8";
+		 regulator-always-on;
+		 regulator-boot-on;
+		 regulator-min-microvolt = <1800000>;
+		 regulator-max-microvolt = <1800000>;
+	 };
+};
+
+&xtal24m {
+	clock-frequency = <24000000>;
+	clock-output-names = "xtal24m";
+};
+
+&pinctrl {
+	status = "okay";
+	vrgmii-supply = <&vcc_1v8>;
+
+	pinctrl_gpio0: gpio0-grp {
+		gpio0-pins {
+			pins = "gpio0";
+			function = "gpio";
+			input-enable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio5: gpio5-grp {
+		gpio5-pins {
+			pins = "gpio5";
+			function = "gpio";
+			input-enable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio11: gpio11-grp {
+		gpio11-pins {
+			pins = "gpio11";
+			function = "gpio";
+			input-enable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio14: gpio14-grp {
+		gpio14-pins {
+			pins = "mode_set1";
+			function = "gpio";
+			input-disable;
+			bias-pull-up;
+		};
+	};
+
+	pinctrl_gpio15: gpio15-grp {
+		gpio15-pins {
+			pins = "mode_set2";
+			function = "gpio";
+			input-enable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio28: gpio28-grp {
+		gpio28-pins {
+			pins = "gpio28";
+			function = "gpio";
+			input-enable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio43: gpio43-grp {
+		gpio43-pins {
+			pins = "usb1_pwren";
+			function = "gpio";
+			input-disable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio71: gpio71-grp {
+		gpio71-pins {
+			pins = "mipi_csi0_xhs";
+			function = "gpio";
+			input-disable;
+			bias-pull-up;
+		};
+	};
+
+	pinctrl_gpio74: gpio74-grp {
+		gpio74-pins {
+			pins = "mipi_csi1_xhs";
+			function = "gpio";
+			input-disable;
+			bias-pull-up;
+		};
+	};
+
+	pinctrl_gpio76: gpio76-grp {
+		gpio76-pins {
+			pins = "mipi_csi2_xvs";
+			function = "gpio";
+			input-disable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio77: gpio77-grp {
+		gpio77-pins {
+			pins = "mipi_csi2_xhs";
+			function = "gpio";
+			input-disable;
+			bias-pull-up;
+		};
+	};
+
+	pinctrl_gpio79: gpio79-grp {
+		gpio79-pins {
+			pins = "mipi_csi3_xvs";
+			function = "gpio";
+			input-disable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio80: gpio80-grp {
+		gpio80-pins {
+			pins = "mipi_csi3_xhs";
+			function = "gpio";
+			input-disable;
+			bias-pull-up;
+		};
+	};
+
+	pinctrl_gpio82: gpio82-grp {
+		gpio82-pins {
+			pins = "mipi_csi4_xvs";
+			function = "gpio";
+			input-disable;
+			bias-pull-up;
+		};
+	};
+
+	pinctrl_gpio84: gpio84-grp {
+		gpio84-pins {
+			pins = "mipi_csi4_mclk";
+			function = "gpio";
+			input-disable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio85: gpio85-grp {
+		gpio85-pins {
+			pins = "mipi_csi5_xvs";
+			function = "gpio";
+			input-disable;
+			bias-pull-up;
+		};
+	};
+
+	pinctrl_gpio94: gpio94-grp {
+		gpio94-pins {
+			pins = "s_mode";
+			function = "gpio";
+			input-disable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio106: gpio106-grp {
+		gpio106-pins {
+			pins = "gpio106";
+			function = "gpio";
+			input-disable;
+			bias-disable;
+		};
+	};
+
+	pinctrl_gpio111: gpio111-grp {
+		gpio111-pins {
+			pins = "gpio111";
+			function = "gpio";
+			input-disable;
+			bias-disable;
+		};
+	};
+};
+
+&gmac0 {
+	phy-handle = <&gmac0_phy0>;
+	phy-mode = "rgmii-id";
+	pinctrl-names = "default";
+	pinctrl-0 = <&pinctrl_gpio106>;
+	rx-internal-delay-ps = <20>;
+	tx-internal-delay-ps = <100>;
+	status = "okay";
+};
+
+&gmac0_mdio {
+	gmac0_phy0: ethernet-phy@0 {
+		compatible = "ethernet-phy-id001c.c916";
+		reg = <0>;
+		reset-gpios = <&gpioD 10 GPIO_ACTIVE_LOW>;
+		reset-assert-us = <10000>;
+		reset-deassert-us = <80000>;
+	};
+};
+
+&gmac1 {
+	phy-handle = <&gmac1_phy0>;
+	phy-mode = "rgmii-rxid";
+	pinctrl-names = "default";
+	pinctrl-0 = <&pinctrl_gpio111>;
+	rx-internal-delay-ps = <200>;
+	tx-internal-delay-ps = <200>;
+	status = "okay";
+};
+
+&gmac1_mdio {
+	gmac1_phy0: ethernet-phy@0 {
+		compatible = "ethernet-phy-id001c.c916";
+		reg = <0>;
+		reset-gpios = <&gpioD 15 GPIO_ACTIVE_LOW>;
+		reset-assert-us = <10000>;
+		reset-deassert-us = <80000>;
+	};
 };
 
 &uart0 {
diff --git a/arch/riscv/boot/dts/eswin/eic7700.dtsi b/arch/riscv/boot/dts/eswin/eic7700.dtsi
index c3ed93008bca..5690d4c6981b 100644
--- a/arch/riscv/boot/dts/eswin/eic7700.dtsi
+++ b/arch/riscv/boot/dts/eswin/eic7700.dtsi
@@ -5,6 +5,9 @@
 
 /dts-v1/;
 
+#include <dt-bindings/gpio/gpio.h>
+#include <dt-bindings/reset/eswin,eic7700-reset.h>
+
 / {
 	#address-cells = <2>;
 	#size-cells = <2>;
@@ -202,6 +205,11 @@ pmu {
 				<0x00000000 0x0000000f 0xfffffffc 0x000000ff 0x00000078>;
 	};
 
+	xtal24m: oscillator {
+		compatible = "fixed-clock";
+		#clock-cells = <0>;
+	};
+
 	soc {
 		compatible = "simple-bus";
 		ranges;
@@ -245,6 +253,83 @@ plic: interrupt-controller@c000000 {
 			#interrupt-cells = <1>;
 		};
 
+		hsp_power_domain: bus@50400000 {
+			compatible = "simple-pm-bus";
+			ranges;
+			clocks = <&clk 171>;
+			#address-cells = <2>;
+			#size-cells = <2>;
+
+			hsp_sp_csr: hsp-sp-top-csr@50440000 {
+				compatible = "eswin,eic7700-syscfg", "syscon";
+				reg = <0x0 0x50440000 0x0 0x2000>;
+			};
+
+			gmac0: ethernet@50400000 {
+				compatible = "eswin,eic7700-qos-eth", "snps,dwmac-5.20";
+				reg = <0x0 0x50400000 0x0 0x10000>;
+				interrupts = <61>;
+				interrupt-names = "macirq";
+				clocks = <&clk 186>,
+					 <&clk 171>,
+					 <&clk 40>,
+					 <&clk 193>;
+				clock-names = "axi", "cfg", "stmmaceth", "tx";
+				resets = <&reset EIC7700_RESET_HSP_ETH0_ARST>;
+				reset-names = "stmmaceth";
+				eswin,hsp-sp-csr = <&hsp_sp_csr 0x100 0x108 0x118 0x114 0x11c>;
+				snps,aal;
+				snps,fixed-burst;
+				snps,tso;
+				snps,axi-config = <&stmmac_axi_setup_gmac0>;
+				status = "disabled";
+
+				gmac0_mdio: mdio {
+					compatible = "snps,dwmac-mdio";
+					#address-cells = <1>;
+					#size-cells = <0>;
+				};
+
+				stmmac_axi_setup_gmac0: stmmac-axi-config {
+					snps,blen = <0 0 0 0 16 8 4>;
+					snps,rd_osr_lmt = <2>;
+					snps,wr_osr_lmt = <2>;
+				};
+			};
+
+			gmac1: ethernet@50410000 {
+				compatible = "eswin,eic7700-qos-eth-clk-inversion", "snps,dwmac-5.20";
+				reg = <0x0 0x50410000 0x0 0x10000>;
+				interrupts = <70>;
+				interrupt-names = "macirq";
+				clocks = <&clk 186>,
+					 <&clk 171>,
+					 <&clk 40>,
+					 <&clk 194>;
+				clock-names = "axi", "cfg", "stmmaceth", "tx";
+				resets = <&reset EIC7700_RESET_HSP_ETH1_ARST>;
+				reset-names = "stmmaceth";
+				eswin,hsp-sp-csr = <&hsp_sp_csr 0x200 0x208 0x218 0x214 0x21c>;
+				snps,aal;
+				snps,fixed-burst;
+				snps,tso;
+				snps,axi-config = <&stmmac_axi_setup_gmac1>;
+				status = "disabled";
+
+				gmac1_mdio: mdio {
+					compatible = "snps,dwmac-mdio";
+					#address-cells = <1>;
+					#size-cells = <0>;
+				};
+
+				stmmac_axi_setup_gmac1: stmmac-axi-config {
+					snps,blen = <0 0 0 0 16 8 4>;
+					snps,rd_osr_lmt = <2>;
+					snps,wr_osr_lmt = <2>;
+				};
+			};
+		};
+
 		uart0: serial@50900000 {
 			compatible = "snps,dw-apb-uart";
 			reg = <0x0 0x50900000 0x0 0x10000>;
@@ -341,5 +426,23 @@ gpioD: gpio-port@3 {
 				#gpio-cells = <2>;
 			};
 		};
+
+		pinctrl: pinctrl@51600080 {
+			compatible = "eswin,eic7700-pinctrl";
+			reg = <0x0 0x51600080 0x0 0x1fff80>;
+		};
+
+		clk: clock-controller@51828000 {
+			compatible = "eswin,eic7700-clock";
+			reg = <0x0 0x51828000 0x0 0x300>;
+			clocks = <&xtal24m>;
+			#clock-cells = <1>;
+		};
+
+		reset: reset-controller@51828300 {
+			compatible = "eswin,eic7700-reset";
+			reg = <0x0 0x51828300 0x0 0x200>;
+			#reset-cells = <1>;
+		};
 	};
 };
-- 
2.25.1


^ permalink raw reply related

* Re: [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
From: Kory Maincent @ 2026-04-23  9:05 UTC (permalink / raw)
  To: Corey Leavitt via B4 Relay
  Cc: corey, Oleksij Rempel, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Heiner Kallweit, Russell King,
	Andrew Lunn, netdev, linux-kernel
In-Reply-To: <20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info>

Hello Corey,

On Thu, 23 Apr 2026 01:42:13 -0600
Corey Leavitt via B4 Relay <devnull+corey.leavitt.info@kernel.org> wrote:

> On systems where a PSE controller driver loads as a module and a
> device-tree PHY node carries a `pses = <&pse_pi>` reference,
> fwnode_mdiobus_register_phy() tries to resolve the PSE handle before
> the controller driver has probed. of_pse_control_get() returns
> -EPROBE_DEFER, the enclosing MDIO/DSA probe fails, and driver-core
> re-queues the work. The retry loop spins until the PSE driver module
> loads and its controller registers.

I will take a look at your series but FYI there was already a RFC series
tackling this issue:
https://lore.kernel.org/lkml/20260330132952.2950531-4-github@szelinsky.de/

It rose a debate and there was currently no final solution.
 
> Commit fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse
> control") made each retry expensive. It reordered
> fwnode_mdiobus_register_phy() so the PHY is registered before the
> PSE lookup. Every deferral now performs a full
> phy_device_register() / phy_device_remove() cycle. On a board with a
> sufficiently tight watchdog the retry loop can starve the watchdog
> kthread. On the reporting hardware (MT7621 + gpio-wdt, 1-second
> margin) the retry loop converts a slow probe phase into a reset
> before userspace loads.
> 
> The affected population today looks small. OpenWrt, where PSE
> actually ships, is still on 6.12 (pre-regression), and most
> environments with CONFIG_PSE_*=m do not have boards whose DT
> references a PSE controller from a PHY. Still, the mechanism is
> general. Any modular PSE driver combined with the documented
> `pses = <&...>` binding reproduces the retry loop. Whether it
> reaches brick-grade or merely slow/flaky boot depends on local
> watchdog timing. More exposure is expected as distribution and
> embedded kernels move to 6.13 and later.
> 
> The narrow fix would be to partially revert the ordering in
> fa2f0454174c so each defer is cheap again. That keeps the same
> architecture (fwnode_mdio holding PSE knowledge, -EPROBE_DEFER
> flowing across the subsystem boundary), and any future reorder
> reintroduces the same class of bug. This series takes the larger
> fix: decouple PSE controller lookup from MDIO registration entirely.
> pse_core now publishes a BLOCKING_NOTIFIER chain with REGISTERED
> and UNREGISTERED events. phy_device subscribes, owns phydev->psec
> lifetime, and attaches PSE handles in response to controller
> lifecycle rather than during probe. fwnode_mdio loses its PSE
> awareness, and -EPROBE_DEFER no longer flows out of fwnode_mdio.
> 
> Patch breakdown:
> 
>   1. Scope the pse_control regulator handle to kref lifetime
>      (Fixes: d83e13761d5b). A latent bug that patch 4 makes
>      reachable.
>   2. Add the notifier chain (enum, head, register/unregister
>      helpers). Pure infrastructure. No subscribers yet, no
>      observable change.
>   3. Fire REGISTERED and UNREGISTERED events from the controller
>      register/unregister paths. Still no subscribers, still no
>      observable change.
>   4. Subscribe from the PHY layer, take ownership of phydev->psec
>      via the notifier, and remove fwnode_find_pse_control() from
>      fwnode_mdio.
> 
> Patch 1 is bundled here per stable-kernel-rules section 4
> reachability guidance. On mainline today, with no notifier
> subscriber, no caller drives the dangling regulator-handle sequence.
> Patches 2 and 3 are deliberately split to separate "add
> infrastructure" from "wire it up". Happy to fold them if maintainers
> prefer the combined form.
> 
> Validated on a Cudy C200P (MT7621 + IP804AR) running an OpenWrt
> build of 6.18.21 with the series applied. A lockdep build
> (CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP) shows no splats
> from the series' code paths during boot, PHY attach, PHY detach, or
> a full controller unbind/rebind cycle. ethtool --set-pse drives all
> four PoE-capable LAN ports, and a Ruckus H510 class-4 PD plugged
> into lan3 negotiates and receives 48 V.
> 
> The C200P has no SFP cage, so the SFP path change in sfp.c
> (phy_device_register -> phy_device_register_locked) isn't exercised
> on the bench. Verified by call-graph audit: every path reaching
> sfp_sm_probe_phy() holds rtnl at entry, via sfp_timeout,
> sfp_check_state, sfp_probe, sfp_remove, or
> sfp_bus_{add,del}_upstream.
> 
> Not addressed by this series: ethtool --show-pse returns "No data
> available" on DSA netdevs in 6.18, because dev->phydev is NULL for
> DSA-frontend netdevs and ethnl_req_get_phydev() therefore returns
> NULL. That's a DSA / ethtool integration quirk that predates this
> work.
> 
> Sending as RFC because this is my first net-next series. I'd
> appreciate maintainer guidance on whether patch 1 should go to net
> rather than net-next, and whether the patch 2/3 split is preferred
> to the combined form.
> 
> Signed-off-by: Corey Leavitt <corey@leavitt.info>
> ---
> Corey Leavitt (4):
>       net: pse-pd: scope pse_control regulator handle to kref lifetime
>       net: pse-pd: add notifier chain for controller lifecycle events
>       net: pse-pd: fire lifecycle events on controller register/unregister
>       net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook
> 
>  drivers/net/mdio/fwnode_mdio.c |  34 ----------
>  drivers/net/phy/phy_device.c   | 144
> ++++++++++++++++++++++++++++++++++++++--- drivers/net/phy/sfp.c          |
> 2 +- drivers/net/pse-pd/pse_core.c  |  60 ++++++++++++++++-
>  include/linux/phy.h            |   2 +
>  include/linux/pse-pd/pse.h     |  41 ++++++++++++
>  6 files changed, 236 insertions(+), 47 deletions(-)
> ---
> base-commit: 1f5ffc672165ff851063a5fd044b727ab2517ae3
> change-id: 20260422-pse-notifier-decouple-efa80d77f4be
> 
> Best regards,
> --  
> Corey Leavitt <corey@leavitt.info>
> 
> 



-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [PATCH net v2 12/15] drivers: net: 8390: AX88190: Remove this driver
From: Bjørn Mork @ 2026-04-23  9:06 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jonathan Corbet, Shuah Khan,
	Geert Uytterhoeven, Michael Fritscher, Byron Stanoszek,
	Daniel Palmer, linux-kernel, netdev, linux-doc
In-Reply-To: <20260422-v7-0-0-net-next-driver-removal-v1-v2-12-08a5b59784d5@lunn.ch>

Andrew Lunn <andrew@lunn.ch> writes:

> The ax88190 was written by David A. Hindsh

Hinds

^ permalink raw reply

* Re: [PATCH] net/stmmac: Fix typos: 'tx_undeflow_irq' -> 'tx_underflow_irq'
From: Jakub Raczynski @ 2026-04-23  9:08 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, linux-kernel, kuba, davem, andrew+netdev, kernel-janitors,
	linux-arm-kernel, linux-stm32
In-Reply-To: <52b06f0a-8283-4903-9d8a-2bbdf637dd5d@lunn.ch>

[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]

On Wed, Apr 22, 2026 at 06:15:20PM +0200, Andrew Lunn wrote:
> On Wed, Apr 22, 2026 at 04:15:37PM +0200, Jakub Raczynski wrote:
> > On Wed, Apr 22, 2026 at 02:47:38PM +0200, Andrew Lunn wrote:
> > > > I don't see anything wrong with it?
> > > > - naming is correct, same as stmmac_extra_stats from common.h, as it
> > > >   wouldn't compile otherwise
> > > > - string length is ok, as max name length is ETH_GSTRING_LEN=32 and it is
> > > >   not close
> > > > - ethtool just polls data from driver and in my tests it is ok
> > > > - all instances of 'undeflow' are changed
> > > > - 'underflow' semantic is ok, 'undeflow' is just not correct
> > > > 
> > > > Please correct me if I am wrong, but imo no issues with this patch.
> > > 
> > > ABI
> > > 
> > > This name is published as part of the kAPI. You are changing its
> > > name. User space could be looking for this name, even thought it has a
> > > typo in it.
> > > 
> > >      Andrew
> > >
> > I don't think it is? This part of extra stats (struct stmmac_extra_stats) and
> > is not part of standard ABI from
> > Documentation/ABI/testing/sysfs-class-net-statistics
> > nor is mentioned in
> > Documentation/networking/device_drivers/ethernet/stmicro/stmmac.rst
> > 
> > These extra stats are specific to stmmac driver and most of these are more
> > than standard
> > https://www.kernel.org/doc/html/v7.0/networking/statistics.html#c.rtnl_link_stats64
> > This name does not exist outside stmmac driver, so while some application may
> > expect this (stmmac specific app), question is should this typo stick?
> 
> 47dd7a540b8a0 drivers/net/stmmac/stmmac_ethtool.c                  (Giuseppe Cavallaro      2009-10-14 15:13:45 -0700   81)     STMMAC_STAT(tx_undeflow_irq),
> 
> It has been exposed to user space for 17 years. In that time, there
> could well be stmmac specific apps using it.
> 
> Just because it is not documented as ABI does not make it not ABI.
> 
>      Andrew
>

Sure, up to you whether NAK or ACK this change.

IMO this name is specific to stmmac and should not be part of any app,
as monitoring tools should be more universal. When monitoring interface this
field will show some other way, via dropped packets and then you would use
driver specific fields for debugging.

Problem is, quick search on github shows this change propagated through
hundreds of Linux forks or different RTOS. But no public app using this found,
at least C app (but well, I didn't browse everything for obvious reasons).
Funny how typo will live everywhere and not be fixed.
So this change would make it differ from all the forks/RTOS'es that will
probably never fix this. So thats the downside.

Question is whether this should then remain that way forever?
And was it really part of some ABI if no one noticed?

Regards
Jakub Raczynski

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply

* Re: [PATCH net v2 13/15] drivers: net: 8390: pcnet: Remove this driver
From: Bjørn Mork @ 2026-04-23  9:09 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Jonathan Corbet, Shuah Khan,
	Geert Uytterhoeven, Michael Fritscher, Byron Stanoszek,
	Daniel Palmer, linux-kernel, netdev, linux-doc
In-Reply-To: <20260422-v7-0-0-net-next-driver-removal-v1-v2-13-08a5b59784d5@lunn.ch>

Andrew Lunn <andrew@lunn.ch> writes:

> The pcnet was written by David A. Hindsh

Hinds again

^ permalink raw reply

* Re: [RFC PATCH bpf-next v6 11/12] selftests/bpf: Add test for memcg_bpf_ops hierarchies
From: XIAO WU @ 2026-04-23  9:15 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: a.s.protopopov, akpm, ameryhung, andrii, ast, bpf, brauner,
	brgerst, cgroups, chenridong, clm, daniel, davem, eddyz87,
	geliang, hannes, haoluo, hawk, hui.zhu, ihor.solodrai,
	inwardvessel, jeffxu, jiayuan.chen, john.fastabend, jolsa, kees,
	kernel, kerneljasonxing, kpsingh, kuba, lance.yang, linux-kernel,
	linux-kselftest, linux-mm, martin.lau, martin.lau, masahiroy,
	mhocko, mkoutny, muchun.song, nathan, netdev, ojeda,
	paul.chaignon, peterz, rdunlap, roman.gushchin, sdf, shakeel.butt,
	shuah, song, tj, willemb, yonghong.song, zhuhui
In-Reply-To: <958ccd923342ddd02e9122381d51319cb125ec51d601bb6fcad57531a2f5ef57@mail.kernel.org>

Hi,

> +cleanup:
> +	bpf_link__destroy(link1);
> +	bpf_link__destroy(link2);
> +	bpf_link__destroy(link3);
> +	memcg_ops__detach(skel);
> +	memcg_ops__destroy(skel);
>
> Can this crash if skel is NULL?

Yes, this is a valid bug in the selftest cleanup path.

If execution jumps to cleanup before memcg_ops__open_and_load()
succeeds, skel remains NULL. In that case, memcg_ops__detach(skel)
dereferences NULL through obj->skeleton in the generated detach helper,
as you pointed out.

This is also inconsistent with nearby tests in the same file that
already do if (skel) {
    memcg_ops__detach(skel);
    memcg_ops__destroy(skel);
}

The C repro, modeling the same control flow:

--8<--
// SPDX-License-Identifier: GPL-2.0
// PoC for cleanup-path NULL dereference in
test_memcg_ops_hierarchies().

#include <stdio.h>

struct bpf_object_skeleton {
    int dummy;
};

struct memcg_ops {
    struct bpf_object_skeleton *skeleton;
};

__attribute__((noinline))
static void bpf_object__detach_skeleton(struct bpf_object_skeleton *s)
{
    (void)s;
}

/* Matches generated skeleton helper shape from review mail. */
static inline void memcg_ops__detach(struct memcg_ops *obj)
{
    bpf_object__detach_skeleton(obj->skeleton);
}

static int setup_cgroup_environment_fail(void)
{
    return -1;
}

int main(void)
{
    int ret;
    struct memcg_ops *skel = NULL;

    fprintf(stderr, "[*] trigger cleanup with skel == NULL\n");

    /* Simulate early failure before open_and_load() assigns skel. */
    ret = setup_cgroup_environment_fail();
    if (ret)
        goto cleanup;

cleanup:
    /* Same problematic call pattern as in the test cleanup block. */
    memcg_ops__detach(skel);

    return 0;
}
--8<--


Signed-off-by: XIAO WU <shawdoxwu@gmail.com>

Thanks,

xiao

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox