* [PATCH net-next] net: phy: fix regression with AX88772A PHY driver
@ 2023-09-18 13:25 Russell King (Oracle)
2023-09-18 13:49 ` Andrew Lunn
2023-09-19 15:10 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 5+ messages in thread
From: Russell King (Oracle) @ 2023-09-18 13:25 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Marek Szyprowski, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Florian Fainelli, netdev
Marek reports that a deadlock occurs with the AX88772A PHY used on the
ASIX USB network driver:
asix 1-1.4:1.0 (unnamed net_device) (uninitialized): PHY [usb-001:003:10] driver [Asix Electronics AX88772A] (irq=POLL)
Asix Electronics AX88772A usb-001:003:10: attached PHY driver(mii_bus:phy_addr=usb-001:003:10, irq=POLL)
asix 1-1.4:1.0 eth0: register 'asix' at usb-12110000.usb-1.4, ASIX AX88772 USB 2.0 Ethernet, a2:99:b6:cd:11:eb
asix 1-1.4:1.0 eth0: configuring for phy/internal link mode
============================================
WARNING: possible recursive locking detected
6.6.0-rc1-00239-g8da77df649c4-dirty #13949 Not tainted
--------------------------------------------
kworker/3:3/71 is trying to acquire lock:
c6c704cc (&dev->lock){+.+.}-{3:3}, at: phy_start_aneg+0x1c/0x38
but task is already holding lock:
c6c704cc (&dev->lock){+.+.}-{3:3}, at: phy_state_machine+0x100/0x2b8
This is because we now consistently call phy_process_state_change()
while holding phydev->lock, but the AX88772A PHY driver then goes on
to call phy_start_aneg() which tries to grab the same lock - causing
deadlock.
Fix this by exporting the unlocked version, and use this in the PHY
driver instead.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: ef113a60d0a9 ("net: phy: call phy_error_precise() while holding the lock")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
Reviewing the other PHY drivers, no others appear impacted, just this
one.
drivers/net/phy/ax88796b.c | 2 +-
drivers/net/phy/phy.c | 3 ++-
include/linux/phy.h | 1 +
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/ax88796b.c b/drivers/net/phy/ax88796b.c
index 0f1e617a26c9..eb74a8cf8df1 100644
--- a/drivers/net/phy/ax88796b.c
+++ b/drivers/net/phy/ax88796b.c
@@ -90,7 +90,7 @@ static void asix_ax88772a_link_change_notify(struct phy_device *phydev)
*/
if (phydev->state == PHY_NOLINK) {
phy_init_hw(phydev);
- phy_start_aneg(phydev);
+ _phy_start_aneg(phydev);
}
}
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 93a8676dd8d8..a5fa077650e8 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -981,7 +981,7 @@ static int phy_check_link_status(struct phy_device *phydev)
* If the PHYCONTROL Layer is operating, we change the state to
* reflect the beginning of Auto-negotiation or forcing.
*/
-static int _phy_start_aneg(struct phy_device *phydev)
+int _phy_start_aneg(struct phy_device *phydev)
{
int err;
@@ -1002,6 +1002,7 @@ static int _phy_start_aneg(struct phy_device *phydev)
return err;
}
+EXPORT_SYMBOL(_phy_start_aneg);
/**
* phy_start_aneg - start auto-negotiation for this PHY device
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 1351b802ffcf..3cc52826f18e 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1736,6 +1736,7 @@ void phy_detach(struct phy_device *phydev);
void phy_start(struct phy_device *phydev);
void phy_stop(struct phy_device *phydev);
int phy_config_aneg(struct phy_device *phydev);
+int _phy_start_aneg(struct phy_device *phydev);
int phy_start_aneg(struct phy_device *phydev);
int phy_aneg_done(struct phy_device *phydev);
int phy_speed_down(struct phy_device *phydev, bool sync);
--
2.30.2
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH net-next] net: phy: fix regression with AX88772A PHY driver
2023-09-18 13:25 [PATCH net-next] net: phy: fix regression with AX88772A PHY driver Russell King (Oracle)
@ 2023-09-18 13:49 ` Andrew Lunn
2023-09-18 13:57 ` Russell King (Oracle)
2023-09-19 15:10 ` patchwork-bot+netdevbpf
1 sibling, 1 reply; 5+ messages in thread
From: Andrew Lunn @ 2023-09-18 13:49 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Heiner Kallweit, Marek Szyprowski, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Florian Fainelli, netdev
On Mon, Sep 18, 2023 at 02:25:36PM +0100, Russell King (Oracle) wrote:
> Marek reports that a deadlock occurs with the AX88772A PHY used on the
> ASIX USB network driver:
>
> asix 1-1.4:1.0 (unnamed net_device) (uninitialized): PHY [usb-001:003:10] driver [Asix Electronics AX88772A] (irq=POLL)
> Asix Electronics AX88772A usb-001:003:10: attached PHY driver(mii_bus:phy_addr=usb-001:003:10, irq=POLL)
> asix 1-1.4:1.0 eth0: register 'asix' at usb-12110000.usb-1.4, ASIX AX88772 USB 2.0 Ethernet, a2:99:b6:cd:11:eb
> asix 1-1.4:1.0 eth0: configuring for phy/internal link mode
>
> ============================================
> WARNING: possible recursive locking detected
> 6.6.0-rc1-00239-g8da77df649c4-dirty #13949 Not tainted
> --------------------------------------------
> kworker/3:3/71 is trying to acquire lock:
> c6c704cc (&dev->lock){+.+.}-{3:3}, at: phy_start_aneg+0x1c/0x38
>
> but task is already holding lock:
> c6c704cc (&dev->lock){+.+.}-{3:3}, at: phy_state_machine+0x100/0x2b8
>
> This is because we now consistently call phy_process_state_change()
> while holding phydev->lock, but the AX88772A PHY driver then goes on
> to call phy_start_aneg() which tries to grab the same lock - causing
> deadlock.
>
> Fix this by exporting the unlocked version, and use this in the PHY
> driver instead.
>
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> Fixes: ef113a60d0a9 ("net: phy: call phy_error_precise() while holding the lock")
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Hi Russell
Yes, this fixes the problem for stable.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
But maybe it would be better to move the hardware workaround into the
PHY driver? Its the PHY which is broken, so why is the MAC working
around it?
Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net-next] net: phy: fix regression with AX88772A PHY driver
2023-09-18 13:49 ` Andrew Lunn
@ 2023-09-18 13:57 ` Russell King (Oracle)
2023-09-18 16:34 ` Andrew Lunn
0 siblings, 1 reply; 5+ messages in thread
From: Russell King (Oracle) @ 2023-09-18 13:57 UTC (permalink / raw)
To: Andrew Lunn
Cc: Heiner Kallweit, Marek Szyprowski, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Florian Fainelli, netdev
On Mon, Sep 18, 2023 at 03:49:32PM +0200, Andrew Lunn wrote:
> On Mon, Sep 18, 2023 at 02:25:36PM +0100, Russell King (Oracle) wrote:
> > Marek reports that a deadlock occurs with the AX88772A PHY used on the
> > ASIX USB network driver:
> >
> > asix 1-1.4:1.0 (unnamed net_device) (uninitialized): PHY [usb-001:003:10] driver [Asix Electronics AX88772A] (irq=POLL)
> > Asix Electronics AX88772A usb-001:003:10: attached PHY driver(mii_bus:phy_addr=usb-001:003:10, irq=POLL)
> > asix 1-1.4:1.0 eth0: register 'asix' at usb-12110000.usb-1.4, ASIX AX88772 USB 2.0 Ethernet, a2:99:b6:cd:11:eb
> > asix 1-1.4:1.0 eth0: configuring for phy/internal link mode
> >
> > ============================================
> > WARNING: possible recursive locking detected
> > 6.6.0-rc1-00239-g8da77df649c4-dirty #13949 Not tainted
> > --------------------------------------------
> > kworker/3:3/71 is trying to acquire lock:
> > c6c704cc (&dev->lock){+.+.}-{3:3}, at: phy_start_aneg+0x1c/0x38
> >
> > but task is already holding lock:
> > c6c704cc (&dev->lock){+.+.}-{3:3}, at: phy_state_machine+0x100/0x2b8
> >
> > This is because we now consistently call phy_process_state_change()
> > while holding phydev->lock, but the AX88772A PHY driver then goes on
> > to call phy_start_aneg() which tries to grab the same lock - causing
> > deadlock.
> >
> > Fix this by exporting the unlocked version, and use this in the PHY
> > driver instead.
> >
> > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > Fixes: ef113a60d0a9 ("net: phy: call phy_error_precise() while holding the lock")
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
>
> Hi Russell
>
> Yes, this fixes the problem for stable.
>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
>
> But maybe it would be better to move the hardware workaround into the
> PHY driver? Its the PHY which is broken, so why is the MAC working
> around it?
Err? Sorry, but your comment makes little sense given that my patch
only touches the PHY core (to export _phy_start_aneg()) and the PHY
driver (ax88796b.c) which is where the work-around is already located.
I'm not having to touch the MAC driver at all to fix this, because
afaics the MAC driver isn't involved in _this_ particular workaround.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH net-next] net: phy: fix regression with AX88772A PHY driver
2023-09-18 13:57 ` Russell King (Oracle)
@ 2023-09-18 16:34 ` Andrew Lunn
0 siblings, 0 replies; 5+ messages in thread
From: Andrew Lunn @ 2023-09-18 16:34 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Heiner Kallweit, Marek Szyprowski, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Florian Fainelli, netdev
> Err? Sorry, but your comment makes little sense
Sorry, -EMORECOFFEE.
Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net-next] net: phy: fix regression with AX88772A PHY driver
2023-09-18 13:25 [PATCH net-next] net: phy: fix regression with AX88772A PHY driver Russell King (Oracle)
2023-09-18 13:49 ` Andrew Lunn
@ 2023-09-19 15:10 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-09-19 15:10 UTC (permalink / raw)
To: Russell King
Cc: andrew, hkallweit1, m.szyprowski, davem, edumazet, kuba, pabeni,
florian.fainelli, netdev
Hello:
This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Mon, 18 Sep 2023 14:25:36 +0100 you wrote:
> Marek reports that a deadlock occurs with the AX88772A PHY used on the
> ASIX USB network driver:
>
> asix 1-1.4:1.0 (unnamed net_device) (uninitialized): PHY [usb-001:003:10] driver [Asix Electronics AX88772A] (irq=POLL)
> Asix Electronics AX88772A usb-001:003:10: attached PHY driver(mii_bus:phy_addr=usb-001:003:10, irq=POLL)
> asix 1-1.4:1.0 eth0: register 'asix' at usb-12110000.usb-1.4, ASIX AX88772 USB 2.0 Ethernet, a2:99:b6:cd:11:eb
> asix 1-1.4:1.0 eth0: configuring for phy/internal link mode
>
> [...]
Here is the summary with links:
- [net-next] net: phy: fix regression with AX88772A PHY driver
https://git.kernel.org/netdev/net-next/c/6a23c555f7eb
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-09-19 15:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-18 13:25 [PATCH net-next] net: phy: fix regression with AX88772A PHY driver Russell King (Oracle)
2023-09-18 13:49 ` Andrew Lunn
2023-09-18 13:57 ` Russell King (Oracle)
2023-09-18 16:34 ` Andrew Lunn
2023-09-19 15:10 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).