From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Andrew Lunn <andrew@lunn.ch>
Cc: Jijie Shao <shaojijie@huawei.com>,
f.fainelli@gmail.com, davem@davemloft.net, edumazet@google.com,
hkallweit1@gmail.com, kuba@kernel.org, netdev@vger.kernel.org,
pabeni@redhat.com,
"shenjian15@huawei.com" <shenjian15@huawei.com>,
"liuyonglong@huawei.com" <liuyonglong@huawei.com>,
wangjie125@huawei.com, chenhao418@huawei.com,
Hao Lan <lanhao@huawei.com>,
"wangpeiyang1@huawei.com" <wangpeiyang1@huawei.com>
Subject: Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
Date: Tue, 5 Sep 2023 15:00:48 +0100 [thread overview]
Message-ID: <ZPc0kHHMrNr0cgp/@shell.armlinux.org.uk> (raw)
In-Reply-To: <99eade9d-a580-4519-8399-832e196d335a@lunn.ch>
On Tue, Sep 05, 2023 at 02:09:29PM +0200, Andrew Lunn wrote:
> > When we do a phy_stop(), hardware might be error and we can't access to
> > mdio.And our process is read/write mdio failed first, then do phy_stop(),
> > reset hardware and call phy_start() finally.
>
> If the hardware/fimrware is already dead, you have to expect a stack
> trace, because the once a second poll can happen, before you notice
> the hardware/firmware is dead and call phy_stop().
>
> You might want to also disconnect the PHY and reconnect it after the
> reset.
Andrew,
I think that's what is being tried here, but there's a race between
phy_stop() and phy_state_machine() which is screwing up phydev->state.
Honestly, the locking in phy_state_machine() is insane, allows this
race to happen, and really needs fixing... and I think that the
phydev->lock usage has become really insane over the years. We have
some driver methods now which are always called with the lock held,
others where the lock may or may not be held, and others where the
lock isn't held - and none of this is documented.
Please can you have a look at the four patches I've just posted as
attached to my previous email. I think we need to start sorting out
some of this crazyness and santising the locking.
My four patches address most of it, except the call to phy_suspend().
If that can be solved, then we can improve the locking more, and
eliminate the race entirely.
If we held the lock over the entire state machine function, then the
problem that has been reported here would not exist - phy_stop()
would not be able to "nip in" during the middle of the PHY state
machine running, and thus we would not see the PHY_HALTED state
overwritten by PHY_ERROR unexpectedly.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2023-09-05 14:01 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-22 15:58 [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY Russell King (Oracle)
2023-05-22 16:06 ` Russell King (Oracle)
2023-05-22 19:03 ` Florian Fainelli
2023-09-04 9:50 ` Jijie Shao
2023-09-04 13:43 ` Andrew Lunn
2023-09-05 8:49 ` Jijie Shao
2023-09-05 12:09 ` Andrew Lunn
2023-09-05 14:00 ` Russell King (Oracle) [this message]
2023-09-05 13:48 ` Russell King (Oracle)
2023-09-05 15:24 ` Russell King (Oracle)
2023-09-06 12:59 ` Andrew Lunn
2023-09-04 14:42 ` Russell King (Oracle)
2023-09-05 8:59 ` Jijie Shao
2023-05-24 7:30 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZPc0kHHMrNr0cgp/@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=andrew@lunn.ch \
--cc=chenhao418@huawei.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=f.fainelli@gmail.com \
--cc=hkallweit1@gmail.com \
--cc=kuba@kernel.org \
--cc=lanhao@huawei.com \
--cc=liuyonglong@huawei.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=shaojijie@huawei.com \
--cc=shenjian15@huawei.com \
--cc=wangjie125@huawei.com \
--cc=wangpeiyang1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.