From: Heiner Kallweit <hkallweit1@gmail.com>
To: Florian Fainelli <f.fainelli@gmail.com>, Andrew Lunn <andrew@lunn.ch>
Cc: Russell King <rmk+kernel@armlinux.org.uk>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: handling of phy_stop() and phy_stop_machine() in phylib
Date: Wed, 7 Feb 2018 21:56:37 +0100 [thread overview]
Message-ID: <ab8f592f-5444-a5ab-e6e2-16f566fbe681@gmail.com> (raw)
In-Reply-To: <4b67f2ac-3fe0-9567-404b-1e58b0fc5eaf@gmail.com>
Am 04.02.2018 um 03:48 schrieb Florian Fainelli:
>
>
> On 02/03/2018 03:58 PM, Heiner Kallweit wrote:
>> Am 03.02.2018 um 21:17 schrieb Andrew Lunn:
>>> On Sat, Feb 03, 2018 at 05:41:54PM +0100, Heiner Kallweit wrote:
>>>> This commit forces callers of phy_resume() and phy_suspend() to hold
>>>> mutex phydev->lock. This was done for calls to phy_resume() and
>>>> phy_suspend() in phylib, however there are more callers in network
>>>> drivers. I'd assume that these other calls issue a warning now
>>>> because of the lock not being held.
>>>> So is there something I miss or would this have to be fixed?
>>>
>>> Hi Heiner
>>>
>>> This is a good point.
>>>
>>> Yes, it looks like some fixes are needed. But what exactly?
>>>
>>> The phy state machine will suspend and resume the phy is you call
>>> phy_stop() and phy_start() in the MAC suspend and resume functions.
>>>
>> AFAICS phy_stop() doesn't suspend the PHY. It just sets the state
>> to PHY_HALTED and (at least if we're not in polling mode) doesn't
>> call phy_suspend(). Maybe a call to phy_trigger_machine() is
>> needed like in phy_start() ? Then the state machine would call
>> phy_suspend(), provided the link is still up.
>
> Right, phy_stop() merely just moves the state machine to PHY_HALTED and
> this is actually a great source of problems which I tried to address here:
>
> https://www.mail-archive.com/netdev@vger.kernel.org/msg196061.html
>
> because phy_stop() is not a synchronous call, so when it returns the
> state machine might still be running (it can take up to a 1 HZ depending
> on when you called phy_stop()) and so if you took that as a
> synchronization point to e.g: turn of your Ethernet MAC/MDIO bus clocks,
> you will likely see problems. phy_stop_machine() would provide that
> synchronization point, but is not currently exported, despite being a
> global symbol. This patch series above is all well and good, except that
> Geert reported issues with suspend/resume interactions which I have not
> been able to track down.
>
To not confuse readers I changed the subject of the mail to reflect that
the discussion isn't about the original topic any longer.
It seems to me that (at least one) reason for the issues is that pm
callbacks for the phy device and the network device interfere.
phy device pm (mdio_bus_phy_suspend et al) feels responsible for dealing
with suspending/resuming the PHY, and the network driver pm callbacks
as well.
Maybe, if the network driver takes care, it should inform phy device pm
to do nothing. For this we could add a flag to phydev.mdio.flags.
If the network driver sets this flag then mdio_bus_phy_suspend()
and mdio_bus_phy_resume() could turn into no-ops.
Not totally sure yet about mdio_bus_phy_restore() ..
> We should most definitively try to consolidate the different PHY
> suspend/resume within the Ethernet MAC suspend/resume implementation and
> document exactly what the appropriate behavior must be under the
> following circumstances:
>
> - when to call phy_stop() + phy_stop_machine()
> - when to call phy_suspend() (if the network interface does do not WoL)
> - when to call phy_resume() (if needed, actually, it usually is not)
> - when to call phy_start()
>
> I don't unfortunately have the time to code this myself at the moment,
> but I will happily review patches if you have the opportunity to do so.
>
>>
>> However, if the link is down already (due to whatever calls
>> around phy_stop() in the driver) then phy_suspend() wouldn't be
>> called.
>
> Correct, there is an implicit assumption that when the link is down,
> there is an opportunity for the Ethernet MAC driver to put things in low
> power, and the PHY itself, should be in a lower power mode where only
> link/energy detection might be utilizing power. At least this is the theory.
>
>>
>> Heiner
>>
>>> A few examples:
>>>
>>> tc35815_suspend(), ravb_suspend() via ravb_close(), sh_eth_suspend()
>>> via sh_eth_close(), fec_suspend(), mpc52xx_fec_of_suspend() via
>>> mpc52xx_fec_close(), ucc_geth_suspend(), etc...
>>>
>>> So i suspect those drivers which call phy_suspend()/phy_resume()
>>> should really be modified to call phy_stop()/phy_start().
>>>
>>> hns_nic_config_phy_loopback() is just funky, and probably needs the
>>> help of the hns guys to fix.
>>>
>>> dsa_slave_suspend() already does a phy_stop(), so the phy_suspend()
>>> can be removed.
>>>
>>> The comments in lpc_eth_open() suggest the phy_resume() is needed, so
>>> locks should be added. socfpga_dwmac_resume() seems to be the same.
>>>
>>> Andrew
>>>
>>
>
next prev parent reply other threads:[~2018-02-07 20:56 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-03 16:41 Potential issue with f5e64032a799 "net: phy: fix resume handling" Heiner Kallweit
2018-02-03 20:17 ` Andrew Lunn
2018-02-03 23:58 ` Heiner Kallweit
2018-02-04 2:48 ` Florian Fainelli
2018-02-05 21:48 ` Heiner Kallweit
2018-02-06 11:00 ` Russell King - ARM Linux
2018-02-06 12:55 ` Andrew Lunn
2018-02-07 20:56 ` Heiner Kallweit [this message]
2018-02-07 21:13 ` handling of phy_stop() and phy_stop_machine() in phylib Russell King - ARM Linux
2018-02-07 23:03 ` Florian Fainelli
2018-02-25 13:00 ` Potential issue with f5e64032a799 "net: phy: fix resume handling" Heiner Kallweit
2018-02-25 16:38 ` Andrew Lunn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ab8f592f-5444-a5ab-e6e2-16f566fbe681@gmail.com \
--to=hkallweit1@gmail.com \
--cc=andrew@lunn.ch \
--cc=f.fainelli@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=rmk+kernel@armlinux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).