netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Heiner Kallweit <hkallweit1@gmail.com>
To: Joakim Zhang <qiangqing.zhang@nxp.com>,
	"christian.melki@t2data.com" <christian.melki@t2data.com>,
	"andrew@lunn.ch" <andrew@lunn.ch>,
	"linux@armlinux.org.uk" <linux@armlinux.org.uk>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kuba@kernel.org" <kuba@kernel.org>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	dl-linux-imx <linux-imx@nxp.com>
Subject: Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume back
Date: Tue, 6 Apr 2021 08:28:56 +0200	[thread overview]
Message-ID: <0e6bd756-f46c-7caf-d45b-a19e7fb80b67@gmail.com> (raw)
In-Reply-To: <DB8PR04MB6795CC9AA84D14BA98FB6598E6769@DB8PR04MB6795.eurprd04.prod.outlook.com>

On 06.04.2021 04:07, Joakim Zhang wrote:
> 
> Hi Heiner,
> 
>> -----Original Message-----
>> From: Heiner Kallweit <hkallweit1@gmail.com>
>> Sent: 2021年4月5日 20:10
>> To: christian.melki@t2data.com; Joakim Zhang <qiangqing.zhang@nxp.com>;
>> andrew@lunn.ch; linux@armlinux.org.uk; davem@davemloft.net;
>> kuba@kernel.org
>> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org; dl-linux-imx
>> <linux-imx@nxp.com>
>> Subject: Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume
>> back
>>
>> On 05.04.2021 10:43, Christian Melki wrote:
>>> On 4/5/21 12:48 AM, Heiner Kallweit wrote:
>>>> On 04.04.2021 16:09, Heiner Kallweit wrote:
>>>>> On 04.04.2021 12:07, Joakim Zhang wrote:
>>>>>> commit 4c0d2e96ba055 ("net: phy: consider that suspend2ram may cut
>>>>>> off PHY power") invokes phy_init_hw() when MDIO bus resume, it will
>>>>>> soft reset PHY if PHY driver implements soft_reset callback.
>>>>>> commit 764d31cacfe4 ("net: phy: micrel: set soft_reset callback to
>>>>>> genphy_soft_reset for KSZ8081") adds soft_reset for KSZ8081. After
>>>>>> these two patches, I found i.MX6UL 14x14 EVK which connected to
>>>>>> KSZ8081RNB doesn't work any more when system resume back, MAC
>> driver is fec_main.c.
>>>>>>
>>>>>> It's obvious that initializing PHY hardware when MDIO bus resume
>>>>>> back would introduce some regression when PHY implements
>>>>>> soft_reset. When I
>>>>>
>>>>> Why is this obvious? Please elaborate on why a soft reset should
>>>>> break something.
>>>>>
>>>>>> am debugging, I found PHY works fine if MAC doesn't support
>>>>>> suspend/resume or phy_stop()/phy_start() doesn't been called during
>>>>>> suspend/resume. This let me realize, PHY state machine
>>>>>> phy_state_machine() could do something breaks the PHY.
>>>>>>
>>>>>> As we known, MAC resume first and then MDIO bus resume when system
>>>>>> resume back from suspend. When MAC resume, usually it will invoke
>>>>>> phy_start() where to change PHY state to PHY_UP, then trigger the
>>>>>> stat> machine to run now. In phy_state_machine(), it will
>>>>>> start/config auto-nego, then change PHY state to PHY_NOLINK, what
>>>>>> to next is periodically check PHY link status. When MDIO bus
>>>>>> resume, it will initialize PHY hardware, including soft_reset, what
>>>>>> would soft_reset affect seems various from different PHYs. For
>>>>>> KSZ8081RNB, when it in PHY_NOLINK state and then perform a soft reset,
>> it will never complete auto-nego.
>>>>>
>>>>> Why? That would need to be checked in detail. Maybe chip errata
>>>>> documentation provides a hint.
>>>>>
>>>>
>>>> The KSZ8081 spec says the following about bit BMCR_PDOWN:
>>>>
>>>> If software reset (Register 0.15) is
>>>> used to exit power-down mode
>>>> (Register 0.11 = 1), two software
>>>> reset writes (Register 0.15 = 1) are
>>>> required. The first write clears
>>>> power-down mode; the second
>>>> write resets the chip and re-latches
>>>> the pin strapping pin values.
>>>>
>>>> Maybe this causes the issue you see and genphy_soft_reset() isn't
>>>> appropriate for this PHY. Please re-test with the KSZ8081 soft reset
>>>> following the spec comment.
>>>>
>>>
>>> Interesting. Never expected that behavior.
>>> Thanks for catching it. Skimmed through the datasheets/erratas.
>>> This is what I found (micrel.c):
>>>
>>> 10/100:
>>> 8001 - Unaffected?
>>> 8021/8031 - Double reset after PDOWN.
>>> 8041 - Errata. PDOWN broken. Recommended do not use. Unclear if reset
>>> solves the issue since errata says no error after reset but is also
>>> claiming that only toggling PDOWN (may) or power will help.
>>> 8051 - Double reset after PDOWN.
>>> 8061 - Double reset after PDOWN.
>>> 8081 - Double reset after PDOWN.
>>> 8091 - Double reset after PDOWN.
>>>
>>> 10/100/1000:
>>> Nothing in gigabit afaics.
>>>
>>> Switches:
>>> 8862 - Not affected?
>>> 8863 - Errata. PDOWN broken. Reset will not help. Workaround exists.
>>> 8864 - Not affected?
>>> 8873 - Errata. PDOWN broken. Reset will not help. Workaround exists.
>>> 9477 - Errata. PDOWN broken. Will randomly cause link failure on
>>> adjacent links. Do not use.
>>>
>>> This certainly explains a lot.
>>>
>>>>>>
>>>>>> This patch changes PHY state to PHY_UP when MDIO bus resume back,
>>>>>> it should be reasonable after PHY hardware re-initialized. Also
>>>>>> give state machine a chance to start/config auto-nego again.
>>>>>>
>>>>>
>>>>> If the MAC driver calls phy_stop() on suspend, then
>>>>> phydev->suspended is true and mdio_bus_phy_may_suspend() returns
>>>>> false. As a consequence
>>>>> phydev->suspended_by_mdio_bus is false and mdio_bus_phy_resume()
>>>>> skips the PHY hw initialization.
>>>>> Please also note that mdio_bus_phy_suspend() calls
>>>>> phy_stop_machine() that sets the state to PHY_UP.
>>>>>
>>>>
>>>> Forgot that MDIO bus suspend is done before MAC driver suspend.
>>>> Therefore disregard this part for now.
>>>>
>>>>> Having said that the current argumentation isn't convincing. I'm not
>>>>> aware of such issues on other systems, therefore it's likely that
>>>>> something is system-dependent.
>>>>>
>>>>> Please check the exact call sequence on your system, maybe it
>>>>> provides a hint.
>>>>>
>>>>>> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
>>>>>> ---
>>>>>>  drivers/net/phy/phy_device.c | 7 +++++++
>>>>>>  1 file changed, 7 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/net/phy/phy_device.c
>>>>>> b/drivers/net/phy/phy_device.c index cc38e326405a..312a6f662481
>>>>>> 100644
>>>>>> --- a/drivers/net/phy/phy_device.c
>>>>>> +++ b/drivers/net/phy/phy_device.c
>>>>>> @@ -306,6 +306,13 @@ static __maybe_unused int
>> mdio_bus_phy_resume(struct device *dev)
>>>>>>  	ret = phy_resume(phydev);
>>>>>>  	if (ret < 0)
>>>>>>  		return ret;
>>>>>> +
>>>>>> +	/* PHY state could be changed to PHY_NOLINK from MAC controller
>> resume
>>>>>> +	 * rounte with phy_start(), here change to PHY_UP after
>> re-initializing
>>>>>> +	 * PHY hardware, let PHY state machine to start/config auto-nego
>> again.
>>>>>> +	 */
>>>>>> +	phydev->state = PHY_UP;
>>>>>> +
>>>>>>  no_resume:
>>>>>>  	if (phydev->attached_dev && phydev->adjust_link)
>>>>>>  		phy_start_machine(phydev);
>>>>>>
>>>>>
>>>>
>>>
>>
>> This is a quick draft of the modified soft reset for KSZ8081.
>> Some tests would be appreciated.
>>
> 
> I test this patch at my side, unfortunately, it still can't work.
> 
> I have not found what soft reset would affect in 8081 spec, is there ang common
> Description for different PHYs? 
> 

You can check the clause 22 spec: 802.3 22.2.4.1.1

Apart from that you can check BMCR value and which modes your PHY advertises
after a soft reset. If the PHY indeed wouldn't restart aneg after a soft reset,
then it would be the only one with this behavior I know. And I'd wonder why
there aren't more such reports.

> Best Regards,
> Joakim Zhang
>> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index
>> a14a00328..4902235a8 100644
>> --- a/drivers/net/phy/micrel.c
>> +++ b/drivers/net/phy/micrel.c
>> @@ -1091,6 +1091,42 @@ static void kszphy_get_stats(struct phy_device
>> *phydev,
>>  		data[i] = kszphy_get_stat(phydev, i);  }
>>
>> +int ksz8081_soft_reset(struct phy_device *phydev) {
>> +	int bmcr, ret, val;
>> +
>> +	phy_lock_mdio_bus(phydev);
>> +
>> +	bmcr = __phy_read(phydev, MII_BMCR);
>> +	if (bmcr < 0)
>> +		return bmcr;
>> +
>> +	bmcr |= BMCR_RESET;
>> +
>> +	if (bmcr & BMCR_PDOWN)
>> +		__phy_write(phydev, MII_BMCR, bmcr);
>> +
>> +	if (phydev->autoneg == AUTONEG_ENABLE)
>> +		bmcr |= BMCR_ANRESTART;
>> +
>> +	__phy_write(phydev, MII_BMCR, bmcr & ~BMCR_ISOLATE);
>> +
>> +	phy_unlock_mdio_bus(phydev);
>> +
>> +	phydev->suspended = 0;
>> +
>> +	ret = phy_read_poll_timeout(phydev, MII_BMCR, val, !(val &
>> BMCR_RESET),
>> +				    50000, 600000, true);
>> +	if (ret)
>> +		return ret;
>> +
>> +	/* BMCR may be reset to defaults */
>> +	if (phydev->autoneg == AUTONEG_DISABLE)
>> +		ret = genphy_setup_forced(phydev);
>> +
>> +	return ret;
>> +}
>> +
>>  static int kszphy_suspend(struct phy_device *phydev)  {
>>  	/* Disable PHY Interrupts */
>> @@ -1303,7 +1339,7 @@ static struct phy_driver ksphy_driver[] = {
>>  	.driver_data	= &ksz8081_type,
>>  	.probe		= kszphy_probe,
>>  	.config_init	= ksz8081_config_init,
>> -	.soft_reset	= genphy_soft_reset,
>> +	.soft_reset	= ksz8081_soft_reset,
>>  	.config_intr	= kszphy_config_intr,
>>  	.handle_interrupt = kszphy_handle_interrupt,
>>  	.get_sset_count = kszphy_get_sset_count,
>> --
>> 2.31.1
> 


  reply	other threads:[~2021-04-06  6:29 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-04 10:07 [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume back Joakim Zhang
2021-04-04 14:09 ` Heiner Kallweit
2021-04-04 22:48   ` Heiner Kallweit
2021-04-05  8:43     ` Christian Melki
2021-04-05 12:09       ` Heiner Kallweit
2021-04-05 13:53         ` Christian Melki
2021-04-05 14:58           ` Heiner Kallweit
2021-04-05 23:18             ` Florian Fainelli
2021-04-06  2:07         ` Joakim Zhang
2021-04-06  6:28           ` Heiner Kallweit [this message]
2021-04-06 10:07             ` Joakim Zhang
2021-04-06 11:42               ` Heiner Kallweit
2021-04-06 18:21                 ` Heiner Kallweit
2021-04-07  1:43                   ` Joakim Zhang
2021-04-07  7:12                     ` Heiner Kallweit
2021-04-07  7:46                       ` Joakim Zhang
2021-04-07 10:05                         ` Joakim Zhang
2021-04-07 10:21                           ` Heiner Kallweit
2021-04-07 10:38                             ` Joakim Zhang
2021-04-06 18:32                 ` Florian Fainelli
2021-04-06 18:43                   ` Heiner Kallweit
2021-04-06 18:47                     ` Florian Fainelli
2021-04-06  2:06       ` Joakim Zhang
2021-04-06  1:39     ` Joakim Zhang
2021-04-06  1:37   ` Joakim Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0e6bd756-f46c-7caf-d45b-a19e7fb80b67@gmail.com \
    --to=hkallweit1@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=christian.melki@t2data.com \
    --cc=davem@davemloft.net \
    --cc=kuba@kernel.org \
    --cc=linux-imx@nxp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=netdev@vger.kernel.org \
    --cc=qiangqing.zhang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).