stmmac driver timeout issue

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* stmmac driver timeout issue
@ 2021-03-04 13:14 Joakim Zhang
  2021-03-04 22:25 ` Andrew Lunn
  2021-03-05  0:28 ` Florian Fainelli
  0 siblings, 2 replies; 8+ messages in thread
From: Joakim Zhang @ 2021-03-04 13:14 UTC (permalink / raw)
  To: Jakub Kicinski, Andrew Lunn; +Cc: netdev@vger.kernel.org


Hello Andrew, Hello Jakub,

You may can give some suggestions based on your great networking knowledge, thanks in advance!

I found that add vlan id hw filter (stmmac_vlan_rx_add_vid) have possibility timeout when accessing VLAN Filter registers during ifup/ifdown stress test, and restore vlan id hw filter (stmmac_restore_hw_vlan_rx_fltr) always timeout when access VLAN Filter registers. 

My hardware is i.MX8MP (drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c, RGMII interface, RTL8211FDI-CG PHY), it needs fix mac speed(imx_dwmac_fix_speed), it indirectly involved in phylink_link_up. After debugging, if phylink_link_up is called later than adding vlan id hw filter, it will report timeout, so I guess we need fix mac speed before accessing VLAN Filter registers. Error like below:
	[  106.389879] 8021q: adding VLAN 0 to HW filter on device eth1
	[  106.395644] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
	[  108.160734] imx-dwmac 30bf0000.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx   ->->-> which means accessing VLAN Filter registers before phylink_link_up is called.

Same case when system resume back, 
	[ 1763.842294] imx-dwmac 30bf0000.ethernet eth1: configuring for phy/rgmii-id link mode
	[ 1763.853084] imx-dwmac 30bf0000.ethernet eth1: No Safety Features support found
	[ 1763.853186] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
	[ 1763.873465] usb usb1: root hub lost power or was reset
	[ 1763.873469] usb usb2: root hub lost power or was reset
	[ 1764.090321] PM: resume devices took 0.248 seconds
	[ 1764.257381] OOM killer enabled.
	[ 1764.260518] Restarting tasks ... done.
	[ 1764.265229] PM: suspend exit
	===============================
	suspend 12 times
	===============================
	[ 1765.887915] imx-dwmac 30bf0000.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx  ->->-> which means accessing VLAN Filter registers before phylink_link_up is called.

My question is that some MAC controllers need RXC clock from RGMII interface to reset DAM or access to some registers. If there is any way to ensure phylink_link_up is invoked synchronously when we need it. I am not sure this timeout caused by a fix mac speed is needed before accessing VLAN Filter registers, is there ang hints, thanks a lot! We have another board i.MX8DXL which don't need fix mac speed attach to AR8031 PHY, can't reproduce this issue.

Best Regards,
Joakim Zhang


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: stmmac driver timeout issue
  2021-03-04 13:14 stmmac driver timeout issue Joakim Zhang
@ 2021-03-04 22:25 ` Andrew Lunn
  2021-03-05  0:28 ` Florian Fainelli
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Lunn @ 2021-03-04 22:25 UTC (permalink / raw)
  To: Joakim Zhang; +Cc: Jakub Kicinski, netdev@vger.kernel.org

On Thu, Mar 04, 2021 at 01:14:31PM +0000, Joakim Zhang wrote:
> 
> Hello Andrew, Hello Jakub,
> 
> You may can give some suggestions based on your great networking knowledge, thanks in advance!
> 
> I found that add vlan id hw filter (stmmac_vlan_rx_add_vid) have possibility timeout when accessing VLAN Filter registers during ifup/ifdown stress test, and restore vlan id hw filter (stmmac_restore_hw_vlan_rx_fltr) always timeout when access VLAN Filter registers. 
> 
> My hardware is i.MX8MP (drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c, RGMII interface, RTL8211FDI-CG PHY), it needs fix mac speed(imx_dwmac_fix_speed), it indirectly involved in phylink_link_up. After debugging, if phylink_link_up is called later than adding vlan id hw filter, it will report timeout, so I guess we need fix mac speed before accessing VLAN Filter registers. Error like below:
> 	[  106.389879] 8021q: adding VLAN 0 to HW filter on device eth1
> 	[  106.395644] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
> 	[  108.160734] imx-dwmac 30bf0000.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx   ->->-> which means accessing VLAN Filter registers before phylink_link_up is called.
> 
> Same case when system resume back, 
> 	[ 1763.842294] imx-dwmac 30bf0000.ethernet eth1: configuring for phy/rgmii-id link mode
> 	[ 1763.853084] imx-dwmac 30bf0000.ethernet eth1: No Safety Features support found
> 	[ 1763.853186] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
> 	[ 1763.873465] usb usb1: root hub lost power or was reset
> 	[ 1763.873469] usb usb2: root hub lost power or was reset
> 	[ 1764.090321] PM: resume devices took 0.248 seconds
> 	[ 1764.257381] OOM killer enabled.
> 	[ 1764.260518] Restarting tasks ... done.
> 	[ 1764.265229] PM: suspend exit
> 	===============================
> 	suspend 12 times
> 	===============================
> 	[ 1765.887915] imx-dwmac 30bf0000.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx  ->->-> which means accessing VLAN Filter registers before phylink_link_up is called.
> 
> My question is that some MAC controllers need RXC clock from RGMII interface to reset DAM or access to some registers.

There are some controllers which need the PHY clock. And some PHYs can
give you some control over the clock. e.g. there are DT properties
like "ti,clk-output-sel", "qca,clk-out-frequency". You probably want
to look at the PHY datasheet and see what you can control. It might be
possible to make it tick all the time. It has also been suggested that
the PHY could implement a clk provider, which a MAC driver to
clk_prepare_enable() when it needs it.

     Andrew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: stmmac driver timeout issue
  2021-03-04 13:14 stmmac driver timeout issue Joakim Zhang
  2021-03-04 22:25 ` Andrew Lunn
@ 2021-03-05  0:28 ` Florian Fainelli
  2021-03-08 12:45   ` Joakim Zhang
  1 sibling, 1 reply; 8+ messages in thread
From: Florian Fainelli @ 2021-03-05  0:28 UTC (permalink / raw)
  To: Joakim Zhang, Jakub Kicinski, Andrew Lunn; +Cc: netdev@vger.kernel.org

On 3/4/21 5:14 AM, Joakim Zhang wrote:
> 
> Hello Andrew, Hello Jakub,
> 
> You may can give some suggestions based on your great networking knowledge, thanks in advance!
> 
> I found that add vlan id hw filter (stmmac_vlan_rx_add_vid) have possibility timeout when accessing VLAN Filter registers during ifup/ifdown stress test, and restore vlan id hw filter (stmmac_restore_hw_vlan_rx_fltr) always timeout when access VLAN Filter registers. 
> 
> My hardware is i.MX8MP (drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c, RGMII interface, RTL8211FDI-CG PHY), it needs fix mac speed(imx_dwmac_fix_speed), it indirectly involved in phylink_link_up. After debugging, if phylink_link_up is called later than adding vlan id hw filter, it will report timeout, so I guess we need fix mac speed before accessing VLAN Filter registers. Error like below:
> 	[  106.389879] 8021q: adding VLAN 0 to HW filter on device eth1
> 	[  106.395644] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
> 	[  108.160734] imx-dwmac 30bf0000.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx   ->->-> which means accessing VLAN Filter registers before phylink_link_up is called.
> 
> Same case when system resume back, 
> 	[ 1763.842294] imx-dwmac 30bf0000.ethernet eth1: configuring for phy/rgmii-id link mode
> 	[ 1763.853084] imx-dwmac 30bf0000.ethernet eth1: No Safety Features support found
> 	[ 1763.853186] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing MAC_VLAN_Tag_Filter
> 	[ 1763.873465] usb usb1: root hub lost power or was reset
> 	[ 1763.873469] usb usb2: root hub lost power or was reset
> 	[ 1764.090321] PM: resume devices took 0.248 seconds
> 	[ 1764.257381] OOM killer enabled.
> 	[ 1764.260518] Restarting tasks ... done.
> 	[ 1764.265229] PM: suspend exit
> 	===============================
> 	suspend 12 times
> 	===============================
> 	[ 1765.887915] imx-dwmac 30bf0000.ethernet eth1: Link is Up - 100Mbps/Full - flow control rx/tx  ->->-> which means accessing VLAN Filter registers before phylink_link_up is called.
> 
> My question is that some MAC controllers need RXC clock from RGMII interface to reset DAM or access to some registers. If there is any way to ensure phylink_link_up is invoked synchronously when we need it. I am not sure this timeout caused by a fix mac speed is needed before accessing VLAN Filter registers, is there ang hints, thanks a lot! We have another board i.MX8DXL which don't need fix mac speed attach to AR8031 PHY, can't reproduce this issue.

Every Ethernet controller is different, but you can see that we
struggled to fix a similar problem where we need the RXC from the PHY
for the MAC to complete its reset side reset with GENET, it took several
iterations to get there:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=88f6c8bf1aaed5039923fb4c701cab4d42176275
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=612eb1c3b9e504de24136c947ed7c07bc342f3aa
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6b6d017fccb4693767d2fcae9ef2fd05243748bb
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3a55402c93877d291b0a612d25edb03d1b4b93ac
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f515486275a08a17a2c806b844cca18f7de5b34

This driver uses PHYLIB (hardware is no longer developed and will not
receive updates to support different PCS), but maybe you can glean some
idea on how to solve this?
-- 
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: stmmac driver timeout issue
  2021-03-05  0:28 ` Florian Fainelli
@ 2021-03-08 12:45   ` Joakim Zhang
  2021-03-08 17:56     ` Florian Fainelli
  0 siblings, 1 reply; 8+ messages in thread
From: Joakim Zhang @ 2021-03-08 12:45 UTC (permalink / raw)
  To: Florian Fainelli, Jakub Kicinski, Andrew Lunn; +Cc: netdev@vger.kernel.org


Hi Florian, Andrew,

Thanks for your help, after debug, It seems related to PHY(RTL8211FDI). It stop output RXC clock for dozens to hundreds milliseconds during auto-negotiation, and there is no such issue with AR8031.
When do ifup/ifdown test or system suspend/resume test, it will suspend then resume phy which do power down and then change to normal operation.(switch from power to normal operation)

There is a note in RTL8211FDI datasheet:
Note 2: When the RTL8211F(I)/RTL8211FD(I) is switched from power to normal operation, a software reset and restart auto-negotiation is performed, even if bits Reset(0.15) and Restart_AN(0.9) are not set by the users.

Form above note, it will trigger auto-negotiation when do ifup/ifdown test or system suspend/resume, so we will meet RXC clock is stop issue on RTL8211FDI. My question is that, Is this a normal behavior, all PHYs will perform this behavior? And Linux PHY frame work can handle this case, there is no config_init after resume, will the config be reset?

Best Regards,
Joakim Zhang

> -----Original Message-----
> From: Florian Fainelli <f.fainelli@gmail.com>
> Sent: 2021年3月5日 8:28
> To: Joakim Zhang <qiangqing.zhang@nxp.com>; Jakub Kicinski
> <kuba@kernel.org>; Andrew Lunn <andrew@lunn.ch>
> Cc: netdev@vger.kernel.org
> Subject: Re: stmmac driver timeout issue
> 
> On 3/4/21 5:14 AM, Joakim Zhang wrote:
> >
> > Hello Andrew, Hello Jakub,
> >
> > You may can give some suggestions based on your great networking
> knowledge, thanks in advance!
> >
> > I found that add vlan id hw filter (stmmac_vlan_rx_add_vid) have possibility
> timeout when accessing VLAN Filter registers during ifup/ifdown stress test,
> and restore vlan id hw filter (stmmac_restore_hw_vlan_rx_fltr) always timeout
> when access VLAN Filter registers.
> >
> > My hardware is i.MX8MP
> (drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c, RGMII interface,
> RTL8211FDI-CG PHY), it needs fix mac speed(imx_dwmac_fix_speed), it
> indirectly involved in phylink_link_up. After debugging, if phylink_link_up is
> called later than adding vlan id hw filter, it will report timeout, so I guess we
> need fix mac speed before accessing VLAN Filter registers. Error like below:
> > 	[  106.389879] 8021q: adding VLAN 0 to HW filter on device eth1
> > 	[  106.395644] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing
> MAC_VLAN_Tag_Filter
> > 	[  108.160734] imx-dwmac 30bf0000.ethernet eth1: Link is Up -
> 100Mbps/Full - flow control rx/tx   ->->-> which means accessing VLAN Filter
> registers before phylink_link_up is called.
> >
> > Same case when system resume back,
> > 	[ 1763.842294] imx-dwmac 30bf0000.ethernet eth1: configuring for
> phy/rgmii-id link mode
> > 	[ 1763.853084] imx-dwmac 30bf0000.ethernet eth1: No Safety Features
> support found
> > 	[ 1763.853186] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing
> MAC_VLAN_Tag_Filter
> > 	[ 1763.873465] usb usb1: root hub lost power or was reset
> > 	[ 1763.873469] usb usb2: root hub lost power or was reset
> > 	[ 1764.090321] PM: resume devices took 0.248 seconds
> > 	[ 1764.257381] OOM killer enabled.
> > 	[ 1764.260518] Restarting tasks ... done.
> > 	[ 1764.265229] PM: suspend exit
> > 	===============================
> > 	suspend 12 times
> > 	===============================
> > 	[ 1765.887915] imx-dwmac 30bf0000.ethernet eth1: Link is Up -
> 100Mbps/Full - flow control rx/tx  ->->-> which means accessing VLAN Filter
> registers before phylink_link_up is called.
> >
> > My question is that some MAC controllers need RXC clock from RGMII
> interface to reset DAM or access to some registers. If there is any way to
> ensure phylink_link_up is invoked synchronously when we need it. I am not sure
> this timeout caused by a fix mac speed is needed before accessing VLAN Filter
> registers, is there ang hints, thanks a lot! We have another board i.MX8DXL
> which don't need fix mac speed attach to AR8031 PHY, can't reproduce this
> issue.
> 
> Every Ethernet controller is different, but you can see that we struggled to fix a
> similar problem where we need the RXC from the PHY for the MAC to complete
> its reset side reset with GENET, it took several iterations to get there:
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kern
> el.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fco
> mmit%2F%3Fid%3D88f6c8bf1aaed5039923fb4c701cab4d42176275&amp;data
> =04%7C01%7Cqiangqing.zhang%40nxp.com%7Cbe0cffc475f946efac1908d8df6
> d97f8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637505009093
> 274835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2lu
> MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Ar2KLapqP3u
> w%2FKlQF5KOkKwgelCZefaW%2FDk5gS8td%2Fc%3D&amp;reserved=0
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kern
> el.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fco
> mmit%2F%3Fid%3D612eb1c3b9e504de24136c947ed7c07bc342f3aa&amp;dat
> a=04%7C01%7Cqiangqing.zhang%40nxp.com%7Cbe0cffc475f946efac1908d8df
> 6d97f8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63750500909
> 3274835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2l
> uMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=YgHMDS5d%
> 2FT3y%2FLOK4Y3nyOWvWdmpMcj4UmaTJCwJnpQ%3D&amp;reserved=0
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kern
> el.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fco
> mmit%2F%3Fid%3D6b6d017fccb4693767d2fcae9ef2fd05243748bb&amp;data=
> 04%7C01%7Cqiangqing.zhang%40nxp.com%7Cbe0cffc475f946efac1908d8df6d
> 97f8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C6375050090932
> 74835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2lu
> MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=9A6zO4NIbgn
> %2BAizlmPdOj5GwxYhln7OFsAp6sFFrpE4%3D&amp;reserved=0
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kern
> el.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fco
> mmit%2F%3Fid%3D3a55402c93877d291b0a612d25edb03d1b4b93ac&amp;dat
> a=04%7C01%7Cqiangqing.zhang%40nxp.com%7Cbe0cffc475f946efac1908d8df
> 6d97f8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C63750500909
> 3274835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2l
> uMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=QwQ2E6arYR
> x8vQ4dLonrDcl0LhulbOn%2FEJUSZArvt1g%3D&amp;reserved=0
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kern
> el.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fco
> mmit%2F%3Fid%3D1f515486275a08a17a2c806b844cca18f7de5b34&amp;data
> =04%7C01%7Cqiangqing.zhang%40nxp.com%7Cbe0cffc475f946efac1908d8df6
> d97f8%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637505009093
> 274835%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2lu
> MzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=VucX1Wgnoz
> MpEmcZolLoYUy2%2FEnJbsn6JxLzZF9SkmE%3D&amp;reserved=0
> 
> This driver uses PHYLIB (hardware is no longer developed and will not receive
> updates to support different PCS), but maybe you can glean some idea on how
> to solve this?
> --
> Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: stmmac driver timeout issue
  2021-03-08 12:45   ` Joakim Zhang
@ 2021-03-08 17:56     ` Florian Fainelli
  2021-03-11 12:04       ` Joakim Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Fainelli @ 2021-03-08 17:56 UTC (permalink / raw)
  To: Joakim Zhang, Jakub Kicinski, Andrew Lunn; +Cc: netdev@vger.kernel.org

On 3/8/21 4:45 AM, Joakim Zhang wrote:
> 
> Hi Florian, Andrew,
> 
> Thanks for your help, after debug, It seems related to PHY(RTL8211FDI). It stop output RXC clock for dozens to hundreds milliseconds during auto-negotiation, and there is no such issue with AR8031.
> When do ifup/ifdown test or system suspend/resume test, it will suspend then resume phy which do power down and then change to normal operation.(switch from power to normal operation)
> 
> There is a note in RTL8211FDI datasheet:
> Note 2: When the RTL8211F(I)/RTL8211FD(I) is switched from power to normal operation, a software reset and restart auto-negotiation is performed, even if bits Reset(0.15) and Restart_AN(0.9) are not set by the users.
> 
> Form above note, it will trigger auto-negotiation when do ifup/ifdown test or system suspend/resume, so we will meet RXC clock is stop issue on RTL8211FDI. My question is that, Is this a normal behavior, all PHYs will perform this behavior? And Linux PHY frame work can handle this case, there is no config_init after resume, will the config be reset?

I do not have experience with Realtek PHYs however what you describe
does not sound completely far off from what Broadcom PHYs would do when
auto-power down is enabled and when the link is dropped either because
the PHY was powered down or auto-negotiation was restarted which then
leads to the RXC/TXC clocks being disabled.

For RGMII that connects to an actual PHY you can probably use the same
technique that Doug had implemented for GENET whereby you put it in
isolate mode and it maintains its RXC while you do the reset. The
problem is that this really only work for an RGMII connection and a PHY,
if you connect to a MAC you could create contention on the pins. I am
afraid there is no fool proof situation but maybe you can find a way to
configure the STMMAC so as to route another internal clock that it
generates as a valid RXC just for the time you need it?

With respect to your original problem, looks like it may be fixed with:

https://git.kernel.org/netdev/net/c/9a7b3950c7e1

or maybe this only works on the specific Intel platform?
-- 
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: stmmac driver timeout issue
  2021-03-08 17:56     ` Florian Fainelli
@ 2021-03-11 12:04       ` Joakim Zhang
  2021-03-12 18:33         ` Florian Fainelli
  0 siblings, 1 reply; 8+ messages in thread
From: Joakim Zhang @ 2021-03-11 12:04 UTC (permalink / raw)
  To: Florian Fainelli, Jakub Kicinski, Andrew Lunn; +Cc: netdev@vger.kernel.org


> -----Original Message-----
> From: Florian Fainelli <f.fainelli@gmail.com>
> Sent: 2021年3月9日 1:57
> To: Joakim Zhang <qiangqing.zhang@nxp.com>; Jakub Kicinski
> <kuba@kernel.org>; Andrew Lunn <andrew@lunn.ch>
> Cc: netdev@vger.kernel.org
> Subject: Re: stmmac driver timeout issue
> 
> On 3/8/21 4:45 AM, Joakim Zhang wrote:
> >
> > Hi Florian, Andrew,
> >
> > Thanks for your help, after debug, It seems related to PHY(RTL8211FDI). It
> stop output RXC clock for dozens to hundreds milliseconds during
> auto-negotiation, and there is no such issue with AR8031.
> > When do ifup/ifdown test or system suspend/resume test, it will
> > suspend then resume phy which do power down and then change to normal
> > operation.(switch from power to normal operation)
> >
> > There is a note in RTL8211FDI datasheet:
> > Note 2: When the RTL8211F(I)/RTL8211FD(I) is switched from power to
> normal operation, a software reset and restart auto-negotiation is performed,
> even if bits Reset(0.15) and Restart_AN(0.9) are not set by the users.
> >
> > Form above note, it will trigger auto-negotiation when do ifup/ifdown test or
> system suspend/resume, so we will meet RXC clock is stop issue on
> RTL8211FDI. My question is that, Is this a normal behavior, all PHYs will
> perform this behavior? And Linux PHY frame work can handle this case, there is
> no config_init after resume, will the config be reset?
> 
> I do not have experience with Realtek PHYs however what you describe does
> not sound completely far off from what Broadcom PHYs would do when
> auto-power down is enabled and when the link is dropped either because the
> PHY was powered down or auto-negotiation was restarted which then leads to
> the RXC/TXC clocks being disabled.
> 
> For RGMII that connects to an actual PHY you can probably use the same
> technique that Doug had implemented for GENET whereby you put it in isolate
> mode and it maintains its RXC while you do the reset. The problem is that this
> really only work for an RGMII connection and a PHY, if you connect to a MAC
> you could create contention on the pins. I am afraid there is no fool proof
> situation but maybe you can find a way to configure the STMMAC so as to route
> another internal clock that it generates as a valid RXC just for the time you
> need it?
> 
> With respect to your original problem, looks like it may be fixed with:
> 
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kern
> el.org%2Fnetdev%2Fnet%2Fc%2F9a7b3950c7e1&amp;data=04%7C01%7Cqian
> gqing.zhang%40nxp.com%7Cb7e83671b0184471020708d8e25b8ca6%7C686ea
> 1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637508230113442096%7CUnk
> nown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1
> haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=LPY4fazuJFAOanncuGll1jGK8W
> bxiR2iZ5KfuuaAk98%3D&amp;reserved=0
> 
> or maybe this only works on the specific Intel platform?

Thanks Florian, I also noticed that patch, but that should work for driver remove. The key is RXC not stable when auto-nego at my side.

I have a question about PHY framework, please point me if something I misunderstanding.
There are many scenarios from PHY framework would trigger auto-nego, such as switch from power down to normal operation, but it never polling the ack of auto-nego (phy_poll_aneg_done), is there any special reasons? Is it possible and reasonable for MAC controller driver to poll this ack, if yes, at least we have a stable RXC at that time.

Best Regards,
Joakim Zhang
> --
> Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: stmmac driver timeout issue
  2021-03-11 12:04       ` Joakim Zhang
@ 2021-03-12 18:33         ` Florian Fainelli
  2021-03-12 19:09           ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Fainelli @ 2021-03-12 18:33 UTC (permalink / raw)
  To: Joakim Zhang, Jakub Kicinski, Andrew Lunn, Heiner Kallweit,
	Russell King
  Cc: netdev@vger.kernel.org

On 3/11/21 4:04 AM, Joakim Zhang wrote:
> 
>> -----Original Message-----
>> From: Florian Fainelli <f.fainelli@gmail.com>
>> Sent: 2021年3月9日 1:57
>> To: Joakim Zhang <qiangqing.zhang@nxp.com>; Jakub Kicinski
>> <kuba@kernel.org>; Andrew Lunn <andrew@lunn.ch>
>> Cc: netdev@vger.kernel.org
>> Subject: Re: stmmac driver timeout issue
>>
>> On 3/8/21 4:45 AM, Joakim Zhang wrote:
>>>
>>> Hi Florian, Andrew,
>>>
>>> Thanks for your help, after debug, It seems related to PHY(RTL8211FDI). It
>> stop output RXC clock for dozens to hundreds milliseconds during
>> auto-negotiation, and there is no such issue with AR8031.
>>> When do ifup/ifdown test or system suspend/resume test, it will
>>> suspend then resume phy which do power down and then change to normal
>>> operation.(switch from power to normal operation)
>>>
>>> There is a note in RTL8211FDI datasheet:
>>> Note 2: When the RTL8211F(I)/RTL8211FD(I) is switched from power to
>> normal operation, a software reset and restart auto-negotiation is performed,
>> even if bits Reset(0.15) and Restart_AN(0.9) are not set by the users.
>>>
>>> Form above note, it will trigger auto-negotiation when do ifup/ifdown test or
>> system suspend/resume, so we will meet RXC clock is stop issue on
>> RTL8211FDI. My question is that, Is this a normal behavior, all PHYs will
>> perform this behavior? And Linux PHY frame work can handle this case, there is
>> no config_init after resume, will the config be reset?
>>
>> I do not have experience with Realtek PHYs however what you describe does
>> not sound completely far off from what Broadcom PHYs would do when
>> auto-power down is enabled and when the link is dropped either because the
>> PHY was powered down or auto-negotiation was restarted which then leads to
>> the RXC/TXC clocks being disabled.
>>
>> For RGMII that connects to an actual PHY you can probably use the same
>> technique that Doug had implemented for GENET whereby you put it in isolate
>> mode and it maintains its RXC while you do the reset. The problem is that this
>> really only work for an RGMII connection and a PHY, if you connect to a MAC
>> you could create contention on the pins. I am afraid there is no fool proof
>> situation but maybe you can find a way to configure the STMMAC so as to route
>> another internal clock that it generates as a valid RXC just for the time you
>> need it?
>>
>> With respect to your original problem, looks like it may be fixed with:
>>
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kern
>> el.org%2Fnetdev%2Fnet%2Fc%2F9a7b3950c7e1&amp;data=04%7C01%7Cqian
>> gqing.zhang%40nxp.com%7Cb7e83671b0184471020708d8e25b8ca6%7C686ea
>> 1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C637508230113442096%7CUnk
>> nown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1
>> haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=LPY4fazuJFAOanncuGll1jGK8W
>> bxiR2iZ5KfuuaAk98%3D&amp;reserved=0
>>
>> or maybe this only works on the specific Intel platform?
> 
> Thanks Florian, I also noticed that patch, but that should work for driver remove. The key is RXC not stable when auto-nego at my side.
> 
> I have a question about PHY framework, please point me if something I misunderstanding.
> There are many scenarios from PHY framework would trigger auto-nego, such as switch from power down to normal operation, but it never polling the ack of auto-nego (phy_poll_aneg_done), is there any special reasons? Is it possible and reasonable for MAC controller driver to poll this ack, if yes, at least we have a stable RXC at that time.

Adding Heiner and Russell as well. Usually you do not want, or rather
cannot know whether auto-negotiation will ever succeed, so waiting for
it could essentially hog your system for some fairly indefinite amount
of time.

With respect to your Realtek PHY is there no way you can force it to
output the 125MHz RX clock towards the MAC while you perform the MAC
initialization?
-- 
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: stmmac driver timeout issue
  2021-03-12 18:33         ` Florian Fainelli
@ 2021-03-12 19:09           ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 8+ messages in thread
From: Russell King - ARM Linux admin @ 2021-03-12 19:09 UTC (permalink / raw)
  To: Joakim Zhang
  Cc: Florian Fainelli, Jakub Kicinski, Andrew Lunn, Heiner Kallweit,
	netdev@vger.kernel.org

On Fri, Mar 12, 2021 at 10:33:06AM -0800, Florian Fainelli wrote:
> On 3/11/21 4:04 AM, Joakim Zhang wrote:
> > I have a question about PHY framework, please point me if something I misunderstanding.
> > There are many scenarios from PHY framework would trigger auto-nego, such as switch from power down to normal operation, but it never polling the ack of auto-nego (phy_poll_aneg_done), is there any special reasons? Is it possible and reasonable for MAC controller driver to poll this ack, if yes, at least we have a stable RXC at that time.
> 
> Adding Heiner and Russell as well. Usually you do not want, or rather
> cannot know whether auto-negotiation will ever succeed, so waiting for
> it could essentially hog your system for some fairly indefinite amount
> of time.

I think the question being asked is essentially whether checking the
link status bit (1.2) without checking the aneg complete bit (1.5) is
sufficient.

Reading 802.3, it seems to be defined that if autonegotiation is in
use, the link shall be reported as down until autonegotiation has
completed - which is logical. The link can only be up if a valid
data path capable of transferring data has been established, which
implies that autonegotiation must have completed.

However, note that when coming out of power down, there is no guarantee
that there is anything connected to the other side of the media, and
thus there is no guarantee that autonegotiation will complete. Waiting
for autonegotiation to complete in this case would not be feasible.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-03-12 19:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-03-04 13:14 stmmac driver timeout issue Joakim Zhang
2021-03-04 22:25 ` Andrew Lunn
2021-03-05  0:28 ` Florian Fainelli
2021-03-08 12:45   ` Joakim Zhang
2021-03-08 17:56     ` Florian Fainelli
2021-03-11 12:04       ` Joakim Zhang
2021-03-12 18:33         ` Florian Fainelli
2021-03-12 19:09           ` Russell King - ARM Linux admin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).