From: Andrew Lunn <andrew@lunn.ch>
To: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: jeffrey.t.kirsher@intel.com,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
"moderated list:INTEL ETHERNET DRIVERS"
<intel-wired-lan@lists.osuosl.org>,
"open list:NETWORKING DRIVERS" <netdev@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] e1000e: Power cycle phy on PM resume
Date: Wed, 23 Sep 2020 17:37:03 +0200 [thread overview]
Message-ID: <20200923153703.GC3764123@lunn.ch> (raw)
In-Reply-To: <F6075687-7BC4-4348-86A8-29D83B7E5AAC@canonical.com>
On Wed, Sep 23, 2020 at 10:44:10PM +0800, Kai-Heng Feng wrote:
> Hi Andrew,
>
> > On Sep 23, 2020, at 20:17, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Wed, Sep 23, 2020 at 03:47:51PM +0800, Kai-Heng Feng wrote:
> >> We are seeing the following error after S3 resume:
> >> [ 704.746874] e1000e 0000:00:1f.6 eno1: Setting page 0x6020
> >> [ 704.844232] e1000e 0000:00:1f.6 eno1: MDI Write did not complete
> >> [ 704.902817] e1000e 0000:00:1f.6 eno1: Setting page 0x6020
> >> [ 704.903075] e1000e 0000:00:1f.6 eno1: reading PHY page 769 (or 0x6020 shifted) reg 0x17
> >> [ 704.903281] e1000e 0000:00:1f.6 eno1: Setting page 0x6020
> >> [ 704.903486] e1000e 0000:00:1f.6 eno1: writing PHY page 769 (or 0x6020 shifted) reg 0x17
> >> [ 704.943155] e1000e 0000:00:1f.6 eno1: MDI Error
> >> ...
> >> [ 705.108161] e1000e 0000:00:1f.6 eno1: Hardware Error
> >>
> >> Since we don't know what platform firmware may do to the phy, so let's
> >> power cycle the phy upon system resume to resolve the issue.
> >>
> >> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> >> ---
> >> drivers/net/ethernet/intel/e1000e/netdev.c | 2 ++
> >> 1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> >> index 664e8ccc88d2..c2a87a408102 100644
> >> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> >> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> >> @@ -6968,6 +6968,8 @@ static __maybe_unused int e1000e_pm_resume(struct device *dev)
> >> !e1000e_check_me(hw->adapter->pdev->device))
> >> e1000e_s0ix_exit_flow(adapter);
> >>
> >> + e1000_power_down_phy(adapter);
> >> +
> >
> > static void e1000_power_down_phy(struct e1000_adapter *adapter)
> > {
> > struct e1000_hw *hw = &adapter->hw;
> >
> > /* Power down the PHY so no link is implied when interface is down *
> > * The PHY cannot be powered down if any of the following is true *
> > * (a) WoL is enabled
> > * (b) AMT is active
> > * (c) SoL/IDER session is active
> > */
> > if (!adapter->wol && hw->mac_type >= e1000_82540 &&
> > hw->media_type == e1000_media_type_copper) {
>
> Looks like the the function comes from e1000, drivers/net/ethernet/intel/e1000/e1000_main.c.
> However, this patch is for e1000e, so the function with same name is different.
Ah! Sorry. Missed that. Also it is not nice there are two functions in
the kernel with the same name.
> > Could it be coming out of S3 because it just received a WoL?
>
> No, the issue can be reproduced by pressing keyboard or rtcwake.
Not relevant now, since i was looking at the wrong function. But i was
meaning the call is a NOP in the case WoL caused the wake up. So if
the issues can also happen after WoL, your fix is not going to fix it.
> > It seems unlikely that it is the MII_CR_POWER_DOWN which is helping,
> > since that is an MDIO write itself. Do you actually know how this call
> > to e1000_power_down_phy() fixes the issues?
>
> I don't know from hardware's perspective, but I think the comment on
> e1000_power_down_phy_copper() can give us some insight:
And there is only one function called e1000_power_down_phy_copper()
:-)
>
> /**
> * e1000_power_down_phy_copper - Restore copper link in case of PHY power down
> * @hw: pointer to the HW structure
> *
> * In the case of a PHY power down to save power, or to turn off link during a
> * driver unload, or wake on lan is not enabled, restore the link to previous
> * settings.
> **/
> void e1000_power_down_phy_copper(struct e1000_hw *hw)
> {
> u16 mii_reg = 0;
>
> /* The PHY will retain its settings across a power down/up cycle */
> e1e_rphy(hw, MII_BMCR, &mii_reg);
> mii_reg |= BMCR_PDOWN;
> e1e_wphy(hw, MII_BMCR, mii_reg);
> usleep_range(1000, 2000);
> }
I don't really see how this explains this:
> >> [ 704.746874] e1000e 0000:00:1f.6 eno1: Setting page 0x6020
> >> [ 704.844232] e1000e 0000:00:1f.6 eno1: MDI Write did not complete
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/intel/e1000e/phy.c#L181
So first off, the comments are all cut/paste from
e1000e_read_phy_reg_mdic(). It would be nice to s/read/write/g in that
function.
So it sets up the transaction and starts it. MDIO is a serial bus with
no acknowledgements. You clock out around 64 bits, and hope the PHY
receives it. The time it takes to send those 64 bits is fixed by the
bus speed, typically 2.5MHz.
So the driver polls waiting for the hardware to say the bits have been
sent. And this is timing out. How long that takes has nothing to do
with the PHY, or what state it is in. Powering down the PHY has no
effect on the MDIO bus master, and how long it takes to shift those
bits out. Which is why i don't think this patch is correct. This is
probably an MDIO bus issue, not a PHY issue.
Try dumping the value of MDIC in the good/bad case before the
transaction starts.
Andrew
next prev parent reply other threads:[~2020-09-23 15:37 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-23 7:47 [PATCH] e1000e: Power cycle phy on PM resume Kai-Heng Feng
2020-09-23 12:17 ` Andrew Lunn
2020-09-23 14:44 ` Kai-Heng Feng
2020-09-23 15:37 ` Andrew Lunn [this message]
2020-09-24 12:50 ` Kai-Heng Feng
2020-09-23 13:28 ` [Intel-wired-lan] " Paul Menzel
2020-09-23 14:46 ` Kai-Heng Feng
2020-09-23 15:02 ` Paul Menzel
2020-09-23 19:28 ` Andrew Lunn
2020-09-24 13:02 ` Paul Menzel
2020-09-24 15:09 ` [PATCH v2] e1000e: Increase iteration on polling MDIC ready bit Kai-Heng Feng
2020-09-24 15:32 ` [Intel-wired-lan] " Paul Menzel
2020-09-24 16:04 ` Andrew Lunn
2020-09-24 15:53 ` Andrew Lunn
2020-09-24 16:04 ` Kai-Heng Feng
2020-09-25 8:50 ` David Laight
2020-09-25 13:29 ` Andrew Lunn
2020-09-26 10:08 ` David Laight
2020-09-24 16:45 ` [PATCH v3] " Kai-Heng Feng
2020-09-24 19:57 ` Andrew Lunn
2020-09-25 3:57 ` Kai-Heng Feng
2020-09-25 5:16 ` [Intel-wired-lan] " Paul Menzel
2020-09-28 8:36 ` [PATCH v4] e1000e: Increase polling timeout on " Kai-Heng Feng
2020-09-29 13:08 ` [Intel-wired-lan] " Neftin, Sasha
2020-09-29 13:31 ` Kai-Heng Feng
2020-09-29 13:46 ` Neftin, Sasha
2020-09-29 15:08 ` Kai-Heng Feng
2020-09-29 15:11 ` David Laight
2020-09-29 15:12 ` Kai-Heng Feng
2020-09-30 6:54 ` Vitaly Lifshits
2020-10-05 6:23 ` Kai-Heng Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200923153703.GC3764123@lunn.ch \
--to=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jeffrey.t.kirsher@intel.com \
--cc=kai.heng.feng@canonical.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).