netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Russell King - ARM Linux <linux@armlinux.org.uk>
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>, netdev@vger.kernel.org
Subject: Re: [BUG] micrel phy suspend/resume
Date: Mon, 11 Dec 2017 00:06:28 +0000	[thread overview]
Message-ID: <20171211000628.GR10595@n2100.armlinux.org.uk> (raw)
In-Reply-To: <389b78f6-951c-528d-8b2b-fc6c6adfc77f@gmail.com>

On Sun, Dec 10, 2017 at 03:50:28PM -0800, Florian Fainelli wrote:
> Hi Russell,
> 
> On 12/10/2017 08:47 AM, Russell King - ARM Linux wrote:
> > Guys,
> > 
> > I've just tripped over a bug with the Micrel PHY driver, but it
> > really isn't specific to the Micrel PHY driver.
> > 
> > When we suspend, we suspend the PHY and then the MAC driver (eg,
> > on the ZII board):
> > 
> > [  198.822751] 400d0000.ethernet-1:00: bus : mdio_bus_suspend+0x0/0x34
> > [  198.822859] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> > [  198.822878] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3900
> > ...
> > [  198.826235] 400d0000.ethernet: bus : platform_pm_freeze+0x0/0x5c
> > [  198.826354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> > [  198.826374] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> > [  198.826503] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0000
> > [  198.826699] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> > 
> > When we resume, the order is reversed:
> > 
> > [  198.848300] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> > [  198.849024] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> > [  198.849120] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> > [  198.849141] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> > [  198.849243] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> > [  198.849401] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> > [  198.849419] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> > [  198.849637] __mdiobus_read: 61 400d0000.ethernet-1 00 01 => 7849
> > ...
> > [  198.852677] 400d0000.ethernet-1:00: bus : mdio_bus_resume+0x0/0x34
> > [  198.852778] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> > [  198.852797] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> > 
> > Now, the MAC driver calls phy_stop() and phy_start() from within its
> > own suspend/resume methods, and at this is done while the PHY is,
> > as far as the kernel PM code is concerned, suspended.
> > 
> > However, phylib works around that by resuming the PHY itself when
> > phy_start() is called in this situation.  That looks good, but it
> > really isn't in this case.  Given the above sequence, we will be in
> > PHY_HALTED state.
> > 
> > So, when phy_start() is called, the first thing it does is check the
> > state, and finds PHY_HALTED.  It then tries to enable the PHY
> > interrupts.  This cause the Micrel driver to write to the PHY to
> > enable interrupts - it reads 0x1b to ack any pre-existing interrupt,
> > modifies 0x1f to set the interrupt pin level, and then supposedly
> > writes 0x1b to enable interrupts.
> > 
> > However, at this point, the PHY is still powered down, as we can see
> > in the following read of the BMCR - containing 0x3900.  The write to
> > enable interrupts is ignored, and so interrupts remain disabled.
> > 
> > The resume process continues, and the system resumes, but interrupts
> > on the PHY remain disabled, and the phylib state machine never
> > advances.  You can do anything you like with cables etc, as far as
> > phylib is concerned, the link steadfastly remains "down".
> > 
> > That's a bit of a problem if your platform is running root-NFS through
> > that network interface.
> > 
> > It looks like some variants of the Micrel phy code work around this by
> > including in their resume path code to enable PHY interrupts
> > independently of phylib.
> > 
> > phy_resume() can't be called inside the locked region in phy_start(),
> > where we enable interrupts, because genphy_resume() also wants to take
> > phydev->lock - and it wants to do that to safely read-modify-write the
> > BMCR register.  This, I feel, comes back to an abuse of the phy state
> > machine lock to also protect atomic bus operations.
> > 
> > If we move over to using the bus lock to protect bus atomic operations
> > (as I believe we should) then phy_resume() can be called while holding
> > phydev->lock from within bits of the code that are protecting themselves
> > from concurrency with the phylib state machine.  It also means that we
> > can get rid of some of these boolean variables, and most importantly
> > in this particular case, call phy_resume() before we enable interrupts.
> > 
> > With that arrangement, things look a lot better:
> > 
> > [   74.545584] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> > [   74.546010] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> > [   74.546029] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> > [   74.546264] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> > [   74.546354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 8100
> > [   74.546373] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 8100
> > [   74.546651] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> > 
> > The patch for this is slightly larger than it needs to be because I've
> > converted more places that do read-modify-write to the new xxx_modify()
> > accessor, and it also ensures that phy_resume() is consistently called
> > with the phy state machine lock held.  It also means we can get rid of
> > the interrupt enable hack for some micrel PHYs, which I suspect comes
> > from this same root cause.
> 
> Since this is a bug fix, I would really rather have the ability to
> target the "net" tree with a simpler fix that is contained within
> drivers/net/phy/phy.c and which does not require the use of phy_modify().

Not really that easy.

The simple solution would be to open-code the phy_modify() inside
genphy_resume() in phy_device.c, and move the phy_resume() call
in phy.c.

However, that leaves at803x taking phydev->lock in its resume path,
which will immediately deadlock.  We couldn't just delete that lock
without ensuring that phy_resume() is always called with phydev->lock
held to preserve the existing locking.  I wouldn't want to simply
delete that lock without some testing of that driver in case there's
a good reason for it being there.

So, to preserve that would mean making sure the two calls to
phy_resume() in phy_device.c are also under the lock, replacing
the at803x.c read-modify-write sequences with open-coded versions,
or convincing ourselves that the locks in at803x_suspend() and
at803x_resume() are not required.

My feeling is that the safest and minimal change for -rc is to add
the phy_modify() helper, switch both genphy_resume() and
at803x_resume() to it and delete the locking there, and ensure that
phy_resume() is always called with phydev->lock taken for a
consistent locking strategy there.

We can prove the phy_modify() helper works elsewhere, which means
the risk is reduced to the at803x_resume() conversion.

The unfortunate issue that gives us is that the calling convention
of "suspend" vs "resume" is different - resume ends up with
phydev->lock always held but suspend doesn't, and that will be
confusing for phy driver developers, and really doesn't help from
the review point of view.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up

      reply	other threads:[~2017-12-11  0:06 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-10 16:47 [BUG] micrel phy suspend/resume Russell King - ARM Linux
2017-12-10 23:50 ` Florian Fainelli
2017-12-11  0:06   ` Russell King - ARM Linux [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171211000628.GR10595@n2100.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=andrew@lunn.ch \
    --cc=f.fainelli@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).