netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Lamparter <chunkeey@googlemail.com>
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: netdev@vger.kernel.org, Christian Lamparter <chunkeey@gmail.com>,
	"David S . Miller" <davem@davemloft.net>,
	Ivan Mikhaylov <ivan@de.ibm.com>,
	vivien.didelot@savoirfairelinux.com, andrew@lunn.ch
Subject: Re: [RFC 2/2] net: emac: add support for device-tree based PHY discovery and setup
Date: Wed, 15 Feb 2017 01:16:30 +0100	[thread overview]
Message-ID: <2409103.Lage0nq55N@debian64> (raw)
In-Reply-To: <2122984.F6mzWerJKz@debian64>

On Tuesday, February 14, 2017 12:38:42 AM CET Christian Lamparter wrote:
> > >>> diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
> > >>> --- a/drivers/net/ethernet/ibm/emac/core.c
> > >>> +++ b/drivers/net/ethernet/ibm/emac/core.c
> > >>> @@ -2420,6 +2421,179 @@ static int emac_read_uint_prop(struct device_node *np, const char *name,
> > >>> [...] 
> > >>> +static int emac_mii_bus_reset(struct mii_bus *bus)
> > >>> +{
> > >>> +	struct emac_instance *dev = netdev_priv(bus->priv);
> > >>> +
> > >>> +	emac_mii_reset_phy(&dev->phy);
> > >>
> > >> This seems wrong, emac_mii_reset_phy() does a BMCR software reset, which
> > >> PHYLIB is already going to do (phy_init_hw), yet you do this here at the
> > >> MDIO bus level towards a specify PHY, whereas this should be affecting
> > >> the MDIO bus itself (and/or *all* PHY child devices for quirks).
> > > Ah, this is a good point. The emac driver has a emac_reset() function
> > > that does disable and enabled the phy clocks. That said, this is already
> > > done by the emac driver during init too. So if I added it, the bus is
> > > reset twice (since it doesn't hurt - I added it back).
> > > 
> > > The emac_mii_phy_reset() was added because of the Meraki MX60(W).
> > > This is because Cisco's bootloader disables the switch port 
> > > (probably to prevent WAN<->LAN leakage during boot)
> > > 
> > > [bootlog from the MX60(W)]
> > > |Disabling port 0
> > > |Disabling port 1
> > > |Disabling port 2
> > > |Disabling port 3
> > > |ENET Speed is 1000 Mbps - FULL duplex connection (EMAC0)
> > > 
> > > Without emac_mii_reset_phy(), the mdiobus_scan() function, which
> > > is called by mdiobus_register will fail with -ENODEV.
> > > | /plb/opb/ethernet@ef600c00: failed to attach dt phy (-19).
> > > This is because get_phy_id() will "mostly read mostly Fs" and abort.
> > 
> > Is the PHY just powered down by chance (BMCR_PWRDN set?) and resetting
> > it implicitly clears the power down that seems to be what is going on.
> 
> Yes, the PHY is just in the BMCR_PDOWN state. I can do the same
> on the WNDR4700, by messing with u-boot:
> 
> | => mii write 0 0 0x0800
> | => mii dump
> | 0.     (ffff)                 -- PHY control register --
> |  (8000:8000) 0.15    =     1    reset
> |  (4000:4000) 0.14    =     1    loopback
> |  (2040:2040) 0. 6,13 =   b11    speed selection = ??? Mbps
> |  (1000:1000) 0.12    =     1    A/N enable
> |  (0800:0800) 0.11    =     1    power-down
> |  (0400:0400) 0.10    =     1    isolate
> |  (0200:0200) 0. 9    =     1    restart A/N
> |  (0100:0100) 0. 8    =     1    duplex = full
> |  (0080:0080) 0. 7    =     1    collision test enable
> |  (003f:003f) 0. 5- 0 =    63    (reserved)
> 
> On the Meraki, the port disabled by the bootloader. 
> The reset is still needed.
> 
> > Keep in mind that MDIO address 16 is the switch's pseudo PHY address
> > here, so if you are telling PHYLIB to probe for that address and you
> > don't get the expected MII_PHYSID1/2 value in return, that usually means
> > that there was a PHY fixup registered to intercept these reads and make
> > us return the switch's unique identifier. Reading from the switch's
> > pseudo PHY at address 16 registers 2/3 (MII_PHYSID1/2) is not guaranteed
> > to return the switch's unique identifer.
> > 
> > With a MDIO device driver this won't happen because you will be probed
> > by address, and you can read any switch register you want to and from
> > there move on with the initialization.
> > 
> > > 
> > > 
> > > With emac_mii_reset_phy() in place, it gets detected:
> > > | switch0: Atheros AR8327 rev. 4 switch registered on emac_mdio
> > > 
> > > Furthermore, this is probably not the only device which need it.
> > > Currently, emac's own phy.c code does call emac_mii_reset_phy() 
> > > as well as part of its probe procedure.
> > > <http://lxr.free-electrons.com/source/drivers/net/ethernet/ibm/emac/phy.c#L522>
> > > 
> > > Ideally, we would like to reset only the ports which are registered in the DT.
> > 
> > Which you would get for free if you did extend qca8k to support the
> > 8327, because qca8k does implicitly tell the DSA layer to register a
> > dsa_slave_mii_bus which will probe and attach to per-port built-in PHYs
> > and that happens only for the ports enabled on your specific board.
> No, the Meraki and the modified WNDR4700 still refuse to work. Just >one<
> of the phys at address 0-4 need to be powered down to get the following
> error:
> [    4.425618] DSA: switch 0 0 parsed
> [    4.429034] DSA: tree 0 parsed
> [    4.435416] qca8k: probe of 4ef600c00.ethern:10 failed with error -5
> 
> I'll report back, what exactly is causing the error in this case.
Ok, the -EIO error is coming from this line:
<http://lxr.free-electrons.com/source/drivers/net/phy/phy_device.c?v=4.9#L496>

get_phy_id() gets called as part of mdiobus_register() in dsa_ds_apply()
<http://lxr.free-electrons.com/source/net/dsa/dsa2.c?v=4.9#L322> which is
called as part of the dsa_switch_register().

So, is there a good place to reset the switch/ports? Sure,
doing it in qca8k has the advantage that we know it's 5 ports.
However, I'm wondering, if the dsa or phy driver the right
place for this?

Note: behind the mdiobus_read() at 
<http://lxr.free-electrons.com/source/drivers/net/phy/phy_device.c?v=4.9#L494>
is emac's emac_mdio_read does returning -EREMOTEIO. I don't see that hiding
the error code behind a return 0xffff will help much:
---
For example I disabled just phy2:
| Generic PHY 4ef600c00.ethern:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=4ef600c00.ethern:00, irq=-1)
| Generic PHY dsa-0.0:01: attached PHY driver [Generic PHY] (mii_bus:phy_addr=dsa-0.0:01, irq=-1)
| qca8k 4ef600c00.ethern:10 lan2: no phy at 2
| qca8k 4ef600c00.ethern:10 lan2: failed to connect to phy2: -19
| emac 4ef600c00.ethernet eth0: error -19 setting up slave phy
| qca8k 4ef600c00.ethern:10: Failed to create slave 3: -19
| Generic PHY dsa-0.0:03: attached PHY driver [Generic PHY] (mii_bus:phy_addr=dsa-0.0:03, irq=-1)
| Generic PHY dsa-0.0:04: attached PHY driver [Generic PHY] (mii_bus:phy_addr=dsa-0.0:04, irq=-1)

And as a result: the switch "lost" the port.

When I tried to unbind the "faulty" switch in order to reset it
and rebind:

root@LEDE:/sys/bus/mdio_bus/drivers/qca8k# echo 4ef600c00.ethern:10 > unbind 
[ 2435.970104] Unable to handle kernel paging request for data at address 0x00000050
[ 2435.977580] Faulting instruction address: 0xc0359ae8
[ 2435.982532] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2435.987901] WNDR4700 Platform
[ 2436.116263] CPU: 0 PID: 663 Comm: ash Not tainted 4.9.8 #0
[ 2436.121729] task: cf974000 task.stack: ce42a000
[ 2436.126238] NIP: c0359ae8 LR: c036a534 CTR: c029b338
[ 2436.131180] REGS: ce42bd10 TRAP: 0300   Not tainted  (4.9.8)
[ 2436.136811] MSR: 00029000 <CE,EE,ME>[ 2436.140260]   CR: 24002288  XER: 00000000
[ 2436.144251] DEAR: 00000050 ESR: 00000000 
GPR00: c036a534 ce42bdc0 cf974000 cfb02800 c03f5553 00000000 cf9e19e0 00000001 
GPR08: cf9e1b00 00000040 00000000 00000001 00780100 00000000 00000000 00000000 
GPR16: 00000000 00000000 10060000 1005cf90 b79a2495 1005cf70 b7a354c4 00000000 
GPR24: cfaf2508 00000001 cfaf24fc cfffbca8 cfaf228c 00000003 cfaf2400 cfb02800 
NIP [c0359ae8] dsa_slave_destroy+0x28/0x78
[ 2436.180491] LR [c036a534] dsa_dst_unapply.part.7+0xa4/0xb70
[ 2436.186033] Call Trace:
[ 2436.188475] [ce42bdc0] [c0359c48] dsa_port_is_cpu+0x1c/0x50 (unreliable)
[ 2436.195161] [ce42bdd0] [c036a534] dsa_dst_unapply.part.7+0xa4/0xb70
[ 2436.201409] [ce42be00] [c0359d60] dsa_unregister_switch+0x3c/0x60
[ 2436.207485] [ce42be20] [c0238a18] mdio_remove+0x24/0x40
[ 2436.212701] [ce42be30] [c01c6544] __device_release_driver+0xa4/0x13c
[ 2436.219034] [ce42be40] [c01c6604] device_release_driver+0x28/0x40
[ 2436.225107] [ce42be50] [c01c4c94] unbind_store+0x6c/0xac
[ 2436.230417] [ce42be70] [c00f8838] kernfs_fop_write+0x128/0x1ac
[ 2436.236247] [ce42be90] [c00a8ce4] __vfs_write+0x28/0x134
[ 2436.241542] [ce42bef0] [c00a9a00] vfs_write+0xd0/0x17c
[ 2436.246665] [ce42bf10] [c00aa758] SyS_write+0x4c/0xa4
[ 2436.251705] [ce42bf40] [c000a848] ret_from_syscall+0x0/0x3c
[ 2436.257266] --- interrupt: c01 at 0xb7a0edc0
[ 2436.257266]     LR = 0xb7a00df0
[ 2436.264631] Instruction dump:
[ 2436.267597] 38210030 4e800020 7c0802a6 9421fff0 bfc10008 7c7f1b78 90010014 892304e8 
[ 2436.275397] 814304e4 39290004 55292036 7d2a4a14 <83c90010> 4bf46641 807f04ec 2f830000 
[ 2436.283367] ---[ end trace e5787cb2a55a0fbd ]---
[ 2436.289726] 
[ 2437.291295] Kernel panic - not syncing: Fatal exception

The panic is caused by dsa_slave_create() not clearing the dangling
ds->ports[port].netdev pointer. The function dsa_user_port_unapply()
relies on it being NULL in order to skip unregistered entries.
---
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 09fc3e9462c1..0dae29eb95d6 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1460,6 +1460,7 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
 	ret = dsa_slave_phy_setup(p, slave_dev);
 	if (ret) {
 		netdev_err(master, "error %d setting up slave phy\n", ret);
+		ds->ports[port].netdev = NULL;
 		unregister_netdev(slave_dev);
 		free_netdev(slave_dev);
 		return ret;
---

Thanks,
Christian

  reply	other threads:[~2017-02-15  0:18 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-05 22:25 [RFC 1/2] dt: emac: document device-tree based phy discovery and setup Christian Lamparter
     [not found] ` <f57a340f615991ed2771d8af4b1a908dec436a5e.1486333475.git.chunkeey-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
2017-02-05 22:25   ` [RFC 2/2] net: emac: add support for device-tree based PHY " Christian Lamparter
     [not found]     ` <710c7971cb7dcef54058b61dced03b5d27553380.1486333475.git.chunkeey-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org>
2017-02-05 22:44       ` Florian Fainelli
     [not found]         ` <7143c86d-6a3c-5e55-70cd-7424f28e1d78-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2017-02-11 22:45           ` Christian Lamparter
2017-02-11 23:07             ` Florian Fainelli
2017-02-13 23:38               ` Christian Lamparter
2017-02-15  0:16                 ` Christian Lamparter [this message]
2017-02-15  0:24                   ` Florian Fainelli
2017-02-15 14:23                   ` Andrew Lunn
2017-02-19 14:44                     ` Christian Lamparter
2017-02-05 22:33   ` [RFC 1/2] dt: emac: document device-tree based phy " Florian Fainelli
2017-02-19 15:20     ` Christian Lamparter
2017-02-08 23:00 ` Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2409103.Lage0nq55N@debian64 \
    --to=chunkeey@googlemail.com \
    --cc=andrew@lunn.ch \
    --cc=chunkeey@gmail.com \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=ivan@de.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=vivien.didelot@savoirfairelinux.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).