* [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue @ 2016-11-28 15:50 Jerome Brunet 2016-11-28 15:50 ` [PATCH net-next v4 2/4] dt-bindings: net: add EEE capability constants Jerome Brunet ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Jerome Brunet @ 2016-11-28 15:50 UTC (permalink / raw) To: netdev, devicetree, Florian Fainelli Cc: Jerome Brunet, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic, linux-arm-kernel, linux-kernel, Julia Lawall, Yegor Yefremov, Andreas Färber This patchset fixes an issue with the OdroidC2 board (DWMAC + RTL8211F). The platform seems to enter LPI on the Rx path too often while performing relatively high TX transfer. This eventually break the link (both Tx and Rx), and require to bring the interface down and up again to get the Rx path working again. The root cause of this issue is not fully understood yet but disabling EEE advertisement on the PHY prevent this feature to be negotiated. With this change, the link is stable and reliable, with the expected throughput performance. The patchset adds options in the generic phy driver to disable EEE advertisement, through device tree. The way it is done is very similar to the handling of the max-speed property. Patch 4 is provided here for testing purpose only. Please don't merge patch 4, this change will go through the amlogic's tree. Chnages since V3: [3] - Fix signess error reported by kbuild test robot (Thx Julia) Changes since V2: [2] - Rename "eee-advert-disable" to "eee-broken-modes" to make the intended purpose of this option clear (flag broken configuration, not a configuration option) - Add DT bindings constants so the DT configuration is more user friendly - Submit to net-next instead of net. Changes since V1: [1] - Disable the advertisement of EEE in the generic code instead of the realtek driver. [1] : http://lkml.kernel.org/r/1479220154-25851-1-git-send-email-jbrunet@baylibre.com [2] : http://lkml.kernel.org/r/1479742524-30222-1-git-send-email-jbrunet@baylibre.com [3] : http://lkml.kernel.org/r/1480326409-25419-1-git-send-email-jbrunet@baylibre.com Jerome Brunet (4): net: phy: add an option to disable EEE advertisement dt-bindings: net: add EEE capability constants dt: bindings: add ethernet phy eee-broken-modes option documentation ARM64: dts: meson: odroidc2: disable advertisement EEE for GbE. Documentation/devicetree/bindings/net/phy.txt | 2 + .../arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts | 14 ++++ drivers/net/phy/phy.c | 3 + drivers/net/phy/phy_device.c | 80 +++++++++++++++++++--- include/dt-bindings/net/mdio.h | 19 +++++ include/linux/phy.h | 3 + 6 files changed, 112 insertions(+), 9 deletions(-) create mode 100644 include/dt-bindings/net/mdio.h -- 2.7.4 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH net-next v4 2/4] dt-bindings: net: add EEE capability constants 2016-11-28 15:50 [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Jerome Brunet @ 2016-11-28 15:50 ` Jerome Brunet 2016-12-05 14:39 ` Rob Herring 2016-11-28 15:50 ` [PATCH net-next v4 3/4] dt: bindings: add ethernet phy eee-broken-modes option documentation Jerome Brunet [not found] ` <1480348229-25672-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> 2 siblings, 1 reply; 17+ messages in thread From: Jerome Brunet @ 2016-11-28 15:50 UTC (permalink / raw) To: netdev, devicetree, Florian Fainelli Cc: Jerome Brunet, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic, linux-arm-kernel, linux-kernel, Julia Lawall, Yegor Yefremov, Andreas Färber Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Tested-by: Yegor Yefremov <yegorslists@googlemail.com> Tested-by: Andreas Färber <afaerber@suse.de> Tested-by: Neil Armstrong <narmstrong@baylibre.com> --- include/dt-bindings/net/mdio.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) create mode 100644 include/dt-bindings/net/mdio.h diff --git a/include/dt-bindings/net/mdio.h b/include/dt-bindings/net/mdio.h new file mode 100644 index 000000000000..99c6d903d439 --- /dev/null +++ b/include/dt-bindings/net/mdio.h @@ -0,0 +1,19 @@ +/* + * This header provides generic constants for ethernet MDIO bindings + */ + +#ifndef _DT_BINDINGS_NET_MDIO_H +#define _DT_BINDINGS_NET_MDIO_H + +/* + * EEE capability Advertisement + */ + +#define MDIO_EEE_100TX 0x0002 /* 100TX EEE cap */ +#define MDIO_EEE_1000T 0x0004 /* 1000T EEE cap */ +#define MDIO_EEE_10GT 0x0008 /* 10GT EEE cap */ +#define MDIO_EEE_1000KX 0x0010 /* 1000KX EEE cap */ +#define MDIO_EEE_10GKX4 0x0020 /* 10G KX4 EEE cap */ +#define MDIO_EEE_10GKR 0x0040 /* 10G KR EEE cap */ + +#endif -- 2.7.4 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 2/4] dt-bindings: net: add EEE capability constants 2016-11-28 15:50 ` [PATCH net-next v4 2/4] dt-bindings: net: add EEE capability constants Jerome Brunet @ 2016-12-05 14:39 ` Rob Herring 2016-12-19 15:16 ` Jerome Brunet 0 siblings, 1 reply; 17+ messages in thread From: Rob Herring @ 2016-12-05 14:39 UTC (permalink / raw) To: Jerome Brunet Cc: netdev, devicetree, Florian Fainelli, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic, linux-arm-kernel, linux-kernel, Julia Lawall, Yegor Yefremov, Andreas Färber On Mon, Nov 28, 2016 at 04:50:26PM +0100, Jerome Brunet wrote: > Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> > Tested-by: Yegor Yefremov <yegorslists@googlemail.com> > Tested-by: Andreas Färber <afaerber@suse.de> > Tested-by: Neil Armstrong <narmstrong@baylibre.com> > --- > include/dt-bindings/net/mdio.h | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > create mode 100644 include/dt-bindings/net/mdio.h Seems changes are wanted on this, but patches 2 and 3 should be combined. The header is part of the binding doc. Rob ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 2/4] dt-bindings: net: add EEE capability constants 2016-12-05 14:39 ` Rob Herring @ 2016-12-19 15:16 ` Jerome Brunet 0 siblings, 0 replies; 17+ messages in thread From: Jerome Brunet @ 2016-12-19 15:16 UTC (permalink / raw) To: Rob Herring Cc: devicetree, Florian Fainelli, Alexandre TORGUE, Andrew Lunn, Martin Blumenstingl, netdev, Neil Armstrong, linux-kernel, Yegor Yefremov, Julia Lawall, Andre Roth, Kevin Hilman, Carlo Caione, Giuseppe Cavallaro, linux-amlogic, Andreas Färber, linux-arm-kernel Hi Rob, First, Thx for this information and sorry for this late reply As you may have seen yourself, there was little bit of confusion while discussing this patch series. The point is the v3 was applied before your reply (patches 2 and 3 not combined unfortunately). Because of this confusion, the series needed a few fixes witch removes the previously added bindings [0]. This time, I made sure to modify (remove) the bindings along with the documentation. [0]: http://lkml.kernel.org/r/1482159938-13239-1-git-send-email-jbrunet @baylibre.com Cheers Jerome On Mon, 2016-12-05 at 08:39 -0600, Rob Herring wrote: > On Mon, Nov 28, 2016 at 04:50:26PM +0100, Jerome Brunet wrote: > > > > Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> > > Tested-by: Yegor Yefremov <yegorslists@googlemail.com> > > Tested-by: Andreas Färber <afaerber@suse.de> > > Tested-by: Neil Armstrong <narmstrong@baylibre.com> > > --- > > include/dt-bindings/net/mdio.h | 19 +++++++++++++++++++ > > 1 file changed, 19 insertions(+) > > create mode 100644 include/dt-bindings/net/mdio.h > > Seems changes are wanted on this, but patches 2 and 3 should be > combined. The header is part of the binding doc. > > Rob _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH net-next v4 3/4] dt: bindings: add ethernet phy eee-broken-modes option documentation 2016-11-28 15:50 [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Jerome Brunet 2016-11-28 15:50 ` [PATCH net-next v4 2/4] dt-bindings: net: add EEE capability constants Jerome Brunet @ 2016-11-28 15:50 ` Jerome Brunet [not found] ` <1480348229-25672-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> 2 siblings, 0 replies; 17+ messages in thread From: Jerome Brunet @ 2016-11-28 15:50 UTC (permalink / raw) To: netdev, devicetree, Florian Fainelli Cc: Jerome Brunet, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic, linux-arm-kernel, linux-kernel, Julia Lawall, Yegor Yefremov, Andreas Färber Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Tested-by: Neil Armstrong <narmstrong@baylibre.com> --- Documentation/devicetree/bindings/net/phy.txt | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt index 4627da3d52c4..54749b60a466 100644 --- a/Documentation/devicetree/bindings/net/phy.txt +++ b/Documentation/devicetree/bindings/net/phy.txt @@ -38,6 +38,8 @@ Optional Properties: - enet-phy-lane-swap: If set, indicates the PHY will swap the TX/RX lanes to compensate for the board being designed with the lanes swapped. +- eee-broken-modes: Bits to clear in the MDIO_AN_EEE_ADV register to + disable EEE broken modes. Example: -- 2.7.4 ^ permalink raw reply related [flat|nested] 17+ messages in thread
[parent not found: <1480348229-25672-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>]
* [PATCH net-next v4 1/4] net: phy: add an option to disable EEE advertisement [not found] ` <1480348229-25672-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> @ 2016-11-28 15:50 ` Jerome Brunet 2016-11-28 15:50 ` [PATCH net-next v4 4/4] ARM64: dts: meson: odroidc2: disable advertisement EEE for GbE Jerome Brunet 2016-11-28 17:54 ` [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Florian Fainelli 2 siblings, 0 replies; 17+ messages in thread From: Jerome Brunet @ 2016-11-28 15:50 UTC (permalink / raw) To: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA, Florian Fainelli Cc: Jerome Brunet, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Julia Lawall, Yegor Yefremov, Andreas Färber This patch adds an option to disable EEE advertisement in the generic PHY by providing a mask of prohibited modes corresponding to the value found in the MDIO_AN_EEE_ADV register. On some platforms, PHY Low power idle seems to be causing issues, even breaking the link some cases. The patch provides a convenient way for these platforms to disable EEE advertisement and work around the issue. Signed-off-by: Jerome Brunet <jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> Tested-by: Yegor Yefremov <yegorslists-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> Tested-by: Andreas Färber <afaerber-l3A5Bk7waGM@public.gmane.org> Tested-by: Neil Armstrong <narmstrong-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> --- drivers/net/phy/phy.c | 3 ++ drivers/net/phy/phy_device.c | 80 +++++++++++++++++++++++++++++++++++++++----- include/linux/phy.h | 3 ++ 3 files changed, 77 insertions(+), 9 deletions(-) diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c index 73adbaa9ac86..a3981cc6448a 100644 --- a/drivers/net/phy/phy.c +++ b/drivers/net/phy/phy.c @@ -1396,6 +1396,9 @@ int phy_ethtool_set_eee(struct phy_device *phydev, struct ethtool_eee *data) { int val = ethtool_adv_to_mmd_eee_adv_t(data->advertised); + /* Mask prohibited EEE modes */ + val &= ~phydev->eee_broken_modes; + phy_write_mmd_indirect(phydev, MDIO_AN_EEE_ADV, MDIO_MMD_AN, val); return 0; diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index ba86c191a13e..cb4aca205cf8 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -1121,6 +1121,43 @@ static int genphy_config_advert(struct phy_device *phydev) } /** + * genphy_config_eee_advert - disable unwanted eee mode advertisement + * @phydev: target phy_device struct + * + * Description: Writes MDIO_AN_EEE_ADV after disabling unsupported energy + * efficent ethernet modes. Returns 0 if the PHY's advertisement hasn't + * changed, and 1 if it has changed. + */ +static int genphy_config_eee_advert(struct phy_device *phydev) +{ + int broken = phydev->eee_broken_modes; + int old_adv, adv; + + /* Nothing to disable */ + if (!broken) + return 0; + + /* If the following call fails, we assume that EEE is not + * supported by the phy. If we read 0, EEE is not advertised + * In both case, we don't need to continue + */ + adv = phy_read_mmd_indirect(phydev, MDIO_AN_EEE_ADV, MDIO_MMD_AN); + if (adv <= 0) + return 0; + + old_adv = adv; + adv &= ~broken; + + /* Advertising remains unchanged with the broken mask */ + if (old_adv == adv) + return 0; + + phy_write_mmd_indirect(phydev, MDIO_AN_EEE_ADV, MDIO_MMD_AN, adv); + + return 1; +} + +/** * genphy_setup_forced - configures/forces speed/duplex from @phydev * @phydev: target phy_device struct * @@ -1178,15 +1215,20 @@ EXPORT_SYMBOL(genphy_restart_aneg); */ int genphy_config_aneg(struct phy_device *phydev) { - int result; + int err, changed; + + changed = genphy_config_eee_advert(phydev); if (AUTONEG_ENABLE != phydev->autoneg) return genphy_setup_forced(phydev); - result = genphy_config_advert(phydev); - if (result < 0) /* error */ - return result; - if (result == 0) { + err = genphy_config_advert(phydev); + if (err < 0) /* error */ + return err; + + changed |= err; + + if (changed == 0) { /* Advertisement hasn't changed, but maybe aneg was never on to * begin with? Or maybe phy was isolated? */ @@ -1196,16 +1238,16 @@ int genphy_config_aneg(struct phy_device *phydev) return ctl; if (!(ctl & BMCR_ANENABLE) || (ctl & BMCR_ISOLATE)) - result = 1; /* do restart aneg */ + changed = 1; /* do restart aneg */ } /* Only restart aneg if we are advertising something different * than we were before. */ - if (result > 0) - result = genphy_restart_aneg(phydev); + if (changed > 0) + return genphy_restart_aneg(phydev); - return result; + return 0; } EXPORT_SYMBOL(genphy_config_aneg); @@ -1563,6 +1605,21 @@ static void of_set_phy_supported(struct phy_device *phydev) __set_phy_supported(phydev, max_speed); } +static void of_set_phy_eee_broken(struct phy_device *phydev) +{ + struct device_node *node = phydev->mdio.dev.of_node; + u32 broken; + + if (!IS_ENABLED(CONFIG_OF_MDIO)) + return; + + if (!node) + return; + + if (!of_property_read_u32(node, "eee-broken-modes", &broken)) + phydev->eee_broken_modes = broken; +} + /** * phy_probe - probe and init a PHY device * @dev: device to probe and init @@ -1600,6 +1657,11 @@ static int phy_probe(struct device *dev) of_set_phy_supported(phydev); phydev->advertising = phydev->supported; + /* Get the EEE modes we want to prohibit. We will ask + * the PHY stop advertising these mode later on + */ + of_set_phy_eee_broken(phydev); + /* Set the state to READY by default */ phydev->state = PHY_READY; diff --git a/include/linux/phy.h b/include/linux/phy.h index edde28ce163a..b53177fd38af 100644 --- a/include/linux/phy.h +++ b/include/linux/phy.h @@ -417,6 +417,9 @@ struct phy_device { u32 advertising; u32 lp_advertising; + /* Energy efficient ethernet modes which should be prohibited */ + u32 eee_broken_modes; + int autoneg; int link_timeout; -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH net-next v4 4/4] ARM64: dts: meson: odroidc2: disable advertisement EEE for GbE. [not found] ` <1480348229-25672-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> 2016-11-28 15:50 ` [PATCH net-next v4 1/4] net: phy: add an option to disable EEE advertisement Jerome Brunet @ 2016-11-28 15:50 ` Jerome Brunet 2016-11-28 17:54 ` [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Florian Fainelli 2 siblings, 0 replies; 17+ messages in thread From: Jerome Brunet @ 2016-11-28 15:50 UTC (permalink / raw) To: netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA, Florian Fainelli Cc: Jerome Brunet, Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Julia Lawall, Yegor Yefremov, Andreas Färber Signed-off-by: Jerome Brunet <jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> Tested-by: Neil Armstrong <narmstrong-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> --- arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts b/arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts index e6e3491d48a5..2036582ca0d6 100644 --- a/arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts +++ b/arch/arm64/boot/dts/amlogic/meson-gxbb-odroidc2.dts @@ -46,6 +46,7 @@ #include "meson-gxbb.dtsi" #include <dt-bindings/gpio/gpio.h> +#include <dt-bindings/net/mdio.h> / { compatible = "hardkernel,odroid-c2", "amlogic,meson-gxbb"; @@ -85,6 +86,19 @@ status = "okay"; pinctrl-0 = <ð_pins>; pinctrl-names = "default"; + + phy-handle = <ð_phy0>; + + mdio { + compatible = "snps,dwmac-mdio"; + #address-cells = <1>; + #size-cells = <0>; + + eth_phy0: ethernet-phy@0 { + reg = <0>; + eee-broken-modes = <MDIO_EEE_1000T>; + }; + }; }; &ir { -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue [not found] ` <1480348229-25672-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> 2016-11-28 15:50 ` [PATCH net-next v4 1/4] net: phy: add an option to disable EEE advertisement Jerome Brunet 2016-11-28 15:50 ` [PATCH net-next v4 4/4] ARM64: dts: meson: odroidc2: disable advertisement EEE for GbE Jerome Brunet @ 2016-11-28 17:54 ` Florian Fainelli 2016-11-30 9:47 ` Jerome Brunet 2017-01-05 23:25 ` Russell King - ARM Linux 2 siblings, 2 replies; 17+ messages in thread From: Florian Fainelli @ 2016-11-28 17:54 UTC (permalink / raw) To: Jerome Brunet, netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA Cc: Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Julia Lawall, Yegor Yefremov, Andreas Färber On 11/28/2016 07:50 AM, Jerome Brunet wrote: > This patchset fixes an issue with the OdroidC2 board (DWMAC + RTL8211F). > The platform seems to enter LPI on the Rx path too often while performing > relatively high TX transfer. This eventually break the link (both Tx and > Rx), and require to bring the interface down and up again to get the Rx > path working again. > > The root cause of this issue is not fully understood yet but disabling EEE > advertisement on the PHY prevent this feature to be negotiated. > With this change, the link is stable and reliable, with the expected > throughput performance. > > The patchset adds options in the generic phy driver to disable EEE > advertisement, through device tree. The way it is done is very similar > to the handling of the max-speed property. > > Patch 4 is provided here for testing purpose only. Please don't merge > patch 4, this change will go through the amlogic's tree. Sorry, but I really don't like the route this is going, and I should have made myself clearer before on that, I really think utilizing a PHY fixup is more appropriate here than an extremely generic DT property. The fixup code can be in the affected PHY driver, or it can be somewhere else, your call. There is no shortage of option on how to implement it, and this would be something easy to enable/disable for known good configurations (ala PCI/USB fixups). If we start supporting generic "enable", "disable" type of properties with values that map directly to register definitions of the HW, we leave too much room for these properties to be utilized to implement a specific policy, and this is not acceptable. -- Florian -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue 2016-11-28 17:54 ` [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Florian Fainelli @ 2016-11-30 9:47 ` Jerome Brunet [not found] ` <1480499246.17538.208.camel-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> 2017-01-05 23:25 ` Russell King - ARM Linux 1 sibling, 1 reply; 17+ messages in thread From: Jerome Brunet @ 2016-11-30 9:47 UTC (permalink / raw) To: Florian Fainelli, netdev, devicetree Cc: Andrew Lunn, Alexandre TORGUE, Neil Armstrong, Martin Blumenstingl, Kevin Hilman, linux-kernel, Yegor Yefremov, Julia Lawall, Andre Roth, linux-amlogic, Carlo Caione, Giuseppe Cavallaro, Andreas Färber, linux-arm-kernel On Mon, 2016-11-28 at 09:54 -0800, Florian Fainelli wrote: > On 11/28/2016 07:50 AM, Jerome Brunet wrote: > > > > This patchset fixes an issue with the OdroidC2 board (DWMAC + > > RTL8211F). > > The platform seems to enter LPI on the Rx path too often while > > performing > > relatively high TX transfer. This eventually break the link (both > > Tx and > > Rx), and require to bring the interface down and up again to get > > the Rx > > path working again. > > > > The root cause of this issue is not fully understood yet but > > disabling EEE > > advertisement on the PHY prevent this feature to be negotiated. > > With this change, the link is stable and reliable, with the > > expected > > throughput performance. > > > > The patchset adds options in the generic phy driver to disable EEE > > advertisement, through device tree. The way it is done is very > > similar > > to the handling of the max-speed property. > > > > Patch 4 is provided here for testing purpose only. Please don't > > merge > > patch 4, this change will go through the amlogic's tree. > > Sorry, but I really don't like the route this is going, and I should > have made myself clearer before on that, I really think utilizing a > PHY > fixup is more appropriate here than an extremely generic DT property. > The fixup code can be in the affected PHY driver, or it can be > somewhere > else, your call. There is no shortage of option on how to implement > it, > and this would be something easy to enable/disable for known good > configurations (ala PCI/USB fixups). > > If we start supporting generic "enable", "disable" type of properties > with values that map directly to register definitions of the HW, we > leave too much room for these properties to be utilized to implement > a > specific policy, and this is not acceptable. Florian, I agree that DT should not be used to setup a policy, but to describe what the HW is. I tried to implement it the way you suggested, using phy fixup, too see what it looks like. There is 2 places in the code that seems (remotely) linked to the issue: - meson8b_dwmac driver : if the mac, regardless of the board/platform, could not tolerate to have EEE activated, it would make sense to have the fixup here. It can provide a C callback for such case. - realtek phy driver: philosophy is kind of the same To be clear, it is doable and it works that way, but I don't think embedding this directly in the code is the right way to do it. It seems we are hiding an information specific about the board inside a generic driver. We have several amlogic's design with the same MAC, sometimes with the same PHY, which have no problem with EEE at all. The issue is really about the board design. What I propose is not an enable/disable configuration switch, but to clearly state that a particular mode of operation is broken. Like the "max-speed" property, it setup a restriction. IMO, this is a description of what the HW is and is capable of, and as such it should be part of the DT. Yes the property directly map to a register, but it does let you directly manipulate it (you can't pass the value you want to write in the register). Having it this way just makes the code simple on both ends (user and driver). Yes people could start abusing this to setup policy. In the end, it is our responsibility, as community, to make sure APIs are used in a proper way, and not let it be used that way. I'm open to suggestion on how improve the solution, maybe something which could bring more confidence that property won't be misused. Jerome _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <1480499246.17538.208.camel-rdvid1DuHRBWk0Htik3J/w@public.gmane.org>]
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue [not found] ` <1480499246.17538.208.camel-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> @ 2016-11-30 18:28 ` Florian Fainelli 0 siblings, 0 replies; 17+ messages in thread From: Florian Fainelli @ 2016-11-30 18:28 UTC (permalink / raw) To: Jerome Brunet, netdev-u79uwXL29TY76Z2rM5mHXA, devicetree-u79uwXL29TY76Z2rM5mHXA Cc: Carlo Caione, Kevin Hilman, Giuseppe Cavallaro, Alexandre TORGUE, Martin Blumenstingl, Andre Roth, Andrew Lunn, Neil Armstrong, linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Julia Lawall, Yegor Yefremov, Andreas Färber On 11/30/2016 01:47 AM, Jerome Brunet wrote: >> If we start supporting generic "enable", "disable" type of properties >> with values that map directly to register definitions of the HW, we >> leave too much room for these properties to be utilized to implement >> a >> specific policy, and this is not acceptable. > > Florian, > > I agree that DT should not be used to setup a policy, but to describe > what the HW is. > > I tried to implement it the way you suggested, using phy fixup, too see > what it looks like. > There is 2 places in the code that seems (remotely) linked to the > issue: > - meson8b_dwmac driver : if the mac, regardless of the board/platform, > could not tolerate to have EEE activated, it would make sense to have > the fixup here. It can provide a C callback for such case. > - realtek phy driver: philosophy is kind of the same > > To be clear, it is doable and it works that way, but I don't think > embedding this directly in the code is the right way to do it. It seems > we are hiding an information specific about the board inside a generic > driver. So there are a few things about that: - if we were not on ARM64, there would be possibly a remote chance of having some concept of a board file which would be where such a PHY fixup, or fixup of any kind would reside - having the PHY fixup in the PHY driver gated by both an exact match on the PHY OUI *and* the specific affected board makes it reasonably easy to locate it > > We have several amlogic's design with the same MAC, sometimes with the > same PHY, which have no problem with EEE at all. The issue is really > about the board design. OK, not a problem then: of_machine_is_compatible() should help you here? > > What I propose is not an enable/disable configuration switch, but to > clearly state that a particular mode of operation is broken. Like the > "max-speed" property, it setup a restriction. IMO, this is a > description of what the HW is and is capable of, and as such it should > be part of the DT. Sure, there is a fine line between describing what's broken, and being able to use that to actually configure non-broken hardware the way you want. > > Yes the property directly map to a register, but it does let you > directly manipulate it (you can't pass the value you want to write in > the register). Having it this way just makes the code simple on both > ends (user and driver). That's exactly the part that is giving me the creeps, any property that directly maps to a register value has a chance of a) leading to hard to debug problem if mis-configured, and b) being used as a policy as opposed to purely describing what is going on with the HW. > > Yes people could start abusing this to setup policy. In the end, it is > our responsibility, as community, to make sure APIs are used in a > proper way, and not let it be used that way. > > I'm open to suggestion on how improve the solution, maybe something > which could bring more confidence that property won't be misused. Once the binding lands in the kernel, there is absolutely zero guarantee nor visibility in how people end-up using in e.g: DT aware bootloader, and I am one of these people. Since there is a binding, there is consumer code in the kernel that needs to behave properly with respect to how the binding is defined. This is the same problem as with any kind of ABI, and a diverse range of consumers. -- Florian -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue 2016-11-28 17:54 ` [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Florian Fainelli 2016-11-30 9:47 ` Jerome Brunet @ 2017-01-05 23:25 ` Russell King - ARM Linux 2017-01-06 5:42 ` Yegor Yefremov 2017-01-06 10:11 ` Jerome Brunet 1 sibling, 2 replies; 17+ messages in thread From: Russell King - ARM Linux @ 2017-01-05 23:25 UTC (permalink / raw) To: Florian Fainelli Cc: devicetree, Alexandre TORGUE, Neil Armstrong, Martin Blumenstingl, netdev, Giuseppe Cavallaro, linux-kernel, Yegor Yefremov, Julia Lawall, Andre Roth, Andrew Lunn, Kevin Hilman, Carlo Caione, linux-amlogic, Andreas Färber, linux-arm-kernel, Jerome Brunet On Mon, Nov 28, 2016 at 09:54:28AM -0800, Florian Fainelli wrote: > If we start supporting generic "enable", "disable" type of properties > with values that map directly to register definitions of the HW, we > leave too much room for these properties to be utilized to implement a > specific policy, and this is not acceptable. Another concern with this patch is that the existing phylib "set_eee" code is horribly buggy - it just translates the modes from userspace into the register value and writes them directly to the register with no validation. So it's possible to set modes in the register that the hardware doesn't support, and have them advertised to the link partner. I have a patch which fixes that, restricting (as we do elsewhere) the advert according to the EEE supported capabilities retrieved from the PCS - maybe the problem here is that the PCS doesn't support support EEE in 1000baseT mode? Out of interest, which PHY is used on this platform? On the SolidRun boards, they're using AR8035, and have suffered this occasional link drop problem. What has been found is that it seems to be to do with the timing parameters, and it seemed to only be 1000bT that was affected. I don't remember off hand exactly which or what the change was they made to stabilise it though, but I can probabily find out tomorrow. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue 2017-01-05 23:25 ` Russell King - ARM Linux @ 2017-01-06 5:42 ` Yegor Yefremov [not found] ` <CAGm1_kvZ4dQrJ89qYU5wLGU1NR=j9xyWUm2mgYtq3F1+bo1OCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2017-01-06 10:11 ` Jerome Brunet 1 sibling, 1 reply; 17+ messages in thread From: Yegor Yefremov @ 2017-01-06 5:42 UTC (permalink / raw) To: Russell King - ARM Linux Cc: devicetree, Florian Fainelli, Alexandre TORGUE, Andrew Lunn, Martin Blumenstingl, netdev, Giuseppe Cavallaro, Neil Armstrong, kernel list, Julia Lawall, Andre Roth, Kevin Hilman, Carlo Caione, linux-amlogic, Andreas Färber, linux-arm-kernel, Jerome Brunet Hi Russel, On Fri, Jan 6, 2017 at 12:25 AM, Russell King - ARM Linux <linux@armlinux.org.uk> wrote: > On Mon, Nov 28, 2016 at 09:54:28AM -0800, Florian Fainelli wrote: >> If we start supporting generic "enable", "disable" type of properties >> with values that map directly to register definitions of the HW, we >> leave too much room for these properties to be utilized to implement a >> specific policy, and this is not acceptable. > > Another concern with this patch is that the existing phylib "set_eee" > code is horribly buggy - it just translates the modes from userspace > into the register value and writes them directly to the register with > no validation. So it's possible to set modes in the register that the > hardware doesn't support, and have them advertised to the link partner. > > I have a patch which fixes that, restricting (as we do elsewhere) the > advert according to the EEE supported capabilities retrieved from the > PCS - maybe the problem here is that the PCS doesn't support support > EEE in 1000baseT mode? > > Out of interest, which PHY is used on this platform? > > On the SolidRun boards, they're using AR8035, and have suffered this > occasional link drop problem. What has been found is that it seems to > be to do with the timing parameters, and it seemed to only be 1000bT > that was affected. I don't remember off hand exactly which or what > the change was they made to stabilise it though, but I can probabily > find out tomorrow. I have different boards with am335x and AR8035 and we had occasional link drop with both 100 and 1000 speeds. Yegor ^ permalink raw reply [flat|nested] 17+ messages in thread
[parent not found: <CAGm1_kvZ4dQrJ89qYU5wLGU1NR=j9xyWUm2mgYtq3F1+bo1OCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue [not found] ` <CAGm1_kvZ4dQrJ89qYU5wLGU1NR=j9xyWUm2mgYtq3F1+bo1OCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2017-01-06 11:58 ` Russell King - ARM Linux 0 siblings, 0 replies; 17+ messages in thread From: Russell King - ARM Linux @ 2017-01-06 11:58 UTC (permalink / raw) To: Yegor Yefremov Cc: Florian Fainelli, Jerome Brunet, netdev, devicetree-u79uwXL29TY76Z2rM5mHXA, Andrew Lunn, Alexandre TORGUE, Neil Armstrong, Martin Blumenstingl, Kevin Hilman, kernel list, Julia Lawall, Andre Roth, linux-amlogic-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, Carlo Caione, Giuseppe Cavallaro, Andreas Färber, linux-arm-kernel On Fri, Jan 06, 2017 at 06:42:24AM +0100, Yegor Yefremov wrote: > On Fri, Jan 6, 2017 at 12:25 AM, Russell King - ARM Linux > <linux-I+IVW8TIWO2tmTQ+vhA3Yw@public.gmane.org> wrote: > > Another concern with this patch is that the existing phylib "set_eee" > > code is horribly buggy - it just translates the modes from userspace > > into the register value and writes them directly to the register with > > no validation. So it's possible to set modes in the register that the > > hardware doesn't support, and have them advertised to the link partner. > > > > I have a patch which fixes that, restricting (as we do elsewhere) the > > advert according to the EEE supported capabilities retrieved from the > > PCS - maybe the problem here is that the PCS doesn't support support > > EEE in 1000baseT mode? > > > > Out of interest, which PHY is used on this platform? > > > > On the SolidRun boards, they're using AR8035, and have suffered this > > occasional link drop problem. What has been found is that it seems to > > be to do with the timing parameters, and it seemed to only be 1000bT > > that was affected. I don't remember off hand exactly which or what > > the change was they made to stabilise it though, but I can probabily > > find out tomorrow. > > I have different boards with am335x and AR8035 and we had occasional > link drop with both 100 and 1000 speeds. AR8035 has "Smart EEE", which is a PHY specific thing... it's not entirely 802.3 compliant as it doesn't involve the MAC. The Smart EEE control registers are in the PCS MMD - some of this is from people's memories: - 0x805b is the TX wakeup timer. Lower 8 bits for 100base-Tx and upper 8 bits for 1000base-T. - 0x805c and 0x805d. The LPI timer is 24 bit, with the lower 16 bits in 0x805c and the upper 8 in 0x805d. 0x805d bit 8 appears to be the Smart EEE enable bit. What was found was setting the 1000base-T wakeup timer to the same as the 100base-Tx avoided the problems we were seeing, which was only with 1000base-T. (Whether that's because 100base-Tx hasn't been as well tested, I don't know.) SR ended up with 0x1717 in 0x805b. I'd suggest playing around with that register to see if extending the wakeup time has any beneficial effect. Also, I suspect Smart EEE shouldn't be enabled if you have an EEE capable MAC (in which case 0x805d bit 8 should be clear.) I think, however, it defaults to enabled. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue 2017-01-05 23:25 ` Russell King - ARM Linux 2017-01-06 5:42 ` Yegor Yefremov @ 2017-01-06 10:11 ` Jerome Brunet 2017-01-06 11:42 ` Russell King - ARM Linux 1 sibling, 1 reply; 17+ messages in thread From: Jerome Brunet @ 2017-01-06 10:11 UTC (permalink / raw) To: Russell King - ARM Linux, Florian Fainelli Cc: Andrew Lunn, Alexandre TORGUE, Neil Armstrong, Martin Blumenstingl, netdev, Giuseppe Cavallaro, linux-kernel, Yegor Yefremov, Julia Lawall, devicetree, Andre Roth, Kevin Hilman, Carlo Caione, linux-amlogic, Andreas Färber, linux-arm-kernel On Thu, 2017-01-05 at 23:25 +0000, Russell King - ARM Linux wrote: > On Mon, Nov 28, 2016 at 09:54:28AM -0800, Florian Fainelli wrote: > > > > If we start supporting generic "enable", "disable" type of > > properties > > with values that map directly to register definitions of the HW, we > > leave too much room for these properties to be utilized to > > implement a > > specific policy, and this is not acceptable. > > Another concern with this patch is that the existing phylib "set_eee" > code is horribly buggy - it just translates the modes from userspace > into the register value and writes them directly to the register with > no validation. So it's possible to set modes in the register that > the > hardware doesn't support, and have them advertised to the link > partner. Hi Russell, The purpose of this patch is to provide a way to mark as broken a particular eee mode. At first, it had nothing to do with "set_eee" but, as Florian rightly pointed out, users shouldn't be able to re-enable a broken mode. At first, I was thinking about returning -ENOSUP if a broken mode was requested. Then I noticed the behavior you just described: A user can request anything of "set_eee", he won't necessarily get what he asked but won't get an error either. To avoid mixing different topic in a single patch, I kept the same behavior, not returning an error, just silently discarding broken modes I agree with you, we should probably validate a bit more what we asked of the hardware in set_eee. I wonder if we should return an error though. With the current implementation, user space application could simply ask to activate everything, excepting the kernel to sort it out and return an error only if something horribly wrong happened. If we start returning an error for unsupported modes, we could break things. I guess we should just silently filter the requested modes. > > I have a patch which fixes that, restricting (as we do elsewhere) the > advert according to the EEE supported capabilities retrieved from the > PCS Could be interesting :) > - maybe the problem here is that the PCS doesn't support support > EEE in 1000baseT mode? It does, and that's kind of the problem. EEE in ON for 100Tx and 1000T by default with this PHY. I have several platform with the same MAC-PHY combination. Only the OdroidC2 shows this particular issue with 1000T- EEE As explained in other mails in this thread. The problem does not come from the MAC entering LPI. It actually comes from the link partner entering LPI on the Rx path under significant Tx transfer. For some reason, this completely mess up our PHY. > > Out of interest, which PHY is used on this platform? The PHY is the Realtek RTL8211F > > On the SolidRun boards, they're using AR8035, and have suffered this > occasional link drop problem. What has been found is that it seems > to > be to do with the timing parameters, and it seemed to only be 1000bT > that was affected. I don't remember off hand exactly which or what > the change was they made to stabilise it though, but I can probabily > find out tomorrow. > Since the same combination of MAC-PHY works well on other designs, it is also my feeling that is has something to do with some timing parameter, maybe related to this particular PCB. While debugging this issue, we tried to play with all the parameters we could think of but never found anything worth mentioning. If you have any ideas, I'd be happy to try. Jerome _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue 2017-01-06 10:11 ` Jerome Brunet @ 2017-01-06 11:42 ` Russell King - ARM Linux 2017-01-06 13:50 ` Jerome Brunet 0 siblings, 1 reply; 17+ messages in thread From: Russell King - ARM Linux @ 2017-01-06 11:42 UTC (permalink / raw) To: Jerome Brunet Cc: Florian Fainelli, netdev, devicetree, Andrew Lunn, Alexandre TORGUE, Neil Armstrong, Martin Blumenstingl, Kevin Hilman, linux-kernel, Yegor Yefremov, Julia Lawall, Andre Roth, linux-amlogic, Carlo Caione, Giuseppe Cavallaro, Andreas Färber, linux-arm-kernel On Fri, Jan 06, 2017 at 11:11:36AM +0100, Jerome Brunet wrote: > The purpose of this patch is to provide a way to mark as broken a > particular eee mode. At first, it had nothing to do with "set_eee" but, > as Florian rightly pointed out, users shouldn't be able to re-enable a > broken mode. I think something else has been missed - I don't see much point to telling userspace that (eg) 1000baseT EEE is supported and then ignore attempts to advertise it. If it's broken, then arguably the hardware doesn't support the mode, so we should really be masking those bits from the EEE supported mask as well. > I wonder if we should return an error though. With the current > implementation, user space application could simply ask to activate > everything, excepting the kernel to sort it out and return an error > only if something horribly wrong happened. If we start returning an > error for unsupported modes, we could break things. I guess we should > just silently filter the requested modes. The ethtool behaviour for advertisments is that errors are not returned unless the attempted advert is really wrong. So, for example, when setting an advertisment for link modes, we accept the user's supplied mask, and bitwise AND it with the supported mask, so unsupported link modes are cleared. Only if the result is an empty mask do we then return an error to userspace. It's similar for forcing the link parameters - phylib attempts to find the best phy setting mode which fuzzily matches the users request, but doesn't error out if we can't do exactly what the user requested. In the EEE case, an empty mask is acceptable (it means "EEE is supported in no link modes") so it isn't appropriate to return errors there. > > - maybe the problem here is that the PCS doesn't support support > > EEE in 1000baseT mode? > > > It does, and that's kind of the problem. EEE in ON for 100Tx and 1000T > by default with this PHY. I have several platform with the same MAC-PHY > combination. Only the OdroidC2 shows this particular issue with 1000T- > EEE > > As explained in other mails in this thread. The problem does not come > from the MAC entering LPI. It actually comes from the link partner > entering LPI on the Rx path under significant Tx transfer. For some > reason, this completely mess up our PHY. For a 1000baseT link to enter low power, both ends have to enter LPI (see 802.3 78.1.3.3.1) - the Tx and Rx paths can't independently enter LPI. So, if you have a busy Tx link, the link itself can't be entering LPI. Your link partner may be sending a request to enter LPI due to its own Tx path being idle, which should then be forwarded to your MAC. It's pretty hard to see what could be messed up with that - I'd have expected the problems to occur when both ends were idle and the link had entered low power mode. > > On the SolidRun boards, they're using AR8035, and have suffered this > > occasional link drop problem. What has been found is that it seems > > to > > be to do with the timing parameters, and it seemed to only be 1000bT > > that was affected. I don't remember off hand exactly which or what > > the change was they made to stabilise it though, but I can probabily > > find out tomorrow. > > > > Since the same combination of MAC-PHY works well on other designs, it > is also my feeling that is has something to do with some timing > parameter, maybe related to this particular PCB. Maybe a different PHY interface? Meson seems to use RGMII, maybe others use SGMII - but then I'd expect 100base-Tx to also be broken. So not really sure. I was talking to Florian about that last night, because the mis-named phy_init_eee() tests for various phy interface modes before proceeding, which seems to be fairly rubbish as the list of interface modes is gradually increasing since it was introduced (and I need to add SGMII to it.) The conclusion I've come to there is that the test should never have been part of phylib, because if there are restrictions on which phy interface modes are allowable for EEE, they're likely to be either PHY or MAC specific. The other problem that having the test there causes is that if the existing users can't handle EEE over SGMII, then when I add SGMII to support my hardware, they end up breaking - far from desirable. There's no information on why the test is there, or even which PHYs or MACs it's applicable to, which makes this unnecessarily more difficult to now resolve. My feeling is that the integration of EEE into phylib is fairly poor at the moment, and we need to be a lot smarter about it. BTW, one of the problems (not caused by your patch) is that changing the EEE advertisment does not (on all PHY drivers) cause the link to be renegotiated - there's no call to phy_start_aneg() when the advert changes, and even if there was, there's no guarantee that phy_start_aneg() will even set the AN restart bit in the control register. However, given that you're hooking into the set_eee function, I'm not sure why you placed your EEE advertisment thing into config_aneg() - isn't it more an initialisation thing (so should be in config_init()?) -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue 2017-01-06 11:42 ` Russell King - ARM Linux @ 2017-01-06 13:50 ` Jerome Brunet 2017-01-06 15:05 ` Russell King - ARM Linux 0 siblings, 1 reply; 17+ messages in thread From: Jerome Brunet @ 2017-01-06 13:50 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Andrew Lunn, Florian Fainelli, Alexandre TORGUE, Neil Armstrong, Martin Blumenstingl, netdev, Giuseppe Cavallaro, linux-kernel, Yegor Yefremov, Julia Lawall, devicetree, Andre Roth, Kevin Hilman, Carlo Caione, linux-amlogic, Andreas Färber, linux-arm-kernel On Fri, 2017-01-06 at 11:42 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 06, 2017 at 11:11:36AM +0100, Jerome Brunet wrote: > > > > The purpose of this patch is to provide a way to mark as broken a > > particular eee mode. At first, it had nothing to do with "set_eee" > > but, > > as Florian rightly pointed out, users shouldn't be able to re- > > enable a > > broken mode. > > I think something else has been missed - I don't see much point to > telling userspace that (eg) 1000baseT EEE is supported and then > ignore attempts to advertise it. > > If it's broken, then arguably the hardware doesn't support the mode, > so we should really be masking those bits from the EEE supported mask > as well. indeed. > > > [...] > > > > > > > > > - maybe the problem here is that the PCS doesn't support support > > > EEE in 1000baseT mode? > > > > > > It does, and that's kind of the problem. EEE in ON for 100Tx and > > 1000T > > by default with this PHY. I have several platform with the same > > MAC-PHY > > combination. Only the OdroidC2 shows this particular issue with > > 1000T- > > EEE > > > > As explained in other mails in this thread. The problem does not > > come > > from the MAC entering LPI. It actually comes from the link partner > > entering LPI on the Rx path under significant Tx transfer. For some > > reason, this completely mess up our PHY. > > For a 1000baseT link to enter low power, both ends have to enter LPI > (see 802.3 78.1.3.3.1) - the Tx and Rx paths can't independently > enter > LPI. > > So, if you have a busy Tx link, the link itself can't be entering > LPI. > Your link partner may be sending a request to enter LPI due to its > own > Tx path being idle, which should then be forwarded to your MAC. > > It's pretty hard to see what could be messed up with that - I'd have > expected the problems to occur when both ends were idle and the link > had entered low power mode. Well, maybe I'm not explaining the issue very well. Here the test done which led me to this conclusion: The test are done using iperf. Receiving data works well, with the expected performance. Sending data is the problem, and only under high load: Here are the lpi stats before starting the test: irq_tx_path_in_lpi_mode_n: 6 irq_tx_path_exit_lpi_mode_n: 5 irq_rx_path_in_lpi_mode_n: 76 irq_rx_path_exit_lpi_mode_n: 75 phy_eee_wakeup_error_n: 0 Sending data with iperf usually works for little while (between 0 and 10s) # iperf3 -c 192.168.1.170 -p12345 Connecting to host 192.168.1.170, port 12345 local 192.168.1.30 port 54450 connected to 192.168.1.170 port 12345 Interval Transfer Bandwidth Retr Cwnd 0.00-1.00 sec 112 MBytes 938 Mbits/sec 0 409 KBytes 1.00-2.00 sec 112 MBytes 940 Mbits/sec 0 426 KBytes 2.00-3.00 sec 112 MBytes 939 Mbits/sec 0 426 KBytes 3.00-4.00 sec 112 MBytes 940 Mbits/sec 0 426 KBytes 4.00-5.00 sec 112 MBytes 940 Mbits/sec 0 426 KBytes 5.00-6.00 sec 112 MBytes 939 Mbits/sec 0 426 KBytes 6.00-7.00 sec 9.26 MBytes 77.6 Mbits/sec 2 1.41 KBytes <=Issue 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec 1 1.41 KBytes 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec 0 1.41 KBytes ^C10.00-13.58 sec 0.00 Bytes 0.00 bits/sec 1 1.41 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - Interval Transfer Bandwidth Retr 0.00-13.58 sec 681 MBytes 421 Mbits/sec 4 sender 0.00-13.58 sec 0.00 Bytes 0.00 bits/sec receiver iperf3: interrupt - the client has terminated iperf3 does not exit ant the link seems completely broken. We cannot send or receive until the interface is brought down then up again. Here are the LPI related stats after the test: irq_tx_path_in_lpi_mode_n: 48 irq_tx_path_exit_lpi_mode_n: 48 irq_rx_path_in_lpi_mode_n: 325 irq_rx_path_exit_lpi_mode_n: 325 phy_eee_wakeup_error_n: 0 This happens with : 1) Default configuration: EEE enabled on the MAC, PHY with reset settings (EEE advertised) 2) EEE disabled on the MAC, PHY still with reset settings (EEE advertised). In such case there is no irq_tx_path_*_lpi_mode interrupts at all but still a lot of irq_rx_path_*_lpi_mode interrupts. So even if the mac does not drive anything EEE related, there is still something happening between the PHY and the link partner regarding EEE. 3) Disabling EEE advertisement for 1000t: no irq_*_lpi_mode at all. The feature is not negotiated and the Tx works well. By the way, EEE work well for the 100tx on the same HW. > > > > > > > > > On the SolidRun boards, they're using AR8035, and have suffered > > > this > > > occasional link drop problem. What has been found is that it > > > seems > > > to > > > be to do with the timing parameters, and it seemed to only be > > > 1000bT > > > that was affected. I don't remember off hand exactly which or > > > what > > > the change was they made to stabilise it though, but I can > > > probabily > > > find out tomorrow. > > > > > > > Since the same combination of MAC-PHY works well on other designs, > > it > > is also my feeling that is has something to do with some timing > > parameter, maybe related to this particular PCB. > > Maybe a different PHY interface? Meson seems to use RGMII, maybe > others use SGMII - but then I'd expect 100base-Tx to also be broken. > So not really sure. Nope, same interface (RGMII), same SoC. Only the PCB layout and external components might be different. > > I was talking to Florian about that last night, because the mis-named > phy_init_eee() tests for various phy interface modes before > proceeding, > which seems to be fairly rubbish as the list of interface modes is > gradually increasing since it was introduced (and I need to add SGMII > to it.) The conclusion I've come to there is that the test should > never have been part of phylib, because if there are restrictions on > which phy interface modes are allowable for EEE, they're likely to be > either PHY or MAC specific. > > The other problem that having the test there causes is that if the > existing users can't handle EEE over SGMII, then when I add SGMII to > support my hardware, they end up breaking - far from desirable. > There's no information on why the test is there, or even which PHYs > or MACs it's applicable to, which makes this unnecessarily more > difficult to now resolve. > > My feeling is that the integration of EEE into phylib is fairly poor > at the moment, and we need to be a lot smarter about it. You know a lot more than I do on this topic obviously. I'm just trying to make GbE work (as cleanly as possible) on that board to be honest. So I'm not sure I understand, are you against EEE integration in phylib entirely, or specifically against the test I added in set_eee to filter out broken modes ? Since set_eee directly set the register, I don't see where else I could have put this test to prevent EEE broken modes from being re-enabled. > > BTW, one of the problems (not caused by your patch) is that changing > the EEE advertisment does not (on all PHY drivers) cause the link to > be renegotiated - there's no call to phy_start_aneg() when the advert > changes, and even if there was, there's no guarantee that > phy_start_aneg() will even set the AN restart bit in the control > register. > > However, given that you're hooking into the set_eee function, I'm not > sure why you placed your EEE advertisment thing into config_aneg() - > isn't it more an initialisation thing (so should be in > config_init()?) What I change is what the PHY advertise, so it seems logical to do it where "genphy_config_advert" was called. Just taking the existing code as an example > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue 2017-01-06 13:50 ` Jerome Brunet @ 2017-01-06 15:05 ` Russell King - ARM Linux 0 siblings, 0 replies; 17+ messages in thread From: Russell King - ARM Linux @ 2017-01-06 15:05 UTC (permalink / raw) To: Jerome Brunet Cc: Andrew Lunn, Florian Fainelli, Alexandre TORGUE, Neil Armstrong, Martin Blumenstingl, netdev, linux-kernel, Yegor Yefremov, Julia Lawall, devicetree, Andre Roth, Kevin Hilman, Carlo Caione, Giuseppe Cavallaro, linux-amlogic, Andreas Färber, linux-arm-kernel (quick reply...) On Fri, Jan 06, 2017 at 02:50:21PM +0100, Jerome Brunet wrote: > So I'm not sure I understand, are you against EEE integration in phylib > entirely, or specifically against the test I added in set_eee to filter > out broken modes ? I'm happy to see EEE integrated into phylib, but I think the current implementation is very buggy and needs a rewrite. > > BTW, one of the problems (not caused by your patch) is that changing > > the EEE advertisment does not (on all PHY drivers) cause the link to > > be renegotiated - there's no call to phy_start_aneg() when the advert > > changes, and even if there was, there's no guarantee that > > phy_start_aneg() will even set the AN restart bit in the control > > register. > > > > However, given that you're hooking into the set_eee function, I'm not > > sure why you placed your EEE advertisment thing into config_aneg() - > > isn't it more an initialisation thing (so should be in > > config_init()?) > > What I change is what the PHY advertise, so it seems logical to do it > where "genphy_config_advert" was called. Just taking the existing code > as an example You need to adjust the adverisment in two places: 1. On initialisation, when you need to change the default value. 2. Whenever the user requests a different EEE advertisment. You don't need to do it each time config_aneg() is called - nothing's going to change the EEE advertisment in that path. Hence, to check it each and every time seems like a waste of CPU cycles. However, there's another path that needs to be considered, which the current EEE code fails to do, and that is the resume path. Nothing at present saves and restores the EEE settings, they are completely lost if the PHY is powered down. This is just another symptom of the current poor quality EEE implementation in phylib, and another reason why I say above that the EEE code is in need of a rewrite... which is something I will be looking at. If the EEE settings are properly saved and restored over suspend/ resume, then the previously programmed EEE advertisment would also be restored. -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-01-06 15:05 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-11-28 15:50 [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Jerome Brunet 2016-11-28 15:50 ` [PATCH net-next v4 2/4] dt-bindings: net: add EEE capability constants Jerome Brunet 2016-12-05 14:39 ` Rob Herring 2016-12-19 15:16 ` Jerome Brunet 2016-11-28 15:50 ` [PATCH net-next v4 3/4] dt: bindings: add ethernet phy eee-broken-modes option documentation Jerome Brunet [not found] ` <1480348229-25672-1-git-send-email-jbrunet-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> 2016-11-28 15:50 ` [PATCH net-next v4 1/4] net: phy: add an option to disable EEE advertisement Jerome Brunet 2016-11-28 15:50 ` [PATCH net-next v4 4/4] ARM64: dts: meson: odroidc2: disable advertisement EEE for GbE Jerome Brunet 2016-11-28 17:54 ` [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Florian Fainelli 2016-11-30 9:47 ` Jerome Brunet [not found] ` <1480499246.17538.208.camel-rdvid1DuHRBWk0Htik3J/w@public.gmane.org> 2016-11-30 18:28 ` Florian Fainelli 2017-01-05 23:25 ` Russell King - ARM Linux 2017-01-06 5:42 ` Yegor Yefremov [not found] ` <CAGm1_kvZ4dQrJ89qYU5wLGU1NR=j9xyWUm2mgYtq3F1+bo1OCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2017-01-06 11:58 ` Russell King - ARM Linux 2017-01-06 10:11 ` Jerome Brunet 2017-01-06 11:42 ` Russell King - ARM Linux 2017-01-06 13:50 ` Jerome Brunet 2017-01-06 15:05 ` Russell King - ARM Linux
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).