From mboxrd@z Thu Jan 1 00:00:00 1970 From: jbrunet@baylibre.com (Jerome Brunet) Date: Fri, 06 Jan 2017 14:50:21 +0100 Subject: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue In-Reply-To: <20170106114226.GX14217@n2100.armlinux.org.uk> References: <1480348229-25672-1-git-send-email-jbrunet@baylibre.com> <049b1efc-3bad-92e0-45ef-0563dc5d81de@gmail.com> <20170105232508.GU14217@n2100.armlinux.org.uk> <1483697496.28003.40.camel@baylibre.com> <20170106114226.GX14217@n2100.armlinux.org.uk> Message-ID: <1483710621.28003.74.camel@baylibre.com> To: linus-amlogic@lists.infradead.org List-Id: linus-amlogic.lists.infradead.org On Fri, 2017-01-06 at 11:42 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 06, 2017 at 11:11:36AM +0100, Jerome Brunet wrote: > > > > The purpose of this patch is to provide a way to mark as broken a > > particular eee mode. At first, it had nothing to do with "set_eee" > > but, > > as Florian rightly pointed out, users shouldn't be able to re- > > enable a > > broken mode. > > I think something else has been missed - I don't see much point to > telling userspace that (eg) 1000baseT EEE is supported and then > ignore attempts to advertise it. > > If it's broken, then arguably the hardware doesn't support the mode, > so we should really be masking those bits from the EEE supported mask > as well. indeed. > > > [...] > > > > > > > > > ?- maybe the problem here is that the PCS doesn't support support > > > EEE in 1000baseT mode? > > > > > > It does, and that's kind of the problem. EEE in ON for 100Tx and > > 1000T > > by default with this PHY. I have several platform with the same > > MAC-PHY? > > combination. Only the OdroidC2 shows this particular issue with > > 1000T- > > EEE > > > > As explained in other mails in this thread. The problem does not > > come > > from the MAC entering LPI. It actually comes from the link partner > > entering LPI on the Rx path under significant Tx transfer. For some > > reason, this completely mess up our PHY. > > For a 1000baseT link to enter low power, both ends have to enter LPI > (see 802.3 78.1.3.3.1) - the Tx and Rx paths can't independently > enter > LPI. > > So, if you have a busy Tx link, the link itself can't be entering > LPI. > Your link partner may be sending a request to enter LPI due to its > own > Tx path being idle, which should then be forwarded to your MAC. > > It's pretty hard to see what could be messed up with that - I'd have > expected the problems to occur when both ends were idle and the link > had entered low power mode. Well, maybe I'm not explaining the issue very well. Here the test done which led me to this conclusion: The?test?are?done?using?iperf. Receiving data works well, with the expected performance. Sending data is the problem, and only under high load: Here are the lpi stats before starting the test: ? ? ?irq_tx_path_in_lpi_mode_n: 6 ?????irq_tx_path_exit_lpi_mode_n: 5 ?????irq_rx_path_in_lpi_mode_n: 76 ?????irq_rx_path_exit_lpi_mode_n: 75 ?????phy_eee_wakeup_error_n: 0 Sending data with iperf usually works for little while (between 0 and 10s) # iperf3 -c 192.168.1.170 -p12345 Connecting to host 192.168.1.170, port 12345 local 192.168.1.30 port 54450 connected to 192.168.1.170 port 12345 Interval???????????Transfer?????Bandwidth???????Retr??Cwnd 0.00-1.00???sec???112 MBytes???938 Mbits/sec????0????409 KBytes??????? 1.00-2.00???sec???112 MBytes???940 Mbits/sec????0????426 KBytes??????? 2.00-3.00???sec???112 MBytes???939 Mbits/sec????0????426 KBytes??????? 3.00-4.00???sec???112 MBytes???940 Mbits/sec????0????426 KBytes??????? 4.00-5.00???sec???112 MBytes???940 Mbits/sec????0????426 KBytes??????? 5.00-6.00???sec???112 MBytes???939 Mbits/sec????0????426 KBytes??????? 6.00-7.00???sec??9.26 MBytes??77.6 Mbits/sec????2???1.41 KBytes <=Issue ? ? 7.00-8.00???sec??0.00 Bytes??0.00 bits/sec????1???1.41 KBytes??????? 8.00-9.00???sec??0.00 Bytes??0.00 bits/sec????0???1.41 KBytes??????? ^C10.00-13.58??sec??0.00 Bytes??0.00 bits/sec????1???1.41 KBytes??????? - - - - - - - - - - - - - - - - - - - - - - - - - Interval???????????Transfer?????Bandwidth???????Retr 0.00-13.58??sec???681 MBytes???421 Mbits/sec????4?????????????sender 0.00-13.58??sec??0.00 Bytes??0.00 bits/sec??????????????????receiver iperf3: interrupt - the client has terminated iperf3 does not exit ant the link seems completely broken. We cannot send or receive until the interface is brought down then up again. Here are the LPI related stats after the test: ?????irq_tx_path_in_lpi_mode_n: 48 ?????irq_tx_path_exit_lpi_mode_n: 48 ?????irq_rx_path_in_lpi_mode_n: 325 ?????irq_rx_path_exit_lpi_mode_n: 325 ?????phy_eee_wakeup_error_n: 0 This happens with : 1) Default configuration: EEE enabled on the MAC, PHY with reset settings (EEE advertised) 2) EEE disabled on the MAC, PHY still with reset settings (EEE advertised). In such case there is no irq_tx_path_*_lpi_mode interrupts at all but still a lot of irq_rx_path_*_lpi_mode interrupts. So even if the mac does not drive anything EEE related, there is still something happening between the PHY and the link partner regarding EEE. 3) Disabling EEE advertisement for 1000t: no irq_*_lpi_mode@all. The feature is not negotiated and the Tx works well. By the way, EEE work well for the 100tx on the same HW. > > > > > > > > > On the SolidRun boards, they're using AR8035, and have suffered > > > this > > > occasional link drop problem.??What has been found is that it > > > seems > > > to > > > be to do with the timing parameters, and it seemed to only be > > > 1000bT > > > that was affected.??I don't remember off hand exactly which or > > > what > > > the change was they made to stabilise it though, but I can > > > probabily > > > find out tomorrow. > > > > > > > Since the same combination of MAC-PHY works well on other designs, > > it > > is also my feeling that is has something to do with some timing > > parameter, maybe related to this particular PCB. > > Maybe a different PHY interface???Meson seems to use RGMII, maybe > others use SGMII - but then I'd expect 100base-Tx to also be broken. > So not really sure. Nope, same interface (RGMII), same SoC. Only the PCB layout and external components might be different. > > I was talking to Florian about that last night, because the mis-named > phy_init_eee() tests for various phy interface modes before > proceeding, > which seems to be fairly rubbish as the list of interface modes is > gradually increasing since it was introduced (and I need to add SGMII > to it.)??The conclusion I've come to there is that the test should > never have been part of phylib, because if there are restrictions on > which phy interface modes are allowable for EEE, they're likely to be > either PHY or MAC specific. > > The other problem that having the test there causes is that if the > existing users can't handle EEE over SGMII, then when I add SGMII to > support my hardware, they end up breaking - far from desirable. > There's no information on why the test is there, or even which PHYs > or MACs it's applicable to, which makes this unnecessarily more > difficult to now resolve. > > My feeling is that the integration of EEE into phylib is fairly poor > at the moment, and we need to be a lot smarter about it. You know a lot more than I do on this topic obviously. I'm just trying to make GbE work (as cleanly as possible) on that board to be honest. So I'm not sure I understand, are you against EEE integration in phylib entirely, or specifically against the test I added in set_eee to filter out broken modes ? Since set_eee directly set the register, I don't see where else I could have put this test to prevent EEE broken modes from being re-enabled. > > BTW, one of the problems (not caused by your patch) is that changing > the EEE advertisment does not (on all PHY drivers) cause the link to > be renegotiated - there's no call to phy_start_aneg() when the advert > changes, and even if there was, there's no guarantee that > phy_start_aneg() will even set the AN restart bit in the control > register. > > However, given that you're hooking into the set_eee function, I'm not > sure why you placed your EEE advertisment thing into config_aneg() - > isn't it more an initialisation thing (so should be in > config_init()?) What I change is what the PHY advertise, so it seems logical to do it where "genphy_config_advert" was called. Just taking the existing code as an example > From mboxrd@z Thu Jan 1 00:00:00 1970 From: jbrunet@baylibre.com (Jerome Brunet) Date: Fri, 06 Jan 2017 14:50:21 +0100 Subject: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue In-Reply-To: <20170106114226.GX14217@n2100.armlinux.org.uk> References: <1480348229-25672-1-git-send-email-jbrunet@baylibre.com> <049b1efc-3bad-92e0-45ef-0563dc5d81de@gmail.com> <20170105232508.GU14217@n2100.armlinux.org.uk> <1483697496.28003.40.camel@baylibre.com> <20170106114226.GX14217@n2100.armlinux.org.uk> Message-ID: <1483710621.28003.74.camel@baylibre.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Fri, 2017-01-06 at 11:42 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 06, 2017 at 11:11:36AM +0100, Jerome Brunet wrote: > > > > The purpose of this patch is to provide a way to mark as broken a > > particular eee mode. At first, it had nothing to do with "set_eee" > > but, > > as Florian rightly pointed out, users shouldn't be able to re- > > enable a > > broken mode. > > I think something else has been missed - I don't see much point to > telling userspace that (eg) 1000baseT EEE is supported and then > ignore attempts to advertise it. > > If it's broken, then arguably the hardware doesn't support the mode, > so we should really be masking those bits from the EEE supported mask > as well. indeed. > > > [...] > > > > > > > > > ?- maybe the problem here is that the PCS doesn't support support > > > EEE in 1000baseT mode? > > > > > > It does, and that's kind of the problem. EEE in ON for 100Tx and > > 1000T > > by default with this PHY. I have several platform with the same > > MAC-PHY? > > combination. Only the OdroidC2 shows this particular issue with > > 1000T- > > EEE > > > > As explained in other mails in this thread. The problem does not > > come > > from the MAC entering LPI. It actually comes from the link partner > > entering LPI on the Rx path under significant Tx transfer. For some > > reason, this completely mess up our PHY. > > For a 1000baseT link to enter low power, both ends have to enter LPI > (see 802.3 78.1.3.3.1) - the Tx and Rx paths can't independently > enter > LPI. > > So, if you have a busy Tx link, the link itself can't be entering > LPI. > Your link partner may be sending a request to enter LPI due to its > own > Tx path being idle, which should then be forwarded to your MAC. > > It's pretty hard to see what could be messed up with that - I'd have > expected the problems to occur when both ends were idle and the link > had entered low power mode. Well, maybe I'm not explaining the issue very well. Here the test done which led me to this conclusion: The?test?are?done?using?iperf. Receiving data works well, with the expected performance. Sending data is the problem, and only under high load: Here are the lpi stats before starting the test: ? ? ?irq_tx_path_in_lpi_mode_n: 6 ?????irq_tx_path_exit_lpi_mode_n: 5 ?????irq_rx_path_in_lpi_mode_n: 76 ?????irq_rx_path_exit_lpi_mode_n: 75 ?????phy_eee_wakeup_error_n: 0 Sending data with iperf usually works for little while (between 0 and 10s) # iperf3 -c 192.168.1.170 -p12345 Connecting to host 192.168.1.170, port 12345 local 192.168.1.30 port 54450 connected to 192.168.1.170 port 12345 Interval???????????Transfer?????Bandwidth???????Retr??Cwnd 0.00-1.00???sec???112 MBytes???938 Mbits/sec????0????409 KBytes??????? 1.00-2.00???sec???112 MBytes???940 Mbits/sec????0????426 KBytes??????? 2.00-3.00???sec???112 MBytes???939 Mbits/sec????0????426 KBytes??????? 3.00-4.00???sec???112 MBytes???940 Mbits/sec????0????426 KBytes??????? 4.00-5.00???sec???112 MBytes???940 Mbits/sec????0????426 KBytes??????? 5.00-6.00???sec???112 MBytes???939 Mbits/sec????0????426 KBytes??????? 6.00-7.00???sec??9.26 MBytes??77.6 Mbits/sec????2???1.41 KBytes <=Issue ? ? 7.00-8.00???sec??0.00 Bytes??0.00 bits/sec????1???1.41 KBytes??????? 8.00-9.00???sec??0.00 Bytes??0.00 bits/sec????0???1.41 KBytes??????? ^C10.00-13.58??sec??0.00 Bytes??0.00 bits/sec????1???1.41 KBytes??????? - - - - - - - - - - - - - - - - - - - - - - - - - Interval???????????Transfer?????Bandwidth???????Retr 0.00-13.58??sec???681 MBytes???421 Mbits/sec????4?????????????sender 0.00-13.58??sec??0.00 Bytes??0.00 bits/sec??????????????????receiver iperf3: interrupt - the client has terminated iperf3 does not exit ant the link seems completely broken. We cannot send or receive until the interface is brought down then up again. Here are the LPI related stats after the test: ?????irq_tx_path_in_lpi_mode_n: 48 ?????irq_tx_path_exit_lpi_mode_n: 48 ?????irq_rx_path_in_lpi_mode_n: 325 ?????irq_rx_path_exit_lpi_mode_n: 325 ?????phy_eee_wakeup_error_n: 0 This happens with : 1) Default configuration: EEE enabled on the MAC, PHY with reset settings (EEE advertised) 2) EEE disabled on the MAC, PHY still with reset settings (EEE advertised). In such case there is no irq_tx_path_*_lpi_mode interrupts at all but still a lot of irq_rx_path_*_lpi_mode interrupts. So even if the mac does not drive anything EEE related, there is still something happening between the PHY and the link partner regarding EEE. 3) Disabling EEE advertisement for 1000t: no irq_*_lpi_mode@all. The feature is not negotiated and the Tx works well. By the way, EEE work well for the 100tx on the same HW. > > > > > > > > > On the SolidRun boards, they're using AR8035, and have suffered > > > this > > > occasional link drop problem.??What has been found is that it > > > seems > > > to > > > be to do with the timing parameters, and it seemed to only be > > > 1000bT > > > that was affected.??I don't remember off hand exactly which or > > > what > > > the change was they made to stabilise it though, but I can > > > probabily > > > find out tomorrow. > > > > > > > Since the same combination of MAC-PHY works well on other designs, > > it > > is also my feeling that is has something to do with some timing > > parameter, maybe related to this particular PCB. > > Maybe a different PHY interface???Meson seems to use RGMII, maybe > others use SGMII - but then I'd expect 100base-Tx to also be broken. > So not really sure. Nope, same interface (RGMII), same SoC. Only the PCB layout and external components might be different. > > I was talking to Florian about that last night, because the mis-named > phy_init_eee() tests for various phy interface modes before > proceeding, > which seems to be fairly rubbish as the list of interface modes is > gradually increasing since it was introduced (and I need to add SGMII > to it.)??The conclusion I've come to there is that the test should > never have been part of phylib, because if there are restrictions on > which phy interface modes are allowable for EEE, they're likely to be > either PHY or MAC specific. > > The other problem that having the test there causes is that if the > existing users can't handle EEE over SGMII, then when I add SGMII to > support my hardware, they end up breaking - far from desirable. > There's no information on why the test is there, or even which PHYs > or MACs it's applicable to, which makes this unnecessarily more > difficult to now resolve. > > My feeling is that the integration of EEE into phylib is fairly poor > at the moment, and we need to be a lot smarter about it. You know a lot more than I do on this topic obviously. I'm just trying to make GbE work (as cleanly as possible) on that board to be honest. So I'm not sure I understand, are you against EEE integration in phylib entirely, or specifically against the test I added in set_eee to filter out broken modes ? Since set_eee directly set the register, I don't see where else I could have put this test to prevent EEE broken modes from being re-enabled. > > BTW, one of the problems (not caused by your patch) is that changing > the EEE advertisment does not (on all PHY drivers) cause the link to > be renegotiated - there's no call to phy_start_aneg() when the advert > changes, and even if there was, there's no guarantee that > phy_start_aneg() will even set the AN restart bit in the control > register. > > However, given that you're hooking into the set_eee function, I'm not > sure why you placed your EEE advertisment thing into config_aneg() - > isn't it more an initialisation thing (so should be in > config_init()?) What I change is what the PHY advertise, so it seems logical to do it where "genphy_config_advert" was called. Just taking the existing code as an example > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Brunet Subject: Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue Date: Fri, 06 Jan 2017 14:50:21 +0100 Message-ID: <1483710621.28003.74.camel@baylibre.com> References: <1480348229-25672-1-git-send-email-jbrunet@baylibre.com> <049b1efc-3bad-92e0-45ef-0563dc5d81de@gmail.com> <20170105232508.GU14217@n2100.armlinux.org.uk> <1483697496.28003.40.camel@baylibre.com> <20170106114226.GX14217@n2100.armlinux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: In-Reply-To: <20170106114226.GX14217@n2100.armlinux.org.uk> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=m.gmane.org@lists.infradead.org To: Russell King - ARM Linux Cc: Andrew Lunn , Florian Fainelli , Alexandre TORGUE , Neil Armstrong , Martin Blumenstingl , netdev@vger.kernel.org, Giuseppe Cavallaro , linux-kernel@vger.kernel.org, Yegor Yefremov , Julia Lawall , devicetree@vger.kernel.org, Andre Roth , Kevin Hilman , Carlo Caione , linux-amlogic@lists.infradead.org, Andreas =?ISO-8859-1?Q?F=E4rber?= , linux-arm-kernel@lists.infradead.org List-Id: devicetree@vger.kernel.org T24gRnJpLCAyMDE3LTAxLTA2IGF0IDExOjQyICswMDAwLCBSdXNzZWxsIEtpbmcgLSBBUk0gTGlu dXggd3JvdGU6Cj4gT24gRnJpLCBKYW4gMDYsIDIwMTcgYXQgMTE6MTE6MzZBTSArMDEwMCwgSmVy b21lIEJydW5ldCB3cm90ZToKPiA+IAo+ID4gVGhlIHB1cnBvc2Ugb2YgdGhpcyBwYXRjaCBpcyB0 byBwcm92aWRlIGEgd2F5IHRvIG1hcmsgYXMgYnJva2VuIGEKPiA+IHBhcnRpY3VsYXIgZWVlIG1v ZGUuIEF0IGZpcnN0LCBpdCBoYWQgbm90aGluZyB0byBkbyB3aXRoICJzZXRfZWVlIgo+ID4gYnV0 LAo+ID4gYXMgRmxvcmlhbiByaWdodGx5IHBvaW50ZWQgb3V0LCB1c2VycyBzaG91bGRuJ3QgYmUg YWJsZSB0byByZS0KPiA+IGVuYWJsZSBhCj4gPiBicm9rZW4gbW9kZS4KPiAKPiBJIHRoaW5rIHNv bWV0aGluZyBlbHNlIGhhcyBiZWVuIG1pc3NlZCAtIEkgZG9uJ3Qgc2VlIG11Y2ggcG9pbnQgdG8K PiB0ZWxsaW5nIHVzZXJzcGFjZSB0aGF0IChlZykgMTAwMGJhc2VUIEVFRSBpcyBzdXBwb3J0ZWQg YW5kIHRoZW4KPiBpZ25vcmUgYXR0ZW1wdHMgdG8gYWR2ZXJ0aXNlIGl0Lgo+IAo+IElmIGl0J3Mg YnJva2VuLCB0aGVuIGFyZ3VhYmx5IHRoZSBoYXJkd2FyZSBkb2Vzbid0IHN1cHBvcnQgdGhlIG1v ZGUsCj4gc28gd2Ugc2hvdWxkIHJlYWxseSBiZSBtYXNraW5nIHRob3NlIGJpdHMgZnJvbSB0aGUg RUVFIHN1cHBvcnRlZCBtYXNrCj4gYXMgd2VsbC4KCmluZGVlZC4KCj4gCj4gPiAKWy4uLl0KPiAK PiA+IAo+ID4gPiAKPiA+ID4gwqAtIG1heWJlIHRoZSBwcm9ibGVtIGhlcmUgaXMgdGhhdCB0aGUg UENTIGRvZXNuJ3Qgc3VwcG9ydCBzdXBwb3J0Cj4gPiA+IEVFRSBpbiAxMDAwYmFzZVQgbW9kZT8K PiA+IAo+ID4gCj4gPiBJdCBkb2VzLCBhbmQgdGhhdCdzIGtpbmQgb2YgdGhlIHByb2JsZW0uIEVF RSBpbiBPTiBmb3IgMTAwVHggYW5kCj4gPiAxMDAwVAo+ID4gYnkgZGVmYXVsdCB3aXRoIHRoaXMg UEhZLiBJIGhhdmUgc2V2ZXJhbCBwbGF0Zm9ybSB3aXRoIHRoZSBzYW1lCj4gPiBNQUMtUEhZwqAK PiA+IGNvbWJpbmF0aW9uLiBPbmx5IHRoZSBPZHJvaWRDMiBzaG93cyB0aGlzIHBhcnRpY3VsYXIg aXNzdWUgd2l0aAo+ID4gMTAwMFQtCj4gPiBFRUUKPiA+IAo+ID4gQXMgZXhwbGFpbmVkIGluIG90 aGVyIG1haWxzIGluIHRoaXMgdGhyZWFkLiBUaGUgcHJvYmxlbSBkb2VzIG5vdAo+ID4gY29tZQo+ ID4gZnJvbSB0aGUgTUFDIGVudGVyaW5nIExQSS4gSXQgYWN0dWFsbHkgY29tZXMgZnJvbSB0aGUg bGluayBwYXJ0bmVyCj4gPiBlbnRlcmluZyBMUEkgb24gdGhlIFJ4IHBhdGggdW5kZXIgc2lnbmlm aWNhbnQgVHggdHJhbnNmZXIuIEZvciBzb21lCj4gPiByZWFzb24sIHRoaXMgY29tcGxldGVseSBt ZXNzIHVwIG91ciBQSFkuCj4gCj4gRm9yIGEgMTAwMGJhc2VUIGxpbmsgdG8gZW50ZXIgbG93IHBv d2VyLCBib3RoIGVuZHMgaGF2ZSB0byBlbnRlciBMUEkKPiAoc2VlIDgwMi4zIDc4LjEuMy4zLjEp IC0gdGhlIFR4IGFuZCBSeCBwYXRocyBjYW4ndCBpbmRlcGVuZGVudGx5Cj4gZW50ZXIKPiBMUEku Cj4gCj4gU28sIGlmIHlvdSBoYXZlIGEgYnVzeSBUeCBsaW5rLCB0aGUgbGluayBpdHNlbGYgY2Fu J3QgYmUgZW50ZXJpbmcKPiBMUEkuCj4gWW91ciBsaW5rIHBhcnRuZXIgbWF5IGJlIHNlbmRpbmcg YSByZXF1ZXN0IHRvIGVudGVyIExQSSBkdWUgdG8gaXRzCj4gb3duCj4gVHggcGF0aCBiZWluZyBp ZGxlLCB3aGljaCBzaG91bGQgdGhlbiBiZSBmb3J3YXJkZWQgdG8geW91ciBNQUMuCj4gCj4gSXQn cyBwcmV0dHkgaGFyZCB0byBzZWUgd2hhdCBjb3VsZCBiZSBtZXNzZWQgdXAgd2l0aCB0aGF0IC0g SSdkIGhhdmUKPiBleHBlY3RlZCB0aGUgcHJvYmxlbXMgdG8gb2NjdXIgd2hlbiBib3RoIGVuZHMg d2VyZSBpZGxlIGFuZCB0aGUgbGluawo+IGhhZCBlbnRlcmVkIGxvdyBwb3dlciBtb2RlLgoKV2Vs bCwgbWF5YmUgSSdtIG5vdCBleHBsYWluaW5nIHRoZSBpc3N1ZSB2ZXJ5IHdlbGwuIEhlcmUgdGhl IHRlc3QgZG9uZQp3aGljaCBsZWQgbWUgdG8gdGhpcyBjb25jbHVzaW9uOgoKVGhlwqB0ZXN0wqBh cmXCoGRvbmXCoHVzaW5nwqBpcGVyZi4gUmVjZWl2aW5nIGRhdGEgd29ya3Mgd2VsbCwgd2l0aCB0 aGUKZXhwZWN0ZWQgcGVyZm9ybWFuY2UuIFNlbmRpbmcgZGF0YSBpcyB0aGUgcHJvYmxlbSwgYW5k IG9ubHkgdW5kZXIgaGlnaApsb2FkOgoKSGVyZSBhcmUgdGhlIGxwaSBzdGF0cyBiZWZvcmUgc3Rh cnRpbmcgdGhlIHRlc3Q6CsKgIMKgIMKgaXJxX3R4X3BhdGhfaW5fbHBpX21vZGVfbjogNgrCoMKg wqDCoMKgaXJxX3R4X3BhdGhfZXhpdF9scGlfbW9kZV9uOiA1CsKgwqDCoMKgwqBpcnFfcnhfcGF0 aF9pbl9scGlfbW9kZV9uOiA3NgrCoMKgwqDCoMKgaXJxX3J4X3BhdGhfZXhpdF9scGlfbW9kZV9u OiA3NQrCoMKgwqDCoMKgcGh5X2VlZV93YWtldXBfZXJyb3JfbjogMAoKU2VuZGluZyBkYXRhIHdp dGggaXBlcmYgdXN1YWxseSB3b3JrcyBmb3IgbGl0dGxlIHdoaWxlIChiZXR3ZWVuIDAgYW5kCjEw cykKCiMgaXBlcmYzIC1jIDE5Mi4xNjguMS4xNzAgLXAxMjM0NQpDb25uZWN0aW5nIHRvIGhvc3Qg MTkyLjE2OC4xLjE3MCwgcG9ydCAxMjM0NQpsb2NhbCAxOTIuMTY4LjEuMzAgcG9ydCA1NDQ1MCBj b25uZWN0ZWQgdG8gMTkyLjE2OC4xLjE3MCBwb3J0IDEyMzQ1CkludGVydmFswqDCoMKgwqDCoMKg wqDCoMKgwqDCoFRyYW5zZmVywqDCoMKgwqDCoEJhbmR3aWR0aMKgwqDCoMKgwqDCoMKgUmV0csKg wqBDd25kCjAuMDAtMS4wMMKgwqDCoHNlY8KgwqDCoDExMiBNQnl0ZXPCoMKgwqA5MzggTWJpdHMv c2VjwqDCoMKgwqAwwqDCoMKgwqA0MDkgS0J5dGVzwqDCoMKgwqDCoMKgwqAKMS4wMC0yLjAwwqDC oMKgc2VjwqDCoMKgMTEyIE1CeXRlc8KgwqDCoDk0MCBNYml0cy9zZWPCoMKgwqDCoDDCoMKgwqDC oDQyNiBLQnl0ZXPCoMKgwqDCoMKgwqDCoAoyLjAwLTMuMDDCoMKgwqBzZWPCoMKgwqAxMTIgTUJ5 dGVzwqDCoMKgOTM5IE1iaXRzL3NlY8KgwqDCoMKgMMKgwqDCoMKgNDI2IEtCeXRlc8KgwqDCoMKg wqDCoMKgCjMuMDAtNC4wMMKgwqDCoHNlY8KgwqDCoDExMiBNQnl0ZXPCoMKgwqA5NDAgTWJpdHMv c2VjwqDCoMKgwqAwwqDCoMKgwqA0MjYgS0J5dGVzwqDCoMKgwqDCoMKgwqAKNC4wMC01LjAwwqDC oMKgc2VjwqDCoMKgMTEyIE1CeXRlc8KgwqDCoDk0MCBNYml0cy9zZWPCoMKgwqDCoDDCoMKgwqDC oDQyNiBLQnl0ZXPCoMKgwqDCoMKgwqDCoAo1LjAwLTYuMDDCoMKgwqBzZWPCoMKgwqAxMTIgTUJ5 dGVzwqDCoMKgOTM5IE1iaXRzL3NlY8KgwqDCoMKgMMKgwqDCoMKgNDI2IEtCeXRlc8KgwqDCoMKg wqDCoMKgCjYuMDAtNy4wMMKgwqDCoHNlY8KgwqA5LjI2IE1CeXRlc8KgwqA3Ny42IE1iaXRzL3Nl Y8KgwqDCoMKgMsKgwqDCoDEuNDEgS0J5dGVzIDw9SXNzdWUKwqAgwqAKNy4wMC04LjAwwqDCoMKg c2VjwqDCoDAuMDAgQnl0ZXPCoMKgMC4wMCBiaXRzL3NlY8KgwqDCoMKgMcKgwqDCoDEuNDEgS0J5 dGVzwqDCoMKgwqDCoMKgwqAKOC4wMC05LjAwwqDCoMKgc2VjwqDCoDAuMDAgQnl0ZXPCoMKgMC4w MCBiaXRzL3NlY8KgwqDCoMKgMMKgwqDCoDEuNDEgS0J5dGVzwqDCoMKgwqDCoMKgwqAKXkMxMC4w MC0xMy41OMKgwqBzZWPCoMKgMC4wMCBCeXRlc8KgwqAwLjAwIGJpdHMvc2VjwqDCoMKgwqAxwqDC oMKgMS40MSBLQnl0ZXPCoMKgwqDCoMKgwqDCoAotIC0gLSAtIC0gLSAtIC0gLSAtIC0gLSAtIC0g LSAtIC0gLSAtIC0gLSAtIC0gLSAtCkludGVydmFswqDCoMKgwqDCoMKgwqDCoMKgwqDCoFRyYW5z ZmVywqDCoMKgwqDCoEJhbmR3aWR0aMKgwqDCoMKgwqDCoMKgUmV0cgowLjAwLTEzLjU4wqDCoHNl Y8KgwqDCoDY4MSBNQnl0ZXPCoMKgwqA0MjEgTWJpdHMvc2VjwqDCoMKgwqA0wqDCoMKgwqDCoMKg wqDCoMKgwqDCoMKgwqBzZW5kZXIKMC4wMC0xMy41OMKgwqBzZWPCoMKgMC4wMCBCeXRlc8KgwqAw LjAwIGJpdHMvc2VjwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgwqDCoMKgcmVjZWl2ZXIK aXBlcmYzOiBpbnRlcnJ1cHQgLSB0aGUgY2xpZW50IGhhcyB0ZXJtaW5hdGVkCgppcGVyZjMgZG9l cyBub3QgZXhpdCBhbnQgdGhlIGxpbmsgc2VlbXMgY29tcGxldGVseSBicm9rZW4uIFdlIGNhbm5v dApzZW5kIG9yIHJlY2VpdmUgdW50aWwgdGhlIGludGVyZmFjZSBpcyBicm91Z2h0IGRvd24gdGhl biB1cCBhZ2Fpbi4KCkhlcmUgYXJlIHRoZSBMUEkgcmVsYXRlZCBzdGF0cyBhZnRlciB0aGUgdGVz dDoKwqDCoMKgwqDCoGlycV90eF9wYXRoX2luX2xwaV9tb2RlX246IDQ4CsKgwqDCoMKgwqBpcnFf dHhfcGF0aF9leGl0X2xwaV9tb2RlX246IDQ4CsKgwqDCoMKgwqBpcnFfcnhfcGF0aF9pbl9scGlf bW9kZV9uOiAzMjUKwqDCoMKgwqDCoGlycV9yeF9wYXRoX2V4aXRfbHBpX21vZGVfbjogMzI1CsKg wqDCoMKgwqBwaHlfZWVlX3dha2V1cF9lcnJvcl9uOiAwCgoKVGhpcyBoYXBwZW5zIHdpdGggOgox KSBEZWZhdWx0IGNvbmZpZ3VyYXRpb246IEVFRSBlbmFibGVkIG9uIHRoZSBNQUMsIFBIWSB3aXRo IHJlc2V0CnNldHRpbmdzIChFRUUgYWR2ZXJ0aXNlZCkKMikgRUVFIGRpc2FibGVkIG9uIHRoZSBN QUMsIFBIWSBzdGlsbCB3aXRoIHJlc2V0IHNldHRpbmdzIChFRUUKYWR2ZXJ0aXNlZCkuIEluIHN1 Y2ggY2FzZSB0aGVyZSBpcyBubyBpcnFfdHhfcGF0aF8qX2xwaV9tb2RlIGludGVycnVwdHMKYXQg YWxsIGJ1dCBzdGlsbCBhIGxvdCBvZiBpcnFfcnhfcGF0aF8qX2xwaV9tb2RlIGludGVycnVwdHMu IFNvIGV2ZW4gaWYKdGhlIG1hYyBkb2VzIG5vdCBkcml2ZSBhbnl0aGluZyBFRUUgcmVsYXRlZCwg dGhlcmUgaXMgc3RpbGwgc29tZXRoaW5nCmhhcHBlbmluZyBiZXR3ZWVuIHRoZSBQSFkgYW5kIHRo ZSBsaW5rIHBhcnRuZXIgcmVnYXJkaW5nIEVFRS4KCjMpIERpc2FibGluZyBFRUUgYWR2ZXJ0aXNl bWVudCBmb3IgMTAwMHQ6IG5vIGlycV8qX2xwaV9tb2RlIGF0IGFsbC4gVGhlCmZlYXR1cmUgaXMg bm90IG5lZ290aWF0ZWQgYW5kIHRoZSBUeCB3b3JrcyB3ZWxsLgoKQnkgdGhlIHdheSwgRUVFIHdv cmsgd2VsbCBmb3IgdGhlIDEwMHR4IG9uIHRoZSBzYW1lIEhXLgoKPiAKPiA+IAo+ID4gPiAKPiA+ ID4gT24gdGhlIFNvbGlkUnVuIGJvYXJkcywgdGhleSdyZSB1c2luZyBBUjgwMzUsIGFuZCBoYXZl IHN1ZmZlcmVkCj4gPiA+IHRoaXMKPiA+ID4gb2NjYXNpb25hbCBsaW5rIGRyb3AgcHJvYmxlbS7C oMKgV2hhdCBoYXMgYmVlbiBmb3VuZCBpcyB0aGF0IGl0Cj4gPiA+IHNlZW1zCj4gPiA+IHRvCj4g PiA+IGJlIHRvIGRvIHdpdGggdGhlIHRpbWluZyBwYXJhbWV0ZXJzLCBhbmQgaXQgc2VlbWVkIHRv IG9ubHkgYmUKPiA+ID4gMTAwMGJUCj4gPiA+IHRoYXQgd2FzIGFmZmVjdGVkLsKgwqBJIGRvbid0 IHJlbWVtYmVyIG9mZiBoYW5kIGV4YWN0bHkgd2hpY2ggb3IKPiA+ID4gd2hhdAo+ID4gPiB0aGUg Y2hhbmdlIHdhcyB0aGV5IG1hZGUgdG8gc3RhYmlsaXNlIGl0IHRob3VnaCwgYnV0IEkgY2FuCj4g PiA+IHByb2JhYmlseQo+ID4gPiBmaW5kIG91dCB0b21vcnJvdy4KPiA+ID4gCj4gPiAKPiA+IFNp bmNlIHRoZSBzYW1lIGNvbWJpbmF0aW9uIG9mIE1BQy1QSFkgd29ya3Mgd2VsbCBvbiBvdGhlciBk ZXNpZ25zLAo+ID4gaXQKPiA+IGlzIGFsc28gbXkgZmVlbGluZyB0aGF0IGlzIGhhcyBzb21ldGhp bmcgdG8gZG8gd2l0aCBzb21lIHRpbWluZwo+ID4gcGFyYW1ldGVyLCBtYXliZSByZWxhdGVkIHRv IHRoaXMgcGFydGljdWxhciBQQ0IuCj4gCj4gTWF5YmUgYSBkaWZmZXJlbnQgUEhZIGludGVyZmFj ZT/CoMKgTWVzb24gc2VlbXMgdG8gdXNlIFJHTUlJLCBtYXliZQo+IG90aGVycyB1c2UgU0dNSUkg LSBidXQgdGhlbiBJJ2QgZXhwZWN0IDEwMGJhc2UtVHggdG8gYWxzbyBiZSBicm9rZW4uCj4gU28g bm90IHJlYWxseSBzdXJlLgoKTm9wZSwgc2FtZSBpbnRlcmZhY2UgKFJHTUlJKSwgc2FtZSBTb0Mu IE9ubHkgdGhlIFBDQiBsYXlvdXQgYW5kCmV4dGVybmFsIGNvbXBvbmVudHMgbWlnaHQgYmUgZGlm ZmVyZW50LgoKPiAKPiBJIHdhcyB0YWxraW5nIHRvIEZsb3JpYW4gYWJvdXQgdGhhdCBsYXN0IG5p Z2h0LCBiZWNhdXNlIHRoZSBtaXMtbmFtZWQKPiBwaHlfaW5pdF9lZWUoKSB0ZXN0cyBmb3IgdmFy aW91cyBwaHkgaW50ZXJmYWNlIG1vZGVzIGJlZm9yZQo+IHByb2NlZWRpbmcsCj4gd2hpY2ggc2Vl bXMgdG8gYmUgZmFpcmx5IHJ1YmJpc2ggYXMgdGhlIGxpc3Qgb2YgaW50ZXJmYWNlIG1vZGVzIGlz Cj4gZ3JhZHVhbGx5IGluY3JlYXNpbmcgc2luY2UgaXQgd2FzIGludHJvZHVjZWQgKGFuZCBJIG5l ZWQgdG8gYWRkIFNHTUlJCj4gdG8gaXQuKcKgwqBUaGUgY29uY2x1c2lvbiBJJ3ZlIGNvbWUgdG8g dGhlcmUgaXMgdGhhdCB0aGUgdGVzdCBzaG91bGQKPiBuZXZlciBoYXZlIGJlZW4gcGFydCBvZiBw aHlsaWIsIGJlY2F1c2UgaWYgdGhlcmUgYXJlIHJlc3RyaWN0aW9ucyBvbgo+IHdoaWNoIHBoeSBp bnRlcmZhY2UgbW9kZXMgYXJlIGFsbG93YWJsZSBmb3IgRUVFLCB0aGV5J3JlIGxpa2VseSB0byBi ZQo+IGVpdGhlciBQSFkgb3IgTUFDIHNwZWNpZmljLgo+IAo+IFRoZSBvdGhlciBwcm9ibGVtIHRo YXQgaGF2aW5nIHRoZSB0ZXN0IHRoZXJlIGNhdXNlcyBpcyB0aGF0IGlmIHRoZQo+IGV4aXN0aW5n IHVzZXJzIGNhbid0IGhhbmRsZSBFRUUgb3ZlciBTR01JSSwgdGhlbiB3aGVuIEkgYWRkIFNHTUlJ IHRvCj4gc3VwcG9ydCBteSBoYXJkd2FyZSwgdGhleSBlbmQgdXAgYnJlYWtpbmcgLSBmYXIgZnJv bSBkZXNpcmFibGUuCj4gVGhlcmUncyBubyBpbmZvcm1hdGlvbiBvbiB3aHkgdGhlIHRlc3QgaXMg dGhlcmUsIG9yIGV2ZW4gd2hpY2ggUEhZcwo+IG9yIE1BQ3MgaXQncyBhcHBsaWNhYmxlIHRvLCB3 aGljaCBtYWtlcyB0aGlzIHVubmVjZXNzYXJpbHkgbW9yZQo+IGRpZmZpY3VsdCB0byBub3cgcmVz b2x2ZS4KPiAKPiBNeSBmZWVsaW5nIGlzIHRoYXQgdGhlIGludGVncmF0aW9uIG9mIEVFRSBpbnRv IHBoeWxpYiBpcyBmYWlybHkgcG9vcgo+IGF0IHRoZSBtb21lbnQsIGFuZCB3ZSBuZWVkIHRvIGJl IGEgbG90IHNtYXJ0ZXIgYWJvdXQgaXQuCgpZb3Uga25vdyBhIGxvdCBtb3JlIHRoYW4gSSBkbyBv biB0aGlzIHRvcGljIG9idmlvdXNseS4gSSdtIGp1c3QgdHJ5aW5nCnRvIG1ha2UgR2JFIHdvcmsg KGFzIGNsZWFubHkgYXMgcG9zc2libGUpIG9uIHRoYXQgYm9hcmQgdG8gYmUgaG9uZXN0LgoKU28g SSdtIG5vdCBzdXJlIEkgdW5kZXJzdGFuZCwgYXJlIHlvdSBhZ2FpbnN0IEVFRSBpbnRlZ3JhdGlv biBpbiBwaHlsaWIKZW50aXJlbHksIG9yIHNwZWNpZmljYWxseSBhZ2FpbnN0IHRoZSB0ZXN0IEkg YWRkZWQgaW4gc2V0X2VlZSB0byBmaWx0ZXIKb3V0IGJyb2tlbiBtb2RlcyA/CgpTaW5jZSBzZXRf ZWVlIGRpcmVjdGx5IHNldCB0aGUgcmVnaXN0ZXIsIEkgZG9uJ3Qgc2VlIHdoZXJlIGVsc2UgSSBj b3VsZApoYXZlIHB1dCB0aGlzIHRlc3QgdG8gcHJldmVudCBFRUUgYnJva2VuIG1vZGVzIGZyb20g YmVpbmcgcmUtZW5hYmxlZC4KCj4gCj4gQlRXLCBvbmUgb2YgdGhlIHByb2JsZW1zIChub3QgY2F1 c2VkIGJ5IHlvdXIgcGF0Y2gpIGlzIHRoYXQgY2hhbmdpbmcKPiB0aGUgRUVFIGFkdmVydGlzbWVu dCBkb2VzIG5vdCAob24gYWxsIFBIWSBkcml2ZXJzKSBjYXVzZSB0aGUgbGluayB0bwo+IGJlIHJl bmVnb3RpYXRlZCAtIHRoZXJlJ3Mgbm8gY2FsbCB0byBwaHlfc3RhcnRfYW5lZygpIHdoZW4gdGhl IGFkdmVydAo+IGNoYW5nZXMsIGFuZCBldmVuIGlmIHRoZXJlIHdhcywgdGhlcmUncyBubyBndWFy YW50ZWUgdGhhdAo+IHBoeV9zdGFydF9hbmVnKCkgd2lsbCBldmVuIHNldCB0aGUgQU4gcmVzdGFy dCBiaXQgaW4gdGhlIGNvbnRyb2wKPiByZWdpc3Rlci4KPiAKPiBIb3dldmVyLCBnaXZlbiB0aGF0 IHlvdSdyZSBob29raW5nIGludG8gdGhlIHNldF9lZWUgZnVuY3Rpb24sIEknbSBub3QKPiBzdXJl IHdoeSB5b3UgcGxhY2VkIHlvdXIgRUVFIGFkdmVydGlzbWVudCB0aGluZyBpbnRvIGNvbmZpZ19h bmVnKCkgLQo+IGlzbid0IGl0IG1vcmUgYW4gaW5pdGlhbGlzYXRpb24gdGhpbmcgKHNvIHNob3Vs ZCBiZSBpbgo+IGNvbmZpZ19pbml0KCk/KQoKV2hhdCBJIGNoYW5nZSBpcyB3aGF0IHRoZSBQSFkg YWR2ZXJ0aXNlLCBzbyBpdCBzZWVtcyBsb2dpY2FsIHRvIGRvIGl0CndoZXJlICJnZW5waHlfY29u ZmlnX2FkdmVydCIgd2FzIGNhbGxlZC4gSnVzdCB0YWtpbmcgdGhlIGV4aXN0aW5nIGNvZGUKYXMg YW4gZXhhbXBsZQoKPiAKCl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fCmxpbnV4LWFybS1rZXJuZWwgbWFpbGluZyBsaXN0CmxpbnV4LWFybS1rZXJuZWxAbGlz dHMuaW5mcmFkZWFkLm9yZwpodHRwOi8vbGlzdHMuaW5mcmFkZWFkLm9yZy9tYWlsbWFuL2xpc3Rp bmZvL2xpbnV4LWFybS1rZXJuZWwK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936331AbdAFNvI (ORCPT ); Fri, 6 Jan 2017 08:51:08 -0500 Received: from mail-wm0-f42.google.com ([74.125.82.42]:34829 "EHLO mail-wm0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754711AbdAFNu4 (ORCPT ); Fri, 6 Jan 2017 08:50:56 -0500 Message-ID: <1483710621.28003.74.camel@baylibre.com> Subject: Re: [PATCH net-next v4 0/4] Fix OdroidC2 Gigabit Tx link issue From: Jerome Brunet To: Russell King - ARM Linux Cc: Florian Fainelli , netdev@vger.kernel.org, devicetree@vger.kernel.org, Andrew Lunn , Alexandre TORGUE , Neil Armstrong , Martin Blumenstingl , Kevin Hilman , linux-kernel@vger.kernel.org, Yegor Yefremov , Julia Lawall , Andre Roth , linux-amlogic@lists.infradead.org, Carlo Caione , Giuseppe Cavallaro , Andreas =?ISO-8859-1?Q?F=E4rber?= , linux-arm-kernel@lists.infradead.org Date: Fri, 06 Jan 2017 14:50:21 +0100 In-Reply-To: <20170106114226.GX14217@n2100.armlinux.org.uk> References: <1480348229-25672-1-git-send-email-jbrunet@baylibre.com> <049b1efc-3bad-92e0-45ef-0563dc5d81de@gmail.com> <20170105232508.GU14217@n2100.armlinux.org.uk> <1483697496.28003.40.camel@baylibre.com> <20170106114226.GX14217@n2100.armlinux.org.uk> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.5 (3.20.5-1.fc24) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2017-01-06 at 11:42 +0000, Russell King - ARM Linux wrote: > On Fri, Jan 06, 2017 at 11:11:36AM +0100, Jerome Brunet wrote: > > > > The purpose of this patch is to provide a way to mark as broken a > > particular eee mode. At first, it had nothing to do with "set_eee" > > but, > > as Florian rightly pointed out, users shouldn't be able to re- > > enable a > > broken mode. > > I think something else has been missed - I don't see much point to > telling userspace that (eg) 1000baseT EEE is supported and then > ignore attempts to advertise it. > > If it's broken, then arguably the hardware doesn't support the mode, > so we should really be masking those bits from the EEE supported mask > as well. indeed. > > > [...] > > > > > > > > >  - maybe the problem here is that the PCS doesn't support support > > > EEE in 1000baseT mode? > > > > > > It does, and that's kind of the problem. EEE in ON for 100Tx and > > 1000T > > by default with this PHY. I have several platform with the same > > MAC-PHY  > > combination. Only the OdroidC2 shows this particular issue with > > 1000T- > > EEE > > > > As explained in other mails in this thread. The problem does not > > come > > from the MAC entering LPI. It actually comes from the link partner > > entering LPI on the Rx path under significant Tx transfer. For some > > reason, this completely mess up our PHY. > > For a 1000baseT link to enter low power, both ends have to enter LPI > (see 802.3 78.1.3.3.1) - the Tx and Rx paths can't independently > enter > LPI. > > So, if you have a busy Tx link, the link itself can't be entering > LPI. > Your link partner may be sending a request to enter LPI due to its > own > Tx path being idle, which should then be forwarded to your MAC. > > It's pretty hard to see what could be messed up with that - I'd have > expected the problems to occur when both ends were idle and the link > had entered low power mode. Well, maybe I'm not explaining the issue very well. Here the test done which led me to this conclusion: The test are done using iperf. Receiving data works well, with the expected performance. Sending data is the problem, and only under high load: Here are the lpi stats before starting the test:      irq_tx_path_in_lpi_mode_n: 6      irq_tx_path_exit_lpi_mode_n: 5      irq_rx_path_in_lpi_mode_n: 76      irq_rx_path_exit_lpi_mode_n: 75      phy_eee_wakeup_error_n: 0 Sending data with iperf usually works for little while (between 0 and 10s) # iperf3 -c 192.168.1.170 -p12345 Connecting to host 192.168.1.170, port 12345 local 192.168.1.30 port 54450 connected to 192.168.1.170 port 12345 Interval           Transfer     Bandwidth       Retr  Cwnd 0.00-1.00   sec   112 MBytes   938 Mbits/sec    0    409 KBytes        1.00-2.00   sec   112 MBytes   940 Mbits/sec    0    426 KBytes        2.00-3.00   sec   112 MBytes   939 Mbits/sec    0    426 KBytes        3.00-4.00   sec   112 MBytes   940 Mbits/sec    0    426 KBytes        4.00-5.00   sec   112 MBytes   940 Mbits/sec    0    426 KBytes        5.00-6.00   sec   112 MBytes   939 Mbits/sec    0    426 KBytes        6.00-7.00   sec  9.26 MBytes  77.6 Mbits/sec    2   1.41 KBytes <=Issue     7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes        8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes        ^C10.00-13.58  sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes        - - - - - - - - - - - - - - - - - - - - - - - - - Interval           Transfer     Bandwidth       Retr 0.00-13.58  sec   681 MBytes   421 Mbits/sec    4             sender 0.00-13.58  sec  0.00 Bytes  0.00 bits/sec                  receiver iperf3: interrupt - the client has terminated iperf3 does not exit ant the link seems completely broken. We cannot send or receive until the interface is brought down then up again. Here are the LPI related stats after the test:      irq_tx_path_in_lpi_mode_n: 48      irq_tx_path_exit_lpi_mode_n: 48      irq_rx_path_in_lpi_mode_n: 325      irq_rx_path_exit_lpi_mode_n: 325      phy_eee_wakeup_error_n: 0 This happens with : 1) Default configuration: EEE enabled on the MAC, PHY with reset settings (EEE advertised) 2) EEE disabled on the MAC, PHY still with reset settings (EEE advertised). In such case there is no irq_tx_path_*_lpi_mode interrupts at all but still a lot of irq_rx_path_*_lpi_mode interrupts. So even if the mac does not drive anything EEE related, there is still something happening between the PHY and the link partner regarding EEE. 3) Disabling EEE advertisement for 1000t: no irq_*_lpi_mode at all. The feature is not negotiated and the Tx works well. By the way, EEE work well for the 100tx on the same HW. > > > > > > > > > On the SolidRun boards, they're using AR8035, and have suffered > > > this > > > occasional link drop problem.  What has been found is that it > > > seems > > > to > > > be to do with the timing parameters, and it seemed to only be > > > 1000bT > > > that was affected.  I don't remember off hand exactly which or > > > what > > > the change was they made to stabilise it though, but I can > > > probabily > > > find out tomorrow. > > > > > > > Since the same combination of MAC-PHY works well on other designs, > > it > > is also my feeling that is has something to do with some timing > > parameter, maybe related to this particular PCB. > > Maybe a different PHY interface?  Meson seems to use RGMII, maybe > others use SGMII - but then I'd expect 100base-Tx to also be broken. > So not really sure. Nope, same interface (RGMII), same SoC. Only the PCB layout and external components might be different. > > I was talking to Florian about that last night, because the mis-named > phy_init_eee() tests for various phy interface modes before > proceeding, > which seems to be fairly rubbish as the list of interface modes is > gradually increasing since it was introduced (and I need to add SGMII > to it.)  The conclusion I've come to there is that the test should > never have been part of phylib, because if there are restrictions on > which phy interface modes are allowable for EEE, they're likely to be > either PHY or MAC specific. > > The other problem that having the test there causes is that if the > existing users can't handle EEE over SGMII, then when I add SGMII to > support my hardware, they end up breaking - far from desirable. > There's no information on why the test is there, or even which PHYs > or MACs it's applicable to, which makes this unnecessarily more > difficult to now resolve. > > My feeling is that the integration of EEE into phylib is fairly poor > at the moment, and we need to be a lot smarter about it. You know a lot more than I do on this topic obviously. I'm just trying to make GbE work (as cleanly as possible) on that board to be honest. So I'm not sure I understand, are you against EEE integration in phylib entirely, or specifically against the test I added in set_eee to filter out broken modes ? Since set_eee directly set the register, I don't see where else I could have put this test to prevent EEE broken modes from being re-enabled. > > BTW, one of the problems (not caused by your patch) is that changing > the EEE advertisment does not (on all PHY drivers) cause the link to > be renegotiated - there's no call to phy_start_aneg() when the advert > changes, and even if there was, there's no guarantee that > phy_start_aneg() will even set the AN restart bit in the control > register. > > However, given that you're hooking into the set_eee function, I'm not > sure why you placed your EEE advertisment thing into config_aneg() - > isn't it more an initialisation thing (so should be in > config_init()?) What I change is what the PHY advertise, so it seems logical to do it where "genphy_config_advert" was called. Just taking the existing code as an example >