public inbox for linux-sunxi@lists.linux.dev
 help / color / mirror / Atom feed
From: Jerome Brunet <jbrunet@baylibre.com>
To: Heiner Kallweit <hkallweit1@gmail.com>,
	Erico Nunes <nunes.erico@gmail.com>,
	Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>,
	Giuseppe Cavallaro <peppe.cavallaro@st.com>,
	Jose Abreu <joabreu@synopsys.com>,
	Kevin Hilman <khilman@baylibre.com>,
	Neil Armstrong <narmstrong@baylibre.com>,
	linux-amlogic@lists.infradead.org, netdev@vger.kernel.org,
	"open list:ARM/Rockchip SoC..."
	<linux-rockchip@lists.infradead.org>,
	linux-sunxi@lists.linux.dev
Subject: Re: net: stmmac: dwmac-meson8b: interface sometimes does not come up at boot
Date: Wed, 02 Mar 2022 14:39:47 +0100	[thread overview]
Message-ID: <1j8rts76te.fsf@starbuckisacylon.baylibre.com> (raw)
In-Reply-To: <1e828df4-7c5d-01af-cc49-3ef9de2cf6de@gmail.com>


On Wed 02 Mar 2022 at 12:01, Heiner Kallweit <hkallweit1@gmail.com> wrote:

> On 02.03.2022 11:33, Erico Nunes wrote:
>> On Sat, Feb 26, 2022 at 2:53 PM Heiner Kallweit <hkallweit1@gmail.com> wrote:
>>> Just to rule out that the PHY may be involved:
>>> - Does the issue occur with internal and/or external PHY?
>> 
>> My target boards have the internal phy only. It is not possible for me
>> at the moment to test it with an external phy.
>> 
>>> - Issue still occurs in PHY polling mode? (disable PHY interrupt in dts)
>> 
>> Thanks for suggesting this. I did tests with this and it seems to be a
>> workaround.
>> With phy interrupt on recent kernels (around v5.17-rc3) I'm able to
>> reproduce the issue relatively easily over a batch of a hundred jobs.
>> With my tests with the phy in polling mode, I have not been able to
>> reproduce so far, even with several hundred jobs.
>> 
> It's my understanding that in the problem case the "aneg complete"
> interrupt fires, but no data flows.
> This might indicate a timing issue. According to the meson PHY driver
> (I don't have the datasheet) the PHY doesn't have a "link up" interrupt
> source, just the mentioned "aneg complete".
>
> Below I send an experimental patch that delays the link up processing
> a little and eliminates not needed interrupt sources.
> Could you please test it with PHY interrupts enabled?
>
>
> By the way, to all:
> I found that interrupt mode is broken in fixed (aneg disabled) mode,
> because link-up isn't signaled. Experiments showed that irq source
> bit 7 can be used to fix this, but this bit isn't documented in the
> driver.
>
>> For completeness I also tested 46f69ded988d (from my initial analysis)
>> and setting the phy to polling mode there does not make a difference,
>> issue still reproduces. So it may have been a different bug. Though I
>> guess at this point we can disregard that and focus on the current
>> kernel.
>> 
>> I tried adding a few debugs and delays to the interrupt code path in
>> drivers/net/phy/meson-gxl.c but nothing gave me useful info so far.
>> 
>> Do you have more advice on how to proceed from here?
>> 
>> Thanks
>> 
>> Erico
>
> Heiner

Hi,

I also did some tests on my side as well. Mostly with v5.10.93 ATM
It is true that I can recall seeing this issue only on boards using the
internal PHY (g12 and gxl board for me - I don't have meson8b boards)

I tried on the u200 (g12 based). Being the ref design it has both
the internal and external interfaces and I can choose.

To my surprise, I could not reproduce the issue on it with the internal
PHY ... until I noticed that eMMC was initialising more or less at the
same time as the network.

I disabled the eMMC, out of curiosity, and the issue was back.
Like Heiner, I suspect a timing issue - at this stage, I can't tell if it
is PHY related though.

I also tried with the external phy, could not reproduce. Unfortunately,
as we can see from the first test on the u200, not reproducing is not
really a proof and it difficult to conclude.

Like Erico, I tried bisecting but I ended up on a BT merge ... Clearly
inconclusive :(

Disabling the IRQ is an interesting test but, on my side, I have mixed
results (on the libretech-cc this time):

* I first tried quickly while bisecting, on commit
  5.6.0-rc3-01434-g8d4ccd7770e7:
  - With IRQ => NOK
  - POLL => NOK

Seeing Erico's report, I thought maybe I mixed things up so I tried again,
doubled checked IRQ were disabled ... still broken. There was another
commit I reproduce it without IRQ but I lost it.

* I also tried on v5.10.93:
  - With IRQ => NOK
  - POLL => OK ... (well, I got bored before the issue showed up)

It seems that switching to polling, in some case, changes the timings
just enough to hide the issue ... but not always. Unless I forgot to
consider something else ?? Ideas ?

If I understand the proposed patch correctly, it is mostly about the phy
IRQ. Since I reproduce without the IRQ, I suppose it is not the
problem we where looking for (might still be a problem worth fixing -
the phy is not "rock-solid" when it comes to aneg - I already tried
stabilising it a few years ago)

TBH, It bothers me that I reproduced w/o the IRQ. The idea makes
sense :/

>
>
> diff --git a/drivers/net/phy/meson-gxl.c b/drivers/net/phy/meson-gxl.c
> index 7e7904fee..0acb3a99a 100644
> --- a/drivers/net/phy/meson-gxl.c
> +++ b/drivers/net/phy/meson-gxl.c
> @@ -7,6 +7,7 @@
>   * Author: Neil Armstrong <narmstrong@baylibre.com>
>   */
>  #include <linux/kernel.h>
> +#include <linux/delay.h>
>  #include <linux/module.h>
>  #include <linux/mii.h>
>  #include <linux/ethtool.h>
> @@ -209,12 +210,7 @@ static int meson_gxl_config_intr(struct phy_device *phydev)
>  		if (ret)
>  			return ret;
>  
> -		val = INTSRC_ANEG_PR
> -			| INTSRC_PARALLEL_FAULT
> -			| INTSRC_ANEG_LP_ACK
> -			| INTSRC_LINK_DOWN
> -			| INTSRC_REMOTE_FAULT
> -			| INTSRC_ANEG_COMPLETE;
> +		val = INTSRC_LINK_DOWN | INTSRC_ANEG_COMPLETE;
>  		ret = phy_write(phydev, INTSRC_MASK, val);
>  	} else {
>  		val = 0;
> @@ -240,6 +236,9 @@ static irqreturn_t meson_gxl_handle_interrupt(struct phy_device *phydev)
>  	if (irq_status == 0)
>  		return IRQ_NONE;
>  
> +	if (irq_status & INTSRC_ANEG_COMPLETE)
> +		msleep(100);
> +
>  	phy_trigger_machine(phydev);
>  
>  	return IRQ_HANDLED;


  reply	other threads:[~2022-03-02 14:44 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAK4VdL3-BEBzgVXTMejrAmDjOorvoGDBZ14UFrDrKxVEMD2Zjg@mail.gmail.com>
2022-02-07 10:41 ` net: stmmac: dwmac-meson8b: interface sometimes does not come up at boot Jerome Brunet
2022-02-20 16:51   ` Erico Nunes
2022-02-22  2:30     ` Samuel Holland
2022-02-26 13:53     ` Heiner Kallweit
2022-03-02 10:33       ` Erico Nunes
2022-03-02 11:01         ` Heiner Kallweit
2022-03-02 13:39           ` Jerome Brunet [this message]
2022-03-02 16:34             ` Heiner Kallweit
2022-03-06  9:40               ` Erico Nunes
2022-03-06 12:56                 ` Heiner Kallweit
2022-03-09 14:45                   ` Erico Nunes
2022-03-09 14:57                     ` Jerome Brunet
2022-03-09 20:42                       ` Heiner Kallweit
     [not found]                         ` <CACdvmAhcyNXViJgk6o6oAoYvAjAg-NFD74Eym_nGHJx3YAqjzw@mail.gmail.com>
2022-06-13  9:10                           ` Jerome Brunet
2022-07-15  5:35                             ` Anand Moon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1j8rts76te.fsf@starbuckisacylon.baylibre.com \
    --to=jbrunet@baylibre.com \
    --cc=alexandre.torgue@foss.st.com \
    --cc=hkallweit1@gmail.com \
    --cc=joabreu@synopsys.com \
    --cc=khilman@baylibre.com \
    --cc=linux-amlogic@lists.infradead.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=linux-sunxi@lists.linux.dev \
    --cc=martin.blumenstingl@googlemail.com \
    --cc=narmstrong@baylibre.com \
    --cc=netdev@vger.kernel.org \
    --cc=nunes.erico@gmail.com \
    --cc=peppe.cavallaro@st.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox