linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] PCI: mediatek-gen3: Ignore link up timeout
@ 2025-11-05  6:28 Chen-Yu Tsai
  2025-11-05  8:45 ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 6+ messages in thread
From: Chen-Yu Tsai @ 2025-11-05  6:28 UTC (permalink / raw)
  To: Matthias Brugger, AngeloGioacchino Del Regno, Ryder Lee,
	Jianjun Wang, Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas
  Cc: Chen-Yu Tsai, Bartosz Golaszewski, linux-pci, linux-mediatek,
	linux-kernel

As mentioned in commit 886a9c134755 ("PCI: dwc: Move link handling into
common code") come up later" in the code, it is possible for link up to
occur later:

  Let's standardize this to succeed as there are usecases where devices
  (and the link) appear later even without hotplug. For example, a
  reconfigured FPGA device.

Another case for this is the new PCIe power control stuff. The power
control mechanism only gets triggered in the PCI core after the driver
calls into pci_host_probe(). The power control framework then triggers
a bus rescan. In most driver implementations, this sequence happens
after link training. If the driver errors out when link training times
out, it will never get to the point where the device gets turned on.

Ignore the link up timeout, and lower the error message down to a
warning.

This makes PCIe devices that have not-always-on power rails work.
However there may be some reversal of PCIe power sequencing, since now
the PERST# and clocks are enabled in the driver, while the power is
applied afterwards.

Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
---
The change works to get my PCIe WiFi device working, but I wonder if
the driver should expose more fine grained controls for the link clock
and PERST# (when it is owned by the controller and not just a GPIO) to
the power control framework. This applies not just to this driver.

The PCI standard says that PERST# should hold the device in reset until
the power rails are valid or stable, i.e. at their designated voltages.
---
 drivers/pci/controller/pcie-mediatek-gen3.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
index 75ddb8bee168..5bdb312c9f9b 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -504,10 +504,15 @@ static int mtk_pcie_startup_port(struct mtk_gen3_pcie *pcie)
 		ltssm_index = PCIE_LTSSM_STATE(val);
 		ltssm_state = ltssm_index >= ARRAY_SIZE(ltssm_str) ?
 			      "Unknown state" : ltssm_str[ltssm_index];
-		dev_err(pcie->dev,
-			"PCIe link down, current LTSSM state: %s (%#x)\n",
-			ltssm_state, val);
-		return err;
+		dev_warn(pcie->dev,
+			 "PCIe link down, current LTSSM state: %s (%#x)\n",
+			 ltssm_state, val);
+
+		/*
+		 * Ignore the timeout, as the link may come up later,
+		 * such as when the PCI power control enables power to the
+		 * device, at which point it triggers a rescan.
+		 */
 	}
 
 	mtk_pcie_enable_msi(pcie);
-- 
2.51.2.1026.g39e6a42477-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: mediatek-gen3: Ignore link up timeout
  2025-11-05  6:28 [PATCH] PCI: mediatek-gen3: Ignore link up timeout Chen-Yu Tsai
@ 2025-11-05  8:45 ` AngeloGioacchino Del Regno
  2025-11-05  9:21   ` Chen-Yu Tsai
  0 siblings, 1 reply; 6+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-11-05  8:45 UTC (permalink / raw)
  To: Chen-Yu Tsai, Matthias Brugger, Ryder Lee, Jianjun Wang,
	Lorenzo Pieralisi, Krzysztof Wilczyński,
	Manivannan Sadhasivam, Rob Herring, Bjorn Helgaas
  Cc: Bartosz Golaszewski, linux-pci, linux-mediatek, linux-kernel

Il 05/11/25 07:28, Chen-Yu Tsai ha scritto:
> As mentioned in commit 886a9c134755 ("PCI: dwc: Move link handling into
> common code") come up later" in the code, it is possible for link up to
> occur later:
> 
>    Let's standardize this to succeed as there are usecases where devices
>    (and the link) appear later even without hotplug. For example, a
>    reconfigured FPGA device.
> 
> Another case for this is the new PCIe power control stuff. The power
> control mechanism only gets triggered in the PCI core after the driver
> calls into pci_host_probe(). The power control framework then triggers
> a bus rescan. In most driver implementations, this sequence happens
> after link training. If the driver errors out when link training times
> out, it will never get to the point where the device gets turned on.
> 
> Ignore the link up timeout, and lower the error message down to a
> warning.
> 
> This makes PCIe devices that have not-always-on power rails work.
> However there may be some reversal of PCIe power sequencing, since now
> the PERST# and clocks are enabled in the driver, while the power is
> applied afterwards.
> 
> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>

Ok, that's sensible.

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>

> ---
> The change works to get my PCIe WiFi device working, but I wonder if
> the driver should expose more fine grained controls for the link clock
> and PERST# (when it is owned by the controller and not just a GPIO) to
> the power control framework. This applies not just to this driver.
> 
> The PCI standard says that PERST# should hold the device in reset until
> the power rails are valid or stable, i.e. at their designated voltages.

I completely agree with all of the above - and I can imagine multiple PCI-Express
controller drivers doing the same as what's being done in MTK Gen3.

This means that the boot process may get slowed down by the port startup sequence
on multiple PCI-Express controllers (again not just MediaTek) and it's something
that must be resolved in some way... with the fastest course of action imo being
giving controller drivers knowledge of whether there's any device that is expected
to be powered off at that time (in order to at least avoid all those waits that
are expected to fail).

P.S.: Chen-Yu, did you check if the same applies to the MTK previous gen driver?
       Could you please check and eventually send a commit to do the same there?

Cheers,
Angelo

> ---
>   drivers/pci/controller/pcie-mediatek-gen3.c | 13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
> index 75ddb8bee168..5bdb312c9f9b 100644
> --- a/drivers/pci/controller/pcie-mediatek-gen3.c
> +++ b/drivers/pci/controller/pcie-mediatek-gen3.c
> @@ -504,10 +504,15 @@ static int mtk_pcie_startup_port(struct mtk_gen3_pcie *pcie)
>   		ltssm_index = PCIE_LTSSM_STATE(val);
>   		ltssm_state = ltssm_index >= ARRAY_SIZE(ltssm_str) ?
>   			      "Unknown state" : ltssm_str[ltssm_index];
> -		dev_err(pcie->dev,
> -			"PCIe link down, current LTSSM state: %s (%#x)\n",
> -			ltssm_state, val);
> -		return err;
> +		dev_warn(pcie->dev,
> +			 "PCIe link down, current LTSSM state: %s (%#x)\n",
> +			 ltssm_state, val);
> +
> +		/*
> +		 * Ignore the timeout, as the link may come up later,
> +		 * such as when the PCI power control enables power to the
> +		 * device, at which point it triggers a rescan.
> +		 */
>   	}
>   
>   	mtk_pcie_enable_msi(pcie);



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: mediatek-gen3: Ignore link up timeout
  2025-11-05  8:45 ` AngeloGioacchino Del Regno
@ 2025-11-05  9:21   ` Chen-Yu Tsai
  2025-11-05 11:32     ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 6+ messages in thread
From: Chen-Yu Tsai @ 2025-11-05  9:21 UTC (permalink / raw)
  To: AngeloGioacchino Del Regno, Bartosz Golaszewski,
	Manivannan Sadhasivam
  Cc: Matthias Brugger, Ryder Lee, Jianjun Wang, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
	linux-mediatek, linux-kernel

On Wed, Nov 5, 2025 at 4:45 PM AngeloGioacchino Del Regno
<angelogioacchino.delregno@collabora.com> wrote:
>
> Il 05/11/25 07:28, Chen-Yu Tsai ha scritto:
> > As mentioned in commit 886a9c134755 ("PCI: dwc: Move link handling into
> > common code") come up later" in the code, it is possible for link up to
> > occur later:
> >
> >    Let's standardize this to succeed as there are usecases where devices
> >    (and the link) appear later even without hotplug. For example, a
> >    reconfigured FPGA device.
> >
> > Another case for this is the new PCIe power control stuff. The power
> > control mechanism only gets triggered in the PCI core after the driver
> > calls into pci_host_probe(). The power control framework then triggers
> > a bus rescan. In most driver implementations, this sequence happens
> > after link training. If the driver errors out when link training times
> > out, it will never get to the point where the device gets turned on.
> >
> > Ignore the link up timeout, and lower the error message down to a
> > warning.
> >
> > This makes PCIe devices that have not-always-on power rails work.
> > However there may be some reversal of PCIe power sequencing, since now
> > the PERST# and clocks are enabled in the driver, while the power is
> > applied afterwards.
> >
> > Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
>
> Ok, that's sensible.
>
> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>
> > ---
> > The change works to get my PCIe WiFi device working, but I wonder if
> > the driver should expose more fine grained controls for the link clock
> > and PERST# (when it is owned by the controller and not just a GPIO) to
> > the power control framework. This applies not just to this driver.
> >
> > The PCI standard says that PERST# should hold the device in reset until
> > the power rails are valid or stable, i.e. at their designated voltages.
>
> I completely agree with all of the above - and I can imagine multiple PCI-Express
> controller drivers doing the same as what's being done in MTK Gen3.
>
> This means that the boot process may get slowed down by the port startup sequence
> on multiple PCI-Express controllers (again not just MediaTek) and it's something
> that must be resolved in some way... with the fastest course of action imo being
> giving controller drivers knowledge of whether there's any device that is expected
> to be powered off at that time (in order to at least avoid all those waits that
> are expected to fail).

That also requires some refactoring, since all the drivers _wait_ for link
up before going into the PCI core, which does the actual child node parsing.

I would like some input from Bartosz, who introduced the PCI power control
framework, and Manivannan, who added slot power control.

> P.S.: Chen-Yu, did you check if the same applies to the MTK previous gen driver?
>        Could you please check and eventually send a commit to do the same there?

My quick survey last week indicated that all the drivers except for the
dwc family error out if link up timed out.

I don't have any hardware for the older generation though. And it looks
like for the previous gen, the driver performs even worse, since it can
support multiple slots, and each slot is brought up sequentially. A slot
is discarded if link up times out. And the whole driver errors out if no
slots are working.


ChenYu

> Cheers,
> Angelo
>
> > ---
> >   drivers/pci/controller/pcie-mediatek-gen3.c | 13 +++++++++----
> >   1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
> > index 75ddb8bee168..5bdb312c9f9b 100644
> > --- a/drivers/pci/controller/pcie-mediatek-gen3.c
> > +++ b/drivers/pci/controller/pcie-mediatek-gen3.c
> > @@ -504,10 +504,15 @@ static int mtk_pcie_startup_port(struct mtk_gen3_pcie *pcie)
> >               ltssm_index = PCIE_LTSSM_STATE(val);
> >               ltssm_state = ltssm_index >= ARRAY_SIZE(ltssm_str) ?
> >                             "Unknown state" : ltssm_str[ltssm_index];
> > -             dev_err(pcie->dev,
> > -                     "PCIe link down, current LTSSM state: %s (%#x)\n",
> > -                     ltssm_state, val);
> > -             return err;
> > +             dev_warn(pcie->dev,
> > +                      "PCIe link down, current LTSSM state: %s (%#x)\n",
> > +                      ltssm_state, val);
> > +
> > +             /*
> > +              * Ignore the timeout, as the link may come up later,
> > +              * such as when the PCI power control enables power to the
> > +              * device, at which point it triggers a rescan.
> > +              */
> >       }
> >
> >       mtk_pcie_enable_msi(pcie);
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: mediatek-gen3: Ignore link up timeout
  2025-11-05  9:21   ` Chen-Yu Tsai
@ 2025-11-05 11:32     ` AngeloGioacchino Del Regno
  2025-11-06  5:52       ` Chen-Yu Tsai
  0 siblings, 1 reply; 6+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-11-05 11:32 UTC (permalink / raw)
  To: Chen-Yu Tsai, Bartosz Golaszewski, Manivannan Sadhasivam
  Cc: Matthias Brugger, Ryder Lee, Jianjun Wang, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
	linux-mediatek, linux-kernel

Il 05/11/25 10:21, Chen-Yu Tsai ha scritto:
> On Wed, Nov 5, 2025 at 4:45 PM AngeloGioacchino Del Regno
> <angelogioacchino.delregno@collabora.com> wrote:
>>
>> Il 05/11/25 07:28, Chen-Yu Tsai ha scritto:
>>> As mentioned in commit 886a9c134755 ("PCI: dwc: Move link handling into
>>> common code") come up later" in the code, it is possible for link up to
>>> occur later:
>>>
>>>     Let's standardize this to succeed as there are usecases where devices
>>>     (and the link) appear later even without hotplug. For example, a
>>>     reconfigured FPGA device.
>>>
>>> Another case for this is the new PCIe power control stuff. The power
>>> control mechanism only gets triggered in the PCI core after the driver
>>> calls into pci_host_probe(). The power control framework then triggers
>>> a bus rescan. In most driver implementations, this sequence happens
>>> after link training. If the driver errors out when link training times
>>> out, it will never get to the point where the device gets turned on.
>>>
>>> Ignore the link up timeout, and lower the error message down to a
>>> warning.
>>>
>>> This makes PCIe devices that have not-always-on power rails work.
>>> However there may be some reversal of PCIe power sequencing, since now
>>> the PERST# and clocks are enabled in the driver, while the power is
>>> applied afterwards.
>>>
>>> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
>>
>> Ok, that's sensible.
>>
>> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>
>>> ---
>>> The change works to get my PCIe WiFi device working, but I wonder if
>>> the driver should expose more fine grained controls for the link clock
>>> and PERST# (when it is owned by the controller and not just a GPIO) to
>>> the power control framework. This applies not just to this driver.
>>>
>>> The PCI standard says that PERST# should hold the device in reset until
>>> the power rails are valid or stable, i.e. at their designated voltages.
>>
>> I completely agree with all of the above - and I can imagine multiple PCI-Express
>> controller drivers doing the same as what's being done in MTK Gen3.
>>
>> This means that the boot process may get slowed down by the port startup sequence
>> on multiple PCI-Express controllers (again not just MediaTek) and it's something
>> that must be resolved in some way... with the fastest course of action imo being
>> giving controller drivers knowledge of whether there's any device that is expected
>> to be powered off at that time (in order to at least avoid all those waits that
>> are expected to fail).
> 
> That also requires some refactoring, since all the drivers _wait_ for link
> up before going into the PCI core, which does the actual child node parsing.
> 
> I would like some input from Bartosz, who introduced the PCI power control
> framework, and Manivannan, who added slot power control.
> 
>> P.S.: Chen-Yu, did you check if the same applies to the MTK previous gen driver?
>>         Could you please check and eventually send a commit to do the same there?
> 
> My quick survey last week indicated that all the drivers except for the
> dwc family error out if link up timed out.
> 
> I don't have any hardware for the older generation though. And it looks
> like for the previous gen, the driver performs even worse, since it can
> support multiple slots, and each slot is brought up sequentially. A slot
> is discarded if link up times out. And the whole driver errors out if no
> slots are working.
> 

Hey, that's bold.

If only one driver (DWC) is working okay, there's something wrong that must be
fixed before that behavior change goes upstream (which it already did, ugh).

This needs attention from both Bartosz and Mani really-right-now.

I'm not sure about possible good solutions, and unfortunately I don't really have
any time to explore, so I'm not spitting any words on that - leaving this to both
Bartosz and Mani as that's also the right thing to do anyway.

Angelo

> 
> ChenYu
> 
>> Cheers,
>> Angelo
>>
>>> ---
>>>    drivers/pci/controller/pcie-mediatek-gen3.c | 13 +++++++++----
>>>    1 file changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
>>> index 75ddb8bee168..5bdb312c9f9b 100644
>>> --- a/drivers/pci/controller/pcie-mediatek-gen3.c
>>> +++ b/drivers/pci/controller/pcie-mediatek-gen3.c
>>> @@ -504,10 +504,15 @@ static int mtk_pcie_startup_port(struct mtk_gen3_pcie *pcie)
>>>                ltssm_index = PCIE_LTSSM_STATE(val);
>>>                ltssm_state = ltssm_index >= ARRAY_SIZE(ltssm_str) ?
>>>                              "Unknown state" : ltssm_str[ltssm_index];
>>> -             dev_err(pcie->dev,
>>> -                     "PCIe link down, current LTSSM state: %s (%#x)\n",
>>> -                     ltssm_state, val);
>>> -             return err;
>>> +             dev_warn(pcie->dev,
>>> +                      "PCIe link down, current LTSSM state: %s (%#x)\n",
>>> +                      ltssm_state, val);
>>> +
>>> +             /*
>>> +              * Ignore the timeout, as the link may come up later,
>>> +              * such as when the PCI power control enables power to the
>>> +              * device, at which point it triggers a rescan.
>>> +              */
>>>        }
>>>
>>>        mtk_pcie_enable_msi(pcie);
>>
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: mediatek-gen3: Ignore link up timeout
  2025-11-05 11:32     ` AngeloGioacchino Del Regno
@ 2025-11-06  5:52       ` Chen-Yu Tsai
  2025-11-06  9:00         ` AngeloGioacchino Del Regno
  0 siblings, 1 reply; 6+ messages in thread
From: Chen-Yu Tsai @ 2025-11-06  5:52 UTC (permalink / raw)
  To: AngeloGioacchino Del Regno
  Cc: Bartosz Golaszewski, Manivannan Sadhasivam, Matthias Brugger,
	Ryder Lee, Jianjun Wang, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
	linux-mediatek, linux-kernel

On Wed, Nov 5, 2025 at 7:32 PM AngeloGioacchino Del Regno
<angelogioacchino.delregno@collabora.com> wrote:
>
> Il 05/11/25 10:21, Chen-Yu Tsai ha scritto:
> > On Wed, Nov 5, 2025 at 4:45 PM AngeloGioacchino Del Regno
> > <angelogioacchino.delregno@collabora.com> wrote:
> >>
> >> Il 05/11/25 07:28, Chen-Yu Tsai ha scritto:
> >>> As mentioned in commit 886a9c134755 ("PCI: dwc: Move link handling into
> >>> common code") come up later" in the code, it is possible for link up to
> >>> occur later:
> >>>
> >>>     Let's standardize this to succeed as there are usecases where devices
> >>>     (and the link) appear later even without hotplug. For example, a
> >>>     reconfigured FPGA device.
> >>>
> >>> Another case for this is the new PCIe power control stuff. The power
> >>> control mechanism only gets triggered in the PCI core after the driver
> >>> calls into pci_host_probe(). The power control framework then triggers
> >>> a bus rescan. In most driver implementations, this sequence happens
> >>> after link training. If the driver errors out when link training times
> >>> out, it will never get to the point where the device gets turned on.
> >>>
> >>> Ignore the link up timeout, and lower the error message down to a
> >>> warning.
> >>>
> >>> This makes PCIe devices that have not-always-on power rails work.
> >>> However there may be some reversal of PCIe power sequencing, since now
> >>> the PERST# and clocks are enabled in the driver, while the power is
> >>> applied afterwards.
> >>>
> >>> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
> >>
> >> Ok, that's sensible.
> >>
> >> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> >>
> >>> ---
> >>> The change works to get my PCIe WiFi device working, but I wonder if
> >>> the driver should expose more fine grained controls for the link clock
> >>> and PERST# (when it is owned by the controller and not just a GPIO) to
> >>> the power control framework. This applies not just to this driver.
> >>>
> >>> The PCI standard says that PERST# should hold the device in reset until
> >>> the power rails are valid or stable, i.e. at their designated voltages.
> >>
> >> I completely agree with all of the above - and I can imagine multiple PCI-Express
> >> controller drivers doing the same as what's being done in MTK Gen3.
> >>
> >> This means that the boot process may get slowed down by the port startup sequence
> >> on multiple PCI-Express controllers (again not just MediaTek) and it's something
> >> that must be resolved in some way... with the fastest course of action imo being
> >> giving controller drivers knowledge of whether there's any device that is expected
> >> to be powered off at that time (in order to at least avoid all those waits that
> >> are expected to fail).
> >
> > That also requires some refactoring, since all the drivers _wait_ for link
> > up before going into the PCI core, which does the actual child node parsing.
> >
> > I would like some input from Bartosz, who introduced the PCI power control
> > framework, and Manivannan, who added slot power control.
> >
> >> P.S.: Chen-Yu, did you check if the same applies to the MTK previous gen driver?
> >>         Could you please check and eventually send a commit to do the same there?
> >
> > My quick survey last week indicated that all the drivers except for the
> > dwc family error out if link up timed out.
> >
> > I don't have any hardware for the older generation though. And it looks
> > like for the previous gen, the driver performs even worse, since it can
> > support multiple slots, and each slot is brought up sequentially. A slot
> > is discarded if link up times out. And the whole driver errors out if no
> > slots are working.
> >
>
> Hey, that's bold.
>
> If only one driver (DWC) is working okay, there's something wrong that must be
> fixed before that behavior change goes upstream (which it already did, ugh).

To be fair one only runs into it if they convert over to the PCI slot power
description in the device tree, and their hardware isn't DWC based. This
is pretty new.

> This needs attention from both Bartosz and Mani really-right-now.
>
> I'm not sure about possible good solutions, and unfortunately I don't really have
> any time to explore, so I'm not spitting any words on that - leaving this to both
> Bartosz and Mani as that's also the right thing to do anyway.

Mani mentioned [1] that work towards moving the pwrctrl stuff into drivers
is almost complete. So I think we're covered.

ChenYu

[1] https://lore.kernel.org/all/rz6ajnl7l25hfl2u7lloywtw7sq7smhb63hg76wjslyuwyjb7a@fhuafuino5kv/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] PCI: mediatek-gen3: Ignore link up timeout
  2025-11-06  5:52       ` Chen-Yu Tsai
@ 2025-11-06  9:00         ` AngeloGioacchino Del Regno
  0 siblings, 0 replies; 6+ messages in thread
From: AngeloGioacchino Del Regno @ 2025-11-06  9:00 UTC (permalink / raw)
  To: Chen-Yu Tsai
  Cc: Bartosz Golaszewski, Manivannan Sadhasivam, Matthias Brugger,
	Ryder Lee, Jianjun Wang, Lorenzo Pieralisi,
	Krzysztof Wilczyński, Rob Herring, Bjorn Helgaas, linux-pci,
	linux-mediatek, linux-kernel

Il 06/11/25 06:52, Chen-Yu Tsai ha scritto:
> On Wed, Nov 5, 2025 at 7:32 PM AngeloGioacchino Del Regno
> <angelogioacchino.delregno@collabora.com> wrote:
>>
>> Il 05/11/25 10:21, Chen-Yu Tsai ha scritto:
>>> On Wed, Nov 5, 2025 at 4:45 PM AngeloGioacchino Del Regno
>>> <angelogioacchino.delregno@collabora.com> wrote:
>>>>
>>>> Il 05/11/25 07:28, Chen-Yu Tsai ha scritto:
>>>>> As mentioned in commit 886a9c134755 ("PCI: dwc: Move link handling into
>>>>> common code") come up later" in the code, it is possible for link up to
>>>>> occur later:
>>>>>
>>>>>      Let's standardize this to succeed as there are usecases where devices
>>>>>      (and the link) appear later even without hotplug. For example, a
>>>>>      reconfigured FPGA device.
>>>>>
>>>>> Another case for this is the new PCIe power control stuff. The power
>>>>> control mechanism only gets triggered in the PCI core after the driver
>>>>> calls into pci_host_probe(). The power control framework then triggers
>>>>> a bus rescan. In most driver implementations, this sequence happens
>>>>> after link training. If the driver errors out when link training times
>>>>> out, it will never get to the point where the device gets turned on.
>>>>>
>>>>> Ignore the link up timeout, and lower the error message down to a
>>>>> warning.
>>>>>
>>>>> This makes PCIe devices that have not-always-on power rails work.
>>>>> However there may be some reversal of PCIe power sequencing, since now
>>>>> the PERST# and clocks are enabled in the driver, while the power is
>>>>> applied afterwards.
>>>>>
>>>>> Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
>>>>
>>>> Ok, that's sensible.
>>>>
>>>> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>>>
>>>>> ---
>>>>> The change works to get my PCIe WiFi device working, but I wonder if
>>>>> the driver should expose more fine grained controls for the link clock
>>>>> and PERST# (when it is owned by the controller and not just a GPIO) to
>>>>> the power control framework. This applies not just to this driver.
>>>>>
>>>>> The PCI standard says that PERST# should hold the device in reset until
>>>>> the power rails are valid or stable, i.e. at their designated voltages.
>>>>
>>>> I completely agree with all of the above - and I can imagine multiple PCI-Express
>>>> controller drivers doing the same as what's being done in MTK Gen3.
>>>>
>>>> This means that the boot process may get slowed down by the port startup sequence
>>>> on multiple PCI-Express controllers (again not just MediaTek) and it's something
>>>> that must be resolved in some way... with the fastest course of action imo being
>>>> giving controller drivers knowledge of whether there's any device that is expected
>>>> to be powered off at that time (in order to at least avoid all those waits that
>>>> are expected to fail).
>>>
>>> That also requires some refactoring, since all the drivers _wait_ for link
>>> up before going into the PCI core, which does the actual child node parsing.
>>>
>>> I would like some input from Bartosz, who introduced the PCI power control
>>> framework, and Manivannan, who added slot power control.
>>>
>>>> P.S.: Chen-Yu, did you check if the same applies to the MTK previous gen driver?
>>>>          Could you please check and eventually send a commit to do the same there?
>>>
>>> My quick survey last week indicated that all the drivers except for the
>>> dwc family error out if link up timed out.
>>>
>>> I don't have any hardware for the older generation though. And it looks
>>> like for the previous gen, the driver performs even worse, since it can
>>> support multiple slots, and each slot is brought up sequentially. A slot
>>> is discarded if link up times out. And the whole driver errors out if no
>>> slots are working.
>>>
>>
>> Hey, that's bold.
>>
>> If only one driver (DWC) is working okay, there's something wrong that must be
>> fixed before that behavior change goes upstream (which it already did, ugh).
> 
> To be fair one only runs into it if they convert over to the PCI slot power
> description in the device tree, and their hardware isn't DWC based. This
> is pretty new.
> 

That changes a lot of things then - I thought it was a regression, but it's not.

>> This needs attention from both Bartosz and Mani really-right-now.
>>
>> I'm not sure about possible good solutions, and unfortunately I don't really have
>> any time to explore, so I'm not spitting any words on that - leaving this to both
>> Bartosz and Mani as that's also the right thing to do anyway.
> 
> Mani mentioned [1] that work towards moving the pwrctrl stuff into drivers
> is almost complete. So I think we're covered.
> 

We're covered. Yes.

Cheers,
Angelo

> ChenYu
> 
> [1] https://lore.kernel.org/all/rz6ajnl7l25hfl2u7lloywtw7sq7smhb63hg76wjslyuwyjb7a@fhuafuino5kv/



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-11-06  9:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-05  6:28 [PATCH] PCI: mediatek-gen3: Ignore link up timeout Chen-Yu Tsai
2025-11-05  8:45 ` AngeloGioacchino Del Regno
2025-11-05  9:21   ` Chen-Yu Tsai
2025-11-05 11:32     ` AngeloGioacchino Del Regno
2025-11-06  5:52       ` Chen-Yu Tsai
2025-11-06  9:00         ` AngeloGioacchino Del Regno

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).