* imx8mp: pci enumeration fails with current HEAD
@ 2025-04-17 14:37 Heiko Schocher
2025-04-17 14:46 ` Heiko Schocher
2025-04-17 14:54 ` Miquel Raynal
0 siblings, 2 replies; 6+ messages in thread
From: Heiko Schocher @ 2025-04-17 14:37 UTC (permalink / raw)
To: Miquel Raynal; +Cc: Fabio Estevam, u-boot@lists.denx.de
Hi Miquel,
I have here an imx8mp based board for which I just prepare mainlining.
pci enumeration works fine with U-Boot 2025.04
u-boot=> pci enum
PCIE-0: Link up (Gen1-x1, Bus0)
u-boot=>
and tftp through the ethernet interface works fine.
Using current HEAD
* 5b4ae0f3f04 - (origin/master, origin/HEAD) mailmap: update my name and email (vor 2 Tagen) <Casey
Connolly>
It breaks with:
u-boot=> pci enum
nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110.
pcie_dw_imx pcie@33800000: failed to power on PHY
u-boot=>
git bisect dropped your patch:
197376fbf300e92afa0a1583815d9c9eb52d613a is the first bad commit
commit 197376fbf300e92afa0a1583815d9c9eb52d613a
Author: Miquel Raynal <miquel.raynal@bootlin.com>
Date: Thu Apr 3 09:39:05 2025 +0200
power-domain: Add refcounting
It is very surprising that such an uclass, specifically designed to
handle resources that may be shared by different devices, is not keeping
the count of the number of times a power domain has been
enabled/disabled to avoid shutting it down unexpectedly or disabling it
several times.
Doing this causes troubles on eg. i.MX8MP because disabling power
domains can be done in recursive loops were the same power domain
disabled up to 4 times in a row. PGCs seem to have tight FSM internal
timings to respect and it is easy to produce a race condition that puts
the power domains in an unstable state, leading to ADB400 errors and
later crashes in Linux.
CI tests using power domains are slightly updated to make sure the count
of on/off calls is even and the results match what we *now* expect.
As we do not want to break existing users while stile getting
interesting error codes, the implementation is split between:
- a low-level helper reporting error codes if the requested transition
could not be operated,
- a higher-level helper ignoring the "non error" codes, like EALREADY and
EBUSY.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
reverting this patch, and it works again fine for me!
Do you have a imx8mp based hardware with pci?
Can you (or anyone else with an imx8mp based hardware with pci) approve
this on a similiar HW?
I try to dig into it, but may you have a fast idea!
Okay, thought about it ... it tries to power on "blk-ctrl@32f10000"
driver:/drivers/power/domain/imx8mp-hsiomix.c
imx8mp.dtsi:
hsio_blk_ctrl: blk-ctrl@32f10000 {
compatible = "fsl,imx8mp-hsio-blk-ctrl", "syscon";
which has several different power domains ... and your patch prevents to
enable more than one of them ... adding some debugs shows:
u-boot=> pci enum
[...]
power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 priv->on_count: 0
power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 power_domain->id: 4
Here pwoer domain id
#define IMX8MP_HSIOBLK_PD_PCIE_PHY 4
gets enabled and on_count increases...
imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4
power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 priv->on_count: 0
power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 power_domain->id: 0
power_domain_on_lowlevel: ret: 0
power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 priv->on_count: 0
power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 power_domain->id: 0
power_domain_on_lowlevel: ret: 0
imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 END OKAY
power_domain_on_lowlevel: ret: 0
power_domain_get_by_index(dev=00000000fe5b2410, power_domain=00000000fe594458)
power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 priv->on_count: 1
power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 power_domain->id: 3
Now power domain
#define IMX8MP_HSIOBLK_PD_PCIE 3
should be enabled, but since your patch this gets prevented as "priv->on_count: 1"
imx8_pcie_phy_power_on: ------------ pad_mode: 1
imx8_pcie_phy_power_on: ------------ pad_mode: 1 ret: -110
nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110.
pcie_dw_imx pcie@33800000: failed to power on PHY
u-boot=>
I have not a fast solution, may you have an idea? I try to look
after the easter holidays into it. May we need for each power_domain-> id such
an on counter...
Thanks!
bye,
Heiko
--
DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: imx8mp: pci enumeration fails with current HEAD 2025-04-17 14:37 imx8mp: pci enumeration fails with current HEAD Heiko Schocher @ 2025-04-17 14:46 ` Heiko Schocher 2025-04-17 15:20 ` Miquel Raynal 2025-04-17 14:54 ` Miquel Raynal 1 sibling, 1 reply; 6+ messages in thread From: Heiko Schocher @ 2025-04-17 14:46 UTC (permalink / raw) To: Miquel Raynal; +Cc: Fabio Estevam, u-boot@lists.denx.de Hello Miquel, On 17.04.25 16:37, Heiko Schocher wrote: > Hi Miquel, > > I have here an imx8mp based board for which I just prepare mainlining. > > pci enumeration works fine with U-Boot 2025.04 > > u-boot=> pci enum > PCIE-0: Link up (Gen1-x1, Bus0) > u-boot=> > > and tftp through the ethernet interface works fine. > > Using current HEAD > > * 5b4ae0f3f04 - (origin/master, origin/HEAD) mailmap: update my name and email (vor 2 Tagen) <Casey > Connolly> > > It breaks with: > > u-boot=> pci enum > nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. > pcie_dw_imx pcie@33800000: failed to power on PHY > u-boot=> > > git bisect dropped your patch: > > 197376fbf300e92afa0a1583815d9c9eb52d613a is the first bad commit > commit 197376fbf300e92afa0a1583815d9c9eb52d613a > Author: Miquel Raynal <miquel.raynal@bootlin.com> > Date: Thu Apr 3 09:39:05 2025 +0200 > > power-domain: Add refcounting > > It is very surprising that such an uclass, specifically designed to > handle resources that may be shared by different devices, is not keeping > the count of the number of times a power domain has been > enabled/disabled to avoid shutting it down unexpectedly or disabling it > several times. > > Doing this causes troubles on eg. i.MX8MP because disabling power > domains can be done in recursive loops were the same power domain > disabled up to 4 times in a row. PGCs seem to have tight FSM internal > timings to respect and it is easy to produce a race condition that puts > the power domains in an unstable state, leading to ADB400 errors and > later crashes in Linux. > > CI tests using power domains are slightly updated to make sure the count > of on/off calls is even and the results match what we *now* expect. > > As we do not want to break existing users while stile getting > interesting error codes, the implementation is split between: > - a low-level helper reporting error codes if the requested transition > could not be operated, > - a higher-level helper ignoring the "non error" codes, like EALREADY and > EBUSY. > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> > > > reverting this patch, and it works again fine for me! > > Do you have a imx8mp based hardware with pci? > > Can you (or anyone else with an imx8mp based hardware with pci) approve > this on a similiar HW? > > I try to dig into it, but may you have a fast idea! > > Okay, thought about it ... it tries to power on "blk-ctrl@32f10000" > driver:/drivers/power/domain/imx8mp-hsiomix.c > > imx8mp.dtsi: > hsio_blk_ctrl: blk-ctrl@32f10000 { > compatible = "fsl,imx8mp-hsio-blk-ctrl", "syscon"; > > which has several different power domains ... and your patch prevents to > enable more than one of them ... adding some debugs shows: > > u-boot=> pci enum > [...] > power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 priv->on_count: 0 > power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 power_domain->id: 4 > > Here pwoer domain id > > #define IMX8MP_HSIOBLK_PD_PCIE_PHY 4 > > gets enabled and on_count increases... > > imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 > power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 priv->on_count: 0 > power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 power_domain->id: 0 > power_domain_on_lowlevel: ret: 0 > power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 priv->on_count: 0 > power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 power_domain->id: 0 > power_domain_on_lowlevel: ret: 0 > imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 END OKAY > power_domain_on_lowlevel: ret: 0 > power_domain_get_by_index(dev=00000000fe5b2410, power_domain=00000000fe594458) > power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 priv->on_count: 1 > power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 power_domain->id: 3 okay, the "power_domain=00000000fe594458" for the IMX8MP_HSIOBLK_PD_PCIE case is different as the one for the IMX8MP_HSIOBLK_PD_PCIE_PHY case ... so priv pointer should also be different... so that should be fine ... I do not see any debug output where this "power_domain=00000000fe594458" gets enabled (or disabled before) ... I have to debug deeper into it... sorry. But may you have an idea... bye, Heiko > > Now power domain > > #define IMX8MP_HSIOBLK_PD_PCIE 3 > > should be enabled, but since your patch this gets prevented as "priv->on_count: 1" > > imx8_pcie_phy_power_on: ------------ pad_mode: 1 > imx8_pcie_phy_power_on: ------------ pad_mode: 1 ret: -110 > nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. > pcie_dw_imx pcie@33800000: failed to power on PHY > u-boot=> > > I have not a fast solution, may you have an idea? I try to look > after the easter holidays into it. May we need for each power_domain-> id such > an on counter... > > Thanks! > > bye, > Heiko -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: imx8mp: pci enumeration fails with current HEAD 2025-04-17 14:46 ` Heiko Schocher @ 2025-04-17 15:20 ` Miquel Raynal 2025-04-17 15:32 ` Heiko Schocher 0 siblings, 1 reply; 6+ messages in thread From: Miquel Raynal @ 2025-04-17 15:20 UTC (permalink / raw) To: Heiko Schocher; +Cc: Fabio Estevam, u-boot@lists.denx.de On 17/04/2025 at 16:46:13 +02, Heiko Schocher <hs@denx.de> wrote: > Hello Miquel, > > On 17.04.25 16:37, Heiko Schocher wrote: >> Hi Miquel, >> I have here an imx8mp based board for which I just prepare mainlining. >> pci enumeration works fine with U-Boot 2025.04 >> u-boot=> pci enum >> PCIE-0: Link up (Gen1-x1, Bus0) >> u-boot=> >> and tftp through the ethernet interface works fine. >> Using current HEAD >> * 5b4ae0f3f04 - (origin/master, origin/HEAD) mailmap: update my name >> and email (vor 2 Tagen) <Casey Connolly> >> It breaks with: >> u-boot=> pci enum >> nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. >> pcie_dw_imx pcie@33800000: failed to power on PHY >> u-boot=> >> git bisect dropped your patch: >> 197376fbf300e92afa0a1583815d9c9eb52d613a is the first bad commit >> commit 197376fbf300e92afa0a1583815d9c9eb52d613a >> Author: Miquel Raynal <miquel.raynal@bootlin.com> >> Date: Thu Apr 3 09:39:05 2025 +0200 >> power-domain: Add refcounting >> It is very surprising that such an uclass, specifically designed >> to >> handle resources that may be shared by different devices, is not keeping >> the count of the number of times a power domain has been >> enabled/disabled to avoid shutting it down unexpectedly or disabling it >> several times. >> Doing this causes troubles on eg. i.MX8MP because disabling power >> domains can be done in recursive loops were the same power domain >> disabled up to 4 times in a row. PGCs seem to have tight FSM internal >> timings to respect and it is easy to produce a race condition that puts >> the power domains in an unstable state, leading to ADB400 errors and >> later crashes in Linux. >> CI tests using power domains are slightly updated to make sure >> the count >> of on/off calls is even and the results match what we *now* expect. >> As we do not want to break existing users while stile getting >> interesting error codes, the implementation is split between: >> - a low-level helper reporting error codes if the requested transition >> could not be operated, >> - a higher-level helper ignoring the "non error" codes, like EALREADY and >> EBUSY. >> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> >> reverting this patch, and it works again fine for me! >> Do you have a imx8mp based hardware with pci? >> Can you (or anyone else with an imx8mp based hardware with pci) >> approve >> this on a similiar HW? >> I try to dig into it, but may you have a fast idea! >> Okay, thought about it ... it tries to power on "blk-ctrl@32f10000" >> driver:/drivers/power/domain/imx8mp-hsiomix.c >> imx8mp.dtsi: >> hsio_blk_ctrl: blk-ctrl@32f10000 { >> compatible = "fsl,imx8mp-hsio-blk-ctrl", "syscon"; >> which has several different power domains ... and your patch prevents >> to >> enable more than one of them ... adding some debugs shows: >> u-boot=> pci enum >> [...] >> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 priv->on_count: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 power_domain->id: 4 >> Here pwoer domain id >> #define IMX8MP_HSIOBLK_PD_PCIE_PHY 4 >> gets enabled and on_count increases... >> imx8mp_hsiomix_set: ------------ power_domain->id: 4 >> IMX8MP_HSIOBLK_PD_PCIE: 3 4 >> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 priv->on_count: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 power_domain->id: 0 >> power_domain_on_lowlevel: ret: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 priv->on_count: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 power_domain->id: 0 >> power_domain_on_lowlevel: ret: 0 >> imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 END OKAY >> power_domain_on_lowlevel: ret: 0 >> power_domain_get_by_index(dev=00000000fe5b2410, power_domain=00000000fe594458) >> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 priv->on_count: 1 >> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 power_domain->id: 3 > > okay, the "power_domain=00000000fe594458" for the IMX8MP_HSIOBLK_PD_PCIE case > is different as the one for the IMX8MP_HSIOBLK_PD_PCIE_PHY case ... so priv pointer > should also be different... That's part of the problem. The API is unusual, you probably allocate a structure (either on the stack or dynamically) and pass its pointer to be filled by the uclass. In general the approach is that you ask the uclass to give you a pointer towards a unique version of a structure that defines an opaque object you want to manipulate. So I do not think the printing is relevant, what is more relevant is the node the udev targets. In both cases it is blk-ctrl@32f10000, so the second time on_count is already set. > so that should be fine ... I do not see any debug output > where this "power_domain=00000000fe594458" gets enabled (or disabled before) ... I > have to debug deeper into it... sorry. But may you have an idea... Let me know if you think my theory does not stand. Thanks, Miquèl ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: imx8mp: pci enumeration fails with current HEAD 2025-04-17 15:20 ` Miquel Raynal @ 2025-04-17 15:32 ` Heiko Schocher 0 siblings, 0 replies; 6+ messages in thread From: Heiko Schocher @ 2025-04-17 15:32 UTC (permalink / raw) To: Miquel Raynal; +Cc: Fabio Estevam, u-boot@lists.denx.de Hello Miquel, On 17.04.25 17:20, Miquel Raynal wrote: > On 17/04/2025 at 16:46:13 +02, Heiko Schocher <hs@denx.de> wrote: > >> Hello Miquel, >> >> On 17.04.25 16:37, Heiko Schocher wrote: >>> Hi Miquel, >>> I have here an imx8mp based board for which I just prepare mainlining. >>> pci enumeration works fine with U-Boot 2025.04 >>> u-boot=> pci enum >>> PCIE-0: Link up (Gen1-x1, Bus0) >>> u-boot=> >>> and tftp through the ethernet interface works fine. >>> Using current HEAD >>> * 5b4ae0f3f04 - (origin/master, origin/HEAD) mailmap: update my name >>> and email (vor 2 Tagen) <Casey Connolly> >>> It breaks with: >>> u-boot=> pci enum >>> nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. >>> pcie_dw_imx pcie@33800000: failed to power on PHY >>> u-boot=> >>> git bisect dropped your patch: >>> 197376fbf300e92afa0a1583815d9c9eb52d613a is the first bad commit >>> commit 197376fbf300e92afa0a1583815d9c9eb52d613a >>> Author: Miquel Raynal <miquel.raynal@bootlin.com> >>> Date: Thu Apr 3 09:39:05 2025 +0200 >>> power-domain: Add refcounting >>> It is very surprising that such an uclass, specifically designed >>> to >>> handle resources that may be shared by different devices, is not keeping >>> the count of the number of times a power domain has been >>> enabled/disabled to avoid shutting it down unexpectedly or disabling it >>> several times. >>> Doing this causes troubles on eg. i.MX8MP because disabling power >>> domains can be done in recursive loops were the same power domain >>> disabled up to 4 times in a row. PGCs seem to have tight FSM internal >>> timings to respect and it is easy to produce a race condition that puts >>> the power domains in an unstable state, leading to ADB400 errors and >>> later crashes in Linux. >>> CI tests using power domains are slightly updated to make sure >>> the count >>> of on/off calls is even and the results match what we *now* expect. >>> As we do not want to break existing users while stile getting >>> interesting error codes, the implementation is split between: >>> - a low-level helper reporting error codes if the requested transition >>> could not be operated, >>> - a higher-level helper ignoring the "non error" codes, like EALREADY and >>> EBUSY. >>> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> >>> reverting this patch, and it works again fine for me! >>> Do you have a imx8mp based hardware with pci? >>> Can you (or anyone else with an imx8mp based hardware with pci) >>> approve >>> this on a similiar HW? >>> I try to dig into it, but may you have a fast idea! >>> Okay, thought about it ... it tries to power on "blk-ctrl@32f10000" >>> driver:/drivers/power/domain/imx8mp-hsiomix.c >>> imx8mp.dtsi: >>> hsio_blk_ctrl: blk-ctrl@32f10000 { >>> compatible = "fsl,imx8mp-hsio-blk-ctrl", "syscon"; >>> which has several different power domains ... and your patch prevents >>> to >>> enable more than one of them ... adding some debugs shows: >>> u-boot=> pci enum >>> [...] >>> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 priv->on_count: 0 >>> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 power_domain->id: 4 >>> Here pwoer domain id >>> #define IMX8MP_HSIOBLK_PD_PCIE_PHY 4 >>> gets enabled and on_count increases... >>> imx8mp_hsiomix_set: ------------ power_domain->id: 4 >>> IMX8MP_HSIOBLK_PD_PCIE: 3 4 >>> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 priv->on_count: 0 >>> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 power_domain->id: 0 >>> power_domain_on_lowlevel: ret: 0 >>> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 priv->on_count: 0 >>> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 power_domain->id: 0 >>> power_domain_on_lowlevel: ret: 0 >>> imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 END OKAY >>> power_domain_on_lowlevel: ret: 0 >>> power_domain_get_by_index(dev=00000000fe5b2410, power_domain=00000000fe594458) >>> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 priv->on_count: 1 >>> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 power_domain->id: 3 >> >> okay, the "power_domain=00000000fe594458" for the IMX8MP_HSIOBLK_PD_PCIE case >> is different as the one for the IMX8MP_HSIOBLK_PD_PCIE_PHY case ... so priv pointer >> should also be different... > > That's part of the problem. The API is unusual, you probably allocate a > structure (either on the stack or dynamically) and pass its pointer to > be filled by the uclass. > > In general the approach is that you ask the uclass to give you a pointer > towards a unique version of a structure that defines an opaque object > you want to manipulate. > > So I do not think the printing is relevant, what is more relevant is the > node the udev targets. In both cases it is blk-ctrl@32f10000, so the > second time on_count is already set. Yep... >> so that should be fine ... I do not see any debug output >> where this "power_domain=00000000fe594458" gets enabled (or disabled before) ... I >> have to debug deeper into it... sorry. But may you have an idea... > > Let me know if you think my theory does not stand. I am unsure here, and not to deep in this part of the code ... I thought that both have a different priv pointer ... so it should be fine.. but thats also just theory... I have to debug... sorry, that I could not help more here currently. bye, Heiko -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: imx8mp: pci enumeration fails with current HEAD 2025-04-17 14:37 imx8mp: pci enumeration fails with current HEAD Heiko Schocher 2025-04-17 14:46 ` Heiko Schocher @ 2025-04-17 14:54 ` Miquel Raynal 2025-04-17 15:18 ` Heiko Schocher 1 sibling, 1 reply; 6+ messages in thread From: Miquel Raynal @ 2025-04-17 14:54 UTC (permalink / raw) To: Heiko Schocher; +Cc: Fabio Estevam, u-boot@lists.denx.de Hello Heiko, On 17/04/2025 at 16:37:39 +02, Heiko Schocher <hs@denx.de> wrote: > Hi Miquel, > > I have here an imx8mp based board for which I just prepare mainlining. > > pci enumeration works fine with U-Boot 2025.04 > > u-boot=> pci enum > PCIE-0: Link up (Gen1-x1, Bus0) > u-boot=> > > and tftp through the ethernet interface works fine. > > Using current HEAD > > * 5b4ae0f3f04 - (origin/master, origin/HEAD) mailmap: update my name and > email (vor 2 Tagen) <Casey Connolly> > > It breaks with: > > u-boot=> pci enum > nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. > pcie_dw_imx pcie@33800000: failed to power on PHY > u-boot=> > > git bisect dropped your patch: > > 197376fbf300e92afa0a1583815d9c9eb52d613a is the first bad commit > commit 197376fbf300e92afa0a1583815d9c9eb52d613a > Author: Miquel Raynal <miquel.raynal@bootlin.com> > Date: Thu Apr 3 09:39:05 2025 +0200 > > power-domain: Add refcounting > > It is very surprising that such an uclass, specifically designed to > handle resources that may be shared by different devices, is not keeping > the count of the number of times a power domain has been > enabled/disabled to avoid shutting it down unexpectedly or disabling it > several times. > > Doing this causes troubles on eg. i.MX8MP because disabling power > domains can be done in recursive loops were the same power domain > disabled up to 4 times in a row. PGCs seem to have tight FSM internal > timings to respect and it is easy to produce a race condition that puts > the power domains in an unstable state, leading to ADB400 errors and > later crashes in Linux. > > CI tests using power domains are slightly updated to make sure the count > of on/off calls is even and the results match what we *now* expect. > > As we do not want to break existing users while stile getting > interesting error codes, the implementation is split between: > - a low-level helper reporting error codes if the requested transition > could not be operated, > - a higher-level helper ignoring the "non error" codes, like EALREADY and > EBUSY. > > Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> > > > reverting this patch, and it works again fine for me! That's right, this patch made assumptions that are wrong, I am very sorry about that. In my understanding there was a power_domain structure per ID, but in practice there is only one per node, so when two consumer devices point to the same node that has #power-domain-cells = <1> it fails to power up the second power domain. I tested this patch using the imx8mp video pipeline (which works fine), and I wasn't using PCI on it. It's been reported: https://lore.kernel.org/u-boot/20250403-ge-mainline-display-support-v6-0-478b5e3dd872@bootlin.com/T/#m68e9d16204a61450084324b99fd571d95932ece0 https://lore.kernel.org/u-boot/20250403-ge-mainline-display-support-v6-0-478b5e3dd872@bootlin.com/T/#me570681ae3cf6c42cd45912de15cb968af55be28 And the patch is being reverted: https://lore.kernel.org/u-boot/20250417115311.1905411-1-w.egorov@phytec.de/T/#u > Do you have a imx8mp based hardware with pci? I do not have PCI on the board I am using for testing, but now that my attention has been focused on the power domain counting issue I managed to observe the problem by alternately enabling lcdif1 and lcdif2 which share the same power domain node. > Can you (or anyone else with an imx8mp based hardware with pci) approve > this on a similiar HW? > > I try to dig into it, but may you have a fast idea! > > Okay, thought about it ... it tries to power on "blk-ctrl@32f10000" > driver:/drivers/power/domain/imx8mp-hsiomix.c > > imx8mp.dtsi: > hsio_blk_ctrl: blk-ctrl@32f10000 { > compatible = "fsl,imx8mp-hsio-blk-ctrl", "syscon"; > > which has several different power domains ... and your patch prevents to > enable more than one of them ... adding some debugs shows: > > u-boot=> pci enum > [...] > power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 priv->on_count: 0 > power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 power_domain->id: 4 > > Here pwoer domain id > > #define IMX8MP_HSIOBLK_PD_PCIE_PHY 4 > > gets enabled and on_count increases... > > imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 > power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 priv->on_count: 0 > power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 power_domain->id: 0 > power_domain_on_lowlevel: ret: 0 > power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 priv->on_count: 0 > power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 power_domain->id: 0 > power_domain_on_lowlevel: ret: 0 > imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 END OKAY > power_domain_on_lowlevel: ret: 0 > power_domain_get_by_index(dev=00000000fe5b2410, power_domain=00000000fe594458) > power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 priv->on_count: 1 > power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 power_domain->id: 3 > > Now power domain > > #define IMX8MP_HSIOBLK_PD_PCIE 3 > > should be enabled, but since your patch this gets prevented as "priv->on_count: 1" > > imx8_pcie_phy_power_on: ------------ pad_mode: 1 > imx8_pcie_phy_power_on: ------------ pad_mode: 1 ret: -110 > nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. > pcie_dw_imx pcie@33800000: failed to power on PHY > u-boot=> > > I have not a fast solution, may you have an idea? I try to look > after the easter holidays into it. May we need for each power_domain-> id such > an on counter... I fully agree with your observations and conclusion, I was hoping for an easy fix but the fact that the power domain uclass would not retain any useful data nor know how many sub-domains there are is a bit difficult to workaround. This needs to be thought deeper, I will report whenever I have an idea how to do it, for now you can revert the patch locally (or wait for the upstream proposal to get in), and do not hesitate to share your findings if you have an idea on how to handle that properly. Sorry for the breakage, Miquèl ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: imx8mp: pci enumeration fails with current HEAD 2025-04-17 14:54 ` Miquel Raynal @ 2025-04-17 15:18 ` Heiko Schocher 0 siblings, 0 replies; 6+ messages in thread From: Heiko Schocher @ 2025-04-17 15:18 UTC (permalink / raw) To: Miquel Raynal; +Cc: Fabio Estevam, u-boot@lists.denx.de Hello Miquel, thanks for the fast response! On 17.04.25 16:54, Miquel Raynal wrote: > Hello Heiko, > > On 17/04/2025 at 16:37:39 +02, Heiko Schocher <hs@denx.de> wrote: > >> Hi Miquel, >> >> I have here an imx8mp based board for which I just prepare mainlining. >> >> pci enumeration works fine with U-Boot 2025.04 >> >> u-boot=> pci enum >> PCIE-0: Link up (Gen1-x1, Bus0) >> u-boot=> >> >> and tftp through the ethernet interface works fine. >> >> Using current HEAD >> >> * 5b4ae0f3f04 - (origin/master, origin/HEAD) mailmap: update my name and >> email (vor 2 Tagen) <Casey Connolly> >> >> It breaks with: >> >> u-boot=> pci enum >> nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. >> pcie_dw_imx pcie@33800000: failed to power on PHY >> u-boot=> >> >> git bisect dropped your patch: >> >> 197376fbf300e92afa0a1583815d9c9eb52d613a is the first bad commit >> commit 197376fbf300e92afa0a1583815d9c9eb52d613a >> Author: Miquel Raynal <miquel.raynal@bootlin.com> >> Date: Thu Apr 3 09:39:05 2025 +0200 >> >> power-domain: Add refcounting >> >> It is very surprising that such an uclass, specifically designed to >> handle resources that may be shared by different devices, is not keeping >> the count of the number of times a power domain has been >> enabled/disabled to avoid shutting it down unexpectedly or disabling it >> several times. >> >> Doing this causes troubles on eg. i.MX8MP because disabling power >> domains can be done in recursive loops were the same power domain >> disabled up to 4 times in a row. PGCs seem to have tight FSM internal >> timings to respect and it is easy to produce a race condition that puts >> the power domains in an unstable state, leading to ADB400 errors and >> later crashes in Linux. >> >> CI tests using power domains are slightly updated to make sure the count >> of on/off calls is even and the results match what we *now* expect. >> >> As we do not want to break existing users while stile getting >> interesting error codes, the implementation is split between: >> - a low-level helper reporting error codes if the requested transition >> could not be operated, >> - a higher-level helper ignoring the "non error" codes, like EALREADY and >> EBUSY. >> >> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> >> >> >> reverting this patch, and it works again fine for me! > > That's right, this patch made assumptions that are wrong, I am very > sorry about that. In my understanding there was a power_domain structure > per ID, but in practice there is only one per node, so when two consumer > devices point to the same node that has #power-domain-cells = <1> it > fails to power up the second power domain. I tested this patch using the > imx8mp video pipeline (which works fine), and I wasn't using PCI on it. > > It's been reported: > https://lore.kernel.org/u-boot/20250403-ge-mainline-display-support-v6-0-478b5e3dd872@bootlin.com/T/#m68e9d16204a61450084324b99fd571d95932ece0 > https://lore.kernel.org/u-boot/20250403-ge-mainline-display-support-v6-0-478b5e3dd872@bootlin.com/T/#me570681ae3cf6c42cd45912de15cb968af55be28 > And the patch is being reverted: > https://lore.kernel.org/u-boot/20250417115311.1905411-1-w.egorov@phytec.de/T/#u Ah, thanks for the info! >> Do you have a imx8mp based hardware with pci? > > I do not have PCI on the board I am using for testing, but now that my > attention has been focused on the power domain counting issue I managed > to observe the problem by alternately enabling lcdif1 and lcdif2 which > share the same power domain node. Yep, should show the problem too... >> Can you (or anyone else with an imx8mp based hardware with pci) approve >> this on a similiar HW? >> >> I try to dig into it, but may you have a fast idea! >> >> Okay, thought about it ... it tries to power on "blk-ctrl@32f10000" >> driver:/drivers/power/domain/imx8mp-hsiomix.c >> >> imx8mp.dtsi: >> hsio_blk_ctrl: blk-ctrl@32f10000 { >> compatible = "fsl,imx8mp-hsio-blk-ctrl", "syscon"; >> >> which has several different power domains ... and your patch prevents to >> enable more than one of them ... adding some debugs shows: >> >> u-boot=> pci enum >> [...] >> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 priv->on_count: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5942f8) blk-ctrl@32f10000 power_domain->id: 4 >> >> Here pwoer domain id >> >> #define IMX8MP_HSIOBLK_PD_PCIE_PHY 4 >> >> gets enabled and on_count increases... >> >> imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 >> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 priv->on_count: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5d2408) power-domain@17 power_domain->id: 0 >> power_domain_on_lowlevel: ret: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 priv->on_count: 0 >> power_domain_on_lowlevel(power_domain=00000000fe5d2480) power-domain@1 power_domain->id: 0 >> power_domain_on_lowlevel: ret: 0 >> imx8mp_hsiomix_set: ------------ power_domain->id: 4 IMX8MP_HSIOBLK_PD_PCIE: 3 4 END OKAY >> power_domain_on_lowlevel: ret: 0 >> power_domain_get_by_index(dev=00000000fe5b2410, power_domain=00000000fe594458) >> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 priv->on_count: 1 >> power_domain_on_lowlevel(power_domain=00000000fe594458) blk-ctrl@32f10000 power_domain->id: 3 >> >> Now power domain >> >> #define IMX8MP_HSIOBLK_PD_PCIE 3 >> >> should be enabled, but since your patch this gets prevented as "priv->on_count: 1" >> >> imx8_pcie_phy_power_on: ------------ pad_mode: 1 >> imx8_pcie_phy_power_on: ------------ pad_mode: 1 ret: -110 >> nxp_imx8_pcie_phy pcie-phy@32f00000: PHY: Failed to power on pcie-phy@32f00000: -110. >> pcie_dw_imx pcie@33800000: failed to power on PHY >> u-boot=> >> >> I have not a fast solution, may you have an idea? I try to look >> after the easter holidays into it. May we need for each power_domain-> id such >> an on counter... > > I fully agree with your observations and conclusion, I was hoping for an > easy fix but the fact that the power domain uclass would not retain any > useful data nor know how many sub-domains there are is a bit difficult > to workaround. > > This needs to be thought deeper, I will report whenever I have an idea > how to do it, for now you can revert the patch locally (or wait for the > upstream proposal to get in), and do not hesitate to share your findings > if you have an idea on how to handle that properly. Yep, I have to debug ito it... may there is an easy solution. > Sorry for the breakage, No problem! bye, Heiko -- DENX Software Engineering GmbH, Managing Director: Erika Unter HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-52 Fax: +49-8142-66989-80 Email: hs@denx.de ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-17 15:35 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-17 14:37 imx8mp: pci enumeration fails with current HEAD Heiko Schocher 2025-04-17 14:46 ` Heiko Schocher 2025-04-17 15:20 ` Miquel Raynal 2025-04-17 15:32 ` Heiko Schocher 2025-04-17 14:54 ` Miquel Raynal 2025-04-17 15:18 ` Heiko Schocher
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.