* sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
@ 2024-01-04 2:09 Brian Masney
2024-01-08 18:14 ` Shazad Hussain
0 siblings, 1 reply; 8+ messages in thread
From: Brian Masney @ 2024-01-04 2:09 UTC (permalink / raw)
To: linux-arm-msm, Bartosz Golaszewski, Eric Chanudet, Shazad Hussain,
Prasad Sodagudi
Right now when we boot the RideSX4 (sa8775p) board on linux-next with
a quiet kernel log (specifically loglevel=4 - warning) about 50% of
the boots fail since UFS cannot be mounted. Changing the loglevel to
5 (notice) or higher to show more logging makes the race condition
and error go away. I tracked the error down to the following:
- The ice driver fails to probe due to the error
"gcc_ufs_phy_ice_core_clk status stuck at 'off'" and returns
-EBUSY and is not retried. platform_set_drvdata() is never
called as expected.
- The qcom UFS host driver calls of_qcom_ice_get(), however this
will always return -EPROBE_DEFER since the ice probe failed,
and platform_get_drvdata() is always null.
Here's the relevant log messages that I was able to get from a failed
boot once I configured dracut to time out:
gcc_ufs_phy_ice_core_clk status stuck at 'off'
qcom-ice: probe of 1d88000.crypto failed with error -16
ufshcd-qcom 1d84000.ufs: Cannot get ice instance from 1d88000.crypto
ufshcd-qcom 1d84000.ufs: Cannot get ice instance from 1d88000.crypto
platform 1d84000.ufs: deferred probe pending: ufshcd-qcom: ufshcd_pltfrm_init() failed
I assume that there's some kind of vote (icc, clk, regulator, etc)
that's missing from the ice driver, and another driver is performing
the necessary votes. However, I don't have access to the hardware docs
to tell if that's the case. Can someone that has access take a look? I
can post patch(es) if someone can point me to what needs configured.
Brian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
2024-01-04 2:09 sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off' Brian Masney
@ 2024-01-08 18:14 ` Shazad Hussain
2024-01-08 20:50 ` Brian Masney
0 siblings, 1 reply; 8+ messages in thread
From: Shazad Hussain @ 2024-01-08 18:14 UTC (permalink / raw)
To: Brian Masney, linux-arm-msm, Bartosz Golaszewski, Eric Chanudet,
Prasad Sodagudi
On 1/4/2024 7:39 AM, Brian Masney wrote:
> Right now when we boot the RideSX4 (sa8775p) board on linux-next with
> a quiet kernel log (specifically loglevel=4 - warning) about 50% of
> the boots fail since UFS cannot be mounted. Changing the loglevel to
> 5 (notice) or higher to show more logging makes the race condition
> and error go away. I tracked the error down to the following:
>
> - The ice driver fails to probe due to the error
> "gcc_ufs_phy_ice_core_clk status stuck at 'off'" and returns
> -EBUSY and is not retried. platform_set_drvdata() is never
> called as expected.
>
> - The qcom UFS host driver calls of_qcom_ice_get(), however this
> will always return -EPROBE_DEFER since the ice probe failed,
> and platform_get_drvdata() is always null.
>
> Here's the relevant log messages that I was able to get from a failed
> boot once I configured dracut to time out:
>
> gcc_ufs_phy_ice_core_clk status stuck at 'off'
> qcom-ice: probe of 1d88000.crypto failed with error -16
> ufshcd-qcom 1d84000.ufs: Cannot get ice instance from 1d88000.crypto
> ufshcd-qcom 1d84000.ufs: Cannot get ice instance from 1d88000.crypto
> platform 1d84000.ufs: deferred probe pending: ufshcd-qcom: ufshcd_pltfrm_init() failed
>
> I assume that there's some kind of vote (icc, clk, regulator, etc)
> that's missing from the ice driver, and another driver is performing
> the necessary votes. However, I don't have access to the hardware docs
> to tell if that's the case. Can someone that has access take a look? I
> can post patch(es) if someone can point me to what needs configured.
>
> Brian
>
Hi Brian,
I can see that gcc_ufs_phy_ice_core_clk needs the gcc_ufs_phy_gdsc to be
enabled before this particular clk is enabled. But that required
power-domain I do not see in the ice DT node. That can cause this
problem.
-Shazad
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
2024-01-08 18:14 ` Shazad Hussain
@ 2024-01-08 20:50 ` Brian Masney
2024-01-08 23:35 ` Elliot Berman
0 siblings, 1 reply; 8+ messages in thread
From: Brian Masney @ 2024-01-08 20:50 UTC (permalink / raw)
To: Shazad Hussain
Cc: linux-arm-msm, Bartosz Golaszewski, Eric Chanudet,
Prasad Sodagudi
On Mon, Jan 08, 2024 at 11:44:35PM +0530, Shazad Hussain wrote:
> I can see that gcc_ufs_phy_ice_core_clk needs the gcc_ufs_phy_gdsc to be
> enabled before this particular clk is enabled. But that required
> power-domain I do not see in the ice DT node. That can cause this
> problem.
Thank you! I'll work on and post a patch set as I find free time over
the next week or two.
Brian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
2024-01-08 20:50 ` Brian Masney
@ 2024-01-08 23:35 ` Elliot Berman
2024-01-09 21:44 ` Brian Masney
0 siblings, 1 reply; 8+ messages in thread
From: Elliot Berman @ 2024-01-08 23:35 UTC (permalink / raw)
To: Brian Masney, Shazad Hussain
Cc: linux-arm-msm, Bartosz Golaszewski, Eric Chanudet,
Prasad Sodagudi, Neil Armstrong
On 1/8/2024 12:50 PM, Brian Masney wrote:
> On Mon, Jan 08, 2024 at 11:44:35PM +0530, Shazad Hussain wrote:
>> I can see that gcc_ufs_phy_ice_core_clk needs the gcc_ufs_phy_gdsc to be
>> enabled before this particular clk is enabled. But that required
>> power-domain I do not see in the ice DT node. That can cause this
>> problem.
>
> Thank you! I'll work on and post a patch set as I find free time over
> the next week or two.
I think I observe the same issue on sm8650. Symptoms seem to be same as
you've described. I'll test out the following diff and see if things
seem more reliable:
diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
index fd4f9dac48a3..c9ea50834dc9 100644
--- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
@@ -2526,6 +2526,7 @@ ice: crypto@1d88000 {
"qcom,inline-crypto-engine";
reg = <0 0x01d88000 0 0x8000>;
+ power-domains = <&gcc UFS_PHY_GDSC>;
clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
};
If yes, I can post a patch for sm8650 if no else has yet.
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
2024-01-08 23:35 ` Elliot Berman
@ 2024-01-09 21:44 ` Brian Masney
2024-01-09 21:56 ` Elliot Berman
0 siblings, 1 reply; 8+ messages in thread
From: Brian Masney @ 2024-01-09 21:44 UTC (permalink / raw)
To: Elliot Berman, Shazad Hussain
Cc: linux-arm-msm, Bartosz Golaszewski, Eric Chanudet,
Prasad Sodagudi, Neil Armstrong
On Mon, Jan 08, 2024 at 03:35:55PM -0800, Elliot Berman wrote:
> On 1/8/2024 12:50 PM, Brian Masney wrote:
> > On Mon, Jan 08, 2024 at 11:44:35PM +0530, Shazad Hussain wrote:
> >> I can see that gcc_ufs_phy_ice_core_clk needs the gcc_ufs_phy_gdsc to be
> >> enabled before this particular clk is enabled. But that required
> >> power-domain I do not see in the ice DT node. That can cause this
> >> problem.
> >
> > Thank you! I'll work on and post a patch set as I find free time over
> > the next week or two.
> I think I observe the same issue on sm8650. Symptoms seem to be same as
> you've described. I'll test out the following diff and see if things
> seem more reliable:
>
> diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
> index fd4f9dac48a3..c9ea50834dc9 100644
> --- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
> @@ -2526,6 +2526,7 @@ ice: crypto@1d88000 {
> "qcom,inline-crypto-engine";
> reg = <0 0x01d88000 0 0x8000>;
>
> + power-domains = <&gcc UFS_PHY_GDSC>;
> clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
> };
>
>
> If yes, I can post a patch for sm8650 if no else has yet.
The intermittent boot issue is still present against
linux-next-20240109 with the following patch:
--- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi
+++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
@@ -1556,6 +1556,7 @@ ice: crypto@1d88000 {
compatible = "qcom,sa8775p-inline-crypto-engine",
"qcom,inline-crypto-engine";
reg = <0x0 0x01d88000 0x0 0x8000>;
+ power-domains = <&gcc UFS_PHY_GDSC>;
clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
};
Based on digging through the power domain code, I also added
"required-opps = <&rpmhpd_opp_nom>;" to match what the UFS host
controller has, and those tests fail as well.
Shazad: Any other suggestions for other resources that should also be
referenced on the ice node?
Brian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
2024-01-09 21:44 ` Brian Masney
@ 2024-01-09 21:56 ` Elliot Berman
2024-01-09 23:45 ` Brian Masney
2024-01-15 15:47 ` Brian Masney
0 siblings, 2 replies; 8+ messages in thread
From: Elliot Berman @ 2024-01-09 21:56 UTC (permalink / raw)
To: Brian Masney, Shazad Hussain
Cc: linux-arm-msm, Bartosz Golaszewski, Eric Chanudet,
Prasad Sodagudi, Neil Armstrong
On 1/9/2024 1:44 PM, Brian Masney wrote:
> On Mon, Jan 08, 2024 at 03:35:55PM -0800, Elliot Berman wrote:
>> On 1/8/2024 12:50 PM, Brian Masney wrote:
>>> On Mon, Jan 08, 2024 at 11:44:35PM +0530, Shazad Hussain wrote:
>>>> I can see that gcc_ufs_phy_ice_core_clk needs the gcc_ufs_phy_gdsc to be
>>>> enabled before this particular clk is enabled. But that required
>>>> power-domain I do not see in the ice DT node. That can cause this
>>>> problem.
>>>
>>> Thank you! I'll work on and post a patch set as I find free time over
>>> the next week or two.
>> I think I observe the same issue on sm8650. Symptoms seem to be same as
>> you've described. I'll test out the following diff and see if things
>> seem more reliable:
>>
>> diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
>> index fd4f9dac48a3..c9ea50834dc9 100644
>> --- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
>> +++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
>> @@ -2526,6 +2526,7 @@ ice: crypto@1d88000 {
>> "qcom,inline-crypto-engine";
>> reg = <0 0x01d88000 0 0x8000>;
>>
>> + power-domains = <&gcc UFS_PHY_GDSC>;
>> clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
>> };
>>
>>
>> If yes, I can post a patch for sm8650 if no else has yet.
>
> The intermittent boot issue is still present against
> linux-next-20240109 with the following patch:
>
> --- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
> @@ -1556,6 +1556,7 @@ ice: crypto@1d88000 {
> compatible = "qcom,sa8775p-inline-crypto-engine",
> "qcom,inline-crypto-engine";
> reg = <0x0 0x01d88000 0x0 0x8000>;
> + power-domains = <&gcc UFS_PHY_GDSC>;
> clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
> };
>
Things have been a bit more reliable for me after adding the power-domains.
Are you getting stuck at the same spot or somewhere else?
I've been looking at a similar issue to [1], so I wonder if maybe you're
facing that instead.
[1]: https://lore.kernel.org/linux-arm-msm/20240104101735.48694-1-laura.nao@collabora.com/T/#m39f7c80b59c750ee4c0082474c5c15b6055927ef
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
2024-01-09 21:56 ` Elliot Berman
@ 2024-01-09 23:45 ` Brian Masney
2024-01-15 15:47 ` Brian Masney
1 sibling, 0 replies; 8+ messages in thread
From: Brian Masney @ 2024-01-09 23:45 UTC (permalink / raw)
To: Elliot Berman
Cc: Shazad Hussain, linux-arm-msm, Bartosz Golaszewski, Eric Chanudet,
Prasad Sodagudi, Neil Armstrong
On Tue, Jan 09, 2024 at 01:56:30PM -0800, Elliot Berman wrote:
>
>
> On 1/9/2024 1:44 PM, Brian Masney wrote:
> > On Mon, Jan 08, 2024 at 03:35:55PM -0800, Elliot Berman wrote:
> >> On 1/8/2024 12:50 PM, Brian Masney wrote:
> >>> On Mon, Jan 08, 2024 at 11:44:35PM +0530, Shazad Hussain wrote:
> >>>> I can see that gcc_ufs_phy_ice_core_clk needs the gcc_ufs_phy_gdsc to be
> >>>> enabled before this particular clk is enabled. But that required
> >>>> power-domain I do not see in the ice DT node. That can cause this
> >>>> problem.
> >>>
> >>> Thank you! I'll work on and post a patch set as I find free time over
> >>> the next week or two.
> >> I think I observe the same issue on sm8650. Symptoms seem to be same as
> >> you've described. I'll test out the following diff and see if things
> >> seem more reliable:
> >>
> >> diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
> >> index fd4f9dac48a3..c9ea50834dc9 100644
> >> --- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
> >> +++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
> >> @@ -2526,6 +2526,7 @@ ice: crypto@1d88000 {
> >> "qcom,inline-crypto-engine";
> >> reg = <0 0x01d88000 0 0x8000>;
> >>
> >> + power-domains = <&gcc UFS_PHY_GDSC>;
> >> clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
> >> };
> >>
> >>
> >> If yes, I can post a patch for sm8650 if no else has yet.
> >
> > The intermittent boot issue is still present against
> > linux-next-20240109 with the following patch:
> >
> > --- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi
> > @@ -1556,6 +1556,7 @@ ice: crypto@1d88000 {
> > compatible = "qcom,sa8775p-inline-crypto-engine",
> > "qcom,inline-crypto-engine";
> > reg = <0x0 0x01d88000 0x0 0x8000>;
> > + power-domains = <&gcc UFS_PHY_GDSC>;
> > clocks = <&gcc GCC_UFS_PHY_ICE_CORE_CLK>;
> > };
> >
>
> Things have been a bit more reliable for me after adding the power-domains.
>
> Are you getting stuck at the same spot or somewhere else?
>
> I've been looking at a similar issue to [1], so I wonder if maybe you're
> facing that instead.
>
> [1]: https://lore.kernel.org/linux-arm-msm/20240104101735.48694-1-laura.nao@collabora.com/T/#m39f7c80b59c750ee4c0082474c5c15b6055927ef
So it could be that issue that I'm also encountering. Previously I
could configure a timeout on dracut and it would drop me to a shell
when the system failed to boot. That's how I was able to get the
dmesg for the ice error. However, dracut did not always time out, and
when that happened the system wouldn't respond over the serial console.
Now the boot still hangs for me about 50% of the time, however I have
not been able to get dracut to time out after probably 20 reboots. I
have magic sysrq enabled in my kernel, however I haven't been able to
get it to trigger when going through Beaker. Let me ask internally about
sysrq to see if I can get an interesting stack dump.
If I boot with the standard verbose logging, then the race condition
doesn't occur and -next boots fine for me.
Brian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off'
2024-01-09 21:56 ` Elliot Berman
2024-01-09 23:45 ` Brian Masney
@ 2024-01-15 15:47 ` Brian Masney
1 sibling, 0 replies; 8+ messages in thread
From: Brian Masney @ 2024-01-15 15:47 UTC (permalink / raw)
To: Elliot Berman
Cc: Shazad Hussain, linux-arm-msm, Bartosz Golaszewski, Eric Chanudet,
Prasad Sodagudi, Neil Armstrong
On Tue, Jan 09, 2024 at 01:56:30PM -0800, Elliot Berman wrote:
> Things have been a bit more reliable for me after adding the power-domains.
>
> Are you getting stuck at the same spot or somewhere else?
>
> I've been looking at a similar issue to [1], so I wonder if maybe you're
> facing that instead.
>
> [1]: https://lore.kernel.org/linux-arm-msm/20240104101735.48694-1-laura.nao@collabora.com/T/#m39f7c80b59c750ee4c0082474c5c15b6055927ef
I had some time Friday to setup some automation to bisect this.
Unfortunately, I can't reproduce the hang on a sa8775p with the
upstream ARM64 defconfig on linux-next-20240112. I can reproduce it
using Fedora's arm64 defconfig with the same version of linux-next.
We have a lab move going on this week and our sa8775p boards will be
unavailable. I'll start a bisect with the Fedora config once our boards
are available again.
Brian
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-01-15 15:47 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-04 2:09 sa8755p ufs ice bug: gcc_ufs_phy_ice_core_clk status stuck at 'off' Brian Masney
2024-01-08 18:14 ` Shazad Hussain
2024-01-08 20:50 ` Brian Masney
2024-01-08 23:35 ` Elliot Berman
2024-01-09 21:44 ` Brian Masney
2024-01-09 21:56 ` Elliot Berman
2024-01-09 23:45 ` Brian Masney
2024-01-15 15:47 ` Brian Masney
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox