All of lore.kernel.org
 help / color / mirror / Atom feed
From: Niklas Cassel <niklas.cassel@linaro.org>
To: Paolo Pisati <p.pisati@gmail.com>
Cc: linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: msm8996: qcom-qmp: apq8096-db820c fails to boot, reset back to fastboot and locks up
Date: Tue, 11 Jun 2019 19:12:25 +0200	[thread overview]
Message-ID: <20190611171225.GA21992@centauri.ideon.se> (raw)
In-Reply-To: <20190610134401.GA12964@harukaze>

On Mon, Jun 10, 2019 at 03:44:01PM +0200, Paolo Pisati wrote:
> From time to time, my apq8096-db820c fails to boot to userspace, reset back to
> fastboot and locks up: to easily reproduce the issue, i'm boot looping using a
> cron job with a 1 min reboot entry on the board while leaving a "while 1; do
> fastboot boot boot.img; done" on the host pc.
> 
> The issue is present in mainline up to 5.2-rc4, using defconfig and:
> 
> CONFIG_SCSI_UFS_QCOM=y
> CONFIG_PHY_QCOM_QMP=y
> CONFIG_PHY_QCOM_UFS=y
> 
> but was present in previous releases too (e.g. 4.14., 4.19, etc qcom-lt or
> mainline), where it's even easier to reproduce (e.g. takes way less reboots to
> trigger it).

Hello Paolo,

I have a guess of what is going on.
db820c has 3 PCIe controllers,
that shares a singe QMP block (that has clocks, regulators, and resets).
The QMP block has 3 PCIe PHYs, that have their own clocks and resets.

> 
> These are the last lines printed out:
> ...
> [    7.407209] qcom-qmp-phy 34000.phy: Registered Qcom-QMP phy
> [    7.448058] qcom-qmp-phy 7410000.phy: Registered Qcom-QMP phy
> [    7.461859] ufs_qcom_phy_qmp_14nm 627000.phy: invalid resource
> [    7.535434] qcom-qmp-phy 34000.phy: phy common block init timed-out

^^ here the phy_init() called from pcie-qcom.c
which ends up to a call to qcom_qmp_phy_enable()

which has this code:

        ret = qcom_qmp_phy_com_init(qphy);
        if (ret)
                return ret;

qcom_qmp_phy_com_init() has this code:

        if (qmp->init_count++) {
                mutex_unlock(&qmp->phy_mutex);
                return 0;
        }

qcom_qmp_phy_com_init() later fails,
since the common block init time out, so the qmp driver
disables clocks, asserts reset, and disables regulators


> [    7.538596] phy phy-34000.phy.0: phy init failed --> -110
> [    7.550891] qcom-pcie: probe of 600000.pcie failed with error -110

^^ here the first PCIe controller instance fails to probe

> [    7.619008] qcom-pcie 608000.pcie: 608000.pcie supply vddpe-3v3 not found,
> using dummy regulator

^^ here the second PCIe controller is probed.

it will call phy_init()

which will again call qcom_qmp_phy_enable() which will call
qcom_qmp_phy_com_init()

where this code:

        if (qmp->init_count++) {
                mutex_unlock(&qmp->phy_mutex);
                return 0;
        }

now will return 0,

so clocks will never be enabled, resets never deasserted, regulators
never enabled.

since qcom_qmp_phy_com_init() returns success in this case,
qcom_qmp_phy_enable() will try to continue with the init,
and writes to disabled hardware is usually not a good idea.

I think the proper fix for this is:

diff --git a/drivers/phy/qualcomm/phy-qcom-qmp.c b/drivers/phy/qualcomm/phy-qcom-qmp.c
index cd91b4179b10..22352e3b0ec5 100644
--- a/drivers/phy/qualcomm/phy-qcom-qmp.c
+++ b/drivers/phy/qualcomm/phy-qcom-qmp.c
@@ -1490,7 +1490,7 @@ static int qcom_qmp_phy_enable(struct phy *phy)
 
        ret = qcom_qmp_phy_com_init(qphy);
        if (ret)
-               return ret;
+               goto err_lane_rst;
 
        if (cfg->has_lane_rst) {
                ret = reset_control_deassert(qphy->lane_rst);



Kind regards,
Niklas

> 
> Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
> S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
> S - IMAGE_VARIANT_STRING=M8996LAB
> S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
> S - Boot Interface: UFS
> S - Secure Boot: Off
> ...
> 
> Full boot here: https://pastebin.ubuntu.com/p/rtjVrD3yzk/
> 
> Any idea what is going on? Am i doing something wrong?
> -- 
> bye,
> p.

  reply	other threads:[~2019-06-11 17:12 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-10 13:44 msm8996: qcom-qmp: apq8096-db820c fails to boot, reset back to fastboot and locks up Paolo Pisati
2019-06-11 17:12 ` Niklas Cassel [this message]
2019-06-12 13:17   ` Niklas Cassel
2019-06-12 14:09     ` Paolo Pisati
2019-06-12 16:20       ` Niklas Cassel
2019-06-13  8:57         ` Paolo Pisati
2019-06-13  9:19 ` Marc Gonzalez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190611171225.GA21992@centauri.ideon.se \
    --to=niklas.cassel@linaro.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=p.pisati@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.