From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
To: Loic Poulain <loic.poulain@linaro.org>
Cc: Hemant Kumar <hemantk@codeaurora.org>,
linux-arm-msm <linux-arm-msm@vger.kernel.org>,
Bhaumik Bhatt <bbhatt@codeaurora.org>
Subject: Re: [PATCH v5 06/10] mhi: pci_generic: Add suspend/resume/recovery procedure
Date: Wed, 23 Dec 2020 15:11:10 +0530 [thread overview]
Message-ID: <20201223094110.GA2644@thinkpad> (raw)
In-Reply-To: <CAMZdPi8kDLwqKBUjjkA2mkBpnj=AB53itEU=nObVXDVK+2jqYg@mail.gmail.com>
On Wed, Dec 23, 2020 at 09:25:37AM +0100, Loic Poulain wrote:
> Hi Mani,
>
> On Tue, 22 Dec 2020 at 18:05, Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> >
[...]
> > > +
> > > + /* Check if we can recover without full reset */
> > > + pci_set_power_state(pdev, PCI_D0);
> > > + pci_load_saved_state(pdev, mhi_pdev->pci_state);
> > > + pci_restore_state(pdev);
> >
> > These pci state settings seems redundant with resume().
> >
> > In this function you should first check if MHI is alive, if yes then do
> > power up. Else you should just exit.
>
> Recovery is not only executed on a resume but also when a crash or
> reboot is detected, that why we need to restore PCI state here.
> Moreover, contrary to resume, the restored PCI state is not the one
> saved in suspend, but the known working (and saved) initial pci state
> (mhi_pdev->pci_state).
>
Ah I missed it!
> >
> > > +
> > > + if (!mhi_pci_is_alive(mhi_cntrl))
> > > + goto err_try_reset;
> > > +
> > > + err = mhi_prepare_for_power_up(mhi_cntrl);
> > > + if (err)
> > > + goto err_try_reset;
> > > +
> > > + err = mhi_sync_power_up(mhi_cntrl);
> > > + if (err)
> > > + goto err_unprepare;
> >
> > Add a debug log for recovery success.
>
> Yes, will do.
>
> >
> > > +
> > > + set_bit(MHI_PCI_DEV_STARTED, &mhi_pdev->status);
> > > + return;
> > > +
> > > +err_unprepare:
> > > + mhi_unprepare_after_power_down(mhi_cntrl);
> > > +err_try_reset:
> > > + if (pci_reset_function(pdev))
> >
> > So if the device recovers, who will make sure reinitializing the MHI
> > controller? That's why I think we should convey the recovery result to
> > PM core. I don't think using workqueue here is a good idea.
>
> The mhi controller is reinitialized in the recovery work itself.
> Recovery can be a long process, and play with device
> registering/deregistering. We can not do that synchronously in the
> system resume path since it causes unexpected resume latency (this is
> actually no more a resume but a complete reset), moving it
> synchronously in resume cause hang on my side. However I agree that
> the PM core should be informed about the resume failure, so instead of
> unconditionally returning success in the resume callback I'm going to
> forward the error to PM core (and trigger recovery in parallel).
>
okay.
> >
> > > + dev_err(&pdev->dev, "Recovery failed\n");
> > > +}
> > > +
> > > static int mhi_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > {
> > > const struct mhi_pci_dev_info *info = (struct mhi_pci_dev_info *) id->driver_data;
> > > @@ -327,6 +371,8 @@ static int mhi_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > if (!mhi_pdev)
> > > return -ENOMEM;
> > >
> > > + INIT_WORK(&mhi_pdev->recovery_work, mhi_pci_recovery_work);
> > > +
> > > mhi_cntrl_config = info->config;
> > > mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > >
> > > @@ -391,6 +437,8 @@ static void mhi_pci_remove(struct pci_dev *pdev)
> > > struct mhi_pci_device *mhi_pdev = pci_get_drvdata(pdev);
> > > struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > >
> > > + cancel_work_sync(&mhi_pdev->recovery_work);
> > > +
> > > if (test_and_clear_bit(MHI_PCI_DEV_STARTED, &mhi_pdev->status)) {
> > > mhi_power_down(mhi_cntrl, true);
> > > mhi_unprepare_after_power_down(mhi_cntrl);
> > > @@ -456,12 +504,66 @@ static const struct pci_error_handlers mhi_pci_err_handler = {
> > > .reset_done = mhi_pci_reset_done,
> > > };
> > >
> > > +static int __maybe_unused mhi_pci_suspend(struct device *dev)
> > > +{
> > > + struct pci_dev *pdev = to_pci_dev(dev);
> > > + struct mhi_pci_device *mhi_pdev = dev_get_drvdata(dev);
> > > + struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > > +
> > > + cancel_work_sync(&mhi_pdev->recovery_work);
> > > +
> > > + /* Transition to M3 state */
> > > + mhi_pm_suspend(mhi_cntrl);
> > > +
> > > + pci_save_state(pdev);
> > > + pci_disable_device(pdev);
> > > + pci_wake_from_d3(pdev, true);
> > > + pci_set_power_state(pdev, PCI_D3hot);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int __maybe_unused mhi_pci_resume(struct device *dev)
> > > +{
> > > + struct pci_dev *pdev = to_pci_dev(dev);
> > > + struct mhi_pci_device *mhi_pdev = dev_get_drvdata(dev);
> > > + struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > > + int err;
> > > +
> > > + pci_set_power_state(pdev, PCI_D0);
> > > + pci_restore_state(pdev);
> > > + pci_set_master(pdev);
> > > +
> > > + err = pci_enable_device(pdev);
> > > + if (err)
> > > + goto err_recovery;
> > > +
> > > + /* Exit M3, transition to M0 state */
> > > + err = mhi_pm_resume(mhi_cntrl);
> > > + if (err) {
> > > + dev_err(&pdev->dev, "failed to resume device: %d\n", err);
> > > + goto err_recovery;
> > > + }
> > > +
> > > + return 0;
> > > +
> > > +err_recovery:
> > > + /* The device may have loose power or crashed, try recovering it */
> >
> > Did you actually hit this scenario? In the case of power loss or crash, can we
> > access the MHI register space?
>
> Yes I hit this scenario on my computer since PCI power is not
> maintained, mhi_pm_resume behaves correctly whether the MHI register
> space is available or not since it will hit and return an error moving
> to M0 state:
> mhi mhi0: Did not enter M0 state, MHI state: M3, PM state: M3->M0
>
Okay. As long as you are returning the error code to PM core I'm fine.
Thanks,
Mani
>
> Regards,
> Loic
next prev parent reply other threads:[~2020-12-23 9:42 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-14 14:25 [PATCH v5 00/10] mhi: pci_generic: Misc improvements Loic Poulain
2020-12-14 14:25 ` [PATCH v5 01/10] mhi: Add mhi_controller_initialize helper Loic Poulain
2020-12-15 3:07 ` Hemant Kumar
2020-12-14 14:25 ` [PATCH v5 02/10] bus: mhi: core: Add device hardware reset support Loic Poulain
2020-12-14 14:25 ` [PATCH v5 03/10] mhi: pci-generic: Increase number of hardware events Loic Poulain
2020-12-14 14:25 ` [PATCH v5 04/10] mhi: pci_generic: Enable burst mode for hardware channels Loic Poulain
2020-12-14 14:25 ` [PATCH v5 05/10] mhi: pci_generic: Add support for reset Loic Poulain
2020-12-15 2:19 ` Hemant Kumar
2020-12-14 14:25 ` [PATCH v5 06/10] mhi: pci_generic: Add suspend/resume/recovery procedure Loic Poulain
2020-12-22 17:05 ` Manivannan Sadhasivam
2020-12-23 8:25 ` Loic Poulain
2020-12-23 9:41 ` Manivannan Sadhasivam [this message]
2020-12-14 14:25 ` [PATCH v5 07/10] mhi: pci_generic: Add PCI error handlers Loic Poulain
2020-12-14 14:25 ` [PATCH v5 08/10] mhi: pci_generic: Add health-check Loic Poulain
2020-12-14 14:25 ` [PATCH v5 09/10] mhi: pci_generic: Increase controller timeout value Loic Poulain
2020-12-14 14:25 ` [PATCH v5 10/10] mhi: pci_generic: Add diag channels Loic Poulain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201223094110.GA2644@thinkpad \
--to=manivannan.sadhasivam@linaro.org \
--cc=bbhatt@codeaurora.org \
--cc=hemantk@codeaurora.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=loic.poulain@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox