From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
To: Loic Poulain <loic.poulain@linaro.org>
Cc: Hemant Kumar <hemantk@codeaurora.org>,
linux-arm-msm <linux-arm-msm@vger.kernel.org>,
Bhaumik Bhatt <bbhatt@codeaurora.org>
Subject: Re: [PATCH v5 06/10] mhi: pci_generic: Add suspend/resume/recovery procedure
Date: Wed, 23 Dec 2020 15:11:10 +0530 [thread overview]
Message-ID: <20201223094110.GA2644@thinkpad> (raw)
In-Reply-To: <CAMZdPi8kDLwqKBUjjkA2mkBpnj=AB53itEU=nObVXDVK+2jqYg@mail.gmail.com>
On Wed, Dec 23, 2020 at 09:25:37AM +0100, Loic Poulain wrote:
> Hi Mani,
>
> On Tue, 22 Dec 2020 at 18:05, Manivannan Sadhasivam
> <manivannan.sadhasivam@linaro.org> wrote:
> >
[...]
> > > +
> > > + /* Check if we can recover without full reset */
> > > + pci_set_power_state(pdev, PCI_D0);
> > > + pci_load_saved_state(pdev, mhi_pdev->pci_state);
> > > + pci_restore_state(pdev);
> >
> > These pci state settings seems redundant with resume().
> >
> > In this function you should first check if MHI is alive, if yes then do
> > power up. Else you should just exit.
>
> Recovery is not only executed on a resume but also when a crash or
> reboot is detected, that why we need to restore PCI state here.
> Moreover, contrary to resume, the restored PCI state is not the one
> saved in suspend, but the known working (and saved) initial pci state
> (mhi_pdev->pci_state).
>
Ah I missed it!
> >
> > > +
> > > + if (!mhi_pci_is_alive(mhi_cntrl))
> > > + goto err_try_reset;
> > > +
> > > + err = mhi_prepare_for_power_up(mhi_cntrl);
> > > + if (err)
> > > + goto err_try_reset;
> > > +
> > > + err = mhi_sync_power_up(mhi_cntrl);
> > > + if (err)
> > > + goto err_unprepare;
> >
> > Add a debug log for recovery success.
>
> Yes, will do.
>
> >
> > > +
> > > + set_bit(MHI_PCI_DEV_STARTED, &mhi_pdev->status);
> > > + return;
> > > +
> > > +err_unprepare:
> > > + mhi_unprepare_after_power_down(mhi_cntrl);
> > > +err_try_reset:
> > > + if (pci_reset_function(pdev))
> >
> > So if the device recovers, who will make sure reinitializing the MHI
> > controller? That's why I think we should convey the recovery result to
> > PM core. I don't think using workqueue here is a good idea.
>
> The mhi controller is reinitialized in the recovery work itself.
> Recovery can be a long process, and play with device
> registering/deregistering. We can not do that synchronously in the
> system resume path since it causes unexpected resume latency (this is
> actually no more a resume but a complete reset), moving it
> synchronously in resume cause hang on my side. However I agree that
> the PM core should be informed about the resume failure, so instead of
> unconditionally returning success in the resume callback I'm going to
> forward the error to PM core (and trigger recovery in parallel).
>
okay.
> >
> > > + dev_err(&pdev->dev, "Recovery failed\n");
> > > +}
> > > +
> > > static int mhi_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > {
> > > const struct mhi_pci_dev_info *info = (struct mhi_pci_dev_info *) id->driver_data;
> > > @@ -327,6 +371,8 @@ static int mhi_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> > > if (!mhi_pdev)
> > > return -ENOMEM;
> > >
> > > + INIT_WORK(&mhi_pdev->recovery_work, mhi_pci_recovery_work);
> > > +
> > > mhi_cntrl_config = info->config;
> > > mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > >
> > > @@ -391,6 +437,8 @@ static void mhi_pci_remove(struct pci_dev *pdev)
> > > struct mhi_pci_device *mhi_pdev = pci_get_drvdata(pdev);
> > > struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > >
> > > + cancel_work_sync(&mhi_pdev->recovery_work);
> > > +
> > > if (test_and_clear_bit(MHI_PCI_DEV_STARTED, &mhi_pdev->status)) {
> > > mhi_power_down(mhi_cntrl, true);
> > > mhi_unprepare_after_power_down(mhi_cntrl);
> > > @@ -456,12 +504,66 @@ static const struct pci_error_handlers mhi_pci_err_handler = {
> > > .reset_done = mhi_pci_reset_done,
> > > };
> > >
> > > +static int __maybe_unused mhi_pci_suspend(struct device *dev)
> > > +{
> > > + struct pci_dev *pdev = to_pci_dev(dev);
> > > + struct mhi_pci_device *mhi_pdev = dev_get_drvdata(dev);
> > > + struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > > +
> > > + cancel_work_sync(&mhi_pdev->recovery_work);
> > > +
> > > + /* Transition to M3 state */
> > > + mhi_pm_suspend(mhi_cntrl);
> > > +
> > > + pci_save_state(pdev);
> > > + pci_disable_device(pdev);
> > > + pci_wake_from_d3(pdev, true);
> > > + pci_set_power_state(pdev, PCI_D3hot);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +static int __maybe_unused mhi_pci_resume(struct device *dev)
> > > +{
> > > + struct pci_dev *pdev = to_pci_dev(dev);
> > > + struct mhi_pci_device *mhi_pdev = dev_get_drvdata(dev);
> > > + struct mhi_controller *mhi_cntrl = &mhi_pdev->mhi_cntrl;
> > > + int err;
> > > +
> > > + pci_set_power_state(pdev, PCI_D0);
> > > + pci_restore_state(pdev);
> > > + pci_set_master(pdev);
> > > +
> > > + err = pci_enable_device(pdev);
> > > + if (err)
> > > + goto err_recovery;
> > > +
> > > + /* Exit M3, transition to M0 state */
> > > + err = mhi_pm_resume(mhi_cntrl);
> > > + if (err) {
> > > + dev_err(&pdev->dev, "failed to resume device: %d\n", err);
> > > + goto err_recovery;
> > > + }
> > > +
> > > + return 0;
> > > +
> > > +err_recovery:
> > > + /* The device may have loose power or crashed, try recovering it */
> >
> > Did you actually hit this scenario? In the case of power loss or crash, can we
> > access the MHI register space?
>
> Yes I hit this scenario on my computer since PCI power is not
> maintained, mhi_pm_resume behaves correctly whether the MHI register
> space is available or not since it will hit and return an error moving
> to M0 state:
> mhi mhi0: Did not enter M0 state, MHI state: M3, PM state: M3->M0
>
Okay. As long as you are returning the error code to PM core I'm fine.
Thanks,
Mani
>
> Regards,
> Loic
next prev parent reply other threads:[~2020-12-23 9:42 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-14 14:25 [PATCH v5 00/10] mhi: pci_generic: Misc improvements Loic Poulain
2020-12-14 14:25 ` [PATCH v5 01/10] mhi: Add mhi_controller_initialize helper Loic Poulain
2020-12-15 3:07 ` Hemant Kumar
2020-12-14 14:25 ` [PATCH v5 02/10] bus: mhi: core: Add device hardware reset support Loic Poulain
2020-12-14 14:25 ` [PATCH v5 03/10] mhi: pci-generic: Increase number of hardware events Loic Poulain
2020-12-14 14:25 ` [PATCH v5 04/10] mhi: pci_generic: Enable burst mode for hardware channels Loic Poulain
2020-12-14 14:25 ` [PATCH v5 05/10] mhi: pci_generic: Add support for reset Loic Poulain
2020-12-15 2:19 ` Hemant Kumar
2020-12-14 14:25 ` [PATCH v5 06/10] mhi: pci_generic: Add suspend/resume/recovery procedure Loic Poulain
2020-12-22 17:05 ` Manivannan Sadhasivam
2020-12-23 8:25 ` Loic Poulain
2020-12-23 9:41 ` Manivannan Sadhasivam [this message]
2020-12-14 14:25 ` [PATCH v5 07/10] mhi: pci_generic: Add PCI error handlers Loic Poulain
2020-12-14 14:25 ` [PATCH v5 08/10] mhi: pci_generic: Add health-check Loic Poulain
2020-12-14 14:25 ` [PATCH v5 09/10] mhi: pci_generic: Increase controller timeout value Loic Poulain
2020-12-14 14:25 ` [PATCH v5 10/10] mhi: pci_generic: Add diag channels Loic Poulain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201223094110.GA2644@thinkpad \
--to=manivannan.sadhasivam@linaro.org \
--cc=bbhatt@codeaurora.org \
--cc=hemantk@codeaurora.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=loic.poulain@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.