All of lore.kernel.org
 help / color / mirror / Atom feed
From: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
To: Niklas Cassel <Niklas.Cassel@wdc.com>
Cc: "linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Damien Le Moal <dlemoal@kernel.org>,
	Vidya Sagar <vidyas@nvidia.com>
Subject: Re: [PATCH v7 1/2] PCI: designware-ep: Fix DBI access before core init
Date: Thu, 30 Nov 2023 12:08:00 +0530	[thread overview]
Message-ID: <20231130063800.GD3043@thinkpad> (raw)
In-Reply-To: <ZWdob6tgWf6919/s@x1-carbon>

On Wed, Nov 29, 2023 at 04:36:04PM +0000, Niklas Cassel wrote:
> On Wed, Nov 29, 2023 at 09:21:40PM +0530, Manivannan Sadhasivam wrote:
> > On Wed, Nov 29, 2023 at 11:38:34AM +0000, Niklas Cassel wrote:
> > > On Tue, Nov 28, 2023 at 06:41:51PM +0100, Niklas Cassel wrote:
> > > > On Mon, Nov 20, 2023 at 02:10:13PM +0530, Manivannan Sadhasivam wrote:
> > > > > The drivers for platforms requiring reference clock from the PCIe host for
> > > > > initializing their PCIe EP core, make use of the 'core_init_notifier'
> > > > > feature exposed by the DWC common code. On these platforms, access to the
> > > > > hw registers like DBI before completing the core initialization will result
> > > > > in a whole system hang. But the current DWC EP driver tries to access DBI
> > > > > registers during dw_pcie_ep_init() without waiting for core initialization
> > > > > and it results in system hang on platforms making use of
> > > > > 'core_init_notifier' such as Tegra194 and Qcom SM8450.
> > > > > 
> > > > > To workaround this issue, users of the above mentioned platforms have to
> > > > > maintain the dependency with the PCIe host by booting the PCIe EP after
> > > > > host boot. But this won't provide a good user experience, since PCIe EP is
> > > > > _one_ of the features of those platforms and it doesn't make sense to
> > > > > delay the whole platform booting due to the PCIe dependency.
> > > > > 
> > > > > So to fix this issue, let's move all the DBI access during
> > > > > dw_pcie_ep_init() in the DWC EP driver to the dw_pcie_ep_init_complete()
> > > > > API that gets called only after core initialization on these platforms.
> > > > > This makes sure that the DBI register accesses are skipped during
> > > > > dw_pcie_ep_init() and accessed later once the core initialization happens.
> > > > > 
> > > > > For the rest of the platforms, DBI access happens as usual.
> > > > > 
> > > > > Co-developed-by: Vidya Sagar <vidyas@nvidia.com>
> > > > > Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
> > > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> > > > > ---
> > > > 
> > > > Hello Mani,
> > > > 
> > > > I tried this patch on top of a work in progress EP driver,
> > > > which, similar to pcie-qcom-ep.c has a perst gpio as input,
> > > > and a .core_init_notifier.
> > > > 
> > > > What I noticed is the following every time I reboot the RC, I get:
> > > > 
> > > > [  604.735115] debugfs: Directory 'a40000000.pcie_ep' with parent 'dmaengine' already present!
> > > > [ 1000.713582] debugfs: Directory 'a40000000.pcie_ep' with parent 'dmaengine' already present!
> > > > [ 1000.714355] debugfs: File 'mf' in directory '/' already present!
> > > > [ 1000.714890] debugfs: File 'wr_ch_cnt' in directory '/' already present!
> > > > [ 1000.715476] debugfs: File 'rd_ch_cnt' in directory '/' already present!
> > > > [ 1000.716061] debugfs: Directory 'registers' with parent '/' already present!
> > > > 
> > > > 
> > > > Also:
> > > > 
> > > > # ls -al /sys/class/dma/dma*/device | grep pcie
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma3chan0/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma3chan1/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma3chan2/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma3chan3/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma4chan0/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma4chan1/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma4chan2/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:14 /sys/class/dma/dma4chan3/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:16 /sys/class/dma/dma5chan0/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:16 /sys/class/dma/dma5chan1/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:16 /sys/class/dma/dma5chan2/device -> ../../../a40000000.pcie_ep
> > > > lrwxrwxrwx    1 root     root             0 Jan  1 00:16 /sys/class/dma/dma5chan3/device -> ../../../a40000000.pcie_ep
> > > > 
> > > > Adds another dmaX entry for each reboot.
> > > > 
> > > > 
> > > > I'm quite sure that you will see the same with pcie-qcom-ep.
> > > > 
> > > > I think that either the DWC drivers using core_init (only tegra and qcom)
> > > > need to deinit the eDMA in their assert_perst() function, or this patch
> > > > needs to deinit the eDMA before initializing it.
> > > > 
> > > > 
> > > > A problem with the current code, if you do NOT have this patch, which I assume
> > > > is also problem on pcie-qcom-ep, is that since assert_perst() function performs
> > > > a core reset, all the eDMA setting written in the dbi by the eDMA driver will be
> > > > cleared, so a PERST assert + deassert by the RC will wipe the eDMA settings.
> > > > Hopefully, this will no longer be a problem after this patch has been merged.
> > > > 
> > > > 
> > > > Kind regards,
> > > > Niklas
> > > 
> > > I'm sorry that I'm just looking at this patch now (it's v7 already).
> > > But I did notice that the DWC code is inconsistent for drivers having
> > > a .core_init_notifier and drivers not having a .core_init_notifier.
> > > 
> > > When receiving a hot reset or link-down reset, the DWC core gets reset,
> > > which means that most DBI settings get reset to their reset value.
> > > 
> > 
> > I believe you are talking about the hardware here and not the DWC core driver.
> > 
> > > 
> > > Both tegra and qcom-ep does have a start_link() that is basically a no-op.
> > 
> > That's because of the fact that we cannot write to LTSSM registers before perst
> > deassert.
> > 
> > > Instead, ep_init_complete() (and LTSSM enable) is called when PERST is
> > > deasserted, so settings written by ep_init_complete() will always get set
> > > after PERST is asserted + deasserted.
> > > 
> > > However, for a driver without a .core_init_notifier, a pci-epf-test unbind
> > > + bind, will currently NOT write the DBI settings written by
> > > ep_init_complete() when starting the link the second time.
> > > 
> > > If you unbind + bind pci-epf-test (which requires stopping and starting the
> > > link), I think that you should write all the DBI settings. Unbinding + binding
> > > will allocate memory for all the BARs, write all the iATU settings etc.
> > > It doesn't make sense that some DBI writes (those made by ep_init_complete())
> > > are not redone.
> > > 

Looking at this issue again, I think your statement may not be correct. In the
stop_link() callback, non-core_init_notifier platforms are just disabling the
LTSSM and enabling it again in start_link(). So all the rest of the
configurations (DBI etc...) should not be affected during EPF bind/unbind.

Can you point me to the specific controller driver that is doing otherwise?

- Mani

> > > The problem is that if you do not have a .core_init_notifier,
> > > ep_init_complete() (which does DBI writes) is only called by ep_init(),
> > > and never ever again.
> > > 
> > 
> > Hmm, right. I did not look close into non-core_init_notifier platforms because
> > of the lack of hardware.
> > 
> > > 
> > > Considering that .start_link() is a no-op for DWC drivers with a
> > > .core_init_notifier (they instead call ep_init_complete() when perst is
> > > deasserted), I think the most logical thing would be for .start_link() to
> > > call ep_init_complete() (for drivers without a .core_init_notifier), that way,
> > > all DBI settings (and not just some) will be written on an unbind + bind.
> > > 
> > > 
> > > Something like this:
> > > 
> > > --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> > > @@ -465,6 +465,16 @@ static int dw_pcie_ep_start(struct pci_epc *epc)
> > >  {
> > >         struct dw_pcie_ep *ep = epc_get_drvdata(epc);
> > >         struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> > > +       const struct pci_epc_features *epc_features;
> > > +
> > > +       if (ep->ops->get_features) {
> > > +               epc_features = ep->ops->get_features(ep);
> > > +               if (!epc_features->core_init_notifier) {
> > > +                       ret = dw_pcie_ep_init_complete(ep);
> > > +                       if (ret)
> > > +                               return ret;
> > > +               }
> > > +       }
> > >  
> > >         return dw_pcie_start_link(pci);
> > >  }
> > > @@ -729,7 +739,6 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
> > >         struct device *dev = pci->dev;
> > >         struct platform_device *pdev = to_platform_device(dev);
> > >         struct device_node *np = dev->of_node;
> > > -       const struct pci_epc_features *epc_features;
> > >         struct dw_pcie_ep_func *ep_func;
> > >  
> > >         INIT_LIST_HEAD(&ep->func_list);
> > > @@ -817,16 +826,6 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
> > >         if (ret)
> > >                 goto err_free_epc_mem;
> > >  
> > > -       if (ep->ops->get_features) {
> > > -               epc_features = ep->ops->get_features(ep);
> > > -               if (epc_features->core_init_notifier)
> > > -                       return 0;
> > > -       }
> > > -
> > > -       ret = dw_pcie_ep_init_complete(ep);
> > > -       if (ret)
> > > -               goto err_remove_edma;
> > > -
> > >         return 0;
> > >  
> > >  err_remove_edma:
> > > 
> > > 
> > > I could send a patch, but it would be conflicting with your patch.
> > > And you also need to handle deiniting + initing the eDMA in a nice way,
> > > but that seems to be a problem that also needs to be addressed with the
> > > patch in $subject.
> > > 
> > > What do you think?
> > > 
> > 
> > Your proposed solution looks good to me. If you are OK, I can spin up a patch on
> > behalf of you (with your authorship ofc) and integrate it with this series.
> 
> Sounds good to me!
> 
> Since you will need to also fix the error paths (my suggestion didn't)),
> I think that you deserve the authorship if you cook up a patch.
> 
> It would be nice to hear Vidya's opinion on my suggestion as well, since he
> appears to be the one who added the .core_init_notifier in the first place.
> 
> 
> Kind regards,
> Niklas

-- 
மணிவண்ணன் சதாசிவம்

  reply	other threads:[~2023-11-30  6:38 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-20  8:40 [PATCH v7 0/2] PCI: designware-ep: Fix DBI access before core init Manivannan Sadhasivam
2023-11-20  8:40 ` [PATCH v7 1/2] " Manivannan Sadhasivam
2023-11-28 17:41   ` Niklas Cassel
2023-11-29 11:38     ` Niklas Cassel
2023-11-29 15:51       ` Manivannan Sadhasivam
2023-11-29 16:36         ` Niklas Cassel
2023-11-30  6:38           ` Manivannan Sadhasivam [this message]
2023-11-30 11:22             ` Niklas Cassel
2023-11-30 13:57               ` Manivannan Sadhasivam
2023-12-23  1:52                 ` Niklas Cassel
2024-01-06 15:12                   ` Manivannan Sadhasivam
2024-01-18 18:33                     ` Niklas Cassel
2024-01-19  7:28                       ` Manivannan Sadhasivam
2024-01-23  9:18                         ` Niklas Cassel
2024-01-23  9:59                           ` Manivannan Sadhasivam
2023-11-29 14:08     ` Manivannan Sadhasivam
2024-01-09 21:12   ` Bjorn Helgaas
2023-11-20  8:40 ` [PATCH v7 2/2] PCI: designware-ep: Move pci_epc_init_notify() inside dw_pcie_ep_init_complete() Manivannan Sadhasivam
2024-01-07  7:27 ` [PATCH v7 0/2] PCI: designware-ep: Fix DBI access before core init Krzysztof Wilczyński
2024-01-07  7:34   ` Krzysztof Wilczyński
2024-01-09 17:58   ` Niklas Cassel
2024-01-10  3:11     ` Manivannan Sadhasivam
2024-01-10 10:01       ` Niklas Cassel
2024-01-10 15:27         ` Manivannan Sadhasivam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231130063800.GD3043@thinkpad \
    --to=manivannan.sadhasivam@linaro.org \
    --cc=Niklas.Cassel@wdc.com \
    --cc=dlemoal@kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=vidyas@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.