linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Niklas Cassel <cassel@kernel.org>
Cc: "Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Manivannan Sadhasivam" <mani@kernel.org>,
	"Rob Herring" <robh@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Heiko Stuebner" <heiko@sntech.de>,
	"Wilfred Mallawa" <wilfred.mallawa@wdc.com>,
	linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-rockchip@lists.infradead.org
Subject: Re: [PATCH v2] PCI: dw-rockchip: Delay link training after hot reset in EP mode
Date: Wed, 18 Jun 2025 14:59:59 -0500	[thread overview]
Message-ID: <20250618195959.GA1207191@bhelgaas> (raw)
In-Reply-To: <aFLHYfs1iDgwMdcp@ryzen>

On Wed, Jun 18, 2025 at 04:04:17PM +0200, Niklas Cassel wrote:
> On Tue, Jun 17, 2025 at 05:01:14PM -0500, Bjorn Helgaas wrote:
> > On Fri, Jun 13, 2025 at 12:19:09PM +0200, Niklas Cassel wrote:
> > > From: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> > > 
> > > RK3588 TRM, section "11.6.1.3.3 Hot Reset and Link-Down Reset" states that:
> > > """
> > > If you want to delay link re-establishment (after reset) so that you can
> > > reprogram some registers through DBI, you must set app_ltssm_enable =0
> > > immediately after core_rst_n as shown in above. This can be achieved by
> > > enable the app_dly2_en, and end-up the delay by assert app_dly2_done.
> > > """
> > >
> > > I.e. setting app_dly2_en will automatically deassert app_ltssm_enable on
> > > a hot reset, and setting app_dly2_done will re-assert app_ltssm_enable,
> > > re-enabling link training.
> > > 
> > > When receiving a hot reset/link-down IRQ when running in EP mode, we will
> > > call dw_pcie_ep_linkdown(), which will call the .link_down() callback in
> > > the currently bound endpoint function (EPF) drivers.
> > > 
> > > The callback in an EPF driver can theoretically take a long time to
> > > complete, so make sure that the link is not re-established until after
> > > dw_pcie_ep_linkdown() (which calls the .link_down() callback(s)
> > > synchronously).
> > 
> > I don't know why we care *how long* EPF callbacks might take.
> 
> Well, because currently, we do NOT delay link training, and everything
> works as expected.
> 
> Most likely we are just lucky, because dw_pcie_ep_linkdown() calls
> dw_pcie_ep_init_non_sticky_registers(), which is quite a short function.

I'm just making the point that IIUC there's a race between link
training and any DBI accesses done by
dw_pcie_ep_init_non_sticky_registers() and potentially EPF callbacks,
and the time those paths take is immaterial.

If this is indeed a race and this patch is the fix, I think it's
misleading to describe it as "this path might take a long time and
lose the race."  We have to assume arbitrary delays can be added to
either path, so we can never rely on a path being "fast enough" to
avoid the race.

Is the following basically what we're doing?

  Set PCIE_LTSSM_APP_DLY2_EN so the controller never automatically
  trains the link after a link-down interrupt.  That way any DBI
  updates done in the dw_pcie_ep_linkdown() path will happen while the
  link is still down.  Then allow link training by setting
  PCIE_LTSSM_APP_DLY2_DONE.

We don't set PCIE_LTSSM_APP_DLY2_DONE anywhere in the initial probe
path.  Obviously the link must train in that case, so I guess
PCIE_LTSSM_APP_DLY2_EN only applies to the case of link state
transition from link-up to link-down?

> During a hot reset, the BARs get resized to 1 GB (yes, that is the
> default/reset value on rk3588), so the fact that the host sees a smaller
> BAR size means that dw_pcie_ep_init_non_sticky_registers() must have had
> time to run before link training completed.
> 
> But we do not want to rely on luck for these DBI writes to finish before
> link training is complete, hence this patch.
> 
> The .link_down() callback in drivers/pci/endpoint/functions/pci-epf-test.c
> simply does a cancel_delayed_work_sync().
> 
> I could imagine an EPF driver doing some more time consuming work in the
> callback, like allocating memory (which could trigger direct reclaim), and
> then calling pci_epc_set_bar() which will eventually result in some DBI
> writes. That most likely would not work without this patch.
> 
> > From the TRM quote, it sounds like the important thing is that you
> > don't want the link to train before dw_pcie_ep_linkdown() calls
> > dw_pcie_ep_init_non_sticky_registers(), which looks like it programs
> > registers through DBI.
> > 
> > Maybe you also want to allow the EFP ->link_down() callbacks to also
> > program things via DBI before link training?  But I don't think the
> > amount of time they take is relevant.  If you need to do *anything*
> > via DBI before the link trains, you have to prevent training until
> > you're finished with DBI.
> > 
> > > Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
> > > Co-developed-by: Niklas Cassel <cassel@kernel.org>
> > > Signed-off-by: Niklas Cassel <cassel@kernel.org>
> > > ---
> > > Changes since v1:
> > > -Rebased on v6.16-rc1
> > > 
> > >  drivers/pci/controller/dwc/pcie-dw-rockchip.c | 15 ++++++++++++---
> > >  1 file changed, 12 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/pci/controller/dwc/pcie-dw-rockchip.c b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > > index 93171a392879..cd1e9352b21f 100644
> > > --- a/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > > +++ b/drivers/pci/controller/dwc/pcie-dw-rockchip.c
> > > @@ -58,6 +58,8 @@
> > >  
> > >  /* Hot Reset Control Register */
> > >  #define PCIE_CLIENT_HOT_RESET_CTRL	0x180
> > > +#define  PCIE_LTSSM_APP_DLY2_EN		BIT(1)
> > > +#define  PCIE_LTSSM_APP_DLY2_DONE	BIT(3)
> > >  #define  PCIE_LTSSM_ENABLE_ENHANCE	BIT(4)
> > >  
> > >  /* LTSSM Status Register */
> > > @@ -474,7 +476,7 @@ static irqreturn_t rockchip_pcie_ep_sys_irq_thread(int irq, void *arg)
> > >  	struct rockchip_pcie *rockchip = arg;
> > >  	struct dw_pcie *pci = &rockchip->pci;
> > >  	struct device *dev = pci->dev;
> > > -	u32 reg;
> > > +	u32 reg, val;
> > >  
> > >  	reg = rockchip_pcie_readl_apb(rockchip, PCIE_CLIENT_INTR_STATUS_MISC);
> > >  	rockchip_pcie_writel_apb(rockchip, reg, PCIE_CLIENT_INTR_STATUS_MISC);
> > > @@ -485,6 +487,10 @@ static irqreturn_t rockchip_pcie_ep_sys_irq_thread(int irq, void *arg)
> > >  	if (reg & PCIE_LINK_REQ_RST_NOT_INT) {
> > >  		dev_dbg(dev, "hot reset or link-down reset\n");
> > >  		dw_pcie_ep_linkdown(&pci->ep);
> > > +		/* Stop delaying link training. */
> > > +		val = HIWORD_UPDATE_BIT(PCIE_LTSSM_APP_DLY2_DONE);
> > > +		rockchip_pcie_writel_apb(rockchip, val,
> > > +					 PCIE_CLIENT_HOT_RESET_CTRL);
> > >  	}
> > >  
> > >  	if (reg & PCIE_RDLH_LINK_UP_CHGED) {
> > > @@ -566,8 +572,11 @@ static int rockchip_pcie_configure_ep(struct platform_device *pdev,
> > >  		return ret;
> > >  	}
> > >  
> > > -	/* LTSSM enable control mode */
> > > -	val = HIWORD_UPDATE_BIT(PCIE_LTSSM_ENABLE_ENHANCE);
> > > +	/*
> > > +	 * LTSSM enable control mode, and automatically delay link training on
> > > +	 * hot reset/link-down reset.
> > > +	 */
> > > +	val = HIWORD_UPDATE_BIT(PCIE_LTSSM_ENABLE_ENHANCE | PCIE_LTSSM_APP_DLY2_EN);
> > >  	rockchip_pcie_writel_apb(rockchip, val, PCIE_CLIENT_HOT_RESET_CTRL);
> > >  
> > >  	rockchip_pcie_writel_apb(rockchip, PCIE_CLIENT_EP_MODE,
> > > -- 
> > > 2.49.0
> > > 

  reply	other threads:[~2025-06-18 20:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-13 10:19 [PATCH v2] PCI: dw-rockchip: Delay link training after hot reset in EP mode Niklas Cassel
2025-06-13 11:22 ` Manivannan Sadhasivam
2025-06-17 22:01 ` Bjorn Helgaas
2025-06-17 22:05   ` Bjorn Helgaas
2025-06-18 14:23     ` Niklas Cassel
2025-06-18 14:40       ` Niklas Cassel
2025-06-18 19:54         ` Bjorn Helgaas
2025-06-18 14:04   ` Niklas Cassel
2025-06-18 19:59     ` Bjorn Helgaas [this message]
2025-06-19  9:53       ` Niklas Cassel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250618195959.GA1207191@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=cassel@kernel.org \
    --cc=heiko@sntech.de \
    --cc=kwilczynski@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=robh@kernel.org \
    --cc=wilfred.mallawa@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).