Re: [PATCH v4 3/4] PCI: tegra: Add Tegra264 support

public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed

From: Thierry Reding <thierry.reding@kernel.org>
To: Manivannan Sadhasivam <mani@kernel.org>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Rob Herring" <robh@kernel.org>,
	"Krzysztof Kozlowski" <krzk+dt@kernel.org>,
	"Conor Dooley" <conor+dt@kernel.org>,
	"Thierry Reding" <thierry.reding@gmail.com>,
	"Jonathan Hunter" <jonathanh@nvidia.com>,
	"Karthikeyan Mitran" <m.karthikeyan@mobiveil.co.in>,
	"Hou Zhiqiang" <Zhiqiang.Hou@nxp.com>,
	"Thomas Petazzoni" <thomas.petazzoni@bootlin.com>,
	"Pali Rohár" <pali@kernel.org>,
	"Michal Simek" <michal.simek@amd.com>,
	"Kevin Xie" <kevin.xie@starfivetech.com>,
	linux-pci@vger.kernel.org, devicetree@vger.kernel.org,
	linux-tegra@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	"Thierry Reding" <treding@nvidia.com>,
	"Manikanta Maddireddy" <mmaddireddy@nvidia.com>
Subject: Re: [PATCH v4 3/4] PCI: tegra: Add Tegra264 support
Date: Tue, 7 Apr 2026 11:38:28 +0200	[thread overview]
Message-ID: <adTAVYEzfD9FQl8N@orome> (raw)
In-Reply-To: <iaoee5r5e2w52fap7ex23wdikbuvpjpesinedgjkehsedszhzo@64yoo2avmxle>

[-- Attachment #1: Type: text/plain, Size: 12894 bytes --]

On Thu, Apr 02, 2026 at 11:02:02PM +0530, Manivannan Sadhasivam wrote:
> On Thu, Apr 02, 2026 at 04:27:37PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > Add a driver for the PCIe controller found on NVIDIA Tegra264 SoCs. The
> > driver is very small, with its main purpose being to set up the address
> > translation registers and then creating a standard PCI host using ECAM.
> > 
> > Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> 
> What is the rationale for adding a new driver? Can't you reuse the existing one?
> If so, that should be mentioned in the description.

Which existing one? Tegra PCI controllers for previou generations
(Tegra194 and Tegra234) were DesignWare IP, but Tegra264 is an internal
IP, so the programming is entirely different. I'll add something to that
effect to the commit message.

> > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> > index 5aaed8ac6e44..6ead04f7bd6e 100644
> > --- a/drivers/pci/controller/Kconfig
> > +++ b/drivers/pci/controller/Kconfig
> > @@ -254,7 +254,15 @@ config PCI_TEGRA
> >  	select IRQ_MSI_LIB
> >  	help
> >  	  Say Y here if you want support for the PCIe host controller found
> > -	  on NVIDIA Tegra SoCs.
> > +	  on NVIDIA Tegra SoCs (Tegra20 through Tegra186).
> > +
> > +config PCIE_TEGRA264
> > +	bool "NVIDIA Tegra264 PCIe controller"
> 
> This driver seems to be using external MSI controller. So it can be built as a
> module. Also, you have the remove() callback for some reason.

Okay, I can turn this into a tristate symbol.

> > +	depends on ARCH_TEGRA || COMPILE_TEST
> > +	depends on PCI_MSI
> 
> Why?

I suppose it's not necessary in the sense of it being a build
dependency. At runtime, however, the root complex is not useful if PCI
MSI is not enabled. We can drop this dependency and rely on .config to
have it enabled as needed.

> > diff --git a/drivers/pci/controller/pcie-tegra264.c b/drivers/pci/controller/pcie-tegra264.c
> > new file mode 100644
> > index 000000000000..3ce1ad971bdb
> > --- /dev/null
> > +++ b/drivers/pci/controller/pcie-tegra264.c
[...]
> > +struct tegra264_pcie {
> > +	struct device *dev;
> > +	bool link_up;
> 
> Keep bool types at the end to avoid holes.

Done.

> > +static int tegra264_pcie_parse_dt(struct tegra264_pcie *pcie)
> > +{
> > +	int err;
> > +
> > +	pcie->wake_gpio = devm_gpiod_get_optional(pcie->dev, "nvidia,pex-wake",
> 
> You should switch to standard 'wake-gpios' property.

Will do.

> > +						  GPIOD_IN);
> > +	if (IS_ERR(pcie->wake_gpio))
> > +		return PTR_ERR(pcie->wake_gpio);
> > +
> > +	if (pcie->wake_gpio) {
> 
> Since you are bailing out above, you don't need this check.

I think we still want to have this check to handle the case of optional
wake GPIOs. Not all controllers may have this wired up and
devm_gpiod_get_optional() will return NULL (not an ERR_PTR()-encoded
error) if the wake-gpios property is missing.

> > +static void tegra264_pcie_bpmp_set_rp_state(struct tegra264_pcie *pcie)
> 
> I don't think this function name is self explanatory. Looks like it is turning
> off the PCIe controller, so how about tegra264_pcie_power_off()?

Agreed. The name is a relic from when this was potentially being used to
toggle on and off the controller. But it's only used for disabling, so
tegra264_pcie__power_off() sounds much better.

> > +{
> > +	struct tegra_bpmp_message msg = {};
> > +	struct mrq_pcie_request req = {};
> > +	int err;
> > +
> > +	req.cmd = CMD_PCIE_RP_CONTROLLER_OFF;
> > +	req.rp_ctrlr_off.rp_controller = pcie->ctl_id;
> > +
> > +	msg.mrq = MRQ_PCIE;
> > +	msg.tx.data = &req;
> > +	msg.tx.size = sizeof(req);
> > +
> > +	err = tegra_bpmp_transfer(pcie->bpmp, &msg);
> > +	if (err)
> > +		dev_info(pcie->dev, "failed to turn off PCIe #%u: %pe\n",
> 
> Why not dev_err()?
> 
> > +			 pcie->ctl_id, ERR_PTR(err));
> > +
> > +	if (msg.rx.ret)
> > +		dev_info(pcie->dev, "failed to turn off PCIe #%u: %d\n",
> 
> Same here.

These are not fatal errors and are safe to ignore. dev_err() seemed too
strong for this. They also really shouldn't happen. Though I now realize
that's a bad argument, or rather, actually an argument for making them
dev_err() so that they do stand out if they really should happen.

> 
> > +			 pcie->ctl_id, msg.rx.ret);
> > +}
> > +
> > +static void tegra264_pcie_icc_set(struct tegra264_pcie *pcie)
> > +{
> > +	u32 value, speed, width, bw;
> > +	int err;
> > +
> > +	value = readw(pcie->ecam + XTL_RC_PCIE_CFG_LINK_STATUS);
> > +	speed = FIELD_GET(PCI_EXP_LNKSTA_CLS, value);
> > +	width = FIELD_GET(PCI_EXP_LNKSTA_NLW, value);
> > +
> > +	bw = width * (PCIE_SPEED2MBS_ENC(speed) / BITS_PER_BYTE);
> > +	value = MBps_to_icc(bw);
> 
> So this becomes, 'width * (PCIE_SPEED2MBS_ENC(speed) / 8) * 1000 / 8'. But don't
> you want, 'width * (PCIE_SPEED2MBS_ENC(speed)) * 1000 / 8'?

This is M*B*ps_to_icc(), not M*b*ps_to_icc(), so we do in fact get the
latter. I almost fell for this as well because I got confused by some of
these macros being all-caps and other times the case actually mattering.

> > +	err = icc_set_bw(pcie->icc_path, bw, bw);
> > +	if (err < 0)
> > +		dev_err(pcie->dev,
> > +			"failed to request bandwidth (%u MBps): %pe\n",
> > +			bw, ERR_PTR(err));
> 
> So you don't want to error out if this fails?

No. This is not a fatal error and the system will continue to work,
albeit perhaps at suboptimal performance. Given that Ethernet and mass
storage are connected to these, a failure to set the bandwidth and
erroring out here may leave the system unusable, but continuing on would
let the system boot and update firmware, kernel or whatever to recover.

I'll add a comment explaining this.

[...]
> > +static void tegra264_pcie_init(struct tegra264_pcie *pcie)
> > +{
> > +	enum pci_bus_speed speed;
> > +	unsigned int i;
> > +	u32 value;
> > +
> > +	/* bring the link out of reset */
> 
> s/link/controller or endpoint?

This controls the PERST# signal, so I guess "endpoint" would be more
correct.

> > +	value = readl(pcie->xtl + XTL_RC_MGMT_PERST_CONTROL);
> > +	value |= XTL_RC_MGMT_PERST_CONTROL_PERST_O_N;
> > +	writel(value, pcie->xtl + XTL_RC_MGMT_PERST_CONTROL);
> > +
> > +	if (!tegra_is_silicon()) {
> 
> This looks like some pre-silicon validation thing. Do you really want it to be
> present in the upstream driver?

At this point there is silicon for this chip, but we've been trying to
get some of the pre-silicon code merged upstream as well because
occasionally people will want to run upstream on simulation, even after
silicon is available. At other times we may want to reuse these drivers
on future chips during pre-silicon validation.

Obviously there needs to be a balance. We don't want to have excessive
amounts of code specifically for pre-silicon validation, but in
relatively simple cases like this it is useful.

> 
> > +		dev_info(pcie->dev,
> > +			 "skipping link state for PCIe #%u in simulation\n",
> > +			 pcie->ctl_id);
> > +		pcie->link_up = true;
> > +		return;
> > +	}
> > +
> > +	for (i = 0; i < PCIE_LINK_WAIT_MAX_RETRIES; i++) {
> > +		if (tegra264_pcie_link_up(pcie, NULL))
> > +			break;
> > +
> > +		usleep_range(PCIE_LINK_WAIT_US_MIN, PCIE_LINK_WAIT_US_MAX);
> > +	}
> > +
> > +	if (tegra264_pcie_link_up(pcie, &speed)) {
> 
> Why are you doing it for the second time?

It's just a last-resort check to see if it's really not come up after
the retries. Also, in this call we're actually interested in retrieving
the detected link speed.

> 
> > +		/* Per PCIe r5.0, 6.6.1 wait for 100ms after DLL up */
> 
> No need of this comment.

Fair enough. This was perhaps more useful in earlier versions of the
patch before the line below used the standardize wait time.

[...]
> > +static int tegra264_pcie_probe(struct platform_device *pdev)
> > +{
> > +	struct device *dev = &pdev->dev;
> > +	struct pci_host_bridge *bridge;
> > +	struct tegra264_pcie *pcie;
> > +	struct resource_entry *bus;
> > +	struct resource *res;
> > +	int err;
> > +
> > +	bridge = devm_pci_alloc_host_bridge(dev, sizeof(struct tegra264_pcie));
> > +	if (!bridge)
> > +		return dev_err_probe(dev, -ENOMEM,
> > +				     "failed to allocate host bridge\n");
> > +
> > +	pcie = pci_host_bridge_priv(bridge);
> > +	platform_set_drvdata(pdev, pcie);
> > +	pcie->bridge = bridge;
> > +	pcie->dev = dev;
> > +
> > +	err = pinctrl_pm_select_default_state(dev);
> 
> I questioned this before:
> https://lore.kernel.org/linux-pci/o5sxxdikdjwd76zsedvkpsl54nw6wrhopwsflt43y5st67mrub@uuw3yfjfqthd/

I'll remove this. Looks like we should be fine with just relying on the
default state being set by the pinctrl core. We might need to move it
into the resume callback.

> > +	if (err < 0)
> > +		return dev_err_probe(dev, err,
> > +				     "failed to configure sideband pins\n");
> > +
> > +	err = tegra264_pcie_parse_dt(pcie);
> > +	if (err < 0)
> > +		return dev_err_probe(dev, err, "failed to parse device tree");
> > +
> > +	pcie->xal = devm_platform_ioremap_resource_byname(pdev, "xal");
> > +	if (IS_ERR(pcie->xal))
> > +		return dev_err_probe(dev, PTR_ERR(pcie->xal),
> > +				     "failed to map XAL memory\n");
> > +
> > +	pcie->xtl = devm_platform_ioremap_resource_byname(pdev, "xtl-pri");
> > +	if (IS_ERR(pcie->xtl))
> > +		return dev_err_probe(dev, PTR_ERR(pcie->xtl),
> > +				     "failed to map XTL-PRI memory\n");
> > +
> > +	bus = resource_list_first_type(&bridge->windows, IORESOURCE_BUS);
> > +	if (!bus)
> > +		return dev_err_probe(dev, -ENODEV,
> > +				     "failed to get bus resources\n");
> > +
> > +	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "ecam");
> > +	if (!res)
> > +		return dev_err_probe(dev, -ENXIO,
> > +				     "failed to get ECAM resource\n");
> > +
> > +	pcie->icc_path = devm_of_icc_get(&pdev->dev, "write");
> > +	if (IS_ERR(pcie->icc_path))
> > +		return dev_err_probe(&pdev->dev, PTR_ERR(pcie->icc_path),
> > +				     "failed to get ICC");
> > +
> > +	/*
> > +	 * Parse BPMP property only for silicon, as interaction with BPMP is
> > +	 * not needed for other platforms.
> > +	 */
> > +	if (tegra_is_silicon()) {
> > +		pcie->bpmp = tegra_bpmp_get_with_id(dev, &pcie->ctl_id);
> > +		if (IS_ERR(pcie->bpmp))
> > +			return dev_err_probe(dev, PTR_ERR(pcie->bpmp),
> > +					     "failed to get BPMP\n");
> > +	}
> > +
> 
> pm_runtime_set_active()
> 
> > +	pm_runtime_enable(dev);
> 
> devm_pm_runtime_enable()?

Looks like I can even use devm_pm_runtime_set_active_enabled() to
combine the two.

> 
> > +	pm_runtime_get_sync(dev);
> > +
> > +	/* sanity check that programmed ranges match what's in DT */
> > +	if (!tegra264_pcie_check_ranges(pdev)) {
> > +		err = -EINVAL;
> > +		goto put_pm;
> > +	}
> > +
> > +	pcie->cfg = pci_ecam_create(dev, res, bus->res, &pci_generic_ecam_ops);
> > +	if (IS_ERR(pcie->cfg)) {
> > +		err = dev_err_probe(dev, PTR_ERR(pcie->cfg),
> > +				    "failed to create ECAM\n");
> > +		goto put_pm;
> > +	}
> > +
> > +	bridge->ops = (struct pci_ops *)&pci_generic_ecam_ops.pci_ops;
> > +	bridge->sysdata = pcie->cfg;
> > +	pcie->ecam = pcie->cfg->win;
> > +
> > +	tegra264_pcie_init(pcie);
> > +
> > +	if (!pcie->link_up)
> > +		goto free;
> 
> goto free_ecam;

It's not clear to me, but are you suggesting to rename the existing
"free" label to "free_ecam"? I can do that.

> > +	err = pci_host_probe(bridge);
> > +	if (err < 0) {
> > +		dev_err(dev, "failed to register host: %pe\n", ERR_PTR(err));
> 
> dev_err_probe()

Okay.

> 
> > +		goto free;
> > +	}
> > +
> > +	return err;
> 
> return 0;

Done.

[...]
> > +static int tegra264_pcie_resume_noirq(struct device *dev)
> > +{
> > +	struct tegra264_pcie *pcie = dev_get_drvdata(dev);
> > +	int err;
> > +
> > +	if (pcie->wake_gpio && device_may_wakeup(dev)) {
> > +		err = disable_irq_wake(pcie->wake_irq);
> > +		if (err < 0)
> > +			dev_err(dev, "failed to disable wake IRQ: %pe\n",
> > +				ERR_PTR(err));
> > +	}
> > +
> > +	if (pcie->link_up == false)
> > +		return 0;
> > +
> > +	tegra264_pcie_init(pcie);
> > +
> 
> Why do you need init() here without deinit() in tegra264_pcie_suspend_noirq()?

That's because when we come out of suspend the link may have gone down
again, so we need to take the endpoint out of reset to retrigger the
link training. I think we could possibly explicitly clear that PERST_O_N
bit in the PERST_CONTROL register in a new tegra264_pcie_deinit() to
mirror what tegra264_pcie_init() does, but it's automatically done by
firmware anyway, so not needed.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

next prev parent reply	other threads:[~2026-04-07  9:38 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-02 14:27 [PATCH v4 0/4] PCI: tegra: Add Tegra264 support Thierry Reding
2026-04-02 14:27 ` [PATCH v4 1/4] dt-bindings: pci: Document the NVIDIA Tegra264 PCIe controller Thierry Reding
2026-04-02 14:27 ` [PATCH v4 2/4] PCI: Use standard wait times for PCIe link monitoring Thierry Reding
2026-04-02 14:27 ` [PATCH v4 3/4] PCI: tegra: Add Tegra264 support Thierry Reding
2026-04-02 17:32   ` Manivannan Sadhasivam
2026-04-07  9:38     ` Thierry Reding [this message]
2026-04-02 14:27 ` [PATCH v4 4/4] arm64: tegra: Add PCI controllers on Tegra264 Thierry Reding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adTAVYEzfD9FQl8N@orome \
    --to=thierry.reding@kernel.org \
    --cc=Zhiqiang.Hou@nxp.com \
    --cc=bhelgaas@google.com \
    --cc=conor+dt@kernel.org \
    --cc=devicetree@vger.kernel.org \
    --cc=jonathanh@nvidia.com \
    --cc=kevin.xie@starfivetech.com \
    --cc=krzk+dt@kernel.org \
    --cc=kwilczynski@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=m.karthikeyan@mobiveil.co.in \
    --cc=mani@kernel.org \
    --cc=michal.simek@amd.com \
    --cc=mmaddireddy@nvidia.com \
    --cc=pali@kernel.org \
    --cc=robh@kernel.org \
    --cc=thierry.reding@gmail.com \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=treding@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox