linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Val Packett <val@packett.cool>
Cc: linux-pci@vger.kernel.org, "Bjorn Helgaas" <bhelgaas@google.com>,
	"Krzysztof Wilczyński" <kw@linux.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	LKML <linux-kernel@vger.kernel.org>,
	"Lucas De Marchi" <lucas.demarchi@intel.com>
Subject: Re: [PATCH 1/2] PCI: Setup bridge resources earlier
Date: Fri, 10 Oct 2025 20:01:05 +0300 (EEST)	[thread overview]
Message-ID: <094618f2-947e-a66e-dc27-f4dedc8b5bb5@linux.intel.com> (raw)
In-Reply-To: <2faffdc3-f956-4071-a6a4-de9b5889096d@packett.cool>

[-- Attachment #1: Type: text/plain, Size: 9140 bytes --]

On Thu, 9 Oct 2025, Val Packett wrote:

> On 10/7/25 12:43 PM, Ilpo Järvinen wrote:
> 
> > On Mon, 6 Oct 2025, Val Packett wrote:
> > > [..]
> > > 
> > > I think it's that early check in pci_read_bridge_bases that avoids the
> > > setup
> > > here:
> > > 
> > >      if (pci_is_root_bus(child)) /* It's a host bus, nothing to read */
> > >          return;
> > If there's a PCI device as is the case in pci_read_bridge_windows()
> > which inputs non-NULL pci_dev, the config space of that device can be read
> > normally (or should be readable normally, AFAIK). The case where bus->self
> > is NULL is different, we can't read from a non-existing PCI device, but
> > it doesn't apply to pci_read_bridge_windows().
> > 
> > I don't think reading the window is the real issue here but how the
> > resource fitting algorithm corners itself by reserving space for bridge
> > windows before it knows their sizes, so basically these lines from the
> > earlier email:
> > 
> > pci 0004:00:00.0: bridge window [mem 0x7c300000-0x7c3fffff]: assigned
> > pci 0004:00:00.0: bridge window [mem 0x7c400000-0x7c4fffff 64bit pref]:
> > assigned
> > pci 0004:00:00.0: BAR 0 [mem 0x7c500000-0x7c500fff]: assigned
> > 
> > ...which seem to occur before the child buses have been scanned so that
> > space reserved is either hotplug reservation or due to "old_size" lower
> > bounding. That non-prefetchable bridge window is too small to fit the
> > child resources.
> > 
> > Could you try passing pci=hpmemsize=0M to kernel command line if that
> > helps?
> > 
> > The other case is the "old_size" in calculate_memsize() which too can
> > cause the same effect preventing sizing bridge window truly to zero when
> > it's not needed (== disable it == not assign it at all at that point).
> > Forcing it to zero would perhaps be worth a test (or removing the max()
> > related to old_size)
> > 
> > I've no idea why the old_size should decide anything, I hate that black
> > magic but I've just not dared to remove it (it's hard to know why some
> > things were made in the past, there could have been some HW issue worked
> > around by such odd feature but it's so old code that there isn't any real
> > information about whys anymore to find).
>
> Well, you did dare to mess with resource assignment sequence, and it got very
> quickly and quietly merged into linux-next causing a big regression on
> hardware that's not made by your company.. so maybe it's better not to touch
> anything there at all (:

Even if you were very likely joking here, I'm still sorry for breaking it, 
no matter which company's device.

Perhaps I'll start Cc you in all upcoming resource changes as you seem to 
be so willing to volunteer to review them. ;-D

> > pci=realloc on command line might help too, but I'm not sure. There seems
> > to be some extra space within the root bus resource so it might work.
> > 
> > I'm not sure what call chain is causing the assignment of those 3 bridge
> > windows. One easy way to find out where it comes from would be to put
> > WARN_ON(res->start == 0x7c400000); into pci_assign_resource() next to the
> > line which prints "...: assigned".
> 
> OK, I've uploaded the full big chungus logs (all with the WARN_ON):
> 
> https://owo.packett.cool/lin/pcifail.reverted.dmesg
> https://owo.packett.cool/lin/pcifail.noarg.dmesg
> https://owo.packett.cool/lin/pcifail.hpmeme.dmesg (hpmemsize didn't help)
> https://owo.packett.cool/lin/pcifail.realloc.dmesg (realloc didn't help
> either)

Thanks for the logs.

> So without your change, the assignment first comes from pci_rescan_bus →
> pci_assign_unassigned_bus_resources *via IRQ*, and then in the probe of the
> wifi driver.

There seem to be cases where pci_assign_unassigned_root_bus_resources() 
executes for bus 0000:04 before 0004:01:00.0 is scanned as it is only 
recorded later.

Hmm, qcom_pcie_global_irq_thread() seems to indicate the real enumeration 
can only occur after the link up event has arrived, and it starts another 
scan from there.

Perhaps this could be solved by inhibiting resource sizing and assignment 
per host bridge until the bus could be enumerated for real. Otherwise the 
resource assignment has no idea how the bridge windows should be sized 
which then can lead to this cornering itself if the initial assignment 
without knowledge of the necessary sizes.

Could you please try if the patch below helps?


--
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com>
Date: Fri, 10 Oct 2025 19:55:49 +0300
Subject: [PATCH 1/1] PCI: Delay resource sizing until devices can be enumerated
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

It seems pci-qcom cannot yet enumerate devices while scanning bus from
pci_host_probe(). Instead, there is IRQ waiting for a link up event
which starts the scan from qcom_pcie_global_irq_thread().

pci_host_probe() also calls pci_assign_unassigned_root_bus_resources()
which tries to size the bridge windows without any knowledge about the
attached devices and assigns them. If the assigned window is too small
to accomodate all the devices resource once they appear, the windows
may have become so pinned in place that they cannot be successfully
resized.

It is not very useful to size bridge windows without the children.
Thus, mark into the struct pci_host_bridge that delayed enumeration is
performed. Inhibit resource assignment until the enumeration is known
to be possible.

FIXME: There could be other entrypoints to resource assignment code
beyond those that include check for ->enumeration_pending but these
were the ones seen in the logs.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/controller/dwc/pcie-qcom.c |  3 +++
 drivers/pci/setup-bus.c                | 10 ++++++++++
 include/linux/pci.h                    |  1 +
 3 files changed, 14 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 294babe1816e..33683d751de0 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -1644,6 +1644,7 @@ static irqreturn_t qcom_pcie_global_irq_thread(int irq, void *data)
 		msleep(PCIE_RESET_CONFIG_WAIT_MS);
 		dev_dbg(dev, "Received Link up event. Starting enumeration!\n");
 		/* Rescan the bus to enumerate endpoint devices */
+		pp->bridge->enumeration_pending = 0;
 		pci_lock_rescan_remove();
 		pci_rescan_bus(pp->bridge->bus);
 		pci_unlock_rescan_remove();
@@ -1825,6 +1826,8 @@ static int qcom_pcie_probe(struct platform_device *pdev)
 		bridge->sysdata = cfg;
 		bridge->ops = (struct pci_ops *)&pci_qcom_ecam_ops.pci_ops;
 		bridge->msi_domain = true;
+		// FIXME: Should this be specific to just some host bridges?
+		bridge->enumeration_pending = 1;
 
 		ret = pci_host_probe(bridge);
 		if (ret)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 7853ac6999e2..a54a4dda6b60 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2266,6 +2266,7 @@ static void pci_prepare_next_assign_round(struct list_head *fail_head,
  */
 void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 {
+	struct pci_host_bridge *host = pci_find_host_bridge(bus);
 	LIST_HEAD(realloc_head);
 	/* List of resources that want additional resources */
 	struct list_head *add_list = NULL;
@@ -2275,6 +2276,10 @@ void pci_assign_unassigned_root_bus_resources(struct pci_bus *bus)
 	int pci_try_num = 1;
 	enum enable_type enable_local;
 
+	/* Wait until enumeration to avoid pinning windows with wrong sizes. */
+	if (host->enumeration_pending)
+		return;
+
 	/* Don't realloc if asked to do so */
 	enable_local = pci_realloc_detect(bus, pci_realloc_enable);
 	if (pci_realloc_enabled(enable_local)) {
@@ -2489,10 +2494,15 @@ int pci_reassign_bridge_resources(struct pci_dev *bridge, unsigned long type)
 
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
 {
+	struct pci_host_bridge *host = pci_find_host_bridge(bus);
 	struct pci_dev *dev;
 	/* List of resources that want additional resources */
 	LIST_HEAD(add_list);
 
+	/* Wait until enumeration to avoid pinning windows with wrong sizes. */
+	if (host->enumeration_pending)
+		return;
+
 	down_read(&pci_bus_sem);
 	for_each_pci_bridge(dev, bus)
 		if (pci_has_subordinate(dev))
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 59876de13860..10fb5aaecd8e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -619,6 +619,7 @@ struct pci_host_bridge {
 	unsigned int	preserve_config:1;	/* Preserve FW resource setup */
 	unsigned int	size_windows:1;		/* Enable root bus sizing */
 	unsigned int	msi_domain:1;		/* Bridge wants MSI domain */
+	unsigned int	enumeration_pending:1;	/* Delayed enumeration */
 
 	/* Resource alignment requirements */
 	resource_size_t (*align_resource)(struct pci_dev *dev,

base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
-- 
2.39.5

  reply	other threads:[~2025-10-10 17:01 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-24 13:42 [PATCH 0/2] PCI: Fix bogus resource overlaps Ilpo Järvinen
2025-09-24 13:42 ` [PATCH 1/2] PCI: Setup bridge resources earlier Ilpo Järvinen
2025-10-06  8:00   ` Val Packett
2025-10-06 10:46     ` Ilpo Järvinen
2025-10-06 20:08       ` Val Packett
2025-10-07 15:43         ` Ilpo Järvinen
2025-10-09  7:29           ` Val Packett
2025-10-10 17:01             ` Ilpo Järvinen [this message]
2025-10-12  6:29               ` Val Packett
2025-10-16  7:42           ` Manivannan Sadhasivam
2025-10-13 21:01     ` Bjorn Helgaas
2025-10-28 22:47       ` Bjorn Helgaas
2025-10-30 22:08         ` Bjorn Helgaas
2025-10-13 18:07   ` Guenter Roeck
2025-10-14 11:20     ` Ilpo Järvinen
2025-10-17 18:22   ` Bhanu Seshu Kumar Valluri
2025-10-17 18:27     ` Bhanu Seshu Kumar Valluri
2025-10-17 18:52     ` Bjorn Helgaas
2025-10-18  1:57       ` Bhanu Seshu Kumar Valluri
2025-10-20 18:46         ` Ilpo Järvinen
2025-10-27  8:10           ` Bhanu Seshu Kumar Valluri
2025-10-27 13:49             ` Ilpo Järvinen
2025-09-24 13:42 ` [PATCH 2/2] PCI: Resources outside their window must set IORESOURCE_UNSET Ilpo Järvinen
2025-09-25 21:21   ` Bjorn Helgaas
2025-09-26 12:21     ` Ilpo Järvinen
2025-09-26 19:30       ` Bjorn Helgaas
2025-09-29 10:34         ` Ilpo Järvinen
2025-09-30 15:47   ` Geert Uytterhoeven
2025-09-30 16:32     ` Ilpo Järvinen
2025-10-01 11:49       ` Geert Uytterhoeven
2025-10-01 13:06         ` Ilpo Järvinen
2025-10-01 14:08           ` Geert Uytterhoeven
2025-10-02 14:54             ` Ilpo Järvinen
2025-10-02 15:25               ` Geert Uytterhoeven
2025-10-02 16:59                 ` Ilpo Järvinen
2025-10-03  8:36                   ` Geert Uytterhoeven
2025-10-03 14:58                     ` Ilpo Järvinen
2025-10-06 10:14                       ` Geert Uytterhoeven
2025-10-06 12:37                         ` Ilpo Järvinen
2025-10-06 13:17                           ` Geert Uytterhoeven
2025-10-07 17:30                             ` Ilpo Järvinen
2025-10-08  8:40                               ` Kai-Heng Feng
2025-10-08 13:57                               ` Geert Uytterhoeven
2025-09-24 23:48 ` [PATCH 0/2] PCI: Fix bogus resource overlaps Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=094618f2-947e-a66e-dc27-f4dedc8b5bb5@linux.intel.com \
    --to=ilpo.jarvinen@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=kw@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lucas.demarchi@intel.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=val@packett.cool \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).