public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 01/11] PCI: Prevent resource tree corruption when BAR resize fails
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 02/11] PCI/IOV: Adjust ->barsz[] when changing BAR size Ilpo Järvinen
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

pbus_reassign_bridge_resources() saves bridge windows into the saved
list before attempting to adjust resource assignments to perform a BAR
resize operation. If resource adjustments cannot be completed fully,
rollback is attempted by restoring the resource from the saved list.

The rollback, however, does not check whether the resources it restores were
assigned by the partial resize attempt. If restore changes addresses of the
resource, it can result in corrupting the resource tree.

An example of a corrupted resource tree with overlapping addresses:

  6200000000000-6203fbfffffff : pciex@620c3c0000000
    6200000000000-6203fbff0ffff : PCI Bus 0030:01
      6200020000000-62000207fffff : 0030:01:00.0
      6200000000000-6203fbff0ffff : PCI Bus 0030:02

A resource that are assigned into the resource tree must remain
unchanged. Thus, release such a resource before attempting to restore
and claim it back.

For simplicity, always do the release and claim back for the resource
even in the cases where it is restored to the same address range.

Note: this fix may "break" some cases where devices "worked" because
the resource tree corruption allowed address space double counting to
fit more resource than what can now be assigned without double
counting. The upcoming changes to BAR resizing should address those
scenarios (to the extent possible).

Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs")
Link: https://lore.kernel.org/linux-pci/67840a16-99b4-4d8c-9b5c-4721ab0970a2@hogyros.de/
Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 4a8735b275e4..e6984bb530ae 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2504,6 +2504,11 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 		bridge = dev_res->dev;
 		i = pci_resource_num(bridge, res);
 
+		if (res->parent) {
+			release_child_resources(res);
+			pci_release_resource(bridge, i);
+		}
+
 		restore_dev_resource(dev_res);
 
 		pci_claim_resource(bridge, i);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 02/11] PCI/IOV: Adjust ->barsz[] when changing BAR size
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
  2025-11-13 16:26 ` [PATCH v2 01/11] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 03/11] PCI: Change pci_dev variable from 'bridge' to 'dev' Ilpo Järvinen
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
struct pci_srvio keeps a cached copy of BAR size in unit of resource_size_t
in ->barsz[] which is not adjusted by pci_rebar_set_size() but by
pci_iov_resource_set_size(). pci_iov_resource_set_size() is called also
from pci_resize_resource_set_size().

The current arrangement is problematic once BAR resize algorithm starts to
roll back changes properly in case of a failure. The normal resource
fitting algorithm rolls back resource size using the struct
pci_dev_resource easily but also calling pci_resize_resource_set_size() or
pci_iov_resource_set_size() to roll back BAR size would be an extra burden,
whereas combining ->barsz[] update with pci_rebar_set_size() naturally
rolls back it when restoring the old BAR size on a different layer of the
BAR resize operation.

Thus, rework pci_rebar_set_size() to also update ->barsz[].

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/iov.c       | 15 ++++-----------
 drivers/pci/pci.c       |  4 ++++
 drivers/pci/pci.h       |  5 ++---
 drivers/pci/setup-res.c | 10 ++++------
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 77dee43b7858..04b675e90963 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -158,8 +158,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 	return dev->sriov->barsz[pci_resource_num_to_vf_bar(resno)];
 }
 
-void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
-			       resource_size_t size)
+void pci_iov_resource_set_size(struct pci_dev *dev, int resno, int size)
 {
 	if (!pci_resource_is_iov(resno)) {
 		pci_warn(dev, "%s is not an IOV resource\n",
@@ -167,7 +166,8 @@ void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
 		return;
 	}
 
-	dev->sriov->barsz[pci_resource_num_to_vf_bar(resno)] = size;
+	resno = pci_resource_num_to_vf_bar(resno);
+	dev->sriov->barsz[resno] = pci_rebar_size_to_bytes(size);
 }
 
 bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev)
@@ -1340,7 +1340,6 @@ EXPORT_SYMBOL_GPL(pci_sriov_configure_simple);
 int pci_iov_vf_bar_set_size(struct pci_dev *dev, int resno, int size)
 {
 	u32 sizes;
-	int ret;
 
 	if (!pci_resource_is_iov(resno))
 		return -EINVAL;
@@ -1355,13 +1354,7 @@ int pci_iov_vf_bar_set_size(struct pci_dev *dev, int resno, int size)
 	if (!(sizes & BIT(size)))
 		return -EINVAL;
 
-	ret = pci_rebar_set_size(dev, resno, size);
-	if (ret)
-		return ret;
-
-	pci_iov_resource_set_size(dev, resno, pci_rebar_size_to_bytes(size));
-
-	return 0;
+	return pci_rebar_set_size(dev, resno, size);
 }
 EXPORT_SYMBOL_GPL(pci_iov_vf_bar_set_size);
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..7dfc58b0e55e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3803,6 +3803,10 @@ int pci_rebar_set_size(struct pci_dev *pdev, int bar, int size)
 	ctrl &= ~PCI_REBAR_CTRL_BAR_SIZE;
 	ctrl |= FIELD_PREP(PCI_REBAR_CTRL_BAR_SIZE, size);
 	pci_write_config_dword(pdev, pos + PCI_REBAR_CTRL, ctrl);
+
+	if (pci_resource_is_iov(bar))
+		pci_iov_resource_set_size(pdev, bar, size);
+
 	return 0;
 }
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4492b809094b..bf1a577e9623 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -808,8 +808,7 @@ void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
 int pci_iov_bus_range(struct pci_bus *bus);
-void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
-			       resource_size_t size);
+void pci_iov_resource_set_size(struct pci_dev *dev, int resno, int size);
 bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev);
 static inline u16 pci_iov_vf_rebar_cap(struct pci_dev *dev)
 {
@@ -851,7 +850,7 @@ static inline int pci_iov_bus_range(struct pci_bus *bus)
 	return 0;
 }
 static inline void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
-					     resource_size_t size) { }
+					     int size) { }
 static inline bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev)
 {
 	return false;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index c3ba4ccecd43..3d0b0b3f60c4 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -450,12 +450,10 @@ static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
 	resource_size_t res_size = pci_rebar_size_to_bytes(size);
 	struct resource *res = pci_resource_n(dev, resno);
 
-	if (!pci_resource_is_iov(resno)) {
-		resource_set_size(res, res_size);
-	} else {
-		resource_set_size(res, res_size * pci_sriov_get_totalvfs(dev));
-		pci_iov_resource_set_size(dev, resno, res_size);
-	}
+	if (pci_resource_is_iov(resno))
+		res_size *= pci_sriov_get_totalvfs(dev);
+
+	resource_set_size(res, res_size);
 }
 
 int pci_resize_resource(struct pci_dev *dev, int resno, int size)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 03/11] PCI: Change pci_dev variable from 'bridge' to 'dev'
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
  2025-11-13 16:26 ` [PATCH v2 01/11] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 02/11] PCI/IOV: Adjust ->barsz[] when changing BAR size Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-14 14:35   ` Alex Bennée
  2025-11-13 16:26 ` [PATCH v2 04/11] PCI: Try BAR resize even when no window was released Ilpo Järvinen
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

Upcoming fix to BAR resize will store also device BAR resource in the
saved list. Change the pci_dev variable in the loop from 'bridge' to
'dev' as the former would be misleading with non-bridges in the list.

This is in a separate change to reduce churn in the upcoming BAR resize
fix.

While it appears that the logic in the loop doing pci_setup_bridge() is
altered as 'bridge' variable is no longer updated, a bridge should
never appear more than once in the saved list so the if check can only
match to the first entry. As such, the code with two distinct pci_dev
variables better represents the intention of the check compared with the
old code where bridge variable was reused for a different purpose.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index e6984bb530ae..d58f025aeaff 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2479,12 +2479,13 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 	}
 
 	list_for_each_entry(dev_res, &saved, list) {
+		struct pci_dev *dev = dev_res->dev;
+
 		/* Skip the bridge we just assigned resources for */
-		if (bridge == dev_res->dev)
+		if (bridge == dev)
 			continue;
 
-		bridge = dev_res->dev;
-		pci_setup_bridge(bridge->subordinate);
+		pci_setup_bridge(dev->subordinate);
 	}
 
 	free_list(&saved);
@@ -2500,19 +2501,19 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 	/* Revert to the old configuration */
 	list_for_each_entry(dev_res, &saved, list) {
 		struct resource *res = dev_res->res;
+		struct pci_dev *dev = dev_res->dev;
 
-		bridge = dev_res->dev;
-		i = pci_resource_num(bridge, res);
+		i = pci_resource_num(dev, res);
 
 		if (res->parent) {
 			release_child_resources(res);
-			pci_release_resource(bridge, i);
+			pci_release_resource(dev, i);
 		}
 
 		restore_dev_resource(dev_res);
 
-		pci_claim_resource(bridge, i);
-		pci_setup_bridge(bridge->subordinate);
+		pci_claim_resource(dev, i);
+		pci_setup_bridge(dev->subordinate);
 	}
 	free_list(&saved);
 	up_read(&pci_bus_sem);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 04/11] PCI: Try BAR resize even when no window was released
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (2 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 03/11] PCI: Change pci_dev variable from 'bridge' to 'dev' Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 05/11] PCI: Freeing saved list does not require holding pci_bus_sem Ilpo Järvinen
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

Usually, resizing BARs requires releasing bridge windows in order to
resize it to fit a larger BAR into the window. This is not always the
case, however, FW could have made the window large enough to accomodate
larger BAR as is, or the user might prefer to shrink a BAR to make more
space for another Resizable BAR.

Thus, replace the check that requires that at least one bridge window
was released with a check that simply ensures bridge is not NULL.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index d58f025aeaff..1a3d54563854 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2424,7 +2424,7 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 {
 	unsigned long type = res->flags;
 	struct pci_dev_resource *dev_res;
-	struct pci_dev *bridge;
+	struct pci_dev *bridge = NULL;
 	LIST_HEAD(saved);
 	LIST_HEAD(added);
 	LIST_HEAD(failed);
@@ -2459,10 +2459,8 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 		bus = bus->parent;
 	}
 
-	if (list_empty(&saved)) {
-		up_read(&pci_bus_sem);
+	if (!bridge)
 		return -ENOENT;
-	}
 
 	__pci_bus_size_bridges(bridge->subordinate, &added);
 	__pci_bridge_assign_resources(bridge, &added, &failed);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 05/11] PCI: Freeing saved list does not require holding pci_bus_sem
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (3 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 04/11] PCI: Try BAR resize even when no window was released Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

Freeing the saved list does not require holding pci_bus_sem, so the
critical section can be made shorter.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 1a3d54563854..51f5e5a80b54 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2513,8 +2513,8 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 		pci_claim_resource(dev, i);
 		pci_setup_bridge(dev->subordinate);
 	}
-	free_list(&saved);
 	up_read(&pci_bus_sem);
+	free_list(&saved);
 
 	return ret;
 }
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (4 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 05/11] PCI: Freeing saved list does not require holding pci_bus_sem Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-13 16:46   ` Ilpo Järvinen
                     ` (2 more replies)
  2025-11-13 16:26 ` [PATCH v2 07/11] PCI: Add kerneldoc for pci_resize_resource() Ilpo Järvinen
                   ` (4 subsequent siblings)
  10 siblings, 3 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

BAR resize operation is implemented in the pci_resize_resource() and
pbus_reassign_bridge_resources() functions. pci_resize_resource() can
be called either from __resource_resize_store() from sysfs or directly
by the driver for the Endpoint Device.

The pci_resize_resource() requires that caller has released the device
resources that share the bridge window with the BAR to be resized as
otherwise the bridge window is pinned in place and cannot be changed.

pbus_reassign_bridge_resources() implement rollback of the resources if
the resize operation fails, but rollback is performed only for the
bridge windows. Because releasing the device resources are done by the
caller of the BAR resize interface, these functions performing the BAR
resize do not have access to the device resources as they were before
the resize.

pbus_reassign_bridge_resources() could try to
__pci_bridge_assign_resources() after rolling back the bridge windows
as they were, however, it will not guarantee the resource are assigned
due to differences how FW and the kernel may want to assign the
resources (alignment of the start address and tail).

In order to perform rollback robustly, the BAR resize interface has to
be altered to also release the device resources that share the bridge
window with the BAR to be resized.

Also, remove restoring from the entries failed list as saved list
should now contain both the bridge windows and device resources so
the extra restore is duplicated work.

Some drivers (currently only amdgpu) want to prevent releasing some
resources. Add exclude_bars param to pci_resize_resource() and make
amdgpu to pass its register BAR (BAR 5) which should never be released
during resize operation. Normally 64-bit prefetchable resources do not
share bridge window with it as the register BAR (32-bit only) but there
are various fallbacks in the resource assignment logic which may make
the resources to share the bridge window in rare cases.

This change (together with the driver side changes) is to counter the
resource releases that had to be done to prevent resource tree
corruption in the ("PCI: Release assigned resource before restoring
them") change. As such, it likely restores functionality in cases where
device resources were released to avoid resource tree conflicts which
appeared to be "working" when such conflicts were not correctly
detected by the kernel.

Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/
Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   2 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c |   2 +-
 drivers/gpu/drm/xe/xe_vram.c                |   2 +-
 drivers/pci/pci-sysfs.c                     |  17 +---
 drivers/pci/pci.h                           |   4 +-
 drivers/pci/setup-bus.c                     | 100 +++++++++++++++-----
 drivers/pci/setup-res.c                     |  23 ++---
 include/linux/pci.h                         |   3 +-
 8 files changed, 91 insertions(+), 62 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a899fb4de29..4e241836ae68 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1736,7 +1736,7 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
 
 	pci_release_resource(adev->pdev, 0);
 
-	r = pci_resize_resource(adev->pdev, 0, rbar_size);
+	r = pci_resize_resource(adev->pdev, 0, rbar_size, 1 << 5);
 	if (r == -ENOSPC)
 		dev_info(adev->dev,
 			 "Not enough PCI address space for a large BAR.");
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 51bb27e10a4f..7699e8fcf5ed 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -37,7 +37,7 @@ _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
 
 	_release_bars(pdev);
 
-	ret = pci_resize_resource(pdev, resno, bar_size);
+	ret = pci_resize_resource(pdev, resno, bar_size, 0);
 	if (ret) {
 		drm_info(&i915->drm, "Failed to resize BAR%d to %dM (%pe)\n",
 			 resno, 1 << bar_size, ERR_PTR(ret));
diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
index b44ebf50fedb..00dd027057df 100644
--- a/drivers/gpu/drm/xe/xe_vram.c
+++ b/drivers/gpu/drm/xe/xe_vram.c
@@ -36,7 +36,7 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
 	if (pci_resource_len(pdev, resno))
 		pci_release_resource(pdev, resno);
 
-	ret = pci_resize_resource(pdev, resno, bar_size);
+	ret = pci_resize_resource(pdev, resno, bar_size, 0);
 	if (ret) {
 		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",
 			 resno, 1 << bar_size, ERR_PTR(ret));
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 9d6f74bd95f8..2a1b5456c2dc 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1599,18 +1599,13 @@ static ssize_t __resource_resize_store(struct device *dev, int n,
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct pci_bus *bus = pdev->bus;
-	struct resource *b_win, *res;
 	unsigned long size;
-	int ret, i;
+	int ret;
 	u16 cmd;
 
 	if (kstrtoul(buf, 0, &size) < 0)
 		return -EINVAL;
 
-	b_win = pbus_select_window(bus, pci_resource_n(pdev, n));
-	if (!b_win)
-		return -EINVAL;
-
 	device_lock(dev);
 	if (dev->driver || pci_num_vf(pdev)) {
 		ret = -EBUSY;
@@ -1632,15 +1627,7 @@ static ssize_t __resource_resize_store(struct device *dev, int n,
 
 	pci_remove_resource_files(pdev);
 
-	pci_dev_for_each_resource(pdev, res, i) {
-		if (i >= PCI_BRIDGE_RESOURCES)
-			break;
-
-		if (b_win == pbus_select_window(bus, res))
-			pci_release_resource(pdev, i);
-	}
-
-	ret = pci_resize_resource(pdev, n, size);
+	ret = pci_resize_resource(pdev, n, size, 0);
 
 	pci_assign_unassigned_bus_resources(bus);
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index bf1a577e9623..9893ea12d1f2 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -421,8 +421,10 @@ enum pci_bar_type {
 struct device *pci_get_host_bridge_device(struct pci_dev *dev);
 void pci_put_host_bridge_device(struct device *dev);
 
+void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size);
+int pci_do_resource_release_and_resize(struct pci_dev *dev, int resno, int size,
+				       int exclude_bars);
 unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge);
-int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res);
 int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 51f5e5a80b54..7e268960954b 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2420,18 +2420,16 @@ EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
  * release it when possible. If the bridge window contains assigned
  * resources, it cannot be released.
  */
-int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
+static int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res,
+					  struct list_head *saved)
 {
 	unsigned long type = res->flags;
 	struct pci_dev_resource *dev_res;
 	struct pci_dev *bridge = NULL;
-	LIST_HEAD(saved);
 	LIST_HEAD(added);
 	LIST_HEAD(failed);
 	unsigned int i;
-	int ret;
-
-	down_read(&pci_bus_sem);
+	int ret = 0;
 
 	while (!pci_is_root_bus(bus)) {
 		bridge = bus->self;
@@ -2443,9 +2441,9 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 
 		/* Ignore BARs which are still in use */
 		if (!res->child) {
-			ret = add_to_list(&saved, bridge, res, 0, 0);
+			ret = add_to_list(saved, bridge, res, 0, 0);
 			if (ret)
-				goto cleanup;
+				return ret;
 
 			pci_release_resource(bridge, i);
 		} else {
@@ -2468,34 +2466,78 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 		free_list(&added);
 
 	if (!list_empty(&failed)) {
-		if (pci_required_resource_failed(&failed, type)) {
+		if (pci_required_resource_failed(&failed, type))
 			ret = -ENOSPC;
-			goto cleanup;
-		}
-		/* Only resources with unrelated types failed (again) */
 		free_list(&failed);
+		if (ret)
+			return ret;
+
+		/* Only resources with unrelated types failed (again) */
 	}
 
-	list_for_each_entry(dev_res, &saved, list) {
+	list_for_each_entry(dev_res, saved, list) {
 		struct pci_dev *dev = dev_res->dev;
 
 		/* Skip the bridge we just assigned resources for */
 		if (bridge == dev)
 			continue;
 
+		if (!dev->subordinate)
+			continue;
+
 		pci_setup_bridge(dev->subordinate);
 	}
 
-	free_list(&saved);
-	up_read(&pci_bus_sem);
 	return 0;
+}
 
-cleanup:
-	/* Restore size and flags */
-	list_for_each_entry(dev_res, &failed, list)
-		restore_dev_resource(dev_res);
-	free_list(&failed);
+int pci_do_resource_release_and_resize(struct pci_dev *pdev, int resno, int size,
+				       int exclude_bars)
+{
+	struct resource *res = pci_resource_n(pdev, resno);
+	struct pci_dev_resource *dev_res;
+	struct pci_bus *bus = pdev->bus;
+	struct resource *b_win, *r;
+	LIST_HEAD(saved);
+	unsigned int i;
+	int ret = 0;
+
+	b_win = pbus_select_window(bus, res);
+	if (!b_win)
+		return -EINVAL;
+
+	pci_dev_for_each_resource(pdev, r, i) {
+		if (i >= PCI_BRIDGE_RESOURCES)
+			break;
+
+		if (exclude_bars & BIT(i))
+			continue;
 
+		if (b_win != pbus_select_window(bus, r))
+			continue;
+
+		ret = add_to_list(&saved, pdev, r, 0, 0);
+		if (ret)
+			goto restore;
+		pci_release_resource(pdev, i);
+	}
+
+	pci_resize_resource_set_size(pdev, resno, size);
+
+	if (!bus->self)
+		goto out;
+
+	down_read(&pci_bus_sem);
+	ret = pbus_reassign_bridge_resources(bus, res, &saved);
+	if (ret)
+		goto restore;
+
+out:
+	up_read(&pci_bus_sem);
+	free_list(&saved);
+	return ret;
+
+restore:
 	/* Revert to the old configuration */
 	list_for_each_entry(dev_res, &saved, list) {
 		struct resource *res = dev_res->res;
@@ -2510,13 +2552,21 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 
 		restore_dev_resource(dev_res);
 
-		pci_claim_resource(dev, i);
-		pci_setup_bridge(dev->subordinate);
-	}
-	up_read(&pci_bus_sem);
-	free_list(&saved);
+		ret = pci_claim_resource(dev, i);
+		if (ret)
+			continue;
 
-	return ret;
+		if (i < PCI_BRIDGE_RESOURCES) {
+			const char *res_name = pci_resource_name(dev, i);
+
+			pci_update_resource(dev, i);
+			pci_info(dev, "%s %pR: old value restored\n",
+				 res_name, res);
+		}
+		if (dev->subordinate)
+			pci_setup_bridge(dev->subordinate);
+	}
+	goto out;
 }
 
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 3d0b0b3f60c4..e4486d7030c0 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -444,8 +444,7 @@ static bool pci_resize_is_memory_decoding_enabled(struct pci_dev *dev,
 	return cmd & PCI_COMMAND_MEMORY;
 }
 
-static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
-					 int size)
+void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size)
 {
 	resource_size_t res_size = pci_rebar_size_to_bytes(size);
 	struct resource *res = pci_resource_n(dev, resno);
@@ -456,9 +455,9 @@ static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
 	resource_set_size(res, res_size);
 }
 
-int pci_resize_resource(struct pci_dev *dev, int resno, int size)
+int pci_resize_resource(struct pci_dev *dev, int resno, int size,
+			int exclude_bars)
 {
-	struct resource *res = pci_resource_n(dev, resno);
 	struct pci_host_bridge *host;
 	int old, ret;
 	u32 sizes;
@@ -468,10 +467,6 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 	if (host->preserve_config)
 		return -ENOTSUPP;
 
-	/* Make sure the resource isn't assigned before resizing it. */
-	if (!(res->flags & IORESOURCE_UNSET))
-		return -EBUSY;
-
 	if (pci_resize_is_memory_decoding_enabled(dev, resno))
 		return -EBUSY;
 
@@ -490,19 +485,13 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 	if (ret)
 		return ret;
 
-	pci_resize_resource_set_size(dev, resno, size);
-
-	/* Check if the new config works by trying to assign everything. */
-	if (dev->bus->self) {
-		ret = pbus_reassign_bridge_resources(dev->bus, res);
-		if (ret)
-			goto error_resize;
-	}
+	ret = pci_do_resource_release_and_resize(dev, resno, size, exclude_bars);
+	if (ret)
+		goto error_resize;
 	return 0;
 
 error_resize:
 	pci_rebar_set_size(dev, resno, old);
-	pci_resize_resource_set_size(dev, resno, old);
 	return ret;
 }
 EXPORT_SYMBOL(pci_resize_resource);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d1fdf81fbe1e..cc58abbd2b20 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1428,7 +1428,8 @@ static inline int pci_rebar_bytes_to_size(u64 bytes)
 }
 
 u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar);
-int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size);
+int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size,
+				     int exlucde_bars);
 int pci_select_bars(struct pci_dev *dev, unsigned long flags);
 bool pci_device_is_present(struct pci_dev *pdev);
 void pci_ignore_hotplug(struct pci_dev *dev);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 07/11] PCI: Add kerneldoc for pci_resize_resource()
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (5 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

As pci_resize_resource() is meant to be used also outside of PCI core,
document the interface with kerneldoc.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-res.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index e4486d7030c0..558e452fc799 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -455,6 +455,25 @@ void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size)
 	resource_set_size(res, res_size);
 }
 
+/**
+ * pci_resize_resource - reconfigure a Resizable BAR and resources
+ * @dev: the PCI device
+ * @resno: index of the BAR to be resized
+ * @size: new size as defined in the spec (0=1MB, 31=128TB)
+ * @exclude_bars: a mask of BARs that should not be released
+ *
+ * Reconfigures @resno to @size and re-runs resource assignment algorithm
+ * with the new size.
+ *
+ * Prior to resize, @dev resources that share the bridge window with @resno
+ * are released (unpins the bridge window resource to allow changing it).
+ * The caller may prevent releasing a particular BAR by providing
+ * @exclude_bars mask but it may result in the resize operation failing due
+ * to insufficient space.
+ *
+ * Return: 0 on success, or negative on error. In case of an error, the
+ *         resources are restored to their original places.
+ */
 int pci_resize_resource(struct pci_dev *dev, int resno, int size,
 			int exclude_bars)
 {
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (6 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 07/11] PCI: Add kerneldoc for pci_resize_resource() Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-14 13:10   ` Alex Bennée
  2025-11-14 15:58   ` Lucas De Marchi
  2025-11-13 16:26 ` [PATCH v2 09/11] drm/i915: " Ilpo Järvinen
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

PCI core handles releasing device's resources and their rollback in
case of failure of a BAR resizing operation. Releasing resource prior
to calling pci_resize_resource() prevents PCI core from restoring the
BARs as they were.

Remove driver-side release of BARs from the xe driver.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_vram.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
index 00dd027057df..5aacab9358a4 100644
--- a/drivers/gpu/drm/xe/xe_vram.c
+++ b/drivers/gpu/drm/xe/xe_vram.c
@@ -33,9 +33,6 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
 	int bar_size = pci_rebar_bytes_to_size(size);
 	int ret;
 
-	if (pci_resource_len(pdev, resno))
-		pci_release_resource(pdev, resno);
-
 	ret = pci_resize_resource(pdev, resno, bar_size, 0);
 	if (ret) {
 		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 09/11] drm/i915: Remove driver side BAR release before resize
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (7 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-14 15:55   ` Lucas De Marchi
  2025-11-13 16:26 ` [PATCH v2 10/11] drm/amdgpu: " Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 11/11] PCI: Prevent restoring assigned resources Ilpo Järvinen
  10 siblings, 1 reply; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

PCI core handles releasing device's resources and their rollback in
case of failure of a BAR resizing operation. Releasing resource prior
to calling pci_resize_resource() prevents PCI core from restoring the
BARs as they were.

Remove driver-side release of BARs from the i915 driver.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_region_lmem.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 7699e8fcf5ed..c37a0560ebe0 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -18,16 +18,6 @@
 #include "gt/intel_gt_regs.h"
 
 #ifdef CONFIG_64BIT
-static void _release_bars(struct pci_dev *pdev)
-{
-	int resno;
-
-	for (resno = PCI_STD_RESOURCES; resno < PCI_STD_RESOURCE_END; resno++) {
-		if (pci_resource_len(pdev, resno))
-			pci_release_resource(pdev, resno);
-	}
-}
-
 static void
 _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
 {
@@ -35,8 +25,6 @@ _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
 	int bar_size = pci_rebar_bytes_to_size(size);
 	int ret;
 
-	_release_bars(pdev);
-
 	ret = pci_resize_resource(pdev, resno, bar_size, 0);
 	if (ret) {
 		drm_info(&i915->drm, "Failed to resize BAR%d to %dM (%pe)\n",
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 10/11] drm/amdgpu: Remove driver side BAR release before resize
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (8 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 09/11] drm/i915: " Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  2025-11-13 16:26 ` [PATCH v2 11/11] PCI: Prevent restoring assigned resources Ilpo Järvinen
  10 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

PCI core handles releasing device's resources and their rollback in
case of failure of a BAR resizing operation. Releasing resource prior
to calling pci_resize_resource() prevents PCI core from restoring the
BARs as they were.

Remove driver-side release of BARs from the amdgpu driver.

Also remove the driver initiated assignment as pci_resize_resource()
should try to assign as much as possible. If the driver side call
manages to get more required resources assigned in some scenario, such
a problem should be fixed inside pci_resize_resource() instead.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4e241836ae68..f11a255786d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1729,12 +1729,8 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
 	pci_write_config_word(adev->pdev, PCI_COMMAND,
 			      cmd & ~PCI_COMMAND_MEMORY);
 
-	/* Free the VRAM and doorbell BAR, we most likely need to move both. */
+	/* Tear down doorbell as resizing will release BARs */
 	amdgpu_doorbell_fini(adev);
-	if (adev->asic_type >= CHIP_BONAIRE)
-		pci_release_resource(adev->pdev, 2);
-
-	pci_release_resource(adev->pdev, 0);
 
 	r = pci_resize_resource(adev->pdev, 0, rbar_size, 1 << 5);
 	if (r == -ENOSPC)
@@ -1743,8 +1739,6 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
 	else if (r && r != -ENOTSUPP)
 		dev_err(adev->dev, "Problem resizing BAR0 (%d).", r);
 
-	pci_assign_unassigned_bus_resources(adev->pdev->bus);
-
 	/* When the doorbell or fb BAR isn't available we have no chance of
 	 * using the device.
 	 */
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 11/11] PCI: Prevent restoring assigned resources
       [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
                   ` (9 preceding siblings ...)
  2025-11-13 16:26 ` [PATCH v2 10/11] drm/amdgpu: " Ilpo Järvinen
@ 2025-11-13 16:26 ` Ilpo Järvinen
  10 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:26 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

restore_dev_resource() copies saved addresses and flags from the struct
pci_dev_resource back to the struct resource, typically, during
rollback from a failure or in preparation for a retry attempt.

If the resource is within resource tree, the resource must not be
modified as the resource tree could be corrupted. Thus, it's a bug to
call restore_dev_resource() for assigned resources (which did happen
due to logic flaws in the BAR resize rollback).

Add WARN_ON_ONCE() into restore_dev_resource() to detect such bugs
easily and return without altering the resource to prevent corruption.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 7e268960954b..1d9fc078c7ad 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -15,6 +15,7 @@
  */
 
 #include <linux/bitops.h>
+#include <linux/bug.h>
 #include <linux/init.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -135,6 +136,9 @@ static void restore_dev_resource(struct pci_dev_resource *dev_res)
 {
 	struct resource *res = dev_res->res;
 
+	if (WARN_ON_ONCE(res->parent))
+		return;
+
 	res->start = dev_res->start;
 	res->end = dev_res->end;
 	res->flags = dev_res->flags;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path
  2025-11-13 16:26 ` [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
@ 2025-11-13 16:46   ` Ilpo Järvinen
  2025-11-13 21:01     ` Bjorn Helgaas
  2025-11-14  9:34   ` Christian König
  2026-01-20 21:17   ` Ville Syrjälä
  2 siblings, 1 reply; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:46 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 14778 bytes --]

On Thu, 13 Nov 2025, Ilpo Järvinen wrote:

> BAR resize operation is implemented in the pci_resize_resource() and
> pbus_reassign_bridge_resources() functions. pci_resize_resource() can
> be called either from __resource_resize_store() from sysfs or directly
> by the driver for the Endpoint Device.
> 
> The pci_resize_resource() requires that caller has released the device
> resources that share the bridge window with the BAR to be resized as
> otherwise the bridge window is pinned in place and cannot be changed.
> 
> pbus_reassign_bridge_resources() implement rollback of the resources if
> the resize operation fails, but rollback is performed only for the
> bridge windows. Because releasing the device resources are done by the
> caller of the BAR resize interface, these functions performing the BAR
> resize do not have access to the device resources as they were before
> the resize.
> 
> pbus_reassign_bridge_resources() could try to
> __pci_bridge_assign_resources() after rolling back the bridge windows
> as they were, however, it will not guarantee the resource are assigned
> due to differences how FW and the kernel may want to assign the
> resources (alignment of the start address and tail).
> 
> In order to perform rollback robustly, the BAR resize interface has to
> be altered to also release the device resources that share the bridge
> window with the BAR to be resized.
> 
> Also, remove restoring from the entries failed list as saved list
> should now contain both the bridge windows and device resources so
> the extra restore is duplicated work.
> 
> Some drivers (currently only amdgpu) want to prevent releasing some
> resources. Add exclude_bars param to pci_resize_resource() and make
> amdgpu to pass its register BAR (BAR 5) which should never be released
> during resize operation. Normally 64-bit prefetchable resources do not
> share bridge window with it as the register BAR (32-bit only) but there
> are various fallbacks in the resource assignment logic which may make
> the resources to share the bridge window in rare cases.
> 
> This change (together with the driver side changes) is to counter the
> resource releases that had to be done to prevent resource tree
> corruption in the ("PCI: Release assigned resource before restoring
> them") change. As such, it likely restores functionality in cases where
> device resources were released to avoid resource tree conflicts which
> appeared to be "working" when such conflicts were not correctly
> detected by the kernel.
> 
> Link: https://lore.kernel.org/linux-pci/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/
> Link: https://lore.kernel.org/linux-pci/874irqop6b.fsf@draig.linaro.org/
> Reported-by: Simon Richter <Simon.Richter@hogyros.de>
> Reported-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   2 +-
>  drivers/gpu/drm/i915/gt/intel_region_lmem.c |   2 +-
>  drivers/gpu/drm/xe/xe_vram.c                |   2 +-
>  drivers/pci/pci-sysfs.c                     |  17 +---
>  drivers/pci/pci.h                           |   4 +-
>  drivers/pci/setup-bus.c                     | 100 +++++++++++++++-----
>  drivers/pci/setup-res.c                     |  23 ++---
>  include/linux/pci.h                         |   3 +-
>  8 files changed, 91 insertions(+), 62 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7a899fb4de29..4e241836ae68 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1736,7 +1736,7 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>  
>  	pci_release_resource(adev->pdev, 0);
>  
> -	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> +	r = pci_resize_resource(adev->pdev, 0, rbar_size, 1 << 5);
>  	if (r == -ENOSPC)
>  		dev_info(adev->dev,
>  			 "Not enough PCI address space for a large BAR.");
> diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> index 51bb27e10a4f..7699e8fcf5ed 100644
> --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> @@ -37,7 +37,7 @@ _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
>  
>  	_release_bars(pdev);
>  
> -	ret = pci_resize_resource(pdev, resno, bar_size);
> +	ret = pci_resize_resource(pdev, resno, bar_size, 0);
>  	if (ret) {
>  		drm_info(&i915->drm, "Failed to resize BAR%d to %dM (%pe)\n",
>  			 resno, 1 << bar_size, ERR_PTR(ret));
> diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
> index b44ebf50fedb..00dd027057df 100644
> --- a/drivers/gpu/drm/xe/xe_vram.c
> +++ b/drivers/gpu/drm/xe/xe_vram.c
> @@ -36,7 +36,7 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
>  	if (pci_resource_len(pdev, resno))
>  		pci_release_resource(pdev, resno);
>  
> -	ret = pci_resize_resource(pdev, resno, bar_size);
> +	ret = pci_resize_resource(pdev, resno, bar_size, 0);
>  	if (ret) {
>  		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",
>  			 resno, 1 << bar_size, ERR_PTR(ret));
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 9d6f74bd95f8..2a1b5456c2dc 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1599,18 +1599,13 @@ static ssize_t __resource_resize_store(struct device *dev, int n,
>  {
>  	struct pci_dev *pdev = to_pci_dev(dev);
>  	struct pci_bus *bus = pdev->bus;
> -	struct resource *b_win, *res;
>  	unsigned long size;
> -	int ret, i;
> +	int ret;
>  	u16 cmd;
>  
>  	if (kstrtoul(buf, 0, &size) < 0)
>  		return -EINVAL;
>  
> -	b_win = pbus_select_window(bus, pci_resource_n(pdev, n));
> -	if (!b_win)
> -		return -EINVAL;
> -
>  	device_lock(dev);
>  	if (dev->driver || pci_num_vf(pdev)) {
>  		ret = -EBUSY;
> @@ -1632,15 +1627,7 @@ static ssize_t __resource_resize_store(struct device *dev, int n,
>  
>  	pci_remove_resource_files(pdev);
>  
> -	pci_dev_for_each_resource(pdev, res, i) {
> -		if (i >= PCI_BRIDGE_RESOURCES)
> -			break;
> -
> -		if (b_win == pbus_select_window(bus, res))
> -			pci_release_resource(pdev, i);
> -	}
> -
> -	ret = pci_resize_resource(pdev, n, size);
> +	ret = pci_resize_resource(pdev, n, size, 0);
>  
>  	pci_assign_unassigned_bus_resources(bus);
>  
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index bf1a577e9623..9893ea12d1f2 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -421,8 +421,10 @@ enum pci_bar_type {
>  struct device *pci_get_host_bridge_device(struct pci_dev *dev);
>  void pci_put_host_bridge_device(struct device *dev);
>  
> +void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size);
> +int pci_do_resource_release_and_resize(struct pci_dev *dev, int resno, int size,
> +				       int exclude_bars);
>  unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge);
> -int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res);
>  int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
>  
>  int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 51f5e5a80b54..7e268960954b 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -2420,18 +2420,16 @@ EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
>   * release it when possible. If the bridge window contains assigned
>   * resources, it cannot be released.
>   */
> -int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
> +static int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res,
> +					  struct list_head *saved)
>  {
>  	unsigned long type = res->flags;
>  	struct pci_dev_resource *dev_res;
>  	struct pci_dev *bridge = NULL;
> -	LIST_HEAD(saved);
>  	LIST_HEAD(added);
>  	LIST_HEAD(failed);
>  	unsigned int i;
> -	int ret;
> -
> -	down_read(&pci_bus_sem);
> +	int ret = 0;
>  
>  	while (!pci_is_root_bus(bus)) {
>  		bridge = bus->self;
> @@ -2443,9 +2441,9 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
>  
>  		/* Ignore BARs which are still in use */
>  		if (!res->child) {
> -			ret = add_to_list(&saved, bridge, res, 0, 0);
> +			ret = add_to_list(saved, bridge, res, 0, 0);
>  			if (ret)
> -				goto cleanup;
> +				return ret;
>  
>  			pci_release_resource(bridge, i);
>  		} else {
> @@ -2468,34 +2466,78 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
>  		free_list(&added);
>  
>  	if (!list_empty(&failed)) {
> -		if (pci_required_resource_failed(&failed, type)) {
> +		if (pci_required_resource_failed(&failed, type))
>  			ret = -ENOSPC;
> -			goto cleanup;
> -		}
> -		/* Only resources with unrelated types failed (again) */
>  		free_list(&failed);
> +		if (ret)
> +			return ret;
> +
> +		/* Only resources with unrelated types failed (again) */
>  	}
>  
> -	list_for_each_entry(dev_res, &saved, list) {
> +	list_for_each_entry(dev_res, saved, list) {
>  		struct pci_dev *dev = dev_res->dev;
>  
>  		/* Skip the bridge we just assigned resources for */
>  		if (bridge == dev)
>  			continue;
>  
> +		if (!dev->subordinate)
> +			continue;
> +
>  		pci_setup_bridge(dev->subordinate);
>  	}
>  
> -	free_list(&saved);
> -	up_read(&pci_bus_sem);
>  	return 0;
> +}
>  
> -cleanup:
> -	/* Restore size and flags */
> -	list_for_each_entry(dev_res, &failed, list)
> -		restore_dev_resource(dev_res);
> -	free_list(&failed);
> +int pci_do_resource_release_and_resize(struct pci_dev *pdev, int resno, int size,
> +				       int exclude_bars)
> +{
> +	struct resource *res = pci_resource_n(pdev, resno);
> +	struct pci_dev_resource *dev_res;
> +	struct pci_bus *bus = pdev->bus;
> +	struct resource *b_win, *r;
> +	LIST_HEAD(saved);
> +	unsigned int i;
> +	int ret = 0;
> +
> +	b_win = pbus_select_window(bus, res);
> +	if (!b_win)
> +		return -EINVAL;
> +
> +	pci_dev_for_each_resource(pdev, r, i) {
> +		if (i >= PCI_BRIDGE_RESOURCES)
> +			break;
> +
> +		if (exclude_bars & BIT(i))
> +			continue;
>  
> +		if (b_win != pbus_select_window(bus, r))
> +			continue;
> +
> +		ret = add_to_list(&saved, pdev, r, 0, 0);
> +		if (ret)
> +			goto restore;
> +		pci_release_resource(pdev, i);
> +	}
> +
> +	pci_resize_resource_set_size(pdev, resno, size);
> +
> +	if (!bus->self)
> +		goto out;
> +
> +	down_read(&pci_bus_sem);
> +	ret = pbus_reassign_bridge_resources(bus, res, &saved);
> +	if (ret)
> +		goto restore;
> +
> +out:
> +	up_read(&pci_bus_sem);
> +	free_list(&saved);
> +	return ret;
> +
> +restore:
>  	/* Revert to the old configuration */
>  	list_for_each_entry(dev_res, &saved, list) {
>  		struct resource *res = dev_res->res;
> @@ -2510,13 +2552,21 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
>  
>  		restore_dev_resource(dev_res);
>  
> -		pci_claim_resource(dev, i);
> -		pci_setup_bridge(dev->subordinate);
> -	}
> -	up_read(&pci_bus_sem);
> -	free_list(&saved);
> +		ret = pci_claim_resource(dev, i);
> +		if (ret)
> +			continue;
>  
> -	return ret;
> +		if (i < PCI_BRIDGE_RESOURCES) {
> +			const char *res_name = pci_resource_name(dev, i);
> +
> +			pci_update_resource(dev, i);
> +			pci_info(dev, "%s %pR: old value restored\n",
> +				 res_name, res);
> +		}
> +		if (dev->subordinate)
> +			pci_setup_bridge(dev->subordinate);
> +	}
> +	goto out;
>  }
>  
>  void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
> index 3d0b0b3f60c4..e4486d7030c0 100644
> --- a/drivers/pci/setup-res.c
> +++ b/drivers/pci/setup-res.c
> @@ -444,8 +444,7 @@ static bool pci_resize_is_memory_decoding_enabled(struct pci_dev *dev,
>  	return cmd & PCI_COMMAND_MEMORY;
>  }
>  
> -static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
> -					 int size)
> +void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size)
>  {
>  	resource_size_t res_size = pci_rebar_size_to_bytes(size);
>  	struct resource *res = pci_resource_n(dev, resno);
> @@ -456,9 +455,9 @@ static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
>  	resource_set_size(res, res_size);
>  }
>  
> -int pci_resize_resource(struct pci_dev *dev, int resno, int size)
> +int pci_resize_resource(struct pci_dev *dev, int resno, int size,
> +			int exclude_bars)
>  {
> -	struct resource *res = pci_resource_n(dev, resno);
>  	struct pci_host_bridge *host;
>  	int old, ret;
>  	u32 sizes;
> @@ -468,10 +467,6 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
>  	if (host->preserve_config)
>  		return -ENOTSUPP;
>  
> -	/* Make sure the resource isn't assigned before resizing it. */
> -	if (!(res->flags & IORESOURCE_UNSET))
> -		return -EBUSY;
> -
>  	if (pci_resize_is_memory_decoding_enabled(dev, resno))
>  		return -EBUSY;
>  
> @@ -490,19 +485,13 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
>  	if (ret)
>  		return ret;
>  
> -	pci_resize_resource_set_size(dev, resno, size);
> -
> -	/* Check if the new config works by trying to assign everything. */
> -	if (dev->bus->self) {
> -		ret = pbus_reassign_bridge_resources(dev->bus, res);
> -		if (ret)
> -			goto error_resize;
> -	}
> +	ret = pci_do_resource_release_and_resize(dev, resno, size, exclude_bars);
> +	if (ret)
> +		goto error_resize;
>  	return 0;
>  
>  error_resize:
>  	pci_rebar_set_size(dev, resno, old);
> -	pci_resize_resource_set_size(dev, resno, old);
>  	return ret;
>  }
>  EXPORT_SYMBOL(pci_resize_resource);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index d1fdf81fbe1e..cc58abbd2b20 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1428,7 +1428,8 @@ static inline int pci_rebar_bytes_to_size(u64 bytes)
>  }
>  
>  u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar);
> -int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size);
> +int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size,
> +				     int exlucde_bars);

It seems I managed to mistype this in the prototype, please let me know if 
you want me to send v3 addressing this and the other comments.

-- 
 i.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path
  2025-11-13 16:46   ` Ilpo Järvinen
@ 2025-11-13 21:01     ` Bjorn Helgaas
  0 siblings, 0 replies; 21+ messages in thread
From: Bjorn Helgaas @ 2025-11-13 21:01 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Bjorn Helgaas, Alex Bennée, Simon Richter, Lucas De Marchi,
	Alex Deucher, amd-gfx, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

On Thu, Nov 13, 2025 at 06:46:07PM +0200, Ilpo Järvinen wrote:
> On Thu, 13 Nov 2025, Ilpo Järvinen wrote:

> > -int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size);
> > +int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size,
> > +				     int exlucde_bars);
> 
> It seems I managed to mistype this in the prototype, please let me know if 
> you want me to send v3 addressing this and the other comments.

Fixed this one, thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path
  2025-11-13 16:26 ` [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
  2025-11-13 16:46   ` Ilpo Järvinen
@ 2025-11-14  9:34   ` Christian König
  2026-01-20 21:17   ` Ville Syrjälä
  2 siblings, 0 replies; 21+ messages in thread
From: Christian König @ 2025-11-14  9:34 UTC (permalink / raw)
  To: Ilpo Järvinen, Alex Bennée, Simon Richter,
	Lucas De Marchi, Alex Deucher, amd-gfx, Bjorn Helgaas,
	David Airlie, dri-devel, intel-gfx, intel-xe, Jani Nikula,
	Joonas Lahtinen, linux-pci, Rodrigo Vivi, Simona Vetter,
	Tvrtko Ursulin, Thomas Hellström, Michał Winiarski,
	linux-kernel

Hi,

On 11/13/25 17:26, Ilpo Järvinen wrote:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7a899fb4de29..4e241836ae68 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1736,7 +1736,7 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>  
>  	pci_release_resource(adev->pdev, 0);
>  
> -	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> +	r = pci_resize_resource(adev->pdev, 0, rbar_size, 1 << 5);

It is HW generation specific if that is BAR 5 or 2 which needs protection.

There is an "if (adev->asic_type >= CHIP_BONAIRE) pci_release_resource(...);" which depends on the same criteria.

Something like this should work:

if (adev->asic_type >= CHIP_BONAIRE)
	r = pci_resize_resource(adev->pdev, 0, rbar_size, 1 << 5);
else
	r = pci_resize_resource(adev->pdev, 0, rbar_size, 1 << 2);

Apart from that the patch set looks really good to me and is certainly a very nice cleanup.

We can potentially also remove a bunch of the extra checks in amdgpu. Going to take care of it after this set landed.

Thanks,
Christian.

>  	if (r == -ENOSPC)
>  		dev_info(adev->dev,
>  			 "Not enough PCI address space for a large BAR.");

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize
  2025-11-13 16:26 ` [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
@ 2025-11-14 13:10   ` Alex Bennée
  2025-11-14 13:16     ` Ilpo Järvinen
  2025-11-14 15:58   ` Lucas De Marchi
  1 sibling, 1 reply; 21+ messages in thread
From: Alex Bennée @ 2025-11-14 13:10 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Simon Richter, Lucas De Marchi, Alex Deucher, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> writes:

> PCI core handles releasing device's resources and their rollback in
> case of failure of a BAR resizing operation. Releasing resource prior
> to calling pci_resize_resource() prevents PCI core from restoring the
> BARs as they were.
>
> Remove driver-side release of BARs from the xe driver.
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> ---
>  drivers/gpu/drm/xe/xe_vram.c | 3 ---
>  1 file changed, 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
> index 00dd027057df..5aacab9358a4 100644
> --- a/drivers/gpu/drm/xe/xe_vram.c
> +++ b/drivers/gpu/drm/xe/xe_vram.c
> @@ -33,9 +33,6 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
>  	int bar_size = pci_rebar_bytes_to_size(size);
>  	int ret;
>  
> -	if (pci_resource_len(pdev, resno))
> -		pci_release_resource(pdev, resno);
> -
>  	ret = pci_resize_resource(pdev, resno, bar_size, 0);
>  	if (ret) {
>  		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",

This didn't apply, I assume due to a clash with:

  d30203739be79 (drm/xe: Move rebar to be done earlier)

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize
  2025-11-14 13:10   ` Alex Bennée
@ 2025-11-14 13:16     ` Ilpo Järvinen
  0 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2025-11-14 13:16 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Simon Richter, Lucas De Marchi, Alex Deucher, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 1797 bytes --]

On Fri, 14 Nov 2025, Alex Bennée wrote:

> Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> writes:
> 
> > PCI core handles releasing device's resources and their rollback in
> > case of failure of a BAR resizing operation. Releasing resource prior
> > to calling pci_resize_resource() prevents PCI core from restoring the
> > BARs as they were.
> >
> > Remove driver-side release of BARs from the xe driver.
> >
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > ---
> >  drivers/gpu/drm/xe/xe_vram.c | 3 ---
> >  1 file changed, 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
> > index 00dd027057df..5aacab9358a4 100644
> > --- a/drivers/gpu/drm/xe/xe_vram.c
> > +++ b/drivers/gpu/drm/xe/xe_vram.c
> > @@ -33,9 +33,6 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
> >  	int bar_size = pci_rebar_bytes_to_size(size);
> >  	int ret;
> >  
> > -	if (pci_resource_len(pdev, resno))
> > -		pci_release_resource(pdev, resno);
> > -
> >  	ret = pci_resize_resource(pdev, resno, bar_size, 0);
> >  	if (ret) {
> >  		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",
> 
> This didn't apply, I assume due to a clash with:
> 
>   d30203739be79 (drm/xe: Move rebar to be done earlier)

The xe driver changes do not matter if you using only amdgpu.

We know those xe changes in the drm tree conflict as the need for this 
BAR resizing rework was not know when the xe changes were made. The 
resolution is just to remove the release_bars() function from xe driver 
completely as BAR releasing prior to resize is now handled by 
pci_resize_resource().

-- 
 i.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 03/11] PCI: Change pci_dev variable from 'bridge' to 'dev'
  2025-11-13 16:26 ` [PATCH v2 03/11] PCI: Change pci_dev variable from 'bridge' to 'dev' Ilpo Järvinen
@ 2025-11-14 14:35   ` Alex Bennée
  0 siblings, 0 replies; 21+ messages in thread
From: Alex Bennée @ 2025-11-14 14:35 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Simon Richter, Lucas De Marchi, Alex Deucher, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> writes:

> Upcoming fix to BAR resize will store also device BAR resource in the
> saved list. Change the pci_dev variable in the loop from 'bridge' to
> 'dev' as the former would be misleading with non-bridges in the list.
>
> This is in a separate change to reduce churn in the upcoming BAR resize
> fix.
>
> While it appears that the logic in the loop doing pci_setup_bridge() is
> altered as 'bridge' variable is no longer updated, a bridge should
> never appear more than once in the saved list so the if check can only
> match to the first entry. As such, the code with two distinct pci_dev
> variables better represents the intention of the check compared with the
> old code where bridge variable was reused for a different purpose.
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 09/11] drm/i915: Remove driver side BAR release before resize
  2025-11-13 16:26 ` [PATCH v2 09/11] drm/i915: " Ilpo Järvinen
@ 2025-11-14 15:55   ` Lucas De Marchi
  0 siblings, 0 replies; 21+ messages in thread
From: Lucas De Marchi @ 2025-11-14 15:55 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Alex Deucher, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

On Thu, Nov 13, 2025 at 06:26:26PM +0200, Ilpo Järvinen wrote:
>PCI core handles releasing device's resources and their rollback in
>case of failure of a BAR resizing operation. Releasing resource prior
>to calling pci_resize_resource() prevents PCI core from restoring the
>BARs as they were.
>
>Remove driver-side release of BARs from the i915 driver.
>
>Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

I also applied it to drm-tip from today and submitted for xe and i915
tests: https://lore.kernel.org/all/20251114154338.4161137-2-lucas.demarchi@intel.com

pass by comment... the forcewake handling in the middle of BAR resizing
is insane but it's probably because it's already very late in the probe
sequence. Moving it to early probe like was done in xe would allow it to
be simplified too.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize
  2025-11-13 16:26 ` [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
  2025-11-14 13:10   ` Alex Bennée
@ 2025-11-14 15:58   ` Lucas De Marchi
  1 sibling, 0 replies; 21+ messages in thread
From: Lucas De Marchi @ 2025-11-14 15:58 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Alex Deucher, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

On Thu, Nov 13, 2025 at 06:26:25PM +0200, Ilpo Järvinen wrote:
>PCI core handles releasing device's resources and their rollback in
>case of failure of a BAR resizing operation. Releasing resource prior
>to calling pci_resize_resource() prevents PCI core from restoring the
>BARs as they were.
>
>Remove driver-side release of BARs from the xe driver.
>
>Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>Cc: Lucas De Marchi <lucas.demarchi@intel.com>


Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>

as noted in the previous version, the conflict with drm-xe-next is
trivial: just drop the release_bars() function.

Lucas De Marchi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path
  2025-11-13 16:26 ` [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
  2025-11-13 16:46   ` Ilpo Järvinen
  2025-11-14  9:34   ` Christian König
@ 2026-01-20 21:17   ` Ville Syrjälä
  2026-01-21 10:02     ` Ilpo Järvinen
  2 siblings, 1 reply; 21+ messages in thread
From: Ville Syrjälä @ 2026-01-20 21:17 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

On Thu, Nov 13, 2025 at 06:26:23PM +0200, Ilpo Järvinen wrote:
> BAR resize operation is implemented in the pci_resize_resource() and
> pbus_reassign_bridge_resources() functions. pci_resize_resource() can
> be called either from __resource_resize_store() from sysfs or directly
> by the driver for the Endpoint Device.
> 
> The pci_resize_resource() requires that caller has released the device
> resources that share the bridge window with the BAR to be resized as
> otherwise the bridge window is pinned in place and cannot be changed.
> 
> pbus_reassign_bridge_resources() implement rollback of the resources if
> the resize operation fails, but rollback is performed only for the
> bridge windows. Because releasing the device resources are done by the
> caller of the BAR resize interface, these functions performing the BAR
> resize do not have access to the device resources as they were before
> the resize.
> 
> pbus_reassign_bridge_resources() could try to
> __pci_bridge_assign_resources() after rolling back the bridge windows
> as they were, however, it will not guarantee the resource are assigned
> due to differences how FW and the kernel may want to assign the
> resources (alignment of the start address and tail).
> 
> In order to perform rollback robustly, the BAR resize interface has to
> be altered to also release the device resources that share the bridge
> window with the BAR to be resized.
> 
> Also, remove restoring from the entries failed list as saved list
> should now contain both the bridge windows and device resources so
> the extra restore is duplicated work.
> 
> Some drivers (currently only amdgpu) want to prevent releasing some
> resources. Add exclude_bars param to pci_resize_resource() and make
> amdgpu to pass its register BAR (BAR 5) which should never be released
> during resize operation. Normally 64-bit prefetchable resources do not
> share bridge window with it as the register BAR (32-bit only) but there
> are various fallbacks in the resource assignment logic which may make
> the resources to share the bridge window in rare cases.
> 
> This change (together with the driver side changes) is to counter the
> resource releases that had to be done to prevent resource tree
> corruption in the ("PCI: Release assigned resource before restoring
> them") change. As such, it likely restores functionality in cases where
> device resources were released to avoid resource tree conflicts which
> appeared to be "working" when such conflicts were not correctly
> detected by the kernel.

This thing completely broke my DG2 + non-reBAR system.  The bisect
points to commit 4efaa80b3d75 ("drm/i915: Remove driver side BAR
release before resize") but the real problems seem to be in this
patch. A had a quick look at the code and spotted a few issues...

<snip>
> @@ -2468,34 +2466,78 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
>  		free_list(&added);
>  
>  	if (!list_empty(&failed)) {
> -		if (pci_required_resource_failed(&failed, type)) {
> +		if (pci_required_resource_failed(&failed, type))
>  			ret = -ENOSPC;
> -			goto cleanup;
> -		}
> -		/* Only resources with unrelated types failed (again) */
>  		free_list(&failed);
> +		if (ret)
> +			return ret;
> +
> +		/* Only resources with unrelated types failed (again) */
>  	}
>  
> -	list_for_each_entry(dev_res, &saved, list) {
> +	list_for_each_entry(dev_res, saved, list) {
>  		struct pci_dev *dev = dev_res->dev;
>  
>  		/* Skip the bridge we just assigned resources for */
>  		if (bridge == dev)
>  			continue;
>  
> +		if (!dev->subordinate)
> +			continue;
> +
>  		pci_setup_bridge(dev->subordinate);
>  	}
>  
> -	free_list(&saved);
> -	up_read(&pci_bus_sem);
>  	return 0;
> +}
>  
> -cleanup:
> -	/* Restore size and flags */
> -	list_for_each_entry(dev_res, &failed, list)
> -		restore_dev_resource(dev_res);
> -	free_list(&failed);
> +int pci_do_resource_release_and_resize(struct pci_dev *pdev, int resno, int size,
> +				       int exclude_bars)
> +{
> +	struct resource *res = pci_resource_n(pdev, resno);
> +	struct pci_dev_resource *dev_res;
> +	struct pci_bus *bus = pdev->bus;
> +	struct resource *b_win, *r;
> +	LIST_HEAD(saved);
> +	unsigned int i;
> +	int ret = 0;
> +
> +	b_win = pbus_select_window(bus, res);
> +	if (!b_win)
> +		return -EINVAL;
> +
> +	pci_dev_for_each_resource(pdev, r, i) {
> +		if (i >= PCI_BRIDGE_RESOURCES)
> +			break;
> +
> +		if (exclude_bars & BIT(i))
> +			continue;
>  
> +		if (b_win != pbus_select_window(bus, r))
> +			continue;
> +
> +		ret = add_to_list(&saved, pdev, r, 0, 0);
> +		if (ret)
> +			goto restore;
> +		pci_release_resource(pdev, i);
> +	}
> +
> +	pci_resize_resource_set_size(pdev, resno, size);
> +
> +	if (!bus->self)
> +		goto out;
> +
> +	down_read(&pci_bus_sem);
> +	ret = pbus_reassign_bridge_resources(bus, res, &saved);
> +	if (ret)
> +		goto restore;
> +
> +out:
> +	up_read(&pci_bus_sem);
> +	free_list(&saved);
> +	return ret;
> +
> +restore:
>  	/* Revert to the old configuration */
>  	list_for_each_entry(dev_res, &saved, list) {
>  		struct resource *res = dev_res->res;
> @@ -2510,13 +2552,21 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
>  
>  		restore_dev_resource(dev_res);
>  
> -		pci_claim_resource(dev, i);
> -		pci_setup_bridge(dev->subordinate);
> -	}
> -	up_read(&pci_bus_sem);
> -	free_list(&saved);
> +		ret = pci_claim_resource(dev, i);
> +		if (ret)
> +			continue;

This clobbers 'ret' was supposedly meant to be returned to the
caller in the end. Thus (at least in my case) the caller always
sees a return value of 0, and then it forgets to restores the
reBAR setting back to the original value.

>  
> -	return ret;
> +		if (i < PCI_BRIDGE_RESOURCES) {
> +			const char *res_name = pci_resource_name(dev, i);
> +
> +			pci_update_resource(dev, i);
> +			pci_info(dev, "%s %pR: old value restored\n",
> +				 res_name, res);
> +		}
> +		if (dev->subordinate)
> +			pci_setup_bridge(dev->subordinate);
> +	}
> +	goto out;
>  }
>  
>  void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
> index 3d0b0b3f60c4..e4486d7030c0 100644
> --- a/drivers/pci/setup-res.c
> +++ b/drivers/pci/setup-res.c
> @@ -444,8 +444,7 @@ static bool pci_resize_is_memory_decoding_enabled(struct pci_dev *dev,
>  	return cmd & PCI_COMMAND_MEMORY;
>  }
>  
> -static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
> -					 int size)
> +void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size)
>  {
>  	resource_size_t res_size = pci_rebar_size_to_bytes(size);
>  	struct resource *res = pci_resource_n(dev, resno);
> @@ -456,9 +455,9 @@ static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
>  	resource_set_size(res, res_size);
>  }
>  
> -int pci_resize_resource(struct pci_dev *dev, int resno, int size)
> +int pci_resize_resource(struct pci_dev *dev, int resno, int size,
> +			int exclude_bars)
>  {
> -	struct resource *res = pci_resource_n(dev, resno);
>  	struct pci_host_bridge *host;
>  	int old, ret;
>  	u32 sizes;
> @@ -468,10 +467,6 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
>  	if (host->preserve_config)
>  		return -ENOTSUPP;
>  
> -	/* Make sure the resource isn't assigned before resizing it. */
> -	if (!(res->flags & IORESOURCE_UNSET))
> -		return -EBUSY;
> -
>  	if (pci_resize_is_memory_decoding_enabled(dev, resno))
>  		return -EBUSY;
>  
> @@ -490,19 +485,13 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
>  	if (ret)
>  		return ret;
>  
> -	pci_resize_resource_set_size(dev, resno, size);
> -
> -	/* Check if the new config works by trying to assign everything. */
> -	if (dev->bus->self) {
> -		ret = pbus_reassign_bridge_resources(dev->bus, res);
> -		if (ret)
> -			goto error_resize;
> -	}
> +	ret = pci_do_resource_release_and_resize(dev, resno, size, exclude_bars);
> +	if (ret)
> +		goto error_resize;
>  	return 0;
>  
>  error_resize:
>  	pci_rebar_set_size(dev, resno, old);
> -	pci_resize_resource_set_size(dev, resno, old);

This new order is very broken because the new reBAR setting
may have turned some of the originally set bits in the BAR
to read-only. Thus we may not be able to restore the BAR back
to the original value.

In my case the original settings are 256MiB / 0x4030000000,
followed by a failed resize to 8GiB, and finally we see a
failed restore 'BAR 2: error updating (0x3000000c != 0x0000000c)'
due to the read-only bits.

i915 limps along after the failure but nothing really works,
and xe just completely explodes.

>  	return ret;
>  }
>  EXPORT_SYMBOL(pci_resize_resource);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index d1fdf81fbe1e..cc58abbd2b20 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1428,7 +1428,8 @@ static inline int pci_rebar_bytes_to_size(u64 bytes)
>  }
>  
>  u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar);
> -int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size);
> +int __must_check pci_resize_resource(struct pci_dev *dev, int i, int size,
> +				     int exlucde_bars);
>  int pci_select_bars(struct pci_dev *dev, unsigned long flags);
>  bool pci_device_is_present(struct pci_dev *pdev);
>  void pci_ignore_hotplug(struct pci_dev *dev);
> -- 
> 2.39.5

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path
  2026-01-20 21:17   ` Ville Syrjälä
@ 2026-01-21 10:02     ` Ilpo Järvinen
  0 siblings, 0 replies; 21+ messages in thread
From: Ilpo Järvinen @ 2026-01-21 10:02 UTC (permalink / raw)
  To: Ville Syrjälä
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 9727 bytes --]

On Tue, 20 Jan 2026, Ville Syrjälä wrote:

> On Thu, Nov 13, 2025 at 06:26:23PM +0200, Ilpo Järvinen wrote:
> > BAR resize operation is implemented in the pci_resize_resource() and
> > pbus_reassign_bridge_resources() functions. pci_resize_resource() can
> > be called either from __resource_resize_store() from sysfs or directly
> > by the driver for the Endpoint Device.
> > 
> > The pci_resize_resource() requires that caller has released the device
> > resources that share the bridge window with the BAR to be resized as
> > otherwise the bridge window is pinned in place and cannot be changed.
> > 
> > pbus_reassign_bridge_resources() implement rollback of the resources if
> > the resize operation fails, but rollback is performed only for the
> > bridge windows. Because releasing the device resources are done by the
> > caller of the BAR resize interface, these functions performing the BAR
> > resize do not have access to the device resources as they were before
> > the resize.
> > 
> > pbus_reassign_bridge_resources() could try to
> > __pci_bridge_assign_resources() after rolling back the bridge windows
> > as they were, however, it will not guarantee the resource are assigned
> > due to differences how FW and the kernel may want to assign the
> > resources (alignment of the start address and tail).
> > 
> > In order to perform rollback robustly, the BAR resize interface has to
> > be altered to also release the device resources that share the bridge
> > window with the BAR to be resized.
> > 
> > Also, remove restoring from the entries failed list as saved list
> > should now contain both the bridge windows and device resources so
> > the extra restore is duplicated work.
> > 
> > Some drivers (currently only amdgpu) want to prevent releasing some
> > resources. Add exclude_bars param to pci_resize_resource() and make
> > amdgpu to pass its register BAR (BAR 5) which should never be released
> > during resize operation. Normally 64-bit prefetchable resources do not
> > share bridge window with it as the register BAR (32-bit only) but there
> > are various fallbacks in the resource assignment logic which may make
> > the resources to share the bridge window in rare cases.
> > 
> > This change (together with the driver side changes) is to counter the
> > resource releases that had to be done to prevent resource tree
> > corruption in the ("PCI: Release assigned resource before restoring
> > them") change. As such, it likely restores functionality in cases where
> > device resources were released to avoid resource tree conflicts which
> > appeared to be "working" when such conflicts were not correctly
> > detected by the kernel.
> 
> This thing completely broke my DG2 + non-reBAR system.  The bisect
> points to commit 4efaa80b3d75 ("drm/i915: Remove driver side BAR
> release before resize") but the real problems seem to be in this
> patch. A had a quick look at the code and spotted a few issues...
> 
> <snip>
> > @@ -2468,34 +2466,78 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
> >  		free_list(&added);
> >  
> >  	if (!list_empty(&failed)) {
> > -		if (pci_required_resource_failed(&failed, type)) {
> > +		if (pci_required_resource_failed(&failed, type))
> >  			ret = -ENOSPC;
> > -			goto cleanup;
> > -		}
> > -		/* Only resources with unrelated types failed (again) */
> >  		free_list(&failed);
> > +		if (ret)
> > +			return ret;
> > +
> > +		/* Only resources with unrelated types failed (again) */
> >  	}
> >  
> > -	list_for_each_entry(dev_res, &saved, list) {
> > +	list_for_each_entry(dev_res, saved, list) {
> >  		struct pci_dev *dev = dev_res->dev;
> >  
> >  		/* Skip the bridge we just assigned resources for */
> >  		if (bridge == dev)
> >  			continue;
> >  
> > +		if (!dev->subordinate)
> > +			continue;
> > +
> >  		pci_setup_bridge(dev->subordinate);
> >  	}
> >  
> > -	free_list(&saved);
> > -	up_read(&pci_bus_sem);
> >  	return 0;
> > +}
> >  
> > -cleanup:
> > -	/* Restore size and flags */
> > -	list_for_each_entry(dev_res, &failed, list)
> > -		restore_dev_resource(dev_res);
> > -	free_list(&failed);
> > +int pci_do_resource_release_and_resize(struct pci_dev *pdev, int resno, int size,
> > +				       int exclude_bars)
> > +{
> > +	struct resource *res = pci_resource_n(pdev, resno);
> > +	struct pci_dev_resource *dev_res;
> > +	struct pci_bus *bus = pdev->bus;
> > +	struct resource *b_win, *r;
> > +	LIST_HEAD(saved);
> > +	unsigned int i;
> > +	int ret = 0;
> > +
> > +	b_win = pbus_select_window(bus, res);
> > +	if (!b_win)
> > +		return -EINVAL;
> > +
> > +	pci_dev_for_each_resource(pdev, r, i) {
> > +		if (i >= PCI_BRIDGE_RESOURCES)
> > +			break;
> > +
> > +		if (exclude_bars & BIT(i))
> > +			continue;
> >  
> > +		if (b_win != pbus_select_window(bus, r))
> > +			continue;
> > +
> > +		ret = add_to_list(&saved, pdev, r, 0, 0);
> > +		if (ret)
> > +			goto restore;
> > +		pci_release_resource(pdev, i);
> > +	}
> > +
> > +	pci_resize_resource_set_size(pdev, resno, size);
> > +
> > +	if (!bus->self)
> > +		goto out;
> > +
> > +	down_read(&pci_bus_sem);
> > +	ret = pbus_reassign_bridge_resources(bus, res, &saved);
> > +	if (ret)
> > +		goto restore;
> > +
> > +out:
> > +	up_read(&pci_bus_sem);
> > +	free_list(&saved);
> > +	return ret;
> > +
> > +restore:
> >  	/* Revert to the old configuration */
> >  	list_for_each_entry(dev_res, &saved, list) {
> >  		struct resource *res = dev_res->res;
> > @@ -2510,13 +2552,21 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
> >  
> >  		restore_dev_resource(dev_res);
> >  
> > -		pci_claim_resource(dev, i);
> > -		pci_setup_bridge(dev->subordinate);
> > -	}
> > -	up_read(&pci_bus_sem);
> > -	free_list(&saved);
> > +		ret = pci_claim_resource(dev, i);
> > +		if (ret)
> > +			continue;
> 
> This clobbers 'ret' was supposedly meant to be returned to the
> caller in the end. Thus (at least in my case) the caller always
> sees a return value of 0, and then it forgets to restores the
> reBAR setting back to the original value.

Thanks for the report.

Yes, you're right it's wrong, I'll move that call inside the if.

> > -	return ret;
> > +		if (i < PCI_BRIDGE_RESOURCES) {
> > +			const char *res_name = pci_resource_name(dev, i);
> > +
> > +			pci_update_resource(dev, i);
> > +			pci_info(dev, "%s %pR: old value restored\n",
> > +				 res_name, res);
> > +		}
> > +		if (dev->subordinate)
> > +			pci_setup_bridge(dev->subordinate);
> > +	}
> > +	goto out;
> >  }
> >  
> >  void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
> > diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
> > index 3d0b0b3f60c4..e4486d7030c0 100644
> > --- a/drivers/pci/setup-res.c
> > +++ b/drivers/pci/setup-res.c
> > @@ -444,8 +444,7 @@ static bool pci_resize_is_memory_decoding_enabled(struct pci_dev *dev,
> >  	return cmd & PCI_COMMAND_MEMORY;
> >  }
> >  
> > -static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
> > -					 int size)
> > +void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size)
> >  {
> >  	resource_size_t res_size = pci_rebar_size_to_bytes(size);
> >  	struct resource *res = pci_resource_n(dev, resno);
> > @@ -456,9 +455,9 @@ static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
> >  	resource_set_size(res, res_size);
> >  }
> >  
> > -int pci_resize_resource(struct pci_dev *dev, int resno, int size)
> > +int pci_resize_resource(struct pci_dev *dev, int resno, int size,
> > +			int exclude_bars)
> >  {
> > -	struct resource *res = pci_resource_n(dev, resno);
> >  	struct pci_host_bridge *host;
> >  	int old, ret;
> >  	u32 sizes;
> > @@ -468,10 +467,6 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
> >  	if (host->preserve_config)
> >  		return -ENOTSUPP;
> >  
> > -	/* Make sure the resource isn't assigned before resizing it. */
> > -	if (!(res->flags & IORESOURCE_UNSET))
> > -		return -EBUSY;
> > -
> >  	if (pci_resize_is_memory_decoding_enabled(dev, resno))
> >  		return -EBUSY;
> >  
> > @@ -490,19 +485,13 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
> >  	if (ret)
> >  		return ret;
> >  
> > -	pci_resize_resource_set_size(dev, resno, size);
> > -
> > -	/* Check if the new config works by trying to assign everything. */
> > -	if (dev->bus->self) {
> > -		ret = pbus_reassign_bridge_resources(dev->bus, res);
> > -		if (ret)
> > -			goto error_resize;
> > -	}
> > +	ret = pci_do_resource_release_and_resize(dev, resno, size, exclude_bars);
> > +	if (ret)
> > +		goto error_resize;
> >  	return 0;
> >  
> >  error_resize:
> >  	pci_rebar_set_size(dev, resno, old);
> > -	pci_resize_resource_set_size(dev, resno, old);
> 
> This new order is very broken because the new reBAR setting
> may have turned some of the originally set bits in the BAR
> to read-only. Thus we may not be able to restore the BAR back
> to the original value.
> 
> In my case the original settings are 256MiB / 0x4030000000,
> followed by a failed resize to 8GiB, and finally we see a
> failed restore 'BAR 2: error updating (0x3000000c != 0x0000000c)'
> due to the read-only bits.
> 
> i915 limps along after the failure but nothing really works,
> and xe just completely explodes.

Hmm, I certainly didn't foresee this happening. It seems I need to break 
the nice rebar/setup-bus layering to fix this (by moving the 
pci_rebar_set_size() calls into pci_do_resource_release_and_resize()).


-- 
 i.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-01-21 10:02 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20251113162628.5946-1-ilpo.jarvinen@linux.intel.com>
2025-11-13 16:26 ` [PATCH v2 01/11] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
2025-11-13 16:26 ` [PATCH v2 02/11] PCI/IOV: Adjust ->barsz[] when changing BAR size Ilpo Järvinen
2025-11-13 16:26 ` [PATCH v2 03/11] PCI: Change pci_dev variable from 'bridge' to 'dev' Ilpo Järvinen
2025-11-14 14:35   ` Alex Bennée
2025-11-13 16:26 ` [PATCH v2 04/11] PCI: Try BAR resize even when no window was released Ilpo Järvinen
2025-11-13 16:26 ` [PATCH v2 05/11] PCI: Freeing saved list does not require holding pci_bus_sem Ilpo Järvinen
2025-11-13 16:26 ` [PATCH v2 06/11] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
2025-11-13 16:46   ` Ilpo Järvinen
2025-11-13 21:01     ` Bjorn Helgaas
2025-11-14  9:34   ` Christian König
2026-01-20 21:17   ` Ville Syrjälä
2026-01-21 10:02     ` Ilpo Järvinen
2025-11-13 16:26 ` [PATCH v2 07/11] PCI: Add kerneldoc for pci_resize_resource() Ilpo Järvinen
2025-11-13 16:26 ` [PATCH v2 08/11] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
2025-11-14 13:10   ` Alex Bennée
2025-11-14 13:16     ` Ilpo Järvinen
2025-11-14 15:58   ` Lucas De Marchi
2025-11-13 16:26 ` [PATCH v2 09/11] drm/i915: " Ilpo Järvinen
2025-11-14 15:55   ` Lucas De Marchi
2025-11-13 16:26 ` [PATCH v2 10/11] drm/amdgpu: " Ilpo Järvinen
2025-11-13 16:26 ` [PATCH v2 11/11] PCI: Prevent restoring assigned resources Ilpo Järvinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox