public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] PCI: BAR resizing fix/rework
@ 2025-10-28 17:35 Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
                   ` (8 more replies)
  0 siblings, 9 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski
  Cc: linux-kernel, Ilpo Järvinen

Simon and Alex, could you please test if this series eliminates the
claim conflicts and makes the BAR resize either succeed or not break
things while rolling back resource changes? It should be tested without
other fix patches (from me; if you need some random unrelated fix,
that's okay).

Hi all,

Thanks to issue reports from Simon Richter and Alex Bennée, I
discovered BAR resize rollback can corrupt the resource tree. As fixing
corruption requires avoiding overlapping resource assignments, the
correct fix can unfortunately results in worse user experience, what
appeared to be "working" previously might no longer do so. Thus, I had
to do a larger rework to pci_resize_resource() in order to properly
restore resource states as it was prior to BAR resize.

This rework has been on my TODO list anyway but it wasn't the highest
prio item until pci_resize_resource() started to cause regressions due
to other resource assignment algorithm changes.

BAR resize rollback does not always restore BAR resources as they were
before the resize operation was started. Currently, when
pci_resize_resource() call is made by a driver, the driver must release
device resource prior to the call. This is a design flaw in
pci_resize_resource() API as PCI core cannot then save the state of
those resources from what it was prior to release so it could restore
them later if the BAR size change has to be rolled back.

PCI core's BAR resize operation doesn't even attempt to restore the
device resources currently when rolling back BAR resize operation. If
the normal resource assignment algorithm assigned those resources, then
device resources might be assigned after pci_resize_resource() call but
that could also trigger the resource tree corruption issue so what
appeared to an user as "working" might be a corrupted state.

With the new pci_resize_resource() interface, the driver calling
pci_resize_resource() should no longer release the device resources.

I've added WARN_ON_ONCE() to pick up similar bugs that cause resource
tree corruption. At least in my tests all looked clear on that front
after this series.

I was a bit on the edge how to split this series. Between patches 1 and
5-8, there might be cases where user experience is made worse if only
part of the series are applied. But at the same time I was hesitant to
merge all those changes together either as the changes way easier to
understand when split properly. Personally I think BAR resize rollback
code has not really functioned okay prior to series at all because
touching an assigned resource on the rollback path is a bug, plain and
simple. If that got things "working" it's still a bad bug (that one can
get lucky and corruption results in non-corrupted numbers doesn't make
it any better). If those patches need to be merged into one, just let
me know and I can rearrange the patch order to make it easier.

This series will conflict what's in pci/rebar and likely with some xe
changes from Lucas De Marchi that might also be rendered in part
unnecessary due to pci_resize_resource() API change. My suggestion is
that this series takes precedence over what's in pci/rebar to make
things easier for stable people (I can rebase the pci/rebar patches on
top of these so feel free to drop those other patches, if needed).


Ilpo Järvinen (9):
  PCI: Prevent resource tree corruption when BAR resize fails
  PCI/IOV: Adjust ->barsz[] when changing BAR size
  PCI: Change pci_dev variable from 'bridge' to 'dev'
  PCI: Try BAR resize even when no window was released
  PCI: Fix restoring BARs on BAR resize rollback path
  drm/xe: Remove driver side BAR release before resize
  drm/i915: Remove driver side BAR release before resize
  drm/amdgpu: Remove driver side BAR release before resize
  PCI: Prevent restoring assigned resources

 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c  |   8 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c |  12 --
 drivers/gpu/drm/xe/xe_vram.c                |   3 -
 drivers/pci/iov.c                           |  15 +--
 drivers/pci/pci-sysfs.c                     |  15 +--
 drivers/pci/pci.c                           |   4 +
 drivers/pci/pci.h                           |   8 +-
 drivers/pci/setup-bus.c                     | 119 ++++++++++++++------
 drivers/pci/setup-res.c                     |  30 ++---
 9 files changed, 108 insertions(+), 106 deletions(-)


base-commit: 3a8660878839faadb4f1a6dd72c3179c1df56787
-- 
2.39.5


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-10-29 23:36   ` Bjorn Helgaas
  2025-10-28 17:35 ` [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size Ilpo Järvinen
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

pbus_reassign_bridge_resources() saves bridge windows into the saved
list before attempting to adjust resource assignments to perform a BAR
resize operation. If resource adjustments cannot be completed fully,
rollback is attempted by restoring the resource from the saved list.

The rollback, however, does not check whether the resources it restores were
assigned by the partial resize attempt. If restore changes addresses of the
resource, it can result in corrupting the resource tree.

An example of a corrupted resource tree with overlapping addresses:

6200000000000-6203fbfffffff : pciex@620c3c0000000
  6200000000000-6203fbff0ffff : PCI Bus 0030:01
    6200020000000-62000207fffff : 0030:01:00.0
    6200000000000-6203fbff0ffff : PCI Bus 0030:02

A resource that are assigned into the resource tree must remain
unchanged. Thus, release such a resource before attempting to restore
and claim it back.

For simplicity, always do the release and claim back for the resource
even in the cases where it is restored to the same address range.

Note: this fix may "break" some cases where devices "worked" because
the resource tree corruption allowed address space double counting to
fit more resource than what can now be assigned without double
counting. The upcoming changes to BAR resizing should address those
scenarios (to the extent possible).

Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs")
Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 4a8735b275e4..e6984bb530ae 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2504,6 +2504,11 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 		bridge = dev_res->dev;
 		i = pci_resource_num(bridge, res);
 
+		if (res->parent) {
+			release_child_resources(res);
+			pci_release_resource(bridge, i);
+		}
+
 		restore_dev_resource(dev_res);
 
 		pci_claim_resource(bridge, i);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-11-13 16:29   ` Bjorn Helgaas
  2025-10-28 17:35 ` [PATCH 3/9] PCI: Change pci_dev variable from 'bridge' to 'dev' Ilpo Järvinen
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
struct pci_srvio keeps a cached copy of BAR size in unit of
resource_size_t in ->barsz[] which is not adjusted by
pci_rebar_set_size() but by pci_iov_resource_set_size().
pci_iov_resource_set_size() is called also from
pci_resize_resource_set_size().

The current arrangement is problematic once BAR resize algorithm starts
to roll back changes properly in case of a failure. The normal resource
fitting algorithm rolls back resource size using the struct
pci_dev_resource easily but having to call also
pci_resize_resource_set_size() or pci_iov_resource_set_size() to roll
back BAR size would be an extra burden, whereas combining ->barsz[]
update with pci_rebar_set_size() naturally rolls back it when restoring
the old BAR size on a different layer of the BAR resize operation.

Thus, rework pci_rebar_set_size() to also update ->barsz[].

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/iov.c       | 15 ++++-----------
 drivers/pci/pci.c       |  4 ++++
 drivers/pci/pci.h       |  5 ++---
 drivers/pci/setup-res.c | 10 ++++------
 4 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 77dee43b7858..04b675e90963 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -158,8 +158,7 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
 	return dev->sriov->barsz[pci_resource_num_to_vf_bar(resno)];
 }
 
-void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
-			       resource_size_t size)
+void pci_iov_resource_set_size(struct pci_dev *dev, int resno, int size)
 {
 	if (!pci_resource_is_iov(resno)) {
 		pci_warn(dev, "%s is not an IOV resource\n",
@@ -167,7 +166,8 @@ void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
 		return;
 	}
 
-	dev->sriov->barsz[pci_resource_num_to_vf_bar(resno)] = size;
+	resno = pci_resource_num_to_vf_bar(resno);
+	dev->sriov->barsz[resno] = pci_rebar_size_to_bytes(size);
 }
 
 bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev)
@@ -1340,7 +1340,6 @@ EXPORT_SYMBOL_GPL(pci_sriov_configure_simple);
 int pci_iov_vf_bar_set_size(struct pci_dev *dev, int resno, int size)
 {
 	u32 sizes;
-	int ret;
 
 	if (!pci_resource_is_iov(resno))
 		return -EINVAL;
@@ -1355,13 +1354,7 @@ int pci_iov_vf_bar_set_size(struct pci_dev *dev, int resno, int size)
 	if (!(sizes & BIT(size)))
 		return -EINVAL;
 
-	ret = pci_rebar_set_size(dev, resno, size);
-	if (ret)
-		return ret;
-
-	pci_iov_resource_set_size(dev, resno, pci_rebar_size_to_bytes(size));
-
-	return 0;
+	return pci_rebar_set_size(dev, resno, size);
 }
 EXPORT_SYMBOL_GPL(pci_iov_vf_bar_set_size);
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..7dfc58b0e55e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3803,6 +3803,10 @@ int pci_rebar_set_size(struct pci_dev *pdev, int bar, int size)
 	ctrl &= ~PCI_REBAR_CTRL_BAR_SIZE;
 	ctrl |= FIELD_PREP(PCI_REBAR_CTRL_BAR_SIZE, size);
 	pci_write_config_dword(pdev, pos + PCI_REBAR_CTRL, ctrl);
+
+	if (pci_resource_is_iov(bar))
+		pci_iov_resource_set_size(pdev, bar, size);
+
 	return 0;
 }
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4492b809094b..bf1a577e9623 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -808,8 +808,7 @@ void pci_iov_update_resource(struct pci_dev *dev, int resno);
 resource_size_t pci_sriov_resource_alignment(struct pci_dev *dev, int resno);
 void pci_restore_iov_state(struct pci_dev *dev);
 int pci_iov_bus_range(struct pci_bus *bus);
-void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
-			       resource_size_t size);
+void pci_iov_resource_set_size(struct pci_dev *dev, int resno, int size);
 bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev);
 static inline u16 pci_iov_vf_rebar_cap(struct pci_dev *dev)
 {
@@ -851,7 +850,7 @@ static inline int pci_iov_bus_range(struct pci_bus *bus)
 	return 0;
 }
 static inline void pci_iov_resource_set_size(struct pci_dev *dev, int resno,
-					     resource_size_t size) { }
+					     int size) { }
 static inline bool pci_iov_is_memory_decoding_enabled(struct pci_dev *dev)
 {
 	return false;
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index c3ba4ccecd43..3d0b0b3f60c4 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -450,12 +450,10 @@ static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
 	resource_size_t res_size = pci_rebar_size_to_bytes(size);
 	struct resource *res = pci_resource_n(dev, resno);
 
-	if (!pci_resource_is_iov(resno)) {
-		resource_set_size(res, res_size);
-	} else {
-		resource_set_size(res, res_size * pci_sriov_get_totalvfs(dev));
-		pci_iov_resource_set_size(dev, resno, res_size);
-	}
+	if (pci_resource_is_iov(resno))
+		res_size *= pci_sriov_get_totalvfs(dev);
+
+	resource_set_size(res, res_size);
 }
 
 int pci_resize_resource(struct pci_dev *dev, int resno, int size)
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/9] PCI: Change pci_dev variable from 'bridge' to 'dev'
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 4/9] PCI: Try BAR resize even when no window was released Ilpo Järvinen
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

Upcoming fix to BAR resize will store also device BAR resource into the
saved list. Change the pci_dev variable in the loop from 'bridge' to
'dev' as the former would be misleading with non-bridges in the list.

This is in a separate change to reduce churn in the upcoming BAR resize
fix.

While it appears that the logic in the loop doing pci_setup_bridge() is
altered as 'bridge' variable is no longer updated, a bridge should
never appear more than once in the saved list so the if check can only
match to the first entry. As such, the code with two distinct pci_dev
variables better represent the intention of the check compared with the
old code where bridge variable was reused for different purpose.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index e6984bb530ae..d58f025aeaff 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2479,12 +2479,13 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 	}
 
 	list_for_each_entry(dev_res, &saved, list) {
+		struct pci_dev *dev = dev_res->dev;
+
 		/* Skip the bridge we just assigned resources for */
-		if (bridge == dev_res->dev)
+		if (bridge == dev)
 			continue;
 
-		bridge = dev_res->dev;
-		pci_setup_bridge(bridge->subordinate);
+		pci_setup_bridge(dev->subordinate);
 	}
 
 	free_list(&saved);
@@ -2500,19 +2501,19 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 	/* Revert to the old configuration */
 	list_for_each_entry(dev_res, &saved, list) {
 		struct resource *res = dev_res->res;
+		struct pci_dev *dev = dev_res->dev;
 
-		bridge = dev_res->dev;
-		i = pci_resource_num(bridge, res);
+		i = pci_resource_num(dev, res);
 
 		if (res->parent) {
 			release_child_resources(res);
-			pci_release_resource(bridge, i);
+			pci_release_resource(dev, i);
 		}
 
 		restore_dev_resource(dev_res);
 
-		pci_claim_resource(bridge, i);
-		pci_setup_bridge(bridge->subordinate);
+		pci_claim_resource(dev, i);
+		pci_setup_bridge(dev->subordinate);
 	}
 	free_list(&saved);
 	up_read(&pci_bus_sem);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/9] PCI: Try BAR resize even when no window was released
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
                   ` (2 preceding siblings ...)
  2025-10-28 17:35 ` [PATCH 3/9] PCI: Change pci_dev variable from 'bridge' to 'dev' Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 5/9] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

Usually, resizing BARs requires releasing bridge windows in order to
resize it to fit a larged BAR into the window. This is not always the
case, however, FW could have made the window large enough to accomodate
larger BAR as is, or the user might prefer to shrink a BAR to make more
space for another Resizable BAR.

Thus, replace the check that requires that at least one bridge window
was released with a check that simply ensures bridge is not NULL.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index d58f025aeaff..76a4259ab076 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2459,10 +2459,8 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 		bus = bus->parent;
 	}
 
-	if (list_empty(&saved)) {
-		up_read(&pci_bus_sem);
+	if (!bridge)
 		return -ENOENT;
-	}
 
 	__pci_bus_size_bridges(bridge->subordinate, &added);
 	__pci_bridge_assign_resources(bridge, &added, &failed);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/9] PCI: Fix restoring BARs on BAR resize rollback path
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
                   ` (3 preceding siblings ...)
  2025-10-28 17:35 ` [PATCH 4/9] PCI: Try BAR resize even when no window was released Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 6/9] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

BAR resize operation is implemented in the pci_resize_resource() and
pbus_reassign_bridge_resources() functions. pci_resize_resource() can
be called either from __resource_resize_store() from sysfs or directly
by the driver for the Endpoint Device.

The pci_resize_resource() requires that caller has released the device
resources that share the bridge window with the BAR to be resized as
otherwise the bridge window is pinned in place and cannot be changed.

pbus_reassign_bridge_resources() implement rollback of the resources if
the resize operation fails, but rollback is performed only for the
bridge windows. Because releasing the device resources are done by the
caller of the BAR resize interface, these functions performing the BAR
resize do not have access to the device resources as they were before
the resize.

pbus_reassign_bridge_resources() could try to
__pci_bridge_assign_resources() after rolling back the bridge windows
as they were, however, it will not guarantee the resource are assigned
due to differences how FW and the kernel may want to assign the
resources (alignment of the start address and tail).

In order to perform rollback robustly, the BAR resize interface has to
be altered to also release the device resources that share the bridge
window with the BAR to be resized.

Also, remove restoring from the entries failed list as saved list
should now contain both the bridge windows and device resources so
the extra restore is duplicated work.

This change (together with the driver side changes) is to counter the
resource releases that had to be done to prevent resource tree
corruption in the ("PCI: Release assigned resource before restoring
them") change. As such, it likely restores functionality in cases where
device resources were released to avoid resource tree conflicts which
appeared to be "working" when such conflicts were not correctly
detected by the kernel.

Reported-by: Simon Richter <Simon.Richter@hogyros.de>
Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/pci-sysfs.c | 15 +------
 drivers/pci/pci.h       |  3 +-
 drivers/pci/setup-bus.c | 95 ++++++++++++++++++++++++++++++-----------
 drivers/pci/setup-res.c | 20 ++-------
 4 files changed, 77 insertions(+), 56 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 9d6f74bd95f8..caffd20abb9f 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1599,18 +1599,13 @@ static ssize_t __resource_resize_store(struct device *dev, int n,
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
 	struct pci_bus *bus = pdev->bus;
-	struct resource *b_win, *res;
 	unsigned long size;
-	int ret, i;
+	int ret;
 	u16 cmd;
 
 	if (kstrtoul(buf, 0, &size) < 0)
 		return -EINVAL;
 
-	b_win = pbus_select_window(bus, pci_resource_n(pdev, n));
-	if (!b_win)
-		return -EINVAL;
-
 	device_lock(dev);
 	if (dev->driver || pci_num_vf(pdev)) {
 		ret = -EBUSY;
@@ -1632,14 +1627,6 @@ static ssize_t __resource_resize_store(struct device *dev, int n,
 
 	pci_remove_resource_files(pdev);
 
-	pci_dev_for_each_resource(pdev, res, i) {
-		if (i >= PCI_BRIDGE_RESOURCES)
-			break;
-
-		if (b_win == pbus_select_window(bus, res))
-			pci_release_resource(pdev, i);
-	}
-
 	ret = pci_resize_resource(pdev, n, size);
 
 	pci_assign_unassigned_bus_resources(bus);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index bf1a577e9623..d22e595b3891 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -421,8 +421,9 @@ enum pci_bar_type {
 struct device *pci_get_host_bridge_device(struct pci_dev *dev);
 void pci_put_host_bridge_device(struct device *dev);
 
+void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size);
+int pci_do_resource_release_and_resize(struct pci_dev *dev, int resno, int size);
 unsigned int pci_rescan_bus_bridge_resize(struct pci_dev *bridge);
-int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res);
 int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 76a4259ab076..8da83b612c59 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -2420,18 +2420,16 @@ EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
  * release it when possible. If the bridge window contains assigned
  * resources, it cannot be released.
  */
-int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
+static int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res,
+					  struct list_head *saved)
 {
 	unsigned long type = res->flags;
 	struct pci_dev_resource *dev_res;
 	struct pci_dev *bridge;
-	LIST_HEAD(saved);
 	LIST_HEAD(added);
 	LIST_HEAD(failed);
 	unsigned int i;
-	int ret;
-
-	down_read(&pci_bus_sem);
+	int ret = 0;
 
 	while (!pci_is_root_bus(bus)) {
 		bridge = bus->self;
@@ -2443,9 +2441,9 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 
 		/* Ignore BARs which are still in use */
 		if (!res->child) {
-			ret = add_to_list(&saved, bridge, res, 0, 0);
+			ret = add_to_list(saved, bridge, res, 0, 0);
 			if (ret)
-				goto cleanup;
+				return ret;
 
 			pci_release_resource(bridge, i);
 		} else {
@@ -2468,34 +2466,73 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 		free_list(&added);
 
 	if (!list_empty(&failed)) {
-		if (pci_required_resource_failed(&failed, type)) {
+		if (pci_required_resource_failed(&failed, type))
 			ret = -ENOSPC;
-			goto cleanup;
-		}
-		/* Only resources with unrelated types failed (again) */
 		free_list(&failed);
+		if (ret)
+			return ret;
+
+		/* Only resources with unrelated types failed (again) */
 	}
 
-	list_for_each_entry(dev_res, &saved, list) {
+	list_for_each_entry(dev_res, saved, list) {
 		struct pci_dev *dev = dev_res->dev;
 
 		/* Skip the bridge we just assigned resources for */
 		if (bridge == dev)
 			continue;
 
+		if (!dev->subordinate)
+			continue;
+
 		pci_setup_bridge(dev->subordinate);
 	}
 
-	free_list(&saved);
-	up_read(&pci_bus_sem);
 	return 0;
+}
 
-cleanup:
-	/* Restore size and flags */
-	list_for_each_entry(dev_res, &failed, list)
-		restore_dev_resource(dev_res);
-	free_list(&failed);
+int pci_do_resource_release_and_resize(struct pci_dev *pdev, int resno, int size)
+{
+	struct resource *res = pci_resource_n(pdev, resno);
+	struct pci_dev_resource *dev_res;
+	struct pci_bus *bus = pdev->bus;
+	struct resource *b_win, *r;
+	LIST_HEAD(saved);
+	unsigned int i;
+	int ret;
+
+	b_win = pbus_select_window(bus, res);
+	if (!b_win)
+		return -EINVAL;
+
+	pci_dev_for_each_resource(pdev, r, i) {
+		if (i >= PCI_BRIDGE_RESOURCES)
+			break;
 
+		if (b_win != pbus_select_window(bus, r))
+			continue;
+
+		ret = add_to_list(&saved, pdev, r, 0, 0);
+		if (ret)
+			goto restore;
+		pci_release_resource(pdev, i);
+	}
+
+	pci_resize_resource_set_size(pdev, resno, size);
+
+	if (!bus->self)
+		goto out;
+
+	guard(rwsem_read)(&pci_bus_sem);
+	ret = pbus_reassign_bridge_resources(bus, res, &saved);
+	if (ret)
+		goto restore;
+
+out:
+	free_list(&saved);
+	return ret;
+
+restore:
 	/* Revert to the old configuration */
 	list_for_each_entry(dev_res, &saved, list) {
 		struct resource *res = dev_res->res;
@@ -2510,13 +2547,21 @@ int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *res)
 
 		restore_dev_resource(dev_res);
 
-		pci_claim_resource(dev, i);
-		pci_setup_bridge(dev->subordinate);
-	}
-	free_list(&saved);
-	up_read(&pci_bus_sem);
+		ret = pci_claim_resource(dev, i);
+		if (ret)
+			continue;
 
-	return ret;
+		if (i < PCI_BRIDGE_RESOURCES) {
+			const char *res_name = pci_resource_name(dev, i);
+
+			pci_update_resource(dev, i);
+			pci_info(dev, "%s %pR: old value restored\n",
+				 res_name, res);
+		}
+		if (dev->subordinate)
+			pci_setup_bridge(dev->subordinate);
+	}
+	goto out;
 }
 
 void pci_assign_unassigned_bus_resources(struct pci_bus *bus)
diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
index 3d0b0b3f60c4..93c70f8a8552 100644
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -444,8 +444,7 @@ static bool pci_resize_is_memory_decoding_enabled(struct pci_dev *dev,
 	return cmd & PCI_COMMAND_MEMORY;
 }
 
-static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
-					 int size)
+void pci_resize_resource_set_size(struct pci_dev *dev, int resno, int size)
 {
 	resource_size_t res_size = pci_rebar_size_to_bytes(size);
 	struct resource *res = pci_resource_n(dev, resno);
@@ -458,7 +457,6 @@ static void pci_resize_resource_set_size(struct pci_dev *dev, int resno,
 
 int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 {
-	struct resource *res = pci_resource_n(dev, resno);
 	struct pci_host_bridge *host;
 	int old, ret;
 	u32 sizes;
@@ -468,10 +466,6 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 	if (host->preserve_config)
 		return -ENOTSUPP;
 
-	/* Make sure the resource isn't assigned before resizing it. */
-	if (!(res->flags & IORESOURCE_UNSET))
-		return -EBUSY;
-
 	if (pci_resize_is_memory_decoding_enabled(dev, resno))
 		return -EBUSY;
 
@@ -490,19 +484,13 @@ int pci_resize_resource(struct pci_dev *dev, int resno, int size)
 	if (ret)
 		return ret;
 
-	pci_resize_resource_set_size(dev, resno, size);
-
-	/* Check if the new config works by trying to assign everything. */
-	if (dev->bus->self) {
-		ret = pbus_reassign_bridge_resources(dev->bus, res);
-		if (ret)
-			goto error_resize;
-	}
+	ret = pci_do_resource_release_and_resize(dev, resno, size);
+	if (ret)
+		goto error_resize;
 	return 0;
 
 error_resize:
 	pci_rebar_set_size(dev, resno, old);
-	pci_resize_resource_set_size(dev, resno, old);
 	return ret;
 }
 EXPORT_SYMBOL(pci_resize_resource);
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/9] drm/xe: Remove driver side BAR release before resize
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
                   ` (4 preceding siblings ...)
  2025-10-28 17:35 ` [PATCH 5/9] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-10-28 21:24   ` Lucas De Marchi
  2025-10-28 17:35 ` [PATCH 7/9] drm/i915: " Ilpo Järvinen
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

PCI core handles releasing device's resources and their rollback in
case of failure of a BAR resizing operation. Releasing resource prior
to calling pci_resize_resource() prevents PCI core from restoring the
BARs as they were.

Remove driver-side release of BARs from the xe driver.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
---
 drivers/gpu/drm/xe/xe_vram.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
index b44ebf50fedb..929412f0d131 100644
--- a/drivers/gpu/drm/xe/xe_vram.c
+++ b/drivers/gpu/drm/xe/xe_vram.c
@@ -33,9 +33,6 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
 	int bar_size = pci_rebar_bytes_to_size(size);
 	int ret;
 
-	if (pci_resource_len(pdev, resno))
-		pci_release_resource(pdev, resno);
-
 	ret = pci_resize_resource(pdev, resno, bar_size);
 	if (ret) {
 		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/9] drm/i915: Remove driver side BAR release before resize
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
                   ` (5 preceding siblings ...)
  2025-10-28 17:35 ` [PATCH 6/9] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-11-10 22:53   ` Bjorn Helgaas
  2025-10-28 17:35 ` [PATCH 8/9] drm/amdgpu: " Ilpo Järvinen
  2025-10-28 17:35 ` [PATCH 9/9] PCI: Prevent restoring assigned resources Ilpo Järvinen
  8 siblings, 1 reply; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

PCI core handles releasing device's resources and their rollback in
case of failure of a BAR resizing operation. Releasing resource prior
to calling pci_resize_resource() prevents PCI core from restoring the
BARs as they were.

Remove driver-side release of BARs from the i915 driver.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/gpu/drm/i915/gt/intel_region_lmem.c | 12 ------------
 1 file changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 51bb27e10a4f..ca3de61451a3 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -18,16 +18,6 @@
 #include "gt/intel_gt_regs.h"
 
 #ifdef CONFIG_64BIT
-static void _release_bars(struct pci_dev *pdev)
-{
-	int resno;
-
-	for (resno = PCI_STD_RESOURCES; resno < PCI_STD_RESOURCE_END; resno++) {
-		if (pci_resource_len(pdev, resno))
-			pci_release_resource(pdev, resno);
-	}
-}
-
 static void
 _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
 {
@@ -35,8 +25,6 @@ _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
 	int bar_size = pci_rebar_bytes_to_size(size);
 	int ret;
 
-	_release_bars(pdev);
-
 	ret = pci_resize_resource(pdev, resno, bar_size);
 	if (ret) {
 		drm_info(&i915->drm, "Failed to resize BAR%d to %dM (%pe)\n",
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
                   ` (6 preceding siblings ...)
  2025-10-28 17:35 ` [PATCH 7/9] drm/i915: " Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  2025-11-10 22:54   ` Bjorn Helgaas
  2025-11-11  9:08   ` Christian König
  2025-10-28 17:35 ` [PATCH 9/9] PCI: Prevent restoring assigned resources Ilpo Järvinen
  8 siblings, 2 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

PCI core handles releasing device's resources and their rollback in
case of failure of a BAR resizing operation. Releasing resource prior
to calling pci_resize_resource() prevents PCI core from restoring the
BARs as they were.

Remove driver-side release of BARs from the amdgpu driver.

Also remove the driver initiated assignment as pci_resize_resource()
should try to assign as much as possible. If the driver side call
manages to get more required resources assigned in some scenario, such
a problem should be fixed inside pci_resize_resource() instead.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7a899fb4de29..65474d365229 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -1729,12 +1729,8 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
 	pci_write_config_word(adev->pdev, PCI_COMMAND,
 			      cmd & ~PCI_COMMAND_MEMORY);
 
-	/* Free the VRAM and doorbell BAR, we most likely need to move both. */
+	/* Tear down doorbell as resizing will release BARs */
 	amdgpu_doorbell_fini(adev);
-	if (adev->asic_type >= CHIP_BONAIRE)
-		pci_release_resource(adev->pdev, 2);
-
-	pci_release_resource(adev->pdev, 0);
 
 	r = pci_resize_resource(adev->pdev, 0, rbar_size);
 	if (r == -ENOSPC)
@@ -1743,8 +1739,6 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
 	else if (r && r != -ENOTSUPP)
 		dev_err(adev->dev, "Problem resizing BAR0 (%d).", r);
 
-	pci_assign_unassigned_bus_resources(adev->pdev->bus);
-
 	/* When the doorbell or fb BAR isn't available we have no chance of
 	 * using the device.
 	 */
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 9/9] PCI: Prevent restoring assigned resources
  2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
                   ` (7 preceding siblings ...)
  2025-10-28 17:35 ` [PATCH 8/9] drm/amdgpu: " Ilpo Järvinen
@ 2025-10-28 17:35 ` Ilpo Järvinen
  8 siblings, 0 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-28 17:35 UTC (permalink / raw)
  To: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel
  Cc: Ilpo Järvinen

restore_dev_resource() copies saved addresses and flags from the struct
pci_dev_resource back to the struct resource, typically, during
rollback from a failure or in preparation for a retry attempt.

If the resource is within resource tree, the resource must not be
modified as the resource tree could be corrupted. Thus, it's a bug to
call restore_dev_resource() for assigned resources (which did happen
due to logic flaws in the BAR resize rollback).

Add WARN_ON_ONCE() into restore_dev_resource() to detect such bugs
easily and return without altering the resource to prevent corruption.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 drivers/pci/setup-bus.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 8da83b612c59..28d6ae822c0b 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -15,6 +15,7 @@
  */
 
 #include <linux/bitops.h>
+#include <linux/bug.h>
 #include <linux/init.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -135,6 +136,9 @@ static void restore_dev_resource(struct pci_dev_resource *dev_res)
 {
 	struct resource *res = dev_res->res;
 
+	if (WARN_ON_ONCE(res->parent))
+		return;
+
 	res->start = dev_res->start;
 	res->end = dev_res->end;
 	res->flags = dev_res->flags;
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/9] drm/xe: Remove driver side BAR release before resize
  2025-10-28 17:35 ` [PATCH 6/9] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
@ 2025-10-28 21:24   ` Lucas De Marchi
  2025-10-30 14:37     ` Lucas De Marchi
  0 siblings, 1 reply; 28+ messages in thread
From: Lucas De Marchi @ 2025-10-28 21:24 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Alex Deucher, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

On Tue, Oct 28, 2025 at 07:35:48PM +0200, Ilpo Järvinen wrote:
>PCI core handles releasing device's resources and their rollback in
>case of failure of a BAR resizing operation. Releasing resource prior
>to calling pci_resize_resource() prevents PCI core from restoring the
>BARs as they were.
>
>Remove driver-side release of BARs from the xe driver.
>
>Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>---
> drivers/gpu/drm/xe/xe_vram.c | 3 ---
> 1 file changed, 3 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
>index b44ebf50fedb..929412f0d131 100644
>--- a/drivers/gpu/drm/xe/xe_vram.c
>+++ b/drivers/gpu/drm/xe/xe_vram.c
>@@ -33,9 +33,6 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
> 	int bar_size = pci_rebar_bytes_to_size(size);
> 	int ret;
>
>-	if (pci_resource_len(pdev, resno))
>-		pci_release_resource(pdev, resno);
>-

conflict with drm-xe-next:

++<<<<<<< ours
  +      release_bars(pdev);
  +
++=======
++>>>>>>> theirs

if we don't need to release the BARs anymore to call
pci_resize_resource(), then the resolution is simply to drop the
function release_bars() function.

I'm sending that to our CI for coverage:
https://lore.kernel.org/intel-xe/20251028211613.3228940-2-lucas.demarchi@intel.com/T/#u

thanks
Lucas De Marchi

> 	ret = pci_resize_resource(pdev, resno, bar_size);
> 	if (ret) {
> 		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",
>-- 
>2.39.5
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails
  2025-10-28 17:35 ` [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
@ 2025-10-29 23:36   ` Bjorn Helgaas
  2025-10-30  8:22     ` Ilpo Järvinen
  0 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2025-10-29 23:36 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

On Tue, Oct 28, 2025 at 07:35:43PM +0200, Ilpo Järvinen wrote:
> pbus_reassign_bridge_resources() saves bridge windows into the saved
> list before attempting to adjust resource assignments to perform a BAR
> resize operation. If resource adjustments cannot be completed fully,
> rollback is attempted by restoring the resource from the saved list.

> Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs")
> Reported-by: Simon Richter <Simon.Richter@hogyros.de>
> Reported-by: Alex Bennée <alex.bennee@linaro.org>

If these reports were public, can we include lore URLs for them?

Same question for [PATCH 5/9] PCI: Fix restoring BARs on BAR resize
rollback path.

I put these all on pci/resource for build testing.  I assume we'll
tweak these based on testing reports and sorting out the pci/rebar
conflicts.

Bjorn

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails
  2025-10-29 23:36   ` Bjorn Helgaas
@ 2025-10-30  8:22     ` Ilpo Järvinen
  2025-11-10 22:59       ` Bjorn Helgaas
  0 siblings, 1 reply; 28+ messages in thread
From: Ilpo Järvinen @ 2025-10-30  8:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 1220 bytes --]

On Wed, 29 Oct 2025, Bjorn Helgaas wrote:

> On Tue, Oct 28, 2025 at 07:35:43PM +0200, Ilpo Järvinen wrote:
> > pbus_reassign_bridge_resources() saves bridge windows into the saved
> > list before attempting to adjust resource assignments to perform a BAR
> > resize operation. If resource adjustments cannot be completed fully,
> > rollback is attempted by restoring the resource from the saved list.
> 
> > Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs")
> > Reported-by: Simon Richter <Simon.Richter@hogyros.de>
> > Reported-by: Alex Bennée <alex.bennee@linaro.org>
> 
> If these reports were public, can we include lore URLs for them?
> 
> Same question for [PATCH 5/9] PCI: Fix restoring BARs on BAR resize
> rollback path.
> 
> I put these all on pci/resource for build testing.  I assume we'll
> tweak these based on testing reports and sorting out the pci/rebar
> conflicts.

Thanks, the links will come in v2 along with fixing a few things found by 
more extensive tests by LKP. E.g., it seems clang thinks guard() cannot be 
used here because goto jumps over it (auto variable initialization gets 
skipped so it's kind of understandable limitation).

-- 
 i.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/9] drm/xe: Remove driver side BAR release before resize
  2025-10-28 21:24   ` Lucas De Marchi
@ 2025-10-30 14:37     ` Lucas De Marchi
  0 siblings, 0 replies; 28+ messages in thread
From: Lucas De Marchi @ 2025-10-30 14:37 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Alex Deucher, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

On Tue, Oct 28, 2025 at 04:24:04PM -0500, Lucas De Marchi wrote:
>On Tue, Oct 28, 2025 at 07:35:48PM +0200, Ilpo Järvinen wrote:
>>PCI core handles releasing device's resources and their rollback in
>>case of failure of a BAR resizing operation. Releasing resource prior
>>to calling pci_resize_resource() prevents PCI core from restoring the
>>BARs as they were.
>>
>>Remove driver-side release of BARs from the xe driver.
>>
>>Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>Cc: Lucas De Marchi <lucas.demarchi@intel.com>
>>---
>>drivers/gpu/drm/xe/xe_vram.c | 3 ---
>>1 file changed, 3 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/xe/xe_vram.c b/drivers/gpu/drm/xe/xe_vram.c
>>index b44ebf50fedb..929412f0d131 100644
>>--- a/drivers/gpu/drm/xe/xe_vram.c
>>+++ b/drivers/gpu/drm/xe/xe_vram.c
>>@@ -33,9 +33,6 @@ _resize_bar(struct xe_device *xe, int resno, resource_size_t size)
>>	int bar_size = pci_rebar_bytes_to_size(size);
>>	int ret;
>>
>>-	if (pci_resource_len(pdev, resno))
>>-		pci_release_resource(pdev, resno);
>>-
>
>conflict with drm-xe-next:
>
>++<<<<<<< ours
> +      release_bars(pdev);
> +
>++=======
>++>>>>>>> theirs
>
>if we don't need to release the BARs anymore to call
>pci_resize_resource(), then the resolution is simply to drop the
>function release_bars() function.
>
>I'm sending that to our CI for coverage:
>https://lore.kernel.org/intel-xe/20251028211613.3228940-2-lucas.demarchi@intel.com/T/#u

CI came back clean. Looks good from xe side:

Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>

thanks
Lucas De Marchi

>
>thanks
>Lucas De Marchi
>
>>	ret = pci_resize_resource(pdev, resno, bar_size);
>>	if (ret) {
>>		drm_info(&xe->drm, "Failed to resize BAR%d to %dM (%pe). Consider enabling 'Resizable BAR' support in your BIOS\n",
>>-- 
>>2.39.5
>>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 7/9] drm/i915: Remove driver side BAR release before resize
  2025-10-28 17:35 ` [PATCH 7/9] drm/i915: " Ilpo Järvinen
@ 2025-11-10 22:53   ` Bjorn Helgaas
  0 siblings, 0 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2025-11-10 22:53 UTC (permalink / raw)
  To: Ilpo Järvinen, Jani Nikula, Joonas Lahtinen, Rodrigo Vivi,
	Tvrtko Ursulin
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, linux-pci, Simona Vetter, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

i915 folks, any objection to this?

On Tue, Oct 28, 2025 at 07:35:49PM +0200, Ilpo Järvinen wrote:
> PCI core handles releasing device's resources and their rollback in
> case of failure of a BAR resizing operation. Releasing resource prior
> to calling pci_resize_resource() prevents PCI core from restoring the
> BARs as they were.
> 
> Remove driver-side release of BARs from the i915 driver.
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_region_lmem.c | 12 ------------
>  1 file changed, 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> index 51bb27e10a4f..ca3de61451a3 100644
> --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
> @@ -18,16 +18,6 @@
>  #include "gt/intel_gt_regs.h"
>  
>  #ifdef CONFIG_64BIT
> -static void _release_bars(struct pci_dev *pdev)
> -{
> -	int resno;
> -
> -	for (resno = PCI_STD_RESOURCES; resno < PCI_STD_RESOURCE_END; resno++) {
> -		if (pci_resource_len(pdev, resno))
> -			pci_release_resource(pdev, resno);
> -	}
> -}
> -
>  static void
>  _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
>  {
> @@ -35,8 +25,6 @@ _resize_bar(struct drm_i915_private *i915, int resno, resource_size_t size)
>  	int bar_size = pci_rebar_bytes_to_size(size);
>  	int ret;
>  
> -	_release_bars(pdev);
> -
>  	ret = pci_resize_resource(pdev, resno, bar_size);
>  	if (ret) {
>  		drm_info(&i915->drm, "Failed to resize BAR%d to %dM (%pe)\n",
> -- 
> 2.39.5
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-10-28 17:35 ` [PATCH 8/9] drm/amdgpu: " Ilpo Järvinen
@ 2025-11-10 22:54   ` Bjorn Helgaas
  2025-11-11  9:08   ` Christian König
  1 sibling, 0 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2025-11-10 22:54 UTC (permalink / raw)
  To: Ilpo Järvinen, Alex Deucher, Christian König
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, amd-gfx,
	Bjorn Helgaas, David Airlie, dri-devel, intel-gfx, intel-xe,
	Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Thomas Hellström,
	Michał Winiarski, linux-kernel

amdgpu folks, any objection to this?

On Tue, Oct 28, 2025 at 07:35:50PM +0200, Ilpo Järvinen wrote:
> PCI core handles releasing device's resources and their rollback in
> case of failure of a BAR resizing operation. Releasing resource prior
> to calling pci_resize_resource() prevents PCI core from restoring the
> BARs as they were.
> 
> Remove driver-side release of BARs from the amdgpu driver.
> 
> Also remove the driver initiated assignment as pci_resize_resource()
> should try to assign as much as possible. If the driver side call
> manages to get more required resources assigned in some scenario, such
> a problem should be fixed inside pci_resize_resource() instead.
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7a899fb4de29..65474d365229 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1729,12 +1729,8 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>  	pci_write_config_word(adev->pdev, PCI_COMMAND,
>  			      cmd & ~PCI_COMMAND_MEMORY);
>  
> -	/* Free the VRAM and doorbell BAR, we most likely need to move both. */
> +	/* Tear down doorbell as resizing will release BARs */
>  	amdgpu_doorbell_fini(adev);
> -	if (adev->asic_type >= CHIP_BONAIRE)
> -		pci_release_resource(adev->pdev, 2);
> -
> -	pci_release_resource(adev->pdev, 0);
>  
>  	r = pci_resize_resource(adev->pdev, 0, rbar_size);
>  	if (r == -ENOSPC)
> @@ -1743,8 +1739,6 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>  	else if (r && r != -ENOTSUPP)
>  		dev_err(adev->dev, "Problem resizing BAR0 (%d).", r);
>  
> -	pci_assign_unassigned_bus_resources(adev->pdev->bus);
> -
>  	/* When the doorbell or fb BAR isn't available we have no chance of
>  	 * using the device.
>  	 */
> -- 
> 2.39.5
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails
  2025-10-30  8:22     ` Ilpo Järvinen
@ 2025-11-10 22:59       ` Bjorn Helgaas
  0 siblings, 0 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2025-11-10 22:59 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

On Thu, Oct 30, 2025 at 10:22:27AM +0200, Ilpo Järvinen wrote:
> On Wed, 29 Oct 2025, Bjorn Helgaas wrote:
> 
> > On Tue, Oct 28, 2025 at 07:35:43PM +0200, Ilpo Järvinen wrote:
> > > pbus_reassign_bridge_resources() saves bridge windows into the saved
> > > list before attempting to adjust resource assignments to perform a BAR
> > > resize operation. If resource adjustments cannot be completed fully,
> > > rollback is attempted by restoring the resource from the saved list.
> > 
> > > Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs")
> > > Reported-by: Simon Richter <Simon.Richter@hogyros.de>
> > > Reported-by: Alex Bennée <alex.bennee@linaro.org>
> > 
> > If these reports were public, can we include lore URLs for them?
> > 
> > Same question for [PATCH 5/9] PCI: Fix restoring BARs on BAR resize
> > rollback path.
> > 
> > I put these all on pci/resource for build testing.  I assume we'll
> > tweak these based on testing reports and sorting out the pci/rebar
> > conflicts.
> 
> Thanks, the links will come in v2 along with fixing a few things found by 
> more extensive tests by LKP. E.g., it seems clang thinks guard() cannot be 
> used here because goto jumps over it (auto variable initialization gets 
> skipped so it's kind of understandable limitation).

Just a ping on this.  The lkp robot did build this fine:
https://lore.kernel.org/r/202510311139.1VIkw3Ez-lkp@intel.com

I'm happy to put it in pci/next as-is, especially if the amdgpu and
i915 folks are ok with it.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-10-28 17:35 ` [PATCH 8/9] drm/amdgpu: " Ilpo Järvinen
  2025-11-10 22:54   ` Bjorn Helgaas
@ 2025-11-11  9:08   ` Christian König
  2025-11-11 11:08     ` Ilpo Järvinen
  2025-11-11 23:30     ` Liu, Monk
  1 sibling, 2 replies; 28+ messages in thread
From: Christian König @ 2025-11-11  9:08 UTC (permalink / raw)
  To: Ilpo Järvinen, Alex Bennée, Simon Richter,
	Lucas De Marchi, Alex Deucher, amd-gfx, Bjorn Helgaas,
	David Airlie, dri-devel, intel-gfx, intel-xe, Jani Nikula,
	Joonas Lahtinen, linux-pci, Rodrigo Vivi, Simona Vetter,
	Tvrtko Ursulin, Thomas Hellström, Michał Winiarski,
	linux-kernel

Sorry for the late reply I'm really busy at the moment.

On 10/28/25 18:35, Ilpo Järvinen wrote:
> PCI core handles releasing device's resources and their rollback in
> case of failure of a BAR resizing operation. Releasing resource prior
> to calling pci_resize_resource() prevents PCI core from restoring the
> BARs as they were.

I've intentionally didn't do it this way because at least on AMD HW we could only release the VRAM and doorbell BAR (both 64bit), but not the register BAR (32bit only).

This patch set looks like the right thing in general, but which BARs are now released by pci_resize_resource()?

If we avoid releasing the 32bit BAR as well then that should work, otherwise we will probably cause problems.

Regards,
Christian.

> 
> Remove driver-side release of BARs from the amdgpu driver.
> 
> Also remove the driver initiated assignment as pci_resize_resource()
> should try to assign as much as possible. If the driver side call
> manages to get more required resources assigned in some scenario, such
> a problem should be fixed inside pci_resize_resource() instead.
> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7a899fb4de29..65474d365229 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1729,12 +1729,8 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>  	pci_write_config_word(adev->pdev, PCI_COMMAND,
>  			      cmd & ~PCI_COMMAND_MEMORY);
>  
> -	/* Free the VRAM and doorbell BAR, we most likely need to move both. */
> +	/* Tear down doorbell as resizing will release BARs */
>  	amdgpu_doorbell_fini(adev);
> -	if (adev->asic_type >= CHIP_BONAIRE)
> -		pci_release_resource(adev->pdev, 2);
> -
> -	pci_release_resource(adev->pdev, 0);
>  
>  	r = pci_resize_resource(adev->pdev, 0, rbar_size);
>  	if (r == -ENOSPC)
> @@ -1743,8 +1739,6 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>  	else if (r && r != -ENOTSUPP)
>  		dev_err(adev->dev, "Problem resizing BAR0 (%d).", r);
>  
> -	pci_assign_unassigned_bus_resources(adev->pdev->bus);
> -
>  	/* When the doorbell or fb BAR isn't available we have no chance of
>  	 * using the device.
>  	 */


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-11-11  9:08   ` Christian König
@ 2025-11-11 11:08     ` Ilpo Järvinen
  2025-11-11 12:08       ` Christian König
  2025-11-11 23:30     ` Liu, Monk
  1 sibling, 1 reply; 28+ messages in thread
From: Ilpo Järvinen @ 2025-11-11 11:08 UTC (permalink / raw)
  To: Christian König
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Thomas Hellström,
	Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 3273 bytes --]

On Tue, 11 Nov 2025, Christian König wrote:

> Sorry for the late reply I'm really busy at the moment.
> 
> On 10/28/25 18:35, Ilpo Järvinen wrote:
> > PCI core handles releasing device's resources and their rollback in
> > case of failure of a BAR resizing operation. Releasing resource prior
> > to calling pci_resize_resource() prevents PCI core from restoring the
> > BARs as they were.
> 
> I've intentionally didn't do it this way because at least on AMD HW we 
> could only release the VRAM and doorbell BAR (both 64bit), but not the 
> register BAR (32bit only).
> 
> This patch set looks like the right thing in general, but which BARs are 
> now released by pci_resize_resource()?
> 
> If we avoid releasing the 32bit BAR as well then that should work, 
> otherwise we will probably cause problems.

After these changes, pci_resize_resource() releases BARs that share the 
bridge window with the BAR to be resized. So the answer depends on the 
upstream bridge.

However, amdgpu_device_resize_fb_bar() also checks that root bus has a
resource with a 64-bit address. That won't tell what the nearest bridge 
has though. Maybe that check should be converted to check the resources of 
the nearest bus instead? It would make it impossible to have the 
32-bit resource share the bridge window with the 64-bit resources so the 
resize would be safe.

Thanks a lot for checking this out!

-- 
 i.

> Regards,
> Christian.
> 
> > 
> > Remove driver-side release of BARs from the amdgpu driver.
> > 
> > Also remove the driver initiated assignment as pci_resize_resource()
> > should try to assign as much as possible. If the driver side call
> > manages to get more required resources assigned in some scenario, such
> > a problem should be fixed inside pci_resize_resource() instead.
> > 
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +-------
> >  1 file changed, 1 insertion(+), 7 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 7a899fb4de29..65474d365229 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -1729,12 +1729,8 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
> >  	pci_write_config_word(adev->pdev, PCI_COMMAND,
> >  			      cmd & ~PCI_COMMAND_MEMORY);
> >  
> > -	/* Free the VRAM and doorbell BAR, we most likely need to move both. */
> > +	/* Tear down doorbell as resizing will release BARs */
> >  	amdgpu_doorbell_fini(adev);
> > -	if (adev->asic_type >= CHIP_BONAIRE)
> > -		pci_release_resource(adev->pdev, 2);
> > -
> > -	pci_release_resource(adev->pdev, 0);
> >  
> >  	r = pci_resize_resource(adev->pdev, 0, rbar_size);
> >  	if (r == -ENOSPC)
> > @@ -1743,8 +1739,6 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
> >  	else if (r && r != -ENOTSUPP)
> >  		dev_err(adev->dev, "Problem resizing BAR0 (%d).", r);
> >  
> > -	pci_assign_unassigned_bus_resources(adev->pdev->bus);
> > -
> >  	/* When the doorbell or fb BAR isn't available we have no chance of
> >  	 * using the device.
> >  	 */
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-11-11 11:08     ` Ilpo Järvinen
@ 2025-11-11 12:08       ` Christian König
  2025-11-11 12:56         ` Ilpo Järvinen
  0 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2025-11-11 12:08 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Thomas Hellström,
	Michał Winiarski, LKML

On 11/11/25 12:08, Ilpo Järvinen wrote:
> On Tue, 11 Nov 2025, Christian König wrote:
> 
>> Sorry for the late reply I'm really busy at the moment.
>>
>> On 10/28/25 18:35, Ilpo Järvinen wrote:
>>> PCI core handles releasing device's resources and their rollback in
>>> case of failure of a BAR resizing operation. Releasing resource prior
>>> to calling pci_resize_resource() prevents PCI core from restoring the
>>> BARs as they were.
>>
>> I've intentionally didn't do it this way because at least on AMD HW we 
>> could only release the VRAM and doorbell BAR (both 64bit), but not the 
>> register BAR (32bit only).
>>
>> This patch set looks like the right thing in general, but which BARs are 
>> now released by pci_resize_resource()?
>>
>> If we avoid releasing the 32bit BAR as well then that should work, 
>> otherwise we will probably cause problems.
> 
> After these changes, pci_resize_resource() releases BARs that share the 
> bridge window with the BAR to be resized. So the answer depends on the 
> upstream bridge.
> 
> However, amdgpu_device_resize_fb_bar() also checks that root bus has a
> resource with a 64-bit address. That won't tell what the nearest bridge 
> has though. Maybe that check should be converted to check the resources of 
> the nearest bus instead? It would make it impossible to have the 
> 32-bit resource share the bridge window with the 64-bit resources so the 
> resize would be safe.

Mhm, I don't think that will work.


I've added the check for the root bus to avoid a couple of issues during resize, but checking the nearest bridge would block a whole bunch of use cases and isn't even 100% save.

See one use case of this is that all the BARs of the device start in the same 32bit bridge window (or a mixture of 64bit and 32bit window).

What we have is that BAR 0 and 2 are 64bit BARs which can (after some preparation) move around freely. But IIRC BAR 4 are the legacy I/O ports and BAR 5 is the 32bit MMIO registers (don't nail me on that, could be just the other way around).

Especially that 32bit MMIO BAR *can't* move! Not only because it is 32bit, but also because the amdgpu driver as well as the HW itself through the VGA emulation, as well as the EFI/VESA/VBIOS code might reference its absolute address.


Could we give pci_resize_resource() a mask of BARs which are save to release? Or maybe a flag to indicate that it can only free up 64bit BARs?

Regards,
Christian.

> 
> Thanks a lot for checking this out!
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-11-11 12:08       ` Christian König
@ 2025-11-11 12:56         ` Ilpo Järvinen
  2025-11-11 15:07           ` Christian König
  0 siblings, 1 reply; 28+ messages in thread
From: Ilpo Järvinen @ 2025-11-11 12:56 UTC (permalink / raw)
  To: Christian König
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Thomas Hellström,
	Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 4172 bytes --]

On Tue, 11 Nov 2025, Christian König wrote:

> On 11/11/25 12:08, Ilpo Järvinen wrote:
> > On Tue, 11 Nov 2025, Christian König wrote:
> > 
> >> Sorry for the late reply I'm really busy at the moment.
> >>
> >> On 10/28/25 18:35, Ilpo Järvinen wrote:
> >>> PCI core handles releasing device's resources and their rollback in
> >>> case of failure of a BAR resizing operation. Releasing resource prior
> >>> to calling pci_resize_resource() prevents PCI core from restoring the
> >>> BARs as they were.
> >>
> >> I've intentionally didn't do it this way because at least on AMD HW we 
> >> could only release the VRAM and doorbell BAR (both 64bit), but not the 
> >> register BAR (32bit only).
> >>
> >> This patch set looks like the right thing in general, but which BARs are 
> >> now released by pci_resize_resource()?
> >>
> >> If we avoid releasing the 32bit BAR as well then that should work, 
> >> otherwise we will probably cause problems.
> > 
> > After these changes, pci_resize_resource() releases BARs that share the 
> > bridge window with the BAR to be resized. So the answer depends on the 
> > upstream bridge.
> > 
> > However, amdgpu_device_resize_fb_bar() also checks that root bus has a
> > resource with a 64-bit address. That won't tell what the nearest bridge 
> > has though. Maybe that check should be converted to check the resources of 
> > the nearest bus instead? It would make it impossible to have the 
> > 32-bit resource share the bridge window with the 64-bit resources so the 
> > resize would be safe.
> 
> Mhm, I don't think that will work.
> 
> 
> I've added the check for the root bus to avoid a couple of issues during 
> resize, but checking the nearest bridge would block a whole bunch of use 
> cases and isn't even 100% save.
> 
> See one use case of this is that all the BARs of the device start in the 
> same 32bit bridge window (or a mixture of 64bit and 32bit window).

"32bit bridge window" is ambiguous. There are non-prefetchable and 
prefetchable bridge windows, out of which the latter can be 64-bit as 
well. Which one you're talking about?

If a 64-bit prefetchable window exists, pbus_size_mem() nor 
__pci_assign_resource() would not have produced such a configuration where 
they're put into the same bridge window, even before the commit 
ae88d0b9c57f ("PCI: Use pbus_select_window_for_type() during mem window 
sizing") (I think). Now pbus_size_mem() certainly doesn't.

> What we have is that BAR 0 and 2 are 64bit BARs which can (after some 
> preparation) move around freely. But IIRC BAR 4 are the legacy I/O ports 
> and BAR 5 is the 32bit MMIO registers (don't nail me on that, could be 
> just the other way around).
>
> Especially that 32bit MMIO BAR *can't* move! Not only because it is 
> 32bit, but also because the amdgpu driver as well as the HW itself 
> through the VGA emulation, as well as the EFI/VESA/VBIOS code might 
> reference its absolute address.

So if the 64-bit check is replaced with this:

+       /* Check if the parent bridge has a 64-bit (pref) memory resource */
+       res = pci_resource_n(adev->pdev, 0)->parent;
+       /* Trying to resize is pointless without a window above 4GB */
+       if (!(res->flags & IORESOURCE_MEM_64))
		return 0;

...I don't think it's possible for 32-bit resource to share that window 
under _any_ circumstance.

If you say that ->parent somehow points to a non-IORESOURCE_MEM_64 window 
at this point, you're implying allocation for the 64-bit prefetchable 
window was tried and failed, and __pci_assign_resource() then used one of 
its fallbacks.

Are you saying that "some preparation" includes making room for that 
64-bit prefetchable window that failed to assign earlier as I cannot see 
how else it would ever get assigned so that the 64-bit BARs could be moved 
there?

> Could we give pci_resize_resource() a mask of BARs which are save to 
> release?

It is possible.

> Or maybe a flag to indicate that it can only free up 64bit BARs?
> 
> Regards,
> Christian.
> 
> > 
> > Thanks a lot for checking this out!
> > 
> 

-- 
 i.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-11-11 12:56         ` Ilpo Järvinen
@ 2025-11-11 15:07           ` Christian König
  2025-11-11 15:52             ` Ilpo Järvinen
  0 siblings, 1 reply; 28+ messages in thread
From: Christian König @ 2025-11-11 15:07 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Thomas Hellström,
	Michał Winiarski, LKML

On 11/11/25 13:56, Ilpo Järvinen wrote:
> On Tue, 11 Nov 2025, Christian König wrote:
> 
>> On 11/11/25 12:08, Ilpo Järvinen wrote:
>>> On Tue, 11 Nov 2025, Christian König wrote:
>>>
>>>> Sorry for the late reply I'm really busy at the moment.
>>>>
>>>> On 10/28/25 18:35, Ilpo Järvinen wrote:
>>>>> PCI core handles releasing device's resources and their rollback in
>>>>> case of failure of a BAR resizing operation. Releasing resource prior
>>>>> to calling pci_resize_resource() prevents PCI core from restoring the
>>>>> BARs as they were.
>>>>
>>>> I've intentionally didn't do it this way because at least on AMD HW we 
>>>> could only release the VRAM and doorbell BAR (both 64bit), but not the 
>>>> register BAR (32bit only).
>>>>
>>>> This patch set looks like the right thing in general, but which BARs are 
>>>> now released by pci_resize_resource()?
>>>>
>>>> If we avoid releasing the 32bit BAR as well then that should work, 
>>>> otherwise we will probably cause problems.
>>>
>>> After these changes, pci_resize_resource() releases BARs that share the 
>>> bridge window with the BAR to be resized. So the answer depends on the 
>>> upstream bridge.
>>>
>>> However, amdgpu_device_resize_fb_bar() also checks that root bus has a
>>> resource with a 64-bit address. That won't tell what the nearest bridge 
>>> has though. Maybe that check should be converted to check the resources of 
>>> the nearest bus instead? It would make it impossible to have the 
>>> 32-bit resource share the bridge window with the 64-bit resources so the 
>>> resize would be safe.
>>
>> Mhm, I don't think that will work.
>>
>>
>> I've added the check for the root bus to avoid a couple of issues during 
>> resize, but checking the nearest bridge would block a whole bunch of use 
>> cases and isn't even 100% save.
>>
>> See one use case of this is that all the BARs of the device start in the 
>> same 32bit bridge window (or a mixture of 64bit and 32bit window).
> 
> "32bit bridge window" is ambiguous. There are non-prefetchable and 
> prefetchable bridge windows, out of which the latter can be 64-bit as 
> well. Which one you're talking about?

The non-prefetchable 32bit window.

> If a 64-bit prefetchable window exists, pbus_size_mem() nor 
> __pci_assign_resource() would not have produced such a configuration where 
> they're put into the same bridge window, even before the commit 
> ae88d0b9c57f ("PCI: Use pbus_select_window_for_type() during mem window 
> sizing") (I think). Now pbus_size_mem() certainly doesn't.

I need to double check, but if I'm not completely mistaken that is assigned by the BIOS.

Here is an example of a "good" configuration where both VRAM (BAR0) and doorbell (BAR2) is in the prefetchable window and MMIO in the non-prefetchable:

Device:
	Region 0: Memory at 80000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at 90000000 (64-bit, prefetchable) [size=2M]
	Region 4: I/O ports at 3000 [size=256]
	Region 5: Memory at 9f300000 (32-bit, non-prefetchable) [size=1M]

Bridge:
	Memory behind bridge: 9f300000-9f4fffff [size=2M] [32-bit]
	Prefetchable memory behind bridge: 80000000-901fffff [size=258M] [32-bit]

And here is an example of another system where things are mixed up:

Device:
	Region 0: Memory at 2c00000000 (64-bit, prefetchable) [size=256M]
	Region 2: Memory at 94000000 (64-bit, prefetchable) [size=2M]
	Region 4: I/O ports at 1000 [size=256]
	Region 5: Memory at 94600000 (32-bit, non-prefetchable) [size=512K]

Bridge:
	Memory behind bridge: 94000000-946fffff [size=7M] [32-bit]
	Prefetchable memory behind bridge: 2c00000000-2c107fffff [size=264M] [32-bit]

In that example the doorbell ended up in the non-prefetchable window for some reason. And that config comes in all possible variations.

On AMD GPUs both BAR0 and BAR2 are resizeable.

So far we have only implemented resizing of BAR0, but essentially we want to have both for some use cases.

>> What we have is that BAR 0 and 2 are 64bit BARs which can (after some 
>> preparation) move around freely. But IIRC BAR 4 are the legacy I/O ports 
>> and BAR 5 is the 32bit MMIO registers (don't nail me on that, could be 
>> just the other way around).
>>
>> Especially that 32bit MMIO BAR *can't* move! Not only because it is 
>> 32bit, but also because the amdgpu driver as well as the HW itself 
>> through the VGA emulation, as well as the EFI/VESA/VBIOS code might 
>> reference its absolute address.
> 
> So if the 64-bit check is replaced with this:
> 
> +       /* Check if the parent bridge has a 64-bit (pref) memory resource */
> +       res = pci_resource_n(adev->pdev, 0)->parent;
> +       /* Trying to resize is pointless without a window above 4GB */
> +       if (!(res->flags & IORESOURCE_MEM_64))
> 		return 0;
> 
> ...I don't think it's possible for 32-bit resource to share that window 
> under _any_ circumstance.

Well see the example above. I have SSH access to a system where exactly that is the configuration.

> If you say that ->parent somehow points to a non-IORESOURCE_MEM_64 window 
> at this point, you're implying allocation for the 64-bit prefetchable 
> window was tried and failed, and __pci_assign_resource() then used one of 
> its fallbacks.

No, as I said that comes from the BIOS.

> Are you saying that "some preparation" includes making room for that 
> 64-bit prefetchable window that failed to assign earlier as I cannot see 
> how else it would ever get assigned so that the 64-bit BARs could be moved 
> there?

No, at least from the amdgpu driver side we don't touch the resource allocation at all.

In this case preparation means disabling the VGA emulation, cause otherwise trying to resize the BAR can just cause a spontaneous system reboot for some reason. 

>> Could we give pci_resize_resource() a mask of BARs which are save to 
>> release?
> 
> It is possible.

Then let us solve this issue by this somehow.

Regards,
Christian.

> 
>> Or maybe a flag to indicate that it can only free up 64bit BARs?
>>
>> Regards,
>> Christian.
>>
>>>
>>> Thanks a lot for checking this out!
>>>
>>
> 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-11-11 15:07           ` Christian König
@ 2025-11-11 15:52             ` Ilpo Järvinen
  0 siblings, 0 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-11-11 15:52 UTC (permalink / raw)
  To: Christian König
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Thomas Hellström,
	Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 7863 bytes --]

On Tue, 11 Nov 2025, Christian König wrote:

> On 11/11/25 13:56, Ilpo Järvinen wrote:
> > On Tue, 11 Nov 2025, Christian König wrote:
> > 
> >> On 11/11/25 12:08, Ilpo Järvinen wrote:
> >>> On Tue, 11 Nov 2025, Christian König wrote:
> >>>
> >>>> Sorry for the late reply I'm really busy at the moment.
> >>>>
> >>>> On 10/28/25 18:35, Ilpo Järvinen wrote:
> >>>>> PCI core handles releasing device's resources and their rollback in
> >>>>> case of failure of a BAR resizing operation. Releasing resource prior
> >>>>> to calling pci_resize_resource() prevents PCI core from restoring the
> >>>>> BARs as they were.
> >>>>
> >>>> I've intentionally didn't do it this way because at least on AMD HW we 
> >>>> could only release the VRAM and doorbell BAR (both 64bit), but not the 
> >>>> register BAR (32bit only).
> >>>>
> >>>> This patch set looks like the right thing in general, but which BARs are 
> >>>> now released by pci_resize_resource()?
> >>>>
> >>>> If we avoid releasing the 32bit BAR as well then that should work, 
> >>>> otherwise we will probably cause problems.
> >>>
> >>> After these changes, pci_resize_resource() releases BARs that share the 
> >>> bridge window with the BAR to be resized. So the answer depends on the 
> >>> upstream bridge.
> >>>
> >>> However, amdgpu_device_resize_fb_bar() also checks that root bus has a
> >>> resource with a 64-bit address. That won't tell what the nearest bridge 
> >>> has though. Maybe that check should be converted to check the resources of 
> >>> the nearest bus instead? It would make it impossible to have the 
> >>> 32-bit resource share the bridge window with the 64-bit resources so the 
> >>> resize would be safe.
> >>
> >> Mhm, I don't think that will work.
> >>
> >>
> >> I've added the check for the root bus to avoid a couple of issues during 
> >> resize, but checking the nearest bridge would block a whole bunch of use 
> >> cases and isn't even 100% save.
> >>
> >> See one use case of this is that all the BARs of the device start in the 
> >> same 32bit bridge window (or a mixture of 64bit and 32bit window).
> > 
> > "32bit bridge window" is ambiguous. There are non-prefetchable and 
> > prefetchable bridge windows, out of which the latter can be 64-bit as 
> > well. Which one you're talking about?
> 
> The non-prefetchable 32bit window.
> 
> > If a 64-bit prefetchable window exists, pbus_size_mem() nor 
> > __pci_assign_resource() would not have produced such a configuration where 
> > they're put into the same bridge window, even before the commit 
> > ae88d0b9c57f ("PCI: Use pbus_select_window_for_type() during mem window 
> > sizing") (I think). Now pbus_size_mem() certainly doesn't.
> 
> I need to double check, but if I'm not completely mistaken that is assigned by the BIOS.
> 
> Here is an example of a "good" configuration where both VRAM (BAR0) and doorbell (BAR2) is in the prefetchable window and MMIO in the non-prefetchable:
> 
> Device:
> 	Region 0: Memory at 80000000 (64-bit, prefetchable) [size=256M]
> 	Region 2: Memory at 90000000 (64-bit, prefetchable) [size=2M]
> 	Region 4: I/O ports at 3000 [size=256]
> 	Region 5: Memory at 9f300000 (32-bit, non-prefetchable) [size=1M]
> 
> Bridge:
> 	Memory behind bridge: 9f300000-9f4fffff [size=2M] [32-bit]
> 	Prefetchable memory behind bridge: 80000000-901fffff [size=258M] [32-bit]
> 
> And here is an example of another system where things are mixed up:
> 
> Device:
> 	Region 0: Memory at 2c00000000 (64-bit, prefetchable) [size=256M]
> 	Region 2: Memory at 94000000 (64-bit, prefetchable) [size=2M]
> 	Region 4: I/O ports at 1000 [size=256]
> 	Region 5: Memory at 94600000 (32-bit, non-prefetchable) [size=512K]
> 
> Bridge:
> 	Memory behind bridge: 94000000-946fffff [size=7M] [32-bit]
> 	Prefetchable memory behind bridge: 2c00000000-2c107fffff [size=264M] [32-bit]
> 
> In that example the doorbell ended up in the non-prefetchable window for 
> some reason. And that config comes in all possible variations.

The really odd thing is that there seems to be even room in that 
prefetchable window for a 2MB BAR.

(Unless it's ppc which I heard is placing small BARs in a weird way.)

> On AMD GPUs both BAR0 and BAR2 are resizeable.
>
> So far we have only implemented resizing of BAR0, but essentially we 
> want to have both for some use cases. 

Okay. My plan is anyway to change the resource fitting logic so it will 
leave enough space to fit as large resources as possible at where the 
resizable BARs is at (once I get that far). Then BAR resize itself can be
mostly done in-place without need to release the bridge windows at all.

> >> What we have is that BAR 0 and 2 are 64bit BARs which can (after some 
> >> preparation) move around freely. But IIRC BAR 4 are the legacy I/O ports 
> >> and BAR 5 is the 32bit MMIO registers (don't nail me on that, could be 
> >> just the other way around).
> >>
> >> Especially that 32bit MMIO BAR *can't* move! Not only because it is 
> >> 32bit, but also because the amdgpu driver as well as the HW itself 
> >> through the VGA emulation, as well as the EFI/VESA/VBIOS code might 
> >> reference its absolute address.
> > 
> > So if the 64-bit check is replaced with this:
> > 
> > +       /* Check if the parent bridge has a 64-bit (pref) memory resource */
> > +       res = pci_resource_n(adev->pdev, 0)->parent;
> > +       /* Trying to resize is pointless without a window above 4GB */
> > +       if (!(res->flags & IORESOURCE_MEM_64))
> > 		return 0;
> > 
> > ...I don't think it's possible for 32-bit resource to share that window 
> > under _any_ circumstance.
> 
> Well see the example above.

For the record, BAR0 would pass that 64-bit check above and could be 
resized safely too.

But I hear your point that this kind of mixed config seems possible for 
some reason.

> I have SSH access to a system where exactly that is the configuration.
>
> > If you say that ->parent somehow points to a non-IORESOURCE_MEM_64 window 
> > at this point, you're implying allocation for the 64-bit prefetchable 
> > window was tried and failed, and __pci_assign_resource() then used one of 
> > its fallbacks.
> 
> No, as I said that comes from the BIOS.

Normally we don't abide the BIOS allocations for normal BARs, only 
bridge windows are claimed using pci_claim_resource(). Only if 
preserve_config is set for the host bridge, also dev resources are 
claimed as they were discovered.

The normal BARs are normally added into the resource tree using 
pci_assign_resource() which will not end up using the resource address of 
the resource itself in determining where to place the 
resource (AFAICT from the code in __pci_assign_resource() ->
pci_bus_alloc_resource() -> pci_bus_alloc_from_region() -> 
allocate_resource() -> __find_resource_space()).

> > Are you saying that "some preparation" includes making room for that 
> > 64-bit prefetchable window that failed to assign earlier as I cannot see 
> > how else it would ever get assigned so that the 64-bit BARs could be moved 
> > there?
> 
> No, at least from the amdgpu driver side we don't touch the resource 
> allocation at all. 
> 
> In this case preparation means disabling the VGA emulation, cause 
> otherwise trying to resize the BAR can just cause a spontaneous system 
> reboot for some reason. 
>
> >> Could we give pci_resize_resource() a mask of BARs which are save to 
> >> release?
> > 
> > It is possible.
> 
> Then let us solve this issue by this somehow.

I've added exclude_bars parameters to pci_resize_resource() and made
amdgpu to pass 1 << 5 to it, will send v2 in a day or two with that. If 
you've better idea than using a literal like that, please let me know.

-- 
 i.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize
  2025-11-11  9:08   ` Christian König
  2025-11-11 11:08     ` Ilpo Järvinen
@ 2025-11-11 23:30     ` Liu, Monk
  1 sibling, 0 replies; 28+ messages in thread
From: Liu, Monk @ 2025-11-11 23:30 UTC (permalink / raw)
  To: Koenig, Christian, Ilpo Järvinen, Alex Bennée,
	Simon Richter, Lucas De Marchi, Deucher, Alexander,
	amd-gfx@lists.freedesktop.org, Bjorn Helgaas, David Airlie,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, Jani Nikula, Joonas Lahtinen,
	linux-pci@vger.kernel.org, Rodrigo Vivi, Simona Vetter,
	Tvrtko Ursulin, Thomas Hellström, Michał Winiarski,
	linux-kernel@vger.kernel.org, Shi, Lianjie

[AMD Official Use Only - AMD Internal Distribution Only]

Hi @Shi, Lianjie

Do we think this patch series impact our use case ?

Monk Liu | Cloud GPU & Virtualization | AMD


-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Christian König
Sent: Tuesday, November 11, 2025 5:09 PM
To: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>; Alex Bennée <alex.bennee@linaro.org>; Simon Richter <Simon.Richter@hogyros.de>; Lucas De Marchi <lucas.demarchi@intel.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; amd-gfx@lists.freedesktop.org; Bjorn Helgaas <bhelgaas@google.com>; David Airlie <airlied@gmail.com>; dri-devel@lists.freedesktop.org; intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Jani Nikula <jani.nikula@linux.intel.com>; Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; linux-pci@vger.kernel.org; Rodrigo Vivi <rodrigo.vivi@intel.com>; Simona Vetter <simona@ffwll.ch>; Tvrtko Ursulin <tursulin@ursulin.net>; Thomas Hellström <thomas.hellstrom@linux.intel.com>; Michał Winiarski <michal.winiarski@intel.com>; linux-kernel@vger.kernel.org
Subject: Re: [PATCH 8/9] drm/amdgpu: Remove driver side BAR release before resize

Sorry for the late reply I'm really busy at the moment.

On 10/28/25 18:35, Ilpo Järvinen wrote:
> PCI core handles releasing device's resources and their rollback in
> case of failure of a BAR resizing operation. Releasing resource prior
> to calling pci_resize_resource() prevents PCI core from restoring the
> BARs as they were.

I've intentionally didn't do it this way because at least on AMD HW we could only release the VRAM and doorbell BAR (both 64bit), but not the register BAR (32bit only).

This patch set looks like the right thing in general, but which BARs are now released by pci_resize_resource()?

If we avoid releasing the 32bit BAR as well then that should work, otherwise we will probably cause problems.

Regards,
Christian.

>
> Remove driver-side release of BARs from the amdgpu driver.
>
> Also remove the driver initiated assignment as pci_resize_resource()
> should try to assign as much as possible. If the driver side call
> manages to get more required resources assigned in some scenario, such
> a problem should be fixed inside pci_resize_resource() instead.
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7a899fb4de29..65474d365229 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -1729,12 +1729,8 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>       pci_write_config_word(adev->pdev, PCI_COMMAND,
>                             cmd & ~PCI_COMMAND_MEMORY);
>
> -     /* Free the VRAM and doorbell BAR, we most likely need to move both. */
> +     /* Tear down doorbell as resizing will release BARs */
>       amdgpu_doorbell_fini(adev);
> -     if (adev->asic_type >= CHIP_BONAIRE)
> -             pci_release_resource(adev->pdev, 2);
> -
> -     pci_release_resource(adev->pdev, 0);
>
>       r = pci_resize_resource(adev->pdev, 0, rbar_size);
>       if (r == -ENOSPC)
> @@ -1743,8 +1739,6 @@ int amdgpu_device_resize_fb_bar(struct amdgpu_device *adev)
>       else if (r && r != -ENOTSUPP)
>               dev_err(adev->dev, "Problem resizing BAR0 (%d).", r);
>
> -     pci_assign_unassigned_bus_resources(adev->pdev->bus);
> -
>       /* When the doorbell or fb BAR isn't available we have no chance of
>        * using the device.
>        */


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size
  2025-10-28 17:35 ` [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size Ilpo Järvinen
@ 2025-11-13 16:29   ` Bjorn Helgaas
  2025-11-13 16:35     ` Ilpo Järvinen
  0 siblings, 1 reply; 28+ messages in thread
From: Bjorn Helgaas @ 2025-11-13 16:29 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, linux-kernel

On Tue, Oct 28, 2025 at 07:35:44PM +0200, Ilpo Järvinen wrote:
> pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
> struct pci_srvio keeps a cached copy of BAR size in unit of
> resource_size_t in ->barsz[] ...

Nit: s/pci_srvio/pci/sriov/  (fixed locally, FYI in case you post a v2)

I'm not sure what "unit of resource_size_t" adds here, maybe could be
removed to just say this?

  struct pci_srvio keeps a cached copy of BAR size in ->barsz[] ...

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size
  2025-11-13 16:29   ` Bjorn Helgaas
@ 2025-11-13 16:35     ` Ilpo Järvinen
  2025-11-13 16:57       ` Bjorn Helgaas
  2025-11-13 21:02       ` Bjorn Helgaas
  0 siblings, 2 replies; 28+ messages in thread
From: Ilpo Järvinen @ 2025-11-13 16:35 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

[-- Attachment #1: Type: text/plain, Size: 901 bytes --]

On Thu, 13 Nov 2025, Bjorn Helgaas wrote:

> On Tue, Oct 28, 2025 at 07:35:44PM +0200, Ilpo Järvinen wrote:
> > pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
> > struct pci_srvio keeps a cached copy of BAR size in unit of
> > resource_size_t in ->barsz[] ...
> 
> Nit: s/pci_srvio/pci/sriov/  (fixed locally, FYI in case you post a v2)

I just posted v2 without seeing this first. :-(

I seem to never learn to type those letters in the correct order, I don't 
know why I always keep typing them wrong.

> I'm not sure what "unit of resource_size_t" adds here, maybe could be
> removed to just say this?
> 
>   struct pci_srvio keeps a cached copy of BAR size in ->barsz[] ...

Seems okay with me. I just had it there to differentiate from "BAR size" 
which happens to often be the format directly compatible with field in the 
capability.

-- 
 i.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size
  2025-11-13 16:35     ` Ilpo Järvinen
@ 2025-11-13 16:57       ` Bjorn Helgaas
  2025-11-13 21:02       ` Bjorn Helgaas
  1 sibling, 0 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2025-11-13 16:57 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

On Thu, Nov 13, 2025 at 06:35:26PM +0200, Ilpo Järvinen wrote:
> On Thu, 13 Nov 2025, Bjorn Helgaas wrote:
> > On Tue, Oct 28, 2025 at 07:35:44PM +0200, Ilpo Järvinen wrote:
> > > pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
> > > struct pci_srvio keeps a cached copy of BAR size in unit of
> > > resource_size_t in ->barsz[] ...
> > 
> > Nit: s/pci_srvio/pci/sriov/  (fixed locally, FYI in case you post a v2)
> 
> I just posted v2 without seeing this first. :-(

Perfect, we crossed in the mail!  I'll tweak this locally.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size
  2025-11-13 16:35     ` Ilpo Järvinen
  2025-11-13 16:57       ` Bjorn Helgaas
@ 2025-11-13 21:02       ` Bjorn Helgaas
  1 sibling, 0 replies; 28+ messages in thread
From: Bjorn Helgaas @ 2025-11-13 21:02 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: Alex Bennée, Simon Richter, Lucas De Marchi, Alex Deucher,
	amd-gfx, Bjorn Helgaas, David Airlie, dri-devel, intel-gfx,
	intel-xe, Jani Nikula, Joonas Lahtinen, linux-pci, Rodrigo Vivi,
	Simona Vetter, Tvrtko Ursulin, Christian König,
	Thomas Hellström, Michał Winiarski, LKML

On Thu, Nov 13, 2025 at 06:35:26PM +0200, Ilpo Järvinen wrote:
> On Thu, 13 Nov 2025, Bjorn Helgaas wrote:
> 
> > On Tue, Oct 28, 2025 at 07:35:44PM +0200, Ilpo Järvinen wrote:
> > > pci_rebar_set_size() adjusts BAR size for both normal and IOV BARs. The
> > > struct pci_srvio keeps a cached copy of BAR size in unit of
> > > resource_size_t in ->barsz[] ...

> > I'm not sure what "unit of resource_size_t" adds here, maybe could be
> > removed to just say this?
> > 
> >   struct pci_srvio keeps a cached copy of BAR size in ->barsz[] ...
> 
> Seems okay with me. I just had it there to differentiate from "BAR size" 
> which happens to often be the format directly compatible with field in the 
> capability.

Ah, I see now, "size" used to be in bytes, but now it's the encoded
size.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2025-11-13 21:02 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-28 17:35 [PATCH 0/9] PCI: BAR resizing fix/rework Ilpo Järvinen
2025-10-28 17:35 ` [PATCH 1/9] PCI: Prevent resource tree corruption when BAR resize fails Ilpo Järvinen
2025-10-29 23:36   ` Bjorn Helgaas
2025-10-30  8:22     ` Ilpo Järvinen
2025-11-10 22:59       ` Bjorn Helgaas
2025-10-28 17:35 ` [PATCH 2/9] PCI/IOV: Adjust ->barsz[] when changing BAR size Ilpo Järvinen
2025-11-13 16:29   ` Bjorn Helgaas
2025-11-13 16:35     ` Ilpo Järvinen
2025-11-13 16:57       ` Bjorn Helgaas
2025-11-13 21:02       ` Bjorn Helgaas
2025-10-28 17:35 ` [PATCH 3/9] PCI: Change pci_dev variable from 'bridge' to 'dev' Ilpo Järvinen
2025-10-28 17:35 ` [PATCH 4/9] PCI: Try BAR resize even when no window was released Ilpo Järvinen
2025-10-28 17:35 ` [PATCH 5/9] PCI: Fix restoring BARs on BAR resize rollback path Ilpo Järvinen
2025-10-28 17:35 ` [PATCH 6/9] drm/xe: Remove driver side BAR release before resize Ilpo Järvinen
2025-10-28 21:24   ` Lucas De Marchi
2025-10-30 14:37     ` Lucas De Marchi
2025-10-28 17:35 ` [PATCH 7/9] drm/i915: " Ilpo Järvinen
2025-11-10 22:53   ` Bjorn Helgaas
2025-10-28 17:35 ` [PATCH 8/9] drm/amdgpu: " Ilpo Järvinen
2025-11-10 22:54   ` Bjorn Helgaas
2025-11-11  9:08   ` Christian König
2025-11-11 11:08     ` Ilpo Järvinen
2025-11-11 12:08       ` Christian König
2025-11-11 12:56         ` Ilpo Järvinen
2025-11-11 15:07           ` Christian König
2025-11-11 15:52             ` Ilpo Järvinen
2025-11-11 23:30     ` Liu, Monk
2025-10-28 17:35 ` [PATCH 9/9] PCI: Prevent restoring assigned resources Ilpo Järvinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox