Linux PCI subsystem development
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Geramy Loveless <gloveless@jqluv.com>
Cc: "Cristian Cocos" <cristi@ieee.org>,
	"Christian König" <christian.koenig@amd.com>,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH v3 RESEND] PCI: release empty sibling bridge windows during window resize
Date: Wed, 29 Apr 2026 13:34:19 +0300 (EEST)	[thread overview]
Message-ID: <c762ef52-11f8-cf82-7e0e-32323c59c6e6@linux.intel.com> (raw)
In-Reply-To: <20260428225532.3108047-1-gloveless@jqluv.com>

[-- Attachment #1: Type: text/plain, Size: 8015 bytes --]

On Tue, 28 Apr 2026, Geramy Loveless wrote:

> pbus_reassign_bridge_resources() refuses to release bridge windows that
> have child resources. On multi-port PCIe switches (e.g. Thunderbolt
> docks), empty sibling downstream ports hold small reservations that
> prevent the parent window from being freed and re-sized for rebar.
> 
> Walk descendants of the target window depth-first, saving and freeing
> each empty bridge window so the parent can then be released and grown.
> 
> Without this change, attempts to grow a window deep below a switch
> fabric leave nested empty windows pinning the ancestor:
> 
>   pcieport 0000:65:00.0: bridge window [mem 0x8880000000-0x98800fffff
>     64bit pref]: not released, active children present
>   pcieport 0000:00:03.2: bridge window [mem 0x8880000000-0x98800fffff
>     64bit pref]: not released, active children present
> 
> With the bottom-up walk and upstream ancestry check, the chain unwinds
> and the rebar succeeds:
> 
>   pcieport 0000:96:00.0: bridge window [...]: releasing
>   pcieport 0000:95:00.0: bridge window [...]: releasing
>   pcieport 0000:94:00.0: bridge window [...]: releasing
>   pcieport 0000:65:00.0: bridge window [...]: releasing
>   amdgpu 0000:97:00.0: BAR 0 [mem 0x9000000000-0x97ffffffff 64bit pref]:
>     assigned
> 
> Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Signed-off-by: Geramy Loveless <gloveless@jqluv.com>
> ---
> RESEND: my earlier v3 (Message-ID 20260428225146.3104063-1-gloveless@jqluv.com)
> was sent with stale code — the diff was byte-identical to v2 because the
> v3 edits weren't staged before commit --amend. This resend contains the
> actual v3 content described below. Apologies for the noise.

Hi,

I already came to that conclusion before seeing this... It happens, no 
worries. :-)

> v3:
>  - Restructure helper: outer loop only recurses; inner iterates bus
>    resources via pci_bus_for_each_resource(). The release path is now
>    unified, so the caller-side block in pbus_reassign_bridge_resources()
>    is gone.
>  - Walk the resource tree upstream when checking ancestry of @b_win,
>    so descendants in nested topologies are also released (Ilpo).
>  - Drop pci_resource_is_bridge_win() filter; bus resources are
>    inherently bridge windows.
>  - Rename pci_bus_release_empty_bridge_resources() to
>    pbus_release_empty_bridge_resources().
>  - Drop misleading "Uses PCI bus/device iterators..." paragraph from
>    commit message.
>  - Kerneldoc tweaks per Ilpo's review.
> 
> v2:
>  - Use PCI bus/device iterators instead of walking raw resource tree.
>  - Save released resources to rollback list.
>  - Filter via pci_resource_is_bridge_win() (now removed in v3).
>  - Call release before the !res->child check.
>  - Check bridge->subordinate before recursion.
> ---
>  drivers/pci/setup-bus.c | 78 ++++++++++++++++++++++++++++++++---------
>  1 file changed, 61 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> index 4cf120ebe5a..b47dc1a2925 100644
> --- a/drivers/pci/setup-bus.c
> +++ b/drivers/pci/setup-bus.c
> @@ -2292,6 +2292,63 @@ void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge)
>  }
>  EXPORT_SYMBOL_GPL(pci_assign_unassigned_bridge_resources);
>  
> +/*
> + * pbus_release_empty_bridge_resources - Bottom-up release of bridge windows
> + * in empty subtrees.
> + * @bus: PCI bus whose child bridges to process
> + * @b_win: Ancestor bridge window; only resources that are (grand)parented
> + *	   by @b_win are released
> + * @saved: List to store released resources into for rollback
> + *
> + * Recurses depth-first into subordinate buses, then releases bridge windows
> + * that are (grand)parented by @b_win on the way back up. Each resource is
> + * stored into @saved before release so the entire operation can be rolled
> + * back.
> + */
> +static void pbus_release_empty_bridge_resources(struct pci_bus *bus,
> +						struct resource *b_win,
> +						struct list_head *saved)
> +{
> +	struct pci_dev *dev;
> +	struct resource *r;
> +	unsigned int i;
> +
> +	list_for_each_entry(dev, &bus->devices, bus_list) {
> +		if (dev->subordinate)
> +			pbus_release_empty_bridge_resources(dev->subordinate,
> +							    b_win, saved);
> +	}
> +
> +	pci_bus_for_each_resource(bus, r, i) {

While looking into i vs idx (which should be the same thing), I realized
there's one bad bug here:

pci_bus_for_each_resource() also iterates over subtractive decode bridge 
resources for which you cannot take pci_resource_num(bus->self, r) 
because that's invalid memorywise (they're resource of another struct 
pci_dev). And passing idx then to pci_release_resource() is also invalid 
(it might actually "work" as the second bug might reverse the effects of 
the first one but it's still really bad).

What we want here is to only consider resources local to this bridge 
(bus->self) so we need to add:

		if (i < PCI_BRIDGE_RESOURCE_NUM)
			break;

BUT, it might be worth adding a new iterator that doesn't have this trap:

	pci_bridge_for_each_window(bridge, r, ...)

...There would be some use for it in the existing code as well so I've 
been considering adding it anyway to eliminate the various loop forms 
that do this same iteration.

> +		struct resource *p;
> +		unsigned int idx;
> +
> +		if (!r || !resource_assigned(r))
> +			continue;
> +
> +		/* Walk upstream to confirm r descends from (or equals) b_win */
> +		for (p = r; p && p != b_win; p = p->parent)
> +			;
> +		if (!p)
> +			continue;

Can you place this loop into a new function that returns bool. I believe 
it will also make the code easier to follow as you don't need p and can 
return immmediately on match. With a reasonable name, you don't need the 
comment anymore as the naming explains what the code does.

I've need for the same check myself (in some other changes I'm 
developing), in preparation for that, I suggest placing the function under 
pbus_select_window() as they're kind of related.

> +
> +		idx = pci_resource_num(bus->self, r);

This can be dropped.

You can use i (+ PCI_BRIDGE_RESOURCES) in pci_release_resource() call.


The general structure of the code seems fine now.

> +
> +		if (r->child) {
> +			const char *res_name = pci_resource_name(bus->self, idx);
> +
> +			pci_info(bus->self, "%s %pR: not released, active children present\n",
> +				 res_name, r);
> +			continue;
> +		}
> +
> +		if (pci_dev_res_add_to_list(saved, bus->self, r, 0, 0))
> +			continue;
> +
> +		pci_release_resource(bus->self, idx);
> +	}
> +}
> +
>  /*
>   * Walk to the root bus, find the bridge window relevant for @res and
>   * release it when possible. If the bridge window contains assigned
> @@ -2305,7 +2362,6 @@ static int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *
>  	struct pci_dev *bridge = NULL;
>  	LIST_HEAD(added);
>  	LIST_HEAD(failed);
> -	unsigned int i;
>  	int ret = 0;
>  
>  	while (!pci_is_root_bus(bus)) {
> @@ -2314,22 +2370,10 @@ static int pbus_reassign_bridge_resources(struct pci_bus *bus, struct resource *
>  		if (!res)
>  			break;
>  
> -		i = pci_resource_num(bridge, res);
> -
> -		/* Ignore BARs which are still in use */
> -		if (!res->child) {
> -			ret = pci_dev_res_add_to_list(saved, bridge, res, 0, 0);
> -			if (ret)
> -				return ret;
> -
> -			pci_release_resource(bridge, i);
> -		} else {
> -			const char *res_name = pci_resource_name(bridge, i);
> -
> -			pci_warn(bridge,
> -				 "%s %pR: was not released (still contains assigned resources)\n",
> -				 res_name, res);
> -		}
> +		/* Release this bridge window and any empty descendants bottom-up */
> +		if (bridge->subordinate)
> +			pbus_release_empty_bridge_resources(bridge->subordinate,
> +							    res, saved);
>  
>  		bus = bus->parent;
>  	}
> 

-- 
 i.

      reply	other threads:[~2026-04-29 10:34 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 22:55 [PATCH v3 RESEND] PCI: release empty sibling bridge windows during window resize Geramy Loveless
2026-04-29 10:34 ` Ilpo Järvinen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c762ef52-11f8-cf82-7e0e-32323c59c6e6@linux.intel.com \
    --to=ilpo.jarvinen@linux.intel.com \
    --cc=christian.koenig@amd.com \
    --cc=cristi@ieee.org \
    --cc=gloveless@jqluv.com \
    --cc=linux-pci@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox