* [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() @ 2025-08-14 9:39 Rui He 2025-08-14 20:36 ` Bjorn Helgaas 2025-08-17 2:46 ` Ethan Zhao 0 siblings, 2 replies; 7+ messages in thread From: Rui He @ 2025-08-14 9:39 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-pci, linux-kernel, Prashant.Chikhalkar, Jiguang.Xiao, Rui.He For preconfigured PCI bridge, child bus created on the first scan. While for some reasons(e.g register mutation), the secondary, and subordiante register reset to 0 on the second scan, which caused to create PCI bus twice for the same PCI device. Following is the related log: [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0d] [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to [bus 0f-10] Here PCI device 000:0b:01.0 assigend to bus 0d and 0e. This patch checks if child PCI bus has been created on the second scan of bridge. If yes, return directly instead of create a new one. Signed-off-by: Rui He <rui.he@windriver.com> --- drivers/pci/probe.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index f41128f91ca76..ec67adbf31738 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1444,6 +1444,9 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, goto out; } + if(pci_has_subordinate(dev)) + goto out; + /* Clear errors */ pci_write_config_word(dev, PCI_STATUS, 0xffff); -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() 2025-08-14 9:39 [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() Rui He @ 2025-08-14 20:36 ` Bjorn Helgaas 2025-08-15 2:31 ` He, Rui 2025-08-17 2:46 ` Ethan Zhao 1 sibling, 1 reply; 7+ messages in thread From: Bjorn Helgaas @ 2025-08-14 20:36 UTC (permalink / raw) To: Rui He Cc: Bjorn Helgaas, linux-pci, linux-kernel, Prashant.Chikhalkar, Jiguang.Xiao On Thu, Aug 14, 2025 at 05:39:37PM +0800, Rui He wrote: > For preconfigured PCI bridge, child bus created on the first scan. > While for some reasons(e.g register mutation), the secondary, and subordiante > register reset to 0 on the second scan, which caused to create > PCI bus twice for the same PCI device. I don't quite follow this. Do you mean something is changing the bridge configuration between the first and second scans? > Following is the related log: > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0d] > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to [bus 0f-10] Drop the timestamps (since they don't contribute to understanding the problem) and indent the logs a couple spaces. > Here PCI device 000:0b:01.0 assigend to bus 0d and 0e. It looks like the [bus 0f-10] range is assigned to both bridges (0b:01.0 and 0b:05.0), which would definitely be a problem. I'm surprised that we haven't tripped over this before, and I'm curious about how we got here. Can you set CONFIG_DYNAMIC_DEBUG=y, boot with the dyndbg="file drivers/pci/* +p" kernel parameter, and collect the complete dmesg log? > This patch checks if child PCI bus has been created on the second scan > of bridge. If yes, return directly instead of create a new one. > > Signed-off-by: Rui He <rui.he@windriver.com> > --- > drivers/pci/probe.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index f41128f91ca76..ec67adbf31738 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1444,6 +1444,9 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, > goto out; > } > > + if(pci_has_subordinate(dev)) > + goto out; Follow the coding style, i.e., add a space in "if (pci_..." > /* Clear errors */ > pci_write_config_word(dev, PCI_STATUS, 0xffff); > > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() 2025-08-14 20:36 ` Bjorn Helgaas @ 2025-08-15 2:31 ` He, Rui 2025-08-15 14:22 ` Bjorn Helgaas 0 siblings, 1 reply; 7+ messages in thread From: He, Rui @ 2025-08-15 2:31 UTC (permalink / raw) To: Bjorn Helgaas Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Chikhalkar, Prashant, Xiao, Jiguang > -----Original Message----- > From: Bjorn Helgaas <helgaas@kernel.org> > Sent: 2025年8月15日 4:36 > To: He, Rui <Rui.He@windriver.com> > Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; > linux-kernel@vger.kernel.org; Chikhalkar, Prashant > <Prashant.Chikhalkar@windriver.com>; Xiao, Jiguang > <Jiguang.Xiao@windriver.com> > Subject: Re: [PATCH 1/1] pci: Add subordinate check before > pci_add_new_bus() > > CAUTION: This email comes from a non Wind River email account! > Do not click links or open attachments unless you recognize the sender and > know the content is safe. > > On Thu, Aug 14, 2025 at 05:39:37PM +0800, Rui He wrote: > > For preconfigured PCI bridge, child bus created on the first scan. > > While for some reasons(e.g register mutation), the secondary, and > > subordiante register reset to 0 on the second scan, which caused to > > create PCI bus twice for the same PCI device. > > I don't quite follow this. Do you mean something is changing the bridge > configuration between the first and second scans? I'm not sure what changed the bridge configuration, but the secondary and subordinate is indeed 0 on the second scan as [bus 0e-10] created for 0000:0b:01.0. In my opinion, it might be an invalid communication or register mutation in PCI bridge. > > > Following is the related log: > > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus > > 0d] [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge > > configuration invalid ([bus 00-00]), reconfiguring [Wed May 28 > > 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] [Wed > > May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to [bus 0f-10] > > Drop the timestamps (since they don't contribute to understanding the > problem) and indent the logs a couple spaces. > OK > > Here PCI device 000:0b:01.0 assigend to bus 0d and 0e. > > It looks like the [bus 0f-10] range is assigned to both bridges > (0b:01.0 and 0b:05.0), which would definitely be a problem. > > I'm surprised that we haven't tripped over this before, and I'm curious about > how we got here. Can you set CONFIG_DYNAMIC_DEBUG=y, boot with the > dyndbg="file drivers/pci/* +p" kernel parameter, and collect the complete > dmesg log? > Sorry, as this is a individual issue, and cannot be reproduced, I cannot offer more detailed logs. > > This patch checks if child PCI bus has been created on the second scan > > of bridge. If yes, return directly instead of create a new one. > > > > Signed-off-by: Rui He <rui.he@windriver.com> > > --- > > drivers/pci/probe.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index > > f41128f91ca76..ec67adbf31738 100644 > > --- a/drivers/pci/probe.c > > +++ b/drivers/pci/probe.c > > @@ -1444,6 +1444,9 @@ static int pci_scan_bridge_extend(struct pci_bus > *bus, struct pci_dev *dev, > > goto out; > > } > > > > + if(pci_has_subordinate(dev)) > > + goto out; > > Follow the coding style, i.e., add a space in "if (pci_..." Will be changed in v2. > > > /* Clear errors */ > > pci_write_config_word(dev, PCI_STATUS, 0xffff); > > > > -- > > 2.43.0 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() 2025-08-15 2:31 ` He, Rui @ 2025-08-15 14:22 ` Bjorn Helgaas 2025-08-26 7:01 ` He, Rui 0 siblings, 1 reply; 7+ messages in thread From: Bjorn Helgaas @ 2025-08-15 14:22 UTC (permalink / raw) To: He, Rui Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Chikhalkar, Prashant, Xiao, Jiguang On Fri, Aug 15, 2025 at 02:31:31AM +0000, He, Rui wrote: > > -----Original Message----- > > From: Bjorn Helgaas <helgaas@kernel.org> > > Sent: 2025年8月15日 4:36 > > To: He, Rui <Rui.He@windriver.com> > > Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; > > linux-kernel@vger.kernel.org; Chikhalkar, Prashant > > <Prashant.Chikhalkar@windriver.com>; Xiao, Jiguang > > <Jiguang.Xiao@windriver.com> > > Subject: Re: [PATCH 1/1] pci: Add subordinate check before > > pci_add_new_bus() > > > > CAUTION: This email comes from a non Wind River email account! > > Do not click links or open attachments unless you recognize the sender and > > know the content is safe. > > > > On Thu, Aug 14, 2025 at 05:39:37PM +0800, Rui He wrote: > > > For preconfigured PCI bridge, child bus created on the first scan. > > > While for some reasons(e.g register mutation), the secondary, and > > > subordiante register reset to 0 on the second scan, which caused to > > > create PCI bus twice for the same PCI device. > > > > I don't quite follow this. Do you mean something is changing the > > bridge configuration between the first and second scans? > > I'm not sure what changed the bridge configuration, but the > secondary and subordinate is indeed 0 on the second scan as [bus > 0e-10] created for 0000:0b:01.0. > > In my opinion, it might be an invalid communication or register > mutation in PCI bridge. > > > Following is the related log: > > > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus > > > 0d] [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge > > > configuration invalid ([bus 00-00]), reconfiguring [Wed May 28 > > > 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] [Wed > > > May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to [bus 0f-10] > > > Here PCI device 000:0b:01.0 assigend to bus 0d and 0e. > > > > It looks like the [bus 0f-10] range is assigned to both bridges > > (0b:01.0 and 0b:05.0), which would definitely be a problem. > > > > I'm surprised that we haven't tripped over this before, and I'm > > curious about how we got here. Can you set > > CONFIG_DYNAMIC_DEBUG=y, boot with the dyndbg="file drivers/pci/* > > +p" kernel parameter, and collect the complete dmesg log? > > Sorry, as this is a individual issue, and cannot be reproduced, I > cannot offer more detailed logs. Do you have the complete dmesg log from this one time you saw the problem? As-is, I don't think there's quite enough here to move forward with this. I think we need some more detailed analysis to figure out how this happens. Bjorn ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() 2025-08-15 14:22 ` Bjorn Helgaas @ 2025-08-26 7:01 ` He, Rui 0 siblings, 0 replies; 7+ messages in thread From: He, Rui @ 2025-08-26 7:01 UTC (permalink / raw) To: Bjorn Helgaas Cc: Bjorn Helgaas, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Chikhalkar, Prashant, Xiao, Jiguang [-- Attachment #1: Type: text/plain, Size: 5290 bytes --] > -----Original Message----- > From: Bjorn Helgaas <helgaas@kernel.org> > Sent: 2025年8月15日 22:23 > To: He, Rui <Rui.He@windriver.com> > Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; > linux-kernel@vger.kernel.org; Chikhalkar, Prashant > <Prashant.Chikhalkar@windriver.com>; Xiao, Jiguang > <Jiguang.Xiao@windriver.com> > Subject: Re: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() > > CAUTION: This email comes from a non Wind River email account! > Do not click links or open attachments unless you recognize the sender and know > the content is safe. > > On Fri, Aug 15, 2025 at 02:31:31AM +0000, He, Rui wrote: > > > -----Original Message----- > > > From: Bjorn Helgaas <helgaas@kernel.org> > > > Sent: 2025年8月15日 4:36 > > > To: He, Rui <Rui.He@windriver.com> > > > Cc: Bjorn Helgaas <bhelgaas@google.com>; linux-pci@vger.kernel.org; > > > linux-kernel@vger.kernel.org; Chikhalkar, Prashant > > > <Prashant.Chikhalkar@windriver.com>; Xiao, Jiguang > > > <Jiguang.Xiao@windriver.com> > > > Subject: Re: [PATCH 1/1] pci: Add subordinate check before > > > pci_add_new_bus() > > > > > > CAUTION: This email comes from a non Wind River email account! > > > Do not click links or open attachments unless you recognize the > > > sender and know the content is safe. > > > > > > On Thu, Aug 14, 2025 at 05:39:37PM +0800, Rui He wrote: > > > > For preconfigured PCI bridge, child bus created on the first scan. > > > > While for some reasons(e.g register mutation), the secondary, and > > > > subordiante register reset to 0 on the second scan, which caused > > > > to create PCI bus twice for the same PCI device. > > > > > > I don't quite follow this. Do you mean something is changing the > > > bridge configuration between the first and second scans? > > > > I'm not sure what changed the bridge configuration, but the secondary > > and subordinate is indeed 0 on the second scan as [bus 0e-10] created > > for 0000:0b:01.0. > > > > In my opinion, it might be an invalid communication or register > > mutation in PCI bridge. > > > > > Following is the related log: > > > > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to > > > > [bus 0d] [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge > > > > configuration invalid ([bus 00-00]), reconfiguring [Wed May 28 > > > > 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] > > > > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to > > > > [bus 0f-10] > > > > > Here PCI device 000:0b:01.0 assigend to bus 0d and 0e. > > > > > > It looks like the [bus 0f-10] range is assigned to both bridges > > > (0b:01.0 and 0b:05.0), which would definitely be a problem. > > > > > > I'm surprised that we haven't tripped over this before, and I'm > > > curious about how we got here. Can you set CONFIG_DYNAMIC_DEBUG=y, > > > boot with the dyndbg="file drivers/pci/* > > > +p" kernel parameter, and collect the complete dmesg log? > > > > Sorry, as this is a individual issue, and cannot be reproduced, I > > cannot offer more detailed logs. > > Do you have the complete dmesg log from this one time you saw the problem? > > As-is, I don't think there's quite enough here to move forward with this. I think > we need some more detailed analysis to figure out how this happens. > > Bjorn Attached is the dmesg logs while scan the PCI bus. As the dmesg is customer sensitive, I have removed irrelevant logs and only kept the PCI enumeration logs that can be obtained. Among that, "Quirks: Set bus range to 0xAABBCC" is printed by the custom hook declared through DECLARE_PCI_FIXUP_EARLY() in drvivers/pci/quirks.c. 0xAABBCC refers to the predefined PIC bridge Subordinate Bus Number, Secondary Bus Number, and Primary Bus Number. Through the logs, [bus 0d 0e 0c] is created for pci device 0000:0b:01.0, but when removing PCI device 0000:0b:01.0, only bus 0c is removed, bus 0d 0e is still there. Following is the related logs of "lspci" # lspci -tv ...... \-[0000:00]-+-00.0 Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMI2 +-03.1-[08-73]----00.0-[09-73]--+-00.0 Microsemi / PMC / IDT PES24NT24G2 PCI Express Switch | +-02.0-[0a-10]----00.0-[0b-10]--+-01.0-[0d]----00.0 Device xxxx | | +-04.0 PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch | | +-05.0-[00]-- | | +-07.0-[00]-- | | \-09.0-[00]-- | +-03.0-[11-17]----00.0-[12-17]--+-01.0-[14]----00.0 Device xxxx | | +-04.0 PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch | | +-05.0-[00]-- | | +-07.0-[00]-- | | \-09.0-[00]-- ...... Thanks, Rui [-- Attachment #2: dmesg-pci-scan.txt --] [-- Type: text/plain, Size: 10955 bytes --] pci 0000:11:00.0: [10b5:8606] type 01 class 0x060400 pci 0000:11:00.0: Quirks: Set bus range to 0x171211 pci 0000:11:00.0: reg 0x10: [mem 0x00000000-0x0001ffff] pci 0000:11:00.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:11:00.0: PME# supported from D0 D3hot D3cold pci 0000:11:00.0: Adding to iommu group 28 pci 0000:12:01.0: [10b5:8606] type 01 class 0x060400 pci 0000:12:01.0: Quirks: Set bus range to 0x141412 pci 0000:12:01.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:12:01.0: PME# supported from D0 D3hot D3cold pci 0000:12:01.0: Adding to iommu group 28 pci 0000:12:04.0: [10b5:8606] type 00 class 0x068000 pci 0000:12:04.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:12:04.0: Adding to iommu group 28 pci 0000:12:05.0: [10b5:8606] type 01 class 0x060400 pci 0000:12:05.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:12:05.0: PME# supported from D0 D3hot D3cold pci 0000:12:05.0: Adding to iommu group 28 pci 0000:12:07.0: [10b5:8606] type 01 class 0x060400 pci 0000:12:07.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:12:07.0: PME# supported from D0 D3hot D3cold pci 0000:12:07.0: Adding to iommu group 28 pci 0000:12:09.0: [10b5:8606] type 01 class 0x060400 pci 0000:12:09.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:12:09.0: PME# supported from D0 D3hot D3cold pci 0000:12:09.0: Adding to iommu group 28 pci 0000:11:00.0: PCI bridge to [bus 12-17] pci 0000:11:00.0: bridge window [mem 0xa1000000-0xa1ffffff] ...... (removed the enumeration of 0000:14:00.0 as it's customer sensitive) ...... pci 0000:12:01.0: PCI bridge to [bus 14] pci 0000:12:01.0: bridge window [mem 0xa1000000-0xa17fffff] pci 0000:12:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:12:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:12:09.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:12:05.0: PCI bridge to [bus 15-17] pci_bus 0000:15: busn_res: [bus 15-17] end is updated to 15 pci 0000:12:07.0: PCI bridge to [bus 16-17] pci_bus 0000:16: busn_res: [bus 16-17] end is updated to 16 pci 0000:12:09.0: PCI bridge to [bus 17] pci_bus 0000:17: busn_res: [bus 17] end is updated to 17 pcieport 0000:09:02.0: bridge window [io 0x1000-0x0fff] to [bus 0a-10] add_size 1000 pcieport 0000:09:03.0: bridge window [io 0x1000-0x0fff] to [bus 11-17] add_size 1000 pci 0000:11:00.0: BAR 14: assigned [mem 0xa1000000-0xa1ffffff] pci 0000:12:01.0: BAR 14: assigned [mem 0xa1000000-0xa17fffff] pci 0000:14:00.0: BAR 0: assigned [mem 0xa1000000-0xa17fffff 64bit] pci 0000:12:01.0: PCI bridge to [bus 14] pci 0000:12:01.0: bridge window [mem 0xa1000000-0xa17fffff] pci 0000:12:05.0: PCI bridge to [bus 15] pci 0000:12:07.0: PCI bridge to [bus 16] pci 0000:12:09.0: PCI bridge to [bus 17] pci 0000:11:00.0: PCI bridge to [bus 12-17] pci 0000:11:00.0: bridge window [mem 0xa1000000-0xa1ffffff] pcieport 0000:12:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:09.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci_bus 0000:15: busn_res: [bus 15] end is updated to 15 pci_bus 0000:16: busn_res: [bus 16] end is updated to 16 pci_bus 0000:17: busn_res: [bus 17] end is updated to 17 pcieport 0000:09:02.0: bridge window [io 0x1000-0x0fff] to [bus 0a-10] add_size 1000 pcieport 0000:09:03.0: bridge window [io 0x1000-0x0fff] to [bus 11-17] add_size 1000 pci 0000:0a:00.0: [10b5:8606] type 01 class 0x060400 pci 0000:0a:00.0: Quirks: Set bus range to 0x100b0a pci 0000:0a:00.0: reg 0x10: [mem 0x00000000-0x0001ffff] pci 0000:0a:00.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:0a:00.0: PME# supported from D0 D3hot D3cold pci 0000:0a:00.0: Adding to iommu group 27 pci 0000:0b:01.0: [10b5:8606] type 01 class 0x060400 pci 0000:0b:01.0: Quirks: Set bus range to 0x0d0d0b pci 0000:0b:01.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:0b:01.0: PME# supported from D0 D3hot D3cold pci 0000:0b:01.0: Adding to iommu group 27 pci 0000:0b:04.0: [10b5:8606] type 00 class 0x068000 pci 0000:0b:04.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:0b:04.0: Adding to iommu group 27 pci 0000:0b:05.0: [10b5:8606] type 01 class 0x060400 pci 0000:0b:05.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:0b:05.0: PME# supported from D0 D3hot D3cold pci 0000:0b:05.0: Adding to iommu group 27 pci 0000:0b:07.0: [10b5:8606] type 01 class 0x060400 pci 0000:0b:07.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:0b:07.0: PME# supported from D0 D3hot D3cold pci 0000:0b:07.0: Adding to iommu group 27 pci 0000:0b:09.0: [10b5:8606] type 01 class 0x060400 pci 0000:0b:09.0: Max Payload Size set to 256 (was 128, max 512) pci 0000:0b:09.0: PME# supported from D0 D3hot D3cold pci 0000:0b:09.0: Adding to iommu group 27 pci 0000:0a:00.0: PCI bridge to [bus 0b-10] pci 0000:0a:00.0: bridge window [mem 0xa0000000-0xa0ffffff] ...... (removed the enumeration of 0000:0d:00.0 as it's customer sensitive) ...... pci 0000:0b:01.0: PCI bridge to [bus 0d] pci 0000:0b:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:0b:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:0b:09.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:0b:01.0: PCI bridge to [bus 0e-10] pci_bus 0000:0e: busn_res: [bus 0e-10] end is updated to 0e pci 0000:0b:05.0: PCI bridge to [bus 0f-10] pci_bus 0000:0f: busn_res: [bus 0f-10] end is updated to 0f pci 0000:0b:07.0: PCI bridge to [bus 10] pci_bus 0000:10: busn_res: [bus 10] end is updated to 10 pcieport 0000:12:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:09.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci_bus 0000:15: busn_res: [bus 15] end is updated to 15 pci_bus 0000:16: busn_res: [bus 16] end is updated to 16 pci_bus 0000:17: busn_res: [bus 17] end is updated to 17 pci_bus 0000:11: busn_res: [bus 11-17] end is updated to 17 pci 0000:0b:09.0: devices behind bridge are unusable because [bus 11-17] cannot be assigned for them pci 0000:0a:00.0: bridge has subordinate 10 but max busn 17 pcieport 0000:12:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:09.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci_bus 0000:15: busn_res: [bus 15] end is updated to 15 pci_bus 0000:16: busn_res: [bus 16] end is updated to 16 pci_bus 0000:17: busn_res: [bus 17] end is updated to 17 pcieport 0000:09:02.0: bridge window [io 0x1000-0x0fff] to [bus 0a-10] add_size 1000 pcieport 0000:09:03.0: bridge window [io 0x1000-0x0fff] to [bus 11-17] add_size 1000 pci 0000:0a:00.0: BAR 14: assigned [mem 0xa0000000-0xa0ffffff] pci 0000:0a:00.0: BAR 0: no space for [mem size 0x00020000] pci 0000:0a:00.0: BAR 0: failed to assign [mem size 0x00020000] pci 0000:0b:01.0: PCI bridge to [bus 0e] pci 0000:0b:05.0: PCI bridge to [bus 0f] pci 0000:0b:07.0: PCI bridge to [bus 10] pci 0000:0a:00.0: PCI bridge to [bus 0b-10] pci 0000:0a:00.0: bridge window [mem 0xa0000000-0xa0ffffff] pcieport 0000:0b:01.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:0b:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:0b:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:0b:09.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci_bus 0000:0c: busn_res: can not insert [bus 0c-10] under [bus 0b-10] (conflicts with (null) [bus 0d]) ...... (removed the enumeration of 0000:0d:00.0 as it's customer sensitive) ...... pcieport 0000:0b:01.0: PCI bridge to [bus 0c-10] pci_bus 0000:0c: busn_res: [bus 0c-10] end is updated to 0c pci_bus 0000:0d: busn_res: [bus 0d] end is updated to 0d pci_bus 0000:0e: busn_res: [bus 0e] end is updated to 0e pci_bus 0000:0f: busn_res: [bus 0f] end is updated to 0f pcieport 0000:12:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:07.0: bridge configuration invalid ([bus 00-00]), reconfiguring pcieport 0000:12:09.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci_bus 0000:15: busn_res: [bus 15] end is updated to 15 pci_bus 0000:16: busn_res: [bus 16] end is updated to 16 pci_bus 0000:17: busn_res: [bus 17] end is updated to 17 pcieport 0000:09:02.0: bridge window [io 0x1000-0x0fff] to [bus 0a-10] add_size 1000 pcieport 0000:09:03.0: bridge window [io 0x1000-0x0fff] to [bus 11-17] add_size 1000 pcieport 0000:0b:01.0: BAR 14: assigned [mem 0xa0000000-0xa07fffff] pci 0000:0c:00.0: BAR 0: assigned [mem 0xa0000000-0xa07fffff 64bit] pcieport 0000:00:03.1: AER: Multiple Uncorrected (Non-Fatal) error received: 0000:0a:00.0 pcieport 0000:0a:00.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) pcieport 0000:0a:00.0: AER: device [10b5:8606] error status/mask=00100000/00000000 pcieport 0000:0a:00.0: AER: [20] UnsupReq (First) pcieport 0000:0a:00.0: AER: TLP Header: 40000001 0000060f a0000204 00000000 pcieport 0000:0a:00.0: AER: Error of this Agent is reported first pcieport 0000:09:02.0: AER: Device recovery failed pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:0a:00.0 pcieport 0000:0a:00.0: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) pcieport 0000:0a:00.0: AER: device [10b5:8606] error status/mask=00100000/00000000 pcieport 0000:0a:00.0: AER: [20] UnsupReq (First) pcieport 0000:0a:00.0: AER: TLP Header: 40000001 0000060f a0000204 00000000 pcieport 0000:09:02.0: AER: Device recovery failed pci 0000:0c:00.0: Removing from iommu group 27 pci_bus 0000:0c: busn_res: [bus 0c] is released pci 0000:0b:01.0: Removing from iommu group 27 pci 0000:0b:04.0: Removing from iommu group 27 pci_bus 0000:0f: busn_res: [bus 0f] is released pci 0000:0b:05.0: Removing from iommu group 27 pci_bus 0000:10: busn_res: [bus 10] is released pci 0000:0b:07.0: Removing from iommu group 27 pci 0000:0b:09.0: Removing from iommu group 27 pci_bus 0000:0b: busn_res: [bus 0b-10] is released pci 0000:0a:00.0: Removing from iommu group 27 pcieport 0000:00:03.1: AER: Uncorrected (Non-Fatal) error received: 0000:00:03.1 pcieport 0000:00:03.1: AER: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) pcieport 0000:00:03.1: AER: device [8086:2f09] error status/mask=00004000/00000000 pcieport 0000:00:03.1: AER: [14] CmpltTO (First) pcieport 0000:00:03.1: AER: Device recovery failed ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() 2025-08-14 9:39 [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() Rui He 2025-08-14 20:36 ` Bjorn Helgaas @ 2025-08-17 2:46 ` Ethan Zhao 2025-08-25 8:47 ` He, Rui 1 sibling, 1 reply; 7+ messages in thread From: Ethan Zhao @ 2025-08-17 2:46 UTC (permalink / raw) To: Rui He, Bjorn Helgaas Cc: linux-pci, linux-kernel, Prashant.Chikhalkar, Jiguang.Xiao On 8/14/2025 5:39 PM, Rui He wrote: > For preconfigured PCI bridge, child bus created on the first scan. > While for some reasons(e.g register mutation), the secondary, and subordiante > register reset to 0 on the second scan, which caused to create > PCI bus twice for the same PCI device. > > Following is the related log: > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0d] > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge configuration invalid ([bus 00-00]), reconfiguring > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to [bus 0f-10] Could you help to attach a 'lspci -t' about the topology ? bridge 0000:0b:01.0 and 0000:0b:05.0 have the same subordinate bus number, that is weird seems they aren't connected as upstream and downstream, but siblings. Does the device behind the bridge 0000:0b:05.0 work after the second scan (TLP are forwarded) ?> > Here PCI device 000:0b:01.0 assigend to bus 0d and 0e. > > This patch checks if child PCI bus has been created on the second scan > of bridge. If yes, return directly instead of create a new one. > > Signed-off-by: Rui He <rui.he@windriver.com> > --- > drivers/pci/probe.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index f41128f91ca76..ec67adbf31738 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -1444,6 +1444,9 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev, > goto out; > } > The bridge should was marked as broken=1 already, bailed out earlier, wouldn't get here with bridge forwarding was disabled. no further configuration anymore. what is your kernel number ? Thanks, Ethan> + if(pci_has_subordinate(dev)) > + goto out; > + > /* Clear errors */ > pci_write_config_word(dev, PCI_STATUS, 0xffff); > ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() 2025-08-17 2:46 ` Ethan Zhao @ 2025-08-25 8:47 ` He, Rui 0 siblings, 0 replies; 7+ messages in thread From: He, Rui @ 2025-08-25 8:47 UTC (permalink / raw) To: Ethan Zhao, Bjorn Helgaas Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, Chikhalkar, Prashant, Xiao, Jiguang > -----Original Message----- > From: Ethan Zhao <etzhao1900@gmail.com> > Sent: 2025年8月17日 10:46 > To: He, Rui <Rui.He@windriver.com>; Bjorn Helgaas <bhelgaas@google.com> > Cc: linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; Chikhalkar, > Prashant <Prashant.Chikhalkar@windriver.com>; Xiao, Jiguang > <Jiguang.Xiao@windriver.com> > Subject: Re: [PATCH 1/1] pci: Add subordinate check before > pci_add_new_bus() > > CAUTION: This email comes from a non Wind River email account! > Do not click links or open attachments unless you recognize the sender and > know the content is safe. > > On 8/14/2025 5:39 PM, Rui He wrote: > > For preconfigured PCI bridge, child bus created on the first scan. > > While for some reasons(e.g register mutation), the secondary, and > > subordiante register reset to 0 on the second scan, which caused to > > create PCI bus twice for the same PCI device. > > > > Following is the related log: > > [Wed May 28 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus > > 0d] [Wed May 28 20:38:36 CST 2025] pci 0000:0b:05.0: bridge > > configuration invalid ([bus 00-00]), reconfiguring [Wed May 28 > > 20:38:36 CST 2025] pci 0000:0b:01.0: PCI bridge to [bus 0e-10] [Wed > > May 28 20:38:36 CST 2025] pci 0000:0b:05.0: PCI bridge to [bus 0f-10] > Could you help to attach a 'lspci -t' about the topology ? > bridge 0000:0b:01.0 and 0000:0b:05.0 have the same subordinate bus > number, that is weird seems they aren't connected as upstream and > downstream, but siblings. Follwing is the related lspci logs. # lspci -tv ...... \-[0000:00]-+-00.0 Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 DMI2 +-03.1-[08-73]----00.0-[09-73]--+-00.0 Microsemi / PMC / IDT PES24NT24G2 PCI Express Switch | +-02.0-[0a-10]----00.0-[0b-10]--+-01.0-[0d]----00.0 Device xxxx | | +-04.0 PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch | | +-05.0-[00]-- | | +-07.0-[00]-- | | \-09.0-[00]-- | +-03.0-[11-17]----00.0-[12-17]--+-01.0-[14]----00.0 Device xxxx | | +-04.0 PLX Technology, Inc. PEX 8606 6 Lane, 6 Port PCI Express Gen 2 (5.0 GT/s) Switch | | +-05.0-[00]-- | | +-07.0-[00]-- | | \-09.0-[00]-- ...... Yes, you are right. 0000:0b:01.0 and 0000:0b:05.0 are siblings. I added 000:0b:05.0 to indicate that [bus 0d] is created during the first scan, while [bus 0e-10] is created during the second scan. Here, 0000:0b:01.0 is pre-assigned to bus 0d. 0000:05[07, 09].0 is not configured. > > Does the device behind the bridge 0000:0b:05.0 work after the second scan > (TLP are forwarded) ?> > > Here PCI device 000:0b:01.0 assigend to bus 0d and 0e. > > > > This patch checks if child PCI bus has been created on the second scan > > of bridge. If yes, return directly instead of create a new one. > > > > Signed-off-by: Rui He <rui.he@windriver.com> > > --- > > drivers/pci/probe.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index > > f41128f91ca76..ec67adbf31738 100644 > > --- a/drivers/pci/probe.c > > +++ b/drivers/pci/probe.c > > @@ -1444,6 +1444,9 @@ static int pci_scan_bridge_extend(struct pci_bus > *bus, struct pci_dev *dev, > > goto out; > > } > > > The bridge should was marked as broken=1 already, bailed out earlier, > wouldn't get here with bridge forwarding was disabled. no further > configuration anymore. what is your kernel number ? > > > Thanks, > Ethan My kernel version is v5.2.60 (https://git.yoctoproject.org/linux-yocto/tree/Makefile?h=v5.2/standard/base). The bridge can be marked to broken=1 on the first scan, while this error happens on the second scan, here pass=1, (!pass) always be false, broken is impossible to set to 1. [bus 0e-10] was created on 0000:0b:01.0 on the second scan, which means that the if condition is false. -> if ((secondary || subordinate) && !pcibios_assign_all_busses() && -> !is_cardbus && !broken) { pcibios_assign_all_busses() always be false as "pci=assign-busses" not added to cmdline. Is_cardbus always be false as 0000:0b.01.0 is a bridge. Broken always be false on the second scan as pass=1. The only possible is that (secondary || subordinate) is false. Thanks, Rui > + if(pci_has_subordinate(dev)) > > + goto out; > > + > > /* Clear errors */ > > pci_write_config_word(dev, PCI_STATUS, 0xffff); > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-08-26 7:01 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-14 9:39 [PATCH 1/1] pci: Add subordinate check before pci_add_new_bus() Rui He 2025-08-14 20:36 ` Bjorn Helgaas 2025-08-15 2:31 ` He, Rui 2025-08-15 14:22 ` Bjorn Helgaas 2025-08-26 7:01 ` He, Rui 2025-08-17 2:46 ` Ethan Zhao 2025-08-25 8:47 ` He, Rui
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox