* [RFC] PCI: Unassigned Expansion ROM BARs @ 2015-09-24 2:47 Myron Stowe 2015-09-24 3:21 ` Yinghai Lu ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Myron Stowe @ 2015-09-24 2:47 UTC (permalink / raw) To: linux-pci; +Cc: LKML, Bjorn Helgaas, Yinghai Lu, Alex Williamson I've encountered numerous bugzilla reports related to platform BIOS' not programming valid values into a PCI device's Type 0 Configuration space "Expansion ROM Base Address" field (a.k.a. Expansion ROM BAR). The main observed consequence being 'dmesg' entries like the following that get customers excited enough to file reports against the kernel. pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window pci 0000:04:03.0: can't claim BAR 6 [mem 0xffff0000-0xffffffff pref]: no compatible bridge window After I've provided an analysis similar to [1] the respective BIOS response (teams from two of the major vendors) is typically: "The OS has no business touching the Expansion ROM BARs and it provides no value to the equation here. The Expansion ROM BAR is only useful in pre-boot for the BIOS to get boot code from a device." This scenario has occurred enough times now that I'd like to attempt to "raise the bar" and invite a technically merit based discussion concerning this topic - via a public forum that is archived and provides a source of reference for use upon future occurrences - and see if a consensus can be reached between the various vendor's BIOS engineers and kernel engineers. A little more background context - The kernel expects device Expansion ROM BARs to be programmed with valid values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the device’s expansion ROM address space is disabled). This seems to be the main contention point with said BIOS engineers. If an Expansion ROM BAR is not programmed, the kernel will attempt to find available resources and, if successful, program it. As this occurs various 'dmesg' entries related to kernel's actions are output. Note that for devices that share decoders between the Expansion ROM BAR and other BARs the firmware (probably) should not enable the Expansion ROM BAR at hand-off to the operating system (see the last paragraph of the PCI Firmware Specification, Rev 3.2, Section 3.5 "Device State at Firmware/Operating System Handoff"). There is a kernel boot parameter, pci=norom, that is intended to disable the kernel's resource assignment actions for Expansion ROMs that do not already have BIOS assigned address ranges. Note however, if I remember correctly, that this only works if the Expansion ROM BAR is set to "0" by the BIOS before hand-off. I've opened https://bugzilla.kernel.org/show_bug.cgi?id=104931 and attached the full 'dmesg' that exhibits a typical occurrence as an example. I'd like to use the bugzilla to archive any discussion that takes place. I'll copy all relevant discussion that takes place here into the bugzilla as "Additional Comments". Please continue with this thread, adding your views in these regards. Citing's from pertinent specifications that back up your position would be appreciated. Thanks, Myron [1] Annotated 'dmesg' log concerning Expansion ROM BARs not setup by BIOS The "can't claim" messages of interest are: pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window pci 0000:04:03.0: can't claim BAR 6 [mem 0xffff0000-0xffffffff pref]: no compatible bridge window The PCI devices of interest are a device at PCI Bus 1, Device 0, Function 0 (01:00.0) and another device at PCI Bus 4, Device 3, Function 0 (04:03.0). The "root bridge" that leads to PCI buses 1 and 4 - the buses of interest - is "PCI0" and its I/O Port space and Memory Mapped I/O (MMIO) space are: ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-fe]) PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [bus 00-fe] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfeafffff] It's helpful to gather up all the resource related information pertaining to the devices of interest in one place. Concentrating on the PCI-to-PCI bridges and individual PCI devices that lead to 01:00.0, the first device exhibiting the "can't claim" message (everything that is consuming resources on PCI bus 0 and PCI bus 1): pci 0000:00:1a.0: [8086:1c2d] type 00 class 0x0c0320 pci 0000:00:1a.0: reg 0x10: [mem 0xc1305000-0xc13053ff] pci 0000:00:1d.0: [8086:1c26] type 00 class 0x0c0320 pci 0000:00:1d.0: reg 0x10: [mem 0xc1304000-0xc13043ff] pci 0000:00:1f.2: [8086:1c00] type 00 class 0x01018f pci 0000:00:1f.2: reg 0x10: [io 0x3078-0x307f] pci 0000:00:1f.2: reg 0x14: [io 0x308c-0x308f] pci 0000:00:1f.2: reg 0x18: [io 0x3070-0x3077] pci 0000:00:1f.2: reg 0x1c: [io 0x3088-0x308b] pci 0000:00:1f.2: reg 0x20: [io 0x3050-0x305f] pci 0000:00:1f.2: reg 0x24: [io 0x3040-0x304f] pci 0000:00:1f.3: [8086:1c22] type 00 class 0x0c0500 pci 0000:00:1f.3: reg 0x10: [mem 0xc1302000-0xc13020ff 64bit] pci 0000:00:1f.3: reg 0x20: [io 0x3000-0x301f] pci 0000:00:1f.5: [8086:1c08] type 00 class 0x010185 pci 0000:00:1f.5: reg 0x10: [io 0x3068-0x306f] pci 0000:00:1f.5: reg 0x14: [io 0x3084-0x3087] pci 0000:00:1f.5: reg 0x18: [io 0x3060-0x3067] pci 0000:00:1f.5: reg 0x1c: [io 0x3080-0x3083] pci 0000:00:1f.5: reg 0x20: [io 0x3030-0x303f] pci 0000:00:1f.5: reg 0x24: [io 0x3020-0x302f] pci 0000:00:01.0: PCI bridge to [bus 01] pci 0000:00:01.0: bridge window [io 0x2000-0x2fff] pci 0000:00:01.0: bridge window [mem 0xc1200000-0xc12fffff] pci 0000:01:00.0: [1000:0072] type 00 class 0x010700 [1000:0072] - LSI (Symbios) Logic : SAS2008 PCIe Fusion-MPT SAS-2 pci 0000:01:00.0: reg 0x10: [io 0x2000-0x20ff] pci 0000:01:00.0: reg 0x14: [mem 0xc1240000-0xc124ffff 64bit] pci 0000:01:00.0: reg 0x1c: [mem 0xc1200000-0xc123ffff 64bit] x pci 0000:01:00.0: reg 0x30: [mem 0xfff00000-0xffffffff pref] The PCI-to-PCI bridge device for Bus 0 to Bus 1 only has one memory space apeture ("bridge window") active - [mem 0xc1200000-0xc12fffff]. This must be a subset of one of the "root bus" memory resources and looking at those, it is. The target device - 01:00.0; an 'mpt2sas' device - consumes three memory ranges. These correspond to the device's BAR 1 and 2 (64 bit addresses consume two BAR registers), BAR 3 and 4, and the "Expansion ROM Base Address (a.k.a. BAR 6). These also must be a subset of both the corresponding root bus resources and all PCI-to-PCI bridge devices in the PCI hiearchy leading to the device itself. Looking at them we see that the first two satisfy the requirement but the third - [mem 0xfff00000-0xffffffff pref] - does not! It's because the "Expansion ROM Base Address" register (a.k.a. BAR 6) does not adhear to the subset requirement(s) that the kernel later outputs: pci 0000:01:00.0: can't claim BAR 6 [mem 0xfff00000-0xffffffff pref]: no compatible bridge window pci 0000:01:00.0: BAR 6: no space for [mem size 0x00100000 pref] pci 0000:01:00.0: BAR 6: failed to assign [mem size 0x00100000 pref] The "can't claim" message is the kernel alerting us that the BIOS has not correctly set up resources that fulfill all the requirements (subset, alignment, type, ...). The kernel then "sizes" the BAR to see how much address space that BAR requires - in this case we see BAR 6 of 01:00.0 needs 1 MB of contiguous space - and subsequently tries to work around the BIOS' failure, attempting to find available, currently unused, resource space that meets all the requirements which is where the "no space for" message comes from. There is no contiguous space that meets all the requirements (subset, alignment, type, ...) available which is fairly easy to see here; there was only a 1 MB memory aperture provided by the PCI-to-PCI bridge device to begin with and the 01:00.0 device consumed subsets of that for BARs 1 and 3 so there is no way 1 MB remains free to satisfy BAR 6's needs. And so the kernel outputs the "failed to assign" message. In a very similar scenario, the 04:03.0 device also has not been properly set up by BIOS ([mem 0xffff0000-0xffffffff pref]). The difference in this case is that there were enough available resources left to satisfy all the subset, alignment, type, ..., requirements and thus the kernel was able to allocate from such and re-program the device's BAR 6 ([mem 0xc1010000-0xc101ffff pref]) so that the device can function correctly. pci 0000:00:1e.0: PCI bridge to [bus 04] (subtractive decode) pci 0000:00:1e.0: bridge window [mem 0xc0800000-0xc10fffff] pci 0000:00:1e.0: bridge window [mem 0xc0000000-0xc07fffff 64bit pref] pci 0000:00:1e.0: bridge window [io 0x0000-0x0cf7] (subtractive decode) pci 0000:00:1e.0: bridge window [io 0x0d00-0xffff] (subtractive decode) pci 0000:00:1e.0: bridge window [mem 0x000a0000-0x000bffff] (subtract d) pci 0000:00:1e.0: bridge window [mem 0xc0000000-0xfeafffff] (subtract d) pci 0000:04:03.0: [102b:0532] type 00 class 0x030000 [102b:0532] - Matrox : MGS G200eW WPCM450 (Graphics) pci 0000:04:03.0: reg 0x10: [mem 0xc0000000-0xc07fffff pref] pci 0000:04:03.0: reg 0x14: [mem 0xc1000000-0xc1003fff] pci 0000:04:03.0: reg 0x18: [mem 0xc0800000-0xc0ffffff] x pci 0000:04:03.0: reg 0x30: [mem 0xffff0000-0xffffffff pref] pci 0000:04:03.0: can't claim BAR 6 [mem 0xffff0000-0xffffffff pref]: no compatible bridge window pci 0000:04:03.0: BAR 6: assigned [mem 0xc1010000-0xc101ffff pref] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-24 2:47 [RFC] PCI: Unassigned Expansion ROM BARs Myron Stowe @ 2015-09-24 3:21 ` Yinghai Lu 2015-09-24 3:58 ` Alex Williamson 2015-09-24 17:06 ` Myron Stowe 2015-09-27 19:29 ` Myron Stowe 2015-09-27 20:19 ` Myron Stowe 2 siblings, 2 replies; 14+ messages in thread From: Yinghai Lu @ 2015-09-24 3:21 UTC (permalink / raw) To: Myron Stowe; +Cc: linux-pci, LKML, Bjorn Helgaas, Alex Williamson On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe <myron.stowe@gmail.com> wrote: > > The kernel expects device Expansion ROM BARs to be programmed with valid > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > device’s expansion ROM address space is disabled). This seems to be the > main contention point with said BIOS engineers. If an Expansion ROM BAR is > not programmed, the kernel will attempt to find available resources and, if > successful, program it. As this occurs various 'dmesg' entries > related to kernel's actions are output. ... > There is a kernel boot parameter, pci=norom, that is intended to disable the > kernel's resource assignment actions for Expansion ROMs that do not already > have BIOS assigned address ranges. Note however, if I remember correctly, > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > before hand-off. option rom is used by legacy bios to enable booting from external device. usually BIOS call the option rom, so the firmware will be loaded to add on cards. and firmware get started. Also option rom would include tools that is used to configure behavior of cards like add/remove raid. Also there is some use case that kernel driver try to get some parameters from BIOS. like intel soft raid ? --- bad practice ! I would like to treat option rom BAR as optional resources during resource allocation. https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 Subject: PCI: Treat ROM resource as optional during realloc Current on realloc path, we just ignore ROM resource if we can not assign them in first try. Treat ROM resources as optional resources,so try to allocate them together with required ones, if can not assign them, could go with other required resources only, and try to allocate them second time in expand path. Thanks Yinghai ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-24 3:21 ` Yinghai Lu @ 2015-09-24 3:58 ` Alex Williamson 2015-09-24 17:06 ` Myron Stowe 1 sibling, 0 replies; 14+ messages in thread From: Alex Williamson @ 2015-09-24 3:58 UTC (permalink / raw) To: Yinghai Lu; +Cc: Myron Stowe, linux-pci, LKML, Bjorn Helgaas On Wed, 2015-09-23 at 20:21 -0700, Yinghai Lu wrote: > On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe <myron.stowe@gmail.com> wrote: > > > > The kernel expects device Expansion ROM BARs to be programmed with valid > > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > > device’s expansion ROM address space is disabled). This seems to be the > > main contention point with said BIOS engineers. If an Expansion ROM BAR is > > not programmed, the kernel will attempt to find available resources and, if > > successful, program it. As this occurs various 'dmesg' entries > > related to kernel's actions are output. > ... > > There is a kernel boot parameter, pci=norom, that is intended to disable the > > kernel's resource assignment actions for Expansion ROMs that do not already > > have BIOS assigned address ranges. Note however, if I remember correctly, > > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > > before hand-off. > > option rom is used by legacy bios to enable booting from external device. > usually BIOS call the option rom, so the firmware will be loaded to > add on cards. > and firmware get started. > Also option rom would include tools that is used to configure behavior of cards > like add/remove raid. > > Also there is some use case that kernel driver try to get some parameters from > BIOS. like intel soft raid ? --- bad practice ! > > I would like to treat option rom BAR as optional resources during > resource allocation. > > https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 > > Subject: PCI: Treat ROM resource as optional during realloc > > Current on realloc path, we just ignore ROM resource if we can not assign > them in first try. > > Treat ROM resources as optional resources,so try to allocate them together > with required ones, if can not assign them, could go with other required > resources only, and try to allocate them second time in expand path. Don't forget that the physical system boot may not be the only "boot" of the PCI device. We can assign a PCI device to a VM running on top of the bare-metal OS, at which point the option ROM of the device may be re-executed and the device re-initialized by the VM BIOS. A BIOS engineer that argues that the option ROM is unnecessary after bare-metal BIOS boot is completely disregarding this use case. We do have ways to make this be a soft requirement, we can pass the option ROM as a file to the VM, but we need to be able to rip the option ROM from the device in order to do that, likely from a better behaved platform wrt option ROM mapping. Thanks, Alex ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-24 3:21 ` Yinghai Lu 2015-09-24 3:58 ` Alex Williamson @ 2015-09-24 17:06 ` Myron Stowe 2015-09-24 19:01 ` Yinghai Lu 1 sibling, 1 reply; 14+ messages in thread From: Myron Stowe @ 2015-09-24 17:06 UTC (permalink / raw) To: Yinghai Lu; +Cc: linux-pci, LKML, Bjorn Helgaas, Alex Williamson On Wed, Sep 23, 2015 at 9:21 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe <myron.stowe@gmail.com> wrote: >> >> The kernel expects device Expansion ROM BARs to be programmed with valid >> values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the >> device’s expansion ROM address space is disabled). This seems to be the >> main contention point with said BIOS engineers. If an Expansion ROM BAR is >> not programmed, the kernel will attempt to find available resources and, if >> successful, program it. As this occurs various 'dmesg' entries >> related to kernel's actions are output. > ... >> There is a kernel boot parameter, pci=norom, that is intended to disable the >> kernel's resource assignment actions for Expansion ROMs that do not already >> have BIOS assigned address ranges. Note however, if I remember correctly, >> that this only works if the Expansion ROM BAR is set to "0" by the BIOS >> before hand-off. > > option rom is used by legacy bios to enable booting from external device. > usually BIOS call the option rom, so the firmware will be loaded to > add on cards. > and firmware get started. > Also option rom would include tools that is used to configure behavior of cards > like add/remove raid. I'm not sure what you are getting at here but yes, there are use cases where the BIOS needs to access the Expansion ROM, one of the more common being PXE booting from a NIC device where the PXE boot content is retrieved from the ROM, but that has little, if anything, to do with what I'm after here. The BIOS engineers are expressing that the kernel should *never* need to access the Expansion ROM, and thus should *never* try to allocate resources for these BARs and program them to sane address range values. I know you work with bringing up new hardware. So picture yourself sitting with some members from your platform's BIOS team. They tell you: "The OS is incorrect in thinking it needs to find, and program, sane resource range values into a device's Expansion ROM BAR. We (the BIOS) hand-off the platform with these disabled, thus whatever values are in the ROMs BAR should be totally ignored, and the OS should never touch them." What would you reply with to them in an attempt to show that your position (i.e. the kernel finding, and programming values under these circumstances) is correct and that the BIOS opinion is in-correct? That is what I'm after. > > Also there is some use case that kernel driver try to get some parameters from > BIOS. like intel soft raid ? --- bad practice ! Again, your replies are so terse I have no idea what you are saying; it's undecipherable! Are you indicating that you agree with the BIOS engineers views? > > I would like to treat option rom BAR as optional resources during > resource allocation. > > https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 > > Subject: PCI: Treat ROM resource as optional during realloc > > Current on realloc path, we just ignore ROM resource if we can not assign > them in first try. > > Treat ROM resources as optional resources,so try to allocate them together > with required ones, if can not assign them, could go with other required > resources only, and try to allocate them second time in expand path. Yes, while they may have lower priority in obtaining resources, your still attempting to do so initially. The BIOS engineers seem to believe that this is incorrect - the OS should not even attempt to allocate them in the first try. So, which side are you on and can you support your view with some technical based argument (and any references from the specifications)? Please take some time and respond with some thought out explanations and opinions. I value your opinion because I have seen your work but your terse replies are going to do nothing what so ever in trying to convince BIOS engineers that the OS should, or needs to, access such. Otherwise: "Why are we (the kernel) allocating resources for them?" Myron > > Thanks > > Yinghai ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-24 17:06 ` Myron Stowe @ 2015-09-24 19:01 ` Yinghai Lu 2015-09-25 4:35 ` Yinghai Lu 0 siblings, 1 reply; 14+ messages in thread From: Yinghai Lu @ 2015-09-24 19:01 UTC (permalink / raw) To: Myron Stowe; +Cc: linux-pci, LKML, Bjorn Helgaas, Alex Williamson On Thu, Sep 24, 2015 at 10:06 AM, Myron Stowe <myron.stowe@gmail.com> wrote: > On Wed, Sep 23, 2015 at 9:21 PM, Yinghai Lu <yinghai@kernel.org> wrote: >> On Wed, Sep 23, 2015 at 7:47 PM, Myron Stowe <myron.stowe@gmail.com> wrote: >>> >>> The kernel expects device Expansion ROM BARs to be programmed with valid >>> values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the >>> device’s expansion ROM address space is disabled). This seems to be the >>> main contention point with said BIOS engineers. If an Expansion ROM BAR is >>> not programmed, the kernel will attempt to find available resources and, if >>> successful, program it. As this occurs various 'dmesg' entries >>> related to kernel's actions are output. >> ... >>> There is a kernel boot parameter, pci=norom, that is intended to disable the >>> kernel's resource assignment actions for Expansion ROMs that do not already >>> have BIOS assigned address ranges. Note however, if I remember correctly, >>> that this only works if the Expansion ROM BAR is set to "0" by the BIOS >>> before hand-off. >> >> option rom is used by legacy bios to enable booting from external device. >> usually BIOS call the option rom, so the firmware will be loaded to >> add on cards. >> and firmware get started. >> Also option rom would include tools that is used to configure behavior of cards >> like add/remove raid. > > I'm not sure what you are getting at here but yes, there are use cases > where the BIOS > needs to access the Expansion ROM, one of the more common being PXE booting from > a NIC device where the PXE boot content is retrieved from the ROM, but > that has little, > if anything, to do with what I'm after here. > > The BIOS engineers are expressing that the kernel should *never* need > to access the Expansion > ROM, and thus should *never* try to allocate resources for these BARs > and program them > to sane address range values. They are wrong. It still depends on how addon card firmware and drivers from OS to use it. > > > I know you work with bringing up new hardware. So picture yourself > sitting with some > members from your platform's BIOS team. They tell you: "The OS is > incorrect in thinking > it needs to find, and program, sane resource range values into a > device's Expansion ROM > BAR. We (the BIOS) hand-off the platform with these disabled, thus > whatever values are in > the ROMs BAR should be totally ignored, and the OS should never touch > them." What would you > reply with to them in an attempt to show that your position (i.e. the > kernel finding, and programming > values under these circumstances) is correct and that the BIOS opinion > is in-correct? That is > what I'm after. Some addon cards drivers are use ROM bar to talk to card. like Intel DC21285 driver in drivers/mtd/maps/pci.c > >> >> Also there is some use case that kernel driver try to get some parameters from >> BIOS. like intel soft raid ? --- bad practice ! > > Again, your replies are so terse I have no idea what you are saying; it's > undecipherable! Are you indicating that you agree with the BIOS > engineers views? No. Just some addon card/drivers would avoid accessing expand rom to get parameter or settings. > >> >> I would like to treat option rom BAR as optional resources during >> resource allocation. >> >> https://git.kernel.org/cgit/linux/kernel/git/yinghai/linux-yinghai.git/patch/?id=7f689da33302e4871fd18aee2c19abb5e3ea5261 >> >> Subject: PCI: Treat ROM resource as optional during realloc >> >> Current on realloc path, we just ignore ROM resource if we can not assign >> them in first try. >> >> Treat ROM resources as optional resources,so try to allocate them together >> with required ones, if can not assign them, could go with other required >> resources only, and try to allocate them second time in expand path. > > Yes, while they may have lower priority in obtaining resources, your still > attempting to do so initially. The BIOS engineers seem to believe that this is > incorrect - the OS should not even attempt to allocate them in the first try. Some drivers may use it, and we don't know who is really need them. so just give it a try. Or do we want to keep a white list to say which device should have ROM bar as mush have, and other is optional to have ? Thanks Yinghai ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-24 19:01 ` Yinghai Lu @ 2015-09-25 4:35 ` Yinghai Lu 2015-09-25 13:31 ` Myron Stowe 2015-09-25 14:35 ` Bjorn Helgaas 0 siblings, 2 replies; 14+ messages in thread From: Yinghai Lu @ 2015-09-25 4:35 UTC (permalink / raw) To: Myron Stowe; +Cc: linux-pci, LKML, Bjorn Helgaas, Alex Williamson On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu <yinghai@kernel.org> wrote: > Or do we want to keep a white list to say which device should have > ROM bar as mush have, and other is optional to have ? Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() Only set that for 1. if BIOS/firmware already set ROM bar. 2. via quirks for some devices. We assign those needed ROM bar as required and other ROM bars as optional resources. Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/pci/i386.c | 9 +++++- drivers/dma/pch_dma.c | 1 drivers/gpio/gpio-ml-ioh.c | 2 - drivers/misc/pch_phub.c | 12 -------- drivers/pci/probe.c | 7 +++++ drivers/pci/quirks.c | 63 +++++++++++++++++++++++++++++++++++++++++++++ drivers/pci/setup-bus.c | 18 ++++++++++-- include/linux/pci.h | 13 +++++++++ include/linux/pci_ids.h | 7 +++++ 9 files changed, 112 insertions(+), 20 deletions(-) Index: linux-2.6/arch/x86/pci/i386.c =================================================================== --- linux-2.6.orig/arch/x86/pci/i386.c +++ linux-2.6/arch/x86/pci/i386.c @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc } } +bool pci_assign_roms(void) +{ + return !!(pci_probe & PCI_ASSIGN_ROMS); +} + static int __init pcibios_assign_resources(void) { struct pci_host_bridge *host_bridge = NULL; - if (!(pci_probe & PCI_ASSIGN_ROMS)) + if (!pci_assign_roms()) for_each_pci_host_bridge(host_bridge) pcibios_allocate_rom_resources(host_bridge->bus); @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct pcibios_allocate_resources(bus, 0); pcibios_allocate_resources(bus, 1); - if (!(pci_probe & PCI_ASSIGN_ROMS)) + if (!pci_assign_roms()) pcibios_allocate_rom_resources(bus); } Index: linux-2.6/drivers/pci/probe.c =================================================================== --- linux-2.6.orig/drivers/pci/probe.c +++ linux-2.6/drivers/pci/probe.c @@ -224,6 +224,13 @@ int __pci_read_base(struct pci_dev *dev, l64 = l & PCI_ROM_ADDRESS_MASK; sz64 = sz & PCI_ROM_ADDRESS_MASK; mask64 = (u32)PCI_ROM_ADDRESS_MASK; + /* simple validation */ + if (l64 && sz64 && + (l64 & 0xff000000) != 0xff000000 && + system_state == SYSTEM_BOOTING) { + dev_printk(KERN_DEBUG, &dev->dev, "set dev_flags NEED_ROM_BAR\n"); + pci_dev_set_need_rom_bar(dev); + } } if (res->flags & IORESOURCE_MEM_64) { Index: linux-2.6/drivers/pci/quirks.c =================================================================== --- linux-2.6.orig/drivers/pci/quirks.c +++ linux-2.6/drivers/pci/quirks.c @@ -4197,3 +4197,66 @@ static void quirk_intel_qat_vf_cap(struc } } DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); + +/* from drivers/mtd/maps/pci.c */ +static void quirk_set_need_rom_bar(struct pci_dev *pdev) +{ + if (!pci_dev_need_rom_bar(pdev)) { + dev_printk(KERN_DEBUG, &pdev->dev, "set dev_flags NEED_ROM_BAR\n"); + pci_dev_set_need_rom_bar(pdev); + } +} +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, + quirk_set_need_rom_bar); + +#ifdef CONFIG_PARISC +/* from drivers/video/console/sticore.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_EG, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX6, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX4, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX2, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FXE, + quirk_set_need_rom_bar); +#endif + +/* from drivers/misc/pch_phub.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PCH1_PHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7213_PHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_mPHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_nPHUB, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7831_PHUB, + quirk_set_need_rom_bar); + +/* from drivers/net/ethernet/sun/sungem.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_GEM, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_RIO_GEM, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMACP, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC2, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_K2_GMAC, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_SH_SUNGEM, + quirk_set_need_rom_bar); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_IPID2_GMAC, + quirk_set_need_rom_bar); + +/* from drivers/net/ethernet/sun/sunhme.c */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_HAPPYMEAL, + quirk_set_need_rom_bar); + +/* from drm and fbdev */ +DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, + 8, quirk_set_need_rom_bar); Index: linux-2.6/drivers/pci/setup-bus.c =================================================================== --- linux-2.6.orig/drivers/pci/setup-bus.c +++ linux-2.6/drivers/pci/setup-bus.c @@ -226,6 +226,10 @@ static void pdev_assign_resources_prepar if (resource_disabled(r) || r->parent) continue; + if (i == PCI_ROM_RESOURCE && + !pci_assign_roms() && !pci_dev_need_rom_bar(dev)) + continue; + if ((r->flags & IORESOURCE_IO) && !pci_find_host_bridge(dev->bus)->has_ioport) continue; @@ -1461,11 +1465,16 @@ out: return good_size; } -static inline bool is_optional(int i) +bool __weak pci_assign_roms(void) +{ + return false; +} + +static inline bool is_optional(struct pci_dev *dev, int i) { if (i == PCI_ROM_RESOURCE) - return true; + return !pci_assign_roms() && !pci_dev_need_rom_bar(dev); #ifdef CONFIG_PCI_IOV if (i >= PCI_IOV_RESOURCES && i <= PCI_IOV_RESOURCE_END) @@ -1538,7 +1547,7 @@ static int pbus_size_mem(struct pci_bus r_size = resource_size(r); align = pci_resource_alignment(dev, r); /* put SRIOV/ROM res to realloc list */ - if (realloc_head && is_optional(i)) { + if (realloc_head && is_optional(dev, i)) { add_to_align_test_list(&align_test_add_list, align, r_size, &dev->dev, i); r->end = r->start - 1; @@ -1549,6 +1558,9 @@ static int pbus_size_mem(struct pci_bus max_add_align = align; continue; } + if (!realloc_head && i == PCI_ROM_RESOURCE && + !pci_assign_roms() && !pci_dev_need_rom_bar(dev)) + continue; if (align > (1ULL<<37)) { /*128 Gb*/ dev_warn(&dev->dev, "disabling BAR %d: %pR (bad alignment %#llx)\n", Index: linux-2.6/include/linux/pci.h =================================================================== --- linux-2.6.orig/include/linux/pci.h +++ linux-2.6/include/linux/pci.h @@ -182,6 +182,8 @@ enum pci_dev_flags { PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7), /* Get VPD from function 0 VPD */ PCI_DEV_FLAGS_VPD_REF_F0 = (__force pci_dev_flags_t) (1 << 8), + /* Need to have ROM BAR */ + PCI_DEV_FLAGS_NEED_ROM_BAR = (__force pci_dev_flags_t) (1 << 9), }; enum pci_irq_reroute_variant { @@ -1965,6 +1967,14 @@ static inline bool pci_is_dev_assigned(s { return (pdev->dev_flags & PCI_DEV_FLAGS_ASSIGNED) == PCI_DEV_FLAGS_ASSIGNED; } +static inline void pci_dev_set_need_rom_bar(struct pci_dev *pdev) +{ + pdev->dev_flags |= PCI_DEV_FLAGS_NEED_ROM_BAR; +} +static inline bool pci_dev_need_rom_bar(struct pci_dev *pdev) +{ + return !!(pdev->dev_flags & PCI_DEV_FLAGS_NEED_ROM_BAR); +} /** * pci_ari_enabled - query ARI forwarding status @@ -1976,4 +1986,7 @@ static inline bool pci_ari_enabled(struc { return bus->self && bus->self->ari_enabled; } + +bool pci_assign_roms(void); + #endif /* LINUX_PCI_H */ Index: linux-2.6/drivers/misc/pch_phub.c =================================================================== --- linux-2.6.orig/drivers/misc/pch_phub.c +++ linux-2.6/drivers/misc/pch_phub.c @@ -49,7 +49,6 @@ /* MAX number of INT_REDUCE_CONTROL registers */ #define MAX_NUM_INT_REDUCE_CONTROL_REG 128 -#define PCI_DEVICE_ID_PCH1_PHUB 0x8801 #define PCH_MINOR_NOS 1 #define CLKCFG_CAN_50MHZ 0x12000000 #define CLKCFG_CANCLK_MASK 0xFF000000 @@ -61,17 +60,6 @@ #define CLKCFG_PLL2VCO (8 << 9) #define CLKCFG_UARTCLKSEL (1 << 18) -/* Macros for ML7213 */ -#define PCI_VENDOR_ID_ROHM 0x10db -#define PCI_DEVICE_ID_ROHM_ML7213_PHUB 0x801A - -/* Macros for ML7223 */ -#define PCI_DEVICE_ID_ROHM_ML7223_mPHUB 0x8012 /* for Bus-m */ -#define PCI_DEVICE_ID_ROHM_ML7223_nPHUB 0x8002 /* for Bus-n */ - -/* Macros for ML7831 */ -#define PCI_DEVICE_ID_ROHM_ML7831_PHUB 0x8801 - /* SROM ACCESS Macro */ #define PCH_WORD_ADDR_MASK (~((1 << 2) - 1)) Index: linux-2.6/include/linux/pci_ids.h =================================================================== --- linux-2.6.orig/include/linux/pci_ids.h +++ linux-2.6/include/linux/pci_ids.h @@ -1122,6 +1122,12 @@ #define PCI_VENDOR_ID_TCONRAD 0x10da #define PCI_DEVICE_ID_TCONRAD_TOKENRING 0x0508 +#define PCI_VENDOR_ID_ROHM 0x10DB +#define PCI_DEVICE_ID_ROHM_ML7213_PHUB 0x801A +#define PCI_DEVICE_ID_ROHM_ML7223_mPHUB 0x8012 /* for Bus-m */ +#define PCI_DEVICE_ID_ROHM_ML7223_nPHUB 0x8002 /* for Bus-n */ +#define PCI_DEVICE_ID_ROHM_ML7831_PHUB 0x8801 + #define PCI_VENDOR_ID_NVIDIA 0x10de #define PCI_DEVICE_ID_NVIDIA_TNT 0x0020 #define PCI_DEVICE_ID_NVIDIA_TNT2 0x0028 @@ -2900,6 +2906,7 @@ #define PCI_DEVICE_ID_INTEL_82454NX 0x84cb #define PCI_DEVICE_ID_INTEL_84460GX 0x84ea #define PCI_DEVICE_ID_INTEL_IXP4XX 0x8500 +#define PCI_DEVICE_ID_PCH1_PHUB 0x8801 #define PCI_DEVICE_ID_INTEL_IXP2800 0x9004 #define PCI_DEVICE_ID_INTEL_S21152BB 0xb152 Index: linux-2.6/drivers/dma/pch_dma.c =================================================================== --- linux-2.6.orig/drivers/dma/pch_dma.c +++ linux-2.6/drivers/dma/pch_dma.c @@ -976,7 +976,6 @@ static void pch_dma_remove(struct pci_de } /* PCI Device ID of DMA device */ -#define PCI_VENDOR_ID_ROHM 0x10DB #define PCI_DEVICE_ID_EG20T_PCH_DMA_8CH 0x8810 #define PCI_DEVICE_ID_EG20T_PCH_DMA_4CH 0x8815 #define PCI_DEVICE_ID_ML7213_DMA1_8CH 0x8026 Index: linux-2.6/drivers/gpio/gpio-ml-ioh.c =================================================================== --- linux-2.6.orig/drivers/gpio/gpio-ml-ioh.c +++ linux-2.6/drivers/gpio/gpio-ml-ioh.c @@ -31,8 +31,6 @@ #define IOH_IRQ_BASE 0 -#define PCI_VENDOR_ID_ROHM 0x10DB - struct ioh_reg_comn { u32 ien; u32 istatus; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-25 4:35 ` Yinghai Lu @ 2015-09-25 13:31 ` Myron Stowe 2015-09-25 17:02 ` Myron Stowe 2015-09-25 14:35 ` Bjorn Helgaas 1 sibling, 1 reply; 14+ messages in thread From: Myron Stowe @ 2015-09-25 13:31 UTC (permalink / raw) To: Yinghai Lu; +Cc: linux-pci, LKML, Bjorn Helgaas, Alex Williamson On Thu, Sep 24, 2015 at 10:35 PM, Yinghai Lu <yinghai@kernel.org> wrote: > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu <yinghai@kernel.org> wrote: > >> Or do we want to keep a white list to say which device should have >> ROM bar as mush have, and other is optional to have ? I suppose this idea is one possible outcome that could occur but I think we need to have a discussion first before we start making a lot of changes. We need to try and come to some consensus with BIOS engineers. I know that both sets have been alerted to this conversation so *if* we come up with some good arguments to support the kernel's current view/actions perhaps things will start to progress. In the prior thread you replied: "They are wrong. It still depends on how addon card firmware and drivers from OS to use it." So continuing my suggested thought experiment where you are sitting across the table from your platform's BIOS engineers having this discussion ... Do you *really* think your reply was helpful in any way? Do you *really* think you did anything what so ever to get the BIOS engineers to consider something they hadn't before. Do you *really* think your reply was technically based in any way? Do you have any specification references or such to back up such a strong claim? Come on here Yinghai - you are an intelligent person. Take 1/10th the time that you spent developing this patch and think, gather your thoughts, and then sit down calmly, have a beer or coffee or tea (which ever you prefer), relax, and take some time to reply in a logical, thoughtful manner here with enough expression that others can understand what you are getting at and hopefully even with some passion or logic to try to convince the BIOS engineers that the kernel's current behavior is correct. This is your area of expertise - so stand up and rise to the occasion here! Hacking out a patch before this thread has played out serves no purpose what so ever so I'm not even going to waste my time and look at it. It serves no purpose and will only make matters worse as there is already strong disagreement with the kernel's actions in these regards. > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > Only set that for > 1. if BIOS/firmware already set ROM bar. > 2. via quirks for some devices. > > We assign those needed ROM bar as required > and other ROM bars as optional resources. > > Signed-off-by: Yinghai Lu <yinghai@kernel.org> > > --- > arch/x86/pci/i386.c | 9 +++++- > drivers/dma/pch_dma.c | 1 > drivers/gpio/gpio-ml-ioh.c | 2 - > drivers/misc/pch_phub.c | 12 -------- > drivers/pci/probe.c | 7 +++++ > drivers/pci/quirks.c | 63 +++++++++++++++++++++++++++++++++++++++++++++ > drivers/pci/setup-bus.c | 18 ++++++++++-- > include/linux/pci.h | 13 +++++++++ > include/linux/pci_ids.h | 7 +++++ > 9 files changed, 112 insertions(+), 20 deletions(-) > > Index: linux-2.6/arch/x86/pci/i386.c > =================================================================== > --- linux-2.6.orig/arch/x86/pci/i386.c > +++ linux-2.6/arch/x86/pci/i386.c > @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc > } > } > > +bool pci_assign_roms(void) > +{ > + return !!(pci_probe & PCI_ASSIGN_ROMS); > +} > + > static int __init pcibios_assign_resources(void) > { > struct pci_host_bridge *host_bridge = NULL; > > - if (!(pci_probe & PCI_ASSIGN_ROMS)) > + if (!pci_assign_roms()) > for_each_pci_host_bridge(host_bridge) > pcibios_allocate_rom_resources(host_bridge->bus); > > @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct > pcibios_allocate_resources(bus, 0); > pcibios_allocate_resources(bus, 1); > > - if (!(pci_probe & PCI_ASSIGN_ROMS)) > + if (!pci_assign_roms()) > pcibios_allocate_rom_resources(bus); > } > > Index: linux-2.6/drivers/pci/probe.c > =================================================================== > --- linux-2.6.orig/drivers/pci/probe.c > +++ linux-2.6/drivers/pci/probe.c > @@ -224,6 +224,13 @@ int __pci_read_base(struct pci_dev *dev, > l64 = l & PCI_ROM_ADDRESS_MASK; > sz64 = sz & PCI_ROM_ADDRESS_MASK; > mask64 = (u32)PCI_ROM_ADDRESS_MASK; > + /* simple validation */ > + if (l64 && sz64 && > + (l64 & 0xff000000) != 0xff000000 && > + system_state == SYSTEM_BOOTING) { > + dev_printk(KERN_DEBUG, &dev->dev, "set dev_flags NEED_ROM_BAR\n"); > + pci_dev_set_need_rom_bar(dev); > + } > } > > if (res->flags & IORESOURCE_MEM_64) { > Index: linux-2.6/drivers/pci/quirks.c > =================================================================== > --- linux-2.6.orig/drivers/pci/quirks.c > +++ linux-2.6/drivers/pci/quirks.c > @@ -4197,3 +4197,66 @@ static void quirk_intel_qat_vf_cap(struc > } > } > DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); > + > +/* from drivers/mtd/maps/pci.c */ > +static void quirk_set_need_rom_bar(struct pci_dev *pdev) > +{ > + if (!pci_dev_need_rom_bar(pdev)) { > + dev_printk(KERN_DEBUG, &pdev->dev, "set dev_flags NEED_ROM_BAR\n"); > + pci_dev_set_need_rom_bar(pdev); > + } > +} > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, > + quirk_set_need_rom_bar); > + > +#ifdef CONFIG_PARISC > +/* from drivers/video/console/sticore.c */ > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_EG, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX6, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX4, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX2, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FXE, > + quirk_set_need_rom_bar); > +#endif > + > +/* from drivers/misc/pch_phub.c */ > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PCH1_PHUB, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7213_PHUB, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_mPHUB, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_nPHUB, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7831_PHUB, > + quirk_set_need_rom_bar); > + > +/* from drivers/net/ethernet/sun/sungem.c */ > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_GEM, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_RIO_GEM, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMACP, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC2, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_K2_GMAC, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_SH_SUNGEM, > + quirk_set_need_rom_bar); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_IPID2_GMAC, > + quirk_set_need_rom_bar); > + > +/* from drivers/net/ethernet/sun/sunhme.c */ > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_HAPPYMEAL, > + quirk_set_need_rom_bar); > + > +/* from drm and fbdev */ > +DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, > + 8, quirk_set_need_rom_bar); > Index: linux-2.6/drivers/pci/setup-bus.c > =================================================================== > --- linux-2.6.orig/drivers/pci/setup-bus.c > +++ linux-2.6/drivers/pci/setup-bus.c > @@ -226,6 +226,10 @@ static void pdev_assign_resources_prepar > if (resource_disabled(r) || r->parent) > continue; > > + if (i == PCI_ROM_RESOURCE && > + !pci_assign_roms() && !pci_dev_need_rom_bar(dev)) > + continue; > + > if ((r->flags & IORESOURCE_IO) && > !pci_find_host_bridge(dev->bus)->has_ioport) > continue; > @@ -1461,11 +1465,16 @@ out: > return good_size; > } > > -static inline bool is_optional(int i) > +bool __weak pci_assign_roms(void) > +{ > + return false; > +} > + > +static inline bool is_optional(struct pci_dev *dev, int i) > { > > if (i == PCI_ROM_RESOURCE) > - return true; > + return !pci_assign_roms() && !pci_dev_need_rom_bar(dev); > > #ifdef CONFIG_PCI_IOV > if (i >= PCI_IOV_RESOURCES && i <= PCI_IOV_RESOURCE_END) > @@ -1538,7 +1547,7 @@ static int pbus_size_mem(struct pci_bus > r_size = resource_size(r); > align = pci_resource_alignment(dev, r); > /* put SRIOV/ROM res to realloc list */ > - if (realloc_head && is_optional(i)) { > + if (realloc_head && is_optional(dev, i)) { > add_to_align_test_list(&align_test_add_list, > align, r_size, &dev->dev, i); > r->end = r->start - 1; > @@ -1549,6 +1558,9 @@ static int pbus_size_mem(struct pci_bus > max_add_align = align; > continue; > } > + if (!realloc_head && i == PCI_ROM_RESOURCE && > + !pci_assign_roms() && !pci_dev_need_rom_bar(dev)) > + continue; > > if (align > (1ULL<<37)) { /*128 Gb*/ > dev_warn(&dev->dev, "disabling BAR %d: %pR (bad > alignment %#llx)\n", > Index: linux-2.6/include/linux/pci.h > =================================================================== > --- linux-2.6.orig/include/linux/pci.h > +++ linux-2.6/include/linux/pci.h > @@ -182,6 +182,8 @@ enum pci_dev_flags { > PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7), > /* Get VPD from function 0 VPD */ > PCI_DEV_FLAGS_VPD_REF_F0 = (__force pci_dev_flags_t) (1 << 8), > + /* Need to have ROM BAR */ > + PCI_DEV_FLAGS_NEED_ROM_BAR = (__force pci_dev_flags_t) (1 << 9), > }; > > enum pci_irq_reroute_variant { > @@ -1965,6 +1967,14 @@ static inline bool pci_is_dev_assigned(s > { > return (pdev->dev_flags & PCI_DEV_FLAGS_ASSIGNED) == > PCI_DEV_FLAGS_ASSIGNED; > } > +static inline void pci_dev_set_need_rom_bar(struct pci_dev *pdev) > +{ > + pdev->dev_flags |= PCI_DEV_FLAGS_NEED_ROM_BAR; > +} > +static inline bool pci_dev_need_rom_bar(struct pci_dev *pdev) > +{ > + return !!(pdev->dev_flags & PCI_DEV_FLAGS_NEED_ROM_BAR); > +} > > /** > * pci_ari_enabled - query ARI forwarding status > @@ -1976,4 +1986,7 @@ static inline bool pci_ari_enabled(struc > { > return bus->self && bus->self->ari_enabled; > } > + > +bool pci_assign_roms(void); > + > #endif /* LINUX_PCI_H */ > Index: linux-2.6/drivers/misc/pch_phub.c > =================================================================== > --- linux-2.6.orig/drivers/misc/pch_phub.c > +++ linux-2.6/drivers/misc/pch_phub.c > @@ -49,7 +49,6 @@ > > /* MAX number of INT_REDUCE_CONTROL registers */ > #define MAX_NUM_INT_REDUCE_CONTROL_REG 128 > -#define PCI_DEVICE_ID_PCH1_PHUB 0x8801 > #define PCH_MINOR_NOS 1 > #define CLKCFG_CAN_50MHZ 0x12000000 > #define CLKCFG_CANCLK_MASK 0xFF000000 > @@ -61,17 +60,6 @@ > #define CLKCFG_PLL2VCO (8 << 9) > #define CLKCFG_UARTCLKSEL (1 << 18) > > -/* Macros for ML7213 */ > -#define PCI_VENDOR_ID_ROHM 0x10db > -#define PCI_DEVICE_ID_ROHM_ML7213_PHUB 0x801A > - > -/* Macros for ML7223 */ > -#define PCI_DEVICE_ID_ROHM_ML7223_mPHUB 0x8012 /* for Bus-m */ > -#define PCI_DEVICE_ID_ROHM_ML7223_nPHUB 0x8002 /* for Bus-n */ > - > -/* Macros for ML7831 */ > -#define PCI_DEVICE_ID_ROHM_ML7831_PHUB 0x8801 > - > /* SROM ACCESS Macro */ > #define PCH_WORD_ADDR_MASK (~((1 << 2) - 1)) > > Index: linux-2.6/include/linux/pci_ids.h > =================================================================== > --- linux-2.6.orig/include/linux/pci_ids.h > +++ linux-2.6/include/linux/pci_ids.h > @@ -1122,6 +1122,12 @@ > #define PCI_VENDOR_ID_TCONRAD 0x10da > #define PCI_DEVICE_ID_TCONRAD_TOKENRING 0x0508 > > +#define PCI_VENDOR_ID_ROHM 0x10DB > +#define PCI_DEVICE_ID_ROHM_ML7213_PHUB 0x801A > +#define PCI_DEVICE_ID_ROHM_ML7223_mPHUB 0x8012 /* for Bus-m */ > +#define PCI_DEVICE_ID_ROHM_ML7223_nPHUB 0x8002 /* for Bus-n */ > +#define PCI_DEVICE_ID_ROHM_ML7831_PHUB 0x8801 > + > #define PCI_VENDOR_ID_NVIDIA 0x10de > #define PCI_DEVICE_ID_NVIDIA_TNT 0x0020 > #define PCI_DEVICE_ID_NVIDIA_TNT2 0x0028 > @@ -2900,6 +2906,7 @@ > #define PCI_DEVICE_ID_INTEL_82454NX 0x84cb > #define PCI_DEVICE_ID_INTEL_84460GX 0x84ea > #define PCI_DEVICE_ID_INTEL_IXP4XX 0x8500 > +#define PCI_DEVICE_ID_PCH1_PHUB 0x8801 > #define PCI_DEVICE_ID_INTEL_IXP2800 0x9004 > #define PCI_DEVICE_ID_INTEL_S21152BB 0xb152 > > Index: linux-2.6/drivers/dma/pch_dma.c > =================================================================== > --- linux-2.6.orig/drivers/dma/pch_dma.c > +++ linux-2.6/drivers/dma/pch_dma.c > @@ -976,7 +976,6 @@ static void pch_dma_remove(struct pci_de > } > > /* PCI Device ID of DMA device */ > -#define PCI_VENDOR_ID_ROHM 0x10DB > #define PCI_DEVICE_ID_EG20T_PCH_DMA_8CH 0x8810 > #define PCI_DEVICE_ID_EG20T_PCH_DMA_4CH 0x8815 > #define PCI_DEVICE_ID_ML7213_DMA1_8CH 0x8026 > Index: linux-2.6/drivers/gpio/gpio-ml-ioh.c > =================================================================== > --- linux-2.6.orig/drivers/gpio/gpio-ml-ioh.c > +++ linux-2.6/drivers/gpio/gpio-ml-ioh.c > @@ -31,8 +31,6 @@ > > #define IOH_IRQ_BASE 0 > > -#define PCI_VENDOR_ID_ROHM 0x10DB > - > struct ioh_reg_comn { > u32 ien; > u32 istatus; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-25 13:31 ` Myron Stowe @ 2015-09-25 17:02 ` Myron Stowe 0 siblings, 0 replies; 14+ messages in thread From: Myron Stowe @ 2015-09-25 17:02 UTC (permalink / raw) To: Yinghai Lu; +Cc: linux-pci, LKML, Bjorn Helgaas, Alex Williamson On Fri, Sep 25, 2015 at 7:31 AM, Myron Stowe <myron.stowe@gmail.com> wrote: > On Thu, Sep 24, 2015 at 10:35 PM, Yinghai Lu <yinghai@kernel.org> wrote: >> On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu <yinghai@kernel.org> wrote: >> >>> Or do we want to keep a white list to say which device should have >>> ROM bar as mush have, and other is optional to have ? > > I suppose this idea is one possible outcome that could occur but I > think we need to have a discussion first before we start making a lot > of changes. We need to try and come to some consensus with BIOS > engineers. I know that both sets have been alerted to this > conversation so *if* we come up with some good arguments to support > the kernel's current view/actions perhaps things will start to > progress. > > In the prior thread you replied: > "They are wrong. > It still depends on how addon card firmware and drivers > from OS to use it." > > So continuing my suggested thought experiment where you are sitting > across the table from your platform's BIOS engineers having this > discussion ... Do you *really* think your reply was helpful in any > way? Do you *really* think you did anything what so ever to get the > BIOS engineers to consider something they hadn't before. Do you > *really* think your reply was technically based in any way? Do you > have any specification references or such to back up such a strong > claim? > > Come on here Yinghai - you are an intelligent person. Take 1/10th the > time that you spent developing this patch and think, gather your > thoughts, and then sit down calmly, have a beer or coffee or tea > (which ever you prefer), relax, and take some time to reply in a > logical, thoughtful manner here with enough expression that others can > understand what you are getting at and hopefully even with some > passion or logic to try to convince the BIOS engineers that the > kernel's current behavior is correct. This is your area of expertise > - so stand up and rise to the occasion here! > > Hacking out a patch before this thread has played out serves no > purpose what so ever so I'm not even going to waste my time and look > at it. It serves no purpose and will only make matters worse as there > is already strong disagreement with the kernel's actions in these > regards. Sigh, that was a terrible response on my part - I'm trying to encourage engagement in this discussion and yet my response likely has exactly the opposite result; shutting you down. Yinghai, I apologize, truly! Others have privately said that you may be uncomfortable with expressing your views due to language skills. If that's the case then please don't be intimidated and limit your contributions. I expect you know at least two languages which is 50% more than me so don't worry about grammar or such - that's not important and we could benefit from your experience and knowledge. English is my only language and I still too often find it difficult to express my opinions. Again, I'm sorry for my rash, harsh, response, Myron > >> >> Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() >> >> Only set that for >> 1. if BIOS/firmware already set ROM bar. >> 2. via quirks for some devices. >> >> We assign those needed ROM bar as required >> and other ROM bars as optional resources. >> >> Signed-off-by: Yinghai Lu <yinghai@kernel.org> >> >> --- >> arch/x86/pci/i386.c | 9 +++++- >> drivers/dma/pch_dma.c | 1 >> drivers/gpio/gpio-ml-ioh.c | 2 - >> drivers/misc/pch_phub.c | 12 -------- >> drivers/pci/probe.c | 7 +++++ >> drivers/pci/quirks.c | 63 +++++++++++++++++++++++++++++++++++++++++++++ >> drivers/pci/setup-bus.c | 18 ++++++++++-- >> include/linux/pci.h | 13 +++++++++ >> include/linux/pci_ids.h | 7 +++++ >> 9 files changed, 112 insertions(+), 20 deletions(-) >> >> Index: linux-2.6/arch/x86/pci/i386.c >> =================================================================== >> --- linux-2.6.orig/arch/x86/pci/i386.c >> +++ linux-2.6/arch/x86/pci/i386.c >> @@ -377,11 +377,16 @@ static void pcibios_allocate_rom_resourc >> } >> } >> >> +bool pci_assign_roms(void) >> +{ >> + return !!(pci_probe & PCI_ASSIGN_ROMS); >> +} >> + >> static int __init pcibios_assign_resources(void) >> { >> struct pci_host_bridge *host_bridge = NULL; >> >> - if (!(pci_probe & PCI_ASSIGN_ROMS)) >> + if (!pci_assign_roms()) >> for_each_pci_host_bridge(host_bridge) >> pcibios_allocate_rom_resources(host_bridge->bus); >> >> @@ -406,7 +411,7 @@ void pcibios_resource_survey_bus(struct >> pcibios_allocate_resources(bus, 0); >> pcibios_allocate_resources(bus, 1); >> >> - if (!(pci_probe & PCI_ASSIGN_ROMS)) >> + if (!pci_assign_roms()) >> pcibios_allocate_rom_resources(bus); >> } >> >> Index: linux-2.6/drivers/pci/probe.c >> =================================================================== >> --- linux-2.6.orig/drivers/pci/probe.c >> +++ linux-2.6/drivers/pci/probe.c >> @@ -224,6 +224,13 @@ int __pci_read_base(struct pci_dev *dev, >> l64 = l & PCI_ROM_ADDRESS_MASK; >> sz64 = sz & PCI_ROM_ADDRESS_MASK; >> mask64 = (u32)PCI_ROM_ADDRESS_MASK; >> + /* simple validation */ >> + if (l64 && sz64 && >> + (l64 & 0xff000000) != 0xff000000 && >> + system_state == SYSTEM_BOOTING) { >> + dev_printk(KERN_DEBUG, &dev->dev, "set dev_flags NEED_ROM_BAR\n"); >> + pci_dev_set_need_rom_bar(dev); >> + } >> } >> >> if (res->flags & IORESOURCE_MEM_64) { >> Index: linux-2.6/drivers/pci/quirks.c >> =================================================================== >> --- linux-2.6.orig/drivers/pci/quirks.c >> +++ linux-2.6/drivers/pci/quirks.c >> @@ -4197,3 +4197,66 @@ static void quirk_intel_qat_vf_cap(struc >> } >> } >> DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x443, quirk_intel_qat_vf_cap); >> + >> +/* from drivers/mtd/maps/pci.c */ >> +static void quirk_set_need_rom_bar(struct pci_dev *pdev) >> +{ >> + if (!pci_dev_need_rom_bar(pdev)) { >> + dev_printk(KERN_DEBUG, &pdev->dev, "set dev_flags NEED_ROM_BAR\n"); >> + pci_dev_set_need_rom_bar(pdev); >> + } >> +} >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, >> + quirk_set_need_rom_bar); >> + >> +#ifdef CONFIG_PARISC >> +/* from drivers/video/console/sticore.c */ >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_EG, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX6, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX4, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FX2, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_VISUALIZE_FXE, >> + quirk_set_need_rom_bar); >> +#endif >> + >> +/* from drivers/misc/pch_phub.c */ >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_PCH1_PHUB, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7213_PHUB, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_mPHUB, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7223_nPHUB, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ROHM, PCI_DEVICE_ID_ROHM_ML7831_PHUB, >> + quirk_set_need_rom_bar); >> + >> +/* from drivers/net/ethernet/sun/sungem.c */ >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_GEM, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_RIO_GEM, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMACP, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_UNI_N_GMAC2, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_K2_GMAC, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_SH_SUNGEM, >> + quirk_set_need_rom_bar); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_IPID2_GMAC, >> + quirk_set_need_rom_bar); >> + >> +/* from drivers/net/ethernet/sun/sunhme.c */ >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_SUN, PCI_DEVICE_ID_SUN_HAPPYMEAL, >> + quirk_set_need_rom_bar); >> + >> +/* from drm and fbdev */ >> +DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA, >> + 8, quirk_set_need_rom_bar); >> Index: linux-2.6/drivers/pci/setup-bus.c >> =================================================================== >> --- linux-2.6.orig/drivers/pci/setup-bus.c >> +++ linux-2.6/drivers/pci/setup-bus.c >> @@ -226,6 +226,10 @@ static void pdev_assign_resources_prepar >> if (resource_disabled(r) || r->parent) >> continue; >> >> + if (i == PCI_ROM_RESOURCE && >> + !pci_assign_roms() && !pci_dev_need_rom_bar(dev)) >> + continue; >> + >> if ((r->flags & IORESOURCE_IO) && >> !pci_find_host_bridge(dev->bus)->has_ioport) >> continue; >> @@ -1461,11 +1465,16 @@ out: >> return good_size; >> } >> >> -static inline bool is_optional(int i) >> +bool __weak pci_assign_roms(void) >> +{ >> + return false; >> +} >> + >> +static inline bool is_optional(struct pci_dev *dev, int i) >> { >> >> if (i == PCI_ROM_RESOURCE) >> - return true; >> + return !pci_assign_roms() && !pci_dev_need_rom_bar(dev); >> >> #ifdef CONFIG_PCI_IOV >> if (i >= PCI_IOV_RESOURCES && i <= PCI_IOV_RESOURCE_END) >> @@ -1538,7 +1547,7 @@ static int pbus_size_mem(struct pci_bus >> r_size = resource_size(r); >> align = pci_resource_alignment(dev, r); >> /* put SRIOV/ROM res to realloc list */ >> - if (realloc_head && is_optional(i)) { >> + if (realloc_head && is_optional(dev, i)) { >> add_to_align_test_list(&align_test_add_list, >> align, r_size, &dev->dev, i); >> r->end = r->start - 1; >> @@ -1549,6 +1558,9 @@ static int pbus_size_mem(struct pci_bus >> max_add_align = align; >> continue; >> } >> + if (!realloc_head && i == PCI_ROM_RESOURCE && >> + !pci_assign_roms() && !pci_dev_need_rom_bar(dev)) >> + continue; >> >> if (align > (1ULL<<37)) { /*128 Gb*/ >> dev_warn(&dev->dev, "disabling BAR %d: %pR (bad >> alignment %#llx)\n", >> Index: linux-2.6/include/linux/pci.h >> =================================================================== >> --- linux-2.6.orig/include/linux/pci.h >> +++ linux-2.6/include/linux/pci.h >> @@ -182,6 +182,8 @@ enum pci_dev_flags { >> PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1 << 7), >> /* Get VPD from function 0 VPD */ >> PCI_DEV_FLAGS_VPD_REF_F0 = (__force pci_dev_flags_t) (1 << 8), >> + /* Need to have ROM BAR */ >> + PCI_DEV_FLAGS_NEED_ROM_BAR = (__force pci_dev_flags_t) (1 << 9), >> }; >> >> enum pci_irq_reroute_variant { >> @@ -1965,6 +1967,14 @@ static inline bool pci_is_dev_assigned(s >> { >> return (pdev->dev_flags & PCI_DEV_FLAGS_ASSIGNED) == >> PCI_DEV_FLAGS_ASSIGNED; >> } >> +static inline void pci_dev_set_need_rom_bar(struct pci_dev *pdev) >> +{ >> + pdev->dev_flags |= PCI_DEV_FLAGS_NEED_ROM_BAR; >> +} >> +static inline bool pci_dev_need_rom_bar(struct pci_dev *pdev) >> +{ >> + return !!(pdev->dev_flags & PCI_DEV_FLAGS_NEED_ROM_BAR); >> +} >> >> /** >> * pci_ari_enabled - query ARI forwarding status >> @@ -1976,4 +1986,7 @@ static inline bool pci_ari_enabled(struc >> { >> return bus->self && bus->self->ari_enabled; >> } >> + >> +bool pci_assign_roms(void); >> + >> #endif /* LINUX_PCI_H */ >> Index: linux-2.6/drivers/misc/pch_phub.c >> =================================================================== >> --- linux-2.6.orig/drivers/misc/pch_phub.c >> +++ linux-2.6/drivers/misc/pch_phub.c >> @@ -49,7 +49,6 @@ >> >> /* MAX number of INT_REDUCE_CONTROL registers */ >> #define MAX_NUM_INT_REDUCE_CONTROL_REG 128 >> -#define PCI_DEVICE_ID_PCH1_PHUB 0x8801 >> #define PCH_MINOR_NOS 1 >> #define CLKCFG_CAN_50MHZ 0x12000000 >> #define CLKCFG_CANCLK_MASK 0xFF000000 >> @@ -61,17 +60,6 @@ >> #define CLKCFG_PLL2VCO (8 << 9) >> #define CLKCFG_UARTCLKSEL (1 << 18) >> >> -/* Macros for ML7213 */ >> -#define PCI_VENDOR_ID_ROHM 0x10db >> -#define PCI_DEVICE_ID_ROHM_ML7213_PHUB 0x801A >> - >> -/* Macros for ML7223 */ >> -#define PCI_DEVICE_ID_ROHM_ML7223_mPHUB 0x8012 /* for Bus-m */ >> -#define PCI_DEVICE_ID_ROHM_ML7223_nPHUB 0x8002 /* for Bus-n */ >> - >> -/* Macros for ML7831 */ >> -#define PCI_DEVICE_ID_ROHM_ML7831_PHUB 0x8801 >> - >> /* SROM ACCESS Macro */ >> #define PCH_WORD_ADDR_MASK (~((1 << 2) - 1)) >> >> Index: linux-2.6/include/linux/pci_ids.h >> =================================================================== >> --- linux-2.6.orig/include/linux/pci_ids.h >> +++ linux-2.6/include/linux/pci_ids.h >> @@ -1122,6 +1122,12 @@ >> #define PCI_VENDOR_ID_TCONRAD 0x10da >> #define PCI_DEVICE_ID_TCONRAD_TOKENRING 0x0508 >> >> +#define PCI_VENDOR_ID_ROHM 0x10DB >> +#define PCI_DEVICE_ID_ROHM_ML7213_PHUB 0x801A >> +#define PCI_DEVICE_ID_ROHM_ML7223_mPHUB 0x8012 /* for Bus-m */ >> +#define PCI_DEVICE_ID_ROHM_ML7223_nPHUB 0x8002 /* for Bus-n */ >> +#define PCI_DEVICE_ID_ROHM_ML7831_PHUB 0x8801 >> + >> #define PCI_VENDOR_ID_NVIDIA 0x10de >> #define PCI_DEVICE_ID_NVIDIA_TNT 0x0020 >> #define PCI_DEVICE_ID_NVIDIA_TNT2 0x0028 >> @@ -2900,6 +2906,7 @@ >> #define PCI_DEVICE_ID_INTEL_82454NX 0x84cb >> #define PCI_DEVICE_ID_INTEL_84460GX 0x84ea >> #define PCI_DEVICE_ID_INTEL_IXP4XX 0x8500 >> +#define PCI_DEVICE_ID_PCH1_PHUB 0x8801 >> #define PCI_DEVICE_ID_INTEL_IXP2800 0x9004 >> #define PCI_DEVICE_ID_INTEL_S21152BB 0xb152 >> >> Index: linux-2.6/drivers/dma/pch_dma.c >> =================================================================== >> --- linux-2.6.orig/drivers/dma/pch_dma.c >> +++ linux-2.6/drivers/dma/pch_dma.c >> @@ -976,7 +976,6 @@ static void pch_dma_remove(struct pci_de >> } >> >> /* PCI Device ID of DMA device */ >> -#define PCI_VENDOR_ID_ROHM 0x10DB >> #define PCI_DEVICE_ID_EG20T_PCH_DMA_8CH 0x8810 >> #define PCI_DEVICE_ID_EG20T_PCH_DMA_4CH 0x8815 >> #define PCI_DEVICE_ID_ML7213_DMA1_8CH 0x8026 >> Index: linux-2.6/drivers/gpio/gpio-ml-ioh.c >> =================================================================== >> --- linux-2.6.orig/drivers/gpio/gpio-ml-ioh.c >> +++ linux-2.6/drivers/gpio/gpio-ml-ioh.c >> @@ -31,8 +31,6 @@ >> >> #define IOH_IRQ_BASE 0 >> >> -#define PCI_VENDOR_ID_ROHM 0x10DB >> - >> struct ioh_reg_comn { >> u32 ien; >> u32 istatus; ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-25 4:35 ` Yinghai Lu 2015-09-25 13:31 ` Myron Stowe @ 2015-09-25 14:35 ` Bjorn Helgaas 2015-09-25 16:18 ` Alex Williamson 1 sibling, 1 reply; 14+ messages in thread From: Bjorn Helgaas @ 2015-09-25 14:35 UTC (permalink / raw) To: Yinghai Lu; +Cc: Myron Stowe, linux-pci, LKML, Bjorn Helgaas, Alex Williamson On Thu, Sep 24, 2015 at 09:35:20PM -0700, Yinghai Lu wrote: > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu <yinghai@kernel.org> wrote: > > > Or do we want to keep a white list to say which device should have > > ROM bar as mush have, and other is optional to have ? > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > Only set that for > 1. if BIOS/firmware already set ROM bar. > 2. via quirks for some devices. > > We assign those needed ROM bar as required > and other ROM bars as optional resources. I'd rather not have a whitelist if we can avoid it. We'd be continually adding new devices to it, and it makes the system harder to understand because there's no consistent rule about how ROMs are handled. Alex mentioned the idea of ripping the ROM, and I'd like to explore that idea a little more. What if we could temporarily assign space for the ROM during enumeration, read the contents, cache the contents somewhere, then deallocate the actual BAR space? We could hang onto the cache and give it to anybody who later needs the ROM. I know there are probably issues here, but I don't know what they are, so I'd like to at least have a conversation about it. Bjorn ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-25 14:35 ` Bjorn Helgaas @ 2015-09-25 16:18 ` Alex Williamson 2015-09-26 0:04 ` Yinghai Lu 0 siblings, 1 reply; 14+ messages in thread From: Alex Williamson @ 2015-09-25 16:18 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: Yinghai Lu, Myron Stowe, linux-pci, LKML, Bjorn Helgaas On Fri, 2015-09-25 at 09:35 -0500, Bjorn Helgaas wrote: > On Thu, Sep 24, 2015 at 09:35:20PM -0700, Yinghai Lu wrote: > > On Thu, Sep 24, 2015 at 12:01 PM, Yinghai Lu <yinghai@kernel.org> wrote: > > > > > Or do we want to keep a white list to say which device should have > > > ROM bar as mush have, and other is optional to have ? > > > > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() > > > > Only set that for > > 1. if BIOS/firmware already set ROM bar. > > 2. via quirks for some devices. > > > > We assign those needed ROM bar as required > > and other ROM bars as optional resources. > > I'd rather not have a whitelist if we can avoid it. We'd be > continually adding new devices to it, and it makes the system harder > to understand because there's no consistent rule about how ROMs are > handled. > > Alex mentioned the idea of ripping the ROM, and I'd like to explore > that idea a little more. What if we could temporarily assign space > for the ROM during enumeration, read the contents, cache the contents > somewhere, then deallocate the actual BAR space? We could hang onto > the cache and give it to anybody who later needs the ROM. > > I know there are probably issues here, but I don't know what they are, > so I'd like to at least have a conversation about it. Ok, so for background in the case I mentioned, we can often use the pci-sysfs rom interface to get a copy of the device ROM, which we then pass to QEMU and we avoid any access to the physical ROM from the VM. We can obviously get the ROM from other sources too, but that's not really relevant here. If we want to extend this idea into the kernel, creating a buffer that holds the ROM contents that we access rather than mapping and enabling the ROM to provide access, we first need to get access to the ROM at least once. The simplification here is that we can do this on boot and we can re-use the space allocated to the standard BARs since we don't need space for both the ROM and the standard BARs at the same time. I think that in the vast majority of cases, we're going to find that the ROM BAR is smaller than the largest standard MMIO BAR of the device or at least smaller than the minimum bridge aperture. This suggests that we could simply record the BAR values, reprogram them to zero, then overlap the ROM BAR to the orignal addresses, enable, rip the ROM, then restore the configuration without needing to adjust any bridge apertures. That sounds pretty good, so long as we can consider the ROM to be perfectly static. I don't know if anything relies on an in-place update or if there are ROMs that are less read-only than others. Maybe it's safe to assume that or at least safe to assume that if the BIOS didn't leave room for them, then we can consider them static. It might be interesting to strace some of the userspace firmware update programs for add-in cards to see if they re-read the ROM to determine if it has changed. We already sort of do this for VGA ROMs. When a driver tries to map the boot VGA ROM they actually map the shadow copy at 0xC0000 (iirc) rather than the one on the device. This actually sort of sucks because this particular shadow copy certainly is not read-only and in all the glory that is VGA, the shadow sometimes gets modified by the execution of the VGA ROM itself and we no longer have access to the pristine device version of it (bad for device assignment of primary graphics). Now, if we want to make our own shadow copies for all ROMs, it seems pretty clear that we first need to get access to the ROM, which means we need to figure out if the BIOS mapped it. If the ROM BAR is outside of the bridge aperture (catching both 0 and 0xFFF..000) or overlaps a standard ROM BAR, we can consider it unprogrammed. In that case we need to try to do the trick above with unmapping standard BARs to get the shadow copy. Otherwise we should be able to get the shadow copy in place (maybe a question of which we prefer to use in this case). I would be willing to risk that if the BIOS didn't leave room for the ROM and we can't map it into the space used by other BARs or it doesn't fit the bridge aperture, we can spit out a printk warning and skip it. I expect a very low failure rate. What dependencies would we have on the BIOS programming of the ROM BAR if we took such an approach? It seems like we should always be able to detect invalid contents in the ROM BAR and it doesn't matter if they think we should have access or not, we own the device and can give ourselves access. Thanks, Alex ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-25 16:18 ` Alex Williamson @ 2015-09-26 0:04 ` Yinghai Lu 0 siblings, 0 replies; 14+ messages in thread From: Yinghai Lu @ 2015-09-26 0:04 UTC (permalink / raw) To: Alex Williamson Cc: Bjorn Helgaas, Myron Stowe, linux-pci, LKML, Bjorn Helgaas On Fri, Sep 25, 2015 at 9:18 AM, Alex Williamson <alex.williamson@redhat.com> wrote: >> > > Or do we want to keep a white list to say which device should have >> > > ROM bar as mush have, and other is optional to have ? >> > >> > Subject: [RFC PATCH] PCI: Add pci_dev_need_rom_bar() >> > >> > Only set that for >> > 1. if BIOS/firmware already set ROM bar. >> > 2. via quirks for some devices. >> > >> > We assign those needed ROM bar as required >> > and other ROM bars as optional resources. >> >> I'd rather not have a whitelist if we can avoid it. We'd be >> continually adding new devices to it, and it makes the system harder >> to understand because there's no consistent rule about how ROMs are >> handled. I don't like that whitelist way, and hope we can find better way to handle all the cases elegantly. >> >> Alex mentioned the idea of ripping the ROM, and I'd like to explore >> that idea a little more. What if we could temporarily assign space >> for the ROM during enumeration, read the contents, cache the contents >> somewhere, then deallocate the actual BAR space? We could hang onto >> the cache and give it to anybody who later needs the ROM. > > That sounds pretty good, so long as we can consider the ROM to be > perfectly static. I don't know if anything relies on an in-place update > or if there are ROMs that are less read-only than others. Maybe it's > safe to assume that or at least safe to assume that if the BIOS didn't > leave room for them, then we can consider them static. It might be > interesting to strace some of the userspace firmware update programs for > add-in cards to see if they re-read the ROM to determine if it has > changed. Should cover most cases. But there is some driver like drivers/mtd/maps/pci.c::drivers/mtd/maps/pci.c do have write operation to the mmio range. > > We already sort of do this for VGA ROMs. When a driver tries to map the > boot VGA ROM they actually map the shadow copy at 0xC0000 (iirc) rather > than the one on the device. This actually sort of sucks because this > particular shadow copy certainly is not read-only and in all the glory > that is VGA, the shadow sometimes gets modified by the execution of the > VGA ROM itself and we no longer have access to the pristine device > version of it (bad for device assignment of primary graphics). > > Now, if we want to make our own shadow copies for all ROMs, it seems > pretty clear that we first need to get access to the ROM, which means we > need to figure out if the BIOS mapped it. If the ROM BAR is outside of > the bridge aperture (catching both 0 and 0xFFF..000) or overlaps a > standard ROM BAR, we can consider it unprogrammed. In that case we need > to try to do the trick above with unmapping standard BARs to get the > shadow copy. Otherwise we should be able to get the shadow copy in > place (maybe a question of which we prefer to use in this case). I > would be willing to risk that if the BIOS didn't leave room for the ROM > and we can't map it into the space used by other BARs or it doesn't fit > the bridge aperture, we can spit out a printk warning and skip it. I > expect a very low failure rate. Maybe we can combine two methods together: 1. have NEED_ROM_BAR, and it is set a) if BIOS allocate resource to yet (maybe not, so we leave space for MMIO bars) b) via limited whitelist that will not support static copy. 2. kernel will try to allocate resource to ROM bar with NEED_ROM_BAR as required, and others as optional 3. for ROM bar that can not get assigned, kernel try to borrow mmio range from other MMIO bar on the device, and save a copy and expose that via /sys/.../rom That should happen FINAL_FIXUP stage before driver get loaded. --- There is risk on it, some add-on card firmware would stop working if kernel ever try to change MMIO bar. then we will need blacklist to skip BAR on those devices. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-24 2:47 [RFC] PCI: Unassigned Expansion ROM BARs Myron Stowe 2015-09-24 3:21 ` Yinghai Lu @ 2015-09-27 19:29 ` Myron Stowe 2015-09-27 20:19 ` Myron Stowe 2 siblings, 0 replies; 14+ messages in thread From: Myron Stowe @ 2015-09-27 19:29 UTC (permalink / raw) To: linux-pci; +Cc: LKML, Bjorn Helgaas, Yinghai Lu, Alex Williamson On Wed, Sep 23, 2015 at 8:47 PM, Myron Stowe <myron.stowe@gmail.com> wrote: snip > > The kernel expects device Expansion ROM BARs to be programmed with valid > values - even if the respective Expansion ROM's Enable bit is 0 (i.e. the > device’s expansion ROM address space is disabled). This seems to be the > main contention point with said BIOS engineers. If an Expansion ROM BAR is > not programmed, the kernel will attempt to find available resources and, if > successful, program it. As this occurs various 'dmesg' entries > related to kernel's actions are output. > The respective BIOS engineers from the two major vendors exhibiting the behavior outlined are aware of, and monitoring, this thread. With the exception of Daniel's recent post, there hasn't been much substance presented supporting the OS's viewpoint to encourage the BIOS engineers to enter into any kind of discussion. The majority of the responses have gone straight towards how the OS can effectively work around platform's that exhibit such setups. I'd like to step back and present known instances of the OS's need(s) to access the Expansion (a.k.a. option) ROMs - something for the BIOS engineers to consider; something with which to start a dialogue. There are at least three known major reasons why Linux uses the ROMs: 1) For many of the video cards, Linux has drivers that assume the card has been initialized by the ROM. The drivers work fine, but they aren't smart enough to work with the card straight out of reset - a lot of which is due to specific vendor's keeping their devices closed; the code remains proprietary. When such devices are reset when the OS is running (i.e. when X is restarted), the OS has to run the ROM before the driver works again. Alex Williamson and Daniel Blueman both covered this fairly well, including the current dificiencies of such, in prior threads. 2) As Daniel further expressed, hot-plug scenarios and PCI domains which may not be visible to the BIOS at initial boot, may need to access the ROMs. In these environments - PCI hiearchies shared among multiple, distinct, servers; hiearchies using non-transparent bridges - option ROMs handed off by the BIOS unassigned need to be allocated by the OS so that they can be accessed under these circumstances. 3) Virtualized guest environments where a device may be assigned to a virtualized guest is an interesting case. In such environments the host OS effectively functions as the meta-level BIOS, setting up a guest's environment (virtual platform) prior to instantianting it. Within such a context consider a simple example: NIC devices often have Preboot Execution Environment (PXE) code in their ROMs. In a bare-metal scenario, the BIOS (a.k.a. platform firmware) obtains the PXE code from the ROM and initiates its execution. In this scenario, once the OS is up and running there would seem to be no further need to access such device's ROMs. If we now extend the scenario one meta-level to include virtualization, the host OS [1] assumes the role of bare-metal environment's BIOS and the virtualized guest takes on the role of bare-metal OS. As such, if the guest is booted via PXE from a NIC device, the meta-level BIOS (QEMU/seabios) needs the ROM's content in order initiate PXE execution to bring up the guest. So in virtualized environments, it becomes obvious that all the traditional BIOS ROM related actions extend to the (host) OS - PXE booting, device initialization, hot-plug, and directly assigning physical devices to virtualized guests, etc. [1] "host OS" is used here in the generalized sence (i.e. it is in control and thus the subsequent use of QEMU and seabios are not specifically differentiated). ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-24 2:47 [RFC] PCI: Unassigned Expansion ROM BARs Myron Stowe 2015-09-24 3:21 ` Yinghai Lu 2015-09-27 19:29 ` Myron Stowe @ 2015-09-27 20:19 ` Myron Stowe 2016-09-01 19:16 ` Myron Stowe 2 siblings, 1 reply; 14+ messages in thread From: Myron Stowe @ 2015-09-27 20:19 UTC (permalink / raw) To: linux-pci; +Cc: LKML, Bjorn Helgaas, Yinghai Lu, Alex Williamson On Wed, Sep 23, 2015 at 8:47 PM, Myron Stowe <myron.stowe@gmail.com> wrote: snip > > There is a kernel boot parameter, pci=norom, that is intended to disable the > kernel's resource assignment actions for Expansion ROMs that do not already > have BIOS assigned address ranges. Note however, if I remember correctly, > that this only works if the Expansion ROM BAR is set to "0" by the BIOS > before hand-off. In private conversations I was asked: Why do you propose asking the BIOS to assign setting Expansion ROM BARs to "0"? That is not what I'm advocating. I think it's a complete hack. Some background context - this is effectively the defacto detente that has come to be somehow with one of the major vendor's BIOS' to circumvent 'dmesg' entries corresponding to unassigned Expansion ROM BARs which draws customer attention. Unless something has changed recently, specifying "pci=norom" when booting does not cause the kernel to completely ignore Expansion ROM BARs all together as one would expect. The kernel still outputs 'dmesg's corresponding to unassigned Expansion ROM BARs and also attempts to allocate such. This is a kernel bug in my opinion. It's only if both "pci=norom" has been specified, and, the BIOS has set the Expansion ROM BARs to "0" that the kernel completely ignores Expansion ROM BARs and no 'dmesg's are output. Customers don't want to, and shouldn't have to, utilize kernel parameters. They are indispensable for kernel engineers to use for debugging and such but not for normal, every day, (i.e. customer) use. So, no, I am not advocating the current defacto detente that is in place today ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] PCI: Unassigned Expansion ROM BARs 2015-09-27 20:19 ` Myron Stowe @ 2016-09-01 19:16 ` Myron Stowe 0 siblings, 0 replies; 14+ messages in thread From: Myron Stowe @ 2016-09-01 19:16 UTC (permalink / raw) To: linux-pci; +Cc: LKML, Bjorn Helgaas, Yinghai Lu, Alex Williamson Here it is a year later and there has basically been no progress on this ongoing situation. I still often encounter bugs raised against the kernel w.r.t. unmet resource allocations - here is the most recent example, I'll attach the 'dmesg' log from the platform at https://bugzilla.kernel.org/show_bug.cgi?id=104931. Researching device 0000:04:00.3 as it's the device with the issue (and all other devices/functions under PCI bus 04 due to possible competing resource needs). Analysis from v4.7.0 kernel run 'dmesg' log with comments interspersed ... This platform has two PCI Root Bridges. Limiting analysis to the first Root Bridge handling PCI buses 0x00 through 0x7e as it contains the PCI bus in question - bus 04. ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7e]) PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [io 0x0000-0x03bb window] pci_bus 0000:00: root bus resource [io 0x03bc-0x03df window] pci_bus 0000:00: root bus resource [io 0x03e0-0x0cf7 window] pci_bus 0000:00: root bus resource [io 0x1000-0x7fff window] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window] pci_bus 0000:00: root bus resource [mem 0x90000000-0xc7ffbfff window] pci_bus 0000:00: root bus resource [mem 0x30000000000-0x33fffffffff window] CPU addresses falling into the above resource ranges will get intercepted by the host controller and converted into PCI bus transactions. Looking further into the log we find the set of resource ranges (PCI-to-PCI bridge apertures) corresponding to PCI bus 04. pci 0000:00:02.0: PCI bridge to [bus 04] pci 0000:00:02.0: bridge window [io 0x2000-0x2fff] pci 0000:00:02.0: bridge window [mem 0x92000000-0x940fffff] 33M The following shows what the platforms BIOS programmed into the BARs of device(s) under PCI bus 04. pci 0000:04:00.0: [1924:0923] type 00 class 0x020000 pci 0000:04:00.0: reg 0x10: [io 0x2300-0x23ff] pci 0000:04:00.0: reg 0x18: [mem 0x93800000-0x93ffffff 64bit] BAR2 pci 0000:04:00.0: reg 0x20: [mem 0x9400c000-0x9400ffff 64bit] BAR4 pci 0000:04:00.0: reg 0x30: [mem 0xfffc0000-0xffffffff pref] E ROM pci 0000:04:00.1: [1924:0923] type 00 class 0x020000 pci 0000:04:00.1: reg 0x10: [io 0x2200-0x22ff] pci 0000:04:00.1: reg 0x18: [mem 0x93000000-0x937fffff 64bit] pci 0000:04:00.1: reg 0x20: [mem 0x94008000-0x9400bfff 64bit] pci 0000:04:00.1: reg 0x30: [mem 0xfffc0000-0xffffffff pref] pci 0000:04:00.2: [1924:0923] type 00 class 0x020000 pci 0000:04:00.2: reg 0x10: [io 0x2100-0x21ff] pci 0000:04:00.2: reg 0x18: [mem 0x92800000-0x92ffffff 64bit] pci 0000:04:00.2: reg 0x20: [mem 0x94004000-0x94007fff 64bit] pci 0000:04:00.2: reg 0x30: [mem 0xfffc0000-0xffffffff pref] pci 0000:04:00.3: [1924:0923] type 00 class 0x020000 pci 0000:04:00.3: reg 0x10: [io 0x2000-0x20ff] pci 0000:04:00.3: reg 0x18: [mem 0x92000000-0x927fffff 64bit] 8M pci 0000:04:00.3: reg 0x20: [mem 0x94000000-0x94003fff 64bit] 16K pci 0000:04:00.3: reg 0x30: [mem 0xfffc0000-0xffffffff pref] 256K It's already obvious that the 33M of MMIO space that the PCI-to-PCI bridge leading to PCI bus 04 provides (0x92000000-0x940fffff) is not enough space to fully satisfy the MMIO specific addressing needs of all device's BARs below it - the 4 combined ports - totaling (8M + 16K + 256K) *4) = 33M + 64K. This is _without_ taking into account any alignment constraints that likely would increase the buses needed aperture range even further. Note that the values programmed into the device's Expansion ROM BARs do not fit within any of its immediately upstream bridge's MMIO related apertures. pci 0000:04:00.0: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window pci 0000:04:00.1: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window pci 0000:04:00.2: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window pci 0000:04:00.3: can't claim BAR 6 [mem 0xfffc0000-0xffffffff pref]: no compatible bridge window The kernel notices this and attempts to allocate appropriate space for them from any remaining, available, MMIO space that meets all the alignment constraints and such. pci 0000:04:00.0: BAR 6: assigned [mem 0x94040000-0x9407ffff pref] pci 0000:04:00.1: BAR 6: assigned [mem 0x94080000-0x940bffff pref] pci 0000:04:00.2: BAR 6: assigned [mem 0x940c0000-0x940fffff pref] pci 0000:04:00.3: BAR 6: no space for [mem size 0x00040000 pref] pci 0000:04:00.3: BAR 6: failed to assign [mem size 0x00040000 pref] The kernel was able to satisfy the first three ports MMIO needs but was _not_ able to for the last port - there is no remaining available addressing space within the range to satisfy its needs! At this point the 0000:04:00.3 device just happens to work by luck due to the fact that the unmet resource needs correspond to its Expansion ROM BAR [1]. Next a "user" initiates a PCIe hot-unplug of the device, the bus is re-scanned and as a result, BAR4 of all 4 of the device's functions fail getting their appropriate resources allocated. pci 0000:00:02.0: PCI bridge to [bus 04] pci 0000:00:02.0: bridge window [io 0x2000-0x2fff] pci 0000:00:02.0: bridge window [mem 0x92000000-0x940fffff] 33M pci 0000:04:00.0: BAR 2: assigned [mem 0x92000000-0x927fffff 64bit] pci 0000:04:00.1: BAR 2: assigned [mem 0x92800000-0x92ffffff 64bit] pci 0000:04:00.2: BAR 2: assigned [mem 0x93000000-0x937fffff 64bit] pci 0000:04:00.3: BAR 2: assigned [mem 0x93800000-0x93ffffff 64bit] pci 0000:04:00.0: BAR 6: assigned [mem 0x94000000-0x9403ffff pref] pci 0000:04:00.1: BAR 6: assigned [mem 0x94040000-0x9407ffff pref] pci 0000:04:00.2: BAR 6: assigned [mem 0x94080000-0x940bffff pref] pci 0000:04:00.3: BAR 6: assigned [mem 0x940c0000-0x940fffff pref] At this point -all- available MMIO resource space has been consumed. For the more visually inclined (if it's not already obvious). There's probably an easier way to visualize the exhaustion but here is my lame attempt: PCI Bridge 04's MMIO aperture resource range totals 33M ( 0x92000000-0x940fffff ). The first line below denotes the 33M in 1M increments (chunks). The second line denotes the addressing range; specifically bytes 7 and 6 withing the resource's range ( 0x9--xxxxx ). The last line denotes the port (0 through 3) consuming that portion of the resource's range. 1 2 3 4 5 6 7 8 9101112131415161718192021222324252627282930313233 33M 202122232425262728292a2b2c2d232f303132333435363738393a3b3c3d3e3f40 [76] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3-- The last 1M is consumed by a smaller granularity so expanding the above conceptualization to a finer level. 1M of resource range ( 94000000-940fffff ) visualized in 32K increments ( bytes 5 and 4; 0x940--xxx ). 1 2 3 4 5 6 7 8 91011121314151617181920212223242526272829303132 1M 0008101820283038404850586068707880889098a0a8b0b8c0c8d0d8e0e8f0f8 [54] 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 and the remaining needed resource allocation attempts are going to fail. pci 0000:04:00.0: BAR 4: no space for [mem size 0x00004000 64bit] pci 0000:04:00.0: BAR 4: failed to assign [mem size 0x00004000 64bit] pci 0000:04:00.1: BAR 4: no space for [mem size 0x00004000 64bit] pci 0000:04:00.1: BAR 4: failed to assign [mem size 0x00004000 64bit] pci 0000:04:00.2: BAR 4: no space for [mem size 0x00004000 64bit] pci 0000:04:00.2: BAR 4: failed to assign [mem size 0x00004000 64bit] pci 0000:04:00.3: BAR 4: no space for [mem size 0x00004000 64bit] pci 0000:04:00.3: BAR 4: failed to assign [mem size 0x00004000 64bit] pci 0000:04:00.0: BAR 0: assigned [io 0x2000-0x20ff] pci 0000:04:00.1: BAR 0: assigned [io 0x2400-0x24ff] pci 0000:04:00.2: BAR 0: assigned [io 0x2800-0x28ff] pci 0000:04:00.3: BAR 0: assigned [io 0x2c00-0x2cff] At this point none of the four functions (ports) - 0000:04:00.{0..3} were able to get their necessary resource needs met and thus the device's functions (NIC ports) do not work. In fact, I would expect the driver's call into the kernel's PCI core 'pci_enable_device()' routine to fail [1]. Conclusion ... The root cause of the issue(s) [2] is the platform's BIOS not providing enough, and setting up properly, resource needs that the device requires - specifically MMIO addressing space related resources. Most notably conspicuous is the device's Expansion ROM BAR(s) as they are improperly programmed - the initial BIOS programmed values do not fall within any valid resource ranges of the immediately upstream PCI-to-PCI Bridge's MMIO apertures. As for "symptomatic" solutions (just a band-aid to treat the symptom and not addressing the root cause) ... Short of getting the platform's BIOS updated to appropriately account for the device's total needs, a "compromized" solution has been to get them to program device's Expansion ROM BAR values with '0'. This has been done in the past so why this platform's BIOS engineers have chosen not to do that again in this instance is "out of character" and concerning. If, and only if, a device's Expansion ROM BAR is programmed with '0', then adding the "norom" kernel boot parameter will cause the kernel to ignore, and not attempt to assign resources to, such. Short of that, drivers can use, and check the return value of, pci_enable_rom(). That should fail if it's unassigned. Looking at it, it only fails if 'flags == 0' so I'm not sure that catches all cases of it being unassigned. [1] For a device's normal BARs - the BARs corresponding to the PCI specification's "Base Address 0 through 5" Type 0 configuration header space entries - that are initially ill programmed and the kernel can not subsequently assign appropriate resources for such, then the kernel's PCI core subsystem's 'pci_enable_device()' routine should fail. [2] While the analysis only covers one specific device, the 'dmesg' log shows that the same base root cause occurs in at least two additional instances. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2016-09-01 21:14 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-24 2:47 [RFC] PCI: Unassigned Expansion ROM BARs Myron Stowe 2015-09-24 3:21 ` Yinghai Lu 2015-09-24 3:58 ` Alex Williamson 2015-09-24 17:06 ` Myron Stowe 2015-09-24 19:01 ` Yinghai Lu 2015-09-25 4:35 ` Yinghai Lu 2015-09-25 13:31 ` Myron Stowe 2015-09-25 17:02 ` Myron Stowe 2015-09-25 14:35 ` Bjorn Helgaas 2015-09-25 16:18 ` Alex Williamson 2015-09-26 0:04 ` Yinghai Lu 2015-09-27 19:29 ` Myron Stowe 2015-09-27 20:19 ` Myron Stowe 2016-09-01 19:16 ` Myron Stowe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).