* pci_generic_config_write32() access size problem @ 2015-09-28 22:08 Bjorn Helgaas 2015-09-28 22:42 ` Rob Herring 0 siblings, 1 reply; 10+ messages in thread From: Bjorn Helgaas @ 2015-09-28 22:08 UTC (permalink / raw) To: Rob Herring; +Cc: Russell King, linux-pci Hi Rob, Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config accessors"). pci_generic_config_write32() does a read/modify/write if the size is less than 32 bits, so I think we have problem if this is used with 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can fix this? Bjorn ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: pci_generic_config_write32() access size problem 2015-09-28 22:08 pci_generic_config_write32() access size problem Bjorn Helgaas @ 2015-09-28 22:42 ` Rob Herring 2015-09-28 23:05 ` Russell King - ARM Linux 0 siblings, 1 reply; 10+ messages in thread From: Rob Herring @ 2015-09-28 22:42 UTC (permalink / raw) To: Bjorn Helgaas, Russell King; +Cc: linux-pci@vger.kernel.org On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > Hi Rob, > > Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config > accessors"). pci_generic_config_write32() does a read/modify/write if the > size is less than 32 bits, so I think we have problem if this is used with > 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can > fix this? My series didn't change access sizes unless I made a made a mistake somewhere, so the problem should have existed before. Is it known addresses we need to deal with? We could do special case handling of certain addresses to mask out RW1C bits. This could be in the generic functions or in wrappers around the generic functions. There's already some similar examples of special address handling IIRC. Rob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: pci_generic_config_write32() access size problem 2015-09-28 22:42 ` Rob Herring @ 2015-09-28 23:05 ` Russell King - ARM Linux 2015-09-29 16:47 ` Rob Herring 0 siblings, 1 reply; 10+ messages in thread From: Russell King - ARM Linux @ 2015-09-28 23:05 UTC (permalink / raw) To: Rob Herring; +Cc: Bjorn Helgaas, linux-pci@vger.kernel.org On Mon, Sep 28, 2015 at 05:42:20PM -0500, Rob Herring wrote: > On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > > Hi Rob, > > > > Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config > > accessors"). pci_generic_config_write32() does a read/modify/write if the > > size is less than 32 bits, so I think we have problem if this is used with > > 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can > > fix this? > > My series didn't change access sizes unless I made a made a mistake > somewhere, so the problem should have existed before. > > Is it known addresses we need to deal with? We could do special case > handling of certain addresses to mask out RW1C bits. This could be in > the generic functions or in wrappers around the generic functions. > There's already some similar examples of special address handling > IIRC. I think this originally came into being because Intel decided that their IOP platforms wouldn't support anything except 32-bit accesses, and cargo cult programming took over. I complained about it on the Intel IOP platforms, but obviously if the hardware doesn't support anything else then your stuck with it. However, what's utterly wrong is that the generic PCI layer should give credence to this crap - it should _not_ provide a generic helper for performing only 32-bit accesses which is non-conformant to the PCI specs, at least without a big fat warning about it. The problem is not limited to what the PCI specs say - remember, vendors are free to add their own PCI configuration registers: you don't know where these registers are in the PCI config address space. What's even worse is that you don't know what use vendors have put the neighbouring bytes of PCI config space to in the case of an 8-bit or 16-bit vendor register. For the standard PCI config registers, some addresses are well known. For example, the PCI status register is at fixed address 6, sharing the PCI command register at address 4. Others depend on the function, and whether there's capabilities present. For example, the PCIe device status register shares the same double-word as the PCI device control register, and the status register contains RW1C bits. The same is true for the PCIe AER capability. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: pci_generic_config_write32() access size problem 2015-09-28 23:05 ` Russell King - ARM Linux @ 2015-09-29 16:47 ` Rob Herring 2015-09-29 17:06 ` Russell King - ARM Linux [not found] ` <CADaLNDndQ6qZ2yuGuff6=cTm565in5e75AjkcX8MJmH2BCwyEw@mail.gmail.com> 0 siblings, 2 replies; 10+ messages in thread From: Rob Herring @ 2015-09-29 16:47 UTC (permalink / raw) To: Russell King - ARM Linux, Ray Jui, Tanmay Inamdar, Thierry Reding, Scott Branden Cc: Bjorn Helgaas, linux-pci@vger.kernel.org + several affected driver maintainers On Mon, Sep 28, 2015 at 6:05 PM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Mon, Sep 28, 2015 at 05:42:20PM -0500, Rob Herring wrote: >> On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: >> > Hi Rob, >> > >> > Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config >> > accessors"). pci_generic_config_write32() does a read/modify/write if the >> > size is less than 32 bits, so I think we have problem if this is used with >> > 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can >> > fix this? >> >> My series didn't change access sizes unless I made a made a mistake >> somewhere, so the problem should have existed before. >> >> Is it known addresses we need to deal with? We could do special case >> handling of certain addresses to mask out RW1C bits. This could be in >> the generic functions or in wrappers around the generic functions. >> There's already some similar examples of special address handling >> IIRC. > > I think this originally came into being because Intel decided that their > IOP platforms wouldn't support anything except 32-bit accesses, and > cargo cult programming took over. I complained about it on the Intel IOP > platforms, but obviously if the hardware doesn't support anything else > then your stuck with it. What about SA1100 nanoengine? Others are Tegra, XGene, iproc, and maybe rcar. These are all relatively active platforms, so hopefully some testing can be done. Rob > However, what's utterly wrong is that the generic PCI layer should give > credence to this crap - it should _not_ provide a generic helper for > performing only 32-bit accesses which is non-conformant to the PCI specs, > at least without a big fat warning about it. > > The problem is not limited to what the PCI specs say - remember, vendors > are free to add their own PCI configuration registers: you don't know > where these registers are in the PCI config address space. What's even > worse is that you don't know what use vendors have put the neighbouring > bytes of PCI config space to in the case of an 8-bit or 16-bit vendor > register. > > For the standard PCI config registers, some addresses are well known. > For example, the PCI status register is at fixed address 6, sharing the > PCI command register at address 4. > > Others depend on the function, and whether there's capabilities present. > For example, the PCIe device status register shares the same double-word > as the PCI device control register, and the status register contains RW1C > bits. > > The same is true for the PCIe AER capability. > > -- > FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up > according to speedtest.net. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: pci_generic_config_write32() access size problem 2015-09-29 16:47 ` Rob Herring @ 2015-09-29 17:06 ` Russell King - ARM Linux [not found] ` <CADaLNDndQ6qZ2yuGuff6=cTm565in5e75AjkcX8MJmH2BCwyEw@mail.gmail.com> 1 sibling, 0 replies; 10+ messages in thread From: Russell King - ARM Linux @ 2015-09-29 17:06 UTC (permalink / raw) To: Rob Herring Cc: Ray Jui, Tanmay Inamdar, Thierry Reding, Scott Branden, Bjorn Helgaas, linux-pci@vger.kernel.org On Tue, Sep 29, 2015 at 11:47:57AM -0500, Rob Herring wrote: > + several affected driver maintainers > > On Mon, Sep 28, 2015 at 6:05 PM, Russell King - ARM Linux > <linux@arm.linux.org.uk> wrote: > > On Mon, Sep 28, 2015 at 05:42:20PM -0500, Rob Herring wrote: > >> On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > >> > Hi Rob, > >> > > >> > Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config > >> > accessors"). pci_generic_config_write32() does a read/modify/write if the > >> > size is less than 32 bits, so I think we have problem if this is used with > >> > 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can > >> > fix this? > >> > >> My series didn't change access sizes unless I made a made a mistake > >> somewhere, so the problem should have existed before. > >> > >> Is it known addresses we need to deal with? We could do special case > >> handling of certain addresses to mask out RW1C bits. This could be in > >> the generic functions or in wrappers around the generic functions. > >> There's already some similar examples of special address handling > >> IIRC. > > > > I think this originally came into being because Intel decided that their > > IOP platforms wouldn't support anything except 32-bit accesses, and > > cargo cult programming took over. I complained about it on the Intel IOP > > platforms, but obviously if the hardware doesn't support anything else > > then your stuck with it. > > What about SA1100 nanoengine? No idea about that, sorry. I never had any information about what those comprised. I see that you've deleted a really useful comment which is relevant to this discussion though: - /* nanoEngine PCI bridge does not return -1 for a non-existing - * device. We must fake the answer. We know that the only valid - * device is device zero at bus 0, which is the network chip. */ Note "network chip" not "network card". That, coupled with: pci 0000:00:00.0: [8086:1209] type 0 class 0x000200 ... Kernel modules: e100 in other comments tells us that it's probably not a real PCI slot, but an e100 soldered down on the board. My guess is that the host bridge is a custom device (FPGA?) which interfaces with the rather unique method of gaining access to the SDRAM on SA1110. Basically, you share the SDRAM bus, and negotiate with the SA1110 through a hand-over process to take direct control of the SDRAM itself - no commercially produced PCI host chip would ever support this. Given that it's probably a custom and fixed setup, I don't think anyone has to worry about these corner cases. I doubt that it supports any of the SERR/PERR error reporting, and so I doubt anyone cares about the RW1C bits in the PCI status register there. There's certainly no chance of any PCIe stuff appearing there... if there are any of the boards still in existence. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CADaLNDndQ6qZ2yuGuff6=cTm565in5e75AjkcX8MJmH2BCwyEw@mail.gmail.com>]
* Re: pci_generic_config_write32() access size problem [not found] ` <CADaLNDndQ6qZ2yuGuff6=cTm565in5e75AjkcX8MJmH2BCwyEw@mail.gmail.com> @ 2015-09-30 21:30 ` Rob Herring [not found] ` <CADaLNDmjv+SkRxtGDToUTxo4DKFF+yd2LRmte=k4-HRsGOHYVg@mail.gmail.com> 0 siblings, 1 reply; 10+ messages in thread From: Rob Herring @ 2015-09-30 21:30 UTC (permalink / raw) To: Duc Dang Cc: Russell King - ARM Linux, Ray Jui, Tanmay Inamdar, Thierry Reding, Scott Branden, Bjorn Helgaas, linux-pci@vger.kernel.org On Wed, Sep 30, 2015 at 4:17 PM, Duc Dang <dhdang@apm.com> wrote: > On Tue, Sep 29, 2015 at 9:47 AM, Rob Herring <robh@kernel.org> wrote: >> + several affected driver maintainers >> >> On Mon, Sep 28, 2015 at 6:05 PM, Russell King - ARM Linux >> <linux@arm.linux.org.uk> wrote: >>> On Mon, Sep 28, 2015 at 05:42:20PM -0500, Rob Herring wrote: >>>> On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: >>>> > Hi Rob, >>>> > >>>> > Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config >>>> > accessors"). pci_generic_config_write32() does a read/modify/write if the >>>> > size is less than 32 bits, so I think we have problem if this is used with >>>> > 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can >>>> > fix this? >>>> >>>> My series didn't change access sizes unless I made a made a mistake >>>> somewhere, so the problem should have existed before. >>>> >>>> Is it known addresses we need to deal with? We could do special case >>>> handling of certain addresses to mask out RW1C bits. This could be in >>>> the generic functions or in wrappers around the generic functions. >>>> There's already some similar examples of special address handling >>>> IIRC. >>> >>> I think this originally came into being because Intel decided that their >>> IOP platforms wouldn't support anything except 32-bit accesses, and >>> cargo cult programming took over. I complained about it on the Intel IOP >>> platforms, but obviously if the hardware doesn't support anything else >>> then your stuck with it. >> >> What about SA1100 nanoengine? >> >> Others are Tegra, XGene, iproc, and maybe rcar. These are all >> relatively active platforms, so hopefully some testing can be done. > > On X-Gene, so far I don't see anything strange after switching to use > pci_generic_config_write32 with all the cards that I can test. You mean switching to use pci_generic_config_write, right? > > But looks like it's still safer to use pci_generic_config_write > instead as this will make sure no unexpected read-modify-write will > happen? Yes. You can also change xgene_pcie_config_read32 to use pci_generic_config_read instead to simplify the function some. Rob ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CADaLNDmjv+SkRxtGDToUTxo4DKFF+yd2LRmte=k4-HRsGOHYVg@mail.gmail.com>]
* Re: pci_generic_config_write32() access size problem [not found] ` <CADaLNDmjv+SkRxtGDToUTxo4DKFF+yd2LRmte=k4-HRsGOHYVg@mail.gmail.com> @ 2015-09-30 21:53 ` Rob Herring 2015-10-01 7:46 ` Thierry Reding 0 siblings, 1 reply; 10+ messages in thread From: Rob Herring @ 2015-09-30 21:53 UTC (permalink / raw) To: Duc Dang Cc: Russell King - ARM Linux, Ray Jui, Tanmay Inamdar, Thierry Reding, Scott Branden, Bjorn Helgaas, linux-pci@vger.kernel.org On Wed, Sep 30, 2015 at 4:47 PM, Duc Dang <dhdang@apm.com> wrote: > On Wed, Sep 30, 2015 at 2:30 PM, Rob Herring <robh@kernel.org> wrote: >> On Wed, Sep 30, 2015 at 4:17 PM, Duc Dang <dhdang@apm.com> wrote: >>> On Tue, Sep 29, 2015 at 9:47 AM, Rob Herring <robh@kernel.org> wrote: >>>> + several affected driver maintainers >>>> >>>> On Mon, Sep 28, 2015 at 6:05 PM, Russell King - ARM Linux >>>> <linux@arm.linux.org.uk> wrote: >>>>> On Mon, Sep 28, 2015 at 05:42:20PM -0500, Rob Herring wrote: >>>>>> On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: >>>>>> > Hi Rob, >>>>>> > >>>>>> > Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config >>>>>> > accessors"). pci_generic_config_write32() does a read/modify/write if the >>>>>> > size is less than 32 bits, so I think we have problem if this is used with >>>>>> > 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can >>>>>> > fix this? >>>>>> >>>>>> My series didn't change access sizes unless I made a made a mistake >>>>>> somewhere, so the problem should have existed before. >>>>>> >>>>>> Is it known addresses we need to deal with? We could do special case >>>>>> handling of certain addresses to mask out RW1C bits. This could be in >>>>>> the generic functions or in wrappers around the generic functions. >>>>>> There's already some similar examples of special address handling >>>>>> IIRC. >>>>> >>>>> I think this originally came into being because Intel decided that their >>>>> IOP platforms wouldn't support anything except 32-bit accesses, and >>>>> cargo cult programming took over. I complained about it on the Intel IOP >>>>> platforms, but obviously if the hardware doesn't support anything else >>>>> then your stuck with it. >>>> >>>> What about SA1100 nanoengine? >>>> >>>> Others are Tegra, XGene, iproc, and maybe rcar. These are all >>>> relatively active platforms, so hopefully some testing can be done. >>> >>> On X-Gene, so far I don't see anything strange after switching to use >>> pci_generic_config_write32 with all the cards that I can test. >> >> You mean switching to use pci_generic_config_write, right? > Hi Rob, > > I meant even using pci_generic_config_write32 , all the cards that I > have right now are still running fine. No doubt. I would hope your driver was already tested to some level. The question is can you switch to pci_generic_config_write? Is there some h/w bug like hanging the bus or garbage data that prevents 8 and 16 bit accesses from working? Rob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: pci_generic_config_write32() access size problem 2015-09-30 21:53 ` Rob Herring @ 2015-10-01 7:46 ` Thierry Reding 2015-10-01 15:45 ` Ray Jui 2015-10-01 16:11 ` Russell King - ARM Linux 0 siblings, 2 replies; 10+ messages in thread From: Thierry Reding @ 2015-10-01 7:46 UTC (permalink / raw) To: Rob Herring Cc: Duc Dang, Russell King - ARM Linux, Ray Jui, Tanmay Inamdar, Scott Branden, Bjorn Helgaas, linux-pci@vger.kernel.org [-- Attachment #1: Type: text/plain, Size: 8201 bytes --] On Wed, Sep 30, 2015 at 04:53:00PM -0500, Rob Herring wrote: > On Wed, Sep 30, 2015 at 4:47 PM, Duc Dang <dhdang@apm.com> wrote: > > On Wed, Sep 30, 2015 at 2:30 PM, Rob Herring <robh@kernel.org> wrote: > >> On Wed, Sep 30, 2015 at 4:17 PM, Duc Dang <dhdang@apm.com> wrote: > >>> On Tue, Sep 29, 2015 at 9:47 AM, Rob Herring <robh@kernel.org> wrote: > >>>> + several affected driver maintainers > >>>> > >>>> On Mon, Sep 28, 2015 at 6:05 PM, Russell King - ARM Linux > >>>> <linux@arm.linux.org.uk> wrote: > >>>>> On Mon, Sep 28, 2015 at 05:42:20PM -0500, Rob Herring wrote: > >>>>>> On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: > >>>>>> > Hi Rob, > >>>>>> > > >>>>>> > Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config > >>>>>> > accessors"). pci_generic_config_write32() does a read/modify/write if the > >>>>>> > size is less than 32 bits, so I think we have problem if this is used with > >>>>>> > 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can > >>>>>> > fix this? > >>>>>> > >>>>>> My series didn't change access sizes unless I made a made a mistake > >>>>>> somewhere, so the problem should have existed before. > >>>>>> > >>>>>> Is it known addresses we need to deal with? We could do special case > >>>>>> handling of certain addresses to mask out RW1C bits. This could be in > >>>>>> the generic functions or in wrappers around the generic functions. > >>>>>> There's already some similar examples of special address handling > >>>>>> IIRC. > >>>>> > >>>>> I think this originally came into being because Intel decided that their > >>>>> IOP platforms wouldn't support anything except 32-bit accesses, and > >>>>> cargo cult programming took over. I complained about it on the Intel IOP > >>>>> platforms, but obviously if the hardware doesn't support anything else > >>>>> then your stuck with it. > >>>> > >>>> What about SA1100 nanoengine? > >>>> > >>>> Others are Tegra, XGene, iproc, and maybe rcar. These are all > >>>> relatively active platforms, so hopefully some testing can be done. > >>> > >>> On X-Gene, so far I don't see anything strange after switching to use > >>> pci_generic_config_write32 with all the cards that I can test. > >> > >> You mean switching to use pci_generic_config_write, right? > > Hi Rob, > > > > I meant even using pci_generic_config_write32 , all the cards that I > > have right now are still running fine. > > No doubt. I would hope your driver was already tested to some level. > > The question is can you switch to pci_generic_config_write? Is there > some h/w bug like hanging the bus or garbage data that prevents 8 and > 16 bit accesses from working? It seems like at least on Tegra there is a hardware bug that prevents 8 and 16 bit accesses from working. I've tried next-20151001 on Jetson TK1 (I have root on NFS, and the network card is on PCIe) and I see this: [ 1.535877] tegra-pcie 1003000.pcie-controller: probing port 0, using 2 lanes [ 1.545099] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000008 [ 1.985057] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 2.420307] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 2.856198] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 2.864783] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring [ 2.871296] tegra-pcie 1003000.pcie-controller: probing port 1, using 1 lanes [ 2.880794] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000000 [ 2.898441] tegra-pcie 1003000.pcie-controller: PCI host bridge to bus 0000:00 [ 2.905718] pci_bus 0000:00: root bus resource [mem 0x13000000-0x1fffffff] [ 2.912587] pci_bus 0000:00: root bus resource [mem 0x20000000-0x3fffffff pref] [ 2.919905] pci_bus 0000:00: root bus resource [bus 00-ff] [ 2.925386] pci_bus 0000:00: root bus resource [io 0x1000-0xffff] [ 2.931625] pci 0000:00:02.0: [10de:0e13] type 01 class 0x060400 [ 2.937733] pci 0000:00:02.0: PME# supported from D0 D1 D2 D3hot D3cold [ 2.945062] PCI: bus0: Fast back to back transfers disabled [ 2.950649] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring [ 2.959096] pci 0000:01:00.0: [10ec:8168] type 00 class 0x020000 [ 2.965148] pci 0000:01:00.0: reg 0x10: [io 0x0000-0x00ff] [ 2.970762] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00000fff 64bit] [ 2.977575] pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] [ 2.984879] pci 0000:01:00.0: supports D1 D2 [ 2.989162] pci 0000:01:00.0: PME# supported from D0 D1 D2 D3hot D3cold [ 3.015680] PCI: bus1: Fast back to back transfers disabled [ 3.021261] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 [ 3.028012] pci 0000:00:02.0: BAR 8: assigned [mem 0x13000000-0x130fffff] [ 3.034796] pci 0000:00:02.0: BAR 9: assigned [mem 0x20000000-0x200fffff 64bit pref] [ 3.042574] pci 0000:00:02.0: BAR 7: assigned [io 0x1000-0x1fff] [ 3.048715] pci 0000:01:00.0: BAR 4: assigned [mem 0x20000000-0x20003fff 64bit pref] [ 3.056491] pci 0000:01:00.0: BAR 2: assigned [mem 0x13000000-0x13000fff 64bit] [ 3.063807] pci 0000:01:00.0: BAR 0: assigned [io 0x1000-0x10ff] [ 3.070021] pci 0000:00:02.0: PCI bridge to [bus 01] [ 3.074984] pci 0000:00:02.0: bridge window [io 0x1000-0x1fff] [ 3.081093] pci 0000:00:02.0: bridge window [mem 0x13000000-0x130fffff] [ 3.087890] pci 0000:00:02.0: bridge window [mem 0x20000000-0x200fffff 64bit pref] [ 3.095668] pci 0000:00:02.0: nv_msi_ht_cap_quirk didn't locate host bridge [ 3.102813] pcieport 0000:00:02.0: enabling device (0140 -> 0143) [ 3.109455] pcieport 0000:00:02.0: Signaling PME through PCIe PME interrupt [ 3.116428] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt [ 3.122950] pcie_pme 0000:00:02.0:pcie01: service driver pcie_pme loaded [ 3.130024] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded [ 3.135670] r8169 0000:01:00.0: enabling device (0140 -> 0143) [ 3.158326] r8169 0000:01:00.0 eth0: RTL8168g/8111g at 0xf08f6000, 00:04:4b:1e:07:bb, XID 0c000800 IRQ 390 [ 3.168015] r8169 0000:01:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko] The network card is detected and boot continues normally. When I change the pci_ops to use pci_generic_config_{read,write} the system no longer boots because the network card never shows up. The kernel log has this: [ 1.546563] tegra-pcie 1003000.pcie-controller: probing port 0, using 2 lanes [ 1.556137] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000008 [ 1.996905] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 2.432731] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 2.868648] tegra-pcie 1003000.pcie-controller: link 0 down, retrying [ 2.877241] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring [ 2.883707] tegra-pcie 1003000.pcie-controller: probing port 1, using 1 lanes [ 2.893239] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000000 [ 2.908770] tegra-pcie 1003000.pcie-controller: PCI host bridge to bus 0000:00 [ 2.916039] pci_bus 0000:00: root bus resource [mem 0x13000000-0x1fffffff] [ 2.922906] pci_bus 0000:00: root bus resource [mem 0x20000000-0x3fffffff pref] [ 2.930225] pci_bus 0000:00: root bus resource [bus 00-ff] [ 2.935721] pci_bus 0000:00: root bus resource [io 0x1000-0xffff] [ 2.941935] pci 0000:00:02.0: [10de:0e13] type 00 class 0x060400 [ 2.947958] pci 0000:00:02.0: ignoring class 0x060400 (doesn't match header type 00) [ 2.956292] PCI: bus0: Fast back to back transfers disabled So it seems like it's failing to properly read the header type. That's retrieved from byte 0xe as an 8 bit access. That said, even with 32-bit accesses I haven't seen any problems, but that might just be because none of the devices I've tested with have R1WC bits in problematic places. Thierry [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 819 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: pci_generic_config_write32() access size problem 2015-10-01 7:46 ` Thierry Reding @ 2015-10-01 15:45 ` Ray Jui 2015-10-01 16:11 ` Russell King - ARM Linux 1 sibling, 0 replies; 10+ messages in thread From: Ray Jui @ 2015-10-01 15:45 UTC (permalink / raw) To: Thierry Reding, Rob Herring Cc: Duc Dang, Russell King - ARM Linux, Tanmay Inamdar, Scott Branden, Bjorn Helgaas, linux-pci@vger.kernel.org On 10/1/2015 12:46 AM, Thierry Reding wrote: > On Wed, Sep 30, 2015 at 04:53:00PM -0500, Rob Herring wrote: >> On Wed, Sep 30, 2015 at 4:47 PM, Duc Dang <dhdang@apm.com> wrote: >>> On Wed, Sep 30, 2015 at 2:30 PM, Rob Herring <robh@kernel.org> wrote: >>>> On Wed, Sep 30, 2015 at 4:17 PM, Duc Dang <dhdang@apm.com> wrote: >>>>> On Tue, Sep 29, 2015 at 9:47 AM, Rob Herring <robh@kernel.org> wrote: >>>>>> + several affected driver maintainers >>>>>> >>>>>> On Mon, Sep 28, 2015 at 6:05 PM, Russell King - ARM Linux >>>>>> <linux@arm.linux.org.uk> wrote: >>>>>>> On Mon, Sep 28, 2015 at 05:42:20PM -0500, Rob Herring wrote: >>>>>>>> On Mon, Sep 28, 2015 at 5:08 PM, Bjorn Helgaas <helgaas@kernel.org> wrote: >>>>>>>>> Hi Rob, >>>>>>>>> >>>>>>>>> Russell pointed out a problem with 1f94a94f67e1 ("PCI: Add generic config >>>>>>>>> accessors"). pci_generic_config_write32() does a read/modify/write if the >>>>>>>>> size is less than 32 bits, so I think we have problem if this is used with >>>>>>>>> 8- or 16-bit registers that contain RW1C bits. Any thoughts on how we can >>>>>>>>> fix this? >>>>>>>> >>>>>>>> My series didn't change access sizes unless I made a made a mistake >>>>>>>> somewhere, so the problem should have existed before. >>>>>>>> >>>>>>>> Is it known addresses we need to deal with? We could do special case >>>>>>>> handling of certain addresses to mask out RW1C bits. This could be in >>>>>>>> the generic functions or in wrappers around the generic functions. >>>>>>>> There's already some similar examples of special address handling >>>>>>>> IIRC. >>>>>>> >>>>>>> I think this originally came into being because Intel decided that their >>>>>>> IOP platforms wouldn't support anything except 32-bit accesses, and >>>>>>> cargo cult programming took over. I complained about it on the Intel IOP >>>>>>> platforms, but obviously if the hardware doesn't support anything else >>>>>>> then your stuck with it. >>>>>> >>>>>> What about SA1100 nanoengine? >>>>>> >>>>>> Others are Tegra, XGene, iproc, and maybe rcar. These are all >>>>>> relatively active platforms, so hopefully some testing can be done. >>>>> >>>>> On X-Gene, so far I don't see anything strange after switching to use >>>>> pci_generic_config_write32 with all the cards that I can test. >>>> >>>> You mean switching to use pci_generic_config_write, right? >>> Hi Rob, >>> >>> I meant even using pci_generic_config_write32 , all the cards that I >>> have right now are still running fine. >> >> No doubt. I would hope your driver was already tested to some level. >> >> The question is can you switch to pci_generic_config_write? Is there >> some h/w bug like hanging the bus or garbage data that prevents 8 and >> 16 bit accesses from working? > > It seems like at least on Tegra there is a hardware bug that prevents 8 > and 16 bit accesses from working. We have the same issue with iProc based PCIe controllers. The access needs to be 32-bit aligned. > I've tried next-20151001 on Jetson TK1 > (I have root on NFS, and the network card is on PCIe) and I see this: > > [ 1.535877] tegra-pcie 1003000.pcie-controller: probing port 0, using 2 lanes > [ 1.545099] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000008 > [ 1.985057] tegra-pcie 1003000.pcie-controller: link 0 down, retrying > [ 2.420307] tegra-pcie 1003000.pcie-controller: link 0 down, retrying > [ 2.856198] tegra-pcie 1003000.pcie-controller: link 0 down, retrying > [ 2.864783] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring > [ 2.871296] tegra-pcie 1003000.pcie-controller: probing port 1, using 1 lanes > [ 2.880794] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000000 > [ 2.898441] tegra-pcie 1003000.pcie-controller: PCI host bridge to bus 0000:00 > [ 2.905718] pci_bus 0000:00: root bus resource [mem 0x13000000-0x1fffffff] > [ 2.912587] pci_bus 0000:00: root bus resource [mem 0x20000000-0x3fffffff pref] > [ 2.919905] pci_bus 0000:00: root bus resource [bus 00-ff] > [ 2.925386] pci_bus 0000:00: root bus resource [io 0x1000-0xffff] > [ 2.931625] pci 0000:00:02.0: [10de:0e13] type 01 class 0x060400 > [ 2.937733] pci 0000:00:02.0: PME# supported from D0 D1 D2 D3hot D3cold > [ 2.945062] PCI: bus0: Fast back to back transfers disabled > [ 2.950649] pci 0000:00:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring > [ 2.959096] pci 0000:01:00.0: [10ec:8168] type 00 class 0x020000 > [ 2.965148] pci 0000:01:00.0: reg 0x10: [io 0x0000-0x00ff] > [ 2.970762] pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00000fff 64bit] > [ 2.977575] pci 0000:01:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] > [ 2.984879] pci 0000:01:00.0: supports D1 D2 > [ 2.989162] pci 0000:01:00.0: PME# supported from D0 D1 D2 D3hot D3cold > [ 3.015680] PCI: bus1: Fast back to back transfers disabled > [ 3.021261] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 > [ 3.028012] pci 0000:00:02.0: BAR 8: assigned [mem 0x13000000-0x130fffff] > [ 3.034796] pci 0000:00:02.0: BAR 9: assigned [mem 0x20000000-0x200fffff 64bit pref] > [ 3.042574] pci 0000:00:02.0: BAR 7: assigned [io 0x1000-0x1fff] > [ 3.048715] pci 0000:01:00.0: BAR 4: assigned [mem 0x20000000-0x20003fff 64bit pref] > [ 3.056491] pci 0000:01:00.0: BAR 2: assigned [mem 0x13000000-0x13000fff 64bit] > [ 3.063807] pci 0000:01:00.0: BAR 0: assigned [io 0x1000-0x10ff] > [ 3.070021] pci 0000:00:02.0: PCI bridge to [bus 01] > [ 3.074984] pci 0000:00:02.0: bridge window [io 0x1000-0x1fff] > [ 3.081093] pci 0000:00:02.0: bridge window [mem 0x13000000-0x130fffff] > [ 3.087890] pci 0000:00:02.0: bridge window [mem 0x20000000-0x200fffff 64bit pref] > [ 3.095668] pci 0000:00:02.0: nv_msi_ht_cap_quirk didn't locate host bridge > [ 3.102813] pcieport 0000:00:02.0: enabling device (0140 -> 0143) > [ 3.109455] pcieport 0000:00:02.0: Signaling PME through PCIe PME interrupt > [ 3.116428] pci 0000:01:00.0: Signaling PME through PCIe PME interrupt > [ 3.122950] pcie_pme 0000:00:02.0:pcie01: service driver pcie_pme loaded > [ 3.130024] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded > [ 3.135670] r8169 0000:01:00.0: enabling device (0140 -> 0143) > [ 3.158326] r8169 0000:01:00.0 eth0: RTL8168g/8111g at 0xf08f6000, 00:04:4b:1e:07:bb, XID 0c000800 IRQ 390 > [ 3.168015] r8169 0000:01:00.0 eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko] > > The network card is detected and boot continues normally. When I change > the pci_ops to use pci_generic_config_{read,write} the system no longer > boots because the network card never shows up. The kernel log has this: > > [ 1.546563] tegra-pcie 1003000.pcie-controller: probing port 0, using 2 lanes > [ 1.556137] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000008 > [ 1.996905] tegra-pcie 1003000.pcie-controller: link 0 down, retrying > [ 2.432731] tegra-pcie 1003000.pcie-controller: link 0 down, retrying > [ 2.868648] tegra-pcie 1003000.pcie-controller: link 0 down, retrying > [ 2.877241] tegra-pcie 1003000.pcie-controller: link 0 down, ignoring > [ 2.883707] tegra-pcie 1003000.pcie-controller: probing port 1, using 1 lanes > [ 2.893239] tegra-pcie 1003000.pcie-controller: Slot present pin change, signature: 00000000 > [ 2.908770] tegra-pcie 1003000.pcie-controller: PCI host bridge to bus 0000:00 > [ 2.916039] pci_bus 0000:00: root bus resource [mem 0x13000000-0x1fffffff] > [ 2.922906] pci_bus 0000:00: root bus resource [mem 0x20000000-0x3fffffff pref] > [ 2.930225] pci_bus 0000:00: root bus resource [bus 00-ff] > [ 2.935721] pci_bus 0000:00: root bus resource [io 0x1000-0xffff] > [ 2.941935] pci 0000:00:02.0: [10de:0e13] type 00 class 0x060400 > [ 2.947958] pci 0000:00:02.0: ignoring class 0x060400 (doesn't match header type 00) > [ 2.956292] PCI: bus0: Fast back to back transfers disabled > > So it seems like it's failing to properly read the header type. That's > retrieved from byte 0xe as an 8 bit access. That said, even with 32-bit > accesses I haven't seen any problems, but that might just be because > none of the devices I've tested with have R1WC bits in problematic > places. We have not seen any issue so far. The devices we tested include Intel e1000e based gigabit NIC and Broadcom BCM4356 5G WiFi card. But yeah, I'm not sure if it ever hits the R1WC bits issue in our test cases. > > Thierry > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: pci_generic_config_write32() access size problem 2015-10-01 7:46 ` Thierry Reding 2015-10-01 15:45 ` Ray Jui @ 2015-10-01 16:11 ` Russell King - ARM Linux 1 sibling, 0 replies; 10+ messages in thread From: Russell King - ARM Linux @ 2015-10-01 16:11 UTC (permalink / raw) To: Thierry Reding Cc: Rob Herring, Duc Dang, Ray Jui, Tanmay Inamdar, Scott Branden, Bjorn Helgaas, linux-pci@vger.kernel.org On Thu, Oct 01, 2015 at 09:46:59AM +0200, Thierry Reding wrote: > So it seems like it's failing to properly read the header type. That's > retrieved from byte 0xe as an 8 bit access. That said, even with 32-bit > accesses I haven't seen any problems, but that might just be because > none of the devices I've tested with have R1WC bits in problematic > places. As I've explained, the registers which are most likely to be affected in PCIe are: - PCI status register - PCIe device status register - PCIe error status(es) and possibly others. Some of these registers are accessed by the PCIe AER driver. Here's the problem: how do you know whether the PCIe on your board is operating correctly if your PCI host driver goes around writing '1's to the RW1C bits in these status registers, thereby clearing the error status(es)? It may appear to be working correctly, but generating a number of correctable errors. Also, as I've pointed out, how do you know whether 8-bit or 16-bit registers neighbouring the register you're accessing are safe to be written, especially those in the space above 0x40, where vendors are allowed to place their own vendor specific registers - this is the bigger problem. Yes, you can get away with saying "the cards I have in front of me work today" but as long as the bus doesn't support 8-bit or 16-bit accesses, you're at the mercy of these issues, because what you have is an implementation which doesn't conform to the PCI specifications. The point that I initially raised was that (IMHO) generic PCI code should not be making it easy to make "mistakes" like always accessing config space via 32-bit accesses, at least not without taking some counter-measures against it - even at a basic level of detecting a write to the PCI command register and zeroing the upper 16-bits of the 32-bit quantity it's about to write back. Maybe also looking up the PCI device, and then its capabilities, and doing a similar thing for the PCIe status registers containing RW1C bits. Right now, it's all too easy to just plug the 32-bit accessors in without thinking, and without even checking or testing whether the PCI conformant accessors should be used. It seems that the Marvell platforms have suffered from something like this: the later hardware definitely can (it's specified to) but it's not mentioned whether or not they're supported in older hardware - but practical tests have shown that the 8-bit and 16-bit accesses are supported there too. However, the driver opted to just use the 32-bit accessors anyway. I'd hope that having the code more complex behind the 32-bit accessors to at least try a little harder to avoid clearing these RW1C bits will be enough to make people think twice about using them - especially if they have to walk the list of PCI devices on the bus to try and find the appropriate PCI device struct corresponding to the requested devfn. Those who are dealing with non-conformant hardware then get a slight performance impact when accessing PCI configuration space, which I think is entirely justified given that it _is_ non-conformant hardware. -- FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up according to speedtest.net. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-10-01 16:11 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-09-28 22:08 pci_generic_config_write32() access size problem Bjorn Helgaas 2015-09-28 22:42 ` Rob Herring 2015-09-28 23:05 ` Russell King - ARM Linux 2015-09-29 16:47 ` Rob Herring 2015-09-29 17:06 ` Russell King - ARM Linux [not found] ` <CADaLNDndQ6qZ2yuGuff6=cTm565in5e75AjkcX8MJmH2BCwyEw@mail.gmail.com> 2015-09-30 21:30 ` Rob Herring [not found] ` <CADaLNDmjv+SkRxtGDToUTxo4DKFF+yd2LRmte=k4-HRsGOHYVg@mail.gmail.com> 2015-09-30 21:53 ` Rob Herring 2015-10-01 7:46 ` Thierry Reding 2015-10-01 15:45 ` Ray Jui 2015-10-01 16:11 ` Russell King - ARM Linux
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).