* [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan @ 2014-03-07 14:33 Andreas Noever 2014-03-07 16:45 ` Bjorn Helgaas 0 siblings, 1 reply; 9+ messages in thread From: Andreas Noever @ 2014-03-07 14:33 UTC (permalink / raw) To: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Bjorn Helgaas Hi, After upgrading to the latest RC I noticed that suprise removal stopped working. Linux did not notice that the devices where gone. Bisection points to 1f42db786b14a31bf807fc41ee5583a00c08fcb1 PCI: Enable INTx if BIOS left them disabled http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1f42db786b14a31bf807fc41ee5583a00c08fcb1 It seems that the above patch is triggered on the bridge itself (!) when a new device is added below it. At this point the hotplug driver for the bridge has already enabled MSI. Reenabling INTX kills MSI and prevents later suprise removal notifications. The following stacktrace is from a "echo 1 > rescan" on the bridge: do_pci_enable_device+0x59/0x140 pci_reenable_device+0x1f/0x30 pci_assign_unassigned_bridge_resources+0x123/0x160 pci_rescan_bus_bridge_resize+0x28/0x40 dev_bus_rescan_store+0x85/0xa0 dev_attr_store+0x18/0x30 sysfs_kf_write+0x3d/0x50 kernfs_fop_write+0xd2/0x140 vfs_write+0xba/0x1e0 SyS_write+0x49/0xa0 system_call_fastpath+0x1a/0x1f Similarly for a hotplug event: do_pci_enable_device+0x59/0x140 pci_reenable_device+0x1f/0x30 pci_assign_unassigned_bridge_resources+0x123/0x160 pciehp_configure_device+0x90/0x160 pciehp_enable_slot+0x163/0x280 pciehp_power_thread+0xb8/0xe0 process_one_work+0x167/0x420 worker_thread+0x121/0x3a0 In both cases DisINTx is turned off and the hotplug driver stops reacting to events. Andreas ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan 2014-03-07 14:33 [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan Andreas Noever @ 2014-03-07 16:45 ` Bjorn Helgaas 2014-03-07 19:39 ` Yinghai Lu 0 siblings, 1 reply; 9+ messages in thread From: Bjorn Helgaas @ 2014-03-07 16:45 UTC (permalink / raw) To: Andreas Noever Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Yinghai Lu, Rajat Jain [+cc Yinghai, Rajat] On Fri, Mar 7, 2014 at 7:33 AM, Andreas Noever <andreas.noever@gmail.com> wrote: > Hi, > > After upgrading to the latest RC I noticed that suprise removal > stopped working. Linux did not notice that the devices where gone. > Bisection points to > > 1f42db786b14a31bf807fc41ee5583a00c08fcb1 PCI: Enable INTx if BIOS left > them disabled > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1f42db786b14a31bf807fc41ee5583a00c08fcb1 > > It seems that the above patch is triggered on the bridge itself (!) > when a new device is added below it. At this point the hotplug driver > for the bridge has already enabled MSI. Reenabling INTX kills MSI and > prevents later suprise removal notifications. > > The following stacktrace is from a "echo 1 > rescan" on the bridge: > do_pci_enable_device+0x59/0x140 > pci_reenable_device+0x1f/0x30 > pci_assign_unassigned_bridge_resources+0x123/0x160 > pci_rescan_bus_bridge_resize+0x28/0x40 > dev_bus_rescan_store+0x85/0xa0 > dev_attr_store+0x18/0x30 > sysfs_kf_write+0x3d/0x50 > kernfs_fop_write+0xd2/0x140 > vfs_write+0xba/0x1e0 > SyS_write+0x49/0xa0 > system_call_fastpath+0x1a/0x1f > > Similarly for a hotplug event: > do_pci_enable_device+0x59/0x140 > pci_reenable_device+0x1f/0x30 > pci_assign_unassigned_bridge_resources+0x123/0x160 > pciehp_configure_device+0x90/0x160 > pciehp_enable_slot+0x163/0x280 > pciehp_power_thread+0xb8/0xe0 > process_one_work+0x167/0x420 > worker_thread+0x121/0x3a0 > > In both cases DisINTx is turned off and the hotplug driver stops > reacting to events. Thanks a lot for noticing and bisecting this! I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691 It seems like clearing DisINTx has some effect on MSI. I don't see anything in the spec that would suggest this (I'm looking at the PCIe r3.0 spec, sec 7.5.1.1). Can somebody point out a connection between DisINTx and MSI? If not, maybe we'll need some sort of quirk to deal with this. Andrea, can you attach a complete dmesg log and "lspci -vv" output to the bugzilla? Thanks again! Bjorn ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan 2014-03-07 16:45 ` Bjorn Helgaas @ 2014-03-07 19:39 ` Yinghai Lu 2014-03-08 1:04 ` Bjorn Helgaas 0 siblings, 1 reply; 9+ messages in thread From: Yinghai Lu @ 2014-03-07 19:39 UTC (permalink / raw) To: Bjorn Helgaas Cc: Andreas Noever, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Rajat Jain [-- Attachment #1: Type: text/plain, Size: 2450 bytes --] On Fri, Mar 7, 2014 at 8:45 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > > I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691 > > It seems like clearing DisINTx has some effect on MSI. I don't see > anything in the spec that would suggest this (I'm looking at the PCIe > r3.0 spec, sec 7.5.1.1). > > Can somebody point out a connection between DisINTx and MSI? If not, > maybe we'll need some sort of quirk to deal with this. I had different impression: if you disable INTx in some chipset, MSI will not work anymore. so we have static void pci_intx_for_msi(struct pci_dev *dev, int enable) { if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG)) pci_intx(dev, enable); } and have quirks for ati and broadcom chip to set that FLAG. regarding the regression: i would suggest move out do_pci_enable_intx() from re-enable path. diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 5a24cb3..92718c9 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct pci_dev *dev, int bars) return pci_enable_resources(dev, bars); } +static void do_pci_enable_intx(struct pci_dev *dev) +{ + u8 pin; + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); + if (pin) { + pci_read_config_word(dev, PCI_COMMAND, &cmd); + if (cmd & PCI_COMMAND_INTX_DISABLE) + pci_write_config_word(dev, PCI_COMMAND, + cmd & ~PCI_COMMAND_INTX_DISABLE); + } +} + static int do_pci_enable_device(struct pci_dev *dev, int bars) { int err; u16 cmd; - u8 pin; err = pci_set_power_state(dev, PCI_D0); if (err < 0 && err != -EIO) @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars) return err; pci_fixup_device(pci_fixup_enable, dev); - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); - if (pin) { - pci_read_config_word(dev, PCI_COMMAND, &cmd); - if (cmd & PCI_COMMAND_INTX_DISABLE) - pci_write_config_word(dev, PCI_COMMAND, - cmd & ~PCI_COMMAND_INTX_DISABLE); - } - return 0; } @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) err = do_pci_enable_device(dev, bars); if (err < 0) atomic_dec(&dev->enable_cnt); + else + do_pci_enable_intx(dev); return err; } [-- Attachment #2: do_not_enable_intx_again.patch --] [-- Type: text/x-patch, Size: 1413 bytes --] diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 5a24cb3..92718c9 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct pci_dev *dev, int bars) return pci_enable_resources(dev, bars); } +static void do_pci_enable_intx(struct pci_dev *dev) +{ + u8 pin; + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); + if (pin) { + pci_read_config_word(dev, PCI_COMMAND, &cmd); + if (cmd & PCI_COMMAND_INTX_DISABLE) + pci_write_config_word(dev, PCI_COMMAND, + cmd & ~PCI_COMMAND_INTX_DISABLE); + } +} + static int do_pci_enable_device(struct pci_dev *dev, int bars) { int err; u16 cmd; - u8 pin; err = pci_set_power_state(dev, PCI_D0); if (err < 0 && err != -EIO) @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars) return err; pci_fixup_device(pci_fixup_enable, dev); - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); - if (pin) { - pci_read_config_word(dev, PCI_COMMAND, &cmd); - if (cmd & PCI_COMMAND_INTX_DISABLE) - pci_write_config_word(dev, PCI_COMMAND, - cmd & ~PCI_COMMAND_INTX_DISABLE); - } - return 0; } @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) err = do_pci_enable_device(dev, bars); if (err < 0) atomic_dec(&dev->enable_cnt); + else + do_pci_enable_intx(dev); return err; } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan 2014-03-07 19:39 ` Yinghai Lu @ 2014-03-08 1:04 ` Bjorn Helgaas 2014-03-08 9:55 ` Andreas Noever 0 siblings, 1 reply; 9+ messages in thread From: Bjorn Helgaas @ 2014-03-08 1:04 UTC (permalink / raw) To: Yinghai Lu Cc: Andreas Noever, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki [+cc Rafael] On Fri, Mar 07, 2014 at 11:39:48AM -0800, Yinghai Lu wrote: > On Fri, Mar 7, 2014 at 8:45 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > > > > I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691 > > > > It seems like clearing DisINTx has some effect on MSI. I don't see > > anything in the spec that would suggest this (I'm looking at the PCIe > > r3.0 spec, sec 7.5.1.1). > > > > Can somebody point out a connection between DisINTx and MSI? If not, > > maybe we'll need some sort of quirk to deal with this. > > I had different impression: if you disable INTx in some chipset, MSI will not > work anymore. > > so we have > > static void pci_intx_for_msi(struct pci_dev *dev, int enable) > { > if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG)) > pci_intx(dev, enable); > } > > and have quirks for ati and broadcom chip to set that FLAG. Setting INTX_DISABLE on some chipsets seems to disable MSI, and that behavior seems to be a hardware defect (see ba698ad4b7e4 "PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.") Andreas has a device where *clearing* INTX_DISABLE seems to disable MSI. Do you think that's also a hardware defect? If it's not a defect, is there something in the spec that explains why that happens? > regarding the regression: i would suggest move out > do_pci_enable_intx() from re-enable path. Today we have this: pcie_portdrv_probe pci_enable_device # clears INTX_DISABLE pci_enable_msi # sets INTX_DISABLE pciehp_configure_device pci_reenable_device # clears INTX_DISABLE again This is clearly not the intent; we set INTX_DISABLE when MSI was enabled, then we clear it again later even though MSI is still enabled. Maybe we should just leave INTX_DISABLE alone if (dev->msi_enabled || dev->msix_enabled). pci_reenable_device() is also used in the device resume path. I don't know what should happen there. But I'm curious about why we set INTX_DISABLE when enabling MSI/MSI-X in the first place. Per the PCI 3.0 spec, sec 6.8.3.3: While enabled for MSI or MSI-X operation, a function is prohibited from using its INTx# pin (if implemented) to request service (MSI, MSI-X, and INTx# are mutually exclusive). This suggests that we might not need to touch INTX_DISABLE when we're enabling MSI/MSI-X. I looked at these commits related to it: ba698ad4b7e4 PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set. b1cbf4e4dddd msi: fix up the msi enable/disable logic 1769b46a3ed9 PCI MSI: always toggle legacy-INTx-enable bit upon MSI entry/exit 986162d3239a ia32 Message Signalled Interrupt support and none of them mentions a problem that requires us to set INTX_DISABLE. It's possible that we're causing ourselves trouble by being overly defensive. I wonder what would happen if we stopped fiddling with it in the MSI/MSI-X paths, e.g., something like this (just as an experiment, of course): diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 7a0fec6ce571..9ef7bd608add 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -442,8 +442,6 @@ static struct msi_desc *alloc_msi_entry(struct pci_dev *dev) static void pci_intx_for_msi(struct pci_dev *dev, int enable) { - if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG)) - pci_intx(dev, enable); } static void __pci_restore_msi_state(struct pci_dev *dev) If we did that, INTX_DISABLE would be cleared by the first pci_enable_device() and pci_reenable_device() wouldn't do anything, leaving it cleared. The resulting state (cleared) would be the same, but the transitions would be gone, and maybe those are important. The thing I don't like about the patch below is that it's magical: the code doesn't have any obvious connection with the problem. How would one deduce that this is necessary, or explain why it's necessary? A changelog like "this makes things work" is not really very useful. If we make a change like this, it needs to be connected with MSI/MSI-X somehow so a reader can figure out why we twiddle INTX_DISABLE in the enable path but not the reenable path. Bjorn > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 5a24cb3..92718c9 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct > pci_dev *dev, int bars) > return pci_enable_resources(dev, bars); > } > > +static void do_pci_enable_intx(struct pci_dev *dev) > +{ > + u8 pin; > + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); > + if (pin) { > + pci_read_config_word(dev, PCI_COMMAND, &cmd); > + if (cmd & PCI_COMMAND_INTX_DISABLE) > + pci_write_config_word(dev, PCI_COMMAND, > + cmd & ~PCI_COMMAND_INTX_DISABLE); > + } > +} > + > static int do_pci_enable_device(struct pci_dev *dev, int bars) > { > int err; > u16 cmd; > - u8 pin; > > err = pci_set_power_state(dev, PCI_D0); > if (err < 0 && err != -EIO) > @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev > *dev, int bars) > return err; > pci_fixup_device(pci_fixup_enable, dev); > > - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); > - if (pin) { > - pci_read_config_word(dev, PCI_COMMAND, &cmd); > - if (cmd & PCI_COMMAND_INTX_DISABLE) > - pci_write_config_word(dev, PCI_COMMAND, > - cmd & ~PCI_COMMAND_INTX_DISABLE); > - } > - > return 0; > } > > @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct > pci_dev *dev, unsigned long flags) > err = do_pci_enable_device(dev, bars); > if (err < 0) > atomic_dec(&dev->enable_cnt); > + else > + do_pci_enable_intx(dev); > return err; > } > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 5a24cb3..92718c9 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct pci_dev *dev, int bars) > return pci_enable_resources(dev, bars); > } > > +static void do_pci_enable_intx(struct pci_dev *dev) > +{ > + u8 pin; > + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); > + if (pin) { > + pci_read_config_word(dev, PCI_COMMAND, &cmd); > + if (cmd & PCI_COMMAND_INTX_DISABLE) > + pci_write_config_word(dev, PCI_COMMAND, > + cmd & ~PCI_COMMAND_INTX_DISABLE); > + } > +} > + > static int do_pci_enable_device(struct pci_dev *dev, int bars) > { > int err; > u16 cmd; > - u8 pin; > > err = pci_set_power_state(dev, PCI_D0); > if (err < 0 && err != -EIO) > @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars) > return err; > pci_fixup_device(pci_fixup_enable, dev); > > - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); > - if (pin) { > - pci_read_config_word(dev, PCI_COMMAND, &cmd); > - if (cmd & PCI_COMMAND_INTX_DISABLE) > - pci_write_config_word(dev, PCI_COMMAND, > - cmd & ~PCI_COMMAND_INTX_DISABLE); > - } > - > return 0; > } > > @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) > err = do_pci_enable_device(dev, bars); > if (err < 0) > atomic_dec(&dev->enable_cnt); > + else > + do_pci_enable_intx(dev); > return err; > } > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan 2014-03-08 1:04 ` Bjorn Helgaas @ 2014-03-08 9:55 ` Andreas Noever 2014-03-10 20:43 ` Bjorn Helgaas 0 siblings, 1 reply; 9+ messages in thread From: Andreas Noever @ 2014-03-08 9:55 UTC (permalink / raw) To: Bjorn Helgaas Cc: Yinghai Lu, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > [+cc Rafael] > > On Fri, Mar 07, 2014 at 11:39:48AM -0800, Yinghai Lu wrote: >> On Fri, Mar 7, 2014 at 8:45 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> > >> > I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691 >> > >> > It seems like clearing DisINTx has some effect on MSI. I don't see >> > anything in the spec that would suggest this (I'm looking at the PCIe >> > r3.0 spec, sec 7.5.1.1). >> > >> > Can somebody point out a connection between DisINTx and MSI? If not, >> > maybe we'll need some sort of quirk to deal with this. >> >> I had different impression: if you disable INTx in some chipset, MSI will not >> work anymore. >> >> so we have >> >> static void pci_intx_for_msi(struct pci_dev *dev, int enable) >> { >> if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG)) >> pci_intx(dev, enable); >> } >> >> and have quirks for ati and broadcom chip to set that FLAG. > > Setting INTX_DISABLE on some chipsets seems to disable MSI, and that > behavior seems to be a hardware defect (see ba698ad4b7e4 "PCI: Add > quirk for devices which disable MSI when INTX_DISABLE is set.") > > Andreas has a device where *clearing* INTX_DISABLE seems to disable > MSI. Do you think that's also a hardware defect? If it's not a > defect, is there something in the spec that explains why that happens? > >> regarding the regression: i would suggest move out >> do_pci_enable_intx() from re-enable path. > > Today we have this: > > pcie_portdrv_probe > pci_enable_device # clears INTX_DISABLE > pci_enable_msi # sets INTX_DISABLE > > pciehp_configure_device > pci_reenable_device # clears INTX_DISABLE again > > This is clearly not the intent; we set INTX_DISABLE when MSI was > enabled, then we clear it again later even though MSI is still > enabled. Maybe we should just leave INTX_DISABLE alone if > (dev->msi_enabled || dev->msix_enabled). > > pci_reenable_device() is also used in the device resume path. I don't > know what should happen there. > > But I'm curious about why we set INTX_DISABLE when enabling MSI/MSI-X > in the first place. Per the PCI 3.0 spec, sec 6.8.3.3: > > While enabled for MSI or MSI-X operation, a function is prohibited > from using its INTx# pin (if implemented) to request service (MSI, > MSI-X, and INTx# are mutually exclusive). > > This suggests that we might not need to touch INTX_DISABLE when we're > enabling MSI/MSI-X. I looked at these commits related to it: > > ba698ad4b7e4 PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set. > b1cbf4e4dddd msi: fix up the msi enable/disable logic > 1769b46a3ed9 PCI MSI: always toggle legacy-INTx-enable bit upon MSI entry/exit > 986162d3239a ia32 Message Signalled Interrupt support > > and none of them mentions a problem that requires us to set > INTX_DISABLE. It's possible that we're causing ourselves trouble by > being overly defensive. I wonder what would happen if we stopped > fiddling with it in the MSI/MSI-X paths, e.g., something like this > (just as an experiment, of course): > > diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c > index 7a0fec6ce571..9ef7bd608add 100644 > --- a/drivers/pci/msi.c > +++ b/drivers/pci/msi.c > @@ -442,8 +442,6 @@ static struct msi_desc *alloc_msi_entry(struct pci_dev *dev) > > static void pci_intx_for_msi(struct pci_dev *dev, int enable) > { > - if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG)) > - pci_intx(dev, enable); > } > > static void __pci_restore_msi_state(struct pci_dev *dev) > > If we did that, INTX_DISABLE would be cleared by the first > pci_enable_device() and pci_reenable_device() wouldn't do anything, > leaving it cleared. The resulting state (cleared) would be the same, > but the transitions would be gone, and maybe those are important. Just a quick note: With pci_intx_for_msi removed no hotplug events are ever delivered. Everything else still works though. So it is either a problem specific to Thunderbolt bridges or maybe it just affects hotplug (and PME?) interrupts. I also attempted booting with pcie_hp=nomsi and now everything works. Interestingly pciehp now also gets an interrupt from 09 (event though that card has just been removed). I suspect this is just pciehp not noticing that it itself is gone. pciehp 0000:06:03.0:pcie24: Card not present on Slot(3-1) pciehp 0000:09:00.0:pcie24: Latch open on Slot(9) pciehp 0000:09:00.0:pcie24: Button pressed on Slot(9) pciehp 0000:09:00.0:pcie24: Card present on Slot(9) pciehp 0000:09:00.0:pcie24: Power fault on slot 9 pciehp 0000:09:00.0:pcie24: Power fault bit 0 set pciehp 0000:09:00.0:pcie24: PCI slot #9 - powering on due to button press. pciehp 0000:09:00.0:pcie24: unloading service driver pciehp pciehp 0000:09:00.0:pcie24: Link Training Error occurs pciehp 0000:09:00.0:pcie24: Failed to check link status > The thing I don't like about the patch below is that it's magical: the > code doesn't have any obvious connection with the problem. How would > one deduce that this is necessary, or explain why it's necessary? A > changelog like "this makes things work" is not really very useful. If > we make a change like this, it needs to be connected with MSI/MSI-X > somehow so a reader can figure out why we twiddle INTX_DISABLE in the > enable path but not the reenable path. > > Bjorn > >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index 5a24cb3..92718c9 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct >> pci_dev *dev, int bars) >> return pci_enable_resources(dev, bars); >> } >> >> +static void do_pci_enable_intx(struct pci_dev *dev) >> +{ >> + u8 pin; >> + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); >> + if (pin) { >> + pci_read_config_word(dev, PCI_COMMAND, &cmd); >> + if (cmd & PCI_COMMAND_INTX_DISABLE) >> + pci_write_config_word(dev, PCI_COMMAND, >> + cmd & ~PCI_COMMAND_INTX_DISABLE); >> + } >> +} >> + >> static int do_pci_enable_device(struct pci_dev *dev, int bars) >> { >> int err; >> u16 cmd; >> - u8 pin; >> >> err = pci_set_power_state(dev, PCI_D0); >> if (err < 0 && err != -EIO) >> @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev >> *dev, int bars) >> return err; >> pci_fixup_device(pci_fixup_enable, dev); >> >> - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); >> - if (pin) { >> - pci_read_config_word(dev, PCI_COMMAND, &cmd); >> - if (cmd & PCI_COMMAND_INTX_DISABLE) >> - pci_write_config_word(dev, PCI_COMMAND, >> - cmd & ~PCI_COMMAND_INTX_DISABLE); >> - } >> - >> return 0; >> } >> >> @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct >> pci_dev *dev, unsigned long flags) >> err = do_pci_enable_device(dev, bars); >> if (err < 0) >> atomic_dec(&dev->enable_cnt); >> + else >> + do_pci_enable_intx(dev); >> return err; >> } > >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index 5a24cb3..92718c9 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct pci_dev *dev, int bars) >> return pci_enable_resources(dev, bars); >> } >> >> +static void do_pci_enable_intx(struct pci_dev *dev) >> +{ >> + u8 pin; >> + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); >> + if (pin) { >> + pci_read_config_word(dev, PCI_COMMAND, &cmd); >> + if (cmd & PCI_COMMAND_INTX_DISABLE) >> + pci_write_config_word(dev, PCI_COMMAND, >> + cmd & ~PCI_COMMAND_INTX_DISABLE); >> + } >> +} >> + >> static int do_pci_enable_device(struct pci_dev *dev, int bars) >> { >> int err; >> u16 cmd; >> - u8 pin; >> >> err = pci_set_power_state(dev, PCI_D0); >> if (err < 0 && err != -EIO) >> @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars) >> return err; >> pci_fixup_device(pci_fixup_enable, dev); >> >> - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); >> - if (pin) { >> - pci_read_config_word(dev, PCI_COMMAND, &cmd); >> - if (cmd & PCI_COMMAND_INTX_DISABLE) >> - pci_write_config_word(dev, PCI_COMMAND, >> - cmd & ~PCI_COMMAND_INTX_DISABLE); >> - } >> - >> return 0; >> } >> >> @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags) >> err = do_pci_enable_device(dev, bars); >> if (err < 0) >> atomic_dec(&dev->enable_cnt); >> + else >> + do_pci_enable_intx(dev); >> return err; >> } >> > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan 2014-03-08 9:55 ` Andreas Noever @ 2014-03-10 20:43 ` Bjorn Helgaas [not found] ` <CAMxnaaWr0z1zbpFQPX6UvYbnxhfA+3aj4ffhCBpp1i4ZLsGTPg@mail.gmail.com> 0 siblings, 1 reply; 9+ messages in thread From: Bjorn Helgaas @ 2014-03-10 20:43 UTC (permalink / raw) To: Andreas Noever Cc: Yinghai Lu, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote: > On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> If we did that, INTX_DISABLE would be cleared by the first >> pci_enable_device() and pci_reenable_device() wouldn't do anything, >> leaving it cleared. The resulting state (cleared) would be the same, >> but the transitions would be gone, and maybe those are important. > Just a quick note: With pci_intx_for_msi removed no hotplug events are > ever delivered. Everything else still works though. So it is either a > problem specific to Thunderbolt bridges or maybe it just affects > hotplug (and PME?) interrupts. Interesting. This is on a MacBook, isn't it? If you have Mac OS on it, is there a way you can do the equivalent of lspci on it? I'm curious about whether it sets INTx_DISABLE when it enables MSI. I still haven't found any indication that INTx_DISABLE is intended or required as part of enabling MSI/MSI-X, so I'm quite dubious about Linux using it that way. The references I've seen, e.g., http://www.pcisig.com/reflector/msg05301.html, http://www.pcisig.com/reflector/msg05302.html, say the purpose is to better manage shared IRQs. > I also attempted booting with pcie_hp=nomsi and now everything works. > Interestingly pciehp now also gets an interrupt from 09 (event though > that card has just been removed). I suspect this is just pciehp not > noticing that it itself is gone. > pciehp 0000:06:03.0:pcie24: Card not present on Slot(3-1) > pciehp 0000:09:00.0:pcie24: Latch open on Slot(9) > pciehp 0000:09:00.0:pcie24: Button pressed on Slot(9) > pciehp 0000:09:00.0:pcie24: Card present on Slot(9) > pciehp 0000:09:00.0:pcie24: Power fault on slot 9 > pciehp 0000:09:00.0:pcie24: Power fault bit 0 set > pciehp 0000:09:00.0:pcie24: PCI slot #9 - powering on due to button press. > pciehp 0000:09:00.0:pcie24: unloading service driver pciehp > pciehp 0000:09:00.0:pcie24: Link Training Error occurs > pciehp 0000:09:00.0:pcie24: Failed to check link status This is a good clue. I think the portdrv registration thing is a bit more confusing than necessary. I'll poke around in there a bit. Unfortunately, I don't think this is going to lead to a quick easy fix suitable for -rc7, so we'll probably have to do something simple like skipping the INTx enable if MSI is already enabled. Bjorn ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CAMxnaaWr0z1zbpFQPX6UvYbnxhfA+3aj4ffhCBpp1i4ZLsGTPg@mail.gmail.com>]
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan [not found] ` <CAMxnaaWr0z1zbpFQPX6UvYbnxhfA+3aj4ffhCBpp1i4ZLsGTPg@mail.gmail.com> @ 2014-03-11 0:10 ` Bjorn Helgaas 2014-03-11 12:52 ` Andreas Noever 0 siblings, 1 reply; 9+ messages in thread From: Bjorn Helgaas @ 2014-03-11 0:10 UTC (permalink / raw) To: Andreas Noever Cc: Yinghai Lu, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki On Tue, Mar 11, 2014 at 12:10:02AM +0100, Andreas Noever wrote: > On Mon, Mar 10, 2014 at 9:43 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: > > On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote: > >> On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > > > >>> If we did that, INTX_DISABLE would be cleared by the first > >>> pci_enable_device() and pci_reenable_device() wouldn't do anything, > >>> leaving it cleared. The resulting state (cleared) would be the same, > >>> but the transitions would be gone, and maybe those are important. > >> Just a quick note: With pci_intx_for_msi removed no hotplug events are > >> ever delivered. Everything else still works though. So it is either a > >> problem specific to Thunderbolt bridges or maybe it just affects > >> hotplug (and PME?) interrupts. > > > > Interesting. This is on a MacBook, isn't it? If you have Mac OS on > > it, is there a way you can do the equivalent of lspci on it? I'm > > curious about whether it sets INTx_DISABLE when it enables MSI. > > lspci -vv and lspci -vv -xxxx attached (yes, someone made a port). > > It looks like Mac OS sets DisINTx for all devices that have MSI > enabled. The only exception is the Thunderbolt controller (no > idea...). But the hotplug bridges all have DisINTx+. OK, thanks. I don't know what to make of that. Here's a possible patch; can you try it out? PCI: Do not enable INTx in pci_reenable_device() From: Bjorn Helgaas <bhelgaas@google.com> Previously we cleared the Interrupt Disable bit in do_pci_enable_device(), which is used by both pci_enable_device() and pci_reenable_device(). But we use pci_reenable_device() after the driver may have enabled MSI or MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X. The pciehp hot-plug path uses pci_reenable_device() on the hotplug bridge, and clearing its Interrupt Disable bit makes its hotplug event-reporting MSI stop working. Fixes: 1f42db786b14 PCI: Enable INTx if BIOS left them disabled Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> --- drivers/pci/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 8dc3e701ec57..79fc89c6c3f3 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1192,6 +1192,9 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars) return err; pci_fixup_device(pci_fixup_enable, dev); + if (dev->msi_enabled || dev->msix_enabled) + return 0; + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); if (pin) { pci_read_config_word(dev, PCI_COMMAND, &cmd); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan 2014-03-11 0:10 ` Bjorn Helgaas @ 2014-03-11 12:52 ` Andreas Noever 2014-03-17 16:36 ` Bjorn Helgaas 0 siblings, 1 reply; 9+ messages in thread From: Andreas Noever @ 2014-03-11 12:52 UTC (permalink / raw) To: Bjorn Helgaas Cc: Yinghai Lu, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki On Tue, Mar 11, 2014 at 1:10 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: > On Tue, Mar 11, 2014 at 12:10:02AM +0100, Andreas Noever wrote: >> On Mon, Mar 10, 2014 at 9:43 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> > On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote: >> >> On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> > >> >>> If we did that, INTX_DISABLE would be cleared by the first >> >>> pci_enable_device() and pci_reenable_device() wouldn't do anything, >> >>> leaving it cleared. The resulting state (cleared) would be the same, >> >>> but the transitions would be gone, and maybe those are important. >> >> Just a quick note: With pci_intx_for_msi removed no hotplug events are >> >> ever delivered. Everything else still works though. So it is either a >> >> problem specific to Thunderbolt bridges or maybe it just affects >> >> hotplug (and PME?) interrupts. >> > >> > Interesting. This is on a MacBook, isn't it? If you have Mac OS on >> > it, is there a way you can do the equivalent of lspci on it? I'm >> > curious about whether it sets INTx_DISABLE when it enables MSI. >> >> lspci -vv and lspci -vv -xxxx attached (yes, someone made a port). >> >> It looks like Mac OS sets DisINTx for all devices that have MSI >> enabled. The only exception is the Thunderbolt controller (no >> idea...). But the hotplug bridges all have DisINTx+. > > OK, thanks. I don't know what to make of that. > > Here's a possible patch; can you try it out? > > > PCI: Do not enable INTx in pci_reenable_device() > > From: Bjorn Helgaas <bhelgaas@google.com> > > Previously we cleared the Interrupt Disable bit in do_pci_enable_device(), > which is used by both pci_enable_device() and pci_reenable_device(). But > we use pci_reenable_device() after the driver may have enabled MSI or > MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X. > > The pciehp hot-plug path uses pci_reenable_device() on the hotplug bridge, > and clearing its Interrupt Disable bit makes its hotplug event-reporting > MSI stop working. > > Fixes: 1f42db786b14 PCI: Enable INTx if BIOS left them disabled > Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691 > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > --- > drivers/pci/pci.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index 8dc3e701ec57..79fc89c6c3f3 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -1192,6 +1192,9 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars) > return err; > pci_fixup_device(pci_fixup_enable, dev); > > + if (dev->msi_enabled || dev->msix_enabled) > + return 0; > + > pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); > if (pin) { > pci_read_config_word(dev, PCI_COMMAND, &cmd); That fixes it. Thanks, Andreas ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan 2014-03-11 12:52 ` Andreas Noever @ 2014-03-17 16:36 ` Bjorn Helgaas 0 siblings, 0 replies; 9+ messages in thread From: Bjorn Helgaas @ 2014-03-17 16:36 UTC (permalink / raw) To: Andreas Noever Cc: Yinghai Lu, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki On Tue, Mar 11, 2014 at 6:52 AM, Andreas Noever <andreas.noever@gmail.com> wrote: > On Tue, Mar 11, 2014 at 1:10 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >> On Tue, Mar 11, 2014 at 12:10:02AM +0100, Andreas Noever wrote: >>> On Mon, Mar 10, 2014 at 9:43 PM, Bjorn Helgaas <bhelgaas@google.com> wrote: >>> > On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote: >>> >> On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote: >>> > >>> >>> If we did that, INTX_DISABLE would be cleared by the first >>> >>> pci_enable_device() and pci_reenable_device() wouldn't do anything, >>> >>> leaving it cleared. The resulting state (cleared) would be the same, >>> >>> but the transitions would be gone, and maybe those are important. >>> >> Just a quick note: With pci_intx_for_msi removed no hotplug events are >>> >> ever delivered. Everything else still works though. So it is either a >>> >> problem specific to Thunderbolt bridges or maybe it just affects >>> >> hotplug (and PME?) interrupts. >>> > >>> > Interesting. This is on a MacBook, isn't it? If you have Mac OS on >>> > it, is there a way you can do the equivalent of lspci on it? I'm >>> > curious about whether it sets INTx_DISABLE when it enables MSI. >>> >>> lspci -vv and lspci -vv -xxxx attached (yes, someone made a port). >>> >>> It looks like Mac OS sets DisINTx for all devices that have MSI >>> enabled. The only exception is the Thunderbolt controller (no >>> idea...). But the hotplug bridges all have DisINTx+. >> >> OK, thanks. I don't know what to make of that. >> >> Here's a possible patch; can you try it out? >> >> >> PCI: Do not enable INTx in pci_reenable_device() >> >> From: Bjorn Helgaas <bhelgaas@google.com> >> >> Previously we cleared the Interrupt Disable bit in do_pci_enable_device(), >> which is used by both pci_enable_device() and pci_reenable_device(). But >> we use pci_reenable_device() after the driver may have enabled MSI or >> MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X. >> >> The pciehp hot-plug path uses pci_reenable_device() on the hotplug bridge, >> and clearing its Interrupt Disable bit makes its hotplug event-reporting >> MSI stop working. >> >> Fixes: 1f42db786b14 PCI: Enable INTx if BIOS left them disabled >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691 >> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> >> --- >> drivers/pci/pci.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index 8dc3e701ec57..79fc89c6c3f3 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -1192,6 +1192,9 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars) >> return err; >> pci_fixup_device(pci_fixup_enable, dev); >> >> + if (dev->msi_enabled || dev->msix_enabled) >> + return 0; >> + >> pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin); >> if (pin) { >> pci_read_config_word(dev, PCI_COMMAND, &cmd); > > That fixes it. Thanks for testing this, Andreas. I merged it and it's in Linus' tree now. Would you mind booting with "pci=earlydump" sometime and attaching the dmesg to the bugzilla? Your lspci output shows that INTx_DISABLE is set, and I'm curious about whether it was set by the BIOS or by Mac OS. Bjorn ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-03-17 16:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-07 14:33 [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan Andreas Noever
2014-03-07 16:45 ` Bjorn Helgaas
2014-03-07 19:39 ` Yinghai Lu
2014-03-08 1:04 ` Bjorn Helgaas
2014-03-08 9:55 ` Andreas Noever
2014-03-10 20:43 ` Bjorn Helgaas
[not found] ` <CAMxnaaWr0z1zbpFQPX6UvYbnxhfA+3aj4ffhCBpp1i4ZLsGTPg@mail.gmail.com>
2014-03-11 0:10 ` Bjorn Helgaas
2014-03-11 12:52 ` Andreas Noever
2014-03-17 16:36 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).