* [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
@ 2014-03-07 14:33 Andreas Noever
2014-03-07 16:45 ` Bjorn Helgaas
0 siblings, 1 reply; 9+ messages in thread
From: Andreas Noever @ 2014-03-07 14:33 UTC (permalink / raw)
To: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
Bjorn Helgaas
Hi,
After upgrading to the latest RC I noticed that suprise removal
stopped working. Linux did not notice that the devices where gone.
Bisection points to
1f42db786b14a31bf807fc41ee5583a00c08fcb1 PCI: Enable INTx if BIOS left
them disabled
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1f42db786b14a31bf807fc41ee5583a00c08fcb1
It seems that the above patch is triggered on the bridge itself (!)
when a new device is added below it. At this point the hotplug driver
for the bridge has already enabled MSI. Reenabling INTX kills MSI and
prevents later suprise removal notifications.
The following stacktrace is from a "echo 1 > rescan" on the bridge:
do_pci_enable_device+0x59/0x140
pci_reenable_device+0x1f/0x30
pci_assign_unassigned_bridge_resources+0x123/0x160
pci_rescan_bus_bridge_resize+0x28/0x40
dev_bus_rescan_store+0x85/0xa0
dev_attr_store+0x18/0x30
sysfs_kf_write+0x3d/0x50
kernfs_fop_write+0xd2/0x140
vfs_write+0xba/0x1e0
SyS_write+0x49/0xa0
system_call_fastpath+0x1a/0x1f
Similarly for a hotplug event:
do_pci_enable_device+0x59/0x140
pci_reenable_device+0x1f/0x30
pci_assign_unassigned_bridge_resources+0x123/0x160
pciehp_configure_device+0x90/0x160
pciehp_enable_slot+0x163/0x280
pciehp_power_thread+0xb8/0xe0
process_one_work+0x167/0x420
worker_thread+0x121/0x3a0
In both cases DisINTx is turned off and the hotplug driver stops
reacting to events.
Andreas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
2014-03-07 14:33 [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan Andreas Noever
@ 2014-03-07 16:45 ` Bjorn Helgaas
2014-03-07 19:39 ` Yinghai Lu
0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2014-03-07 16:45 UTC (permalink / raw)
To: Andreas Noever
Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
Yinghai Lu, Rajat Jain
[+cc Yinghai, Rajat]
On Fri, Mar 7, 2014 at 7:33 AM, Andreas Noever <andreas.noever@gmail.com> wrote:
> Hi,
>
> After upgrading to the latest RC I noticed that suprise removal
> stopped working. Linux did not notice that the devices where gone.
> Bisection points to
>
> 1f42db786b14a31bf807fc41ee5583a00c08fcb1 PCI: Enable INTx if BIOS left
> them disabled
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=1f42db786b14a31bf807fc41ee5583a00c08fcb1
>
> It seems that the above patch is triggered on the bridge itself (!)
> when a new device is added below it. At this point the hotplug driver
> for the bridge has already enabled MSI. Reenabling INTX kills MSI and
> prevents later suprise removal notifications.
>
> The following stacktrace is from a "echo 1 > rescan" on the bridge:
> do_pci_enable_device+0x59/0x140
> pci_reenable_device+0x1f/0x30
> pci_assign_unassigned_bridge_resources+0x123/0x160
> pci_rescan_bus_bridge_resize+0x28/0x40
> dev_bus_rescan_store+0x85/0xa0
> dev_attr_store+0x18/0x30
> sysfs_kf_write+0x3d/0x50
> kernfs_fop_write+0xd2/0x140
> vfs_write+0xba/0x1e0
> SyS_write+0x49/0xa0
> system_call_fastpath+0x1a/0x1f
>
> Similarly for a hotplug event:
> do_pci_enable_device+0x59/0x140
> pci_reenable_device+0x1f/0x30
> pci_assign_unassigned_bridge_resources+0x123/0x160
> pciehp_configure_device+0x90/0x160
> pciehp_enable_slot+0x163/0x280
> pciehp_power_thread+0xb8/0xe0
> process_one_work+0x167/0x420
> worker_thread+0x121/0x3a0
>
> In both cases DisINTx is turned off and the hotplug driver stops
> reacting to events.
Thanks a lot for noticing and bisecting this!
I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691
It seems like clearing DisINTx has some effect on MSI. I don't see
anything in the spec that would suggest this (I'm looking at the PCIe
r3.0 spec, sec 7.5.1.1).
Can somebody point out a connection between DisINTx and MSI? If not,
maybe we'll need some sort of quirk to deal with this.
Andrea, can you attach a complete dmesg log and "lspci -vv" output to
the bugzilla?
Thanks again!
Bjorn
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
2014-03-07 16:45 ` Bjorn Helgaas
@ 2014-03-07 19:39 ` Yinghai Lu
2014-03-08 1:04 ` Bjorn Helgaas
0 siblings, 1 reply; 9+ messages in thread
From: Yinghai Lu @ 2014-03-07 19:39 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Andreas Noever, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Rajat Jain
[-- Attachment #1: Type: text/plain, Size: 2450 bytes --]
On Fri, Mar 7, 2014 at 8:45 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>
> I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691
>
> It seems like clearing DisINTx has some effect on MSI. I don't see
> anything in the spec that would suggest this (I'm looking at the PCIe
> r3.0 spec, sec 7.5.1.1).
>
> Can somebody point out a connection between DisINTx and MSI? If not,
> maybe we'll need some sort of quirk to deal with this.
I had different impression: if you disable INTx in some chipset, MSI will not
work anymore.
so we have
static void pci_intx_for_msi(struct pci_dev *dev, int enable)
{
if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG))
pci_intx(dev, enable);
}
and have quirks for ati and broadcom chip to set that FLAG.
regarding the regression: i would suggest move out
do_pci_enable_intx() from re-enable path.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5a24cb3..92718c9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct
pci_dev *dev, int bars)
return pci_enable_resources(dev, bars);
}
+static void do_pci_enable_intx(struct pci_dev *dev)
+{
+ u8 pin;
+ pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
+ if (pin) {
+ pci_read_config_word(dev, PCI_COMMAND, &cmd);
+ if (cmd & PCI_COMMAND_INTX_DISABLE)
+ pci_write_config_word(dev, PCI_COMMAND,
+ cmd & ~PCI_COMMAND_INTX_DISABLE);
+ }
+}
+
static int do_pci_enable_device(struct pci_dev *dev, int bars)
{
int err;
u16 cmd;
- u8 pin;
err = pci_set_power_state(dev, PCI_D0);
if (err < 0 && err != -EIO)
@@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev
*dev, int bars)
return err;
pci_fixup_device(pci_fixup_enable, dev);
- pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
- if (pin) {
- pci_read_config_word(dev, PCI_COMMAND, &cmd);
- if (cmd & PCI_COMMAND_INTX_DISABLE)
- pci_write_config_word(dev, PCI_COMMAND,
- cmd & ~PCI_COMMAND_INTX_DISABLE);
- }
-
return 0;
}
@@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct
pci_dev *dev, unsigned long flags)
err = do_pci_enable_device(dev, bars);
if (err < 0)
atomic_dec(&dev->enable_cnt);
+ else
+ do_pci_enable_intx(dev);
return err;
}
[-- Attachment #2: do_not_enable_intx_again.patch --]
[-- Type: text/x-patch, Size: 1413 bytes --]
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5a24cb3..92718c9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct pci_dev *dev, int bars)
return pci_enable_resources(dev, bars);
}
+static void do_pci_enable_intx(struct pci_dev *dev)
+{
+ u8 pin;
+ pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
+ if (pin) {
+ pci_read_config_word(dev, PCI_COMMAND, &cmd);
+ if (cmd & PCI_COMMAND_INTX_DISABLE)
+ pci_write_config_word(dev, PCI_COMMAND,
+ cmd & ~PCI_COMMAND_INTX_DISABLE);
+ }
+}
+
static int do_pci_enable_device(struct pci_dev *dev, int bars)
{
int err;
u16 cmd;
- u8 pin;
err = pci_set_power_state(dev, PCI_D0);
if (err < 0 && err != -EIO)
@@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
return err;
pci_fixup_device(pci_fixup_enable, dev);
- pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
- if (pin) {
- pci_read_config_word(dev, PCI_COMMAND, &cmd);
- if (cmd & PCI_COMMAND_INTX_DISABLE)
- pci_write_config_word(dev, PCI_COMMAND,
- cmd & ~PCI_COMMAND_INTX_DISABLE);
- }
-
return 0;
}
@@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
err = do_pci_enable_device(dev, bars);
if (err < 0)
atomic_dec(&dev->enable_cnt);
+ else
+ do_pci_enable_intx(dev);
return err;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
2014-03-07 19:39 ` Yinghai Lu
@ 2014-03-08 1:04 ` Bjorn Helgaas
2014-03-08 9:55 ` Andreas Noever
0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2014-03-08 1:04 UTC (permalink / raw)
To: Yinghai Lu
Cc: Andreas Noever, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki
[+cc Rafael]
On Fri, Mar 07, 2014 at 11:39:48AM -0800, Yinghai Lu wrote:
> On Fri, Mar 7, 2014 at 8:45 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> > I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691
> >
> > It seems like clearing DisINTx has some effect on MSI. I don't see
> > anything in the spec that would suggest this (I'm looking at the PCIe
> > r3.0 spec, sec 7.5.1.1).
> >
> > Can somebody point out a connection between DisINTx and MSI? If not,
> > maybe we'll need some sort of quirk to deal with this.
>
> I had different impression: if you disable INTx in some chipset, MSI will not
> work anymore.
>
> so we have
>
> static void pci_intx_for_msi(struct pci_dev *dev, int enable)
> {
> if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG))
> pci_intx(dev, enable);
> }
>
> and have quirks for ati and broadcom chip to set that FLAG.
Setting INTX_DISABLE on some chipsets seems to disable MSI, and that
behavior seems to be a hardware defect (see ba698ad4b7e4 "PCI: Add
quirk for devices which disable MSI when INTX_DISABLE is set.")
Andreas has a device where *clearing* INTX_DISABLE seems to disable
MSI. Do you think that's also a hardware defect? If it's not a
defect, is there something in the spec that explains why that happens?
> regarding the regression: i would suggest move out
> do_pci_enable_intx() from re-enable path.
Today we have this:
pcie_portdrv_probe
pci_enable_device # clears INTX_DISABLE
pci_enable_msi # sets INTX_DISABLE
pciehp_configure_device
pci_reenable_device # clears INTX_DISABLE again
This is clearly not the intent; we set INTX_DISABLE when MSI was
enabled, then we clear it again later even though MSI is still
enabled. Maybe we should just leave INTX_DISABLE alone if
(dev->msi_enabled || dev->msix_enabled).
pci_reenable_device() is also used in the device resume path. I don't
know what should happen there.
But I'm curious about why we set INTX_DISABLE when enabling MSI/MSI-X
in the first place. Per the PCI 3.0 spec, sec 6.8.3.3:
While enabled for MSI or MSI-X operation, a function is prohibited
from using its INTx# pin (if implemented) to request service (MSI,
MSI-X, and INTx# are mutually exclusive).
This suggests that we might not need to touch INTX_DISABLE when we're
enabling MSI/MSI-X. I looked at these commits related to it:
ba698ad4b7e4 PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.
b1cbf4e4dddd msi: fix up the msi enable/disable logic
1769b46a3ed9 PCI MSI: always toggle legacy-INTx-enable bit upon MSI entry/exit
986162d3239a ia32 Message Signalled Interrupt support
and none of them mentions a problem that requires us to set
INTX_DISABLE. It's possible that we're causing ourselves trouble by
being overly defensive. I wonder what would happen if we stopped
fiddling with it in the MSI/MSI-X paths, e.g., something like this
(just as an experiment, of course):
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 7a0fec6ce571..9ef7bd608add 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -442,8 +442,6 @@ static struct msi_desc *alloc_msi_entry(struct pci_dev *dev)
static void pci_intx_for_msi(struct pci_dev *dev, int enable)
{
- if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG))
- pci_intx(dev, enable);
}
static void __pci_restore_msi_state(struct pci_dev *dev)
If we did that, INTX_DISABLE would be cleared by the first
pci_enable_device() and pci_reenable_device() wouldn't do anything,
leaving it cleared. The resulting state (cleared) would be the same,
but the transitions would be gone, and maybe those are important.
The thing I don't like about the patch below is that it's magical: the
code doesn't have any obvious connection with the problem. How would
one deduce that this is necessary, or explain why it's necessary? A
changelog like "this makes things work" is not really very useful. If
we make a change like this, it needs to be connected with MSI/MSI-X
somehow so a reader can figure out why we twiddle INTX_DISABLE in the
enable path but not the reenable path.
Bjorn
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 5a24cb3..92718c9 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct
> pci_dev *dev, int bars)
> return pci_enable_resources(dev, bars);
> }
>
> +static void do_pci_enable_intx(struct pci_dev *dev)
> +{
> + u8 pin;
> + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
> + if (pin) {
> + pci_read_config_word(dev, PCI_COMMAND, &cmd);
> + if (cmd & PCI_COMMAND_INTX_DISABLE)
> + pci_write_config_word(dev, PCI_COMMAND,
> + cmd & ~PCI_COMMAND_INTX_DISABLE);
> + }
> +}
> +
> static int do_pci_enable_device(struct pci_dev *dev, int bars)
> {
> int err;
> u16 cmd;
> - u8 pin;
>
> err = pci_set_power_state(dev, PCI_D0);
> if (err < 0 && err != -EIO)
> @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev
> *dev, int bars)
> return err;
> pci_fixup_device(pci_fixup_enable, dev);
>
> - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
> - if (pin) {
> - pci_read_config_word(dev, PCI_COMMAND, &cmd);
> - if (cmd & PCI_COMMAND_INTX_DISABLE)
> - pci_write_config_word(dev, PCI_COMMAND,
> - cmd & ~PCI_COMMAND_INTX_DISABLE);
> - }
> -
> return 0;
> }
>
> @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct
> pci_dev *dev, unsigned long flags)
> err = do_pci_enable_device(dev, bars);
> if (err < 0)
> atomic_dec(&dev->enable_cnt);
> + else
> + do_pci_enable_intx(dev);
> return err;
> }
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 5a24cb3..92718c9 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct pci_dev *dev, int bars)
> return pci_enable_resources(dev, bars);
> }
>
> +static void do_pci_enable_intx(struct pci_dev *dev)
> +{
> + u8 pin;
> + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
> + if (pin) {
> + pci_read_config_word(dev, PCI_COMMAND, &cmd);
> + if (cmd & PCI_COMMAND_INTX_DISABLE)
> + pci_write_config_word(dev, PCI_COMMAND,
> + cmd & ~PCI_COMMAND_INTX_DISABLE);
> + }
> +}
> +
> static int do_pci_enable_device(struct pci_dev *dev, int bars)
> {
> int err;
> u16 cmd;
> - u8 pin;
>
> err = pci_set_power_state(dev, PCI_D0);
> if (err < 0 && err != -EIO)
> @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
> return err;
> pci_fixup_device(pci_fixup_enable, dev);
>
> - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
> - if (pin) {
> - pci_read_config_word(dev, PCI_COMMAND, &cmd);
> - if (cmd & PCI_COMMAND_INTX_DISABLE)
> - pci_write_config_word(dev, PCI_COMMAND,
> - cmd & ~PCI_COMMAND_INTX_DISABLE);
> - }
> -
> return 0;
> }
>
> @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
> err = do_pci_enable_device(dev, bars);
> if (err < 0)
> atomic_dec(&dev->enable_cnt);
> + else
> + do_pci_enable_intx(dev);
> return err;
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
2014-03-08 1:04 ` Bjorn Helgaas
@ 2014-03-08 9:55 ` Andreas Noever
2014-03-10 20:43 ` Bjorn Helgaas
0 siblings, 1 reply; 9+ messages in thread
From: Andreas Noever @ 2014-03-08 9:55 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Yinghai Lu, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki
On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> [+cc Rafael]
>
> On Fri, Mar 07, 2014 at 11:39:48AM -0800, Yinghai Lu wrote:
>> On Fri, Mar 7, 2014 at 8:45 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >
>> > I opened a bugzilla report at https://bugzilla.kernel.org/show_bug.cgi?id=71691
>> >
>> > It seems like clearing DisINTx has some effect on MSI. I don't see
>> > anything in the spec that would suggest this (I'm looking at the PCIe
>> > r3.0 spec, sec 7.5.1.1).
>> >
>> > Can somebody point out a connection between DisINTx and MSI? If not,
>> > maybe we'll need some sort of quirk to deal with this.
>>
>> I had different impression: if you disable INTx in some chipset, MSI will not
>> work anymore.
>>
>> so we have
>>
>> static void pci_intx_for_msi(struct pci_dev *dev, int enable)
>> {
>> if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG))
>> pci_intx(dev, enable);
>> }
>>
>> and have quirks for ati and broadcom chip to set that FLAG.
>
> Setting INTX_DISABLE on some chipsets seems to disable MSI, and that
> behavior seems to be a hardware defect (see ba698ad4b7e4 "PCI: Add
> quirk for devices which disable MSI when INTX_DISABLE is set.")
>
> Andreas has a device where *clearing* INTX_DISABLE seems to disable
> MSI. Do you think that's also a hardware defect? If it's not a
> defect, is there something in the spec that explains why that happens?
>
>> regarding the regression: i would suggest move out
>> do_pci_enable_intx() from re-enable path.
>
> Today we have this:
>
> pcie_portdrv_probe
> pci_enable_device # clears INTX_DISABLE
> pci_enable_msi # sets INTX_DISABLE
>
> pciehp_configure_device
> pci_reenable_device # clears INTX_DISABLE again
>
> This is clearly not the intent; we set INTX_DISABLE when MSI was
> enabled, then we clear it again later even though MSI is still
> enabled. Maybe we should just leave INTX_DISABLE alone if
> (dev->msi_enabled || dev->msix_enabled).
>
> pci_reenable_device() is also used in the device resume path. I don't
> know what should happen there.
>
> But I'm curious about why we set INTX_DISABLE when enabling MSI/MSI-X
> in the first place. Per the PCI 3.0 spec, sec 6.8.3.3:
>
> While enabled for MSI or MSI-X operation, a function is prohibited
> from using its INTx# pin (if implemented) to request service (MSI,
> MSI-X, and INTx# are mutually exclusive).
>
> This suggests that we might not need to touch INTX_DISABLE when we're
> enabling MSI/MSI-X. I looked at these commits related to it:
>
> ba698ad4b7e4 PCI: Add quirk for devices which disable MSI when INTX_DISABLE is set.
> b1cbf4e4dddd msi: fix up the msi enable/disable logic
> 1769b46a3ed9 PCI MSI: always toggle legacy-INTx-enable bit upon MSI entry/exit
> 986162d3239a ia32 Message Signalled Interrupt support
>
> and none of them mentions a problem that requires us to set
> INTX_DISABLE. It's possible that we're causing ourselves trouble by
> being overly defensive. I wonder what would happen if we stopped
> fiddling with it in the MSI/MSI-X paths, e.g., something like this
> (just as an experiment, of course):
>
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 7a0fec6ce571..9ef7bd608add 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -442,8 +442,6 @@ static struct msi_desc *alloc_msi_entry(struct pci_dev *dev)
>
> static void pci_intx_for_msi(struct pci_dev *dev, int enable)
> {
> - if (!(dev->dev_flags & PCI_DEV_FLAGS_MSI_INTX_DISABLE_BUG))
> - pci_intx(dev, enable);
> }
>
> static void __pci_restore_msi_state(struct pci_dev *dev)
>
> If we did that, INTX_DISABLE would be cleared by the first
> pci_enable_device() and pci_reenable_device() wouldn't do anything,
> leaving it cleared. The resulting state (cleared) would be the same,
> but the transitions would be gone, and maybe those are important.
Just a quick note: With pci_intx_for_msi removed no hotplug events are
ever delivered. Everything else still works though. So it is either a
problem specific to Thunderbolt bridges or maybe it just affects
hotplug (and PME?) interrupts.
I also attempted booting with pcie_hp=nomsi and now everything works.
Interestingly pciehp now also gets an interrupt from 09 (event though
that card has just been removed). I suspect this is just pciehp not
noticing that it itself is gone.
pciehp 0000:06:03.0:pcie24: Card not present on Slot(3-1)
pciehp 0000:09:00.0:pcie24: Latch open on Slot(9)
pciehp 0000:09:00.0:pcie24: Button pressed on Slot(9)
pciehp 0000:09:00.0:pcie24: Card present on Slot(9)
pciehp 0000:09:00.0:pcie24: Power fault on slot 9
pciehp 0000:09:00.0:pcie24: Power fault bit 0 set
pciehp 0000:09:00.0:pcie24: PCI slot #9 - powering on due to button press.
pciehp 0000:09:00.0:pcie24: unloading service driver pciehp
pciehp 0000:09:00.0:pcie24: Link Training Error occurs
pciehp 0000:09:00.0:pcie24: Failed to check link status
> The thing I don't like about the patch below is that it's magical: the
> code doesn't have any obvious connection with the problem. How would
> one deduce that this is necessary, or explain why it's necessary? A
> changelog like "this makes things work" is not really very useful. If
> we make a change like this, it needs to be connected with MSI/MSI-X
> somehow so a reader can figure out why we twiddle INTX_DISABLE in the
> enable path but not the reenable path.
>
> Bjorn
>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 5a24cb3..92718c9 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct
>> pci_dev *dev, int bars)
>> return pci_enable_resources(dev, bars);
>> }
>>
>> +static void do_pci_enable_intx(struct pci_dev *dev)
>> +{
>> + u8 pin;
>> + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
>> + if (pin) {
>> + pci_read_config_word(dev, PCI_COMMAND, &cmd);
>> + if (cmd & PCI_COMMAND_INTX_DISABLE)
>> + pci_write_config_word(dev, PCI_COMMAND,
>> + cmd & ~PCI_COMMAND_INTX_DISABLE);
>> + }
>> +}
>> +
>> static int do_pci_enable_device(struct pci_dev *dev, int bars)
>> {
>> int err;
>> u16 cmd;
>> - u8 pin;
>>
>> err = pci_set_power_state(dev, PCI_D0);
>> if (err < 0 && err != -EIO)
>> @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev
>> *dev, int bars)
>> return err;
>> pci_fixup_device(pci_fixup_enable, dev);
>>
>> - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
>> - if (pin) {
>> - pci_read_config_word(dev, PCI_COMMAND, &cmd);
>> - if (cmd & PCI_COMMAND_INTX_DISABLE)
>> - pci_write_config_word(dev, PCI_COMMAND,
>> - cmd & ~PCI_COMMAND_INTX_DISABLE);
>> - }
>> -
>> return 0;
>> }
>>
>> @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct
>> pci_dev *dev, unsigned long flags)
>> err = do_pci_enable_device(dev, bars);
>> if (err < 0)
>> atomic_dec(&dev->enable_cnt);
>> + else
>> + do_pci_enable_intx(dev);
>> return err;
>> }
>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 5a24cb3..92718c9 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1190,11 +1190,22 @@ int __weak pcibios_enable_device(struct pci_dev *dev, int bars)
>> return pci_enable_resources(dev, bars);
>> }
>>
>> +static void do_pci_enable_intx(struct pci_dev *dev)
>> +{
>> + u8 pin;
>> + pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
>> + if (pin) {
>> + pci_read_config_word(dev, PCI_COMMAND, &cmd);
>> + if (cmd & PCI_COMMAND_INTX_DISABLE)
>> + pci_write_config_word(dev, PCI_COMMAND,
>> + cmd & ~PCI_COMMAND_INTX_DISABLE);
>> + }
>> +}
>> +
>> static int do_pci_enable_device(struct pci_dev *dev, int bars)
>> {
>> int err;
>> u16 cmd;
>> - u8 pin;
>>
>> err = pci_set_power_state(dev, PCI_D0);
>> if (err < 0 && err != -EIO)
>> @@ -1204,14 +1215,6 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
>> return err;
>> pci_fixup_device(pci_fixup_enable, dev);
>>
>> - pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
>> - if (pin) {
>> - pci_read_config_word(dev, PCI_COMMAND, &cmd);
>> - if (cmd & PCI_COMMAND_INTX_DISABLE)
>> - pci_write_config_word(dev, PCI_COMMAND,
>> - cmd & ~PCI_COMMAND_INTX_DISABLE);
>> - }
>> -
>> return 0;
>> }
>>
>> @@ -1287,6 +1290,8 @@ static int pci_enable_device_flags(struct pci_dev *dev, unsigned long flags)
>> err = do_pci_enable_device(dev, bars);
>> if (err < 0)
>> atomic_dec(&dev->enable_cnt);
>> + else
>> + do_pci_enable_intx(dev);
>> return err;
>> }
>>
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
2014-03-08 9:55 ` Andreas Noever
@ 2014-03-10 20:43 ` Bjorn Helgaas
[not found] ` <CAMxnaaWr0z1zbpFQPX6UvYbnxhfA+3aj4ffhCBpp1i4ZLsGTPg@mail.gmail.com>
0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2014-03-10 20:43 UTC (permalink / raw)
To: Andreas Noever
Cc: Yinghai Lu, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki
On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote:
> On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> If we did that, INTX_DISABLE would be cleared by the first
>> pci_enable_device() and pci_reenable_device() wouldn't do anything,
>> leaving it cleared. The resulting state (cleared) would be the same,
>> but the transitions would be gone, and maybe those are important.
> Just a quick note: With pci_intx_for_msi removed no hotplug events are
> ever delivered. Everything else still works though. So it is either a
> problem specific to Thunderbolt bridges or maybe it just affects
> hotplug (and PME?) interrupts.
Interesting. This is on a MacBook, isn't it? If you have Mac OS on
it, is there a way you can do the equivalent of lspci on it? I'm
curious about whether it sets INTx_DISABLE when it enables MSI.
I still haven't found any indication that INTx_DISABLE is intended or
required as part of enabling MSI/MSI-X, so I'm quite dubious about
Linux using it that way. The references I've seen, e.g.,
http://www.pcisig.com/reflector/msg05301.html,
http://www.pcisig.com/reflector/msg05302.html, say the purpose is to
better manage shared IRQs.
> I also attempted booting with pcie_hp=nomsi and now everything works.
> Interestingly pciehp now also gets an interrupt from 09 (event though
> that card has just been removed). I suspect this is just pciehp not
> noticing that it itself is gone.
> pciehp 0000:06:03.0:pcie24: Card not present on Slot(3-1)
> pciehp 0000:09:00.0:pcie24: Latch open on Slot(9)
> pciehp 0000:09:00.0:pcie24: Button pressed on Slot(9)
> pciehp 0000:09:00.0:pcie24: Card present on Slot(9)
> pciehp 0000:09:00.0:pcie24: Power fault on slot 9
> pciehp 0000:09:00.0:pcie24: Power fault bit 0 set
> pciehp 0000:09:00.0:pcie24: PCI slot #9 - powering on due to button press.
> pciehp 0000:09:00.0:pcie24: unloading service driver pciehp
> pciehp 0000:09:00.0:pcie24: Link Training Error occurs
> pciehp 0000:09:00.0:pcie24: Failed to check link status
This is a good clue. I think the portdrv registration thing is a bit
more confusing than necessary. I'll poke around in there a bit.
Unfortunately, I don't think this is going to lead to a quick easy fix
suitable for -rc7, so we'll probably have to do something simple like
skipping the INTx enable if MSI is already enabled.
Bjorn
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
[not found] ` <CAMxnaaWr0z1zbpFQPX6UvYbnxhfA+3aj4ffhCBpp1i4ZLsGTPg@mail.gmail.com>
@ 2014-03-11 0:10 ` Bjorn Helgaas
2014-03-11 12:52 ` Andreas Noever
0 siblings, 1 reply; 9+ messages in thread
From: Bjorn Helgaas @ 2014-03-11 0:10 UTC (permalink / raw)
To: Andreas Noever
Cc: Yinghai Lu, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki
On Tue, Mar 11, 2014 at 12:10:02AM +0100, Andreas Noever wrote:
> On Mon, Mar 10, 2014 at 9:43 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote:
> >> On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >
> >>> If we did that, INTX_DISABLE would be cleared by the first
> >>> pci_enable_device() and pci_reenable_device() wouldn't do anything,
> >>> leaving it cleared. The resulting state (cleared) would be the same,
> >>> but the transitions would be gone, and maybe those are important.
> >> Just a quick note: With pci_intx_for_msi removed no hotplug events are
> >> ever delivered. Everything else still works though. So it is either a
> >> problem specific to Thunderbolt bridges or maybe it just affects
> >> hotplug (and PME?) interrupts.
> >
> > Interesting. This is on a MacBook, isn't it? If you have Mac OS on
> > it, is there a way you can do the equivalent of lspci on it? I'm
> > curious about whether it sets INTx_DISABLE when it enables MSI.
>
> lspci -vv and lspci -vv -xxxx attached (yes, someone made a port).
>
> It looks like Mac OS sets DisINTx for all devices that have MSI
> enabled. The only exception is the Thunderbolt controller (no
> idea...). But the hotplug bridges all have DisINTx+.
OK, thanks. I don't know what to make of that.
Here's a possible patch; can you try it out?
PCI: Do not enable INTx in pci_reenable_device()
From: Bjorn Helgaas <bhelgaas@google.com>
Previously we cleared the Interrupt Disable bit in do_pci_enable_device(),
which is used by both pci_enable_device() and pci_reenable_device(). But
we use pci_reenable_device() after the driver may have enabled MSI or
MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X.
The pciehp hot-plug path uses pci_reenable_device() on the hotplug bridge,
and clearing its Interrupt Disable bit makes its hotplug event-reporting
MSI stop working.
Fixes: 1f42db786b14 PCI: Enable INTx if BIOS left them disabled
Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
drivers/pci/pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 8dc3e701ec57..79fc89c6c3f3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1192,6 +1192,9 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
return err;
pci_fixup_device(pci_fixup_enable, dev);
+ if (dev->msi_enabled || dev->msix_enabled)
+ return 0;
+
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
if (pin) {
pci_read_config_word(dev, PCI_COMMAND, &cmd);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
2014-03-11 0:10 ` Bjorn Helgaas
@ 2014-03-11 12:52 ` Andreas Noever
2014-03-17 16:36 ` Bjorn Helgaas
0 siblings, 1 reply; 9+ messages in thread
From: Andreas Noever @ 2014-03-11 12:52 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Yinghai Lu, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki
On Tue, Mar 11, 2014 at 1:10 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Tue, Mar 11, 2014 at 12:10:02AM +0100, Andreas Noever wrote:
>> On Mon, Mar 10, 2014 at 9:43 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> > On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote:
>> >> On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> >
>> >>> If we did that, INTX_DISABLE would be cleared by the first
>> >>> pci_enable_device() and pci_reenable_device() wouldn't do anything,
>> >>> leaving it cleared. The resulting state (cleared) would be the same,
>> >>> but the transitions would be gone, and maybe those are important.
>> >> Just a quick note: With pci_intx_for_msi removed no hotplug events are
>> >> ever delivered. Everything else still works though. So it is either a
>> >> problem specific to Thunderbolt bridges or maybe it just affects
>> >> hotplug (and PME?) interrupts.
>> >
>> > Interesting. This is on a MacBook, isn't it? If you have Mac OS on
>> > it, is there a way you can do the equivalent of lspci on it? I'm
>> > curious about whether it sets INTx_DISABLE when it enables MSI.
>>
>> lspci -vv and lspci -vv -xxxx attached (yes, someone made a port).
>>
>> It looks like Mac OS sets DisINTx for all devices that have MSI
>> enabled. The only exception is the Thunderbolt controller (no
>> idea...). But the hotplug bridges all have DisINTx+.
>
> OK, thanks. I don't know what to make of that.
>
> Here's a possible patch; can you try it out?
>
>
> PCI: Do not enable INTx in pci_reenable_device()
>
> From: Bjorn Helgaas <bhelgaas@google.com>
>
> Previously we cleared the Interrupt Disable bit in do_pci_enable_device(),
> which is used by both pci_enable_device() and pci_reenable_device(). But
> we use pci_reenable_device() after the driver may have enabled MSI or
> MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X.
>
> The pciehp hot-plug path uses pci_reenable_device() on the hotplug bridge,
> and clearing its Interrupt Disable bit makes its hotplug event-reporting
> MSI stop working.
>
> Fixes: 1f42db786b14 PCI: Enable INTx if BIOS left them disabled
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
> drivers/pci/pci.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 8dc3e701ec57..79fc89c6c3f3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1192,6 +1192,9 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
> return err;
> pci_fixup_device(pci_fixup_enable, dev);
>
> + if (dev->msi_enabled || dev->msix_enabled)
> + return 0;
> +
> pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
> if (pin) {
> pci_read_config_word(dev, PCI_COMMAND, &cmd);
That fixes it.
Thanks,
Andreas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan
2014-03-11 12:52 ` Andreas Noever
@ 2014-03-17 16:36 ` Bjorn Helgaas
0 siblings, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2014-03-17 16:36 UTC (permalink / raw)
To: Andreas Noever
Cc: Yinghai Lu, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, Rajat Jain, Rafael J. Wysocki
On Tue, Mar 11, 2014 at 6:52 AM, Andreas Noever
<andreas.noever@gmail.com> wrote:
> On Tue, Mar 11, 2014 at 1:10 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Tue, Mar 11, 2014 at 12:10:02AM +0100, Andreas Noever wrote:
>>> On Mon, Mar 10, 2014 at 9:43 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> > On Sat, Mar 8, 2014 at 2:55 AM, Andreas Noever <andreas.noever@gmail.com> wrote:
>>> >> On Sat, Mar 8, 2014 at 2:04 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>>> >
>>> >>> If we did that, INTX_DISABLE would be cleared by the first
>>> >>> pci_enable_device() and pci_reenable_device() wouldn't do anything,
>>> >>> leaving it cleared. The resulting state (cleared) would be the same,
>>> >>> but the transitions would be gone, and maybe those are important.
>>> >> Just a quick note: With pci_intx_for_msi removed no hotplug events are
>>> >> ever delivered. Everything else still works though. So it is either a
>>> >> problem specific to Thunderbolt bridges or maybe it just affects
>>> >> hotplug (and PME?) interrupts.
>>> >
>>> > Interesting. This is on a MacBook, isn't it? If you have Mac OS on
>>> > it, is there a way you can do the equivalent of lspci on it? I'm
>>> > curious about whether it sets INTx_DISABLE when it enables MSI.
>>>
>>> lspci -vv and lspci -vv -xxxx attached (yes, someone made a port).
>>>
>>> It looks like Mac OS sets DisINTx for all devices that have MSI
>>> enabled. The only exception is the Thunderbolt controller (no
>>> idea...). But the hotplug bridges all have DisINTx+.
>>
>> OK, thanks. I don't know what to make of that.
>>
>> Here's a possible patch; can you try it out?
>>
>>
>> PCI: Do not enable INTx in pci_reenable_device()
>>
>> From: Bjorn Helgaas <bhelgaas@google.com>
>>
>> Previously we cleared the Interrupt Disable bit in do_pci_enable_device(),
>> which is used by both pci_enable_device() and pci_reenable_device(). But
>> we use pci_reenable_device() after the driver may have enabled MSI or
>> MSI-X, and we *set* Interrupt Disable as part of enabling MSI/MSI-X.
>>
>> The pciehp hot-plug path uses pci_reenable_device() on the hotplug bridge,
>> and clearing its Interrupt Disable bit makes its hotplug event-reporting
>> MSI stop working.
>>
>> Fixes: 1f42db786b14 PCI: Enable INTx if BIOS left them disabled
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=71691
>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>> ---
>> drivers/pci/pci.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index 8dc3e701ec57..79fc89c6c3f3 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -1192,6 +1192,9 @@ static int do_pci_enable_device(struct pci_dev *dev, int bars)
>> return err;
>> pci_fixup_device(pci_fixup_enable, dev);
>>
>> + if (dev->msi_enabled || dev->msix_enabled)
>> + return 0;
>> +
>> pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
>> if (pin) {
>> pci_read_config_word(dev, PCI_COMMAND, &cmd);
>
> That fixes it.
Thanks for testing this, Andreas. I merged it and it's in Linus' tree now.
Would you mind booting with "pci=earlydump" sometime and attaching the
dmesg to the bugzilla? Your lspci output shows that INTx_DISABLE is
set, and I'm curious about whether it was set by the BIOS or by Mac
OS.
Bjorn
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-03-17 16:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-07 14:33 [Bug] PCI: Enable INTx if BIOS left them disabled - triggers during rescan Andreas Noever
2014-03-07 16:45 ` Bjorn Helgaas
2014-03-07 19:39 ` Yinghai Lu
2014-03-08 1:04 ` Bjorn Helgaas
2014-03-08 9:55 ` Andreas Noever
2014-03-10 20:43 ` Bjorn Helgaas
[not found] ` <CAMxnaaWr0z1zbpFQPX6UvYbnxhfA+3aj4ffhCBpp1i4ZLsGTPg@mail.gmail.com>
2014-03-11 0:10 ` Bjorn Helgaas
2014-03-11 12:52 ` Andreas Noever
2014-03-17 16:36 ` Bjorn Helgaas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).