From: Bjorn Helgaas <helgaas@kernel.org>
To: linux-pci@vger.kernel.org, Lukas Wunner <lukas@wunner.de>
Cc: Richard Weinberger <richard@nod.at>, aaron@sigma-star.at
Subject: Re: [bugzilla-daemon@kernel.org: [Bug 216511] New: Spurious PCI_EXP_SLTSTA_DLLSC when hot plugging]
Date: Wed, 21 Sep 2022 13:03:26 -0500 [thread overview]
Message-ID: <20220921180326.GA1221419@bhelgaas> (raw)
In-Reply-To: <20220921114020.GA1191462@bhelgaas>
On Wed, Sep 21, 2022 at 06:40:20AM -0500, Bjorn Helgaas wrote:
> ----- Forwarded message from bugzilla-daemon@kernel.org -----
>
> Date: Wed, 21 Sep 2022 11:30:47 +0000
> From: bugzilla-daemon@kernel.org
> To: bjorn@helgaas.com
> Subject: [Bug 216511] New: Spurious PCI_EXP_SLTSTA_DLLSC when hot plugging
> Message-ID: <bug-216511-41252@https.bugzilla.kernel.org/>
>
> https://bugzilla.kernel.org/show_bug.cgi?id=216511
>
> Bug ID: 216511
> Summary: Spurious PCI_EXP_SLTSTA_DLLSC when hot plugging
> ...
> A x86_64 machine has a PCI switch (PEX 8747) with four ports, on two of them
> NVMe disks are attachable.
> Using a vendor specific tool I can power on/off each port.
> When I power on both ports, hot plugging a NVMe into any port, it works
> perfectly fine,
> but as soon I plug a second one, *both* ports receive a PCI_EXP_SLTSTA_DLLSC
> event.
> As consequence the previously attached NVMe will be detached and only device
> remains, or the previously attached NVMe gets detached and immediately
> reattached but all IO fails later.
>
> To me it seems very wrong that both ports see PCI_EXP_SLTSTA_DLLSC.
>
> The problem can be observed with any kernel so far.
> Could this be a firmware issue? What debug further methods do you suggest?
Relevant devices from lspci:
0a:00.0 PLX 8748 Upstream Port to [bus 0b-1b]
0b:08.0 PLX 8747 Downstream Port to [bus 0c-0f] # Slot 0
0c:00.0 NVMe
0b:09.0 PLX 8747 Downstream Port to [bus 10-13] # Slot 0-1
10:00.0 NVMe
From dmesg log, we add 10:00.0 in Slot 0-1 first, then add 0c:00.0 in
Slot 0. When 0c:00.0 is added, Slot 0-1 gets a PCI_EXP_SLTSTA_DLLSC
interrupt for 10:00.0:
pcieport 0000:0b:09.0: pciehp: pending interrupts 0x0008 from Slot Status
presence detect changed # Slot 0-1
pcieport 0000:0b:09.0: pciehp: pending interrupts 0x0100 from Slot Status
DLL state changed # Slot 0-1
pcieport 0000:0b:09.0: pciehp: pciehp_check_link_status: lnk_status = a023
PCI_EXP_LNKSTA_LABS
PCI_EXP_LNKSTA_DLLLA
PCI_EXP_LNKSTA_NLW_X2
PCI_EXP_LNKSTA_CLS_8_0GB
pci 0000:10:00.0: [27d1:5216] type 00 class 0x010802 # NVMe in Slot 0-1
pcieport 0000:0b:08.0: pciehp: pending interrupts 0x0008 from Slot Status
presence detect changed # Slot 0
pcieport 0000:0b:09.0: pciehp: pending interrupts 0x0100 from Slot Status
DLL state changed # Slot 0-1 (?)
pcieport 0000:0b:09.0: pciehp: Slot(0-1): Link Down
Here's the call chain when handling that DLL state change:
pciehp_ist
pcie_capability_read_word(pdev, PCI_EXP_SLTSTA, &status)
status &= ... PCI_EXP_SLTSTA_DLLSC
events |= status
if (events & PCI_EXP_SLTSTA_DLLSC)
pciehp_handle_presence_or_link_change
pciehp_disable_slot
__pciehp_disable_slot
remove_board
pciehp_unconfigure_device
pci_stop_and_remove_bus_device
Per spec, "software must read the Data Link Layer Link Active bit of
the Link Status Register to determine if the Link is active before
initiating configuration cycles to the hot plugged device" (PCIe r6.0,
sec 7.5.3.11).
It looks like Linux depends on PCI_EXP_SLTSTA_DLLSC but does not
actually read PCI_EXP_LNKSTA in this path, so this looks like a pciehp
defect.
Bjorn
next prev parent reply other threads:[~2022-09-21 18:03 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-21 11:40 [bugzilla-daemon@kernel.org: [Bug 216511] New: Spurious PCI_EXP_SLTSTA_DLLSC when hot plugging] Bjorn Helgaas
2022-09-21 18:03 ` Bjorn Helgaas [this message]
2022-09-21 18:56 ` Lukas Wunner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220921180326.GA1221419@bhelgaas \
--to=helgaas@kernel.org \
--cc=aaron@sigma-star.at \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=richard@nod.at \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox