From: Hans de Goede <hdegoede@redhat.com>
To: Bjorn Helgaas <helgaas@kernel.org>, linux-pci@vger.kernel.org
Cc: Blazej Kucman <blazej.kucman@intel.com>,
Lukas Wunner <lukas@wunner.de>,
Naveen Naidu <naveennaidu479@gmail.com>,
Keith Busch <kbusch@kernel.org>,
Nirmal Patel <nirmal.patel@linux.intel.com>,
Jonathan Derrick <jonathan.derrick@linux.dev>
Subject: Re: [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1
Date: Tue, 25 Jan 2022 09:58:18 +0100 [thread overview]
Message-ID: <50702e5f-96e8-bc68-67ee-bcf11a5ccdc8@redhat.com> (raw)
In-Reply-To: <20220124214635.GA1553164@bhelgaas>
Hi,
On 1/24/22 22:46, Bjorn Helgaas wrote:
> [+cc linux-pci, Hans, Lukas, Naveen, Keith, Nirmal, Jonathan]
>
> On Mon, Jan 24, 2022 at 11:46:14AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=215525
>>
>> Bug ID: 215525
>> Summary: HotPlug does not work on upstream kernel 5.17.0-rc1
>> Product: Drivers
>> Version: 2.5
>> Kernel Version: 5.17.0-rc1 upstream
>> Hardware: x86-64
>> OS: Linux
>> Tree: Mainline
>> Status: NEW
>> Severity: normal
>> Priority: P1
>> Component: PCI
>> Assignee: drivers_pci@kernel-bugs.osdl.org
>> Reporter: blazej.kucman@intel.com
>> Regression: No
>>
>> Created attachment 300308
>> --> https://bugzilla.kernel.org/attachment.cgi?id=300308&action=edit
>> dmesg
>>
>> While testing on latest upstream
>> kernel(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/) we
>> noticed that with the merge commit
>> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0a231f01e5b25bacd23e6edc7c979a18a517b2b)
>> hotplug and hotunplug of nvme drives stopped working.
>>
>> Rescan PCI does not help.
>> echo "1" > /sys/bus/pci/rescan
>>
>> Issue does not reproduce on a kernel built on an antecedent
>> commit(88db8458086b1dcf20b56682504bdb34d2bca0e2).
>>
>>
>> During hot-remove device does not disappear, however when we try to do I/O on
>> the disk then there is an I/O error, and the device disappears.
>>
>> Before I/O no logs regarding the disk appeared in the dmesg, only after I/O the
>> entries appeared like below:
>> [ 177.943703] nvme nvme5: controller is down; will reset: CSTS=0xffffffff,
>> PCI_STATUS=0xffff
>> [ 177.971661] nvme 10000:0b:00.0: can't change power state from D3cold to D0
>> (config space inaccessible)
>> [ 177.981121] pcieport 10000:00:02.0: can't derive routing for PCI INT A
>> [ 177.987749] nvme 10000:0b:00.0: PCI INT A: no GSI
>> [ 177.992633] nvme nvme5: Removing after probe failure status: -19
>> [ 178.004633] nvme5n1: detected capacity change from 83984375 to 0
>> [ 178.004677] I/O error, dev nvme5n1, sector 0 op 0x0:(READ) flags 0x0
>> phys_seg 1 prio class 0
>>
>>
>> OS: RHEL 8.4 GA
>> Platform: Intel Purley
>>
>> The logs are collected on a non-recent upstream kernel, but a issue also occurs
>> on the newest upstream kernel(dd81e1c7d5fb126e5fbc5c9e334d7b3ec29a16a0)
>
> Apparently worked immediately before merging the PCI changes for
> v5.17 and failed immediately after:
>
> good: 88db8458086b ("Merge tag 'exfat-for-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat")
> bad: d0a231f01e5b ("Merge tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci")
>
> Only three commits touch pciehp:
>
> 085a9f43433f ("PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors")
> 23584c1ed3e1 ("PCI: pciehp: Fix infinite loop in IRQ handler upon power fault")
> a3b0f10db148 ("PCI: pciehp: Use PCI_POSSIBLE_ERROR() to check config reads")
>
> None seems obviously related to me. Blazej, could you try setting
> CONFIG_DYNAMIC_DEBUG=y and booting with 'dyndbg="file pciehp* +p"' to
> enable more debug messages?
Since there are only 3 commits maybe try reverting them 1 by 1 in reverse history order
(so revert latest commit first) ? And see if running a kernel with the reverted commit(s)
fixes things ?
Regards,
Hans
next prev parent reply other threads:[~2022-01-25 9:01 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <bug-215525-41252@https.bugzilla.kernel.org/>
2022-01-24 21:46 ` [Bug 215525] New: HotPlug does not work on upstream kernel 5.17.0-rc1 Bjorn Helgaas
2022-01-25 8:58 ` Hans de Goede [this message]
2022-01-25 15:33 ` Lukas Wunner
2022-01-26 7:31 ` Thorsten Leemhuis
2022-01-27 14:46 ` Mariusz Tkaczyk
2022-01-27 20:47 ` Jonathan Derrick
2022-01-27 22:31 ` Jonathan Derrick
2022-01-28 2:52 ` Bjorn Helgaas
2022-01-28 8:29 ` Mariusz Tkaczyk
2022-01-28 13:08 ` Bjorn Helgaas
2022-01-28 13:49 ` Kai-Heng Feng
2022-01-28 14:03 ` Bjorn Helgaas
2022-02-02 15:48 ` Blazej Kucman
2022-02-02 16:43 ` Bjorn Helgaas
2022-02-03 9:13 ` Thorsten Leemhuis
2022-02-03 10:47 ` Blazej Kucman
2022-02-03 15:58 ` Bjorn Helgaas
2022-02-09 13:41 ` Blazej Kucman
2022-02-09 21:02 ` Bjorn Helgaas
2022-02-10 11:14 ` Blazej Kucman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50702e5f-96e8-bc68-67ee-bcf11a5ccdc8@redhat.com \
--to=hdegoede@redhat.com \
--cc=blazej.kucman@intel.com \
--cc=helgaas@kernel.org \
--cc=jonathan.derrick@linux.dev \
--cc=kbusch@kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=naveennaidu479@gmail.com \
--cc=nirmal.patel@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).