Linux PCI subsystem development
 help / color / mirror / Atom feed
* VPD blacklist of Marvell QLogic 1077/2261
@ 2020-12-17  0:44 Arun Easi
  2020-12-17 17:48 ` Bjorn Helgaas
  0 siblings, 1 reply; 3+ messages in thread
From: Arun Easi @ 2020-12-17  0:44 UTC (permalink / raw)
  To: linux-pci; +Cc: Bjorn Helgaas, Girish Basrur, Quinn Tran

Hi Bjorn,

This is regarding the blacklisting of one of the Marvell QLogic FC
adapter (1077/2261) on VPD area access. The commit that did was
this:

--8<-- pruned commit message --8<--
| commit 0d5370d1d85251e5893ab7c90a429464de2e140b
| Author: Ethan Zhao <ethan.zhao@oracle.com>
| Date:   Mon Feb 27 17:08:44 2017 +0900
| 
|     PCI: Prevent VPD access for QLogic ISP2722
|     Call Trace:
|      <NMI>  [<ffffffff817193de>] dump_stack+0x63/0x81
|      [<ffffffff81714072>] panic+0xd0/0x20e
|      [<ffffffff8101c8b4>] do_nmi+0xf4/0x170
|      <<EOE>>  [<ffffffff815db4b3>] raw_pci_read+0x23/0x40
|      [<ffffffff815db4fc>] pci_read+0x2c/0x30
|      [<ffffffff8136f612>] pci_user_read_config_word+0x72/0x110
|      [<ffffffff8136f746>] pci_vpd_pci22_wait+0x96/0x130
|      [<ffffffff8136ff9b>] pci_vpd_pci22_read+0xdb/0x1a0
|      [<ffffffff8136ea30>] pci_read_vpd+0x20/0x30


https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d5370d1d85251e5893ab7c90a429464de2e140b

While investigating the original report of the issue, I found an
interesting information that may explain why Ethan Zhao was
hitting the NMI/crash.

If you notice the stack referred in the commit, you could see that
a bunch of old vpd access functions, pci_vpd_pci22*, were referred.
When these functions were in use (these functions were renamed
around 2016 Feb by f1cd93f9aabe), there was a critical VPD
access bug missing; missing in fact most of the life of those
functions.

This one:
    104daa71b396: PCI: Determine actual VPD size on first access

Without this patch, a read of the vpd sysfs file can access area
outside of VPD space, which is not allowed by the spec.

My guess here is that, Ethan, when trying to access VPD of the QLogic
1077/2261 adapter, was using a kernel that had the bug present and
it led up to the NMI/crash he was observing on his machine.

We had an early firmware that returned CA on VPD access beyond
bounds that is known to NMI some servers. The FW has since changed to
not clear the VPD flag upon out-of-bound access.

In light of the above, plus the fact that I did try the
experiment on multiple setups and was not able to reproduce the
issue, would you be willing to revert the above patch? If so, I
could send a git revert (or equivalent) patch of the commit.

This blacklisting is preventing multiple customers from accessing
the VPD area of the said production adapter and making their life
a bit difficult.

Regards,
-Arun

Old discussion of the topic:
    https://lkml.org/lkml/2019/5/21/991

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-12-19  0:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-12-17  0:44 VPD blacklist of Marvell QLogic 1077/2261 Arun Easi
2020-12-17 17:48 ` Bjorn Helgaas
2020-12-19  0:27   ` [EXT] " Arun Easi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox