From: Andrew Murray <andrew.murray@arm.com>
To: "Chocron, Jonathan" <jonnyc@amazon.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"robh+dt@kernel.org" <robh+dt@kernel.org>,
"jingoohan1@gmail.com" <jingoohan1@gmail.com>,
"Woodhouse, David" <dwmw@amazon.co.uk>,
"Hanoch, Uri" <hanochu@amazon.com>,
"devicetree@vger.kernel.org" <devicetree@vger.kernel.org>,
"lorenzo.pieralisi@arm.com" <lorenzo.pieralisi@arm.com>,
"gustavo.pimentel@synopsys.com" <gustavo.pimentel@synopsys.com>,
"Wasserstrom, Barak" <barakw@amazon.com>,
"Saidi, Ali" <alisaidi@amazon.com>,
"mark.rutland@arm.com" <mark.rutland@arm.com>,
"Hawa, Hanna" <hhhawa@amazon.com>,
"Shenhar, Talel" <talel@amazon.com>,
"Krupnik, Ronen" <ronenk@amazon.com>,
"bhelgaas@google.com" <bhelgaas@google.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
benh@kernel.crashing
Subject: Re: [PATCH v3 4/8] PCI: Add quirk to disable MSI-X support for Amazon's Annapurna Labs Root Port
Date: Tue, 20 Aug 2019 20:54:01 +0100 [thread overview]
Message-ID: <20190820195400.GH23903@e119886-lin.cambridge.arm.com> (raw)
In-Reply-To: <87417da8ccea10b84c1a968700ef2ff783c751be.camel@amazon.com>
On Tue, Aug 20, 2019 at 05:06:22PM +0000, Chocron, Jonathan wrote:
> On Tue, 2019-08-20 at 16:25 +0100, Andrew Murray wrote:
> > On Tue, Aug 20, 2019 at 02:52:30PM +0000, Chocron, Jonathan wrote:
> > > On Mon, 2019-08-19 at 19:23 +0100, Andrew Murray wrote:
> > > > On Tue, Jul 23, 2019 at 12:25:29PM +0300, Jonathan Chocron wrote:
> > > > > The Root Port (identified by [1c36:0032]) doesn't support MSI-
> > > > > X. On
> > > > > some
> > > >
> > > > Shouldn't this read [1c36:0031]?
> > > >
> > >
> > > Indeed. Thanks for catching this.
> > >
> > > >
> > > > > platforms it is configured to not advertise the capability at
> > > > > all,
> > > > > while
> > > > > on others it (mistakenly) does. This causes a panic during
> > > > > initialization by the pcieport driver, since it tries to
> > > > > configure
> > > > > the
> > > > > MSI-X capability. Specifically, when trying to access the MSI-X
> > > > > table
> > > > > a "non-existing addr" exception occurs.
> > > > >
> > > > > Example stacktrace snippet:
> > > > >
> > > > > [ 1.632363] SError Interrupt on CPU2, code 0xbf000000 --
> > > > > SError
> > > > > [ 1.632364] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-
> > > > > rc1-
> > > > > Jonny-14847-ge76f1d4a1828-dirty #33
> > > > > [ 1.632365] Hardware name: Annapurna Labs Alpine V3 EVP (DT)
> > > > > [ 1.632365] pstate: 80000005 (Nzcv daif -PAN -UAO)
> > > > > [ 1.632366] pc : __pci_enable_msix_range+0x4e4/0x608
> > > > > [ 1.632367] lr : __pci_enable_msix_range+0x498/0x608
> > > > > [ 1.632367] sp : ffffff80117db700
> > > > > [ 1.632368] x29: ffffff80117db700 x28: 0000000000000001
> > > > > [ 1.632370] x27: 0000000000000001 x26: 0000000000000000
> > > > > [ 1.632372] x25: ffffffd3e9d8c0b0 x24: 0000000000000000
> > > > > [ 1.632373] x23: 0000000000000000 x22: 0000000000000000
> > > > > [ 1.632375] x21: 0000000000000001 x20: 0000000000000000
> > > > > [ 1.632376] x19: ffffffd3e9d8c000 x18: ffffffffffffffff
> > > > > [ 1.632378] x17: 0000000000000000 x16: 0000000000000000
> > > > > [ 1.632379] x15: ffffff80116496c8 x14: ffffffd3e9844503
> > > > > [ 1.632380] x13: ffffffd3e9844502 x12: 0000000000000038
> > > > > [ 1.632382] x11: ffffffffffffff00 x10: 0000000000000040
> > > > > [ 1.632384] x9 : ffffff801165e270 x8 : ffffff801165e268
> > > > > [ 1.632385] x7 : 0000000000000002 x6 : 00000000000000b2
> > > > > [ 1.632387] x5 : ffffffd3e9d8c2c0 x4 : 0000000000000000
> > > > > [ 1.632388] x3 : 0000000000000000 x2 : 0000000000000000
> > > > > [ 1.632390] x1 : 0000000000000000 x0 : ffffffd3e9844680
> > > > > [ 1.632392] Kernel panic - not syncing: Asynchronous SError
> > > > > Interrupt
> > > > > [ 1.632393] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.2.0-
> > > > > rc1-
> > > > > Jonny-14847-ge76f1d4a1828-dirty #33
> > > > > [ 1.632394] Hardware name: Annapurna Labs Alpine V3 EVP (DT)
> > > > > [ 1.632394] Call trace:
> > > > > [ 1.632395] dump_backtrace+0x0/0x140
> > > > > [ 1.632395] show_stack+0x14/0x20
> > > > > [ 1.632396] dump_stack+0xa8/0xcc
> > > > > [ 1.632396] panic+0x140/0x334
> > > > > [ 1.632397] nmi_panic+0x6c/0x70
> > > > > [ 1.632398] arm64_serror_panic+0x74/0x88
> > > > > [ 1.632398] __pte_error+0x0/0x28
> > > > > [ 1.632399] el1_error+0x84/0xf8
> > > > > [ 1.632400] __pci_enable_msix_range+0x4e4/0x608
> > > > > [ 1.632400] pci_alloc_irq_vectors_affinity+0xdc/0x150
> > > > > [ 1.632401] pcie_port_device_register+0x2b8/0x4e0
> > > > > [ 1.632402] pcie_portdrv_probe+0x34/0xf0
> > > > >
> > > > > Signed-off-by: Jonathan Chocron <jonnyc@amazon.com>
> > > > > Reviewed-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
> > > > > ---
> > > > > drivers/pci/quirks.c | 15 +++++++++++++++
> > > > > 1 file changed, 15 insertions(+)
> > > > >
> > > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > > index 23672680dba7..11f843aa96b3 100644
> > > > > --- a/drivers/pci/quirks.c
> > > > > +++ b/drivers/pci/quirks.c
> > > > > @@ -2925,6 +2925,21 @@
> > > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0x10a1,
> > > > > quirk_msi_intx_disable_qca_bug);
> > > > > DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATTANSIC, 0xe091,
> > > > > quirk_msi_intx_disable_qca_bug);
> > > > > +
> > > > > +/*
> > > > > + * Amazon's Annapurna Labs 1c36:0031 Root Ports don't support
> > > > > MSI-
> > > > > X, so it
> > > > > + * should be disabled on platforms where the device
> > > > > (mistakenly)
> > > > > advertises it.
> > > > > + *
> > > > > + * The 0031 device id is reused for other non Root Port device
> > > > > types,
> > > > > + * therefore the quirk is registered for the
> > > > > PCI_CLASS_BRIDGE_PCI
> > > > > class.
> > > > > + */
> > > > > +static void quirk_al_msi_disable(struct pci_dev *dev)
> > > > > +{
> > > > > + dev->no_msi = 1;
> > > > > + pci_warn(dev, "Disabling MSI-X\n");
> > > >
> > > > This will disable both MSI and MSI-X support - is this really the
> > > > intention
> > > > here? Do the root ports support MSI and legacy, or just legacy?
> > > >
> > >
> > > The HW should support MSI, but we currently don't have a use case
> > > for
> > > it so it hasn't been tested and therefore we are okay with
> > > disabling
> > > it.
> >
> > OK - then the commit message and comment (for quirk_al_msi_disable)
> > should
> > probably be updated to reflect this.
>
> If you mean that we should explicitly state that MSI is possibly
> supported then I don't think it is very relevant here. If we decide to
> test and enable it in the future, then the new quirk (which would only
> disable MSI-x) would reflect the fact that MSI works. Sounds ok?
>
My thought process is that 1c36:0031 probably advertises an MSI capability
reflecting that the device probably supports MSI - if there is no valid
reason for disabling MSI support then we shouldn't disable it.
It doesn't make sense to me to disable it just because that's the path of
least resistance - a better course of action would be to add support
to disable MSI-X in the kernel.
The kernel panic generated in attempting to enable MSI-X shows that this
device has a user for its MSI/MSI-X interrupts. Thus if you disabled only
MSI-X the PCIe port driver would use MSI interrupts instead - Due to
"no_msi = 1" I guess it will be using legacy.
> > Especially given that the device id
> > may be used on other device types - implying that a use-case for this
> > may later arise.
> >
> Not sure I understood that last line. This patch is relevant only for
> the BRIDGE class 0x0031 device. The other device types, which (sadly)
> happen to re-use the dev_id, have are totally different and shouldn't
> be mixed in here.
>
It's probably unlikely, but if 1c36:0031 was the device ID given to a
future incarnation of this hardware in a PCIe switch (which has BRIDGE
class), then using legacy interrupts instead of MSI wouldn't be ideal
and an un-necessary limitation.
This is why adding a comment to indicate that MSI may work but hasn't
been tested may be helpful for others in the future.
> > >
> > > For future knowledge, how can just MSI-X be disabled?
> > > I see that in pcie_port_enable_irq_vec(), the pcieport driver calls
> > > pci_alloc_irq_vectors() with PCI_IRQ_MSIX | PCI_IRQ_MSI. And
> > > internally, both __pci_enable_msix_range() and
> > > __pci_enable_msi_range()
> > > use pci_msi_supported() which doesn't differentiate between MSI and
> > > MSI-x.
> >
> > The documentation [1] would suggest that once upon a time
> > pci_disable_msix
> > was used - but now should let pci_alloc_irq_vectors cap the max
> > number
> > of vectors. However in your case it's the PCIe port driver that is
> > attempting
> > to allocate MSI-X's and so the solution isn't an obvious one.
> >
> > Setting dev->msix_enabled to false (i.e. through pci_disable_msix)
> > would
> > result in an un-necessary WARN_ON_ONCE.
>
> Per my understanding, it seems that dev->msix_enabled indicates if the
> MSI-X capability has already been enabled or not, and not as an
> indication if it is supported by the device. If that is the case, then
> not sure pci_disable_msix() would result in the desired effect.
Yes this isn't the right approach.
>
> > I think you'd need to ensure
> > devi->msix_cap is NULL (which makes sense as your hardware shouldn't
> > be
> > exposing this capability).
> >
> > I guess the right way of achieving this would be through a quirk,
> > though you'd
> > be the first to do this and you'd have to ensure the quirk runs
> > before
> > anyone tests for msix_cap.
> >
> > That's my view, though others may have different suggestions.
> >
> I think that maybe a dev->no_msix field should be added and then a
> corresponding pci_msix_supported() function as well. I'd be glad to
> take it offline or on a dedicated thread, but it shouldn't block this
> patchset.
Thanks,
Andrew Murray
>
> > [1] Documentation/PCI/msi-howto.rst
> >
> > Thanks,
> >
> > Andrew Murray
> >
> > >
> > > > Thanks,
> > > >
> > > > Andrew Murray
> > > >
> > > > > +}
> > > > > +DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMAZON_ANNAPURNA_L
> > > > > ABS,
> > > > > 0x0031,
> > > > > + PCI_CLASS_BRIDGE_PCI, 8,
> > > > > quirk_al_msi_disable);
> > > > > #endif /* CONFIG_PCI_MSI */
> > > > >
> > > > > /*
> > > > > --
> > > > > 2.17.1
> > > > >
next prev parent reply other threads:[~2019-08-20 19:54 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-23 9:25 [PATCH v3 0/8] Amazon's Annapurna Labs DT-based PCIe host controller driver Jonathan Chocron
2019-07-23 9:25 ` [PATCH v3 1/8] PCI: Add Amazon's Annapurna Labs vendor ID Jonathan Chocron
2019-08-19 17:51 ` Andrew Murray
2019-07-23 9:25 ` [PATCH v3 2/8] PCI: Add ACS quirk for Amazon Annapurna Labs root ports Jonathan Chocron
2019-07-23 9:25 ` [PATCH v3 3/8] PCI/VPD: Add VPD release quirk for Amazon's Annapurna Labs Root Port Jonathan Chocron
2019-07-23 9:25 ` [PATCH v3 4/8] PCI: Add quirk to disable MSI-X support " Jonathan Chocron
2019-08-19 18:23 ` Andrew Murray
2019-08-20 14:52 ` Chocron, Jonathan
2019-08-20 15:25 ` Andrew Murray
2019-08-20 17:06 ` Chocron, Jonathan
2019-08-20 19:54 ` Andrew Murray [this message]
2019-08-21 10:30 ` Chocron, Jonathan
2019-07-23 9:27 ` [PATCH v3 5/8] dt-bindings: PCI: Add Amazon's Annapurna Labs PCIe host bridge binding Jonathan Chocron
2019-08-13 15:30 ` Rob Herring
2019-08-13 16:48 ` Chocron, Jonathan
2019-08-13 20:46 ` Rob Herring
2019-08-19 16:15 ` Chocron, Jonathan
2019-07-23 9:27 ` [PATCH v3 6/8] PCI: al: Add support for DW based driver type Jonathan Chocron
2019-07-23 9:40 ` Gustavo Pimentel
2019-08-12 17:03 ` Lorenzo Pieralisi
2019-08-14 11:39 ` Chocron, Jonathan
2019-07-23 9:27 ` [PATCH v3 7/8] PCI: dw: Add validation that PCIe core is set to correct mode Jonathan Chocron
2019-07-23 9:27 ` [PATCH v3 8/8] PCI: dw: Add support for PCI_PROBE_ONLY/PCI_REASSIGN_ALL_BUS flags Jonathan Chocron
2019-08-07 16:36 ` Lorenzo Pieralisi
2019-08-08 9:30 ` Chocron, Jonathan
2019-08-08 10:58 ` Lorenzo Pieralisi
2019-08-12 18:05 ` Chocron, Jonathan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190820195400.GH23903@e119886-lin.cambridge.arm.com \
--to=andrew.murray@arm.com \
--cc=alisaidi@amazon.com \
--cc=barakw@amazon.com \
--cc=benh@kernel.crashing \
--cc=bhelgaas@google.com \
--cc=devicetree@vger.kernel.org \
--cc=dwmw@amazon.co.uk \
--cc=gustavo.pimentel@synopsys.com \
--cc=hanochu@amazon.com \
--cc=hhhawa@amazon.com \
--cc=jingoohan1@gmail.com \
--cc=jonnyc@amazon.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=mark.rutland@arm.com \
--cc=robh+dt@kernel.org \
--cc=ronenk@amazon.com \
--cc=talel@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).