From: Bjorn Helgaas <helgaas@kernel.org>
To: Andre Przywara <andre.przywara@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
Andrew Murray <andrew.murray@arm.com>,
"Rafael J . Wysocki" <rjw@rjwysocki.net>,
Len Brown <lenb@kernel.org>, Will Deacon <will@kernel.org>,
Catalin Marinas <catalin.marinas@arm.com>,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
linux-acpi@vger.kernel.org
Subject: Re: [PATCH] pcie: Add quirk for the Arm Neoverse N1SDP platform
Date: Tue, 10 Dec 2019 08:41:15 -0600 [thread overview]
Message-ID: <20191210144115.GA94877@google.com> (raw)
In-Reply-To: <20191209160638.141431-1-andre.przywara@arm.com>
On Mon, Dec 09, 2019 at 04:06:38PM +0000, Andre Przywara wrote:
> From: Deepak Pandey <Deepak.Pandey@arm.com>
>
> The Arm N1SDP SoC suffers from some PCIe integration issues, most
> prominently config space accesses to not existing BDFs being answered
> with a bus abort, resulting in an SError.
Can we tease this apart a little more? Linux doesn't program all the
bits that control error signaling, so even on hardware that works
perfectly, much of this behavior is determined by what firmware did.
I wonder if Linux could be more careful about this.
"Bus abort" is not a term used in PCIe. IIUC, a config read to a
device that doesn't exist should terminate with an Unsupported Request
completion, e.g., see the implementation note in PCIe r5.0 sec 2.3.1.
The UR should be an uncorrectable non-fatal error (Table 6-5), and
Figures 6-2 and 6-3 show how it should be handled and when it should
be signaled as a system error. In case you don't have a copy of the
spec, I extracted those two figures and put them at [1].
Can you collect "lspci -vvxxx" output to see if we can correlate it
with those figures and the behavior you see?
[1] https://drive.google.com/file/d/1ihhdQvr0a7ZEJG-3gPddw1Tq7cTFAsah/view?usp=sharing
> To mitigate this, the firmware scans the bus before boot (catching the
> SErrors) and creates a table with valid BDFs, which acts as a filter for
> Linux' config space accesses.
>
> Add code consulting the table as an ACPI PCIe quirk, also register the
> corresponding device tree based description of the host controller.
> Also fix the other two minor issues on the way, namely not being fully
> ECAM compliant and config space accesses being restricted to 32-bit
> accesses only.
As I'm sure you've noticed, controllers that support only 32-bit
config writes are not spec compliant and devices may not work
correctly. The comment in pci_generic_config_write32() explains why.
You may not trip over this problem frequently, but I wouldn't call it
a "minor" issue because when you *do* trip over it, you have no
indication that a register was corrupted.
Even ECAM compliance is not really minor -- if this controller were
fully compliant with the spec, you would need ZERO Linux changes to
support it. Every quirk like this means additional maintenance
burden, and it's not just a one-time thing. It means old kernels that
*should* "just work" on your system will not work unless somebody
backports the quirk.
> This allows the Arm Neoverse N1SDP board to boot Linux without crashing
> and to access *any* devices (there are no platform devices except UART).
WARNING: multiple messages have this Message-ID (diff)
From: Bjorn Helgaas <helgaas@kernel.org>
To: Andre Przywara <andre.przywara@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
"Rafael J . Wysocki" <rjw@rjwysocki.net>,
linux-kernel@vger.kernel.org, linux-acpi@vger.kernel.org,
linux-pci@vger.kernel.org, Andrew Murray <andrew.murray@arm.com>,
Will Deacon <will@kernel.org>,
linux-arm-kernel@lists.infradead.org, Len Brown <lenb@kernel.org>
Subject: Re: [PATCH] pcie: Add quirk for the Arm Neoverse N1SDP platform
Date: Tue, 10 Dec 2019 08:41:15 -0600 [thread overview]
Message-ID: <20191210144115.GA94877@google.com> (raw)
In-Reply-To: <20191209160638.141431-1-andre.przywara@arm.com>
On Mon, Dec 09, 2019 at 04:06:38PM +0000, Andre Przywara wrote:
> From: Deepak Pandey <Deepak.Pandey@arm.com>
>
> The Arm N1SDP SoC suffers from some PCIe integration issues, most
> prominently config space accesses to not existing BDFs being answered
> with a bus abort, resulting in an SError.
Can we tease this apart a little more? Linux doesn't program all the
bits that control error signaling, so even on hardware that works
perfectly, much of this behavior is determined by what firmware did.
I wonder if Linux could be more careful about this.
"Bus abort" is not a term used in PCIe. IIUC, a config read to a
device that doesn't exist should terminate with an Unsupported Request
completion, e.g., see the implementation note in PCIe r5.0 sec 2.3.1.
The UR should be an uncorrectable non-fatal error (Table 6-5), and
Figures 6-2 and 6-3 show how it should be handled and when it should
be signaled as a system error. In case you don't have a copy of the
spec, I extracted those two figures and put them at [1].
Can you collect "lspci -vvxxx" output to see if we can correlate it
with those figures and the behavior you see?
[1] https://drive.google.com/file/d/1ihhdQvr0a7ZEJG-3gPddw1Tq7cTFAsah/view?usp=sharing
> To mitigate this, the firmware scans the bus before boot (catching the
> SErrors) and creates a table with valid BDFs, which acts as a filter for
> Linux' config space accesses.
>
> Add code consulting the table as an ACPI PCIe quirk, also register the
> corresponding device tree based description of the host controller.
> Also fix the other two minor issues on the way, namely not being fully
> ECAM compliant and config space accesses being restricted to 32-bit
> accesses only.
As I'm sure you've noticed, controllers that support only 32-bit
config writes are not spec compliant and devices may not work
correctly. The comment in pci_generic_config_write32() explains why.
You may not trip over this problem frequently, but I wouldn't call it
a "minor" issue because when you *do* trip over it, you have no
indication that a register was corrupted.
Even ECAM compliance is not really minor -- if this controller were
fully compliant with the spec, you would need ZERO Linux changes to
support it. Every quirk like this means additional maintenance
burden, and it's not just a one-time thing. It means old kernels that
*should* "just work" on your system will not work unless somebody
backports the quirk.
> This allows the Arm Neoverse N1SDP board to boot Linux without crashing
> and to access *any* devices (there are no platform devices except UART).
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-12-10 14:41 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-09 16:06 [PATCH] pcie: Add quirk for the Arm Neoverse N1SDP platform Andre Przywara
2019-12-09 16:06 ` Andre Przywara
2019-12-09 16:26 ` Will Deacon
2019-12-09 16:26 ` Will Deacon
2019-12-18 2:21 ` Jon Masters
2019-12-18 2:21 ` Jon Masters
2019-12-18 10:22 ` Andre Przywara
2019-12-18 10:22 ` Andre Przywara
2019-12-18 13:53 ` Marc Zyngier
2019-12-18 13:53 ` Marc Zyngier
2019-12-10 14:41 ` Bjorn Helgaas [this message]
2019-12-10 14:41 ` Bjorn Helgaas
2019-12-11 11:00 ` Andre Przywara
2019-12-11 11:00 ` Andre Przywara
2019-12-11 20:17 ` Bjorn Helgaas
2019-12-11 20:17 ` Bjorn Helgaas
2019-12-12 11:05 ` Andre Przywara
2019-12-12 11:05 ` Andre Przywara
2019-12-12 13:44 ` Robin Murphy
2019-12-12 13:44 ` Robin Murphy
2019-12-13 21:07 ` Bjorn Helgaas
2019-12-13 21:07 ` Bjorn Helgaas
2019-12-18 15:07 ` Andre Przywara
2019-12-18 15:07 ` Andre Przywara
2019-12-12 21:07 ` Andrew Murray
2019-12-12 21:07 ` Andrew Murray
2019-12-13 14:39 ` Andre Przywara
2019-12-13 14:39 ` Andre Przywara
2019-12-12 12:37 ` Andrew Murray
2019-12-12 12:37 ` Andrew Murray
2019-12-13 14:23 ` Andre Przywara
2019-12-13 14:23 ` Andre Przywara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191210144115.GA94877@google.com \
--to=helgaas@kernel.org \
--cc=andre.przywara@arm.com \
--cc=andrew.murray@arm.com \
--cc=catalin.marinas@arm.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=rjw@rjwysocki.net \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.