From: Rob Herring <robh+dt@kernel.org>
To: James Morse <james.morse@arm.com>
Cc: linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Mark Rutland <mark.rutland@arm.com>,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
Sudeep Holla <sudeep.holla@arm.com>,
Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>,
Andre Przywara <andre.przywara@arm.com>
Subject: Re: [PATCH 1/6] dt-bindings: firmware: Add arm,errata-management
Date: Mon, 3 Apr 2023 10:45:51 -0500 [thread overview]
Message-ID: <CAL_JsqKQRNmJkSTio+h6C92dEUoSimoGbmJQ4dMVW_ZAeRku7A@mail.gmail.com> (raw)
In-Reply-To: <cbbcd0e3-7269-315f-af13-208a3282f17b@arm.com>
On Fri, Mar 31, 2023 at 11:59 AM James Morse <james.morse@arm.com> wrote:
>
> Hi Rob,
>
> On 31/03/2023 14:46, Rob Herring wrote:
> > On Thu, Mar 30, 2023 at 11:52 AM James Morse <james.morse@arm.com> wrote:
> >> The Errata Management SMCCC interface allows firmware to advertise whether
> >> the OS is affected by an erratum, or if a higher exception level has
> >> mitigated the issue. This allows properties of the device that are not
> >> discoverable by the OS to be described. e.g. some errata depend on the
> >> behaviour of the interconnect, which is not visible to the OS.
> >>
> >> Deployed devices may find it significantly harder to update EL3
> >> firmware than the device tree. Erratum workarounds typically have to
> >> fail safe, and assume the platform is affected putting correctness
> >> above performance.
> >
> > Updating the DT is still harder than updating the kernel. A well
> > designed binding allows for errata fixes without DT changes. That
> > generally means specific compatibles up front rather than adding
> > properties to turn things on/off.
>
> I started with a per-erratum identifier, but there are 8 of them, and its hard to know
> where to put it.
That's still requiring updating the DT to fix things.
> The CPU side is detectable from the MIDR,its an interconnect property
> that we need to know ... but the interconnect isn't described in the DT. (but the obvious
> compatible string identifies the PMU)
But the interconnect could be described. In fact, there's a binding
for such things already. Surprisingly, it's called 'interconnects'...
Of course, there are lots of interconnects in SoCs and the one you
need may not be described ('cause it is invisible to s/w (until it's
not)). There's a binding going back to the CCI-400 in fact. So it
isn't really that interconnects aren't described, it's that they
aren't consistently described.
If you can add this errata table to the DT, then why not add
describing the interconnect? Then it will be there for the next thing
we need the interconnect for. I imagine some of the interconnects are
already described if not explicitly in bits and pieces (i.e. clocks or
power domains for the interconnect get tossed into some other node).
> > Do we really want to encourage/endorse/support non-updatable firmware?
> > We've rejected plenty of things with 'fix your firmware'.
>
> A DT property was explicitly requested by Marc Z on the RFC:
> https://lore.kernel.org/linux-arm-kernel/86mt5dxxbc.wl-maz@kernel.org/
>
> The concern is that platforms where the CPU is affected, but the issue is masked by the
> interconnect will never bother with a firmware interface. The kernel can't know this, so
> has to enable the workaround at the cost of performance.
Sure it can. Worst case, the kernel knows the exact SoC and board it
is running on because we have root level compatibles for those. It's
just a question of whether using those would scale or not. Whether
that scales or not depends on both how long the lists are and how
distributed the implementation is (e.g. PCI quirks). More on that
below.
> Adding a DT property to describe the hardware state of the erratum avoids the need for
> separate kernel builds to work around the missing firmware.
DT is not the kernel's runtime configuration mechanism. That assumes a
tight coupling of the DT and kernel. That may work for some cases like
Android with kernel and DT updated together, but for upstream we can't
assume that coupling and must treat it as an ABI (not an issue for
this case).
The kernel command line is a runtime config mechanism for the kernel.
So why not use only that? There's certainly some room for improvement
with handling it. For example, it's not well defined with what happens
with 'bootargs' as you go from a dtb to bootloader to kernel in terms
of overriding/prepending/appending, but that could be addressed.
> >> Instead of adding a device-tree entry for any CPU errata that is
> >> relevant (or not) to the platform, allow the device-tree to describe
> >> firmware's responses for the SMCCC interface. This could be used as
> >> the data source for the firmware interface, or be parsed by the OS if
> >> the firmware interface is missing.
>
> > What's to prevent vendors from only using the DT mechanism and never
> > supporting the SMCCC interface? I'm sure some will love to not have to
> > make a firmware update when they can just fix it in DT.
>
> The firmware interface has to exist for ACPI systems where EL3 can't influence the ACPI
> tables, which typically get replaced if the part is OEM'd.
>
> Ultimately all the interface is describing is a non-discoverable hardware property
> relevant to an erratum. These are often configurations the silicon manufacturer chooses.
> In this case its the behaviour of the interconnect.
>
> If we didn't have to support ACPI systems, this stuff would only have been in the DT. With
With...?
I fail to see what ACPI has to do with DT platforms adopting the SMCCC
interface or not.
> > The downside to this extendable binding compared to simple properties
> > is parsing a flat tree is slow and more complicated. So it may be
> > difficult to support if you need this early in boot.
>
> I do need this early in the boot, but I'm not worried about the delay as these tables
> should be small.
>
>
> >> Most errata can be detected from CPU id registers. These mechanisms
> >> are only needed for the rare cases that external knowledge is needed.
> >
> > And also have significant performance impact. In the end, how many
> > platforms are there that can't fix these in firmware and need a
> > mainline/distro kernel optimized to avoid the work-around. I suspect
> > the number is small enough it could be a list in the kernel.
>
> At a guess, its all mobile phones produced in the last 2 years that want to run a version
> of Android that uses virtualisation. Cortex-A78 is affected, but I don't know when the
> first products were shipped.
How many run mainline and run it well enough to even care about the
optimization yet?
Even if we went with the above list, that's 2 years x 4 vendors (QCom,
Mediatek, Samsung, Google) x 1-2 Soc (high and mid tier). Subtract out
any vendors capable of updating their firmware. So a worst case list
of potentially 8-16 SoCs? It shouldn't grow because anything newer is
going to implement the SMCCC interface, right? That's not unmanageable
in my book.
Rob
next prev parent reply other threads:[~2023-04-03 15:47 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-30 16:51 [PATCH 0/6] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
2023-03-30 16:51 ` [PATCH 1/6] dt-bindings: firmware: Add arm,errata-management James Morse
2023-03-30 20:50 ` Rob Herring
2023-03-31 8:29 ` Krzysztof Kozlowski
2023-03-31 16:58 ` James Morse
2023-04-03 9:15 ` Krzysztof Kozlowski
2023-04-03 12:05 ` Marc Zyngier
2023-03-31 13:46 ` Rob Herring
2023-03-31 16:58 ` James Morse
2023-04-03 15:45 ` Rob Herring [this message]
2023-04-04 15:19 ` James Morse
2023-03-30 16:51 ` [PATCH 2/6] firmware: smccc: Add support for erratum discovery API James Morse
2023-03-30 20:34 ` kernel test robot
2023-03-30 16:51 ` [PATCH 3/6] arm64: cputype: Add new part numbers for Cortex-X3, and Neoverse-V2 James Morse
2023-03-30 16:51 ` [PATCH 4/6] arm64: errata: Disable FWB on parts with non-ARM interconnects James Morse
2023-03-30 16:51 ` [PATCH 5/6] firmware: smccc: Allow errata management to be overridden by device tree James Morse
2023-03-30 20:44 ` kernel test robot
2023-03-31 17:05 ` James Morse
2023-03-30 16:51 ` [PATCH 6/6] arm64: errata: Add a commandline option to enable/disable #2701951 James Morse
2023-03-31 12:57 ` [PATCH 0/6] arm64: errata: Disable FWB on parts with non-ARM interconnects Rob Herring
2023-03-31 13:03 ` Rob Herring
2023-05-11 17:15 ` Catalin Marinas
2023-05-11 18:42 ` Marc Zyngier
2023-05-11 21:13 ` Catalin Marinas
2023-05-16 16:29 ` James Morse
2023-05-23 12:24 ` Robin Murphy
2023-05-23 10:48 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAL_JsqKQRNmJkSTio+h6C92dEUoSimoGbmJQ4dMVW_ZAeRku7A@mail.gmail.com \
--to=robh+dt@kernel.org \
--cc=andre.przywara@arm.com \
--cc=catalin.marinas@arm.com \
--cc=devicetree@vger.kernel.org \
--cc=james.morse@arm.com \
--cc=krzysztof.kozlowski+dt@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=lpieralisi@kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=sudeep.holla@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).