* Kernels on Bad Firmware (was Re: kernel entry for thumb2-only cpus)
@ 2012-08-08 17:36 Matt Sealey
2012-08-08 17:50 ` Stephen Warren
0 siblings, 1 reply; 4+ messages in thread
From: Matt Sealey @ 2012-08-08 17:36 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Aug 8, 2012 at 10:33 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Tue, Aug 07, 2012 at 05:34:15PM -0500, Matt Sealey wrote:
>> Just because there is a ton of absolutely awful, broken firmware code out there
>> doesn't mean it can and will always be the case, and Linux policy should not
>> be dictated on an architecture basis on a few bad eggs, especially if it means
>> developers in the big wide world have to jump through hoops. Surely it is just
>> as good for Linux to loudly advocate the correct solutions in firmware and
>> implement the workarounds anyway, like a device quirk, rather than just out and
>> out say "firmwares suck, ignore what they did, do it again. We don't trust them
>> and never will."
>
> Sadly, firmware developers have taught us time and time and time and time
> again that they can not be trusted. You may be the single one who can be
> trusted to validate their stuff properly, but you would be in a severe
> minority if that is true.
>
> Many firmware developers do the barest minimum that's required. Once their
> job is done, that's the end of the development cycle and nothing further
> happens, not even for bugs.
>
> We've seen this time and time again. We see it with corrupted ATAG lists,
> we see it with bad memory tags passed to the kernel that the kernel has to
> then screw around with to fix up the broken firmware developers crap.
> Let me show you the crappy workarounds that we have:
No need, we're guilty of that too; line 2600 onwards of
arch/powerpc/kernel/prom_init.c
However we do provide a Forth script that can run the kernel that does
all this, but nobody
would accept removing the fixups from the kernel (an old patch is
inside the zip archive..).
http://www.powerdeveloper.org/platforms/efika/devicetree
We definitely learned from our mistakes - nasty firmware shipped
before Linux had
properly implemented device trees for that SoC, and many many user problems that
just caused support and update hassle.. it costs money to support bad code. Far
more than just spending getting it right in the first place.
The fact we could update it with a script was the awesome thing about
using OpenFirmware -
but U-Boot can do this too, since libfdt is there and it's one option
to enable it to allow script
based modification of the blob. If the DT is hardcoded into the firmware somehow
(CONFIG_OF_CONTROL I think) then platforms can load "boot.scr" from
the root filesystem
(cram or jffs or ubifs if necessary) or similar and fix their device
trees in-place, and if they're
from filesystem and "known bad", fix them after loading them. Highly
embedded platforms
might make this clumsy, but it can ALL be done there.. one problem is,
this is also way, way
too late to do pin muxing :]
On UEFI updates to device tree or ACPI DSDT could be done via a small
EBC loader that
chained to the next boot device.. did anyone make any inroads into
what the real spec
for booting from UEFI should be, by the way, I noticed the Beagle
Tiano Core "only" supports
zImage which is infuriating as this flaunts the existing EFI/UEFI
standard. Wrapping the
kernel in a PECOFF image would not be all that hard and then what you
get is the ability
to write the exact nature of the entry point into the architecture id
field of the header (the
latest spec from 2010 or so includes ARMv7 Thumb2).
> Spot a pattern there? The one which stands out to me is that boot loaders
> can not get the trivial task of passing the simple information about where
> the RAM is in the system to the kernel right.
>
> Many boot loaders for _years_ have not been able to get the very very
> trivially simple issue of passing the right machine ID value in r1 to the
> kernel either.
Also guilty although we inherited that problem from a large Taiwanese
manufacturer
who hacked U-Boot to load their machine ID from the filesystem
(type_id.bin) before
being able to boot. Otherwise it'd throw in a 0 and everything would
break. We still
have to ship that to update firmware on old machines. It never made
any sense since
you couldn't boot their boards from a common bootloader binary anyway where the
machine_id would be that dynamic..
> Have we tried to push the onus back on firmware people? I've tried damned
> hard to the extent of preventing some of these work-arounds getting into
> mainline, but the sole result of that was that mainline would not work on
> those platforms;
> on the platforms. It didn't magically cause the firmware to get fixed in
> any way. We just ended up with a detrimental situation to everyone because
> mainline just didn't work on those platforms.
I'm fairly sure you're not above saying "tough shit" to these guys, though?
If their firmware is broken they don't get to boot mainline Linux. If
they have to
stick to BSPs, then, that is their problem. If BSPs get hard to maintain, maybe
someone writing those BSPs will finally get up and implement some change.
Worst case, they'd end up with a mainline tree in git somewhere with a couple
patches that never got accepted to enable booting your board.
This is one reason I'm fairly excited about the proposition of UEFI - it's very
well defined on x86 right now and there's a chance to lock down everything
absolutely necessary to perform boot on ARM and have it be there from the
first consumer board, 100%.
> order for that to change, they must change, and they must _earn_ our trust
> that they _can_ be trusted to do a good job. Until then...
Until then I think what's missing is someone important kernel-side
being involved
in bootloader specifications. ePAPR was a nice try, since Grant Likely got to be
the one to push it through, and we got a nice base for device tree, although we
were on the technical committee for PAPR and ePAPR and founder members
of Power.org at the time, they basically ignored us because what we wanted out
of it was to change the status quo for something better - what they wanted to do
was re-ratify the spec from 10 years ago and put a new trademark stamp on it.
No offense intended to anyone on the list, but Linaro is kind of doing
the same thing
right now on a lot of things. It needs to be specified first then
implemented, but the
"Linux development model" is patch first, patch again, patch again,
patch again and
retrofit the binding to match. Right now it's possible there won't be
a line of common
code between the device tree I just pushed and the one required to
boot the board
in 2 years.. no wonder you can't trust the bootloader guys, they're on
a train going
east at 200kmh, Linux is on a train going west at 200kmh, and you're both trying
to shoot each other. Moving targets are very hard to implement on a consumer
device since you can't force Grandma to update her U-Boot every week and match
a kernel to it. The only reason it's acceptable right now is because
there are no
consumer devices in the mainline Linux ARM tree (except, arguably, ours and the
AlwaysInnovating Touchbook, but I would be happy to know there is
someone running
a phone or another tablet capable of booting a mainline Linux and a
device tree),
only reference designs and platform experiments.
--
Matt Sealey <matt@genesi-usa.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Kernels on Bad Firmware (was Re: kernel entry for thumb2-only cpus)
2012-08-08 17:36 Kernels on Bad Firmware (was Re: kernel entry for thumb2-only cpus) Matt Sealey
@ 2012-08-08 17:50 ` Stephen Warren
2012-08-08 18:54 ` Matt Sealey
2012-08-08 20:17 ` Olof Johansson
0 siblings, 2 replies; 4+ messages in thread
From: Stephen Warren @ 2012-08-08 17:50 UTC (permalink / raw)
To: linux-arm-kernel
On 08/08/2012 11:36 AM, Matt Sealey wrote:
...
> The fact we could update it with a script was the awesome thing about
> using OpenFirmware -
> but U-Boot can do this too, since libfdt is there and it's one option
> to enable it to allow script
> based modification of the blob. If the DT is hardcoded into the firmware somehow
> (CONFIG_OF_CONTROL I think) then platforms can load "boot.scr" from
> the root filesystem ...
Just a comment on CONFIG_OF_CONTROL...
In U-Boot, CONFIG_OF_CONTROL determines whether U-Boot uses a device
tree to configure itself. This is completely orthogonal to whether a
device tree is passed to the kernel, and where the kernel DT comes from,
which is still controlled by the bootm/bootz command parameters.
The DT used to configure U-Boot isn't the same one passed to the kernel
typically. The one for U-Boot is typically appended to the U-Boot image,
whereas the one passed to the kernel is likely loaded from a file in
/boot alongside the uImage/zImage of the kernel. I suppose the U-Boot
script /could/ be written to encode the location of the appended DTB
used by U-Boot and so pass the same one to the kernel, I don't believe
anyone has done that. Besides, U-Boot's copy of the .dts files has
diverged a little from the kernel's...:-(
^ permalink raw reply [flat|nested] 4+ messages in thread
* Kernels on Bad Firmware (was Re: kernel entry for thumb2-only cpus)
2012-08-08 17:50 ` Stephen Warren
@ 2012-08-08 18:54 ` Matt Sealey
2012-08-08 20:17 ` Olof Johansson
1 sibling, 0 replies; 4+ messages in thread
From: Matt Sealey @ 2012-08-08 18:54 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Aug 8, 2012 at 12:50 PM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/08/2012 11:36 AM, Matt Sealey wrote:
> ...
>> The fact we could update it with a script was the awesome thing about
>> using OpenFirmware -
>> but U-Boot can do this too, since libfdt is there and it's one option
>> to enable it to allow script
>> based modification of the blob. If the DT is hardcoded into the firmware somehow
>> (CONFIG_OF_CONTROL I think) then platforms can load "boot.scr" from
>> the root filesystem ...
>
> Just a comment on CONFIG_OF_CONTROL...
>
> In U-Boot, CONFIG_OF_CONTROL determines whether U-Boot uses a device
> tree to configure itself. This is completely orthogonal to whether a
> device tree is passed to the kernel, and where the kernel DT comes from,
> which is still controlled by the bootm/bootz command parameters.
>
> The DT used to configure U-Boot isn't the same one passed to the kernel
> typically. The one for U-Boot is typically appended to the U-Boot image,
> whereas the one passed to the kernel is likely loaded from a file in
> /boot alongside the uImage/zImage of the kernel. I suppose the U-Boot
> script /could/ be written to encode the location of the appended DTB
> used by U-Boot and so pass the same one to the kernel, I don't believe
> anyone has done that. Besides, U-Boot's copy of the .dts files has
> diverged a little from the kernel's...:-(
... excuse me while I find a quiet room to scream in.
The ideal solution is as follows, though (taken from experience writing
OpenFirmware and Forth..)
* Bootloader straps from the processor and initializes the CPU. It may
(SHOULD) contain a basic device tree at this point specialized for the
CPU which defines the basic elements (memory start/size where it is
fixed, cpus, basic stuff as in dtsi skeletons)
* Board init code then uses a set of macros or internal functions to add
entries to the device tree as it brings up devices. Every device on the
system needs to be initialized as used. Anything that is not running
by default needs a "driver" to do this which does the minimal amount
required to give an environment for the OS (for instance, setting a
reasonable default for a clock or doing a mux setting). Adding a device
to the tree instantly makes it available to the rest of the bootloader,
such that devicetree_add_node("/foo/bar/mmc", ...) adds an MMC
device to the block devices list internally and can do generic boot stuff
from it.
* If Linux advances way beyond the driver bindings, the bootloader needs a
"runtime" way of externally specifying the hardware configuration such
that it can be scripted or otherwise automated (even if it's copious use
of mw.l and fdt_add or so in U-Boot hush shell scripting).
As such, actual device trees would go away from the kernel and be
a function of the bootloader's built-in specification, bootloader device
drivers, and pre-kernel execution of automated fixup.
What would be perfect is someone gets a clue about SoC design and
implements the bare minimum boot code inside the SoC. i.MX does this
about halfway - it has a common set of possibilities for iomux settings,
where it probes for SD/MMC, SPI, I2C, PATA, SATA, USB boot sources
and then provides a mask rom function to be able to stream *more* data
(using DMA, interrupts and with L2 cache enabled which is something
you cannot say about 99.99% of running U-Boots today) from the first
valid found source (overridden by fuses and boot config pins). Since it
can read from all these devices anyway it should be capable of selecting
a new boot device and streaming data to memory from these so the
bootloader no longer needs particular drivers for these boot sources.
Less advanced designs would just need a specific driver for it, as today,
with the added extra caveat that they would require adding their DT
bindings at the driver init time, as above.
I've been looking into the tinykernel/moboot solutions to create something
very similar to this but my urge to actually do it is quenched by the fact
we really should be using UEFI. The complexity of, for example, TianoCore
and the horrible HOB/PEI/PEIM legacy from x86 makes me want to cry,
though.
Has nobody really thought of the benefits of actually fixing the UEFI spec
on ARM such that you can grab something, write some bare platform code
that initializes it without the complicated chaining of pre-EFI, sort-of-EFI,
EFI-but-pre-kernel, EFI-booting-but-not-quite blocks? After all what the
kernel needs is valid tables in memory and a working call interface. How
you initialize the platform when standardized is nice, but it's overcomplex
and I think is lowering adoption on ARM... what's going on with device
trees or ACPI DSDT (and how is that going to make you guys cry)?
We had an internal architecture planning document here that basically
reworked UEFI ignoring some of the platform restrictions we had from
OpenFirmware or RTAS call interface (disable mmu, interrupts, caches
and then jump) - we did have a proof of concept (which we still call
Aura, although the actual "Aura" product has advanced to something
almost completely different) which allowed Linux to jump into a
pre-emptive code environment installed by firmware and perform any
driver code you needed at the firmware level. We used it to knock
down some performance problems on the PCI side and provide an
interface to several things which otherwise get done a lot (RTC handling,
PCI configuration space and domain handling) but you don't want to
basically kick Linux in the kidneys while you do it. After all, since
when did you want to disable interrupts, ditch the virtual
memory map, turn off the cache just to set or read the time?)
UEFI provides a lot of the core functionality Linux needs to boot in
a totally SoC-independent way, and even functionality Linux needs
to run, but it fails in that it requires that same disable-everything,
jump, re-enable everything architecture to persist for every call. We
were going to define a new system table that allowed for a new kind of
device tree (with the same kind of functionality as the current one,
but more flexible in that it allowed cross-referencing phandles to
actual UEFI-AURA runtime services). Where PPC and SPARC
platforms pull all the DT code out via calls to OF, ARM UEFI-AURA
would do this the same way. It may parse everything into a flattened
tree and use common functions after that, but the idea would be the
UEFI-AURA runtime would be able to dynamically update the tree
depending on things happening like hotplug events that could be
hidden from the OS somehow, DMA scripting, IPMI or other remote
management) or even encapsulate things like OpenGLES API calls
or media decoding (which is the ultimate expression of the requirements
of all these binary blob people) along with the ability to do funky things
like use the FPU in the firmware (would require the OS to understand
that this happens though) or NEON or so.
What it ends up being though is a mostly functional OS (which UEFI
basically is anyway if you add a GUI, which does exist) with Linux as
the shiny real-world interface to the top, tells the firmware it's done
with this functionality and has a real driver, tells the firmware it wants
to do this and that, passes userspace calls and data by virtual
addresses and references, reducing memory copies etc..
What we're talking about there though is 100 engineers, a couple years
and millions of dollars of funding to make the world a better place. I guess
we could always sign up for Kickstarter and put a few ads for new positions
on Stackoverflow.. :) so, just so you know, I'm not proposing we change
the world. We tried it, it's too big a job and there are too many people saying
"U-Boot is free and libfdt works".
I think the bare minimum "define the board properly and allow scripting changes"
would not be too big a job though, and as long as it can be done on UEFI
too.. that's fine. The work for the current device tree model would not go to
waste as it would be broadly compatible or at least immediately portable?
--
Matt Sealey <matt@genesi-usa.com>
Product Development Analyst, Genesi USA, Inc.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Kernels on Bad Firmware (was Re: kernel entry for thumb2-only cpus)
2012-08-08 17:50 ` Stephen Warren
2012-08-08 18:54 ` Matt Sealey
@ 2012-08-08 20:17 ` Olof Johansson
1 sibling, 0 replies; 4+ messages in thread
From: Olof Johansson @ 2012-08-08 20:17 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Aug 8, 2012 at 10:50 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 08/08/2012 11:36 AM, Matt Sealey wrote:
> ...
>> The fact we could update it with a script was the awesome thing about
>> using OpenFirmware -
>> but U-Boot can do this too, since libfdt is there and it's one option
>> to enable it to allow script
>> based modification of the blob. If the DT is hardcoded into the firmware somehow
>> (CONFIG_OF_CONTROL I think) then platforms can load "boot.scr" from
>> the root filesystem ...
>
> Just a comment on CONFIG_OF_CONTROL...
>
> In U-Boot, CONFIG_OF_CONTROL determines whether U-Boot uses a device
> tree to configure itself. This is completely orthogonal to whether a
> device tree is passed to the kernel, and where the kernel DT comes from,
> which is still controlled by the bootm/bootz command parameters.
>
> The DT used to configure U-Boot isn't the same one passed to the kernel
> typically. The one for U-Boot is typically appended to the U-Boot image,
> whereas the one passed to the kernel is likely loaded from a file in
> /boot alongside the uImage/zImage of the kernel. I suppose the U-Boot
> script /could/ be written to encode the location of the appended DTB
> used by U-Boot and so pass the same one to the kernel, I don't believe
> anyone has done that. Besides, U-Boot's copy of the .dts files has
> diverged a little from the kernel's...:-(
Right, OF_CONTROL was instigated around here, and I explicitly have
kept them from attempting to do that. Why? Because the U-boot device
tree is non-standard, and has a bunch of u-boot cruft in it. We quite
frankly don't want to see it in the kernel.
But the bigger argument is that the device tree bindings are still
rapidly evolving together with the kernel, and until we reach a more
stable state, it's a bad idea to provide a tree that is separate from
the kernel (on ARM), since there will be no way to undo early
mistakes.
The plan is to, at some point in time, take the device tree bindings,
as well as the DTS sources, out of the kernel tree and to a separate
repo. But not until the bindings have settled down quite a bit more.
-Olof
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-08-08 20:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-08 17:36 Kernels on Bad Firmware (was Re: kernel entry for thumb2-only cpus) Matt Sealey
2012-08-08 17:50 ` Stephen Warren
2012-08-08 18:54 ` Matt Sealey
2012-08-08 20:17 ` Olof Johansson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).