* [RFC] ACPI on arm64 TODO List
@ 2015-01-10 14:44 ` Grant Likely
0 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-10 14:44 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>>> 7. Why is ACPI required?
>>> * Problem:
>>> * arm64 maintainers still haven't been convinced that ACPI is
>>> necessary.
>>> * Why do hardware and OS vendors say ACPI is required?
>>> * Status: Al & Grant collecting statements from OEMs to be posted
>>> publicly early in the new year; firmware summit for broader
>>> discussion planned.
>>
>> I was particularly hoping to see better progress on this item. It
>> really shouldn't be that hard to explain why someone wants this feature.
>
> I've written something up in as a reply on the firmware summit thread.
> I'm going to rework it to be a standalone document and post it
> publicly. I hope that should resolve this issue.
I've posted an article on my blog, but I'm reposting it here because
the mailing list is more conducive to discussion...
http://www.secretlab.ca/archives/151
Why ACPI on ARM?
----------------
Why are we doing ACPI on ARM? That question has been asked many times,
but we haven't yet had a good summary of the most important reasons
for wanting ACPI on ARM. This article is an attempt to state the
rationale clearly.
During an email conversation late last year, Catalin Marinas asked for
a summary of exactly why we want ACPI on ARM, Dong Wei replied with
the following list:
> 1. Support multiple OSes, including Linux and Windows
> 2. Support device configurations
> 3. Support dynamic device configurations (hot add/removal)
> 4. Support hardware abstraction through control methods
> 5. Support power management
> 6. Support thermal management
> 7. Support RAS interfaces
The above list is certainly true in that all of them need to be
supported. However, that list doesn't give the rationale for choosing
ACPI. We already have DT mechanisms for doing most of the above, and
can certainly create new bindings for anything that is missing. So, if
it isn't an issue of functionality, then how does ACPI differ from DT
and why is ACPI a better fit for general purpose ARM servers?
The difference is in the support model. To explain what I mean, I'm
first going to expand on each of the items above and discuss the
similarities and differences between ACPI and DT. Then, with that as
the groundwork, I'll discuss how ACPI is a better fit for the general
purpose hardware support model.
Device Configurations
---------------------
2. Support device configurations
3. Support dynamic device configurations (hot add/removal)
>From day one, DT was about device configurations. There isn't any
significant difference between ACPI & DT here. In fact, the majority
of ACPI tables are completely analogous to DT descriptions. With the
exception of the DSDT and SSDT tables, most ACPI tables are merely
flat data used to describe hardware.
DT platforms have also supported dynamic configuration and hotplug for
years. There isn't a lot here that differentiates between ACPI and DT.
The biggest difference is that dynamic changes to the ACPI namespace
can be triggered by ACPI methods, whereas for DT changes are received
as messages from firmware and have been very much platform specific
(e.g. IBM pSeries does this)
Power Management Model
----------------------
4. Support hardware abstraction through control methods
5. Support power management
6. Support thermal management
Power, thermal, and clock management can all be dealt with as a group.
ACPI defines a power management model (OSPM) that both the platform
and the OS conform to. The OS implements the OSPM state machine, but
the platform can provide state change behaviour in the form of
bytecode methods. Methods can access hardware directly or hand off PM
operations to a coprocessor. The OS really doesn't have to care about
the details as long as the platform obeys the rules of the OSPM model.
With DT, the kernel has device drivers for each and every component in
the platform, and configures them using DT data. DT itself doesn't
have a PM model. Rather the PM model is an implementation detail of
the kernel. Device drivers use DT data to decide how to handle PM
state changes. We have clock, pinctrl, and regulator frameworks in the
kernel for working out runtime PM. However, this only works when all
the drivers and support code have been merged into the kernel. When
the kernel's PM model doesn't work for new hardware, then we change
the model. This works very well for mobile/embedded because the vendor
controls the kernel. We can change things when we need to, but we also
struggle with getting board support mainlined.
This difference has a big impact when it comes to OS support.
Engineers from hardware vendors, Microsoft, and most vocally Red Hat
have all told me bluntly that rebuilding the kernel doesn't work for
enterprise OS support. Their model is based around a fixed OS release
that ideally boots out-of-the-box. It may still need additional device
drivers for specific peripherals/features, but from a system view, the
OS works. When additional drivers are provided separately, those
drivers fit within the existing OSPM model for power management. This
is where ACPI has a technical advantage over DT. The ACPI OSPM model
and it's bytecode gives the HW vendors a level of abstraction under
their control, not the kernel's. When the hardware behaves differently
from what the OS expects, the vendor is able to change the behaviour
without changing the HW or patching the OS.
At this point you'd be right to point out that it is harder to get the
whole system working correctly when behaviour is split between the
kernel and the platform. The OS must trust that the platform doesn't
violate the OSPM model. All manner of bad things happen if it does.
That is exactly why the DT model doesn't encode behaviour: It is
easier to make changes and fix bugs when everything is within the same
code base. We don't need a platform/kernel split when we can modify
the kernel.
However, the enterprise folks don't have that luxury. The
platform/kernel split isn't a design choice. It is a characteristic of
the market. Hardware and OS vendors each have their own product
timetables, and they don't line up. The timeline for getting patches
into the kernel and flowing through into OS releases puts OS support
far downstream from the actual release of hardware. Hardware vendors
simply cannot wait for OS support to come online to be able to release
their products. They need to be able to work with available releases,
and make their hardware behave in the way the OS expects. The
advantage of ACPI OSPM is that it defines behaviour and limits what
the hardware is allowed to do without involving the kernel.
What remains is sorting out how we make sure everything works. How do
we make sure there is enough cross platform testing to ensure new
hardware doesn't ship broken and that new OS releases don't break on
old hardware? Those are the reasons why a UEFI/ACPI firmware summit is
being organized, it's why the UEFI forum holds plugfests 3 times a
year, and it is why we're working on FWTS and LuvOS.
Reliability, Availability & Serviceability (RAS)
------------------------------------------------
7. Support RAS interfaces
This isn't a question of whether or not DT can support RAS. Of course
it can. Rather it is a matter of RAS bindings already existing for
ACPI, including a usage model. We've barely begun to explore this on
DT. This item doesn't make ACPI technically superior to DT, but it
certainly makes it more mature.
Multiplatform support
---------------------
1. Support multiple OSes, including Linux and Windows
I'm tackling this item last because I think it is the most contentious
for those of us in the Linux world. I wanted to get the other issues
out of the way before addressing it.
The separation between hardware vendors and OS vendors in the server
market is new for ARM. For the first time ARM hardware and OS release
cycles are completely decoupled from each other, and neither are
expected to have specific knowledge of the other (ie. the hardware
vendor doesn't control the choice of OS). ARM and their partners want
to create an ecosystem of independent OSes and hardware platforms that
don't explicitly require the former to be ported to the latter.
Now, one could argue that Linux is driving the potential market for
ARM servers, and therefore Linux is the only thing that matters, but
hardware vendors don't see it that way. For hardware vendors it is in
their best interest to support as wide a choice of OSes as possible in
order to catch the widest potential customer base. Even if the
majority choose Linux, some will choose BSD, some will choose Windows,
and some will choose something else. Whether or not we think this is
foolish is beside the point; it isn't something we have influence
over.
During early ARM server planning meetings between ARM, its partners
and other industry representatives (myself included) we discussed this
exact point. Before us were two options, DT and ACPI. As one of the
Linux people in the room, I advised that ACPI's closed governance
model was a show stopper for Linux and that DT is the working
interface. Microsoft on the other hand made it abundantly clear that
ACPI was the only interface that they would support. For their part,
the hardware vendors stated the platform abstraction behaviour of ACPI
is a hard requirement for their support model and that they would not
close the door on either Linux or Windows.
However, the one thing that all of us could agree on was that
supporting multiple interfaces doesn't help anyone: It would require
twice as much effort on defining bindings (once for Linux-DT and once
for Windows-ACPI) and it would require firmware to describe everything
twice. Eventually we reached the compromise to use ACPI, but on the
condition of opening the governance process to give Linux engineers
equal influence over the specification. The fact that we now have a
much better seat at the ACPI table, for both ARM and x86, is a direct
result of these early ARM server negotiations. We are no longer second
class citizens in the ACPI world and are actually driving much of the
recent development.
I know that this line of thought is more about market forces rather
than a hard technical argument between ACPI and DT, but it is an
equally significant one. Agreeing on a single way of doing things is
important. The ARM server ecosystem is better for the agreement to use
the same interface for all operating systems. This is what is meant by
standards compliant. The standard is a codification of the mutually
agreed interface. It provides confidence that all vendors are using
the same rules for interoperability.
Summary
-------
To summarize, here is the short form rationale for ACPI on ARM:
- ACPI's bytecode allows the platform to encode behaviour. DT
explicitly does not support this. For hardware vendors, being able to
encode behaviour is an important tool for supporting operating system
releases on new hardware.
- ACPI's OSPM defines a power management model that constrains what
the platform is allowed into a specific model while still having
flexibility in hardware design.
- For enterprise use-cases, ACPI has extablished bindings, such as
for RAS, which are used in production. DT does not. Yes, we can define
those bindings but doing so means ARM and x86 will use completely
different code paths in both firmware and the kernel.
- Choosing a single interface for platform/OS abstraction is
important. It is not reasonable to require vendors to implement both
DT and ACPI if they want to support multiple operating systems.
Agreeing on a single interface instead of being fragmented into per-OS
interfaces makes for better interoperability overall.
- The ACPI governance process works well and we're at the same table
as HW vendors and other OS vendors. In fact, there is no longer any
reason to feel that ACPI is a Windows thing or that we are playing
second fiddle to Microsoft. The move of ACPI governance into the UEFI
forum has significantly opened up the processes, and currently, a
large portion of the changes being made to ACPI is being driven by
Linux.
At the beginning of this article I made the statement that the
difference is in the support model. For servers, responsibility for
hardware behaviour cannot be purely the domain of the kernel, but
rather is split between the platform and the kernel. ACPI frees the OS
from needing to understand all the minute details of the hardware so
that the OS doesn't need to be ported to each and every device
individually. It allows the hardware vendors to take responsibility
for PM behaviour without depending on an OS release cycle which it is
not under their control.
ACPI is also important because hardware and OS vendors have already
worked out how to use it to support the general purpose ecosystem. The
infrastructure is in place, the bindings are in place, and the process
is in place. DT does exactly what we need it to when working with
vertically integrated devices, but we don't have good processes for
supporting what the server vendors need. We could potentially get
there with DT, but doing so doesn't buy us anything. ACPI already does
what the hardware vendors need, Microsoft won't collaborate with us on
DT, and the hardware vendors would still need to provide two
completely separate firmware interface; one for Linux and one for
Windows.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-10 14:44 ` Grant Likely
@ 2015-01-12 10:21 ` Arnd Bergmann
-1 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-12 10:21 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Grant Likely, Al Stone, linaro-acpi@lists.linaro.org,
Catalin Marinas, Rafael J. Wysocki, ACPI Devel Mailing List,
Olof Johansson
On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> I've posted an article on my blog, but I'm reposting it here because
> the mailing list is more conducive to discussion...
>
> http://www.secretlab.ca/archives/151
>
> Why ACPI on ARM?
> ----------------
>
> Why are we doing ACPI on ARM? That question has been asked many times,
> but we haven't yet had a good summary of the most important reasons
> for wanting ACPI on ARM. This article is an attempt to state the
> rationale clearly.
Thanks for writing this up, much appreciated. I'd like to comment
on some of the points here, which seems easier than commenting on the
blog post.
> Device Configurations
> ---------------------
> 2. Support device configurations
> 3. Support dynamic device configurations (hot add/removal)
>
...
>
> DT platforms have also supported dynamic configuration and hotplug for
> years. There isn't a lot here that differentiates between ACPI and DT.
> The biggest difference is that dynamic changes to the ACPI namespace
> can be triggered by ACPI methods, whereas for DT changes are received
> as messages from firmware and have been very much platform specific
> (e.g. IBM pSeries does this)
This seems like a great fit for AML indeed, but I wonder what exactly
we want to hotplug here, since everything I can think of wouldn't need
AML support for the specific use case of SBSA compliant servers:
- CPU: I don't think a lot of people outside mainframes consider
CPUs to be runtime-serviceable parts, so for practical purposes
this would be for power-management purposes triggered by the OS,
and we have PSCI for managing the CPUs here. In case of virtual
machines, we will actually need hotplugging CPUs into the guest,
but this can be done through the existing hypervisor based interfaces
for KVM and Xen.
- memory: quite similar, I don't have runtime memory replacement on
my radar for normal servers yet, and in virtual machines, we'd use
the existing balloon drivers. Memory power management (per-bank
self-refresh or powerdown) would be a good use-case but the Linux
patches we had for this 5 years ago were never merged and I don't
think anybody is working on them any more.
- standard AHCI/OHCI/EHCI/XHCI/PCIe-port/...: these all have register
level support for hotplugging and don't need SoC-specific driver
support or ACPI, as can easily be verified by hotplugging devices on
x86 machines with ACPI turned off.
- nonstandard SATA/USB/PCI-X/PCI-e/...: These are common on embedded
ARM SoCs and could to a certain extent be handled using AML, but for
good reasons are not allowed by SBSA.
- anything else?
> Power Management Model
> ----------------------
> 4. Support hardware abstraction through control methods
> 5. Support power management
> 6. Support thermal management
>
> Power, thermal, and clock management can all be dealt with as a group.
> ACPI defines a power management model (OSPM) that both the platform
> and the OS conform to. The OS implements the OSPM state machine, but
> the platform can provide state change behaviour in the form of
> bytecode methods. Methods can access hardware directly or hand off PM
> operations to a coprocessor. The OS really doesn't have to care about
> the details as long as the platform obeys the rules of the OSPM model.
>
> With DT, the kernel has device drivers for each and every component in
> the platform, and configures them using DT data. DT itself doesn't
> have a PM model. Rather the PM model is an implementation detail of
> the kernel. Device drivers use DT data to decide how to handle PM
> state changes. We have clock, pinctrl, and regulator frameworks in the
> kernel for working out runtime PM. However, this only works when all
> the drivers and support code have been merged into the kernel. When
> the kernel's PM model doesn't work for new hardware, then we change
> the model. This works very well for mobile/embedded because the vendor
> controls the kernel. We can change things when we need to, but we also
> struggle with getting board support mainlined.
I can definitely see this point, but I can also see two important
downsides to the ACPI model that need to be considered for an
individual implementor:
* As a high-level abstraction, there are limits to how fine-grained
the power management can be done, or is implemented in a particular
BIOS. The thinner the abstraction, the better the power savings can
get when implemented right.
* From the experience with x86, Linux tends to prefer using drivers
for hardware registers over the AML based drivers when both are
implemented, because of efficiency and correctness.
We should probably discuss at some point how to get the best of
both. I really don't like the idea of putting the low-level
details that we tend to have DT into ACPI, but there are two
things we can do: For systems that have a high-level abstraction
for their PM in hardware (e.g. talking to an embedded controller
that does the actual work), the ACPI description should contain
enough information to implement a kernel-level driver for it as
we have on Intel machines. For more traditional SoCs that do everything
themselves, I would recommend to always have a working DT for
those people wanting to get the most of their hardware. This will
also enable any other SoC features that cannot be represented in
ACPI.
> What remains is sorting out how we make sure everything works. How do
> we make sure there is enough cross platform testing to ensure new
> hardware doesn't ship broken and that new OS releases don't break on
> old hardware? Those are the reasons why a UEFI/ACPI firmware summit is
> being organized, it's why the UEFI forum holds plugfests 3 times a
> year, and it is why we're working on FWTS and LuvOS.
Right.
> Reliability, Availability & Serviceability (RAS)
> ------------------------------------------------
> 7. Support RAS interfaces
>
> This isn't a question of whether or not DT can support RAS. Of course
> it can. Rather it is a matter of RAS bindings already existing for
> ACPI, including a usage model. We've barely begun to explore this on
> DT. This item doesn't make ACPI technically superior to DT, but it
> certainly makes it more mature.
Unfortunately, RAS can mean a lot of things to different people.
Is there some high-level description of what the APCI idea of RAS
is? On systems I've worked on in the past, this was generally done
out of band (e.g. in an IPMI BMC) because you can't really trust
the running OS when you report errors that may impact data consistency
of that OS.
> Multiplatform support
> ---------------------
> 1. Support multiple OSes, including Linux and Windows
>
> I'm tackling this item last because I think it is the most contentious
> for those of us in the Linux world. I wanted to get the other issues
> out of the way before addressing it.
>
> I know that this line of thought is more about market forces rather
> than a hard technical argument between ACPI and DT, but it is an
> equally significant one. Agreeing on a single way of doing things is
> important. The ARM server ecosystem is better for the agreement to use
> the same interface for all operating systems. This is what is meant by
> standards compliant. The standard is a codification of the mutually
> agreed interface. It provides confidence that all vendors are using
> the same rules for interoperability.
I do think that this is in fact the most important argument in favor
of doing ACPI on Linux, because a number of companies are betting on
Windows (or some in-house OS that uses ACPI) support. At the same time,
I don't think talking of a single 'ARM server ecosystem' that needs to
agree on one interface is helpful here. Each server company has their
own business plan and their own constraints. I absolutely think that
getting as many companies as possible to agree on SBSA and UEFI is
helpful here because it reduces the the differences between the platforms
as seen by a distro. For companies that want to support Windows, it's
obvious they want to have ACPI on their machines, for others the
factors you mention above can be enough to justify the move to ACPI
even without Windows support. Then there are other companies for
which the tradeoffs are different, and I see no reason for forcing
it on them. Finally there are and will likely always be chips that
are not built around SBSA and someone will use the chips in creative
ways to build servers from them, so we already don't have a homogeneous
ecosystem.
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-12 10:21 ` Arnd Bergmann
0 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-12 10:21 UTC (permalink / raw)
To: linux-arm-kernel
On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> I've posted an article on my blog, but I'm reposting it here because
> the mailing list is more conducive to discussion...
>
> http://www.secretlab.ca/archives/151
>
> Why ACPI on ARM?
> ----------------
>
> Why are we doing ACPI on ARM? That question has been asked many times,
> but we haven't yet had a good summary of the most important reasons
> for wanting ACPI on ARM. This article is an attempt to state the
> rationale clearly.
Thanks for writing this up, much appreciated. I'd like to comment
on some of the points here, which seems easier than commenting on the
blog post.
> Device Configurations
> ---------------------
> 2. Support device configurations
> 3. Support dynamic device configurations (hot add/removal)
>
...
>
> DT platforms have also supported dynamic configuration and hotplug for
> years. There isn't a lot here that differentiates between ACPI and DT.
> The biggest difference is that dynamic changes to the ACPI namespace
> can be triggered by ACPI methods, whereas for DT changes are received
> as messages from firmware and have been very much platform specific
> (e.g. IBM pSeries does this)
This seems like a great fit for AML indeed, but I wonder what exactly
we want to hotplug here, since everything I can think of wouldn't need
AML support for the specific use case of SBSA compliant servers:
- CPU: I don't think a lot of people outside mainframes consider
CPUs to be runtime-serviceable parts, so for practical purposes
this would be for power-management purposes triggered by the OS,
and we have PSCI for managing the CPUs here. In case of virtual
machines, we will actually need hotplugging CPUs into the guest,
but this can be done through the existing hypervisor based interfaces
for KVM and Xen.
- memory: quite similar, I don't have runtime memory replacement on
my radar for normal servers yet, and in virtual machines, we'd use
the existing balloon drivers. Memory power management (per-bank
self-refresh or powerdown) would be a good use-case but the Linux
patches we had for this 5 years ago were never merged and I don't
think anybody is working on them any more.
- standard AHCI/OHCI/EHCI/XHCI/PCIe-port/...: these all have register
level support for hotplugging and don't need SoC-specific driver
support or ACPI, as can easily be verified by hotplugging devices on
x86 machines with ACPI turned off.
- nonstandard SATA/USB/PCI-X/PCI-e/...: These are common on embedded
ARM SoCs and could to a certain extent be handled using AML, but for
good reasons are not allowed by SBSA.
- anything else?
> Power Management Model
> ----------------------
> 4. Support hardware abstraction through control methods
> 5. Support power management
> 6. Support thermal management
>
> Power, thermal, and clock management can all be dealt with as a group.
> ACPI defines a power management model (OSPM) that both the platform
> and the OS conform to. The OS implements the OSPM state machine, but
> the platform can provide state change behaviour in the form of
> bytecode methods. Methods can access hardware directly or hand off PM
> operations to a coprocessor. The OS really doesn't have to care about
> the details as long as the platform obeys the rules of the OSPM model.
>
> With DT, the kernel has device drivers for each and every component in
> the platform, and configures them using DT data. DT itself doesn't
> have a PM model. Rather the PM model is an implementation detail of
> the kernel. Device drivers use DT data to decide how to handle PM
> state changes. We have clock, pinctrl, and regulator frameworks in the
> kernel for working out runtime PM. However, this only works when all
> the drivers and support code have been merged into the kernel. When
> the kernel's PM model doesn't work for new hardware, then we change
> the model. This works very well for mobile/embedded because the vendor
> controls the kernel. We can change things when we need to, but we also
> struggle with getting board support mainlined.
I can definitely see this point, but I can also see two important
downsides to the ACPI model that need to be considered for an
individual implementor:
* As a high-level abstraction, there are limits to how fine-grained
the power management can be done, or is implemented in a particular
BIOS. The thinner the abstraction, the better the power savings can
get when implemented right.
* From the experience with x86, Linux tends to prefer using drivers
for hardware registers over the AML based drivers when both are
implemented, because of efficiency and correctness.
We should probably discuss at some point how to get the best of
both. I really don't like the idea of putting the low-level
details that we tend to have DT into ACPI, but there are two
things we can do: For systems that have a high-level abstraction
for their PM in hardware (e.g. talking to an embedded controller
that does the actual work), the ACPI description should contain
enough information to implement a kernel-level driver for it as
we have on Intel machines. For more traditional SoCs that do everything
themselves, I would recommend to always have a working DT for
those people wanting to get the most of their hardware. This will
also enable any other SoC features that cannot be represented in
ACPI.
> What remains is sorting out how we make sure everything works. How do
> we make sure there is enough cross platform testing to ensure new
> hardware doesn't ship broken and that new OS releases don't break on
> old hardware? Those are the reasons why a UEFI/ACPI firmware summit is
> being organized, it's why the UEFI forum holds plugfests 3 times a
> year, and it is why we're working on FWTS and LuvOS.
Right.
> Reliability, Availability & Serviceability (RAS)
> ------------------------------------------------
> 7. Support RAS interfaces
>
> This isn't a question of whether or not DT can support RAS. Of course
> it can. Rather it is a matter of RAS bindings already existing for
> ACPI, including a usage model. We've barely begun to explore this on
> DT. This item doesn't make ACPI technically superior to DT, but it
> certainly makes it more mature.
Unfortunately, RAS can mean a lot of things to different people.
Is there some high-level description of what the APCI idea of RAS
is? On systems I've worked on in the past, this was generally done
out of band (e.g. in an IPMI BMC) because you can't really trust
the running OS when you report errors that may impact data consistency
of that OS.
> Multiplatform support
> ---------------------
> 1. Support multiple OSes, including Linux and Windows
>
> I'm tackling this item last because I think it is the most contentious
> for those of us in the Linux world. I wanted to get the other issues
> out of the way before addressing it.
>
> I know that this line of thought is more about market forces rather
> than a hard technical argument between ACPI and DT, but it is an
> equally significant one. Agreeing on a single way of doing things is
> important. The ARM server ecosystem is better for the agreement to use
> the same interface for all operating systems. This is what is meant by
> standards compliant. The standard is a codification of the mutually
> agreed interface. It provides confidence that all vendors are using
> the same rules for interoperability.
I do think that this is in fact the most important argument in favor
of doing ACPI on Linux, because a number of companies are betting on
Windows (or some in-house OS that uses ACPI) support. At the same time,
I don't think talking of a single 'ARM server ecosystem' that needs to
agree on one interface is helpful here. Each server company has their
own business plan and their own constraints. I absolutely think that
getting as many companies as possible to agree on SBSA and UEFI is
helpful here because it reduces the the differences between the platforms
as seen by a distro. For companies that want to support Windows, it's
obvious they want to have ACPI on their machines, for others the
factors you mention above can be enough to justify the move to ACPI
even without Windows support. Then there are other companies for
which the tradeoffs are different, and I see no reason for forcing
it on them. Finally there are and will likely always be chips that
are not built around SBSA and someone will use the chips in creative
ways to build servers from them, so we already don't have a homogeneous
ecosystem.
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-12 10:21 ` Arnd Bergmann
@ 2015-01-12 12:00 ` Grant Likely
-1 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-12 12:00 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-arm-kernel@lists.infradead.org, Al Stone,
linaro-acpi@lists.linaro.org, Catalin Marinas, Rafael J. Wysocki,
ACPI Devel Mailing List, Olof Johansson
On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>
>> I've posted an article on my blog, but I'm reposting it here because
>> the mailing list is more conducive to discussion...
>>
>> http://www.secretlab.ca/archives/151
>>
>> Why ACPI on ARM?
>> ----------------
>>
>> Why are we doing ACPI on ARM? That question has been asked many times,
>> but we haven't yet had a good summary of the most important reasons
>> for wanting ACPI on ARM. This article is an attempt to state the
>> rationale clearly.
>
> Thanks for writing this up, much appreciated. I'd like to comment
> on some of the points here, which seems easier than commenting on the
> blog post.
Thanks for reading through it. Replies below...
>
>> Device Configurations
>> ---------------------
>> 2. Support device configurations
>> 3. Support dynamic device configurations (hot add/removal)
>>
> ...
>>
>> DT platforms have also supported dynamic configuration and hotplug for
>> years. There isn't a lot here that differentiates between ACPI and DT.
>> The biggest difference is that dynamic changes to the ACPI namespace
>> can be triggered by ACPI methods, whereas for DT changes are received
>> as messages from firmware and have been very much platform specific
>> (e.g. IBM pSeries does this)
>
> This seems like a great fit for AML indeed, but I wonder what exactly
> we want to hotplug here, since everything I can think of wouldn't need
> AML support for the specific use case of SBSA compliant servers:
[...]
I've trimmed the specific examples here because I think that misses
the point. The point is that regardless of interface (either ACPI or
DT) there are always going to be cases where the data needs to change
at runtime. Not all platforms will need to change the CPU data, but
some will (say for a machine that detects a failed CPU and removes
it). Some PCI add-in boards will carry along with them additional data
that needs to be inserted into the ACPI namespace or DT. Some
platforms will have system level component (ie. non-PCI) that may not
always be accessible.
ACPI has an interface baked in already for tying data changes to
events. DT currently needs platform specific support (which we can
improve on). I'm not even trying to argue for ACPI over DT in this
section, but I included it this document because it is one of the
reasons often given for choosing ACPI and I felt it required a more
nuanced discussion.
>> Power Management Model
>> ----------------------
>> 4. Support hardware abstraction through control methods
>> 5. Support power management
>> 6. Support thermal management
>>
>> Power, thermal, and clock management can all be dealt with as a group.
>> ACPI defines a power management model (OSPM) that both the platform
>> and the OS conform to. The OS implements the OSPM state machine, but
>> the platform can provide state change behaviour in the form of
>> bytecode methods. Methods can access hardware directly or hand off PM
>> operations to a coprocessor. The OS really doesn't have to care about
>> the details as long as the platform obeys the rules of the OSPM model.
>>
>> With DT, the kernel has device drivers for each and every component in
>> the platform, and configures them using DT data. DT itself doesn't
>> have a PM model. Rather the PM model is an implementation detail of
>> the kernel. Device drivers use DT data to decide how to handle PM
>> state changes. We have clock, pinctrl, and regulator frameworks in the
>> kernel for working out runtime PM. However, this only works when all
>> the drivers and support code have been merged into the kernel. When
>> the kernel's PM model doesn't work for new hardware, then we change
>> the model. This works very well for mobile/embedded because the vendor
>> controls the kernel. We can change things when we need to, but we also
>> struggle with getting board support mainlined.
>
> I can definitely see this point, but I can also see two important
> downsides to the ACPI model that need to be considered for an
> individual implementor:
>
> * As a high-level abstraction, there are limits to how fine-grained
> the power management can be done, or is implemented in a particular
> BIOS. The thinner the abstraction, the better the power savings can
> get when implemented right.
Agreed. That is the tradeoff. OSPM defines a power model, and the
machine must restrict any PM behaviour to fit within that power model.
This is important for interoperability, but it also leaves performance
on the table. ACPI at least gives us the option to pick that
performance back up by adding better power management to the drivers,
without sacrificing the interoperability provided by OSPM.
In other words, OSPM gets us going, but we can add specific
optimizations when required.
Also important: Vendors can choose to not implement any PM into their
ACPI tables at all. In this case the the machine would be left running
at full tilt. It will be compatible with everything, but it won't be
optimized. Then they have the option of loading a PM driver at runtime
to optimize the system with the caveat that the PM driver must not be
required for the machine to be operational. In this case, as far as
the OS is concerned, it is still applying the OSPM state machine, but
the OSPM behaviour never changes the state of the hardware.
> * From the experience with x86, Linux tends to prefer using drivers
> for hardware registers over the AML based drivers when both are
> implemented, because of efficiency and correctness.
>
> We should probably discuss at some point how to get the best of
> both. I really don't like the idea of putting the low-level
> details that we tend to have DT into ACPI, but there are two
> things we can do: For systems that have a high-level abstraction
> for their PM in hardware (e.g. talking to an embedded controller
> that does the actual work), the ACPI description should contain
> enough information to implement a kernel-level driver for it as
> we have on Intel machines. For more traditional SoCs that do everything
> themselves, I would recommend to always have a working DT for
> those people wanting to get the most of their hardware. This will
> also enable any other SoC features that cannot be represented in
> ACPI.
The nice thing about ACPI is that we always have the option of
ignoring it when the driver knows better since it is always executed
under the control of the kernel interpreter. There is no ACPI going
off and doing something behind the kernel's back. To start with we
have the OSPM state model and devices can use additional ACPI methods
as needed, but as an optimization, the driver can do those operations
directly if the driver author has enough knowledge about the device.
>> Reliability, Availability & Serviceability (RAS)
>> ------------------------------------------------
>> 7. Support RAS interfaces
>>
>> This isn't a question of whether or not DT can support RAS. Of course
>> it can. Rather it is a matter of RAS bindings already existing for
>> ACPI, including a usage model. We've barely begun to explore this on
>> DT. This item doesn't make ACPI technically superior to DT, but it
>> certainly makes it more mature.
>
> Unfortunately, RAS can mean a lot of things to different people.
> Is there some high-level description of what the APCI idea of RAS
> is? On systems I've worked on in the past, this was generally done
> out of band (e.g. in an IPMI BMC) because you can't really trust
> the running OS when you report errors that may impact data consistency
> of that OS.
RAS is also something where every company already has something that
they are using on their x86 machines. Those interfaces are being
ported over to the ARM platforms and will be equivalent to what they
already do for x86. So, for example, an ARM server from DELL will use
mostly the same RAS interfaces as an x86 server from DELL.
>
>> Multiplatform support
>> ---------------------
>> 1. Support multiple OSes, including Linux and Windows
>>
>> I'm tackling this item last because I think it is the most contentious
>> for those of us in the Linux world. I wanted to get the other issues
>> out of the way before addressing it.
>>
>> I know that this line of thought is more about market forces rather
>> than a hard technical argument between ACPI and DT, but it is an
>> equally significant one. Agreeing on a single way of doing things is
>> important. The ARM server ecosystem is better for the agreement to use
>> the same interface for all operating systems. This is what is meant by
>> standards compliant. The standard is a codification of the mutually
>> agreed interface. It provides confidence that all vendors are using
>> the same rules for interoperability.
>
> I do think that this is in fact the most important argument in favor
> of doing ACPI on Linux, because a number of companies are betting on
> Windows (or some in-house OS that uses ACPI) support. At the same time,
> I don't think talking of a single 'ARM server ecosystem' that needs to
> agree on one interface is helpful here. Each server company has their
> own business plan and their own constraints. I absolutely think that
> getting as many companies as possible to agree on SBSA and UEFI is
> helpful here because it reduces the the differences between the platforms
> as seen by a distro. For companies that want to support Windows, it's
> obvious they want to have ACPI on their machines, for others the
> factors you mention above can be enough to justify the move to ACPI
> even without Windows support. Then there are other companies for
> which the tradeoffs are different, and I see no reason for forcing
> it on them. Finally there are and will likely always be chips that
> are not built around SBSA and someone will use the chips in creative
> ways to build servers from them, so we already don't have a homogeneous
> ecosystem.
Allow me to clarify my position here. This entire document is about
why ACPI was chosen for the ARM SBBR specification. The SBBR and the
SBSA are important because they document the agreements and
compromises made by vendors and industry representatives to get
interoperability. It is a tool for vendors to say that they are aiming
for compatibility with a particularly hardware/software ecosystem.
*Nobody* is forced to implement these specifications. Any company is
free to ignore them and go their own way. The tradeoff in doing so is
it means they are on their own for support. Non-compliant hardware
vendors have to convince OS vendors to support them, and similarly,
non-compliant OS vendors need to convince hardware vendors of the
same. Red Had has stated very clearly that they won't support any
hardware that isn't SBSA/SBBR compliant. So has Microsoft. Canonical
on the other hand has said they will support whatever if there is a
business case. This certainly is a business decision and each company
needs to make its own choices.
As far as we (Linux maintainers) are concerned, we've also been really
clear that DT is not a second class citizen to ACPI. Mainline cannot
and should not force certain classes of machines to use ACPI and other
classes of machines to use DT. As long as the code is well written and
conforms to our rules for what ACPI or DT code is allowed to do, then
we should be happy to take the patches.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-12 12:00 ` Grant Likely
0 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-12 12:00 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>
>> I've posted an article on my blog, but I'm reposting it here because
>> the mailing list is more conducive to discussion...
>>
>> http://www.secretlab.ca/archives/151
>>
>> Why ACPI on ARM?
>> ----------------
>>
>> Why are we doing ACPI on ARM? That question has been asked many times,
>> but we haven't yet had a good summary of the most important reasons
>> for wanting ACPI on ARM. This article is an attempt to state the
>> rationale clearly.
>
> Thanks for writing this up, much appreciated. I'd like to comment
> on some of the points here, which seems easier than commenting on the
> blog post.
Thanks for reading through it. Replies below...
>
>> Device Configurations
>> ---------------------
>> 2. Support device configurations
>> 3. Support dynamic device configurations (hot add/removal)
>>
> ...
>>
>> DT platforms have also supported dynamic configuration and hotplug for
>> years. There isn't a lot here that differentiates between ACPI and DT.
>> The biggest difference is that dynamic changes to the ACPI namespace
>> can be triggered by ACPI methods, whereas for DT changes are received
>> as messages from firmware and have been very much platform specific
>> (e.g. IBM pSeries does this)
>
> This seems like a great fit for AML indeed, but I wonder what exactly
> we want to hotplug here, since everything I can think of wouldn't need
> AML support for the specific use case of SBSA compliant servers:
[...]
I've trimmed the specific examples here because I think that misses
the point. The point is that regardless of interface (either ACPI or
DT) there are always going to be cases where the data needs to change
at runtime. Not all platforms will need to change the CPU data, but
some will (say for a machine that detects a failed CPU and removes
it). Some PCI add-in boards will carry along with them additional data
that needs to be inserted into the ACPI namespace or DT. Some
platforms will have system level component (ie. non-PCI) that may not
always be accessible.
ACPI has an interface baked in already for tying data changes to
events. DT currently needs platform specific support (which we can
improve on). I'm not even trying to argue for ACPI over DT in this
section, but I included it this document because it is one of the
reasons often given for choosing ACPI and I felt it required a more
nuanced discussion.
>> Power Management Model
>> ----------------------
>> 4. Support hardware abstraction through control methods
>> 5. Support power management
>> 6. Support thermal management
>>
>> Power, thermal, and clock management can all be dealt with as a group.
>> ACPI defines a power management model (OSPM) that both the platform
>> and the OS conform to. The OS implements the OSPM state machine, but
>> the platform can provide state change behaviour in the form of
>> bytecode methods. Methods can access hardware directly or hand off PM
>> operations to a coprocessor. The OS really doesn't have to care about
>> the details as long as the platform obeys the rules of the OSPM model.
>>
>> With DT, the kernel has device drivers for each and every component in
>> the platform, and configures them using DT data. DT itself doesn't
>> have a PM model. Rather the PM model is an implementation detail of
>> the kernel. Device drivers use DT data to decide how to handle PM
>> state changes. We have clock, pinctrl, and regulator frameworks in the
>> kernel for working out runtime PM. However, this only works when all
>> the drivers and support code have been merged into the kernel. When
>> the kernel's PM model doesn't work for new hardware, then we change
>> the model. This works very well for mobile/embedded because the vendor
>> controls the kernel. We can change things when we need to, but we also
>> struggle with getting board support mainlined.
>
> I can definitely see this point, but I can also see two important
> downsides to the ACPI model that need to be considered for an
> individual implementor:
>
> * As a high-level abstraction, there are limits to how fine-grained
> the power management can be done, or is implemented in a particular
> BIOS. The thinner the abstraction, the better the power savings can
> get when implemented right.
Agreed. That is the tradeoff. OSPM defines a power model, and the
machine must restrict any PM behaviour to fit within that power model.
This is important for interoperability, but it also leaves performance
on the table. ACPI at least gives us the option to pick that
performance back up by adding better power management to the drivers,
without sacrificing the interoperability provided by OSPM.
In other words, OSPM gets us going, but we can add specific
optimizations when required.
Also important: Vendors can choose to not implement any PM into their
ACPI tables at all. In this case the the machine would be left running
at full tilt. It will be compatible with everything, but it won't be
optimized. Then they have the option of loading a PM driver at runtime
to optimize the system with the caveat that the PM driver must not be
required for the machine to be operational. In this case, as far as
the OS is concerned, it is still applying the OSPM state machine, but
the OSPM behaviour never changes the state of the hardware.
> * From the experience with x86, Linux tends to prefer using drivers
> for hardware registers over the AML based drivers when both are
> implemented, because of efficiency and correctness.
>
> We should probably discuss at some point how to get the best of
> both. I really don't like the idea of putting the low-level
> details that we tend to have DT into ACPI, but there are two
> things we can do: For systems that have a high-level abstraction
> for their PM in hardware (e.g. talking to an embedded controller
> that does the actual work), the ACPI description should contain
> enough information to implement a kernel-level driver for it as
> we have on Intel machines. For more traditional SoCs that do everything
> themselves, I would recommend to always have a working DT for
> those people wanting to get the most of their hardware. This will
> also enable any other SoC features that cannot be represented in
> ACPI.
The nice thing about ACPI is that we always have the option of
ignoring it when the driver knows better since it is always executed
under the control of the kernel interpreter. There is no ACPI going
off and doing something behind the kernel's back. To start with we
have the OSPM state model and devices can use additional ACPI methods
as needed, but as an optimization, the driver can do those operations
directly if the driver author has enough knowledge about the device.
>> Reliability, Availability & Serviceability (RAS)
>> ------------------------------------------------
>> 7. Support RAS interfaces
>>
>> This isn't a question of whether or not DT can support RAS. Of course
>> it can. Rather it is a matter of RAS bindings already existing for
>> ACPI, including a usage model. We've barely begun to explore this on
>> DT. This item doesn't make ACPI technically superior to DT, but it
>> certainly makes it more mature.
>
> Unfortunately, RAS can mean a lot of things to different people.
> Is there some high-level description of what the APCI idea of RAS
> is? On systems I've worked on in the past, this was generally done
> out of band (e.g. in an IPMI BMC) because you can't really trust
> the running OS when you report errors that may impact data consistency
> of that OS.
RAS is also something where every company already has something that
they are using on their x86 machines. Those interfaces are being
ported over to the ARM platforms and will be equivalent to what they
already do for x86. So, for example, an ARM server from DELL will use
mostly the same RAS interfaces as an x86 server from DELL.
>
>> Multiplatform support
>> ---------------------
>> 1. Support multiple OSes, including Linux and Windows
>>
>> I'm tackling this item last because I think it is the most contentious
>> for those of us in the Linux world. I wanted to get the other issues
>> out of the way before addressing it.
>>
>> I know that this line of thought is more about market forces rather
>> than a hard technical argument between ACPI and DT, but it is an
>> equally significant one. Agreeing on a single way of doing things is
>> important. The ARM server ecosystem is better for the agreement to use
>> the same interface for all operating systems. This is what is meant by
>> standards compliant. The standard is a codification of the mutually
>> agreed interface. It provides confidence that all vendors are using
>> the same rules for interoperability.
>
> I do think that this is in fact the most important argument in favor
> of doing ACPI on Linux, because a number of companies are betting on
> Windows (or some in-house OS that uses ACPI) support. At the same time,
> I don't think talking of a single 'ARM server ecosystem' that needs to
> agree on one interface is helpful here. Each server company has their
> own business plan and their own constraints. I absolutely think that
> getting as many companies as possible to agree on SBSA and UEFI is
> helpful here because it reduces the the differences between the platforms
> as seen by a distro. For companies that want to support Windows, it's
> obvious they want to have ACPI on their machines, for others the
> factors you mention above can be enough to justify the move to ACPI
> even without Windows support. Then there are other companies for
> which the tradeoffs are different, and I see no reason for forcing
> it on them. Finally there are and will likely always be chips that
> are not built around SBSA and someone will use the chips in creative
> ways to build servers from them, so we already don't have a homogeneous
> ecosystem.
Allow me to clarify my position here. This entire document is about
why ACPI was chosen for the ARM SBBR specification. The SBBR and the
SBSA are important because they document the agreements and
compromises made by vendors and industry representatives to get
interoperability. It is a tool for vendors to say that they are aiming
for compatibility with a particularly hardware/software ecosystem.
*Nobody* is forced to implement these specifications. Any company is
free to ignore them and go their own way. The tradeoff in doing so is
it means they are on their own for support. Non-compliant hardware
vendors have to convince OS vendors to support them, and similarly,
non-compliant OS vendors need to convince hardware vendors of the
same. Red Had has stated very clearly that they won't support any
hardware that isn't SBSA/SBBR compliant. So has Microsoft. Canonical
on the other hand has said they will support whatever if there is a
business case. This certainly is a business decision and each company
needs to make its own choices.
As far as we (Linux maintainers) are concerned, we've also been really
clear that DT is not a second class citizen to ACPI. Mainline cannot
and should not force certain classes of machines to use ACPI and other
classes of machines to use DT. As long as the code is well written and
conforms to our rules for what ACPI or DT code is allowed to do, then
we should be happy to take the patches.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List
2015-01-12 12:00 ` Grant Likely
@ 2015-01-12 19:40 ` Arnd Bergmann
-1 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-12 19:40 UTC (permalink / raw)
To: linaro-acpi
Cc: Grant Likely, Catalin Marinas, Rafael J. Wysocki,
ACPI Devel Mailing List, Olof Johansson,
linux-arm-kernel@lists.infradead.org
On Monday 12 January 2015 12:00:31 Grant Likely wrote:
> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> > This seems like a great fit for AML indeed, but I wonder what exactly
> > we want to hotplug here, since everything I can think of wouldn't need
> > AML support for the specific use case of SBSA compliant servers:
>
> [...]
>
> I've trimmed the specific examples here because I think that misses
> the point. The point is that regardless of interface (either ACPI or
> DT) there are always going to be cases where the data needs to change
> at runtime. Not all platforms will need to change the CPU data, but
> some will (say for a machine that detects a failed CPU and removes
> it). Some PCI add-in boards will carry along with them additional data
> that needs to be inserted into the ACPI namespace or DT. Some
> platforms will have system level component (ie. non-PCI) that may not
> always be accessible.
Just to be sure I get this right: do you mean runtime or boot-time
(re-)configuration for those?
> ACPI has an interface baked in already for tying data changes to
> events. DT currently needs platform specific support (which we can
> improve on). I'm not even trying to argue for ACPI over DT in this
> section, but I included it this document because it is one of the
> reasons often given for choosing ACPI and I felt it required a more
> nuanced discussion.
I can definitely see the need for an architected interface for
dynamic reconfiguration in cases like this, and I think the ACPI
model actually does this better than the IBM Power hypervisor
model, I just didn't see the need on servers as opposed to something
like a laptop docking station to give a more obvious example I know
from x86.
> > * From the experience with x86, Linux tends to prefer using drivers
> > for hardware registers over the AML based drivers when both are
> > implemented, because of efficiency and correctness.
> >
> > We should probably discuss at some point how to get the best of
> > both. I really don't like the idea of putting the low-level
> > details that we tend to have DT into ACPI, but there are two
> > things we can do: For systems that have a high-level abstraction
> > for their PM in hardware (e.g. talking to an embedded controller
> > that does the actual work), the ACPI description should contain
> > enough information to implement a kernel-level driver for it as
> > we have on Intel machines. For more traditional SoCs that do everything
> > themselves, I would recommend to always have a working DT for
> > those people wanting to get the most of their hardware. This will
> > also enable any other SoC features that cannot be represented in
> > ACPI.
>
> The nice thing about ACPI is that we always have the option of
> ignoring it when the driver knows better since it is always executed
> under the control of the kernel interpreter. There is no ACPI going
> off and doing something behind the kernel's back. To start with we
> have the OSPM state model and devices can use additional ACPI methods
> as needed, but as an optimization, the driver can do those operations
> directly if the driver author has enough knowledge about the device.
Ok, makes sense.
> >> Reliability, Availability & Serviceability (RAS)
> >> ------------------------------------------------
> >> 7. Support RAS interfaces
> >>
> >> This isn't a question of whether or not DT can support RAS. Of course
> >> it can. Rather it is a matter of RAS bindings already existing for
> >> ACPI, including a usage model. We've barely begun to explore this on
> >> DT. This item doesn't make ACPI technically superior to DT, but it
> >> certainly makes it more mature.
> >
> > Unfortunately, RAS can mean a lot of things to different people.
> > Is there some high-level description of what the APCI idea of RAS
> > is? On systems I've worked on in the past, this was generally done
> > out of band (e.g. in an IPMI BMC) because you can't really trust
> > the running OS when you report errors that may impact data consistency
> > of that OS.
>
> RAS is also something where every company already has something that
> they are using on their x86 machines. Those interfaces are being
> ported over to the ARM platforms and will be equivalent to what they
> already do for x86. So, for example, an ARM server from DELL will use
> mostly the same RAS interfaces as an x86 server from DELL.
Right, I'm still curious about what those are, in case we have to
add DT bindings for them as well.
> > I do think that this is in fact the most important argument in favor
> > of doing ACPI on Linux, because a number of companies are betting on
> > Windows (or some in-house OS that uses ACPI) support. At the same time,
> > I don't think talking of a single 'ARM server ecosystem' that needs to
> > agree on one interface is helpful here. Each server company has their
> > own business plan and their own constraints. I absolutely think that
> > getting as many companies as possible to agree on SBSA and UEFI is
> > helpful here because it reduces the the differences between the platforms
> > as seen by a distro. For companies that want to support Windows, it's
> > obvious they want to have ACPI on their machines, for others the
> > factors you mention above can be enough to justify the move to ACPI
> > even without Windows support. Then there are other companies for
> > which the tradeoffs are different, and I see no reason for forcing
> > it on them. Finally there are and will likely always be chips that
> > are not built around SBSA and someone will use the chips in creative
> > ways to build servers from them, so we already don't have a homogeneous
> > ecosystem.
>
> Allow me to clarify my position here. This entire document is about
> why ACPI was chosen for the ARM SBBR specification.
I thought it was about why we should merge ACPI support into the kernel,
which seems to me like a different thing.
> As far as we (Linux maintainers) are concerned, we've also been really
> clear that DT is not a second class citizen to ACPI. Mainline cannot
> and should not force certain classes of machines to use ACPI and other
> classes of machines to use DT. As long as the code is well written and
> conforms to our rules for what ACPI or DT code is allowed to do, then
> we should be happy to take the patches.
What we are still missing though is a recommendation for a boot protocol.
The UEFI bits in SBBR are generally useful for having compatibility
across machines that we support in the kernel regardless of the device
description, and we also need to have guidelines along the lines of
"if you do ACPI, then do it like this" that are in SBBR. However, the
way that these two are coupled into "you have to use ACPI and UEFI
this way to build a compliant server" really does make the document
much less useful for Linux.
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Linaro-acpi] [RFC] ACPI on arm64 TODO List
@ 2015-01-12 19:40 ` Arnd Bergmann
0 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-12 19:40 UTC (permalink / raw)
To: linux-arm-kernel
On Monday 12 January 2015 12:00:31 Grant Likely wrote:
> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> > This seems like a great fit for AML indeed, but I wonder what exactly
> > we want to hotplug here, since everything I can think of wouldn't need
> > AML support for the specific use case of SBSA compliant servers:
>
> [...]
>
> I've trimmed the specific examples here because I think that misses
> the point. The point is that regardless of interface (either ACPI or
> DT) there are always going to be cases where the data needs to change
> at runtime. Not all platforms will need to change the CPU data, but
> some will (say for a machine that detects a failed CPU and removes
> it). Some PCI add-in boards will carry along with them additional data
> that needs to be inserted into the ACPI namespace or DT. Some
> platforms will have system level component (ie. non-PCI) that may not
> always be accessible.
Just to be sure I get this right: do you mean runtime or boot-time
(re-)configuration for those?
> ACPI has an interface baked in already for tying data changes to
> events. DT currently needs platform specific support (which we can
> improve on). I'm not even trying to argue for ACPI over DT in this
> section, but I included it this document because it is one of the
> reasons often given for choosing ACPI and I felt it required a more
> nuanced discussion.
I can definitely see the need for an architected interface for
dynamic reconfiguration in cases like this, and I think the ACPI
model actually does this better than the IBM Power hypervisor
model, I just didn't see the need on servers as opposed to something
like a laptop docking station to give a more obvious example I know
from x86.
> > * From the experience with x86, Linux tends to prefer using drivers
> > for hardware registers over the AML based drivers when both are
> > implemented, because of efficiency and correctness.
> >
> > We should probably discuss at some point how to get the best of
> > both. I really don't like the idea of putting the low-level
> > details that we tend to have DT into ACPI, but there are two
> > things we can do: For systems that have a high-level abstraction
> > for their PM in hardware (e.g. talking to an embedded controller
> > that does the actual work), the ACPI description should contain
> > enough information to implement a kernel-level driver for it as
> > we have on Intel machines. For more traditional SoCs that do everything
> > themselves, I would recommend to always have a working DT for
> > those people wanting to get the most of their hardware. This will
> > also enable any other SoC features that cannot be represented in
> > ACPI.
>
> The nice thing about ACPI is that we always have the option of
> ignoring it when the driver knows better since it is always executed
> under the control of the kernel interpreter. There is no ACPI going
> off and doing something behind the kernel's back. To start with we
> have the OSPM state model and devices can use additional ACPI methods
> as needed, but as an optimization, the driver can do those operations
> directly if the driver author has enough knowledge about the device.
Ok, makes sense.
> >> Reliability, Availability & Serviceability (RAS)
> >> ------------------------------------------------
> >> 7. Support RAS interfaces
> >>
> >> This isn't a question of whether or not DT can support RAS. Of course
> >> it can. Rather it is a matter of RAS bindings already existing for
> >> ACPI, including a usage model. We've barely begun to explore this on
> >> DT. This item doesn't make ACPI technically superior to DT, but it
> >> certainly makes it more mature.
> >
> > Unfortunately, RAS can mean a lot of things to different people.
> > Is there some high-level description of what the APCI idea of RAS
> > is? On systems I've worked on in the past, this was generally done
> > out of band (e.g. in an IPMI BMC) because you can't really trust
> > the running OS when you report errors that may impact data consistency
> > of that OS.
>
> RAS is also something where every company already has something that
> they are using on their x86 machines. Those interfaces are being
> ported over to the ARM platforms and will be equivalent to what they
> already do for x86. So, for example, an ARM server from DELL will use
> mostly the same RAS interfaces as an x86 server from DELL.
Right, I'm still curious about what those are, in case we have to
add DT bindings for them as well.
> > I do think that this is in fact the most important argument in favor
> > of doing ACPI on Linux, because a number of companies are betting on
> > Windows (or some in-house OS that uses ACPI) support. At the same time,
> > I don't think talking of a single 'ARM server ecosystem' that needs to
> > agree on one interface is helpful here. Each server company has their
> > own business plan and their own constraints. I absolutely think that
> > getting as many companies as possible to agree on SBSA and UEFI is
> > helpful here because it reduces the the differences between the platforms
> > as seen by a distro. For companies that want to support Windows, it's
> > obvious they want to have ACPI on their machines, for others the
> > factors you mention above can be enough to justify the move to ACPI
> > even without Windows support. Then there are other companies for
> > which the tradeoffs are different, and I see no reason for forcing
> > it on them. Finally there are and will likely always be chips that
> > are not built around SBSA and someone will use the chips in creative
> > ways to build servers from them, so we already don't have a homogeneous
> > ecosystem.
>
> Allow me to clarify my position here. This entire document is about
> why ACPI was chosen for the ARM SBBR specification.
I thought it was about why we should merge ACPI support into the kernel,
which seems to me like a different thing.
> As far as we (Linux maintainers) are concerned, we've also been really
> clear that DT is not a second class citizen to ACPI. Mainline cannot
> and should not force certain classes of machines to use ACPI and other
> classes of machines to use DT. As long as the code is well written and
> conforms to our rules for what ACPI or DT code is allowed to do, then
> we should be happy to take the patches.
What we are still missing though is a recommendation for a boot protocol.
The UEFI bits in SBBR are generally useful for having compatibility
across machines that we support in the kernel regardless of the device
description, and we also need to have guidelines along the lines of
"if you do ACPI, then do it like this" that are in SBBR. However, the
way that these two are coupled into "you have to use ACPI and UEFI
this way to build a compliant server" really does make the document
much less useful for Linux.
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List
2015-01-12 19:40 ` Arnd Bergmann
@ 2015-01-13 17:22 ` Grant Likely
-1 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-13 17:22 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linaro-acpi, Catalin Marinas, Rafael J. Wysocki,
ACPI Devel Mailing List, Olof Johansson,
linux-arm-kernel@lists.infradead.org
On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>> > This seems like a great fit for AML indeed, but I wonder what exactly
>> > we want to hotplug here, since everything I can think of wouldn't need
>> > AML support for the specific use case of SBSA compliant servers:
>>
>> [...]
>>
>> I've trimmed the specific examples here because I think that misses
>> the point. The point is that regardless of interface (either ACPI or
>> DT) there are always going to be cases where the data needs to change
>> at runtime. Not all platforms will need to change the CPU data, but
>> some will (say for a machine that detects a failed CPU and removes
>> it). Some PCI add-in boards will carry along with them additional data
>> that needs to be inserted into the ACPI namespace or DT. Some
>> platforms will have system level component (ie. non-PCI) that may not
>> always be accessible.
>
> Just to be sure I get this right: do you mean runtime or boot-time
> (re-)configuration for those?
Both are important.
>> ACPI has an interface baked in already for tying data changes to
>> events. DT currently needs platform specific support (which we can
>> improve on). I'm not even trying to argue for ACPI over DT in this
>> section, but I included it this document because it is one of the
>> reasons often given for choosing ACPI and I felt it required a more
>> nuanced discussion.
>
> I can definitely see the need for an architected interface for
> dynamic reconfiguration in cases like this, and I think the ACPI
> model actually does this better than the IBM Power hypervisor
> model, I just didn't see the need on servers as opposed to something
> like a laptop docking station to give a more obvious example I know
> from x86.
>
>> > * From the experience with x86, Linux tends to prefer using drivers
>> > for hardware registers over the AML based drivers when both are
>> > implemented, because of efficiency and correctness.
>> >
>> > We should probably discuss at some point how to get the best of
>> > both. I really don't like the idea of putting the low-level
>> > details that we tend to have DT into ACPI, but there are two
>> > things we can do: For systems that have a high-level abstraction
>> > for their PM in hardware (e.g. talking to an embedded controller
>> > that does the actual work), the ACPI description should contain
>> > enough information to implement a kernel-level driver for it as
>> > we have on Intel machines. For more traditional SoCs that do everything
>> > themselves, I would recommend to always have a working DT for
>> > those people wanting to get the most of their hardware. This will
>> > also enable any other SoC features that cannot be represented in
>> > ACPI.
>>
>> The nice thing about ACPI is that we always have the option of
>> ignoring it when the driver knows better since it is always executed
>> under the control of the kernel interpreter. There is no ACPI going
>> off and doing something behind the kernel's back. To start with we
>> have the OSPM state model and devices can use additional ACPI methods
>> as needed, but as an optimization, the driver can do those operations
>> directly if the driver author has enough knowledge about the device.
>
> Ok, makes sense.
>
>> >> Reliability, Availability & Serviceability (RAS)
>> >> ------------------------------------------------
>> >> 7. Support RAS interfaces
>> >>
>> >> This isn't a question of whether or not DT can support RAS. Of course
>> >> it can. Rather it is a matter of RAS bindings already existing for
>> >> ACPI, including a usage model. We've barely begun to explore this on
>> >> DT. This item doesn't make ACPI technically superior to DT, but it
>> >> certainly makes it more mature.
>> >
>> > Unfortunately, RAS can mean a lot of things to different people.
>> > Is there some high-level description of what the APCI idea of RAS
>> > is? On systems I've worked on in the past, this was generally done
>> > out of band (e.g. in an IPMI BMC) because you can't really trust
>> > the running OS when you report errors that may impact data consistency
>> > of that OS.
>>
>> RAS is also something where every company already has something that
>> they are using on their x86 machines. Those interfaces are being
>> ported over to the ARM platforms and will be equivalent to what they
>> already do for x86. So, for example, an ARM server from DELL will use
>> mostly the same RAS interfaces as an x86 server from DELL.
>
> Right, I'm still curious about what those are, in case we have to
> add DT bindings for them as well.
Certainly.
>> > I do think that this is in fact the most important argument in favor
>> > of doing ACPI on Linux, because a number of companies are betting on
>> > Windows (or some in-house OS that uses ACPI) support. At the same time,
>> > I don't think talking of a single 'ARM server ecosystem' that needs to
>> > agree on one interface is helpful here. Each server company has their
>> > own business plan and their own constraints. I absolutely think that
>> > getting as many companies as possible to agree on SBSA and UEFI is
>> > helpful here because it reduces the the differences between the platforms
>> > as seen by a distro. For companies that want to support Windows, it's
>> > obvious they want to have ACPI on their machines, for others the
>> > factors you mention above can be enough to justify the move to ACPI
>> > even without Windows support. Then there are other companies for
>> > which the tradeoffs are different, and I see no reason for forcing
>> > it on them. Finally there are and will likely always be chips that
>> > are not built around SBSA and someone will use the chips in creative
>> > ways to build servers from them, so we already don't have a homogeneous
>> > ecosystem.
>>
>> Allow me to clarify my position here. This entire document is about
>> why ACPI was chosen for the ARM SBBR specification.
>
> I thought it was about why we should merge ACPI support into the kernel,
> which seems to me like a different thing.
Nope! I'm not trying to make that argument here. This document is
primarily to document the rationale for choosing ACPI in the ARM
server SBBR document (from a Linux developer's perspective, granted).
I'll make arguments about actually merging the patches in a different email. :-)
>> As far as we (Linux maintainers) are concerned, we've also been really
>> clear that DT is not a second class citizen to ACPI. Mainline cannot
>> and should not force certain classes of machines to use ACPI and other
>> classes of machines to use DT. As long as the code is well written and
>> conforms to our rules for what ACPI or DT code is allowed to do, then
>> we should be happy to take the patches.
>
> What we are still missing though is a recommendation for a boot protocol.
> The UEFI bits in SBBR are generally useful for having compatibility
> across machines that we support in the kernel regardless of the device
> description, and we also need to have guidelines along the lines of
> "if you do ACPI, then do it like this" that are in SBBR. However, the
> way that these two are coupled into "you have to use ACPI and UEFI
> this way to build a compliant server" really does make the document
> much less useful for Linux.
I don't follow your argument. Exactly what problem do you have with
"You have to use ACPI and UEFI" to be compliant with the SBBR
document? Vendors absolutely have the choice to ignore those
documents, but doing so means they are explicitly rejecting the
platform that ARM has defined for server machines, and by extension,
explicitly rejecting the ecosystem and interoperability that goes with
it.
On the UEFI front, I don't see a problem. Linux mainline has the UEFI
stub on ARM, and that is the boot protocol.
For UEFI providing DT, the interface is set. We defined it, and it
works, but that is not part of the ARM server ecosystem as defined by
ARM. Why would the SBBR cover it?
My perspective is that Linux should support the SBSA+SBBR ecosystem,
but we don't need to be exclusive about it. We'll happily support
UEFI+DT platforms as well as UEFI+ACPI.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Linaro-acpi] [RFC] ACPI on arm64 TODO List
@ 2015-01-13 17:22 ` Grant Likely
0 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-13 17:22 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>> > This seems like a great fit for AML indeed, but I wonder what exactly
>> > we want to hotplug here, since everything I can think of wouldn't need
>> > AML support for the specific use case of SBSA compliant servers:
>>
>> [...]
>>
>> I've trimmed the specific examples here because I think that misses
>> the point. The point is that regardless of interface (either ACPI or
>> DT) there are always going to be cases where the data needs to change
>> at runtime. Not all platforms will need to change the CPU data, but
>> some will (say for a machine that detects a failed CPU and removes
>> it). Some PCI add-in boards will carry along with them additional data
>> that needs to be inserted into the ACPI namespace or DT. Some
>> platforms will have system level component (ie. non-PCI) that may not
>> always be accessible.
>
> Just to be sure I get this right: do you mean runtime or boot-time
> (re-)configuration for those?
Both are important.
>> ACPI has an interface baked in already for tying data changes to
>> events. DT currently needs platform specific support (which we can
>> improve on). I'm not even trying to argue for ACPI over DT in this
>> section, but I included it this document because it is one of the
>> reasons often given for choosing ACPI and I felt it required a more
>> nuanced discussion.
>
> I can definitely see the need for an architected interface for
> dynamic reconfiguration in cases like this, and I think the ACPI
> model actually does this better than the IBM Power hypervisor
> model, I just didn't see the need on servers as opposed to something
> like a laptop docking station to give a more obvious example I know
> from x86.
>
>> > * From the experience with x86, Linux tends to prefer using drivers
>> > for hardware registers over the AML based drivers when both are
>> > implemented, because of efficiency and correctness.
>> >
>> > We should probably discuss at some point how to get the best of
>> > both. I really don't like the idea of putting the low-level
>> > details that we tend to have DT into ACPI, but there are two
>> > things we can do: For systems that have a high-level abstraction
>> > for their PM in hardware (e.g. talking to an embedded controller
>> > that does the actual work), the ACPI description should contain
>> > enough information to implement a kernel-level driver for it as
>> > we have on Intel machines. For more traditional SoCs that do everything
>> > themselves, I would recommend to always have a working DT for
>> > those people wanting to get the most of their hardware. This will
>> > also enable any other SoC features that cannot be represented in
>> > ACPI.
>>
>> The nice thing about ACPI is that we always have the option of
>> ignoring it when the driver knows better since it is always executed
>> under the control of the kernel interpreter. There is no ACPI going
>> off and doing something behind the kernel's back. To start with we
>> have the OSPM state model and devices can use additional ACPI methods
>> as needed, but as an optimization, the driver can do those operations
>> directly if the driver author has enough knowledge about the device.
>
> Ok, makes sense.
>
>> >> Reliability, Availability & Serviceability (RAS)
>> >> ------------------------------------------------
>> >> 7. Support RAS interfaces
>> >>
>> >> This isn't a question of whether or not DT can support RAS. Of course
>> >> it can. Rather it is a matter of RAS bindings already existing for
>> >> ACPI, including a usage model. We've barely begun to explore this on
>> >> DT. This item doesn't make ACPI technically superior to DT, but it
>> >> certainly makes it more mature.
>> >
>> > Unfortunately, RAS can mean a lot of things to different people.
>> > Is there some high-level description of what the APCI idea of RAS
>> > is? On systems I've worked on in the past, this was generally done
>> > out of band (e.g. in an IPMI BMC) because you can't really trust
>> > the running OS when you report errors that may impact data consistency
>> > of that OS.
>>
>> RAS is also something where every company already has something that
>> they are using on their x86 machines. Those interfaces are being
>> ported over to the ARM platforms and will be equivalent to what they
>> already do for x86. So, for example, an ARM server from DELL will use
>> mostly the same RAS interfaces as an x86 server from DELL.
>
> Right, I'm still curious about what those are, in case we have to
> add DT bindings for them as well.
Certainly.
>> > I do think that this is in fact the most important argument in favor
>> > of doing ACPI on Linux, because a number of companies are betting on
>> > Windows (or some in-house OS that uses ACPI) support. At the same time,
>> > I don't think talking of a single 'ARM server ecosystem' that needs to
>> > agree on one interface is helpful here. Each server company has their
>> > own business plan and their own constraints. I absolutely think that
>> > getting as many companies as possible to agree on SBSA and UEFI is
>> > helpful here because it reduces the the differences between the platforms
>> > as seen by a distro. For companies that want to support Windows, it's
>> > obvious they want to have ACPI on their machines, for others the
>> > factors you mention above can be enough to justify the move to ACPI
>> > even without Windows support. Then there are other companies for
>> > which the tradeoffs are different, and I see no reason for forcing
>> > it on them. Finally there are and will likely always be chips that
>> > are not built around SBSA and someone will use the chips in creative
>> > ways to build servers from them, so we already don't have a homogeneous
>> > ecosystem.
>>
>> Allow me to clarify my position here. This entire document is about
>> why ACPI was chosen for the ARM SBBR specification.
>
> I thought it was about why we should merge ACPI support into the kernel,
> which seems to me like a different thing.
Nope! I'm not trying to make that argument here. This document is
primarily to document the rationale for choosing ACPI in the ARM
server SBBR document (from a Linux developer's perspective, granted).
I'll make arguments about actually merging the patches in a different email. :-)
>> As far as we (Linux maintainers) are concerned, we've also been really
>> clear that DT is not a second class citizen to ACPI. Mainline cannot
>> and should not force certain classes of machines to use ACPI and other
>> classes of machines to use DT. As long as the code is well written and
>> conforms to our rules for what ACPI or DT code is allowed to do, then
>> we should be happy to take the patches.
>
> What we are still missing though is a recommendation for a boot protocol.
> The UEFI bits in SBBR are generally useful for having compatibility
> across machines that we support in the kernel regardless of the device
> description, and we also need to have guidelines along the lines of
> "if you do ACPI, then do it like this" that are in SBBR. However, the
> way that these two are coupled into "you have to use ACPI and UEFI
> this way to build a compliant server" really does make the document
> much less useful for Linux.
I don't follow your argument. Exactly what problem do you have with
"You have to use ACPI and UEFI" to be compliant with the SBBR
document? Vendors absolutely have the choice to ignore those
documents, but doing so means they are explicitly rejecting the
platform that ARM has defined for server machines, and by extension,
explicitly rejecting the ecosystem and interoperability that goes with
it.
On the UEFI front, I don't see a problem. Linux mainline has the UEFI
stub on ARM, and that is the boot protocol.
For UEFI providing DT, the interface is set. We defined it, and it
works, but that is not part of the ARM server ecosystem as defined by
ARM. Why would the SBBR cover it?
My perspective is that Linux should support the SBSA+SBBR ecosystem,
but we don't need to be exclusive about it. We'll happily support
UEFI+DT platforms as well as UEFI+ACPI.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List
2015-01-13 17:22 ` Grant Likely
@ 2015-01-14 0:26 ` Al Stone
-1 siblings, 0 replies; 76+ messages in thread
From: Al Stone @ 2015-01-14 0:26 UTC (permalink / raw)
To: Grant Likely, Arnd Bergmann
Cc: linaro-acpi, Catalin Marinas, Rafael J. Wysocki,
ACPI Devel Mailing List, Olof Johansson,
linux-arm-kernel@lists.infradead.org
On 01/13/2015 10:22 AM, Grant Likely wrote:
> On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
>>> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>> On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>> This seems like a great fit for AML indeed, but I wonder what exactly
>>>> we want to hotplug here, since everything I can think of wouldn't need
>>>> AML support for the specific use case of SBSA compliant servers:
>>>
>>> [...]
>>>
>>> I've trimmed the specific examples here because I think that misses
>>> the point. The point is that regardless of interface (either ACPI or
>>> DT) there are always going to be cases where the data needs to change
>>> at runtime. Not all platforms will need to change the CPU data, but
>>> some will (say for a machine that detects a failed CPU and removes
>>> it). Some PCI add-in boards will carry along with them additional data
>>> that needs to be inserted into the ACPI namespace or DT. Some
>>> platforms will have system level component (ie. non-PCI) that may not
>>> always be accessible.
>>
>> Just to be sure I get this right: do you mean runtime or boot-time
>> (re-)configuration for those?
>
> Both are important.
>
>>> ACPI has an interface baked in already for tying data changes to
>>> events. DT currently needs platform specific support (which we can
>>> improve on). I'm not even trying to argue for ACPI over DT in this
>>> section, but I included it this document because it is one of the
>>> reasons often given for choosing ACPI and I felt it required a more
>>> nuanced discussion.
>>
>> I can definitely see the need for an architected interface for
>> dynamic reconfiguration in cases like this, and I think the ACPI
>> model actually does this better than the IBM Power hypervisor
>> model, I just didn't see the need on servers as opposed to something
>> like a laptop docking station to give a more obvious example I know
>> from x86.
I know of at least one server product (non-ARM) that uses the
hot-plugging of CPUs and memory as a key feature, using the
ACPI OSPM model. Essentially, the customer buys a system with
a number of slots and pays for filling one or more of them up
front. As the need for capacity increases, CPUs and/or RAM gets
enabled; i.e., you have spare capacity that you buy as you need
it. If you use up all the CPUs and RAM you have, you buy more
cards, fill the additional slots, and turn on what you need. This
is very akin to the virtual machine model, but done with real hardware
instead.
Whether or not this product is still being sold, I do not know. I
have not worked for that company for eight years, and they were just
coming out as I left. Regardless, this sort of hot-plug does make
sense in the server world, and has been used in shipping products.
>>>>[snip....]
>>
>>>>> Reliability, Availability & Serviceability (RAS)
>>>>> ------------------------------------------------
>>>>> 7. Support RAS interfaces
>>>>>
>>>>> This isn't a question of whether or not DT can support RAS. Of course
>>>>> it can. Rather it is a matter of RAS bindings already existing for
>>>>> ACPI, including a usage model. We've barely begun to explore this on
>>>>> DT. This item doesn't make ACPI technically superior to DT, but it
>>>>> certainly makes it more mature.
>>>>
>>>> Unfortunately, RAS can mean a lot of things to different people.
>>>> Is there some high-level description of what the APCI idea of RAS
>>>> is? On systems I've worked on in the past, this was generally done
>>>> out of band (e.g. in an IPMI BMC) because you can't really trust
>>>> the running OS when you report errors that may impact data consistency
>>>> of that OS.
>>>
>>> RAS is also something where every company already has something that
>>> they are using on their x86 machines. Those interfaces are being
>>> ported over to the ARM platforms and will be equivalent to what they
>>> already do for x86. So, for example, an ARM server from DELL will use
>>> mostly the same RAS interfaces as an x86 server from DELL.
>>
>> Right, I'm still curious about what those are, in case we have to
>> add DT bindings for them as well.
>
> Certainly.
In ACPI terms, the features used are called APEI (Advanced Platform
Error Interface), and defined in Section 18 of the specification. The
tables describe what the possible error sources are, where details about
the error are stored, and what to do when the errors occur. A lot of
the "RAS tools" out there that report and/or analyze error data rely on
this information being reported in the form given by the spec.
I only put "RAS tools" in quotes because it is indeed a very loosely
defined term -- I've had everything from webmin to SNMP to ganglia,
nagios and Tivoli described to me as a RAS tool. In all of those cases,
however, the basic idea was to capture errors as they occur, and try to
manage them properly. That is, replace disks that seem to be heading
down hill, or look for faults in RAM, or dropped packets on LANs --
anything that could help me avoid a catastrophic failure by doing some
preventive maintenance up front.
And indeed a BMC is often used for handling errors in servers, or to
report errors out to something like nagios or ganglia. It could
also just be a log in a bit of NVRAM, too, with a little daemon that
reports back somewhere. But, this is why APEI is used: it tries to
provide a well defined interface between those reporting the error
(firmware, hardware, OS, ...) and those that need to act on the error
(the BMC, the OS, or even other bits of firmware).
Does that help satisfy the curiosity a bit?
BTW, there are also some nice tools from ACPICA that, if enabled, allow
one to simulate the occurrence of an error and test out the response.
What you can do is define the error source and what response you want
the OSPM to take (HEST, or Hardware Error Source Table), then use the
EINJ (Error Injection) table to describe how to simulate the error
having occurred. You then tell ACPICA to "run" the EINJ and test how
the system actually responds. You can do this with many EINJ tables,
too, so you can experiment with or debug APEI tables as you develop
them.
--
ciao,
al
-----------------------------------
Al Stone
Software Engineer
Red Hat, Inc.
ahs3@redhat.com
-----------------------------------
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Linaro-acpi] [RFC] ACPI on arm64 TODO List
@ 2015-01-14 0:26 ` Al Stone
0 siblings, 0 replies; 76+ messages in thread
From: Al Stone @ 2015-01-14 0:26 UTC (permalink / raw)
To: linux-arm-kernel
On 01/13/2015 10:22 AM, Grant Likely wrote:
> On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
>>> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>> On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>> This seems like a great fit for AML indeed, but I wonder what exactly
>>>> we want to hotplug here, since everything I can think of wouldn't need
>>>> AML support for the specific use case of SBSA compliant servers:
>>>
>>> [...]
>>>
>>> I've trimmed the specific examples here because I think that misses
>>> the point. The point is that regardless of interface (either ACPI or
>>> DT) there are always going to be cases where the data needs to change
>>> at runtime. Not all platforms will need to change the CPU data, but
>>> some will (say for a machine that detects a failed CPU and removes
>>> it). Some PCI add-in boards will carry along with them additional data
>>> that needs to be inserted into the ACPI namespace or DT. Some
>>> platforms will have system level component (ie. non-PCI) that may not
>>> always be accessible.
>>
>> Just to be sure I get this right: do you mean runtime or boot-time
>> (re-)configuration for those?
>
> Both are important.
>
>>> ACPI has an interface baked in already for tying data changes to
>>> events. DT currently needs platform specific support (which we can
>>> improve on). I'm not even trying to argue for ACPI over DT in this
>>> section, but I included it this document because it is one of the
>>> reasons often given for choosing ACPI and I felt it required a more
>>> nuanced discussion.
>>
>> I can definitely see the need for an architected interface for
>> dynamic reconfiguration in cases like this, and I think the ACPI
>> model actually does this better than the IBM Power hypervisor
>> model, I just didn't see the need on servers as opposed to something
>> like a laptop docking station to give a more obvious example I know
>> from x86.
I know of at least one server product (non-ARM) that uses the
hot-plugging of CPUs and memory as a key feature, using the
ACPI OSPM model. Essentially, the customer buys a system with
a number of slots and pays for filling one or more of them up
front. As the need for capacity increases, CPUs and/or RAM gets
enabled; i.e., you have spare capacity that you buy as you need
it. If you use up all the CPUs and RAM you have, you buy more
cards, fill the additional slots, and turn on what you need. This
is very akin to the virtual machine model, but done with real hardware
instead.
Whether or not this product is still being sold, I do not know. I
have not worked for that company for eight years, and they were just
coming out as I left. Regardless, this sort of hot-plug does make
sense in the server world, and has been used in shipping products.
>>>>[snip....]
>>
>>>>> Reliability, Availability & Serviceability (RAS)
>>>>> ------------------------------------------------
>>>>> 7. Support RAS interfaces
>>>>>
>>>>> This isn't a question of whether or not DT can support RAS. Of course
>>>>> it can. Rather it is a matter of RAS bindings already existing for
>>>>> ACPI, including a usage model. We've barely begun to explore this on
>>>>> DT. This item doesn't make ACPI technically superior to DT, but it
>>>>> certainly makes it more mature.
>>>>
>>>> Unfortunately, RAS can mean a lot of things to different people.
>>>> Is there some high-level description of what the APCI idea of RAS
>>>> is? On systems I've worked on in the past, this was generally done
>>>> out of band (e.g. in an IPMI BMC) because you can't really trust
>>>> the running OS when you report errors that may impact data consistency
>>>> of that OS.
>>>
>>> RAS is also something where every company already has something that
>>> they are using on their x86 machines. Those interfaces are being
>>> ported over to the ARM platforms and will be equivalent to what they
>>> already do for x86. So, for example, an ARM server from DELL will use
>>> mostly the same RAS interfaces as an x86 server from DELL.
>>
>> Right, I'm still curious about what those are, in case we have to
>> add DT bindings for them as well.
>
> Certainly.
In ACPI terms, the features used are called APEI (Advanced Platform
Error Interface), and defined in Section 18 of the specification. The
tables describe what the possible error sources are, where details about
the error are stored, and what to do when the errors occur. A lot of
the "RAS tools" out there that report and/or analyze error data rely on
this information being reported in the form given by the spec.
I only put "RAS tools" in quotes because it is indeed a very loosely
defined term -- I've had everything from webmin to SNMP to ganglia,
nagios and Tivoli described to me as a RAS tool. In all of those cases,
however, the basic idea was to capture errors as they occur, and try to
manage them properly. That is, replace disks that seem to be heading
down hill, or look for faults in RAM, or dropped packets on LANs --
anything that could help me avoid a catastrophic failure by doing some
preventive maintenance up front.
And indeed a BMC is often used for handling errors in servers, or to
report errors out to something like nagios or ganglia. It could
also just be a log in a bit of NVRAM, too, with a little daemon that
reports back somewhere. But, this is why APEI is used: it tries to
provide a well defined interface between those reporting the error
(firmware, hardware, OS, ...) and those that need to act on the error
(the BMC, the OS, or even other bits of firmware).
Does that help satisfy the curiosity a bit?
BTW, there are also some nice tools from ACPICA that, if enabled, allow
one to simulate the occurrence of an error and test out the response.
What you can do is define the error source and what response you want
the OSPM to take (HEST, or Hardware Error Source Table), then use the
EINJ (Error Injection) table to describe how to simulate the error
having occurred. You then tell ACPICA to "run" the EINJ and test how
the system actually responds. You can do this with many EINJ tables,
too, so you can experiment with or debug APEI tables as you develop
them.
--
ciao,
al
-----------------------------------
Al Stone
Software Engineer
Red Hat, Inc.
ahs3 at redhat.com
-----------------------------------
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List
2015-01-14 0:26 ` Al Stone
@ 2015-01-15 4:07 ` Hanjun Guo
-1 siblings, 0 replies; 76+ messages in thread
From: Hanjun Guo @ 2015-01-15 4:07 UTC (permalink / raw)
To: Al Stone, Grant Likely, Arnd Bergmann
Cc: linaro-acpi, Catalin Marinas, Rafael J. Wysocki,
ACPI Devel Mailing List, Olof Johansson,
linux-arm-kernel@lists.infradead.org
On 2015年01月14日 08:26, Al Stone wrote:
> On 01/13/2015 10:22 AM, Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
>>>> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>>> On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>>>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>>> This seems like a great fit for AML indeed, but I wonder what exactly
>>>>> we want to hotplug here, since everything I can think of wouldn't need
>>>>> AML support for the specific use case of SBSA compliant servers:
>>>>
>>>> [...]
>>>>
>>>> I've trimmed the specific examples here because I think that misses
>>>> the point. The point is that regardless of interface (either ACPI or
>>>> DT) there are always going to be cases where the data needs to change
>>>> at runtime. Not all platforms will need to change the CPU data, but
>>>> some will (say for a machine that detects a failed CPU and removes
>>>> it). Some PCI add-in boards will carry along with them additional data
>>>> that needs to be inserted into the ACPI namespace or DT. Some
>>>> platforms will have system level component (ie. non-PCI) that may not
>>>> always be accessible.
>>>
>>> Just to be sure I get this right: do you mean runtime or boot-time
>>> (re-)configuration for those?
>>
>> Both are important.
>>
>>>> ACPI has an interface baked in already for tying data changes to
>>>> events. DT currently needs platform specific support (which we can
>>>> improve on). I'm not even trying to argue for ACPI over DT in this
>>>> section, but I included it this document because it is one of the
>>>> reasons often given for choosing ACPI and I felt it required a more
>>>> nuanced discussion.
>>>
>>> I can definitely see the need for an architected interface for
>>> dynamic reconfiguration in cases like this, and I think the ACPI
>>> model actually does this better than the IBM Power hypervisor
>>> model, I just didn't see the need on servers as opposed to something
>>> like a laptop docking station to give a more obvious example I know
>>> from x86.
>
> I know of at least one server product (non-ARM) that uses the
> hot-plugging of CPUs and memory as a key feature, using the
> ACPI OSPM model. Essentially, the customer buys a system with
> a number of slots and pays for filling one or more of them up
> front. As the need for capacity increases, CPUs and/or RAM gets
> enabled; i.e., you have spare capacity that you buy as you need
> it. If you use up all the CPUs and RAM you have, you buy more
> cards, fill the additional slots, and turn on what you need. This
> is very akin to the virtual machine model, but done with real hardware
> instead.
There is another important user case for RAS, systems running critical
missions such as bank billing system, system like that need high
reliability that the machine can't be stopped.
So when error happened on hardware including CPU/memory DIMM on such
machines, we need to replace them at run-time.
>
> Whether or not this product is still being sold, I do not know. I
> have not worked for that company for eight years, and they were just
> coming out as I left. Regardless, this sort of hot-plug does make
> sense in the server world, and has been used in shipping products.
I think it still will be, Linux developers put lots of effort to
enable memory hotplug and computer node hotplug in the kernel [1], and
the code already merged into mainline.
[1]:
http://events.linuxfoundation.org/sites/events/files/lcjp13_chen.pdf
Thanks
Hanjun
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Linaro-acpi] [RFC] ACPI on arm64 TODO List
@ 2015-01-15 4:07 ` Hanjun Guo
0 siblings, 0 replies; 76+ messages in thread
From: Hanjun Guo @ 2015-01-15 4:07 UTC (permalink / raw)
To: linux-arm-kernel
On 2015?01?14? 08:26, Al Stone wrote:
> On 01/13/2015 10:22 AM, Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
>>>> On Mon, Jan 12, 2015 at 10:21 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>>> On Saturday 10 January 2015 14:44:02 Grant Likely wrote:
>>>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>>> This seems like a great fit for AML indeed, but I wonder what exactly
>>>>> we want to hotplug here, since everything I can think of wouldn't need
>>>>> AML support for the specific use case of SBSA compliant servers:
>>>>
>>>> [...]
>>>>
>>>> I've trimmed the specific examples here because I think that misses
>>>> the point. The point is that regardless of interface (either ACPI or
>>>> DT) there are always going to be cases where the data needs to change
>>>> at runtime. Not all platforms will need to change the CPU data, but
>>>> some will (say for a machine that detects a failed CPU and removes
>>>> it). Some PCI add-in boards will carry along with them additional data
>>>> that needs to be inserted into the ACPI namespace or DT. Some
>>>> platforms will have system level component (ie. non-PCI) that may not
>>>> always be accessible.
>>>
>>> Just to be sure I get this right: do you mean runtime or boot-time
>>> (re-)configuration for those?
>>
>> Both are important.
>>
>>>> ACPI has an interface baked in already for tying data changes to
>>>> events. DT currently needs platform specific support (which we can
>>>> improve on). I'm not even trying to argue for ACPI over DT in this
>>>> section, but I included it this document because it is one of the
>>>> reasons often given for choosing ACPI and I felt it required a more
>>>> nuanced discussion.
>>>
>>> I can definitely see the need for an architected interface for
>>> dynamic reconfiguration in cases like this, and I think the ACPI
>>> model actually does this better than the IBM Power hypervisor
>>> model, I just didn't see the need on servers as opposed to something
>>> like a laptop docking station to give a more obvious example I know
>>> from x86.
>
> I know of at least one server product (non-ARM) that uses the
> hot-plugging of CPUs and memory as a key feature, using the
> ACPI OSPM model. Essentially, the customer buys a system with
> a number of slots and pays for filling one or more of them up
> front. As the need for capacity increases, CPUs and/or RAM gets
> enabled; i.e., you have spare capacity that you buy as you need
> it. If you use up all the CPUs and RAM you have, you buy more
> cards, fill the additional slots, and turn on what you need. This
> is very akin to the virtual machine model, but done with real hardware
> instead.
There is another important user case for RAS, systems running critical
missions such as bank billing system, system like that need high
reliability that the machine can't be stopped.
So when error happened on hardware including CPU/memory DIMM on such
machines, we need to replace them at run-time.
>
> Whether or not this product is still being sold, I do not know. I
> have not worked for that company for eight years, and they were just
> coming out as I left. Regardless, this sort of hot-plug does make
> sense in the server world, and has been used in shipping products.
I think it still will be, Linux developers put lots of effort to
enable memory hotplug and computer node hotplug in the kernel [1], and
the code already merged into mainline.
[1]:
http://events.linuxfoundation.org/sites/events/files/lcjp13_chen.pdf
Thanks
Hanjun
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List
2015-01-15 4:07 ` Hanjun Guo
@ 2015-01-15 17:15 ` Arnd Bergmann
-1 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-15 17:15 UTC (permalink / raw)
To: linaro-acpi
Cc: Hanjun Guo, Al Stone, Grant Likely, Catalin Marinas,
Rafael J. Wysocki, ACPI Devel Mailing List, Olof Johansson,
linux-arm-kernel@lists.infradead.org
On Thursday 15 January 2015 12:07:45 Hanjun Guo wrote:
> On 2015年01月14日 08:26, Al Stone wrote:
> > On 01/13/2015 10:22 AM, Grant Likely wrote:
> >> On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >>> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
> >>>> I've trimmed the specific examples here because I think that misses
> >>>> the point. The point is that regardless of interface (either ACPI or
> >>>> DT) there are always going to be cases where the data needs to change
> >>>> at runtime. Not all platforms will need to change the CPU data, but
> >>>> some will (say for a machine that detects a failed CPU and removes
> >>>> it). Some PCI add-in boards will carry along with them additional data
> >>>> that needs to be inserted into the ACPI namespace or DT. Some
> >>>> platforms will have system level component (ie. non-PCI) that may not
> >>>> always be accessible.
> >>>
> >>> Just to be sure I get this right: do you mean runtime or boot-time
> >>> (re-)configuration for those?
> >>
> >> Both are important.
But only one of the is relevant to the debate of what ACPI offers over
DT. By mixing the two, it's no longer clear which of your examples
are the ones that matter for runtime hotplugging.
> >>>> ACPI has an interface baked in already for tying data changes to
> >>>> events. DT currently needs platform specific support (which we can
> >>>> improve on). I'm not even trying to argue for ACPI over DT in this
> >>>> section, but I included it this document because it is one of the
> >>>> reasons often given for choosing ACPI and I felt it required a more
> >>>> nuanced discussion.
> >>>
> >>> I can definitely see the need for an architected interface for
> >>> dynamic reconfiguration in cases like this, and I think the ACPI
> >>> model actually does this better than the IBM Power hypervisor
> >>> model, I just didn't see the need on servers as opposed to something
> >>> like a laptop docking station to give a more obvious example I know
> >>> from x86.
> >
> > I know of at least one server product (non-ARM) that uses the
> > hot-plugging of CPUs and memory as a key feature, using the
> > ACPI OSPM model. Essentially, the customer buys a system with
> > a number of slots and pays for filling one or more of them up
> > front. As the need for capacity increases, CPUs and/or RAM gets
> > enabled; i.e., you have spare capacity that you buy as you need
> > it. If you use up all the CPUs and RAM you have, you buy more
> > cards, fill the additional slots, and turn on what you need. This
> > is very akin to the virtual machine model, but done with real hardware
> > instead.
Yes, this is a good example, normally called Capacity-on-Demand (CoD),
and is a feature typically found in enterprise servers, but not in
commodity x86 machines. It would be helpful to hear from someone who
actually plans to do this on ARM, but I get the idea.
> There is another important user case for RAS, systems running critical
> missions such as bank billing system, system like that need high
> reliability that the machine can't be stopped.
>
> So when error happened on hardware including CPU/memory DIMM on such
> machines, we need to replace them at run-time.
>
> > Whether or not this product is still being sold, I do not know. I
> > have not worked for that company for eight years, and they were just
> > coming out as I left. Regardless, this sort of hot-plug does make
> > sense in the server world, and has been used in shipping products.
>
> I think it still will be, Linux developers put lots of effort to
> enable memory hotplug and computer node hotplug in the kernel [1], and
> the code already merged into mainline.
>
> [1]:
> http://events.linuxfoundation.org/sites/events/files/lcjp13_chen.pdf
The case of memory hotremove is interesting as well, but it has some
very significant limitations, regarding system integrity after
uncorrectable memory errors as well as nonmovable pages. The cases
I know either only support hot-add for CoD (see above), or they support
hot-replace for mirrored memory only, but that does not require any
interaction with the OS.
Thanks for the examples!
Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Linaro-acpi] [RFC] ACPI on arm64 TODO List
@ 2015-01-15 17:15 ` Arnd Bergmann
0 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-15 17:15 UTC (permalink / raw)
To: linux-arm-kernel
On Thursday 15 January 2015 12:07:45 Hanjun Guo wrote:
> On 2015?01?14? 08:26, Al Stone wrote:
> > On 01/13/2015 10:22 AM, Grant Likely wrote:
> >> On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >>> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
> >>>> I've trimmed the specific examples here because I think that misses
> >>>> the point. The point is that regardless of interface (either ACPI or
> >>>> DT) there are always going to be cases where the data needs to change
> >>>> at runtime. Not all platforms will need to change the CPU data, but
> >>>> some will (say for a machine that detects a failed CPU and removes
> >>>> it). Some PCI add-in boards will carry along with them additional data
> >>>> that needs to be inserted into the ACPI namespace or DT. Some
> >>>> platforms will have system level component (ie. non-PCI) that may not
> >>>> always be accessible.
> >>>
> >>> Just to be sure I get this right: do you mean runtime or boot-time
> >>> (re-)configuration for those?
> >>
> >> Both are important.
But only one of the is relevant to the debate of what ACPI offers over
DT. By mixing the two, it's no longer clear which of your examples
are the ones that matter for runtime hotplugging.
> >>>> ACPI has an interface baked in already for tying data changes to
> >>>> events. DT currently needs platform specific support (which we can
> >>>> improve on). I'm not even trying to argue for ACPI over DT in this
> >>>> section, but I included it this document because it is one of the
> >>>> reasons often given for choosing ACPI and I felt it required a more
> >>>> nuanced discussion.
> >>>
> >>> I can definitely see the need for an architected interface for
> >>> dynamic reconfiguration in cases like this, and I think the ACPI
> >>> model actually does this better than the IBM Power hypervisor
> >>> model, I just didn't see the need on servers as opposed to something
> >>> like a laptop docking station to give a more obvious example I know
> >>> from x86.
> >
> > I know of at least one server product (non-ARM) that uses the
> > hot-plugging of CPUs and memory as a key feature, using the
> > ACPI OSPM model. Essentially, the customer buys a system with
> > a number of slots and pays for filling one or more of them up
> > front. As the need for capacity increases, CPUs and/or RAM gets
> > enabled; i.e., you have spare capacity that you buy as you need
> > it. If you use up all the CPUs and RAM you have, you buy more
> > cards, fill the additional slots, and turn on what you need. This
> > is very akin to the virtual machine model, but done with real hardware
> > instead.
Yes, this is a good example, normally called Capacity-on-Demand (CoD),
and is a feature typically found in enterprise servers, but not in
commodity x86 machines. It would be helpful to hear from someone who
actually plans to do this on ARM, but I get the idea.
> There is another important user case for RAS, systems running critical
> missions such as bank billing system, system like that need high
> reliability that the machine can't be stopped.
>
> So when error happened on hardware including CPU/memory DIMM on such
> machines, we need to replace them at run-time.
>
> > Whether or not this product is still being sold, I do not know. I
> > have not worked for that company for eight years, and they were just
> > coming out as I left. Regardless, this sort of hot-plug does make
> > sense in the server world, and has been used in shipping products.
>
> I think it still will be, Linux developers put lots of effort to
> enable memory hotplug and computer node hotplug in the kernel [1], and
> the code already merged into mainline.
>
> [1]:
> http://events.linuxfoundation.org/sites/events/files/lcjp13_chen.pdf
The case of memory hotremove is interesting as well, but it has some
very significant limitations, regarding system integrity after
uncorrectable memory errors as well as nonmovable pages. The cases
I know either only support hot-add for CoD (see above), or they support
hot-replace for mirrored memory only, but that does not require any
interaction with the OS.
Thanks for the examples!
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List
2015-01-14 0:26 ` Al Stone
@ 2015-01-15 17:19 ` Arnd Bergmann
-1 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-15 17:19 UTC (permalink / raw)
To: linaro-acpi
Cc: Al Stone, Grant Likely, Catalin Marinas, Rafael J. Wysocki,
ACPI Devel Mailing List, Olof Johansson,
linux-arm-kernel@lists.infradead.org
On Tuesday 13 January 2015 17:26:33 Al Stone wrote:
> On 01/13/2015 10:22 AM, Grant Likely wrote:
> > On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
> >>> RAS is also something where every company already has something that
> >>> they are using on their x86 machines. Those interfaces are being
> >>> ported over to the ARM platforms and will be equivalent to what they
> >>> already do for x86. So, for example, an ARM server from DELL will use
> >>> mostly the same RAS interfaces as an x86 server from DELL.
> >>
> >> Right, I'm still curious about what those are, in case we have to
> >> add DT bindings for them as well.
> >
> > Certainly.
>
> In ACPI terms, the features used are called APEI (Advanced Platform
> Error Interface), and defined in Section 18 of the specification. The
> tables describe what the possible error sources are, where details about
> the error are stored, and what to do when the errors occur. A lot of
> the "RAS tools" out there that report and/or analyze error data rely on
> this information being reported in the form given by the spec.
>
> I only put "RAS tools" in quotes because it is indeed a very loosely
> defined term -- I've had everything from webmin to SNMP to ganglia,
> nagios and Tivoli described to me as a RAS tool. In all of those cases,
> however, the basic idea was to capture errors as they occur, and try to
> manage them properly. That is, replace disks that seem to be heading
> down hill, or look for faults in RAM, or dropped packets on LANs --
> anything that could help me avoid a catastrophic failure by doing some
> preventive maintenance up front.
>
> And indeed a BMC is often used for handling errors in servers, or to
> report errors out to something like nagios or ganglia. It could
> also just be a log in a bit of NVRAM, too, with a little daemon that
> reports back somewhere. But, this is why APEI is used: it tries to
> provide a well defined interface between those reporting the error
> (firmware, hardware, OS, ...) and those that need to act on the error
> (the BMC, the OS, or even other bits of firmware).
>
> Does that help satisfy the curiosity a bit?
Yes, it's much clearer now, thanks!
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Linaro-acpi] [RFC] ACPI on arm64 TODO List
@ 2015-01-15 17:19 ` Arnd Bergmann
0 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-15 17:19 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 13 January 2015 17:26:33 Al Stone wrote:
> On 01/13/2015 10:22 AM, Grant Likely wrote:
> > On Mon, Jan 12, 2015 at 7:40 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> On Monday 12 January 2015 12:00:31 Grant Likely wrote:
> >>> RAS is also something where every company already has something that
> >>> they are using on their x86 machines. Those interfaces are being
> >>> ported over to the ARM platforms and will be equivalent to what they
> >>> already do for x86. So, for example, an ARM server from DELL will use
> >>> mostly the same RAS interfaces as an x86 server from DELL.
> >>
> >> Right, I'm still curious about what those are, in case we have to
> >> add DT bindings for them as well.
> >
> > Certainly.
>
> In ACPI terms, the features used are called APEI (Advanced Platform
> Error Interface), and defined in Section 18 of the specification. The
> tables describe what the possible error sources are, where details about
> the error are stored, and what to do when the errors occur. A lot of
> the "RAS tools" out there that report and/or analyze error data rely on
> this information being reported in the form given by the spec.
>
> I only put "RAS tools" in quotes because it is indeed a very loosely
> defined term -- I've had everything from webmin to SNMP to ganglia,
> nagios and Tivoli described to me as a RAS tool. In all of those cases,
> however, the basic idea was to capture errors as they occur, and try to
> manage them properly. That is, replace disks that seem to be heading
> down hill, or look for faults in RAM, or dropped packets on LANs --
> anything that could help me avoid a catastrophic failure by doing some
> preventive maintenance up front.
>
> And indeed a BMC is often used for handling errors in servers, or to
> report errors out to something like nagios or ganglia. It could
> also just be a log in a bit of NVRAM, too, with a little daemon that
> reports back somewhere. But, this is why APEI is used: it tries to
> provide a well defined interface between those reporting the error
> (firmware, hardware, OS, ...) and those that need to act on the error
> (the BMC, the OS, or even other bits of firmware).
>
> Does that help satisfy the curiosity a bit?
Yes, it's much clearer now, thanks!
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-10 14:44 ` Grant Likely
@ 2015-01-12 14:23 ` Pavel Machek
-1 siblings, 0 replies; 76+ messages in thread
From: Pavel Machek @ 2015-01-12 14:23 UTC (permalink / raw)
To: Grant Likely
Cc: Arnd Bergmann, Al Stone, linaro-acpi@lists.linaro.org,
Catalin Marinas, Rafael J. Wysocki, ACPI Devel Mailing List,
Olof Johansson, linux-arm-kernel@lists.infradead.org
On Sat 2015-01-10 14:44:02, Grant Likely wrote:
> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
> >>> 7. Why is ACPI required?
> >>> * Problem:
> >>> * arm64 maintainers still haven't been convinced that ACPI is
> >>> necessary.
> >>> * Why do hardware and OS vendors say ACPI is required?
> >>> * Status: Al & Grant collecting statements from OEMs to be posted
> >>> publicly early in the new year; firmware summit for broader
> >>> discussion planned.
> >>
> >> I was particularly hoping to see better progress on this item. It
> >> really shouldn't be that hard to explain why someone wants this feature.
> >
> > I've written something up in as a reply on the firmware summit thread.
> > I'm going to rework it to be a standalone document and post it
> > publicly. I hope that should resolve this issue.
>
> I've posted an article on my blog, but I'm reposting it here because
> the mailing list is more conducive to discussion...
>
> http://www.secretlab.ca/archives/151
Unfortunately, I seen the blog post before the mailing list post, so
here's reply in blog format.
Grant Likely published article about ACPI and ARM at
http://www.secretlab.ca/archives/151
. He acknowledges systems with ACPI are harder to debug, but because
Microsoft says so, we have to use ACPI (basically).
I believe doing wrong technical choice "because Microsoft says so" is
a wrong thing to do.
Yes, ACPI gives more flexibility to hardware vendors. Imagine
replacing block devices with interpretted bytecode coming from
ROM. That is obviously bad, right? Why is it good for power
management?
It is not.
Besides being harder to debug, there are more disadvantages:
* Size, speed and complexity disadvantage of bytecode interpretter in
the kernel.
* Many more drivers. Imagine GPIO switch, controlling rfkill
(for
example). In device tree case, that's few lines in the .dts
specifying
which GPIO that switch is on.
In ACPI case, each hardware vendor initially implements rfkill switch
in AML, differently. After few years, each vendor implements
(different) kernel<->AML interface for querying rfkill state and
toggling it in software. Few years after that, we implement kernel
drivers for those AML interfaces, to properly integrate them in
the
kernel.
* Incompatibility. ARM servers will now be very different from other
ARM systems.
Now, are there some arguments for ACPI? Yes -- it allows hw vendors to
hack half-working drivers without touching kernel
sources. (Half-working: such drivers are not properly integrated in
all the various subsystems). Grant claims that power management is
somehow special, and requirement for real drivers is somehow ok for
normal drivers (block, video), but not for power management. Now,
getting driver merged into the kernel does not take that long -- less
than half a year if you know what you are doing. Plus, for power
management, you can really just initialize hardware in the bootloader
(into working but not optimal state). But basic drivers are likely to
merged fast, and then you'll just have to supply DT tables.
Avoid ACPI. It only makes things more complex and harder to debug.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 76+ messages in thread* [RFC] ACPI on arm64 TODO List
@ 2015-01-12 14:23 ` Pavel Machek
0 siblings, 0 replies; 76+ messages in thread
From: Pavel Machek @ 2015-01-12 14:23 UTC (permalink / raw)
To: linux-arm-kernel
On Sat 2015-01-10 14:44:02, Grant Likely wrote:
> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
> >>> 7. Why is ACPI required?
> >>> * Problem:
> >>> * arm64 maintainers still haven't been convinced that ACPI is
> >>> necessary.
> >>> * Why do hardware and OS vendors say ACPI is required?
> >>> * Status: Al & Grant collecting statements from OEMs to be posted
> >>> publicly early in the new year; firmware summit for broader
> >>> discussion planned.
> >>
> >> I was particularly hoping to see better progress on this item. It
> >> really shouldn't be that hard to explain why someone wants this feature.
> >
> > I've written something up in as a reply on the firmware summit thread.
> > I'm going to rework it to be a standalone document and post it
> > publicly. I hope that should resolve this issue.
>
> I've posted an article on my blog, but I'm reposting it here because
> the mailing list is more conducive to discussion...
>
> http://www.secretlab.ca/archives/151
Unfortunately, I seen the blog post before the mailing list post, so
here's reply in blog format.
Grant Likely published article about ACPI and ARM at
http://www.secretlab.ca/archives/151
. He acknowledges systems with ACPI are harder to debug, but because
Microsoft says so, we have to use ACPI (basically).
I believe doing wrong technical choice "because Microsoft says so" is
a wrong thing to do.
Yes, ACPI gives more flexibility to hardware vendors. Imagine
replacing block devices with interpretted bytecode coming from
ROM. That is obviously bad, right? Why is it good for power
management?
It is not.
Besides being harder to debug, there are more disadvantages:
* Size, speed and complexity disadvantage of bytecode interpretter in
the kernel.
* Many more drivers. Imagine GPIO switch, controlling rfkill
(for
example). In device tree case, that's few lines in the .dts
specifying
which GPIO that switch is on.
In ACPI case, each hardware vendor initially implements rfkill switch
in AML, differently. After few years, each vendor implements
(different) kernel<->AML interface for querying rfkill state and
toggling it in software. Few years after that, we implement kernel
drivers for those AML interfaces, to properly integrate them in
the
kernel.
* Incompatibility. ARM servers will now be very different from other
ARM systems.
Now, are there some arguments for ACPI? Yes -- it allows hw vendors to
hack half-working drivers without touching kernel
sources. (Half-working: such drivers are not properly integrated in
all the various subsystems). Grant claims that power management is
somehow special, and requirement for real drivers is somehow ok for
normal drivers (block, video), but not for power management. Now,
getting driver merged into the kernel does not take that long -- less
than half a year if you know what you are doing. Plus, for power
management, you can really just initialize hardware in the bootloader
(into working but not optimal state). But basic drivers are likely to
merged fast, and then you'll just have to supply DT tables.
Avoid ACPI. It only makes things more complex and harder to debug.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 76+ messages in thread* Re: [RFC] ACPI on arm64 TODO List
2015-01-12 14:23 ` Pavel Machek
@ 2015-01-12 14:41 ` Grant Likely
-1 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-12 14:41 UTC (permalink / raw)
To: Pavel Machek
Cc: Arnd Bergmann, Al Stone, linaro-acpi@lists.linaro.org,
Catalin Marinas, Rafael J. Wysocki, ACPI Devel Mailing List,
Olof Johansson, linux-arm-kernel@lists.infradead.org
On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>> >>> 7. Why is ACPI required?
>> >>> * Problem:
>> >>> * arm64 maintainers still haven't been convinced that ACPI is
>> >>> necessary.
>> >>> * Why do hardware and OS vendors say ACPI is required?
>> >>> * Status: Al & Grant collecting statements from OEMs to be posted
>> >>> publicly early in the new year; firmware summit for broader
>> >>> discussion planned.
>> >>
>> >> I was particularly hoping to see better progress on this item. It
>> >> really shouldn't be that hard to explain why someone wants this feature.
>> >
>> > I've written something up in as a reply on the firmware summit thread.
>> > I'm going to rework it to be a standalone document and post it
>> > publicly. I hope that should resolve this issue.
>>
>> I've posted an article on my blog, but I'm reposting it here because
>> the mailing list is more conducive to discussion...
>>
>> http://www.secretlab.ca/archives/151
>
> Unfortunately, I seen the blog post before the mailing list post, so
> here's reply in blog format.
>
> Grant Likely published article about ACPI and ARM at
>
> http://www.secretlab.ca/archives/151
>
> . He acknowledges systems with ACPI are harder to debug, but because
> Microsoft says so, we have to use ACPI (basically).
Please reread the blog post. Microsoft is a factor, but it is not the
primary driver by any means.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-12 14:41 ` Grant Likely
0 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-12 14:41 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>> >>> 7. Why is ACPI required?
>> >>> * Problem:
>> >>> * arm64 maintainers still haven't been convinced that ACPI is
>> >>> necessary.
>> >>> * Why do hardware and OS vendors say ACPI is required?
>> >>> * Status: Al & Grant collecting statements from OEMs to be posted
>> >>> publicly early in the new year; firmware summit for broader
>> >>> discussion planned.
>> >>
>> >> I was particularly hoping to see better progress on this item. It
>> >> really shouldn't be that hard to explain why someone wants this feature.
>> >
>> > I've written something up in as a reply on the firmware summit thread.
>> > I'm going to rework it to be a standalone document and post it
>> > publicly. I hope that should resolve this issue.
>>
>> I've posted an article on my blog, but I'm reposting it here because
>> the mailing list is more conducive to discussion...
>>
>> http://www.secretlab.ca/archives/151
>
> Unfortunately, I seen the blog post before the mailing list post, so
> here's reply in blog format.
>
> Grant Likely published article about ACPI and ARM at
>
> http://www.secretlab.ca/archives/151
>
> . He acknowledges systems with ACPI are harder to debug, but because
> Microsoft says so, we have to use ACPI (basically).
Please reread the blog post. Microsoft is a factor, but it is not the
primary driver by any means.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-12 14:41 ` Grant Likely
@ 2015-01-12 19:39 ` Pavel Machek
-1 siblings, 0 replies; 76+ messages in thread
From: Pavel Machek @ 2015-01-12 19:39 UTC (permalink / raw)
To: Grant Likely
Cc: Arnd Bergmann, Al Stone, linaro-acpi@lists.linaro.org,
Catalin Marinas, Rafael J. Wysocki, ACPI Devel Mailing List,
Olof Johansson, linux-arm-kernel@lists.infradead.org
On Mon 2015-01-12 14:41:50, Grant Likely wrote:
> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
> > On Sat 2015-01-10 14:44:02, Grant Likely wrote:
> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> >> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
> >> >>> 7. Why is ACPI required?
> >> >>> * Problem:
> >> >>> * arm64 maintainers still haven't been convinced that ACPI is
> >> >>> necessary.
> >> >>> * Why do hardware and OS vendors say ACPI is required?
> >> >>> * Status: Al & Grant collecting statements from OEMs to be posted
> >> >>> publicly early in the new year; firmware summit for broader
> >> >>> discussion planned.
> >> >>
> >> >> I was particularly hoping to see better progress on this item. It
> >> >> really shouldn't be that hard to explain why someone wants this feature.
> >> >
> >> > I've written something up in as a reply on the firmware summit thread.
> >> > I'm going to rework it to be a standalone document and post it
> >> > publicly. I hope that should resolve this issue.
> >>
> >> I've posted an article on my blog, but I'm reposting it here because
> >> the mailing list is more conducive to discussion...
> >>
> >> http://www.secretlab.ca/archives/151
> >
> > Unfortunately, I seen the blog post before the mailing list post, so
> > here's reply in blog format.
> >
> > Grant Likely published article about ACPI and ARM at
> >
> > http://www.secretlab.ca/archives/151
> >
> > . He acknowledges systems with ACPI are harder to debug, but because
> > Microsoft says so, we have to use ACPI (basically).
>
> Please reread the blog post. Microsoft is a factor, but it is not the
> primary driver by any means.
Ok, so what is the primary reason? As far as I could tell it is
"Microsoft wants ACPI" and "hardware people want Microsoft" and
"fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
says they want ACPI" -- but RedHat people should really speak for
themselves.)
You snipped quite a lot of reasons why ACPI is inferior that were
below this line in email.
Pavel
(1) ignoring fact that it causes fragmentation between servers and phones.
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-12 19:39 ` Pavel Machek
0 siblings, 0 replies; 76+ messages in thread
From: Pavel Machek @ 2015-01-12 19:39 UTC (permalink / raw)
To: linux-arm-kernel
On Mon 2015-01-12 14:41:50, Grant Likely wrote:
> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
> > On Sat 2015-01-10 14:44:02, Grant Likely wrote:
> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
> >> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
> >> >>> 7. Why is ACPI required?
> >> >>> * Problem:
> >> >>> * arm64 maintainers still haven't been convinced that ACPI is
> >> >>> necessary.
> >> >>> * Why do hardware and OS vendors say ACPI is required?
> >> >>> * Status: Al & Grant collecting statements from OEMs to be posted
> >> >>> publicly early in the new year; firmware summit for broader
> >> >>> discussion planned.
> >> >>
> >> >> I was particularly hoping to see better progress on this item. It
> >> >> really shouldn't be that hard to explain why someone wants this feature.
> >> >
> >> > I've written something up in as a reply on the firmware summit thread.
> >> > I'm going to rework it to be a standalone document and post it
> >> > publicly. I hope that should resolve this issue.
> >>
> >> I've posted an article on my blog, but I'm reposting it here because
> >> the mailing list is more conducive to discussion...
> >>
> >> http://www.secretlab.ca/archives/151
> >
> > Unfortunately, I seen the blog post before the mailing list post, so
> > here's reply in blog format.
> >
> > Grant Likely published article about ACPI and ARM at
> >
> > http://www.secretlab.ca/archives/151
> >
> > . He acknowledges systems with ACPI are harder to debug, but because
> > Microsoft says so, we have to use ACPI (basically).
>
> Please reread the blog post. Microsoft is a factor, but it is not the
> primary driver by any means.
Ok, so what is the primary reason? As far as I could tell it is
"Microsoft wants ACPI" and "hardware people want Microsoft" and
"fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
says they want ACPI" -- but RedHat people should really speak for
themselves.)
You snipped quite a lot of reasons why ACPI is inferior that were
below this line in email.
Pavel
(1) ignoring fact that it causes fragmentation between servers and phones.
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-12 19:39 ` Pavel Machek
@ 2015-01-12 19:55 ` Arnd Bergmann
-1 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-12 19:55 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Pavel Machek, Grant Likely, Al Stone,
linaro-acpi@lists.linaro.org, Catalin Marinas, Rafael J. Wysocki,
ACPI Devel Mailing List, Olof Johansson
On Monday 12 January 2015 20:39:05 Pavel Machek wrote:
>
> Ok, so what is the primary reason? As far as I could tell it is
> "Microsoft wants ACPI" and "hardware people want Microsoft" and
> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
> says they want ACPI" -- but RedHat people should really speak for
> themselves.)
I can only find the first two in Grant's document, not the third one,
and I don't think it's on the table any more. The argument that was
in there was that for a given platform that wants to support both
Linux and Windows, they can use ACPI and Linux should work with that,
but that is different from "all servers must use ACPI", which would
be unrealistic.
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-12 19:55 ` Arnd Bergmann
0 siblings, 0 replies; 76+ messages in thread
From: Arnd Bergmann @ 2015-01-12 19:55 UTC (permalink / raw)
To: linux-arm-kernel
On Monday 12 January 2015 20:39:05 Pavel Machek wrote:
>
> Ok, so what is the primary reason? As far as I could tell it is
> "Microsoft wants ACPI" and "hardware people want Microsoft" and
> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
> says they want ACPI" -- but RedHat people should really speak for
> themselves.)
I can only find the first two in Grant's document, not the third one,
and I don't think it's on the table any more. The argument that was
in there was that for a given platform that wants to support both
Linux and Windows, they can use ACPI and Linux should work with that,
but that is different from "all servers must use ACPI", which would
be unrealistic.
Arnd
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-12 19:39 ` Pavel Machek
@ 2015-01-13 14:12 ` Grant Likely
-1 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-13 14:12 UTC (permalink / raw)
To: Pavel Machek
Cc: Arnd Bergmann, Al Stone, linaro-acpi@lists.linaro.org,
Catalin Marinas, Rafael J. Wysocki, ACPI Devel Mailing List,
Olof Johansson, linux-arm-kernel@lists.infradead.org
On Mon, Jan 12, 2015 at 7:39 PM, Pavel Machek <pavel@ucw.cz> wrote:
> On Mon 2015-01-12 14:41:50, Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
>> > On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>> >> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>> >> >>> 7. Why is ACPI required?
>> >> >>> * Problem:
>> >> >>> * arm64 maintainers still haven't been convinced that ACPI is
>> >> >>> necessary.
>> >> >>> * Why do hardware and OS vendors say ACPI is required?
>> >> >>> * Status: Al & Grant collecting statements from OEMs to be posted
>> >> >>> publicly early in the new year; firmware summit for broader
>> >> >>> discussion planned.
>> >> >>
>> >> >> I was particularly hoping to see better progress on this item. It
>> >> >> really shouldn't be that hard to explain why someone wants this feature.
>> >> >
>> >> > I've written something up in as a reply on the firmware summit thread.
>> >> > I'm going to rework it to be a standalone document and post it
>> >> > publicly. I hope that should resolve this issue.
>> >>
>> >> I've posted an article on my blog, but I'm reposting it here because
>> >> the mailing list is more conducive to discussion...
>> >>
>> >> http://www.secretlab.ca/archives/151
>> >
>> > Unfortunately, I seen the blog post before the mailing list post, so
>> > here's reply in blog format.
>> >
>> > Grant Likely published article about ACPI and ARM at
>> >
>> > http://www.secretlab.ca/archives/151
>> >
>> > . He acknowledges systems with ACPI are harder to debug, but because
>> > Microsoft says so, we have to use ACPI (basically).
>>
>> Please reread the blog post. Microsoft is a factor, but it is not the
>> primary driver by any means.
>
> Ok, so what is the primary reason? As far as I could tell it is
> "Microsoft wants ACPI" and "hardware people want Microsoft" and
> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
> says they want ACPI" -- but RedHat people should really speak for
> themselves.)
The primary driver is abstraction. It is a hard requirement of the
hardware vendors. They have to have the ability to adapt their
products at a software level to support existing Linux distributions
and other operating system releases. This is exactly what they do now
in the x86 market, and they are not going to enter the ARM server
market without this ability.
Even if DT was chosen, the condition would have been to add an
abstraction model into DT, and then DT would end up looking like an
immature ACPI.
The secondary driver is consistency. When hardware vendors and OS
vendors produce independent products for the same ecosystem that must
be compatible at the binary level, then it is really important that
everyone in that ecosystem uses the same interfaces. At this level it
doesn't matter if it is DT or ACPI, just as long as everyone uses the
same thing.
[Of course, vendors have the option of completely rejecting the server
specifications as published by ARM, with the understanding that they
will probably need to either a) ship both the HW and OS themselves, or
b) create a separate and competing ecosystem.]
If the reason was merely as you say, "because Microsoft says so", then
my blog post would have been much shorter. I would have had no qualms
about saying so bluntly if that was actually the case. Instead, this
is the key paragraph to pay attention to:
> > However, the enterprise folks don't have that luxury. The platform/kernel split isn't a design choice. It is a characteristic of the market. Hardware and OS vendors each have their own product timetables, and they don't line up. The timeline for getting patches into the kernel and flowing through into OS releases puts OS support far downstream from the actual release of hardware. Hardware vendors simply cannot wait for OS support to come online to be able to release their products. They need to be able to work with available releases, and make their hardware behave in the way the OS expects. The advantage of ACPI OSPM is that it defines behaviour and limits what the hardware is allowed to do without involving the kernel.
All of the above applies regardless of whether or not vendors only
care about Linux support. ACPI is strongly desired regardless of
whether or not Microsoft is in the picture. Their support merely adds
weight behind an argument that is already the choice preferred by
hardware vendors.
> You snipped quite a lot of reasons why ACPI is inferior that were
> below this line in email.
Yes, I did. My primary point is that ACPI was chosen because the
hardware OEMs require some level of abstraction. If you don't agree
that the abstraction is important, then we fundamentally don't agree
on what the market looks like. In which case there is absolutely no
value in debating the details because each of us are working from a
different set of requirements.
> Pavel
>
> (1) ignoring fact that it causes fragmentation between servers and phones.
That's a red herring. ARM servers are not an extension of the ARM
phone market. The software ecosystem is completely different (the
phone vendor builds and ships the OS instead of being provided by one
of many independent OS vendors). ARM servers are an extension of the
x86 server market, and they will be judged in terms of how they
compare to a similar x86 machine. It is in the HW vendors best
interest to make using an ARM server as similar to their existing x86
products as possible.
When confronted with the choice of similarity with ARM phones or with
from x86 servers, the vendors will chose to follow x86's lead, and
they will be right to do so.
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-13 14:12 ` Grant Likely
0 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-13 14:12 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 12, 2015 at 7:39 PM, Pavel Machek <pavel@ucw.cz> wrote:
> On Mon 2015-01-12 14:41:50, Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
>> > On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>> >> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>> >> > On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> >> >> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>> >> >>> 7. Why is ACPI required?
>> >> >>> * Problem:
>> >> >>> * arm64 maintainers still haven't been convinced that ACPI is
>> >> >>> necessary.
>> >> >>> * Why do hardware and OS vendors say ACPI is required?
>> >> >>> * Status: Al & Grant collecting statements from OEMs to be posted
>> >> >>> publicly early in the new year; firmware summit for broader
>> >> >>> discussion planned.
>> >> >>
>> >> >> I was particularly hoping to see better progress on this item. It
>> >> >> really shouldn't be that hard to explain why someone wants this feature.
>> >> >
>> >> > I've written something up in as a reply on the firmware summit thread.
>> >> > I'm going to rework it to be a standalone document and post it
>> >> > publicly. I hope that should resolve this issue.
>> >>
>> >> I've posted an article on my blog, but I'm reposting it here because
>> >> the mailing list is more conducive to discussion...
>> >>
>> >> http://www.secretlab.ca/archives/151
>> >
>> > Unfortunately, I seen the blog post before the mailing list post, so
>> > here's reply in blog format.
>> >
>> > Grant Likely published article about ACPI and ARM at
>> >
>> > http://www.secretlab.ca/archives/151
>> >
>> > . He acknowledges systems with ACPI are harder to debug, but because
>> > Microsoft says so, we have to use ACPI (basically).
>>
>> Please reread the blog post. Microsoft is a factor, but it is not the
>> primary driver by any means.
>
> Ok, so what is the primary reason? As far as I could tell it is
> "Microsoft wants ACPI" and "hardware people want Microsoft" and
> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
> says they want ACPI" -- but RedHat people should really speak for
> themselves.)
The primary driver is abstraction. It is a hard requirement of the
hardware vendors. They have to have the ability to adapt their
products at a software level to support existing Linux distributions
and other operating system releases. This is exactly what they do now
in the x86 market, and they are not going to enter the ARM server
market without this ability.
Even if DT was chosen, the condition would have been to add an
abstraction model into DT, and then DT would end up looking like an
immature ACPI.
The secondary driver is consistency. When hardware vendors and OS
vendors produce independent products for the same ecosystem that must
be compatible at the binary level, then it is really important that
everyone in that ecosystem uses the same interfaces. At this level it
doesn't matter if it is DT or ACPI, just as long as everyone uses the
same thing.
[Of course, vendors have the option of completely rejecting the server
specifications as published by ARM, with the understanding that they
will probably need to either a) ship both the HW and OS themselves, or
b) create a separate and competing ecosystem.]
If the reason was merely as you say, "because Microsoft says so", then
my blog post would have been much shorter. I would have had no qualms
about saying so bluntly if that was actually the case. Instead, this
is the key paragraph to pay attention to:
> > However, the enterprise folks don't have that luxury. The platform/kernel split isn't a design choice. It is a characteristic of the market. Hardware and OS vendors each have their own product timetables, and they don't line up. The timeline for getting patches into the kernel and flowing through into OS releases puts OS support far downstream from the actual release of hardware. Hardware vendors simply cannot wait for OS support to come online to be able to release their products. They need to be able to work with available releases, and make their hardware behave in the way the OS expects. The advantage of ACPI OSPM is that it defines behaviour and limits what the hardware is allowed to do without involving the kernel.
All of the above applies regardless of whether or not vendors only
care about Linux support. ACPI is strongly desired regardless of
whether or not Microsoft is in the picture. Their support merely adds
weight behind an argument that is already the choice preferred by
hardware vendors.
> You snipped quite a lot of reasons why ACPI is inferior that were
> below this line in email.
Yes, I did. My primary point is that ACPI was chosen because the
hardware OEMs require some level of abstraction. If you don't agree
that the abstraction is important, then we fundamentally don't agree
on what the market looks like. In which case there is absolutely no
value in debating the details because each of us are working from a
different set of requirements.
> Pavel
>
> (1) ignoring fact that it causes fragmentation between servers and phones.
That's a red herring. ARM servers are not an extension of the ARM
phone market. The software ecosystem is completely different (the
phone vendor builds and ships the OS instead of being provided by one
of many independent OS vendors). ARM servers are an extension of the
x86 server market, and they will be judged in terms of how they
compare to a similar x86 machine. It is in the HW vendors best
interest to make using an ARM server as similar to their existing x86
products as possible.
When confronted with the choice of similarity with ARM phones or with
from x86 servers, the vendors will chose to follow x86's lead, and
they will be right to do so.
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-12 19:39 ` Pavel Machek
@ 2015-01-14 1:21 ` Al Stone
-1 siblings, 0 replies; 76+ messages in thread
From: Al Stone @ 2015-01-14 1:21 UTC (permalink / raw)
To: Pavel Machek, Grant Likely
Cc: Arnd Bergmann, Al Stone, linaro-acpi@lists.linaro.org,
Catalin Marinas, Rafael J. Wysocki, ACPI Devel Mailing List,
Olof Johansson, linux-arm-kernel@lists.infradead.org
On 01/12/2015 12:39 PM, Pavel Machek wrote:
> On Mon 2015-01-12 14:41:50, Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
>>> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>>> On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>>>> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>>>>>>> 7. Why is ACPI required?
>>>>>>> * Problem:
>>>>>>> * arm64 maintainers still haven't been convinced that ACPI is
>>>>>>> necessary.
>>>>>>> * Why do hardware and OS vendors say ACPI is required?
>>>>>>> * Status: Al & Grant collecting statements from OEMs to be posted
>>>>>>> publicly early in the new year; firmware summit for broader
>>>>>>> discussion planned.
>>>>>>
>>>>>> I was particularly hoping to see better progress on this item. It
>>>>>> really shouldn't be that hard to explain why someone wants this feature.
>>>>>
>>>>> I've written something up in as a reply on the firmware summit thread.
>>>>> I'm going to rework it to be a standalone document and post it
>>>>> publicly. I hope that should resolve this issue.
>>>>
>>>> I've posted an article on my blog, but I'm reposting it here because
>>>> the mailing list is more conducive to discussion...
>>>>
>>>> http://www.secretlab.ca/archives/151
>>>
>>> Unfortunately, I seen the blog post before the mailing list post, so
>>> here's reply in blog format.
>>>
>>> Grant Likely published article about ACPI and ARM at
>>>
>>> http://www.secretlab.ca/archives/151
>>>
>>> . He acknowledges systems with ACPI are harder to debug, but because
>>> Microsoft says so, we have to use ACPI (basically).
>>
>> Please reread the blog post. Microsoft is a factor, but it is not the
>> primary driver by any means.
>
> Ok, so what is the primary reason? As far as I could tell it is
> "Microsoft wants ACPI" and "hardware people want Microsoft" and
> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
> says they want ACPI" -- but RedHat people should really speak for
> themselves.)
I have to say I found this statement fascinating.
I have been seconded to Linaro from Red Hat for over two years now,
working on getting ACPI running, first as a prototype on an ARMv7 box,
then on ARMv8. I have been working with Grant since very early on when
some of us first started talking about ARM servers in the enterprise
market, and what sorts of standards, if any, would be needed to build an
ecosystem.
This is the first time in at least two years that I have had someone
ask for Red Hat to speak up about ACPI on ARM servers; it's usually
quite the opposite, as in "will you Red Hat folks please shut up about
this already?" :).
For all the reasons Grant has already mentioned, my Customers need to
have ACPI on ARM servers for them to be successful in their business.
I view my job as providing what my Customers need to be successful.
So, here I am. I want ACPI on ARMv8 for my Customers.
> You snipped quite a lot of reasons why ACPI is inferior that were
> below this line in email.
>
> Pavel
>
> (1) ignoring fact that it causes fragmentation between servers and phones.
>
I see this very differently. This is a "fact" only when viewed from
the perspective of having two different technologies that can do very
similar things.
In my opinion, the issue is that these are two very, very different
markets; technologies are only relevant as the tools to be used to be
successful in those markets.
Just on a surface level, phones are expected to be completely replaced
every 18 months or less -- new hardware, new version of the OS, new
everything. That's the driving force in the market.
A server does not change that quickly; it is probable that the hardware
will change, but it is unlikely to change at that speed. It can take
18 months just for some of the certification testing needed for new
hardware or software. Further, everything from the kernel on up is
expected to be stable for a long time -- as long as 25 years, in some
cases I have worked on. "New" can be a bad word in this environment.
Best I can tell, I need different tool sets to do well in each of these
environments -- one that allows me to move quickly for phones, and one
that allows me to carefully control change for servers. I personally
don't see that as fragmentation, but as using the right tool for the
job. If I'm building a phone, I want the speed and flexibility of DT.
If I'm building a server, I want the long term stability of ACPI.
--
ciao,
al
-----------------------------------
Al Stone
Software Engineer
Red Hat, Inc.
ahs3@redhat.com
-----------------------------------
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-14 1:21 ` Al Stone
0 siblings, 0 replies; 76+ messages in thread
From: Al Stone @ 2015-01-14 1:21 UTC (permalink / raw)
To: linux-arm-kernel
On 01/12/2015 12:39 PM, Pavel Machek wrote:
> On Mon 2015-01-12 14:41:50, Grant Likely wrote:
>> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
>>> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>>> On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>>>> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>>>>>>> 7. Why is ACPI required?
>>>>>>> * Problem:
>>>>>>> * arm64 maintainers still haven't been convinced that ACPI is
>>>>>>> necessary.
>>>>>>> * Why do hardware and OS vendors say ACPI is required?
>>>>>>> * Status: Al & Grant collecting statements from OEMs to be posted
>>>>>>> publicly early in the new year; firmware summit for broader
>>>>>>> discussion planned.
>>>>>>
>>>>>> I was particularly hoping to see better progress on this item. It
>>>>>> really shouldn't be that hard to explain why someone wants this feature.
>>>>>
>>>>> I've written something up in as a reply on the firmware summit thread.
>>>>> I'm going to rework it to be a standalone document and post it
>>>>> publicly. I hope that should resolve this issue.
>>>>
>>>> I've posted an article on my blog, but I'm reposting it here because
>>>> the mailing list is more conducive to discussion...
>>>>
>>>> http://www.secretlab.ca/archives/151
>>>
>>> Unfortunately, I seen the blog post before the mailing list post, so
>>> here's reply in blog format.
>>>
>>> Grant Likely published article about ACPI and ARM at
>>>
>>> http://www.secretlab.ca/archives/151
>>>
>>> . He acknowledges systems with ACPI are harder to debug, but because
>>> Microsoft says so, we have to use ACPI (basically).
>>
>> Please reread the blog post. Microsoft is a factor, but it is not the
>> primary driver by any means.
>
> Ok, so what is the primary reason? As far as I could tell it is
> "Microsoft wants ACPI" and "hardware people want Microsoft" and
> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
> says they want ACPI" -- but RedHat people should really speak for
> themselves.)
I have to say I found this statement fascinating.
I have been seconded to Linaro from Red Hat for over two years now,
working on getting ACPI running, first as a prototype on an ARMv7 box,
then on ARMv8. I have been working with Grant since very early on when
some of us first started talking about ARM servers in the enterprise
market, and what sorts of standards, if any, would be needed to build an
ecosystem.
This is the first time in at least two years that I have had someone
ask for Red Hat to speak up about ACPI on ARM servers; it's usually
quite the opposite, as in "will you Red Hat folks please shut up about
this already?" :).
For all the reasons Grant has already mentioned, my Customers need to
have ACPI on ARM servers for them to be successful in their business.
I view my job as providing what my Customers need to be successful.
So, here I am. I want ACPI on ARMv8 for my Customers.
> You snipped quite a lot of reasons why ACPI is inferior that were
> below this line in email.
>
> Pavel
>
> (1) ignoring fact that it causes fragmentation between servers and phones.
>
I see this very differently. This is a "fact" only when viewed from
the perspective of having two different technologies that can do very
similar things.
In my opinion, the issue is that these are two very, very different
markets; technologies are only relevant as the tools to be used to be
successful in those markets.
Just on a surface level, phones are expected to be completely replaced
every 18 months or less -- new hardware, new version of the OS, new
everything. That's the driving force in the market.
A server does not change that quickly; it is probable that the hardware
will change, but it is unlikely to change at that speed. It can take
18 months just for some of the certification testing needed for new
hardware or software. Further, everything from the kernel on up is
expected to be stable for a long time -- as long as 25 years, in some
cases I have worked on. "New" can be a bad word in this environment.
Best I can tell, I need different tool sets to do well in each of these
environments -- one that allows me to move quickly for phones, and one
that allows me to carefully control change for servers. I personally
don't see that as fragmentation, but as using the right tool for the
job. If I'm building a phone, I want the speed and flexibility of DT.
If I'm building a server, I want the long term stability of ACPI.
--
ciao,
al
-----------------------------------
Al Stone
Software Engineer
Red Hat, Inc.
ahs3 at redhat.com
-----------------------------------
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [Linaro-acpi] [RFC] ACPI on arm64 TODO List
2015-01-14 1:21 ` Al Stone
@ 2015-01-15 17:45 ` Linda Knippers
-1 siblings, 0 replies; 76+ messages in thread
From: Linda Knippers @ 2015-01-15 17:45 UTC (permalink / raw)
To: Al Stone, Pavel Machek, Grant Likely
Cc: Arnd Bergmann, linaro-acpi@lists.linaro.org, Catalin Marinas,
Rafael J. Wysocki, ACPI Devel Mailing List, Olof Johansson,
linux-arm-kernel@lists.infradead.org
On 1/13/2015 7:21 PM, Al Stone wrote:
> On 01/12/2015 12:39 PM, Pavel Machek wrote:
>> On Mon 2015-01-12 14:41:50, Grant Likely wrote:
>>> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
>>>> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>>>> On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>>>>> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>>>>>>>> 7. Why is ACPI required?
>>>>>>>> * Problem:
>>>>>>>> * arm64 maintainers still haven't been convinced that ACPI is
>>>>>>>> necessary.
>>>>>>>> * Why do hardware and OS vendors say ACPI is required?
>>>>>>>> * Status: Al & Grant collecting statements from OEMs to be posted
>>>>>>>> publicly early in the new year; firmware summit for broader
>>>>>>>> discussion planned.
>>>>>>>
>>>>>>> I was particularly hoping to see better progress on this item. It
>>>>>>> really shouldn't be that hard to explain why someone wants this feature.
>>>>>>
>>>>>> I've written something up in as a reply on the firmware summit thread.
>>>>>> I'm going to rework it to be a standalone document and post it
>>>>>> publicly. I hope that should resolve this issue.
>>>>>
>>>>> I've posted an article on my blog, but I'm reposting it here because
>>>>> the mailing list is more conducive to discussion...
>>>>>
>>>>> http://www.secretlab.ca/archives/151
>>>>
>>>> Unfortunately, I seen the blog post before the mailing list post, so
>>>> here's reply in blog format.
>>>>
>>>> Grant Likely published article about ACPI and ARM at
>>>>
>>>> http://www.secretlab.ca/archives/151
>>>>
>>>> . He acknowledges systems with ACPI are harder to debug, but because
>>>> Microsoft says so, we have to use ACPI (basically).
>>>
>>> Please reread the blog post. Microsoft is a factor, but it is not the
>>> primary driver by any means.
>>
>> Ok, so what is the primary reason? As far as I could tell it is
>> "Microsoft wants ACPI" and "hardware people want Microsoft" and
>> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
>> says they want ACPI" -- but RedHat people should really speak for
>> themselves.)
>
> I have to say I found this statement fascinating.
>
> I have been seconded to Linaro from Red Hat for over two years now,
> working on getting ACPI running, first as a prototype on an ARMv7 box,
> then on ARMv8. I have been working with Grant since very early on when
> some of us first started talking about ARM servers in the enterprise
> market, and what sorts of standards, if any, would be needed to build an
> ecosystem.
>
> This is the first time in at least two years that I have had someone
> ask for Red Hat to speak up about ACPI on ARM servers; it's usually
> quite the opposite, as in "will you Red Hat folks please shut up about
> this already?" :).
>
> For all the reasons Grant has already mentioned, my Customers need to
> have ACPI on ARM servers for them to be successful in their business.
> I view my job as providing what my Customers need to be successful.
> So, here I am. I want ACPI on ARMv8 for my Customers.
I want that too, even for platforms that might not ever run Windows.
-- ljk
>
>> You snipped quite a lot of reasons why ACPI is inferior that were
>> below this line in email.
>>
>> Pavel
>>
>> (1) ignoring fact that it causes fragmentation between servers and phones.
>>
>
> I see this very differently. This is a "fact" only when viewed from
> the perspective of having two different technologies that can do very
> similar things.
>
> In my opinion, the issue is that these are two very, very different
> markets; technologies are only relevant as the tools to be used to be
> successful in those markets.
>
> Just on a surface level, phones are expected to be completely replaced
> every 18 months or less -- new hardware, new version of the OS, new
> everything. That's the driving force in the market.
>
> A server does not change that quickly; it is probable that the hardware
> will change, but it is unlikely to change at that speed. It can take
> 18 months just for some of the certification testing needed for new
> hardware or software. Further, everything from the kernel on up is
> expected to be stable for a long time -- as long as 25 years, in some
> cases I have worked on. "New" can be a bad word in this environment.
>
> Best I can tell, I need different tool sets to do well in each of these
> environments -- one that allows me to move quickly for phones, and one
> that allows me to carefully control change for servers. I personally
> don't see that as fragmentation, but as using the right tool for the
> job. If I'm building a phone, I want the speed and flexibility of DT.
> If I'm building a server, I want the long term stability of ACPI.
>
^ permalink raw reply [flat|nested] 76+ messages in thread
* [Linaro-acpi] [RFC] ACPI on arm64 TODO List
@ 2015-01-15 17:45 ` Linda Knippers
0 siblings, 0 replies; 76+ messages in thread
From: Linda Knippers @ 2015-01-15 17:45 UTC (permalink / raw)
To: linux-arm-kernel
On 1/13/2015 7:21 PM, Al Stone wrote:
> On 01/12/2015 12:39 PM, Pavel Machek wrote:
>> On Mon 2015-01-12 14:41:50, Grant Likely wrote:
>>> On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
>>>> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
>>>>> On Wed, Dec 17, 2014 at 10:26 PM, Grant Likely <grant.likely@linaro.org> wrote:
>>>>>> On Tue, Dec 16, 2014 at 11:27 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>>>>> On Monday 15 December 2014 19:18:16 Al Stone wrote:
>>>>>>>> 7. Why is ACPI required?
>>>>>>>> * Problem:
>>>>>>>> * arm64 maintainers still haven't been convinced that ACPI is
>>>>>>>> necessary.
>>>>>>>> * Why do hardware and OS vendors say ACPI is required?
>>>>>>>> * Status: Al & Grant collecting statements from OEMs to be posted
>>>>>>>> publicly early in the new year; firmware summit for broader
>>>>>>>> discussion planned.
>>>>>>>
>>>>>>> I was particularly hoping to see better progress on this item. It
>>>>>>> really shouldn't be that hard to explain why someone wants this feature.
>>>>>>
>>>>>> I've written something up in as a reply on the firmware summit thread.
>>>>>> I'm going to rework it to be a standalone document and post it
>>>>>> publicly. I hope that should resolve this issue.
>>>>>
>>>>> I've posted an article on my blog, but I'm reposting it here because
>>>>> the mailing list is more conducive to discussion...
>>>>>
>>>>> http://www.secretlab.ca/archives/151
>>>>
>>>> Unfortunately, I seen the blog post before the mailing list post, so
>>>> here's reply in blog format.
>>>>
>>>> Grant Likely published article about ACPI and ARM at
>>>>
>>>> http://www.secretlab.ca/archives/151
>>>>
>>>> . He acknowledges systems with ACPI are harder to debug, but because
>>>> Microsoft says so, we have to use ACPI (basically).
>>>
>>> Please reread the blog post. Microsoft is a factor, but it is not the
>>> primary driver by any means.
>>
>> Ok, so what is the primary reason? As far as I could tell it is
>> "Microsoft wants ACPI" and "hardware people want Microsoft" and
>> "fragmentation is bad so we do ACPI" (1) (and maybe "someone at RedHat
>> says they want ACPI" -- but RedHat people should really speak for
>> themselves.)
>
> I have to say I found this statement fascinating.
>
> I have been seconded to Linaro from Red Hat for over two years now,
> working on getting ACPI running, first as a prototype on an ARMv7 box,
> then on ARMv8. I have been working with Grant since very early on when
> some of us first started talking about ARM servers in the enterprise
> market, and what sorts of standards, if any, would be needed to build an
> ecosystem.
>
> This is the first time in at least two years that I have had someone
> ask for Red Hat to speak up about ACPI on ARM servers; it's usually
> quite the opposite, as in "will you Red Hat folks please shut up about
> this already?" :).
>
> For all the reasons Grant has already mentioned, my Customers need to
> have ACPI on ARM servers for them to be successful in their business.
> I view my job as providing what my Customers need to be successful.
> So, here I am. I want ACPI on ARMv8 for my Customers.
I want that too, even for platforms that might not ever run Windows.
-- ljk
>
>> You snipped quite a lot of reasons why ACPI is inferior that were
>> below this line in email.
>>
>> Pavel
>>
>> (1) ignoring fact that it causes fragmentation between servers and phones.
>>
>
> I see this very differently. This is a "fact" only when viewed from
> the perspective of having two different technologies that can do very
> similar things.
>
> In my opinion, the issue is that these are two very, very different
> markets; technologies are only relevant as the tools to be used to be
> successful in those markets.
>
> Just on a surface level, phones are expected to be completely replaced
> every 18 months or less -- new hardware, new version of the OS, new
> everything. That's the driving force in the market.
>
> A server does not change that quickly; it is probable that the hardware
> will change, but it is unlikely to change at that speed. It can take
> 18 months just for some of the certification testing needed for new
> hardware or software. Further, everything from the kernel on up is
> expected to be stable for a long time -- as long as 25 years, in some
> cases I have worked on. "New" can be a bad word in this environment.
>
> Best I can tell, I need different tool sets to do well in each of these
> environments -- one that allows me to move quickly for phones, and one
> that allows me to carefully control change for servers. I personally
> don't see that as fragmentation, but as using the right tool for the
> job. If I'm building a phone, I want the speed and flexibility of DT.
> If I'm building a server, I want the long term stability of ACPI.
>
^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [RFC] ACPI on arm64 TODO List
2015-01-12 14:23 ` Pavel Machek
@ 2015-01-13 17:02 ` Grant Likely
-1 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-13 17:02 UTC (permalink / raw)
To: Pavel Machek
Cc: Arnd Bergmann, Al Stone, linaro-acpi@lists.linaro.org,
Catalin Marinas, Rafael J. Wysocki, ACPI Devel Mailing List,
Olof Johansson, linux-arm-kernel@lists.infradead.org
Hi Pavel,
For the sake of argument, I'll respond to your points below, even
though we fundamentally disagree on what is required for a general
purpose server...
On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
> Grant Likely published article about ACPI and ARM at
>
> http://www.secretlab.ca/archives/151
>
> . He acknowledges systems with ACPI are harder to debug, but because
> Microsoft says so, we have to use ACPI (basically).
>
> I believe doing wrong technical choice "because Microsoft says so" is
> a wrong thing to do.
>
> Yes, ACPI gives more flexibility to hardware vendors. Imagine
> replacing block devices with interpretted bytecode coming from
> ROM. That is obviously bad, right? Why is it good for power
> management?
>
> It is not.
>
Trying to equate a block driver with the things that are done by ACPI
is a pretty big stretch. It doesn't even come close to the same level
of complexity. ACPI is merely the glue between the OS and the
behaviour of the HW & FW. (e.g. On platform A, the rfkill switch is a
GPIO, but on platform B it is an i2c transaction to a
microcontroller). There are lots of these little details on modern
hardware, and most of them are pretty trivial aside from the fact that
support requires either a kernel change (for every supported OS) or
something like ACPI.
You just need to look at the x86 market to see that huge drivers in
ACPI generally doesn't happen. In fact, aside from the high profile
bad examples (we all like to watch a good scandal) the x86 market
works really well.
> Besides being harder to debug, there are more disadvantages:
>
> * Size, speed and complexity disadvantage of bytecode interpretter in
> the kernel.
The bytecode interpreter has been in the kernel for years. It works,
It isn't on the hot path for computing power, and it meets the
requirement for abstraction between the kernel and the platform.
> * Many more drivers. Imagine GPIO switch, controlling rfkill
> (for
> example). In device tree case, that's few lines in the .dts
> specifying
> which GPIO that switch is on.
>
> In ACPI case, each hardware vendor initially implements rfkill switch
> in AML, differently. After few years, each vendor implements
> (different) kernel<->AML interface for querying rfkill state and
> toggling it in software. Few years after that, we implement kernel
> drivers for those AML interfaces, to properly integrate them in
> the
> kernel.
I don't know what you're trying to argue here. If an AML interface
works, we never bother with writing direct kernel support for it. Most
of the time it works. When it doesn't, we write a quirk in the driver
and move on.
> * Incompatibility. ARM servers will now be very different from other
> ARM systems.
If anything, ARM servers are an extension of the existing x86 server
market, not any of the existing ARM markets. They don't look like
mobile, and they don't look like embedded. The fact that the OS
vendors and the HW vendors are independent companies completely
changes how this hardware gets supported. The x86 market has figured
this out and that is a big reason why it is able to scale to the size
it has. ACPI is a direct result of what that kind of market needs.
> Now, are there some arguments for ACPI? Yes -- it allows hw vendors to
> hack half-working drivers without touching kernel
> sources. (Half-working: such drivers are not properly integrated in
> all the various subsystems).
That's an unsubstantiated claim. ACPI defines a model for the fiddly
bits around the edges to be implemented by the platform, and performed
under /kernel/ control, without requiring the kernel to explicitly
have code for each and every possibility. It is a very good
abstraction from that point of view.
> Grant claims that power management is
> somehow special, and requirement for real drivers is somehow ok for
> normal drivers (block, video), but not for power management. Now,
> getting driver merged into the kernel does not take that long -- less
> than half a year if you know what you are doing. Plus, for power
> management, you can really just initialize hardware in the bootloader
> (into working but not optimal state). But basic drivers are likely to
> merged fast, and then you'll just have to supply DT tables.
The reality doesn't play out with the scenario you're describing. We
have two major mass markets in the Linux world; Mobile and general
purpose computing. For mobile, we have yet to have a device on the
market that is well supported at launch by mainline. Vendors ship
their own kernel because they only support a single OS, and we haven't
figured out how to support their hardware at launch time (there is
plenty of blame to go around on this issue; regardless of who is to
blame, it is still a problem). The current mobile model doesn't even
remotely address the needs of server vendors and traditional Linux
distributions.
The general purpose (or in this case, the server subset), is the only
place where there is a large ecosystem of independent hardware and OS
vendors. For all the complaints about technical problems, the existing
x86 architecture using ACPI works well and it has spawned an awful lot
of very interesting hardware.
...
Personally, I think ACPI is the wrong thing to be getting upset over.
Even supposing ACPI was rejected, it doesn't say anything about
firmware that is completely out of visibility of the kernel. Both ARM
and x86 CPUs have secure modes that are the domain of firmware, not
the kernel, and have basically free reign on the machine.
ACPI on the other hand can be inspected. We know when the interpreter
is running, because the kernel controls it. We can extract and
decompile the ACPI tables. It isn't quite the hidden black box that
the rhetoric against ACPI claims it to be.
If you really want to challenge the industry, then push for vendors to
open source and upstream their firmware. Heck, all the necessary
building blocks are already open sourced in the Tianocore and ARM
Trusted Firmware projects. This would actually improve the visibility
and auditing of the platform behaviour. The really invisible stuff is
there, not in ACPI, and believe me, if ACPI (or similar) isn't
available then the vendors will be stuffing even more into firmware
than they are now.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread
* [RFC] ACPI on arm64 TODO List
@ 2015-01-13 17:02 ` Grant Likely
0 siblings, 0 replies; 76+ messages in thread
From: Grant Likely @ 2015-01-13 17:02 UTC (permalink / raw)
To: linux-arm-kernel
Hi Pavel,
For the sake of argument, I'll respond to your points below, even
though we fundamentally disagree on what is required for a general
purpose server...
On Mon, Jan 12, 2015 at 2:23 PM, Pavel Machek <pavel@ucw.cz> wrote:
> On Sat 2015-01-10 14:44:02, Grant Likely wrote:
> Grant Likely published article about ACPI and ARM at
>
> http://www.secretlab.ca/archives/151
>
> . He acknowledges systems with ACPI are harder to debug, but because
> Microsoft says so, we have to use ACPI (basically).
>
> I believe doing wrong technical choice "because Microsoft says so" is
> a wrong thing to do.
>
> Yes, ACPI gives more flexibility to hardware vendors. Imagine
> replacing block devices with interpretted bytecode coming from
> ROM. That is obviously bad, right? Why is it good for power
> management?
>
> It is not.
>
Trying to equate a block driver with the things that are done by ACPI
is a pretty big stretch. It doesn't even come close to the same level
of complexity. ACPI is merely the glue between the OS and the
behaviour of the HW & FW. (e.g. On platform A, the rfkill switch is a
GPIO, but on platform B it is an i2c transaction to a
microcontroller). There are lots of these little details on modern
hardware, and most of them are pretty trivial aside from the fact that
support requires either a kernel change (for every supported OS) or
something like ACPI.
You just need to look at the x86 market to see that huge drivers in
ACPI generally doesn't happen. In fact, aside from the high profile
bad examples (we all like to watch a good scandal) the x86 market
works really well.
> Besides being harder to debug, there are more disadvantages:
>
> * Size, speed and complexity disadvantage of bytecode interpretter in
> the kernel.
The bytecode interpreter has been in the kernel for years. It works,
It isn't on the hot path for computing power, and it meets the
requirement for abstraction between the kernel and the platform.
> * Many more drivers. Imagine GPIO switch, controlling rfkill
> (for
> example). In device tree case, that's few lines in the .dts
> specifying
> which GPIO that switch is on.
>
> In ACPI case, each hardware vendor initially implements rfkill switch
> in AML, differently. After few years, each vendor implements
> (different) kernel<->AML interface for querying rfkill state and
> toggling it in software. Few years after that, we implement kernel
> drivers for those AML interfaces, to properly integrate them in
> the
> kernel.
I don't know what you're trying to argue here. If an AML interface
works, we never bother with writing direct kernel support for it. Most
of the time it works. When it doesn't, we write a quirk in the driver
and move on.
> * Incompatibility. ARM servers will now be very different from other
> ARM systems.
If anything, ARM servers are an extension of the existing x86 server
market, not any of the existing ARM markets. They don't look like
mobile, and they don't look like embedded. The fact that the OS
vendors and the HW vendors are independent companies completely
changes how this hardware gets supported. The x86 market has figured
this out and that is a big reason why it is able to scale to the size
it has. ACPI is a direct result of what that kind of market needs.
> Now, are there some arguments for ACPI? Yes -- it allows hw vendors to
> hack half-working drivers without touching kernel
> sources. (Half-working: such drivers are not properly integrated in
> all the various subsystems).
That's an unsubstantiated claim. ACPI defines a model for the fiddly
bits around the edges to be implemented by the platform, and performed
under /kernel/ control, without requiring the kernel to explicitly
have code for each and every possibility. It is a very good
abstraction from that point of view.
> Grant claims that power management is
> somehow special, and requirement for real drivers is somehow ok for
> normal drivers (block, video), but not for power management. Now,
> getting driver merged into the kernel does not take that long -- less
> than half a year if you know what you are doing. Plus, for power
> management, you can really just initialize hardware in the bootloader
> (into working but not optimal state). But basic drivers are likely to
> merged fast, and then you'll just have to supply DT tables.
The reality doesn't play out with the scenario you're describing. We
have two major mass markets in the Linux world; Mobile and general
purpose computing. For mobile, we have yet to have a device on the
market that is well supported at launch by mainline. Vendors ship
their own kernel because they only support a single OS, and we haven't
figured out how to support their hardware at launch time (there is
plenty of blame to go around on this issue; regardless of who is to
blame, it is still a problem). The current mobile model doesn't even
remotely address the needs of server vendors and traditional Linux
distributions.
The general purpose (or in this case, the server subset), is the only
place where there is a large ecosystem of independent hardware and OS
vendors. For all the complaints about technical problems, the existing
x86 architecture using ACPI works well and it has spawned an awful lot
of very interesting hardware.
...
Personally, I think ACPI is the wrong thing to be getting upset over.
Even supposing ACPI was rejected, it doesn't say anything about
firmware that is completely out of visibility of the kernel. Both ARM
and x86 CPUs have secure modes that are the domain of firmware, not
the kernel, and have basically free reign on the machine.
ACPI on the other hand can be inspected. We know when the interpreter
is running, because the kernel controls it. We can extract and
decompile the ACPI tables. It isn't quite the hidden black box that
the rhetoric against ACPI claims it to be.
If you really want to challenge the industry, then push for vendors to
open source and upstream their firmware. Heck, all the necessary
building blocks are already open sourced in the Tianocore and ARM
Trusted Firmware projects. This would actually improve the visibility
and auditing of the platform behaviour. The really invisible stuff is
there, not in ACPI, and believe me, if ACPI (or similar) isn't
available then the vendors will be stuffing even more into firmware
than they are now.
g.
^ permalink raw reply [flat|nested] 76+ messages in thread