public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS
       [not found] ` <deade3c973eb9c98d2a6cfed658bbb49e693894a.1213947350.git.len.brown@intel.com>
@ 2008-07-01  8:33   ` Thomas Renninger
  2008-07-07 11:12     ` Pavel Machek
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Renninger @ 2008-07-01  8:33 UTC (permalink / raw)
  To: Len Brown
  Cc: linux-acpi, Bob Moore, Lin Ming, Len Brown, Bjorn Helgaas,
	Huang Cheng, firmwarekit-discuss, Arjan van de Ven,
	Linux Kernel Mailing List

On Friday 20 June 2008 09:43:25 Len Brown wrote:
> From: Bob Moore <robert.moore@intel.com>
>
> Some BIOSs erroneously reverse the _PRT SourceName and the
> SourceIndex.  Detect and repair this problem. MS ACPI also allows
> and repairs this problem, thus ACPICA must also.

It would be great to have an interface to report this as a BIOS defect.

Something like:

FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT source_name", ACPI_ 
Bug);

FIRMWARE_BUG_ON(severity, description, component);

If say CONFIG_FIRMWARE_DEBUG is set this could send a netlink event to 
userspace and e.g. linuxfirmwarekit can grab it.
linuxfirmwarekit currently implements it's own scripts and even worse, parses 
dmesg to find BIOS bugs, which is not really maintenable.

IMO the kernel developers themselves need an interface where they can report 
bugs while seeing and fixing them.

Huang Cheng posted a nice patch on the linuxfirmwarekit list a while ago 
(unfortunately the patch was attached not inlined):
http://www.bughost.org/pipermail/firmwarekit-discuss/2008-March/000166.html

I mean it needs some cleanup (e.g. the file name was the bug number, there 
need to be a general interface created, e.g. as  shown above, etc.).
But this is IMO a good idea and might be worth adding to the kernel (when 
written :) ).
Comments?

        Thomas

PS: I added Bjorn as in the PNP parts BIOSes are pretty much messed up.
One reason is that AFAIK ACPI parts are glued together. Having dozens/hundreds 
of device declarations in a BIOS of which huge parts have not been written by 
yourself, it is very hard for vendors to check for correctness.
While Bjorn and others are adding quirks and workarounds for specific devices, 
machines or general BIOS bugs it would be great to be able to point vendors 
to these bugs by a simple tool.

> Signed-off-by: Bob Moore <robert.moore@intel.com>
> Signed-off-by: Lin Ming <ming.m.lin@intel.com>
> Signed-off-by: Len Brown <len.brown@intel.com>
> ---
>  drivers/acpi/resources/rscreate.c |   17 +++++++++++++++++
>  1 files changed, 17 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/acpi/resources/rscreate.c
> b/drivers/acpi/resources/rscreate.c index faddaee..70c84ec 100644
> --- a/drivers/acpi/resources/rscreate.c
> +++ b/drivers/acpi/resources/rscreate.c
> @@ -285,6 +285,23 @@ acpi_rs_create_pci_routing_table(union
> acpi_operand_object *package_object, }
>
>  		/*
> +		 * If the BIOS has erroneously reversed the _PRT source_name (index 2)
> +		 * and the source_index (index 3), fix it. _PRT is important enough to
> +		 * workaround this BIOS error. This also provides compatibility with
> +		 * other ACPI implementations.
> +		 */
> +		obj_desc = sub_object_list[3];
> +		if (!obj_desc
> +		    || (ACPI_GET_OBJECT_TYPE(obj_desc) != ACPI_TYPE_INTEGER)) {
> +			sub_object_list[3] = sub_object_list[2];
> +			sub_object_list[2] = obj_desc;
> +
> +			ACPI_WARNING((AE_INFO,
> +				      "(PRT[%X].Source) SourceName and SourceIndex are reversed,
> fixed", +				      index));
> +		}
> +
> +		/*
>  		 * 3) Third subobject: Dereference the PRT.source_name
>  		 * The name may be unresolved (slack mode), so allow a null object
>  		 */



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS
  2008-07-01  8:33   ` Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS Thomas Renninger
@ 2008-07-07 11:12     ` Pavel Machek
  2008-07-08 15:40       ` Arjan van de Ven
  0 siblings, 1 reply; 7+ messages in thread
From: Pavel Machek @ 2008-07-07 11:12 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: Len Brown, linux-acpi, Bob Moore, Lin Ming, Len Brown,
	Bjorn Helgaas, Huang Cheng, firmwarekit-discuss, Arjan van de Ven,
	Linux Kernel Mailing List

Hi!

> > Some BIOSs erroneously reverse the _PRT SourceName and the
> > SourceIndex.  Detect and repair this problem. MS ACPI also allows
> > and repairs this problem, thus ACPICA must also.
> 
> It would be great to have an interface to report this as a BIOS defect.
> 
> Something like:
> 
> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT source_name", ACPI_ 
> Bug);
> 
> FIRMWARE_BUG_ON(severity, description, component);

Yes, please.

I'd also like HARDWARE_BUG_ON(), with similar usage.

With all the preload-linux-on-foo project, we have some chance to make
BIOS vendors fix their stuff if we can easily diagnose errors in
there.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS
  2008-07-07 11:12     ` Pavel Machek
@ 2008-07-08 15:40       ` Arjan van de Ven
  2008-07-08 19:34         ` Brown, Len
  0 siblings, 1 reply; 7+ messages in thread
From: Arjan van de Ven @ 2008-07-08 15:40 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Thomas Renninger, Len Brown, linux-acpi, Bob Moore, Lin Ming,
	Len Brown, Bjorn Helgaas, Huang Cheng, firmwarekit-discuss,
	Linux Kernel Mailing List

Pavel Machek wrote:
> Hi!
> 
>>> Some BIOSs erroneously reverse the _PRT SourceName and the
>>> SourceIndex.  Detect and repair this problem. MS ACPI also allows
>>> and repairs this problem, thus ACPICA must also.
>> It would be great to have an interface to report this as a BIOS defect.
>>
>> Something like:
>>
>> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT source_name", ACPI_ 
>> Bug);
>>
>> FIRMWARE_BUG_ON(severity, description, component);
> 
> Yes, please.
> 
> I'd also like HARDWARE_BUG_ON(), with similar usage.
> 
> With all the preload-linux-on-foo project, we have some chance to make
> BIOS vendors fix their stuff if we can easily diagnose errors in
> there.

also we can tie this to kerneloops.org to make a list of top known firmware/hardware bugs, seperate from the kernel code bugs
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS
  2008-07-08 15:40       ` Arjan van de Ven
@ 2008-07-08 19:34         ` Brown, Len
  2008-07-08 22:11           ` Pavel Machek
  2008-07-09  8:51           ` Thomas Renninger
  0 siblings, 2 replies; 7+ messages in thread
From: Brown, Len @ 2008-07-08 19:34 UTC (permalink / raw)
  To: Arjan van de Ven, Pavel Machek
  Cc: Thomas Renninger, Len Brown, linux-acpi, Moore, Robert,
	Lin, Ming M, Bjorn Helgaas, Huang Cheng, firmwarekit-discuss,
	Linux Kernel Mailing List

 

>>>> Some BIOSs erroneously reverse the _PRT SourceName and the
>>>> SourceIndex.  Detect and repair this problem. MS ACPI also allows
>>>> and repairs this problem, thus ACPICA must also.
>>> It would be great to have an interface to report this as a 
>BIOS defect.
>>>
>>> Something like:
>>>
>>> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT 
>source_name", ACPI_ 
>>> Bug);
>>>
>>> FIRMWARE_BUG_ON(severity, description, component);
>> 
>> Yes, please.

I'm not excited about maintaining
maintaining linux-as-a-firmware-diagnostic --
particularly when...

1. it clutters the code for normail machines
2. finding the bug is pointless, because even
   if you fix one machine, you are guaranteed to
   not fix all machines and thus must maintain
   the workaround anyway.

>> I'd also like HARDWARE_BUG_ON(), with similar usage.
>> 
>> With all the preload-linux-on-foo project, we have some 
>> chance to make
>> BIOS vendors fix their stuff if we can easily diagnose errors in
>> there.

These customers should be running 
http://linuxfirmwarekit.org/

We do maintain some degree of "high-road ACPI spec checking"
with the "acpi=strict" boot option.  If we do more of this,
I think it should stay under that option.

thanks,
-Len

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS
  2008-07-08 19:34         ` Brown, Len
@ 2008-07-08 22:11           ` Pavel Machek
  2008-07-09  8:51           ` Thomas Renninger
  1 sibling, 0 replies; 7+ messages in thread
From: Pavel Machek @ 2008-07-08 22:11 UTC (permalink / raw)
  To: Brown, Len
  Cc: Arjan van de Ven, Thomas Renninger, Len Brown, linux-acpi,
	Moore, Robert, Lin, Ming M, Bjorn Helgaas, Huang Cheng,
	firmwarekit-discuss, Linux Kernel Mailing List

Hi!

> >>> FIRMWARE_BUG_ON(severity, description, component);
> >> 
> >> Yes, please.
> 
> I'm not excited about maintaining
> maintaining linux-as-a-firmware-diagnostic --
> particularly when...
> 
> 1. it clutters the code for normail machines
> 2. finding the bug is pointless, because even
>    if you fix one machine, you are guaranteed to
>    not fix all machines and thus must maintain
>    the workaround anyway.

Well, at least we can make sure new machines are okay.

Plus, it is nice to know how common hw/BIOS problems are.

> >> I'd also like HARDWARE_BUG_ON(), with similar usage.
> >> 
> >> With all the preload-linux-on-foo project, we have some 
> >> chance to make
> >> BIOS vendors fix their stuff if we can easily diagnose errors in
> >> there.
> 
> These customers should be running 
> http://linuxfirmwarekit.org/
> 
> We do maintain some degree of "high-road ACPI spec checking"
> with the "acpi=strict" boot option.  If we do more of this,
> I think it should stay under that option.

That's okay with me, but it would be nice to have printk() markup, so
that we can tell BIOS/hw bugs from normal kernel messages.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS
  2008-07-08 19:34         ` Brown, Len
  2008-07-08 22:11           ` Pavel Machek
@ 2008-07-09  8:51           ` Thomas Renninger
  2008-07-17 20:31             ` Brown, Len
  1 sibling, 1 reply; 7+ messages in thread
From: Thomas Renninger @ 2008-07-09  8:51 UTC (permalink / raw)
  To: Brown, Len
  Cc: Arjan van de Ven, Pavel Machek, Len Brown, linux-acpi,
	Moore, Robert, Lin, Ming M, Bjorn Helgaas, Huang Cheng,
	firmwarekit-discuss, Linux Kernel Mailing List

On Tuesday 08 July 2008 21:34:55 Brown, Len wrote:
> >>>> Some BIOSs erroneously reverse the _PRT SourceName and the
> >>>> SourceIndex.  Detect and repair this problem. MS ACPI also allows
> >>>> and repairs this problem, thus ACPICA must also.
> >>>
> >>> It would be great to have an interface to report this as a
> >
> >BIOS defect.
> >
> >>> Something like:
> >>>
> >>> FIRMWARE_BUG_ON(FIRM_WARN, "erroneously reversed the _PRT
> >
> >source_name", ACPI_
> >
> >>> Bug);
> >>>
> >>> FIRMWARE_BUG_ON(severity, description, component);
> >>
> >> Yes, please.
>
> I'm not excited about maintaining
> maintaining linux-as-a-firmware-diagnostic --
> particularly when...
Above is not for ACPI only. But ACPI is probably a candidate which should make 
use of it.
>
> 1. it clutters the code for normail machines
It is the only disadvantage, "cluttering" the code.
On the other hand, this is an advantage.
While comments like "/* this must not happen, out of spec, etc. */"
should normally be added,
kernel developers tend to return -ENODEV and that's it.
A FIRMWARE_BUG("Description") would be compiled out on a
production kernel, compiled in for a kernel on which linuxfirmwarekit
boots or other debug kernels and would nicely document,
not clutter the code.

> 2. finding the bug is pointless,
No.
> because even 
>    if you fix one machine, you are guaranteed to
>    not fix all machines and thus must maintain
>    the workaround anyway.
I expect you want HP to fix their critical shut down temperature of 256 C?
If we would have known this, we had made them fix this and a lot other things 
also.
This is one of dozens of examples...

> >> I'd also like HARDWARE_BUG_ON(), with similar usage.
> >>
> >> With all the preload-linux-on-foo project, we have some
> >> chance to make
> >> BIOS vendors fix their stuff if we can easily diagnose errors in
> >> there.
>
> These customers should be running
> http://linuxfirmwarekit.org/
It is not worth much.
At least currently and a firmware bug interface to the kernel is about to 
change this.

1) linuxfirmwarekit is parsing dmesg..., these messages are changing
    all the time in the kernel.
    S390 AFAIK currently indexes all printks to avoid a simple space cleanup
    to mix up messges. It is not needed to go that far...
2) The rest is done through userspace plugins written in shell/perl or C.
     But most firmware bugs are absorbed by the kernel already. So there
     is nothing we could tell OEMs they should correct.
3) 2+3 is unmaintainable. There is enough work in the kernel, now we should
    write a linuxfirmwarekit plugin for every firmwarebug we find?!?
4) I tried to convince our kernel people to write linuxfirmwarekit plugins
     with exactly zero response. Hmm wrong, Frank wrote a thermal zone
    checker, some hundred lines of fragile shell code, it would have been less
    than ten lines in the kernel, just add some FIRMWARE_BUG(..) in the right
    if/else conditions. And it's not worth much anymore as soon as
    thermal_zone is moving to sysfs.

> We do maintain some degree of "high-road ACPI spec checking"
> with the "acpi=strict" boot option.  If we do more of this,
> I think it should stay under that option.
In the linuxfirmwarekit the ACPI tables are dumped, disassembled and 
recompiled. The errors and warnings are printed out as ACPI BIOS errors.
This works out.
The iasl compiler still is a bit too noisy and cannot disassemble SSDTs 
correctly (maybe it got fixed, don't know), but this can work out.

The problem is that you only find syntactical bugs.
But there are *a lot* bugs like e.g. critical temperature of 256 C. Or If this 
object exists, that object must exist. Logic ACPI BIOS bugs can only be found 
in the kernel itself at runtime.
Ahh Bjorn is in CC..., Wrong opregion declarations and PNP resource 
declarations are rather common. I mean theoretically you can find this in 
userspace. Not with disassembling and recompiling but with interpreting the 
tables in userspace... Again unmaintainable and who should do the checking?
The guy who writes the kernel code...

I really like the idea of the linuxfirmwarekit and tried to push it, but I do 
not see any future for it without kernel support.

       Thomas

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS
  2008-07-09  8:51           ` Thomas Renninger
@ 2008-07-17 20:31             ` Brown, Len
  0 siblings, 0 replies; 7+ messages in thread
From: Brown, Len @ 2008-07-17 20:31 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: Arjan van de Ven, Pavel Machek, Len Brown, linux-acpi,
	Moore, Robert, Lin, Ming M, Bjorn Helgaas, Huang Cheng,
	firmwarekit-discuss, Linux Kernel Mailing List


>> I'm not excited about maintaining
>> maintaining linux-as-a-firmware-diagnostic --
>> particularly when...

>Above is not for ACPI only. But ACPI is probably a candidate 
>which should make use of it.

Sure, there may be justification for doing something like this
in the kernel, but the issue that started this thread isn't it.

The issue that started this thread can be diagnosed by user-space
in linuxfirmwarekit.  Multiple machines have this bug, which means
that it is "common industry practice" and the kernel has to work
around it (silently) if we like it or not.
ie. the issue, now that it is debugged and worked around,
is effectly just firmware lint.

thanks,
-Len

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-07-17 20:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1213947852-10924-1-git-send-email-lenb@kernel.org>
     [not found] ` <deade3c973eb9c98d2a6cfed658bbb49e693894a.1213947350.git.len.brown@intel.com>
2008-07-01  8:33   ` Check for BIOS bugs - Original Subject: Re: [PATCH 23/70] ACPICA: Workaround for reversed _PRT entries from BIOS Thomas Renninger
2008-07-07 11:12     ` Pavel Machek
2008-07-08 15:40       ` Arjan van de Ven
2008-07-08 19:34         ` Brown, Len
2008-07-08 22:11           ` Pavel Machek
2008-07-09  8:51           ` Thomas Renninger
2008-07-17 20:31             ` Brown, Len

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox