public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Some problems with HP DL380 G8 BIOS and SLES11 SP3
@ 2014-08-13 15:22 Ulrich Windl
  2014-08-14 17:46 ` Don Zickus
  0 siblings, 1 reply; 6+ messages in thread
From: Ulrich Windl @ 2014-08-13 15:22 UTC (permalink / raw)
  To: linux-kernel

Hello!

Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are some kernel messages that indicate a bug either in the kernel or in the HP BIOS. Maybe someone can explain, so I can try to get it fixed whatever party broke it...

Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 [gcc-4_3-branch revision 152973]" (latest).
HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest)

During ACPI init I see:
[...]
Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB)
ACPI: RSDP 00000000000f4f00 00024 (v02 HP    )
ACPI: XSDT 00000000bddaed00 000D4 (v01 HP     ProLiant 00000002   322? 0000162E)
ACPI: FACP 00000000bddaee40 000F4 (v03 HP     ProLiant 00000002   322? 0000162E)
ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 (2011041
3/tbfadt-611)
ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 (20110413/
tbfadt-611)
ACPI: DSDT 00000000bddaef40 026DC (v01 HP         DSDT 00000001 INTL 20030228)
ACPI: FACS 00000000bddac140 00040
ACPI: SPCR 00000000bddac180 00050 (v01 HP     SPCRRBSU 00000001   322? 0000162E)
ACPI: MCFG 00000000bddac200 0003C (v01 HP     ProLiant 00000001      00000000)
[...]

HPET id 0 under DRHD base 0xf4ffe000
BIOS requests to not use x2apic
Use 'intremap=no_x2apic_optout' to override BIOS request
Enabled IRQ remapping in xapic mode
x2apic not enabled, IRQ remapping is in xapic mode
Switched APIC routing to physical flat.
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04
Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS detec
ted, complain to your hardware vendor.
[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
Intel PMU driver.
... version:                3
... bit width:              48
... generic registers:      4
... value mask:             0000ffffffffffff
... max period:             000000007fffffff
... fixed-purpose events:   3
... event mask:             000000070000000f
NMI watchdog enabled, takes one hw-pmu counter.
Booting Node   0, Processors  #1
[...]

 pci0000:00: Requesting ACPI _OSC control (0x1d)
 pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00
ACPI _OSC control for PCIe not granted, disabling ASPM
[...]

 pci0000:20: Requesting ACPI _OSC control (0x1d)
 pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00
ACPI _OSC control for PCIe not granted, disabling ASPM
[...]

Regards,
Ulrich
P.S. Please CC: me, as I'm not on LKML...



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3
  2014-08-13 15:22 Some problems with HP DL380 G8 BIOS and SLES11 SP3 Ulrich Windl
@ 2014-08-14 17:46 ` Don Zickus
  2014-08-18  6:12   ` Antw: " Ulrich Windl
  0 siblings, 1 reply; 6+ messages in thread
From: Don Zickus @ 2014-08-14 17:46 UTC (permalink / raw)
  To: Ulrich Windl; +Cc: linux-kernel

On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote:
> Hello!
> 
> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are some kernel messages that indicate a bug either in the kernel or in the HP BIOS. Maybe someone can explain, so I can try to get it fixed whatever party broke it...
> 
> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 [gcc-4_3-branch revision 152973]" (latest).
> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest)

Yes, it is because you are letting the firmware dynamically control your
cpu frequency.  In order to accomplish they need to use a perf counter or
two, hence the conflict.  Set the firmware setting to OS control and the
problem goes away.  Contact HP for those instructions, they are very aware
of this problem and recommend OS control to all high end servers.

Cheers,
Don

> 
> During ACPI init I see:
> [...]
> Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB)
> ACPI: RSDP 00000000000f4f00 00024 (v02 HP    )
> ACPI: XSDT 00000000bddaed00 000D4 (v01 HP     ProLiant 00000002   322? 0000162E)
> ACPI: FACP 00000000bddaee40 000F4 (v03 HP     ProLiant 00000002   322? 0000162E)
> ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 (2011041
> 3/tbfadt-611)
> ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 (20110413/
> tbfadt-611)
> ACPI: DSDT 00000000bddaef40 026DC (v01 HP         DSDT 00000001 INTL 20030228)
> ACPI: FACS 00000000bddac140 00040
> ACPI: SPCR 00000000bddac180 00050 (v01 HP     SPCRRBSU 00000001   322? 0000162E)
> ACPI: MCFG 00000000bddac200 0003C (v01 HP     ProLiant 00000001      00000000)
> [...]
> 
> HPET id 0 under DRHD base 0xf4ffe000
> BIOS requests to not use x2apic
> Use 'intremap=no_x2apic_optout' to override BIOS request
> Enabled IRQ remapping in xapic mode
> x2apic not enabled, IRQ remapping is in xapic mode
> Switched APIC routing to physical flat.
> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04
> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS detec
> ted, complain to your hardware vendor.
> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
> Intel PMU driver.
> ... version:                3
> ... bit width:              48
> ... generic registers:      4
> ... value mask:             0000ffffffffffff
> ... max period:             000000007fffffff
> ... fixed-purpose events:   3
> ... event mask:             000000070000000f
> NMI watchdog enabled, takes one hw-pmu counter.
> Booting Node   0, Processors  #1
> [...]
> 
>  pci0000:00: Requesting ACPI _OSC control (0x1d)
>  pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00
> ACPI _OSC control for PCIe not granted, disabling ASPM
> [...]
> 
>  pci0000:20: Requesting ACPI _OSC control (0x1d)
>  pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00
> ACPI _OSC control for PCIe not granted, disabling ASPM
> [...]
> 
> Regards,
> Ulrich
> P.S. Please CC: me, as I'm not on LKML...
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3
  2014-08-14 17:46 ` Don Zickus
@ 2014-08-18  6:12   ` Ulrich Windl
  2014-08-18 12:44     ` Don Zickus
  0 siblings, 1 reply; 6+ messages in thread
From: Ulrich Windl @ 2014-08-18  6:12 UTC (permalink / raw)
  To: Don Zickus; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3912 bytes --]

>>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht
<20140814174658.GV49576@redhat.com>:
> On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote:
>> Hello!
>> 
>> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are 
> some kernel messages that indicate a bug either in the kernel or in the HP 
> BIOS. Maybe someone can explain, so I can try to get it fixed whatever party 
> broke it...
>> 
>> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 
> [gcc-4_3-branch revision 152973]" (latest).
>> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest)
> 
> Yes, it is because you are letting the firmware dynamically control your
> cpu frequency.  In order to accomplish they need to use a perf counter or
> two, hence the conflict.  Set the firmware setting to OS control and the
> problem goes away.  Contact HP for those instructions, they are very aware
> of this problem and recommend OS control to all high end servers.

Hi!

Thanks for answering, but the BIOS has set power management to "OS control" (see attachment). So I guess it must be something different.

Regards,
Ulrich

> 
> Cheers,
> Don
> 
>> 
>> During ACPI init I see:
>> [...]
>> Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB)
>> ACPI: RSDP 00000000000f4f00 00024 (v02 HP    )
>> ACPI: XSDT 00000000bddaed00 000D4 (v01 HP     ProLiant 00000002   322? 
> 0000162E)
>> ACPI: FACP 00000000bddaee40 000F4 (v03 HP     ProLiant 00000002   322? 
> 0000162E)
>> ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 
> (2011041
>> 3/tbfadt-611)
>> ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 
> (20110413/
>> tbfadt-611)
>> ACPI: DSDT 00000000bddaef40 026DC (v01 HP         DSDT 00000001 INTL 
> 20030228)
>> ACPI: FACS 00000000bddac140 00040
>> ACPI: SPCR 00000000bddac180 00050 (v01 HP     SPCRRBSU 00000001   322? 
> 0000162E)
>> ACPI: MCFG 00000000bddac200 0003C (v01 HP     ProLiant 00000001      
> 00000000)
>> [...]
>> 
>> HPET id 0 under DRHD base 0xf4ffe000
>> BIOS requests to not use x2apic
>> Use 'intremap=no_x2apic_optout' to override BIOS request
>> Enabled IRQ remapping in xapic mode
>> x2apic not enabled, IRQ remapping is in xapic mode
>> Switched APIC routing to physical flat.
>> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04
>> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS 
> detec
>> ted, complain to your hardware vendor.
>> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
>> Intel PMU driver.
>> ... version:                3
>> ... bit width:              48
>> ... generic registers:      4
>> ... value mask:             0000ffffffffffff
>> ... max period:             000000007fffffff
>> ... fixed-purpose events:   3
>> ... event mask:             000000070000000f
>> NMI watchdog enabled, takes one hw-pmu counter.
>> Booting Node   0, Processors  #1
>> [...]
>> 
>>  pci0000:00: Requesting ACPI _OSC control (0x1d)
>>  pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 
> 0x00
>> ACPI _OSC control for PCIe not granted, disabling ASPM
>> [...]
>> 
>>  pci0000:20: Requesting ACPI _OSC control (0x1d)
>>  pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 
> 0x00
>> ACPI _OSC control for PCIe not granted, disabling ASPM
>> [...]
>> 
>> Regards,
>> Ulrich
>> P.S. Please CC: me, as I'm not on LKML...
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org 
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html 
>> Please read the FAQ at  http://www.tux.org/lkml/ 




[-- Attachment #2: DL380G8-Power.JPG --]
[-- Type: image/jpeg, Size: 73517 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3
  2014-08-18  6:12   ` Antw: " Ulrich Windl
@ 2014-08-18 12:44     ` Don Zickus
  2014-08-18 13:48       ` Ulrich Windl
  0 siblings, 1 reply; 6+ messages in thread
From: Don Zickus @ 2014-08-18 12:44 UTC (permalink / raw)
  To: Ulrich Windl; +Cc: linux-kernel

On Mon, Aug 18, 2014 at 08:12:44AM +0200, Ulrich Windl wrote:
> >>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht
> <20140814174658.GV49576@redhat.com>:
> > On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote:
> >> Hello!
> >> 
> >> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are 
> > some kernel messages that indicate a bug either in the kernel or in the HP 
> > BIOS. Maybe someone can explain, so I can try to get it fixed whatever party 
> > broke it...
> >> 
> >> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 
> > [gcc-4_3-branch revision 152973]" (latest).
> >> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest)
> > 
> > Yes, it is because you are letting the firmware dynamically control your
> > cpu frequency.  In order to accomplish they need to use a perf counter or
> > two, hence the conflict.  Set the firmware setting to OS control and the
> > problem goes away.  Contact HP for those instructions, they are very aware
> > of this problem and recommend OS control to all high end servers.
> 
> Hi!
> 
> Thanks for answering, but the BIOS has set power management to "OS control" (see attachment). So I guess it must be something different.

Hmm, sounds like it.  Regardless, the error message indicates the counters
are in use most likely by the BIOS.  So you can ask HP what is going on.

I assume this is a normal bootup and not a kdump crash kernel, correct?

Cheers,
Don

> 
> Regards,
> Ulrich
> 
> > 
> > Cheers,
> > Don
> > 
> >> 
> >> During ACPI init I see:
> >> [...]
> >> Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB)
> >> ACPI: RSDP 00000000000f4f00 00024 (v02 HP    )
> >> ACPI: XSDT 00000000bddaed00 000D4 (v01 HP     ProLiant 00000002   322? 
> > 0000162E)
> >> ACPI: FACP 00000000bddaee40 000F4 (v03 HP     ProLiant 00000002   322? 
> > 0000162E)
> >> ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 
> > (2011041
> >> 3/tbfadt-611)
> >> ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 
> > (20110413/
> >> tbfadt-611)
> >> ACPI: DSDT 00000000bddaef40 026DC (v01 HP         DSDT 00000001 INTL 
> > 20030228)
> >> ACPI: FACS 00000000bddac140 00040
> >> ACPI: SPCR 00000000bddac180 00050 (v01 HP     SPCRRBSU 00000001   322? 
> > 0000162E)
> >> ACPI: MCFG 00000000bddac200 0003C (v01 HP     ProLiant 00000001      
> > 00000000)
> >> [...]
> >> 
> >> HPET id 0 under DRHD base 0xf4ffe000
> >> BIOS requests to not use x2apic
> >> Use 'intremap=no_x2apic_optout' to override BIOS request
> >> Enabled IRQ remapping in xapic mode
> >> x2apic not enabled, IRQ remapping is in xapic mode
> >> Switched APIC routing to physical flat.
> >> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> >> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04
> >> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS 
> > detec
> >> ted, complain to your hardware vendor.
> >> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
> >> Intel PMU driver.
> >> ... version:                3
> >> ... bit width:              48
> >> ... generic registers:      4
> >> ... value mask:             0000ffffffffffff
> >> ... max period:             000000007fffffff
> >> ... fixed-purpose events:   3
> >> ... event mask:             000000070000000f
> >> NMI watchdog enabled, takes one hw-pmu counter.
> >> Booting Node   0, Processors  #1
> >> [...]
> >> 
> >>  pci0000:00: Requesting ACPI _OSC control (0x1d)
> >>  pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 
> > 0x00
> >> ACPI _OSC control for PCIe not granted, disabling ASPM
> >> [...]
> >> 
> >>  pci0000:20: Requesting ACPI _OSC control (0x1d)
> >>  pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 
> > 0x00
> >> ACPI _OSC control for PCIe not granted, disabling ASPM
> >> [...]
> >> 
> >> Regards,
> >> Ulrich
> >> P.S. Please CC: me, as I'm not on LKML...
> >> 
> >> 
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >> the body of a message to majordomo@vger.kernel.org 
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html 
> >> Please read the FAQ at  http://www.tux.org/lkml/ 
> 
> 
> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3
  2014-08-18 12:44     ` Don Zickus
@ 2014-08-18 13:48       ` Ulrich Windl
  2014-08-18 15:02         ` Don Zickus
  0 siblings, 1 reply; 6+ messages in thread
From: Ulrich Windl @ 2014-08-18 13:48 UTC (permalink / raw)
  To: Don Zickus; +Cc: linux-kernel

>>> Don Zickus <dzickus@redhat.com> schrieb am 18.08.2014 um 14:44 in Nachricht
<20140818124404.GL49576@redhat.com>:
> On Mon, Aug 18, 2014 at 08:12:44AM +0200, Ulrich Windl wrote:
>> >>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht
>> <20140814174658.GV49576@redhat.com>:
>> > On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote:
>> >> Hello!
>> >> 
>> >> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are 
>> > some kernel messages that indicate a bug either in the kernel or in the HP 
>> > BIOS. Maybe someone can explain, so I can try to get it fixed whatever 
> party 
>> > broke it...
>> >> 
>> >> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 
>> > [gcc-4_3-branch revision 152973]" (latest).
>> >> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest)
>> > 
>> > Yes, it is because you are letting the firmware dynamically control your
>> > cpu frequency.  In order to accomplish they need to use a perf counter or
>> > two, hence the conflict.  Set the firmware setting to OS control and the
>> > problem goes away.  Contact HP for those instructions, they are very aware
>> > of this problem and recommend OS control to all high end servers.
>> 
>> Hi!
>> 
>> Thanks for answering, but the BIOS has set power management to "OS control" 
> (see attachment). So I guess it must be something different.
> 
> Hmm, sounds like it.  Regardless, the error message indicates the counters
> are in use most likely by the BIOS.  So you can ask HP what is going on.
> 
> I assume this is a normal bootup and not a kdump crash kernel, correct?

Yes, it's a normal boot. I'm afraid the standard hardware support at HP does not care much about such issues (I remember those Xeon bugs that caused memory errors during longer idle phases (in the G7 server) that are fixed be recent microcode updates: HP changed memory modules, and they changed the board, but it took very long until they updated the BIOS).

Is there any more information I can provide to narrow down the problem?

Regards,
Ulrich

> 
> Cheers,
> Don
> 
>> 
>> Regards,
>> Ulrich
>> 
>> > 
>> > Cheers,
>> > Don
>> > 
>> >> 
>> >> During ACPI init I see:
>> >> [...]
>> >> Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB)
>> >> ACPI: RSDP 00000000000f4f00 00024 (v02 HP    )
>> >> ACPI: XSDT 00000000bddaed00 000D4 (v01 HP     ProLiant 00000002   322? 
>> > 0000162E)
>> >> ACPI: FACP 00000000bddaee40 000F4 (v03 HP     ProLiant 00000002   322? 
>> > 0000162E)
>> >> ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 
>> > (2011041
>> >> 3/tbfadt-611)
>> >> ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 
>> > (20110413/
>> >> tbfadt-611)
>> >> ACPI: DSDT 00000000bddaef40 026DC (v01 HP         DSDT 00000001 INTL 
>> > 20030228)
>> >> ACPI: FACS 00000000bddac140 00040
>> >> ACPI: SPCR 00000000bddac180 00050 (v01 HP     SPCRRBSU 00000001   322? 
>> > 0000162E)
>> >> ACPI: MCFG 00000000bddac200 0003C (v01 HP     ProLiant 00000001      
>> > 00000000)
>> >> [...]
>> >> 
>> >> HPET id 0 under DRHD base 0xf4ffe000
>> >> BIOS requests to not use x2apic
>> >> Use 'intremap=no_x2apic_optout' to override BIOS request
>> >> Enabled IRQ remapping in xapic mode
>> >> x2apic not enabled, IRQ remapping is in xapic mode
>> >> Switched APIC routing to physical flat.
>> >> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>> >> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04
>> >> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS 
>> > detec
>> >> ted, complain to your hardware vendor.
>> >> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)
>> >> Intel PMU driver.
>> >> ... version:                3
>> >> ... bit width:              48
>> >> ... generic registers:      4
>> >> ... value mask:             0000ffffffffffff
>> >> ... max period:             000000007fffffff
>> >> ... fixed-purpose events:   3
>> >> ... event mask:             000000070000000f
>> >> NMI watchdog enabled, takes one hw-pmu counter.
>> >> Booting Node   0, Processors  #1
>> >> [...]
>> >> 
>> >>  pci0000:00: Requesting ACPI _OSC control (0x1d)
>> >>  pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 
>> > 0x00
>> >> ACPI _OSC control for PCIe not granted, disabling ASPM
>> >> [...]
>> >> 
>> >>  pci0000:20: Requesting ACPI _OSC control (0x1d)
>> >>  pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 
>> > 0x00
>> >> ACPI _OSC control for PCIe not granted, disabling ASPM
>> >> [...]
>> >> 
>> >> Regards,
>> >> Ulrich
>> >> P.S. Please CC: me, as I'm not on LKML...
>> >> 
>> >> 
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> >> the body of a message to majordomo@vger.kernel.org 
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html 
>> >> Please read the FAQ at  http://www.tux.org/lkml/ 
>> 
>> 
>> 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3
  2014-08-18 13:48       ` Ulrich Windl
@ 2014-08-18 15:02         ` Don Zickus
  0 siblings, 0 replies; 6+ messages in thread
From: Don Zickus @ 2014-08-18 15:02 UTC (permalink / raw)
  To: Ulrich Windl; +Cc: linux-kernel

On Mon, Aug 18, 2014 at 03:48:00PM +0200, Ulrich Windl wrote:
> >>> Don Zickus <dzickus@redhat.com> schrieb am 18.08.2014 um 14:44 in Nachricht
> <20140818124404.GL49576@redhat.com>:
> > On Mon, Aug 18, 2014 at 08:12:44AM +0200, Ulrich Windl wrote:
> >> >>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht
> >> <20140814174658.GV49576@redhat.com>:
> >> > On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote:
> >> >> Hello!
> >> >> 
> >> >> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are 
> >> > some kernel messages that indicate a bug either in the kernel or in the HP 
> >> > BIOS. Maybe someone can explain, so I can try to get it fixed whatever 
> > party 
> >> > broke it...
> >> >> 
> >> >> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 
> >> > [gcc-4_3-branch revision 152973]" (latest).
> >> >> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest)
> >> > 
> >> > Yes, it is because you are letting the firmware dynamically control your
> >> > cpu frequency.  In order to accomplish they need to use a perf counter or
> >> > two, hence the conflict.  Set the firmware setting to OS control and the
> >> > problem goes away.  Contact HP for those instructions, they are very aware
> >> > of this problem and recommend OS control to all high end servers.
> >> 
> >> Hi!
> >> 
> >> Thanks for answering, but the BIOS has set power management to "OS control" 
> > (see attachment). So I guess it must be something different.
> > 
> > Hmm, sounds like it.  Regardless, the error message indicates the counters
> > are in use most likely by the BIOS.  So you can ask HP what is going on.
> > 
> > I assume this is a normal bootup and not a kdump crash kernel, correct?
> 
> Yes, it's a normal boot. I'm afraid the standard hardware support at HP does not care much about such issues (I remember those Xeon bugs that caused memory errors during longer idle phases (in the G7 server) that are fixed be recent microcode updates: HP changed memory modules, and they changed the board, but it took very long until they updated the BIOS).
> 
> Is there any more information I can provide to narrow down the problem?

Not really.. see below..

<snip>

> >> >> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> >> >> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04
> >> >> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS 
> >> > detec
> >> >> ted, complain to your hardware vendor.
> >> >> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)

what happens here is we walk the PMU to see if one of them is enabled.
And sure enough the fixed counters (38d) have counter 1 and 2 enabled
(330) before the kernel even touches them.

The assumption is if someone is using them, then anything the kernel does
with them could be inaccurate.

My contacts with HP here tell me that if the power control is setting to
OS, then the counters should be unused and not be set (and we have seen
that here at Red Hat).

There isn't much more I can say and I am not really motivated to walk
through all your BIOS options to verify everything. :-)

At least with RHEL kernels, there is supposed to be published HP
whitepapers detailing all this and what to do.

Cheers,
Don

> >> >> Intel PMU driver.
> >> >> ... version:                3
> >> >> ... bit width:              48
> >> >> ... generic registers:      4
> >> >> ... value mask:             0000ffffffffffff
> >> >> ... max period:             000000007fffffff
> >> >> ... fixed-purpose events:   3
> >> >> ... event mask:             000000070000000f
> >> >> NMI watchdog enabled, takes one hw-pmu counter.
> >> >> Booting Node   0, Processors  #1
> >> >> [...]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-08-18 15:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-13 15:22 Some problems with HP DL380 G8 BIOS and SLES11 SP3 Ulrich Windl
2014-08-14 17:46 ` Don Zickus
2014-08-18  6:12   ` Antw: " Ulrich Windl
2014-08-18 12:44     ` Don Zickus
2014-08-18 13:48       ` Ulrich Windl
2014-08-18 15:02         ` Don Zickus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox