* Some problems with HP DL380 G8 BIOS and SLES11 SP3 @ 2014-08-13 15:22 Ulrich Windl 2014-08-14 17:46 ` Don Zickus 0 siblings, 1 reply; 6+ messages in thread From: Ulrich Windl @ 2014-08-13 15:22 UTC (permalink / raw) To: linux-kernel Hello! Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are some kernel messages that indicate a bug either in the kernel or in the HP BIOS. Maybe someone can explain, so I can try to get it fixed whatever party broke it... Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 [gcc-4_3-branch revision 152973]" (latest). HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest) During ACPI init I see: [...] Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB) ACPI: RSDP 00000000000f4f00 00024 (v02 HP ) ACPI: XSDT 00000000bddaed00 000D4 (v01 HP ProLiant 00000002 322? 0000162E) ACPI: FACP 00000000bddaee40 000F4 (v03 HP ProLiant 00000002 322? 0000162E) ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 (2011041 3/tbfadt-611) ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 (20110413/ tbfadt-611) ACPI: DSDT 00000000bddaef40 026DC (v01 HP DSDT 00000001 INTL 20030228) ACPI: FACS 00000000bddac140 00040 ACPI: SPCR 00000000bddac180 00050 (v01 HP SPCRRBSU 00000001 322? 0000162E) ACPI: MCFG 00000000bddac200 0003C (v01 HP ProLiant 00000001 00000000) [...] HPET id 0 under DRHD base 0xf4ffe000 BIOS requests to not use x2apic Use 'intremap=no_x2apic_optout' to override BIOS request Enabled IRQ remapping in xapic mode x2apic not enabled, IRQ remapping is in xapic mode Switched APIC routing to physical flat. ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04 Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS detec ted, complain to your hardware vendor. [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330) Intel PMU driver. ... version: 3 ... bit width: 48 ... generic registers: 4 ... value mask: 0000ffffffffffff ... max period: 000000007fffffff ... fixed-purpose events: 3 ... event mask: 000000070000000f NMI watchdog enabled, takes one hw-pmu counter. Booting Node 0, Processors #1 [...] pci0000:00: Requesting ACPI _OSC control (0x1d) pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00 ACPI _OSC control for PCIe not granted, disabling ASPM [...] pci0000:20: Requesting ACPI _OSC control (0x1d) pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00 ACPI _OSC control for PCIe not granted, disabling ASPM [...] Regards, Ulrich P.S. Please CC: me, as I'm not on LKML... ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3 2014-08-13 15:22 Some problems with HP DL380 G8 BIOS and SLES11 SP3 Ulrich Windl @ 2014-08-14 17:46 ` Don Zickus 2014-08-18 6:12 ` Antw: " Ulrich Windl 0 siblings, 1 reply; 6+ messages in thread From: Don Zickus @ 2014-08-14 17:46 UTC (permalink / raw) To: Ulrich Windl; +Cc: linux-kernel On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote: > Hello! > > Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are some kernel messages that indicate a bug either in the kernel or in the HP BIOS. Maybe someone can explain, so I can try to get it fixed whatever party broke it... > > Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 [gcc-4_3-branch revision 152973]" (latest). > HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest) Yes, it is because you are letting the firmware dynamically control your cpu frequency. In order to accomplish they need to use a perf counter or two, hence the conflict. Set the firmware setting to OS control and the problem goes away. Contact HP for those instructions, they are very aware of this problem and recommend OS control to all high end servers. Cheers, Don > > During ACPI init I see: > [...] > Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB) > ACPI: RSDP 00000000000f4f00 00024 (v02 HP ) > ACPI: XSDT 00000000bddaed00 000D4 (v01 HP ProLiant 00000002 322? 0000162E) > ACPI: FACP 00000000bddaee40 000F4 (v03 HP ProLiant 00000002 322? 0000162E) > ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 (2011041 > 3/tbfadt-611) > ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 (20110413/ > tbfadt-611) > ACPI: DSDT 00000000bddaef40 026DC (v01 HP DSDT 00000001 INTL 20030228) > ACPI: FACS 00000000bddac140 00040 > ACPI: SPCR 00000000bddac180 00050 (v01 HP SPCRRBSU 00000001 322? 0000162E) > ACPI: MCFG 00000000bddac200 0003C (v01 HP ProLiant 00000001 00000000) > [...] > > HPET id 0 under DRHD base 0xf4ffe000 > BIOS requests to not use x2apic > Use 'intremap=no_x2apic_optout' to override BIOS request > Enabled IRQ remapping in xapic mode > x2apic not enabled, IRQ remapping is in xapic mode > Switched APIC routing to physical flat. > ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 > CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04 > Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS detec > ted, complain to your hardware vendor. > [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330) > Intel PMU driver. > ... version: 3 > ... bit width: 48 > ... generic registers: 4 > ... value mask: 0000ffffffffffff > ... max period: 000000007fffffff > ... fixed-purpose events: 3 > ... event mask: 000000070000000f > NMI watchdog enabled, takes one hw-pmu counter. > Booting Node 0, Processors #1 > [...] > > pci0000:00: Requesting ACPI _OSC control (0x1d) > pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00 > ACPI _OSC control for PCIe not granted, disabling ASPM > [...] > > pci0000:20: Requesting ACPI _OSC control (0x1d) > pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: 0x00 > ACPI _OSC control for PCIe not granted, disabling ASPM > [...] > > Regards, > Ulrich > P.S. Please CC: me, as I'm not on LKML... > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3 2014-08-14 17:46 ` Don Zickus @ 2014-08-18 6:12 ` Ulrich Windl 2014-08-18 12:44 ` Don Zickus 0 siblings, 1 reply; 6+ messages in thread From: Ulrich Windl @ 2014-08-18 6:12 UTC (permalink / raw) To: Don Zickus; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 3912 bytes --] >>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht <20140814174658.GV49576@redhat.com>: > On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote: >> Hello! >> >> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are > some kernel messages that indicate a bug either in the kernel or in the HP > BIOS. Maybe someone can explain, so I can try to get it fixed whatever party > broke it... >> >> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 > [gcc-4_3-branch revision 152973]" (latest). >> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest) > > Yes, it is because you are letting the firmware dynamically control your > cpu frequency. In order to accomplish they need to use a perf counter or > two, hence the conflict. Set the firmware setting to OS control and the > problem goes away. Contact HP for those instructions, they are very aware > of this problem and recommend OS control to all high end servers. Hi! Thanks for answering, but the BIOS has set power management to "OS control" (see attachment). So I guess it must be something different. Regards, Ulrich > > Cheers, > Don > >> >> During ACPI init I see: >> [...] >> Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB) >> ACPI: RSDP 00000000000f4f00 00024 (v02 HP ) >> ACPI: XSDT 00000000bddaed00 000D4 (v01 HP ProLiant 00000002 322? > 0000162E) >> ACPI: FACP 00000000bddaee40 000F4 (v03 HP ProLiant 00000002 322? > 0000162E) >> ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 > (2011041 >> 3/tbfadt-611) >> ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 > (20110413/ >> tbfadt-611) >> ACPI: DSDT 00000000bddaef40 026DC (v01 HP DSDT 00000001 INTL > 20030228) >> ACPI: FACS 00000000bddac140 00040 >> ACPI: SPCR 00000000bddac180 00050 (v01 HP SPCRRBSU 00000001 322? > 0000162E) >> ACPI: MCFG 00000000bddac200 0003C (v01 HP ProLiant 00000001 > 00000000) >> [...] >> >> HPET id 0 under DRHD base 0xf4ffe000 >> BIOS requests to not use x2apic >> Use 'intremap=no_x2apic_optout' to override BIOS request >> Enabled IRQ remapping in xapic mode >> x2apic not enabled, IRQ remapping is in xapic mode >> Switched APIC routing to physical flat. >> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 >> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04 >> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS > detec >> ted, complain to your hardware vendor. >> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330) >> Intel PMU driver. >> ... version: 3 >> ... bit width: 48 >> ... generic registers: 4 >> ... value mask: 0000ffffffffffff >> ... max period: 000000007fffffff >> ... fixed-purpose events: 3 >> ... event mask: 000000070000000f >> NMI watchdog enabled, takes one hw-pmu counter. >> Booting Node 0, Processors #1 >> [...] >> >> pci0000:00: Requesting ACPI _OSC control (0x1d) >> pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: > 0x00 >> ACPI _OSC control for PCIe not granted, disabling ASPM >> [...] >> >> pci0000:20: Requesting ACPI _OSC control (0x1d) >> pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: > 0x00 >> ACPI _OSC control for PCIe not granted, disabling ASPM >> [...] >> >> Regards, >> Ulrich >> P.S. Please CC: me, as I'm not on LKML... >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ [-- Attachment #2: DL380G8-Power.JPG --] [-- Type: image/jpeg, Size: 73517 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3 2014-08-18 6:12 ` Antw: " Ulrich Windl @ 2014-08-18 12:44 ` Don Zickus 2014-08-18 13:48 ` Ulrich Windl 0 siblings, 1 reply; 6+ messages in thread From: Don Zickus @ 2014-08-18 12:44 UTC (permalink / raw) To: Ulrich Windl; +Cc: linux-kernel On Mon, Aug 18, 2014 at 08:12:44AM +0200, Ulrich Windl wrote: > >>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht > <20140814174658.GV49576@redhat.com>: > > On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote: > >> Hello! > >> > >> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are > > some kernel messages that indicate a bug either in the kernel or in the HP > > BIOS. Maybe someone can explain, so I can try to get it fixed whatever party > > broke it... > >> > >> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 > > [gcc-4_3-branch revision 152973]" (latest). > >> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest) > > > > Yes, it is because you are letting the firmware dynamically control your > > cpu frequency. In order to accomplish they need to use a perf counter or > > two, hence the conflict. Set the firmware setting to OS control and the > > problem goes away. Contact HP for those instructions, they are very aware > > of this problem and recommend OS control to all high end servers. > > Hi! > > Thanks for answering, but the BIOS has set power management to "OS control" (see attachment). So I guess it must be something different. Hmm, sounds like it. Regardless, the error message indicates the counters are in use most likely by the BIOS. So you can ask HP what is going on. I assume this is a normal bootup and not a kdump crash kernel, correct? Cheers, Don > > Regards, > Ulrich > > > > > Cheers, > > Don > > > >> > >> During ACPI init I see: > >> [...] > >> Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB) > >> ACPI: RSDP 00000000000f4f00 00024 (v02 HP ) > >> ACPI: XSDT 00000000bddaed00 000D4 (v01 HP ProLiant 00000002 322? > > 0000162E) > >> ACPI: FACP 00000000bddaee40 000F4 (v03 HP ProLiant 00000002 322? > > 0000162E) > >> ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 > > (2011041 > >> 3/tbfadt-611) > >> ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 > > (20110413/ > >> tbfadt-611) > >> ACPI: DSDT 00000000bddaef40 026DC (v01 HP DSDT 00000001 INTL > > 20030228) > >> ACPI: FACS 00000000bddac140 00040 > >> ACPI: SPCR 00000000bddac180 00050 (v01 HP SPCRRBSU 00000001 322? > > 0000162E) > >> ACPI: MCFG 00000000bddac200 0003C (v01 HP ProLiant 00000001 > > 00000000) > >> [...] > >> > >> HPET id 0 under DRHD base 0xf4ffe000 > >> BIOS requests to not use x2apic > >> Use 'intremap=no_x2apic_optout' to override BIOS request > >> Enabled IRQ remapping in xapic mode > >> x2apic not enabled, IRQ remapping is in xapic mode > >> Switched APIC routing to physical flat. > >> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 > >> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04 > >> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS > > detec > >> ted, complain to your hardware vendor. > >> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330) > >> Intel PMU driver. > >> ... version: 3 > >> ... bit width: 48 > >> ... generic registers: 4 > >> ... value mask: 0000ffffffffffff > >> ... max period: 000000007fffffff > >> ... fixed-purpose events: 3 > >> ... event mask: 000000070000000f > >> NMI watchdog enabled, takes one hw-pmu counter. > >> Booting Node 0, Processors #1 > >> [...] > >> > >> pci0000:00: Requesting ACPI _OSC control (0x1d) > >> pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: > > 0x00 > >> ACPI _OSC control for PCIe not granted, disabling ASPM > >> [...] > >> > >> pci0000:20: Requesting ACPI _OSC control (0x1d) > >> pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: > > 0x00 > >> ACPI _OSC control for PCIe not granted, disabling ASPM > >> [...] > >> > >> Regards, > >> Ulrich > >> P.S. Please CC: me, as I'm not on LKML... > >> > >> > >> -- > >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> Please read the FAQ at http://www.tux.org/lkml/ > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3 2014-08-18 12:44 ` Don Zickus @ 2014-08-18 13:48 ` Ulrich Windl 2014-08-18 15:02 ` Don Zickus 0 siblings, 1 reply; 6+ messages in thread From: Ulrich Windl @ 2014-08-18 13:48 UTC (permalink / raw) To: Don Zickus; +Cc: linux-kernel >>> Don Zickus <dzickus@redhat.com> schrieb am 18.08.2014 um 14:44 in Nachricht <20140818124404.GL49576@redhat.com>: > On Mon, Aug 18, 2014 at 08:12:44AM +0200, Ulrich Windl wrote: >> >>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht >> <20140814174658.GV49576@redhat.com>: >> > On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote: >> >> Hello! >> >> >> >> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are >> > some kernel messages that indicate a bug either in the kernel or in the HP >> > BIOS. Maybe someone can explain, so I can try to get it fixed whatever > party >> > broke it... >> >> >> >> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 >> > [gcc-4_3-branch revision 152973]" (latest). >> >> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest) >> > >> > Yes, it is because you are letting the firmware dynamically control your >> > cpu frequency. In order to accomplish they need to use a perf counter or >> > two, hence the conflict. Set the firmware setting to OS control and the >> > problem goes away. Contact HP for those instructions, they are very aware >> > of this problem and recommend OS control to all high end servers. >> >> Hi! >> >> Thanks for answering, but the BIOS has set power management to "OS control" > (see attachment). So I guess it must be something different. > > Hmm, sounds like it. Regardless, the error message indicates the counters > are in use most likely by the BIOS. So you can ask HP what is going on. > > I assume this is a normal bootup and not a kdump crash kernel, correct? Yes, it's a normal boot. I'm afraid the standard hardware support at HP does not care much about such issues (I remember those Xeon bugs that caused memory errors during longer idle phases (in the G7 server) that are fixed be recent microcode updates: HP changed memory modules, and they changed the board, but it took very long until they updated the BIOS). Is there any more information I can provide to narrow down the problem? Regards, Ulrich > > Cheers, > Don > >> >> Regards, >> Ulrich >> >> > >> > Cheers, >> > Don >> > >> >> >> >> During ACPI init I see: >> >> [...] >> >> Reserving 128MB of memory at 752MB for crashkernel (System RAM: 132095MB) >> >> ACPI: RSDP 00000000000f4f00 00024 (v02 HP ) >> >> ACPI: XSDT 00000000bddaed00 000D4 (v01 HP ProLiant 00000002 322? >> > 0000162E) >> >> ACPI: FACP 00000000bddaee40 000F4 (v03 HP ProLiant 00000002 322? >> > 0000162E) >> >> ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 >> > (2011041 >> >> 3/tbfadt-611) >> >> ACPI Warning: Invalid length for Pm2ControlBlock: 32, using default 8 >> > (20110413/ >> >> tbfadt-611) >> >> ACPI: DSDT 00000000bddaef40 026DC (v01 HP DSDT 00000001 INTL >> > 20030228) >> >> ACPI: FACS 00000000bddac140 00040 >> >> ACPI: SPCR 00000000bddac180 00050 (v01 HP SPCRRBSU 00000001 322? >> > 0000162E) >> >> ACPI: MCFG 00000000bddac200 0003C (v01 HP ProLiant 00000001 >> > 00000000) >> >> [...] >> >> >> >> HPET id 0 under DRHD base 0xf4ffe000 >> >> BIOS requests to not use x2apic >> >> Use 'intremap=no_x2apic_optout' to override BIOS request >> >> Enabled IRQ remapping in xapic mode >> >> x2apic not enabled, IRQ remapping is in xapic mode >> >> Switched APIC routing to physical flat. >> >> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 >> >> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04 >> >> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS >> > detec >> >> ted, complain to your hardware vendor. >> >> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330) >> >> Intel PMU driver. >> >> ... version: 3 >> >> ... bit width: 48 >> >> ... generic registers: 4 >> >> ... value mask: 0000ffffffffffff >> >> ... max period: 000000007fffffff >> >> ... fixed-purpose events: 3 >> >> ... event mask: 000000070000000f >> >> NMI watchdog enabled, takes one hw-pmu counter. >> >> Booting Node 0, Processors #1 >> >> [...] >> >> >> >> pci0000:00: Requesting ACPI _OSC control (0x1d) >> >> pci0000:00: ACPI _OSC request failed (AE_SUPPORT), returned control mask: >> > 0x00 >> >> ACPI _OSC control for PCIe not granted, disabling ASPM >> >> [...] >> >> >> >> pci0000:20: Requesting ACPI _OSC control (0x1d) >> >> pci0000:20: ACPI _OSC request failed (AE_SUPPORT), returned control mask: >> > 0x00 >> >> ACPI _OSC control for PCIe not granted, disabling ASPM >> >> [...] >> >> >> >> Regards, >> >> Ulrich >> >> P.S. Please CC: me, as I'm not on LKML... >> >> >> >> >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> >> the body of a message to majordomo@vger.kernel.org >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> Please read the FAQ at http://www.tux.org/lkml/ >> >> >> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Antw: Re: Some problems with HP DL380 G8 BIOS and SLES11 SP3 2014-08-18 13:48 ` Ulrich Windl @ 2014-08-18 15:02 ` Don Zickus 0 siblings, 0 replies; 6+ messages in thread From: Don Zickus @ 2014-08-18 15:02 UTC (permalink / raw) To: Ulrich Windl; +Cc: linux-kernel On Mon, Aug 18, 2014 at 03:48:00PM +0200, Ulrich Windl wrote: > >>> Don Zickus <dzickus@redhat.com> schrieb am 18.08.2014 um 14:44 in Nachricht > <20140818124404.GL49576@redhat.com>: > > On Mon, Aug 18, 2014 at 08:12:44AM +0200, Ulrich Windl wrote: > >> >>> Don Zickus <dzickus@redhat.com> schrieb am 14.08.2014 um 19:46 in Nachricht > >> <20140814174658.GV49576@redhat.com>: > >> > On Wed, Aug 13, 2014 at 05:22:17PM +0200, Ulrich Windl wrote: > >> >> Hello! > >> >> > >> >> Running the current SLES11 SP3 kernel on a HP DL380 G8 server, there are > >> > some kernel messages that indicate a bug either in the kernel or in the HP > >> > BIOS. Maybe someone can explain, so I can try to get it fixed whatever > > party > >> > broke it... > >> >> > >> >> Linux kernel is "3.0.101-0.35-default (geeko@buildhost) (gcc version 4.3.4 > >> > [gcc-4_3-branch revision 152973]" (latest). > >> >> HP server is "HP ProLiant DL380p Gen8, BIOS P70 02/10/2014" (latest) > >> > > >> > Yes, it is because you are letting the firmware dynamically control your > >> > cpu frequency. In order to accomplish they need to use a perf counter or > >> > two, hence the conflict. Set the firmware setting to OS control and the > >> > problem goes away. Contact HP for those instructions, they are very aware > >> > of this problem and recommend OS control to all high end servers. > >> > >> Hi! > >> > >> Thanks for answering, but the BIOS has set power management to "OS control" > > (see attachment). So I guess it must be something different. > > > > Hmm, sounds like it. Regardless, the error message indicates the counters > > are in use most likely by the BIOS. So you can ask HP what is going on. > > > > I assume this is a normal bootup and not a kdump crash kernel, correct? > > Yes, it's a normal boot. I'm afraid the standard hardware support at HP does not care much about such issues (I remember those Xeon bugs that caused memory errors during longer idle phases (in the G7 server) that are fixed be recent microcode updates: HP changed memory modules, and they changed the board, but it took very long until they updated the BIOS). > > Is there any more information I can provide to narrow down the problem? Not really.. see below.. <snip> > >> >> ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 > >> >> CPU0: Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz stepping 04 > >> >> Performance Events: PEBS fmt1+, 16-deep LBR, IvyBridge events, Broken BIOS > >> > detec > >> >> ted, complain to your hardware vendor. > >> >> [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330) what happens here is we walk the PMU to see if one of them is enabled. And sure enough the fixed counters (38d) have counter 1 and 2 enabled (330) before the kernel even touches them. The assumption is if someone is using them, then anything the kernel does with them could be inaccurate. My contacts with HP here tell me that if the power control is setting to OS, then the counters should be unused and not be set (and we have seen that here at Red Hat). There isn't much more I can say and I am not really motivated to walk through all your BIOS options to verify everything. :-) At least with RHEL kernels, there is supposed to be published HP whitepapers detailing all this and what to do. Cheers, Don > >> >> Intel PMU driver. > >> >> ... version: 3 > >> >> ... bit width: 48 > >> >> ... generic registers: 4 > >> >> ... value mask: 0000ffffffffffff > >> >> ... max period: 000000007fffffff > >> >> ... fixed-purpose events: 3 > >> >> ... event mask: 000000070000000f > >> >> NMI watchdog enabled, takes one hw-pmu counter. > >> >> Booting Node 0, Processors #1 > >> >> [...] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-08-18 15:02 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-08-13 15:22 Some problems with HP DL380 G8 BIOS and SLES11 SP3 Ulrich Windl 2014-08-14 17:46 ` Don Zickus 2014-08-18 6:12 ` Antw: " Ulrich Windl 2014-08-18 12:44 ` Don Zickus 2014-08-18 13:48 ` Ulrich Windl 2014-08-18 15:02 ` Don Zickus
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox